AWS QuickSight is the next generation Business Intelligence (BI) application that can help build interactive visualizations on top of various data sources hosted on Amazon cloud infrastructure.
This blog is the third in a series on AWS QuickSight where we will see how easy it is to manipulate your data in QuickSight caching layer called SPICE without the need of IT. Get started with AWS QuickSight by reading the first blog.
What is SPICE and How To Use It
SPICE is an abbreviation for Super-fast, Parallel, In-memory, Calculation, Engine and is the brains behind QuickSight. It can source data from S3, RDS databases, Redshift, RDBMS databases and in future EMR, Kinesis and DynamoDB. SPICE has a great toolset of functions that can do light weight transformation of data in memory and thereby eliminate the need for traditional ETL (Extract Transform and Load).
The following figure shows the position of SPICE in an overall BI solution. On the left hand side, you see the various sources that can push data to SPICE. Data from the various sources is cached by SPICE, which uses a combination of columnar storage, in-memory technologies, machine code generation, data compression and auto-replication for high availability. The results from SPICE serve QuickSight analysis and in the future SPICE will integrate with AWS BI partners like Tableau, TIBCO, Qlick, Domo using its SQL-like interface.
For each paid QuickSight user, you will get 10GB of SPICE capacity. This SPICE capacity is pooled across users for the Amazon QuickSight account. Each QuickSight account also gets 1GB free capacity.
Importing Datasets from Redshift to SPICE
Let’s consider a scenario where you have 2 tables in a Redshift instance and want to load them to SPICE and join them. Follow these steps:
- From the QuickSight home page, click Manage Data
- Select New Data Set and then select the Redshift option
- Enter details about the Redshift data source and select the new table that was created as shown in the following screenshot.
4. In the data preparation page, select both the tables ‘usasalarybyzip’ and ‘usazipcodes’. See Figure 3.
SPICE then shows the resultant combined data set as shown in the following screenshot.
5. Click Prepare data & visualize to visualize the joined data set.
In the QuickSight visualization, you can change the visualization type, dimensions, measures and see an interesting stacked bar that shows which city has best average median salary in orange and the corresponding population in blue.
The following screenshot shows that Pound Ridge city has average median salary as $219,554 and a population of 4,764.
Enriching Your Data
SPICE allows extending datasets imported into SPICE with additional derived fields.
Let’s look into the features using the following steps:
- From the QuickSight home page, click on Manage Data
- Select the ‘usasalarybyzipandcity’ that was created in the previous section of Joining data sets and then click Edit dataset.
- Select New Field and then follow the wizard on the right to create a new derived field.
- In this example, we are going to use the ‘toUppper’ function in the formula and convert the statement name to uppercase.
SPICE is the brains behind QuickSight and enables it to scale to thousands of users who can all simultaneously perform fast interactive analysis from a variety of AWS data sources. After data is imported to SPICE, it can be joined with other data sets and get enriched with a wide range of string, date and numeric functions. Additionally, we can also filter data based to restrict what dataset gets analyzed. The capacity for SPICE is 10GB per user and is pooled across all users in the account that allows organizations to automatically change capacity as they grow in team size.
For further reading, order “Effective Business Intelligence with QuickSight” by Rajesh Nadipalli.
More ways to spice up your data can be found below:
About the AuthorMore Content by Rajesh Nadipalli