Analyzing Flights within the U.S. with AWS and Energy BI | by Aashish Nair | Jul, 2023
Information Warehousing with AWS Redshift
With AWS Glue, the information that was initially in a flat mannequin can now be represented with a extra becoming star schema in an information warehouse.
The cloud knowledge warehouse for this knowledge shall be created with AWS Redshift Serverless. This entails making a namespace named flights-namespace
in addition to a database named dev
. As well as, it requires a workgroup named flights-workgroup
, which shall be used to write down SQL queries.
Observe: The workgroup has been configured to permit units exterior of the VPC to entry the database. This shall be helpful when creating the visualization with Energy BI
Now, we will open the question editor in Redshift and begin creating the very fact and dimension tables within the dev
database.
First, the 4 tables within the schema should be created within the warehouse utilizing the next instructions:
The 4 tables at the moment are within the knowledge warehouse, however they’re all empty for the reason that knowledge remains to be within the flights-data-processed
bucket.
The information might be copied into this knowledge warehouse utilizing the COPY
command.
As an illustration, the information in flights.csv
might be copied into the flights
desk utilizing the next command syntax:
Observe: the
iam_role
variable must be assigned no matter iam function is was chosen when creating the workgroup.
By executing the COPY
command for every of the csv recordsdata within the flights-data-processed
bucket, the 4 tables must be crammed with the required knowledge.
For example, here’s a preview of the airport desk: