programming multi-part question and need the explanation and answer to help me learn.
I have attached the assignment guidelines and an example the professor gave about how the model should look. It should be a denormalized star schema with Fact Tables, Dimensional Tables, PKs and FKs, Data Relationships and SCDs.
Requirements:
Overview of the Assignment:
This assignment will go through steps to develop a data warehouse design.
Part 1 – Review the business requirements
CityTour is a national chain of metropolitan tours (yes they have double decker tour busses!). City Tours specializes in themes, for example haunted tours, tours in the river and the harbor (you may want to review Boston Duck boat tours, and Boston trolly tours for ideas)
CityTour would like to introduce data warehousing and analytics to build their business. You have been hired as a data architect to create an initial Constellation data warehouse design. On the next page is CityTour’s relational OLTP data model. In addition, they would like to correlate the tour data with weather data to see if there are trends from weather which might impact tours.
CityTour OLTP Data Model on next page
External Weather data is to come via an API such as Here is a sample JSON API response to give you a sense of the data returned:
{ “coord”: { “lon”: 10.99, “lat”: 44.34 }, “weather”: [ { “id”: 501, “main”: “Rain”, “description”: “moderate rain”, “icon”: “10d” } ], “base”: “stations”, “main”: { “temp”: 298.48, “feels_like”: 298.74, “temp_min”: 297.56, “temp_max”: 300.05, “pressure”: 1015, “humidity”: 64, “sea_level”: 1015, “grnd_level”: 933 }, “visibility”: 10000, “wind”: { “speed”: 0.62, “deg”: 349, “gust”: 1.18 }, “rain”: { “1h”: 3.16 }, “clouds”: { “all”: 100 }, “dt”: 1661870592, “sys”: { “type”: 2, “id”: 2075663, “country”: “IT”, “sunrise”: 1661834187, “sunset”: 1661882248 }, “timezone”: 7200, “id”: 3163858, “name”: “Zocca”, “cod”: 200 }
CityTour OLTP Data Model
Part 1 – Business Rules
Determine four business questions your data warehouse design will answer. Keep these questions in mind as you move on to the rest of the assignment. One of the questions needs to consider some sort of correlation with external weather data.
Your response of four business questions goes here
Part 2 – Design a constellation schema warehouse
Create and insert an ERD showing the constellation ERD schema below. Requirements are as follows:
Determine four to five (non-date/time) SCDs – make sure to include at least one type 2 and one type 3
Determine two to three date dimensions of different grains, consider a role-playing or bitemporal date dimensions. You will use table in question 2b to outline the facts and explain your design choice for the temporal dimensions
Determine two to three fact tables, your design should include at least one Snapshot (transactional) fact, and at least one cumulative fact.
Your dimensional model design goes here
Questions on Dimensions
For EACH non-date/time SCD use the following table explain your SCD design
Outline the fact tables which contain role-playing or bitemporal design
Questions on Fact tables
Question For each of your business questions in part 1, how are the measures tied to your questions? Highlight the dimensions, facts and measures involved.
Question: Outline the fact tables and explain why they are cumulative or snapshot
Question: Which attributes in the OLTP schema will transform to measures and what measures can be derived/calculated that should be included?
Part 3 – Assumptions/Appendix (optional)
You may need to make some assumptions in order to work on this assignment. Clearly state any assumptions you needed to make here, along with your reasoning why the assumption was appropriate (one or two sentences should be sufficient in most cases). Do not remove any functionality, or trivialize, any of the assignment requirements. Feel free to augment.
ProjectDimPKProjectIDProjectNameProjectType2023ProjectType2022ProjectType2021ProjectType2020CustomerDimPKCustomerDimIDCustomerIDCustomerContactNameCustomerAddress1CustomerAddress2CustomerCityCustomerStateCustomerZipCodeEffectiveDateExpireDateCurrentFlagProjectFactPK,FKProjectIDPK,FKEstimatedCompletionDateIDPK,FKActualCompletionDateIDPK,FKCustomerDimIDFinishedOnTime (bit)DaysToComplete (integer)EstimatedRevenue (decimal)ActualRevenue (decimal)RevenueDifference (decimal)DateDimPKDateDimIDDateDayQuarterMonthYearSCD3SCD0Snapshot (Transactional fact)SCD2ProjectStageDimPKProjectStageIDProjectNameProjectTypeProjectStageDescriptionEffectiveDateExpireDateCurrentFlagProjectCumFactPK,FKProjectIDPK,FKActualCompletionMonthDimPK,FKCustomerDimIDPercentageFinishedOnTime (decimal)AverageDaysToComplete (integer)AverageEstimatedRevenue (decimal)AverageActualRevenue (decimal)RevenueDifference (decimal)MonthDimPKMonthDimIDQuarterMonthYearCumulative (Periodic fact)SCD0EmployeeDimPKTeamMemberDimIDEmployeeIDEmployeeNameTeamMemberIDRoleNameHourlyPayRateHourlyBillRateEmployeeCityEmployeeStateEffectiveDateExpireDateCurrentFlagProjectStageFactPK,FKProjectStageIDPK,FKActualStartDateIDPK,FKActualCompletionDateIDPK,FKCompletionProcessedDateIDPK,FKEmployeeDimIDHoursTakenToComplete (integer)DaysStartedLate (integrer)NumberOfDaysWorked (integer)FinishedOnTime (bit)DaysToComplete (integer)EstimatedCost (decimal)ActualCost (decimal)CostDifference (decimal)ActCostToCompleteEmployee(decimal)Snapshot (Transactional fact)SCD2SCD2Jack PolnarCS68910/15/2023