Word Count : 5000 words
The first coursework assignment for this module is an individual assignment worth 50% of the overall module mark.
MN-M534 Business Analytics Assessment 2022
The UK government has committed to ending of sales of all new petrol and diesel engine cars by 2030. Electric car sales are continuing to grow in the UK, with over 395,000 plug in electric cars registered on UK roads at the end of 2021, with nearly 28,000 new cars registered in December alone. Electric cars now total 10.7% of all new UK car sales. However, to meet the UK government target, electric cars will need to represent a much greater proportion of new registrations for each of the years leading to the 2030 deadline.
A new research team within the Department for Transport has commissioned Buxton Analytics to undertake a research project to identify key insights to the available data on electric car usage to inform the DfT as to the key factors that impact the UK meeting the 2030 deadline.
You are a Business Analyst working for Buxton Analytics tasked with providing insight to the data. You are required to undertake the necessary background research on this topic to provide a detailed analytical report for the Department for Transport (DfT). You will then be expected to include your findings in a report to be approved by the executive and submitted to the client.
The following variables are included within the MN-M534electricCarStats.xlsx file
· Date · Diesel Cost (£ Litre)
· DfT Survey results for #1
reason for purchase · Household Energy Cost (£ Ave)
· Govt in power · #Public Charging Stations
· Cars Registered · HMRC zero tax incentive in place
· Ave Fast Charge time
(hrs:mins) · Ave Range (miles)
· CPI Index · Cost of Chargers (£ Ave)
· Ave new Car Cost (£) · Vehicle Tax (£ Ave ICE)
· HMRC Zero Tax Incentive In
Place · Covid 19 period Y/N (UK)
· Petrol Cost (£ Litre) · Ave UK salary(£)
The DfT requires the following questions to be answered:
1. The DfT has provided a spreadsheet of data to be analysed – MN-M534electricCarStats.xlsx, but has stated that they think there are a number of duplicate rows in the data. Your initial analysis has identified that the data file should contain 60 mths of distinct electric car sales data, but the file provided contains 65 records. Using appropriate features in Excel, identify the duplicate records and remove these from the excel file. Your answer must indicate the method you have used to search for duplicates and a screenshot of the command syntax.
2. Using appropriate methods of descriptive analytics compare the following numerical variables.
· Ave new Car Cost (£)
· CPI Index
· Ave Range (miles)
· #Public charging stations
· Govt Subsidy to Purchase
· Ave UK salary(£)
Your answer can include various graphs/plots, descriptive statistics and correlations. Make sure to discuss whether (and how in statistical terms) each variable has an impact on the number of registered electric cars and any relationships or observations worth noting in the data. (17 marks)
3. Create dummy variables for the following categorical variables so you are able to use them as independent variables within the regression in Q6.
· HMRC zero tax incentive in place
· Covid 19 period Y/N (UK)
· Monthly DfT Survey results
Explain why it is required to use dummy variables when using categorical variables with regression and the required rules for processing. (5 marks)
4. Using appropriate methods of descriptive analytics compare the number of registered electric cars with the categorical variables listed in Q3.
Your answer can include various graphs/plots, descriptive statistics and correlations. Make sure to discuss whether (and how in statistical terms) each variable has an impact on the number of registered electric cars and any relationships or observations worth noting in the data. (15 marks)
5. Describe the concept of multicollinearity and why it is an issue for regression models? What is the impact if highly correlated independent variables are used in a multiple regression? Develop a model for the numerical variables listed in Q2 and separately those in Q3. Identify which specific independent variables are highly correlated. Clearly explain the rational and process for your decision? (10 marks)
6. Develop a multiple linear regression using number of cars registered monthly as the dependent variable and the following variables from the MN-M534electricCarStats.xlsx file with the dummy variables created in Q3. Be sure to take account of any multicollinearity issues from Q5 when constructing your model.
Ave new car cost (£)
Govnt subsidy to purchase (£)
#Public charging stations
Ave UK salary(£)
Dummy variable A
Dummy variable B………
Discuss your findings using the data outputs from the regression model including the relative significance of individual independent variables? Are there variables that can be
eliminated to improve the model? Are there any other variables in the data file not currently used that you feel could be included in the regression to improve the model?
7. Based on your analysis what are your recommendations and conclusions for the client (DfT) for the following key questions:
a) In assessing the influence of the independent variables, which have a statistical impact on new car registrations?
b) The DfT has committed to delivering feedback to the society of motor manufacturers based on the analysis of this data, what aspects of the data analysis do you feel are of greatest use to electric car manufactures and why?
c) Comment on the cyclical patterns for new car registrations?
d) Based on your assessment of the data analytics outputs, what are the key areas the DfT should be focussing on to increase the numbers of new car registrations in time for 2030? (12 marks)
The managerial report should contain
Abstract – this should be brief no more than 200 words and answer the following question
- What did I do in a nutshell?
- What is the problem? – no more than 200 words
- How did I solve the problem? – no more than 200 words
· Results & Discussion – around 1400 words for this section
- What did I find out? – describe the results
- What do the results mean? – interpret the results
- Tables/graphs/figures are not included in the word count
- This is where you answer questions 1-6
· Conclusion – around 500 words for this section
- Bringing everything together – what does it all mean?
- Make sure you discuss the overall aim of the assignment in this section.
- This is where you answer question 7
- Whose work did I refer to?
- Remember to include any textbooks that you may have used to help with analysis and interpretation.
- Not included in word count
- Extra information
- Not included in word count
The power point presentation should show the following for questions 1-4
- What you did?
- How you did it?
- Why you did it?
- What were the results?
For question 5 the power point should answer the set questions using a minimum of 1 slide per question and a maximum of 2 slides per question