Machine Learning on Big Data CN7030

Order now

 

This coursework (CRWK) must be attempted individually. This coursework is divided into
two sections: (1) two big data analytics scenarios and (2) presentation. You must attend the
presentation. If you do not turn up in the presentation, you fail this module.
Overall mark for CRWK comes from two main activities as follows:
1- Report (around 3,000 words, with a tolerance of ± 10%) (60%)
2- Presentation (40%)

Marking Scheme
Topic Total
mark
Remarks
(breakdown of marks for each sub-task)

Big Data
Analytics using
PySpark
50
(30) Develop one multi-class classifier and one clustering.
Explain the features and configurations you wish to apply.
(20) Evaluate and visualize the accuracy/performance and the
working solution for each method you applied.

Data Streaming
analytics using
PySpark

40 (40) Complete two tasks for data streaming analytics. You
should put the screenshot of the working solution in the
report.

Documentation 10 (10) Write down a scientific report.
Total: 100

IMPORTANT: you MUST put your PySpark codes (i.e., text format) in the report.
DO NOT put them as screenshot, otherwise it will be counted as a plagiarism.

Leave a Comment