Word Count : 1000 words

Learning Outcomes Assessed:

  1. Critically appraise the challenges posed by the management and processing of complex datasets  and data inputs.

  2. Discuss, compare and contrast advanced techniques and algorithms for working with complex  datasets and data types using data science.

  3. Critically evaluate and select state-of-the-art data science techniques and algorithms for  selected/given applications involving complex data.

  4. Apply advanced techniques and algorithms and critically analyse and evaluate the results.

Task(s) – content 

An energy company in Aberdeen has tasked  with producing a viability study towards  incorporating Machine Learning and Computer Vision for corrosion detection in subsea pipelines.  However, they won’t share the data. The inspection  engineers have come up with the idea of compiling an image dataset.

Surface: Images of metallic objects, with 128 negative instances (i.e., metal objects with no  corrosion) and 1104 positive instances (i.e., rusted and corroded metallic objects).

Underwater: Pictures taken in the subsea, with 27 negative instances (i.e., showing animals  or coral reefs) and 27 positive instances (i.e. showing corroded objects under the sea).

The report structure and presentation will be marked as well, so make sure that your notebook is  presented in an elegant and professional manner. Run all cells and DON’T clean the output, as  markers won’t run the code (instead, markers will just read the notebook as if it were a report).