Data Science Techniques and Applications
Overview
- Credit value: 15 credits at Level 7
- Module convenor and tutor: Alessandro Provetti
- Prerequisites: Principles of Programming, Programming with Data and Data Analytics using R
- Assessment: a two-hour examination (80%) and coursework assignment (20%)
Module description
This module presents data science as a set of nine computational problems, then examines the geometrical interpretation of data and its consequences. The 'Rating and Ranking' and 'Complex Network' models are also studied in some depth.
The module has been designed to overlap with the Machine Learning/Applied Machine Learning modules.
Indicative Module Syllabus
- Data science as nine computational problems
- Statistics, linear algebra and information theory
- Python modules for data analytics such as NumPy and Scikit-learn
- The geometric view of data; the curse of dimensionality; spectral and decomposition techniques
- Advanced techniques: Non-negative Matrix Factorization (NMF) and Factorization Machines (FM)
- Rating and ranking and their use in prediction for, for example, sports
- From data to networks (graphs), and their relevant properties
- Network analysis in: biology, international trade, computer networks, web search and finance
Learning objectives
By the end of this module, you will be able to:
- understand data science as nine computational and modelling problems
- deploy techniques for quantitative data analysis, such as information entropy, spectral analysis and matrix decomposition
- use Python to apply the techniques learned on the module
- validate and evaluate data analysis results
- demonstrate satisfactory knowledge of network models.