Skip to main content

Foundations of Data Science II

Overview

Module description

This module covers further fundamental aspects of data science and analytics. It is a direct continuation of Foundations of Data Science I. Apart from consolidating the knowledge you acquired from FDS I, you will develop further mathematical knowledge and skills needed for studies in the BSc Data Science programme, and needed by data scientists/analysts in general. These include basic elements of calculus, further topics in linear algebra, as well as continuous probability theory and further statistics.

The module will show you how to use the popular and powerful language Python to solve computational tasks from these mathematical subjects. In particular, this module will get you acquainted with popular Python libraries and packages for programming to solve problems arising from calculus, probability theory and statistics.

Indicative module syllabus

  • Differentiation
  • Indefinite and definite integration
  • Solving systems of polynomial equations (eg Newton’s methods) and basic optimisation algorithm (eg gradient descent)
  • Continuous probability (random variables, pdf, cdf, expectation, variance and correlation)
  • Common distribution families (Poisson, normal distribution, etc)
  • Probabilistic inequalities and concentration (LNT, CLT, etc)
  • Statistical testing (hypothesis testing, chi-squared testing)
  • Sampling and confidence intervals
  • Eigenvalues and eigenvectors
  • SVD decomposition
  • Tools: Python

Learning objectives

By the end of this module, you will be able to:

  • demonstrate satisfactory knowledge of basic calculus, further linear algebra and matrix theory, continuous probability theory and statistics, and relevant Python libraries and packages
  • demonstrate satisfactory skills of programming in Python to solve computational tasks from calculus, linear algebra, continuous probability theory and statistics
  • understand the link between the basic knowledge acquired from the module and data science/analytics applications.