Skip to main content

Probability Models

Overview

Module description

In this module we introduce you to the mathematical foundations of probability and show how it can be used to model certain scenarios that occur in everyday life. If you are planning to specialise in the statistical aspects of data science, we provide numerous examples, many of which we will explore with the aid of a statistical programming language and package.

Indicative syllabus

  • Introduction to a statistical computer package (R or some other)
  • Descriptive statistics and graphical methods
  • The mathematical theory of probability
  • Conditional probability and Bayes’ Theorem
  • Discrete random variables and probability distributions
  • Continuous random variables and probability distributions, including the normal distribution
  • Mean and variance of probability distributions
  • Populations, samples and sampling distributions, including the student’s t, and chi-square, distributions
  • Point, and confidence interval, estimation for the population mean of normally distributed data

Learning objectives

By the end of this module, you will be able to:

  • use a statistical computer package to summarise and illustrate data through the use of summary statistics and graphical techniques
  • carry out basic calculations of probabilities based upon the use of the mathematical theory of probability
  • work with conditional probabilities and use Bayes’ Theorem to update them where appropriate
  • specify the characteristics, and various examples, of discrete and continuous probability distributions
  • recognise situations where the use of certain probability distributions is appropriate and carry out the corresponding calculations of probabilities (i) by direct calculation, (ii) by the use of statistical tables, or (iii) by the use of a statistical package
  • carry out calculations of probabilities for normally distributed random variables (i) by the use of statistical tables or (ii) by the use of a statistical package
  • find point estimates for random samples drawn from the normal distribution and, for the unknown population mean parameter, construct confidence intervals (i) by the use of statistical tables or (ii) by the use of a statistical package.