# Statistics: Theory and Practice

## Overview

• Credit value: 30 credits at Level 6
• Convenor: Robert Russell
• Assessment: three problem sets (10% each) and a three-hour examination (70%)

## Module description

In this module we give an overview of the main theoretical ideas that underpin practices in routine or innovative uses of the theory of statistics and its applications.

If your focus is on statistics, this will give you the necessary knowledge for final-year undergraduate or postgraduate study. The module can also serve as a ‘stopping-off’ point if you are a mathematician wishing to push your statistical knowledge beyond introductory level.

You will also gain a working knowledge of a high-level statistical programming language, such as R.

### Indicative syllabus

#### Probability and distribution theory

• Probability spaces, review of conditional probability and independence
• Discrete and continuous random variables and their moments
• Functions of random variables, with emphasis on generating functions
• Collections of random variables, conditional distributions and expectation
• The multivariate normal distribution, with emphasis on the bivariate normal

#### Introduction to statistical inference

• Point and interval estimation (with examples relating to the normal distribution)
• Introduction to hypothesis testing (with examples relating to the normal distribution)
• Likelihood and sufficiency, the Factorization Theorem
• Maximum likelihood estimators

#### Completely randomized one-way design

• Introduction to R
• Design and analysis of completely randomized one-way design (theory and practice in R)
• The chi-square and F distributions, and their relationship to analysis of variance techniques
• Least squares estimators
• Estimation and comparison of treatment effects
• Analysis of residuals

#### Linear regression

• Simple linear regression, analysis of residuals and prediction
• Multiple linear regression, ANOVA, testing redundancy
• Stepwise regression
• Modelling linear regression using R

## Learning objectives

By the end of this module, you will be able to:

• set up and carry out a simple designed experiment which allows for the testing of the influence of certain factors using ANOVA techniques
• collate and analyse data arising from a simple designed experiment within a package (like R), and draw appropriate conclusions
• specify and recognise the joint distribution of several random variables given appropriate assumptions on the marginal distributions and their dependence structure
• specify and recognise the multivariate normal distribution, and some of its important properties, particularly in relation to specific graphical properties of the bivariate normal distribution
• derive key results pertaining to the Chi-squared and Fisher distributions, and relate these to the theoretical basis for the ANOVA technique
• formulate and derive maximum likelihood estimators (and appreciate how these differ from those based on the method of moments)
• determine whether a statistic is sufficient for a given parameter
• appreciate the theoretical underpinning behind hypothesis testing and acknowledge how hypothesis tests are carried out across several different paradigms
• determine whether a given data set is amenable to analysis using multiple linear regression
• import or enter data into a statistical package, like R, and perform multiple linear regression by principally using command line functions (rather than menu-driven GUI operations)
• interpret and draw conclusions from a statistical analysis, and present these conclusions so that they can  either i) be well understood by a statistician, or ii) be accessible (in a non-misleading way) to the intelligent lay-person/non-statistician (who may be involved in policy development).