Maths: Statistics 1 (S1)

D1 Types of variable:

Qualitative: non-numerical (e.g. colour)
Quantitative: numerical (e.g. length)
Continuous: can take any value (e.g. age)
Discrete: can only take certain values (e.g. cost)
Categorical: listed by a category/property and not a number

D2 - 8 Data presentation:

Know how to use and create pie charts, bar charts, line charts, histograms, stem and leaf diagrams, box and whisker plots and cumulative frequency charts
In a box and whisker plot (boxplot), an outlier is at least 1.5 x IQR from the nearest quartile

D9 Skewness:

Skewness can be described as positive, symmetrical or negative
A positive skew has more distribution on the left than the right (a left-skew)
A negative skew has most of the distribution concentrated on the right (a right-skew)
A distribution can have many types of shape, including unimodal (one peak), bimodal (two peaks, sometimes, but not always, of the same height) and uniform (constant)

D10 Measures of central tendency:

The mean is calculated with ∑x ÷ n where n is the number of items
The symbol for the mean is x̄, x bar
The element which the median falls on is found with (n + 1) ÷ 2 (the list must be in ascending order). If n is odd then there is a middle value. If it is even, then the two middle values are added then divided by two to find the final median
The mode is the value which appears most frequently. The data is bimodal if two values occur more than the rest, showing that the data may have been taken from two populations
The mid-range is the value mid way between the upper-extreme and lower-extreme (highest and lowest values), calculated by adding them and dividing by 2

D11 Usefulness of each measure of central tendency:

The mean is the best known average and makes use of all of the data, but is affected by extreme values and cannot be obtained graphically
The median is not influenced by extreme values and can be obtained even if some of the data values are unknown, but it can often only be estimated and it cannot be used for further statistical calculations like the mean can
The mode is also unaffected by extreme values and is easy to calculate, but there may be more than one and sometimes it cannot be determined exactly

D13 Ranges and percentiles:

The range is found with largest value - smallest value, or x_m_a_x - x_m_i_n
The four quartiles are found as follows:
- Q₁: 1(n + 1) ÷ 4
- Q₂: 2(n + 1) ÷ 4
- Q₃: 3(n + 1) ÷ 4
- Q₄: 4(n + 1) ÷ 4
The interquartile range (IQR) is equal to Q₃ - Q₁
To find the x^t^h percentile, use x(n + 1) ÷ 100, like the interquartile ranges (Q₁ = the 25th percentile, Q₂ = the 50th percentile/the median etc)

D14 Measures of spread:

The sum of squares (or S_x_x) is calculated with $\sum{(x^2)} - n\bar{x}^2$ or $\sum{(x - \bar{x})^2}$
- Therefore, to calculate S_x_x, find the sum of all values squared and subtract n multiplied by the mean squared
- For example, for the data {1, 2, 10}, ∑x² = 1² + 2² + 10²
- n × mean ² = 3 × 4.33²
- Therefore, S_x_x= 48.7
Remember not to use a rounded mean when calculating S_x_x
Mean square deviation (MSD) = S_x_x ÷ n
Root mean square deviation (RMSD) is the root of the mean square deviation
Variance (s²) = S_x_x ÷ (n - 1)
Sample standard deviation (s) is the root of the variance
- Sample standard deviation (just called standard deviation in the exam) uses the symbol sx in Casio calculators

D15 Linear coding:

Adding/subtracting a constant from all of the data will change the mean by this amount and will not affect standard deviation
- For example, the data {2, 3, 4, 5, 6} has a mean of 4 and standard deviation of 1.58
- Adding 1 to each item (to make {3, 4, 5, 6, 7}) will increase the mean by 1 but the standard deviation will remain at 1.58
Multiplying/dividing all of the data will affect the mean and standard deviation by this amount
- Using the above example, multiplying each item by 2 will result in {4, 6, 8, 10, 12}
- The mean will now be 2 x 4 = 8 and standard deviation will be 2 x 1.58 = 3.16

D16 Outliers:

An outlier is 2 standard deviations from the mean, or 1.5 IQRs beyond the nearest quartile

u1- u4 Probability - notation:

P(A) is the probability of A
P(B) is the probability of B
P(A∩B) is the probability of A and B occurring
P(A∪B) is the probability of A occurring, B occurring or both occurring
P(A') is the probability of A not occurring
P(A|B) is the probability of A occurring once B has already happened
$P(B|A) = \frac{P(A \cap B)}{P(A)}$

u5 Mutually exclusive and independent events:

Independent events are not affected by one another
- If an event is independent, then P(A) x P(B) = P(A∩B)
If P(B|A) = P(B), then A and B are independent
Mutually exclusive events cannot both happen at the same time - e.g. getting heads and tails in one flip
- If two events are mutually exclusive, P(A∩B) = 0 and P(A∪B) = P(A) + P(B)

u6, u7 Calculating outcomes:

When two mutually exclusive events occur, the probability of either A or B occurring is equal to P(A) + P(B)
The probability of two independent events occurring (e.g. two coins flipped, both getting tails) is equal to P(A) x P(B), 0.5 x 0.5 = 0.25 in this example
TO DO: Anything I've missed? Check back over spec

R1, R2 Discrete random variables:

A Probability distribution is a table of values showing the probabilities of various outcomes, for example:
x 0 1 2 3 4
P(X = x) ¹/₁₀ ¹/₁₀ ²/₁₀ ³/₁₀ ²/₁₀
Here, the probability of x = 0 is ¹/₁₀, x = 3 is ³/₁₀ etc
The probabilities will always sum to 1
These are known as discrete random variables because they can only take a set of values (here only 0, 1, 2, 3 and 4) and their probabilities sum to 1

R3 Expectation:

Expectation is the mean value, written as µ or E(X)
It is calculated with ∑xP(X = x) (so sum each value multiplied by frequency)
In the above table, the expectation = (0 × ¹/₁₀) + (1 × ¹/₁₀) + (2 × ²/₁₀) + (3 × ³/₁₀) + (4 × ²/₁₀) = 1.9
- Therefore, the mean value is 1.90

R4 Variance:

Variance is a measure of spread and is the square of standard deviation
It is represented with Var(X)
From a probability distribution, it can be calculated with Var(X) = E(X²) - E(X)²
- Another way of writing this is Var(X) = E ( (X - µ)² )
For the above table:
- E(X²) = (0² × ¹/₁₀) + (1² × ¹/₁₀) + (2² × ²/₁₀) + (3² × ³/₁₀) + (4² × ²/₁₀) = 6.80
- E(X)² = 1.9 (calculated in the expectation section) squared = 3.61
- Therfore, Var(X) = 6.80 - 3.61 = 3.19
- Standard deviation is the root of this, 1.79 to 3 sig. figs.

H1 - H3 Binomial distributions:

A binomial distribution is applicable with a fixed number of independent and repeated trials, each of which is a success (p) or fail (q)
- Therefore p + q = 1
If X has a binomial distribution B(n, p), this can be written as X ~ B(n, p)
The sample size is denoted by n
P(X = r) = ⁿC_r p^r qⁿ^-^r
Example: A coin is tossed ten times. What is the probability of it coming down heads five times and tails five times?
- p = 0.5 and q = 0.5 since it is a fair coin
- n = 10 because there are ten tosses
- Consider heads a success and tails a failure
- r = 5 because this is the number of successes (heads) we are testing for
- Therefore, the probability of 5 successes and 5 failures is equal to ¹⁰C₅ × 0.5⁵ × 0.5¹⁰^-⁵ = 0.246
Probabilities can be added. So for the above example of coin tosses, the probability of getting less than 4 or 5 heads is equal to
¹⁰C₅ × 0.5⁵ × 0.5¹⁰^-⁵ + ¹⁰C₄ × 0.5⁴ × 0.5¹⁰^-⁴ = 0.451
Remember that all probabilities will add to 1. This can sometimes be used to reduce the number of calculations - for example to find the probability of getting 1 - 9 heads, it is quicker to subtract the probability of 10 heads from 1 than adding all probabilities up to 9

Binomial probability tables:

A cumulative binomial table shows P(X ≤ x) when X ~ B(n,p)
- For example, if a die is rolled 8 times (n = 8), the probability of getting 0, 1, 2 or 3 sixes (p = ¹/₆ and x = 3) can be found:

- Therefore it is equal to 0.9693
- These tables start on page 12 of the exam formula booklet

H4, H5 n!:

n! (n factorial) is used to calculate the number of arrangements of a set of objects/digits etc, without removing or adding any
For example, there are 3! = 3 x 2 x 1 = 6 ways of arranging the letters A, B and C, only using each once

H3 ⁿC_r, combinations:

ⁿC_r is the number of ways to select r objects from n. For example, with the letters A, B, C and D, there are ⁴C₂ = 6 possible ways of selecting two of the four letters randomly (AB, AC, AD, BC, BD, CD)
The formula is ⁿC_r = n! ÷ (n - r)!r!
Combination is used when order does not matter

ⁿP_r Permutations:

Permutations are used when order does matter
The formula is ⁿP_r = n! ÷ r!
- Therefore, for the example used in combinations, there would be 24 ÷ 2 permutations (AB, AC, AD, BC, BD, CD, BA, CA, DA, CD, DB, DC)

H6 mean = np:

If X ~ B(n,p), E(X) = np = number of trials × number of successes
For example, if a fair coin is tossed twenty times, the most likely number of heads is equal to n × p = 20 × 0.5 = 10

TO DO: H7 - be able to calculate the expected frequencies of the various possible outcomes from a series of binomial trials

H8 - 13 Hypothesis testing:

The null hypothesis, H₀, is the hypothesis which is being disproved
The alternative hypothesis, H₁ is the opposite
The significance level is the probability at which it is decided that the null hypothesis is incorrect. For example, to prove that a coin is biased towards heads at a 5% significance level, the coin would have to land on heads at least 95% of the time. The null hypothesis is that p = 0.5 and the alternative hypothesis is that p > 0.5 (heads is more likely)
The critical region is the set of values of the test statistic for which the null hypothesis is rejected
The acceptance region is the opposite to the critical region - the set of values for which the null hypothesis is accepted
The critical value is the value seperating the regions of acceptance and rejection
A 1-tail test is a test where only one side is being tested for (e.g. a die is biased towards 1s)
A 2-tail test is a test for both sides (e.g. finding if a die is biased but without the side being stated)

Example question:

(8 marks)

Full marks solution:

Let P = the probability that a randomly selected frame is faulty
H₀: P = 0.05
H₁: P > 0.05
P(X ≥ 4) or 1 - P(X ≤ 3) = 0.0109
Note: the sign in the P(X > 4) points in the same direction as the H₁ sign. This is always the case
0.0109 < 0.05 (significance level = 0.05) ∴ reject H₀
There is evidence to suggest that the proportion of faulty frames has increased.

x	0	1	2	3	4
P(X = x)	¹/₁₀	¹/₁₀	²/₁₀	³/₁₀	²/₁₀

Maths: Statistics 1 (S1)

D1 Types of variable:

D2 - 8 Data presentation:

D9 Skewness:

D10 Measures of central tendency:

D11 Usefulness of each measure of central tendency:

D13 Ranges and percentiles:

D14 Measures of spread:

D15 Linear coding:

D16 Outliers:

u1- u4 Probability - notation:

u5 Mutually exclusive and independent events:

u6, u7 Calculating outcomes:

R1, R2 Discrete random variables:

R3 Expectation:

R4 Variance:

H1 - H3 Binomial distributions:

Binomial probability tables:

H4, H5 n!:

H3 nCr, combinations:

nPr Permutations:

H6 mean = np:

H8 - 13 Hypothesis testing:

Example question:

H3 ⁿC_r, combinations:

ⁿP_r Permutations: