Maths: Statistics 1 (S1)


    D1 Types of variable:

    • Qualitative: non-numerical (e.g. colour)
    • Quantitative: numerical (e.g. length)
    • Continuous: can take any value (e.g. age)
    • Discrete: can only take certain values (e.g. cost)
    • Categorical: listed by a category/property and not a number

    D2 - 8 Data presentation:

    • Know how to use and create pie charts, bar charts, line charts, histograms, stem and leaf diagrams, box and whisker plots and cumulative frequency charts
    • In a box and whisker plot (boxplot), an outlier is at least 1.5 x IQR from the nearest quartile

    D9 Skewness:

    • Skewness can be described as positive, symmetrical or negative
    • A positive skew has more distribution on the left than the right (a left-skew)
    • A negative skew has most of the distribution concentrated on the right (a right-skew)
    • A distribution can have many types of shape, including unimodal (one peak), bimodal (two peaks, sometimes, but not always, of the same height) and uniform (constant)

    D10 Measures of central tendency:

    • The mean is calculated with x ÷ n where n is the number of items
    • The symbol for the mean is x̄, x bar
    • The element which the median falls on is found with (n + 1) ÷ 2 (the list must be in ascending order). If n is odd then there is a middle value. If it is even, then the two middle values are added then divided by two to find the final median
    • The mode is the value which appears most frequently. The data is bimodal if two values occur more than the rest, showing that the data may have been taken from two populations
    • The mid-range is the value mid way between the upper-extreme and lower-extreme (highest and lowest values), calculated by adding them and dividing by 2

    D11 Usefulness of each measure of central tendency:

    • The mean is the best known average and makes use of all of the data, but is affected by extreme values and cannot be obtained graphically
    • The median is not influenced by extreme values and can be obtained even if some of the data values are unknown, but it can often only be estimated and it cannot be used for further statistical calculations like the mean can
    • The mode is also unaffected by extreme values and is easy to calculate, but there may be more than one and sometimes it cannot be determined exactly

    D13 Ranges and percentiles:

    • The range is found with largest value - smallest value, or xmax - xmin
    • The four quartiles are found as follows:
      - Q1: 1(n + 1) ÷ 4
      - Q2: 2(n + 1) ÷ 4
      - Q3: 3(n + 1) ÷ 4
      - Q4: 4(n + 1) ÷ 4
    • The interquartile range (IQR) is equal to Q3 - Q1
    • To find the xth percentile, use x(n + 1) ÷ 100, like the interquartile ranges (Q1 = the 25th percentile, Q2 = the 50th percentile/the median etc)

    D14 Measures of spread:

    • The sum of squares (or Sxx) is calculated with or
      - Therefore, to calculate Sxx, find the sum of all values squared and subtract n multiplied by the mean squared
      - For example, for the data {1, 2, 10}, x2 = 12 + 22 + 102
      - n × mean 2 = 3 × 4.332
      - Therefore, Sxx = 48.7
    • Remember not to use a rounded mean when calculating Sxx
    • Mean square deviation (MSD) = Sxx ÷ n
    • Root mean square deviation (RMSD) is the root of the mean square deviation
    • Variance (s2) = Sxx ÷ (n - 1)
    • Sample standard deviation (s) is the root of the variance
      - Sample standard deviation (just called standard deviation in the exam) uses the symbol sx in Casio calculators

    D15 Linear coding:

    • Adding/subtracting a constant from all of the data will change the mean by this amount and will not affect standard deviation
      - For example, the data {2, 3, 4, 5, 6} has a mean of 4 and standard deviation of 1.58
      - Adding 1 to each item (to make {3, 4, 5, 6, 7}) will increase the mean by 1 but the standard deviation will remain at 1.58
    • Multiplying/dividing all of the data will affect the mean and standard deviation by this amount
      - Using the above example, multiplying each item by 2 will result in {4, 6, 8, 10, 12}
      - The mean will now be 2 x 4 = 8 and standard deviation will be 2 x 1.58 = 3.16

    D16 Outliers:

    • An outlier is 2 standard deviations from the mean, or 1.5 IQRs beyond the nearest quartile

    u1- u4 Probability - notation:

    • P(A) is the probability of A
    • P(B) is the probability of B
    • P(A∩B) is the probability of A and B occurring
    • P(A∪B) is the probability of A occurring, B occurring or both occurring
    • P(A') is the probability of A not occurring
    • P(A|B) is the probability of A occurring once B has already happened

    u5 Mutually exclusive and independent events:

    • Independent events are not affected by one another
      - If an event is independent, then P(A) x P(B) = P(A∩B)
    • If P(B|A) = P(B), then A and B are independent
    • Mutually exclusive events cannot both happen at the same time - e.g. getting heads and tails in one flip
      - If two events are mutually exclusive, P(A∩B) = 0 and P(A∪B) = P(A) + P(B)

    u6, u7 Calculating outcomes:

    • When two mutually exclusive events occur, the probability of either A or B occurring is equal to P(A) + P(B)
    • The probability of two independent events occurring (e.g. two coins flipped, both getting tails) is equal to P(A) x P(B), 0.5 x 0.5 = 0.25 in this example
      TO DO: Anything I've missed? Check back over spec

    R1, R2 Discrete random variables:

    • A Probability distribution is a table of values showing the probabilities of various outcomes, for example:
      x01234
      P(X = x)1/101/102/103/102/10
    • Here, the probability of x = 0 is 1/10, x = 3 is 3/10 etc
    • The probabilities will always sum to 1
    • These are known as discrete random variables because they can only take a set of values (here only 0, 1, 2, 3 and 4) and their probabilities sum to 1

    R3 Expectation:

    • Expectation is the mean value, written as µ or E(X)
    • It is calculated with ∑xP(X = x) (so sum each value multiplied by frequency)
    • In the above table, the expectation = (0 × 1/10) + (1 × 1/10) + (2 × 2/10) + (3 × 3/10) + (4 × 2/10) = 1.9
      - Therefore, the mean value is 1.90

    R4 Variance:

    • Variance is a measure of spread and is the square of standard deviation
    • It is represented with Var(X)
    • From a probability distribution, it can be calculated with Var(X) = E(X2) - E(X)2
      - Another way of writing this is Var(X) = E ( (X - µ)2 )
    • For the above table:
      - E(X2) = (02 × 1/10) + (12 × 1/10) + (22 × 2/10) + (32 × 3/10) + (42 × 2/10) = 6.80
      - E(X)2 = 1.9 (calculated in the expectation section) squared = 3.61
      - Therfore, Var(X) = 6.80 - 3.61 = 3.19
      - Standard deviation is the root of this, 1.79 to 3 sig. figs.

    H1 - H3 Binomial distributions:

    • A binomial distribution is applicable with a fixed number of independent and repeated trials, each of which is a success (p) or fail (q)
      - Therefore p + q = 1
    • If X has a binomial distribution B(n, p), this can be written as X ~ B(n, p)
    • The sample size is denoted by n
    • P(X = r) = nCr pr qn-r
    • Example: A coin is tossed ten times. What is the probability of it coming down heads five times and tails five times?
      - p = 0.5 and q = 0.5 since it is a fair coin
      - n = 10 because there are ten tosses
      - Consider heads a success and tails a failure
      - r = 5 because this is the number of successes (heads) we are testing for
      - Therefore, the probability of 5 successes and 5 failures is equal to 10C5 × 0.55 × 0.510-5 = 0.246
    • Probabilities can be added. So for the above example of coin tosses, the probability of getting less than 4 or 5 heads is equal to
      10C5 × 0.55 × 0.510-5 + 10C4 × 0.54 × 0.510-4 = 0.451
    • Remember that all probabilities will add to 1. This can sometimes be used to reduce the number of calculations - for example to find the probability of getting 1 - 9 heads, it is quicker to subtract the probability of 10 heads from 1 than adding all probabilities up to 9

    Binomial probability tables:

    • A cumulative binomial table shows P(X ≤ x) when X ~ B(n,p)
      - For example, if a die is rolled 8 times (n = 8), the probability of getting 0, 1, 2 or 3 sixes (p = 1/6 and x = 3) can be found:

      - Therefore it is equal to 0.9693
      - These tables start on page 12 of the exam formula booklet

    H4, H5 n!:

    • n! (n factorial) is used to calculate the number of arrangements of a set of objects/digits etc, without removing or adding any
    • For example, there are 3! = 3 x 2 x 1 = 6 ways of arranging the letters A, B and C, only using each once

    H3 nCr, combinations:

    • nCr is the number of ways to select r objects from n. For example, with the letters A, B, C and D, there are 4C2 = 6 possible ways of selecting two of the four letters randomly (AB, AC, AD, BC, BD, CD)
    • The formula is nCr = n! ÷ (n - r)!r!
    • Combination is used when order does not matter

    nPr Permutations:

    • Permutations are used when order does matter
    • The formula is nPr = n! ÷ r!
      - Therefore, for the example used in combinations, there would be 24 ÷ 2 permutations (AB, AC, AD, BC, BD, CD, BA, CA, DA, CD, DB, DC)

    H6 mean = np:

    • If X ~ B(n,p), E(X) = np = number of trials × number of successes
    • For example, if a fair coin is tossed twenty times, the most likely number of heads is equal to n × p = 20 × 0.5 = 10

    TO DO: H7 - be able to calculate the expected frequencies of the various possible outcomes from a series of binomial trials

    H8 - 13 Hypothesis testing:

    • The null hypothesis, H0, is the hypothesis which is being disproved
    • The alternative hypothesis, H1 is the opposite
    • The significance level is the probability at which it is decided that the null hypothesis is incorrect. For example, to prove that a coin is biased towards heads at a 5% significance level, the coin would have to land on heads at least 95% of the time. The null hypothesis is that p = 0.5 and the alternative hypothesis is that p > 0.5 (heads is more likely)
    • The critical region is the set of values of the test statistic for which the null hypothesis is rejected
    • The acceptance region is the opposite to the critical region - the set of values for which the null hypothesis is accepted
    • The critical value is the value seperating the regions of acceptance and rejection
    • A 1-tail test is a test where only one side is being tested for (e.g. a die is biased towards 1s)
    • A 2-tail test is a test for both sides (e.g. finding if a die is biased but without the side being stated)

    Example question:

      A manufacturer produces titanium bicycle frames. The bicycle frames are tested before use and on average5% of them are found to be faulty. A cheaper manufacturing process is introduced and the manufacturerwishes to check whether the proportion of faulty bicycle frames has increased. A random sample of 18bicycle frames is selected and it is found that 4 of them are faulty. Carry out a hypothesis test at the 5%significance level to investigate whether the proportion of faulty bicycle frames has increased. (8 marks)


      Full marks solution:

    • Let P = the probability that a randomly selected frame is faulty
    • H0: P = 0.05
    • H1: P > 0.05
    • P(X ≥ 4) or 1 - P(X ≤ 3) = 0.0109
    • Note: the sign in the P(X > 4) points in the same direction as the H1 sign. This is always the case
    • 0.0109 < 0.05 (significance level = 0.05) ∴ reject H0
    • There is evidence to suggest that the proportion of faulty frames has increased.