Image credit: Sigmund
Introduction
This is a tutorial of common problems in Statistics alongside their suggested solutions. No particular level is the target. Thus, you may find any of the problems to be at any level (undergrad, master’s, or doctoral level). Questions may come from any topic in Statistics, rather than on a chapter-by-chapter basis. In most cases, these questions are not designed by me. You may find some of the questions may be coming from my lectures notes, online portals where solutions to the problems are not given, well-written Statistics books such as Robert V. Hogg et al., Sheldon Ross, or questions whose original source are not clearly known (anonymous). In cases where I am certain of the original source of the question, I would do my best to cite the source of it. If you feel a problem is not properly cited, please draw my attention to it.
Problem 1.
To study the effect of temperature on yield in a chemical process, five batches were produced at each of three temperature levels. The results results are given below.
i.) Construct an analysis of variance ANOVA table for this problem.
ii.) Use a 0.05 level of significance to test whether the temperature level has an effect on the mean yield of the process.
Source
multiple online sources
| Batch | 50°C | 60°C | 70°C |
|---|---|---|---|
| 1 | 34 | 30 | 23 |
| 2 | 24 | 31 | 28 |
| 3 | 36 | 34 | 28 |
| 4 | 39 | 23 | 30 |
| 5 | 32 | 27 | 31 |
Suggested solution
Clearly, this is a one-way ANOVA problem, and can be solved in many ways. I think it’s best, IMHO, to decompose the sum of squares into an array comprising treatment sum of squares and residual sum of squares.
Decomposing sum of squares is, thus;
The sum of squares (SS) from the above are:
i.
ANOVA table
| Source | DF | Sum of squares | Mean sum of squares | F-value |
|---|---|---|---|---|
| Treatment | 2 | 70 | 35 | |
| Error | 12 | 236 | 19.667 | 1.78* |
| Total | 14 | 306 |
ii.
Hypothesis formulation
Since no sufficient evidence to conclude that temperature level appears to have an effect on the mean yield of the process. [Insert Note here!]
Problem 2.
Compute the correlation coefficient for each of the following probability densities:
i.)
ii.)
Source
Hogan Craig et al.
Suggested solution
Mathematically, correlation coefficient,
i.)
The marginals are;
The expectation and variance of
The expectation and variance of
The covariance of
Correlation coefficient:
ii.)
The bivariate joint distribution is:
| X/Y | 1 | 3 |
|---|---|---|
| 1 | 3/22 | 7/22 |
| 2 | 2/11 | 4/11 |
The marginals are;
The expectation and variance of
The expectation and variance of
The covariance of
Correlation coefficient:
Problem 3.
A construction company wins two road rehabilitation projects,
The marginal distributions therefore are:
Assuming that the completion times of the projects are independent, find the probability that:
i,) the two projects will be completed at the same time,
ii.) both projects will be completed in less than 30 months, and
iii.) project RH1 takes longer time to complete than project RH2.
iv.) Find the expected completion time for each project and interpret your results.
Source
unknown
Suggested solution
The joint distribution table is shown below.
| RH1/RH2 | 18 | 24 | 30 |
|---|---|---|---|
| 24 | 0.09 | 0.15 | 0.06 |
| 30 | 0.18 | 0.30 | 0.12 |
| 36 | 0.03 | 0.05 | 0.02 |
i.)
ii.)
iii.)
iv.)
When ever the company wins a rehabilitation project categorized as RH1, the expected time to completion of the project is about RH2, the time to completion is approximately