07 Hypothesis Testing
Material:¶
ASPE: 9.1-9.3 + 9.5 + 9.8 + 10.1-10.4 + 10.6
Session from 20/21:
Session Description¶
Hypothesis testing is a statistical method used to evaluate whether a certain hypothesis about a population parameter is supported by the data. The key elements of hypothesis testing include the formulation of a null hypothesis and an alternative hypothesis, the selection of an appropriate test statistic, the determination of a significance level or alpha value, the calculation of a p-value, and the comparison of the p-value to the significance level to decide whether to reject or fail to reject the null hypothesis. The significance level is a pre-determined threshold that represents the maximum probability of observing the data if the null hypothesis is true, and the p-value is the probability of observing the data, or more extreme data, if the null hypothesis is true. Hypothesis testing is widely used in many fields to make inferences about population parameters based on sample data, and it is an essential tool for scientific research and decision-making.
Key Concepts¶
- Basics of hypothesis testing
- Type I and II errors
- P-values and critical values and test statistic
- Tests on mean and proportion
- One and two tailed tests
- Paired t-test
- Contingency table tests (next time in recap)
Exercises¶
Exercise 1 (Book 9.3.8)¶
Cloud seeding has been studied for many decades as a weather modification procedure (for an interesting study of this subject, see the article in Technometrics, "A Bayesian Analysis of a Multiplicative Treatment Effect in Weather Modification," 1975 , Vol. 17, pp. 161-166). The rainfall in acre-feet from 20 clouds that were selected at random and seeded with silver nitrate follows: \(18.0,30.7,19.8,27.1,22.3,18.8,31.8,23.4\), \(21.2,27.9,31.9,27.1,25.0,24.7,26.9,21.8,29.2,34.8,26.7\), and 31.6.
- Can you support a claim that mean rainfall from seeded clouds exceeds 25 acre-feet? Use \(\alpha=0.01\). Find the \(P\)-value.
- Check that rainfall is normally distributed.
- Compute the power of the test if the true mean rainfall is 27 acre-feet.
- What sample size would be required to detect a true mean rainfall of 27.5 acre-feet if you wanted the power of the test to be at least 0.9 ?
- Explain how the question in part (a) could be answered by constructing a one-sided confidence bound on the mean diameter.
Answer
-
1) The parameter of interest is the true mean rainfall, \(\mu\).
2) \(H_0: \mu = 25\)
3) \(H_1: \mu > 25\).
4) \(\mathrm{t}_0=\frac{\bar{x}-\mu}{s / \sqrt{n}}\)
5) Reject \(\mathrm{H}_0\) if \(\mathrm{t}_0>\mathrm{t}_{\alpha, \mathrm{n}-1}\) where \(\alpha=0.01\) and \(\mathrm{t}_{0.01,19}=2.539\) for \(\mathrm{n}=20\)
6) \(\overline{\mathrm{x}}=26.04 \mathrm{~s}=4.78 \mathrm{n}=20\)
\[ \mathrm{t}_0=\frac{26.04-25}{4.78 / \sqrt{20}}=0.97 \]7) Because \(0.97<2.539\) fail to reject the null hypothesis. There is insufficient evidence to conclude that the true mean rainfall is greater than 25 acre-feet at \(\alpha=0.01\). The \(0.10<\mathrm{P}\)-value \(<0.25\).
-
The data on the normal probability plot falls along a line. Therefore, the normality assumption is reasonable.
-
\(d=\frac{\delta}{\sigma}=\frac{\left|\mu-\mu_0\right|}{\sigma}=\frac{|27-25|}{4.78}=0.42\)
Using the OC curve, Chart VII h) for \(\alpha=0.01, \mathrm{~d}=0.42\), and \(\mathrm{n}=20\), obtain \(\beta \cong 0.7\) and power of \(1-0.7=\) 0.3 .
-
\(d=\frac{\delta}{\sigma}=\frac{\left|\mu-\mu_0\right|}{\sigma}=\frac{|27.5-25|}{4.78}=0.52\)
Using the OC curve, Chart VII h) for \(\alpha=0.01, \mathrm{~d}=0.52\), and \(\beta \cong 0.1\) (Power=0.9), \(\mathrm{n}=75\)
-
\(99 \%\) lower confidence bound on the mean diameter
\[ \begin{aligned} & \bar{x}-t_{0.01,19}\left(\frac{s}{\sqrt{n}}\right) \leq \mu \\ & 26.04-2.539\left(\frac{4.78}{\sqrt{20}}\right) \leq \mu \\ & 23.326 \leq \mu \end{aligned} \]Because the lower limit of the CI is less than 25 there is insufficient evidence to conclude that the true mean rainfall is greater than 25 acre-feet at \(\alpha=0.01\).
Exercise 2 (Book 9.3.9)¶
A 1992 article in the Journal of the American Medical Association ("A Critical Appraisal of 98.6 Degrees F, the Upper Limit of the Normal Body Temperature, and Other Legacies of Carl Reinhold August Wunderlich") reported body temperature, gender, and heart rate for a number of subjects. The body temperatures for 25 female subjects follow: \(97.8,97.2\), \(97.4,97.6,97.8,97.9,98.0,98.0,98.0,98.1,98.2,98.3,98.3\), \(98.4,98.4,98.4,98.5,98.6,98.6,98.7,98.8,98.8,98.9,98.9\), and 99.0.
- Test the hypothesis \(H_0: \mu=98.6\) versus \(H_1: \mu \neq 98.6\), using \(\alpha=0.05\). Find the \(P\)-value.
- Check the assumption that female body temperature is normally distributed.
- Compute the power of the test if the true mean female body temperature is as low as 98.0 .
- What sample size would be required to detect a true mean female body temperature as low as 98.2 if you wanted the power of the test to be at least 0.9 ?
- Explain how the question in part (a) could be answered by constructing a two-sided confidence interval on the mean female body temperature.
Answer
-
1) The parameter of interest is the true mean female body temperature, \(\mu\).
2) \(H_0: \mu = 98.6\)
3) \(H_1: \mu \neq 98.6\).
4) \(t_0=\frac{\bar{x}-\mu}{s / \sqrt{n}}\)
5) Reject \(\mathrm{H}_0\) if \(\left|\mathrm{t}_0\right| \geq \mathrm{t}_{\alpha / 2, \mathrm{n}-1} \quad\) where \(\alpha=0.05\) and \(\mathrm{t}_{\alpha / 2, \mathrm{n}-1}=2.064\) for \(\mathrm{n}=25\)
6) \(\bar{x}=98.264, \mathrm{~s}=0.4821, \mathrm{n}=25\)
\[ t_0=\frac{98.264-98.6}{0.4821 / \sqrt{25}}=-3.48 \]7) Because \(3.48>2.064\), reject the null hypothesis. Conclude that the true mean female body temperature differs from \(98.6^{\circ} \mathrm{F}\) at \(\alpha=0.05\).
\[ P \text {-value }=2(0.001)=0.002 \] -
The data on the normal probability plot falls along a line. The normality assumption is reasonable.
-
\(d=\frac{\delta}{\sigma}=\frac{\left|\mu-\mu_0\right|}{\sigma}=\frac{|98-98.6|}{0.4821}=1.24\)
Using the OC curve, Chart VIIe for \(\alpha=0.05, \mathrm{~d}=1.24\), and \(\mathrm{n}=25\), obtain \(\beta \cong 0\) and power of \(1-0 \cong 1\)
-
\(d=\frac{\delta}{\sigma}=\frac{\left|\mu-\mu_0\right|}{\sigma}=\frac{|98.2-98.6|}{0.4821}=0.83\)
Using the OC curve, Chart VIIe for \(\alpha=0.05, \mathrm{~d}=0.83\), and \(\beta \cong 0.1\) (Power=0.9), \(\mathrm{n}=20\)
-
\(95 \%\) two sided confidence interval
\[ \begin{aligned} \bar{x}-t_{0.025,24}\left(\frac{s}{\sqrt{n}}\right) & \leq \mu \leq \bar{x}+t_{0.025,24}\left(\frac{s}{\sqrt{n}}\right) \\ 98.264-2.064\left(\frac{0.4821}{\sqrt{25}}\right) & \leq \mu \leq 98.264+2.064\left(\frac{0.4821}{\sqrt{25}}\right) \\ 98.065 & \leq \mu \leq 98.463 \end{aligned} \]We conclude that the mean female body temperature differs from 98.6 at \(\alpha=0.05\) because the value is not included inside the confidence interval.
Exercise 3 (Book 9.5.3)¶
An article in the British Medical Journal ["Comparison of Treatment of Renal Calculi by Operative Surgery, Percutaneous Nephrolithotomy, and Extra-Corporeal Shock Wave Lithotripsy" (1986, Vol. 292, pp. 879-882)] repeated that percutaneous nephrolithotomy (PN) had a success rate in removing kidney stones of 289 of 350 patients. The traditional method was \(78 \%\) effective.
- Is there evidence that the success rate for PN is greater than the historical success rate? Find the \(P\)-value.
- Explain how the question in part (a) could be answered with a confidence interval.
Answer
-
1) The parameter of interest is the true success rate
2) \(\mathrm{H}_0: p=0.78\)
3) \(\mathrm{H}_1: p>0.78\)
4) \(z_0=\frac{x-n p_0}{\sqrt{n p_0\left(1-p_0\right)}}\) or \(z_0=\frac{\hat{p}-p_0}{\sqrt{\frac{p_0\left(1-p_0\right)}{n}}}\); Either approach will yield the same conclusion
5) Reject \(H_0\) if \(\mathrm{z}_0>\mathrm{Z}_\alpha\). Since the value for \(\alpha\) is not given. We assume \(\alpha=0.05\) and \(\mathrm{z}_\alpha=\mathrm{z}_{0.05}=1.65\)
6)
\(x=289\)
\(n=350\)
\(\hat{p}=\frac{289}{350} \cong 0.83\)
\[ z_0=\frac{x-n p_0}{\sqrt{n p_0\left(1-p_0\right)}}=\frac{289-350(0.78)}{\sqrt{350(0.78)(0.22)}}=2.06 \]7) Because \(2.06>1.65\), reject the null hypothesis and conclude the true success rate is greater than 0.78 at \(\alpha=0.05\).
\[ \text { P-value }=1-0.9803=0.0197 \] -
The \(95 \%\) lower confidence interval:
\[ \begin{array}{r} \hat{p}-z_\alpha \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \leq p \\ .83-1.65 \sqrt{\frac{0.83(0.17)}{350}} \leq p \\ 0.7969 \leq p \end{array} \]Because the hypothesized value is not in the confidence interval \((0.78<0.7969)\), reject the null hypothesis.
Exercise 4 (Book 9.5.4)¶
An article in Fortune (September 21, 1992) claimed that nearly one-half of all engineers continue academic studies beyond the B.S. degree, ultimately receiving either an M.S. or a Ph.D. degree. Data from an article in Engineering Horizons (Spring 1990) indicated that 117 of 484 new engineering graduates were planning graduate study.
- Are the data from Engineering Horizons consistent with the claim reported by Fortune? Use \(\alpha=0.05\) in reaching your conclusions. Find the \(P\)-value for this test.
- Discuss how you could have answered the question in part (a) by constructing a two-sided confidence interval on \(p\).
Answer
-
1) The parameter of interest is the true proportion of engineering students planning graduate studies
2) \(\mathrm{H}_0: \mathrm{p}=0.50\)
3) \(\mathrm{H}_1: \mathrm{p} \neq 0.50\)
4) \(z_0=\frac{x-n p_0}{\sqrt{n p_0\left(1-p_0\right)}}\) or \(z_0=\frac{\hat{p}-p_0}{\sqrt{\frac{p_0\left(1-p_0\right)}{n}}}\); Either approach will yield the same conclusion
5) Reject \(\mathrm{H}_0\) if \(\mathrm{z}_0<-\mathrm{z}_{\alpha / 2}\) where \(\alpha=0.05\) and \(-\mathrm{z}_{\alpha / 2}=-\mathrm{z}_{0.025}=-1.96\) or \(\mathrm{z}_0>\mathrm{z}_{\alpha / 2}\) where \(\alpha=0.05\) and \(\mathrm{z}_{\alpha / 2}=\) \(\mathrm{z}_{0.025}=1.96\)
6) \(\mathrm{x}=117 \mathrm{n}=484\)
\[ \begin{aligned} & \hat{p}=\frac{117}{484}=0.2417 \\ & z_0=\frac{x-n p_0}{\sqrt{n p_0\left(1-p_0\right)}}=\frac{117-484(0.5)}{\sqrt{484(0.5)(0.5)}}=-11.36 \end{aligned} \]7) Because \(-11.36>-1.65\) reject the null hypothesis and conclude that the true proportion of engineering students planning graduate studies differs from 0.5 , at \(\alpha=0.05\).
\[ \text { P-value }=2[1-\Phi(11.36)] \cong 0 \] -
\(\hat{p}=\frac{117}{484}=0.2417 \approx 0.242\)
\[ \begin{aligned} \hat{p}-z_{\alpha / 2} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} & \leq p \leq \hat{p}+z_{\alpha / 2} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \\ 0.242-1.96 \sqrt{\frac{0.242(0.758)}{484}} & \leq p \leq 0.242-1.96 \sqrt{\frac{0.242(0.758)}{484}} \\ 0.204 & \leq p \leq 0.280 \end{aligned} \]Because the \(95 \%\) confidence interval does not contain the value 0.5 we conclude that the true proportion of engineering students planning graduate studies differs from 0.5.
Exercise 5 (Exam 2014.2.c)¶
An IT company receives its printed circuit boards from two different suppliers, 1 and 2. Records show that 5% of the circuit boards from supplier 1 and 3% of the circuit boards from supplier 2 are defective. 60% of the company’s current circuit boards come from supplier 2, and the remaining from supplier 1. The company usually keeps a stock of 2000 circuit boards.
- Is there sufficient evidence to support the claim that the rate of defectives depends very significantly on supplier?
Answer
-
\(\mathrm{H}_0\) : Rate of defectives are independent of supplier
\(\mathrm{H}_1\) : Rate of defectives are dependent of supplier
Level of significance \(=0,01\)
P-value \(= 0,0298\)
We fail to reject and conclude that we do not have sufficient evidence to support the claim that rate of defectives and suppliers are not very significantly independent .We would, however, be able to conclude this with alpha \(=0,05\)
Exercise 6 (Exam 2015.3)¶
Different screens and their hue bias were tested and the result is displayed in the following table:
Blueish | Reddish | Greenish | |
---|---|---|---|
Display 1 | 46 | 82 | 72 |
Display 2 | 42 | 38 | 20 |
Display 3 | 52 | 40 | 8 |
Is there sufficient evidence to conclude that screens and hue bias depend significantly?Design an appropriate test to answer this question.
Answer
\(H_0\) : Screens and hue bias are independent
\(H_1\) : Screens and hue bias are dependent
From the template, we obtain a p-value \(= 0.0000\). From this we reject the null hypothesis and conclude that screens and hue bias are dependent.
Exercise 7 (Exam 2015.4)¶
Two different machines, \(A\) and \(B\), which are used to measure blood pressure, are tested on 12 different patients such that each patient has his/her blood pressure measured by both machines. The results for the systolic blood pressure are displayed in the table below:
Patient | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Machine A |
119 | 130 | 141 | 123 | 149 | 156 | 134 | 108 | 123 | 138 | 119 | 156 |
Machine B |
112 | 126 | 145 | 112 | 138 | 156 | 130 | 112 | 112 | 119 | 112 | 152 |
- Determine the mean, standard deviation and interquartile range for both sets of data
- Is it possible to conclude with statistical significance that the two machines give different measurement? Design an appropriate test to answer this question.
- Explain what the P -value obtained in b) actually means.
Answer
-
Machine A Machine B 119 112 130 126 141 145 123 112 149 138 156 156 134 130 108 112 123 112 138 119 119 112 156 152 Mean 133 127,16667 St. Dev. 15,462565 16,813595 IQR 21 27,75 -
\(H_0\) : Mean machine A is equal to mean of machine B
\(H_1\) : Mean machine A is not equal to mean of machine B
We use a t-test since the samples are small. Also, the F-test shows that we are unable to reject different variances and thus assume equal variance. We obtain a p-value \(= 0,0117\). From this we reject the null hypothesis and conclude that the machines are significantly different.
-
The p-value indicates the probability of obtaining the samples given that the null hypothesis is true, i.e. under the assumption that the two machines yield similar measurements, the probability of obtaining the results from assignment a) is \(0,0117\).
Exercise 8 (Exam 2016.4)¶
An industrial safety program was recently instituted in the computer chip industry. The average weekly loss (averaged over 1 month) in labor-hours due to accidents in 10 similar plants both before and after the program are as follows:
Plant | Before | After |
---|---|---|
1 | 30.5 | 23 |
2 | 18.5 | 21 |
3 | 24.5 | 22 |
4 | 32 | 28.5 |
5 | 16 | 14.5 |
6 | 15 | 15.5 |
7 | 23.5 | 24.5 |
8 | 25.5 | 21 |
9 | 28 | 23.5 |
10 | 18 | 16.5 |
- Determine whether the safety program has had a significant effect on reducing labor-hours due to accidents in the 10 plants.
- Setup a \(95 \%\) confidence interval on the average difference and state how this interval could have been used to answer question a.
- Is there evidence to support the claim that the program has had an effect at the \(1 \%\) level of significance?
Answer
-
Before = [30.5, 18.5, 24.5, 32, 16, 15, 23.5, 25.5, 28, 18] After = [23, 21, 22, 28.5, 14.5, 15.5, 24.5, 21, 23.5, 16.5] df = pd.DataFrame({'Before': Before, 'After': After}) df['Difference'] = df['Before']- df['After'] meandiff = np.mean(df['Difference'])
We check for normality of the differences
stats.probplot(df['Difference'], plot=plt) plt.ylabel('Difference in Labor-hours') plt.show() print('Skewness = ' + repr(round(stats.skew(df['Difference']),4))) print('Kurtosis = ' + repr(round(stats.kurtosis(df['Difference']),4))) fig, ax = plt.subplots() df['Difference'].plot.kde(ax=ax, legend=False, title='Distribution');
Skewness = 0.1328
Kurtosis = -0.7163
Plotting
n1 = len(df['Before']) SE1 = stats.sem(df['Before']) mean1 = np.mean(df['Before']) n2 = len(df['After']) SE2 = stats.sem(df['After']) mean2 = np.mean(df['After']) x1 = np.linspace(mean1-4*SE1, mean1+4*SE1, 1000) x2 = np.linspace(mean2-4*SE2, mean2+4*SE2, 1000) y1 = stats.t.pdf(x1, n1-1, mean1, SE1) y2 = stats.t.pdf(x2, n2-1, mean2, SE2) plt.plot(x1,y1, color='red') plt.plot(x2,y2, color='blue') plt.show()
val = stats.ttest_rel(df['Before'], df['After']) alpha = 0.05 stat = abs(round(val[0],2)) pvalue = round(val[1], 4)/2 crit = abs(round(stats.t.ppf(alpha,n1-1),2)) stat
2.27
if pvalue < alpha: print("Reject since " + repr(pvalue) + ' < ' + repr(alpha)) else: print("Fail to reject since " + repr(pvalue) + ' > ' + repr(alpha))
a) Reject since 0.02485 < 0.05
-
same p-value different alpha, so no.
Exercise 9 (Exam 2017.5)¶
A recent study among 254 computer science graduates from Aarhus University was made in order to determine how successful the former students were in their current employment. 98 of these students had taken a course in linear algebra and of these 92 were classified as "successful" in their current employment. 136 of the students who had not taken a course in linear algebra were classified as "successful" in their current employment.
- Is the evidence to support the claim that computer science graduates who had taken a linear algebra course were more successful in their current employment than those who had not taken such a course?
- Explain the meaning of the p-value obtained in question (a), i.e. what does this probability refer to?
Answer
alg = 98
algs = 92
nonalgs = 136
nonalg = 254-alg
-
Since we have two proportions, we can use test of difference between proportions:
val = sm.stats.proportions_ztest([algs, nonalgs], [alg, nonalg], value = None, alternative = 'larger') stat = abs(round(val[0],2)) pvalue = round(val[1],4) alpha = 0.05 crit = stats.norm.isf(alpha/2) if pvalue < alpha: print("Reject since " + repr(pvalue) + ' < ' + repr(alpha)) else: print("Fail to reject since " + repr(pvalue) + ' > ' + repr(alpha))
Reject since 0.0432 < 0.05
import numpy as np from scipy.stats import norm from IPython.display import display, Markdown # Calculate the pooled proportion pooled_p = (algs + nonalgs) / (alg + nonalg) # Calculate the standard error std_error = np.sqrt(pooled_p * (1 - pooled_p) * (1/alg + 1/nonalg)) # Calculate the z-score z_score = (algs/alg - nonalgs/nonalg) / std_error # Determine the p-value for the larger alternative p_value = norm.sf(z_score) # sf is survival function, which is 1-cdf # Set significance level alpha = 0.05 # Display the results display(Markdown(f"### Z-Test Results for Two Proportions")) display(Markdown(f"**Z-Score:** {z_score:.2f}")) display(Markdown(f"**P-Value:** {p_value:.4f}")) display(Markdown(f"**Significance Level (Alpha):** {alpha}")) # Decision based on p-value if p_value < alpha: display(Markdown("**Conclusion:** Reject the null hypothesis since p-value < alpha.")) else: display(Markdown("**Conclusion:** Fail to reject the null hypothesis since p-value > alpha."))
Z-Test Results for Two Proportions
Z-Score: 1.71
P-Value: 0.0432
Significance Level (Alpha): 0.05
Conclusion: Reject the null hypothesis since p-value < alpha.
val[0]
1.7143021919557946
-
see other answers elsewhere
Exercise 10 (Reexam 2018.4)¶
Two producers of batteries measure the longevity of 30 batteries of the same type, which were randomly chosen from a larger batch of such batteries. The lifetime (in hundreds of hours) is displayed "Batteries.xlsx".
- Check the dataset for outliers and replace any outliers with the mean lifetime of the producer in question. Use this cleaned dataset in the following questions.
- Determine estimates for the quartiles, average lifetime, standard deviation and variance of each producer's battery
- Setup \(95 \%\) confidence intervals for each mean battery lifetime from the two producers, and accompany the intervals with plots that display the rejection region.
- Is it reasonable to conclude that the lifetime of the two producer's battery follow a normal distribution? Explain using plots and discussing skewness and kurtosis.
- Setup a \(95 \%\) confidence interval for the difference between the two producer's battery, and accompany the intervals with plots that display the rejection region.
- Is there significant evidence to support the claim that the mean lifetime of the batteries from the two producers differ from one another?
Answer
-
df = pd.read_excel( 'Batteries.xlsx' ) df.head()
Producer 1 Producer 2 0 2.1162 1.1259 1 2.5135 3.1725 2 1.8137 2.4492 3 0.8075 3.7766 4 1.5554 4.4673 q3, q1 = np.percentile(df['Producer 1'], [75,25]) iqr = q3 - q1 upper = q3+1.5*iqr lower = q1 - 1.5*iqr average = df.loc[(df['Producer 1'] < upper) & (df['Producer 1'] > lower) , 'Producer 1'].mean() df['Producer 1'] = np.where((df['Producer 1'] > upper) | (df['Producer 1'] < lower), average, df['Producer 1']) q3, q1 = np.percentile(df['Producer 2'], [75,25]) iqr = q3 - q1 upper = q3+1.5*iqr lower = q1 - 1.5*iqr average = df.loc[(df['Producer 2'] < upper) & (df['Producer 2'] > lower) , 'Producer 2'].mean() df['Producer 2'] = np.where((df['Producer 2'] > upper) | (df['Producer 2'] < lower), average, df['Producer 2'])
-
df1 = df['Producer 1'] df2 = df['Producer 2'] print('Producer 1: ') print('q1 = ', round(df1.quantile(0.25), 4)) print('q2 = ', round(df1.quantile(0.5), 4)) print('q3 = ', round(df1.quantile(0.75), 4)) print('q4 = ', round(df1.quantile(1), 4)) print('average = ', round(df1.mean(), 4)) print('std = ', round(df1.std(), 4)) print('std = ', round(df1.var(), 4)) print(' ') print('Producer 2: ') print('q1 = ', round(df2.quantile(0.25), 4)) print('q2 = ', round(df2.quantile(0.5), 4)) print('q3 = ', round(df2.quantile(0.75), 4)) print('q4 = ', round(df2.quantile(1), 4)) print('average = ', round(df2.mean(), 4)) print('std = ', round(df2.std(), 4)) print('std = ', round(df2.var(), 4))
Producer 1:
q1 = 1.1888
q2 = 1.9196
q3 = 2.4074
q4 = 3.637
average = 1.9029
std = 0.9109
std = 0.8298
Producer 2:
q1 = 2.0375
q2 = 2.5158
q3 = 3.0069
q4 = 4.4673
average = 2.4776
std = 0.9124
std = 0.8325
-
Using t.interval
from scipy import stats n = len(df1) mean = np.mean(df1) SE = stats.sem(df1) Level = 0.95 CI = stats.t.interval(Level, n-1, loc=mean, scale=SE) print('An ' + repr(Level*100) + ' % upper confidence interval for the sample mean is [' + repr(round(CI[0],2)) + '; ' + repr(round(CI[1],2)) + ']')
An 95.0 % upper confidence interval for the sample mean is [1.56; 2.24]
import numpy as np import scipy.stats as stats import matplotlib.pyplot as plt # Using the formula data = df1 # Calculate the mean and standard deviation of the data mean = np.mean(data) std_dev = np.std(data, ddof=1) # ddof=1 gives an unbiased estimator of the population std dev # Calculate the standard error of the mean std_error = std_dev / np.sqrt(len(data)) # Set the significance level and degrees of freedom for the t-distribution alpha = 0.05 # 95% confidence level dof = len(data) - 1 # Calculate the critical t-value for the two-tailed t-test t_crit = stats.t.ppf(1 - alpha/2, dof) # Calculate the confidence interval for the mean lower = mean - t_crit * std_error upper = mean + t_crit * std_error # Print the confidence interval print("95% Confidence Interval mean Producer 1: [{:.2f}, {:.2f}]".format(lower, upper)) # Plot the t-distribution with the rejection region shaded x = np.linspace(mean-4*std_error, mean+4*std_error, 1000) y = stats.t.pdf(x,dof, mean, std_error) plt.plot(x, y, 'k', linewidth=2) shade = np.linspace(lower, upper, 300) plt.fill_between(shade, stats.t.pdf(shade, dof, mean, std_error), alpha=0.5) plt.axvline(x=lower, linestyle='--', color='k') plt.axvline(x=upper, linestyle='--', color='k') plt.title("t-Distribution with 95% Confidence Interval \n for the mean of batteries from producer 1") plt.xlabel("t-value") plt.ylabel("Probability density") plt.show() # Using the formula data = df2 # Calculate the mean and standard deviation of the data mean = np.mean(data) std_dev = np.std(data, ddof=1) # ddof=1 gives an unbiased estimator of the population std dev # Calculate the standard error of the mean std_error = std_dev / np.sqrt(len(data)) # Set the significance level and degrees of freedom for the t-distribution alpha = 0.05 # 95% confidence level dof = len(data) - 1 # Calculate the critical t-value for the two-tailed t-test t_crit = stats.t.ppf(1 - alpha/2, dof) # Calculate the confidence interval for the mean lower = mean - t_crit * std_error upper = mean + t_crit * std_error # Print the confidence interval print("95% Confidence Interval mean Producer 2: [{:.2f}, {:.2f}]".format(lower, upper)) # Plot the t-distribution with the rejection region shaded x = np.linspace(mean-4*std_error, mean+4*std_error, 1000) y = stats.t.pdf(x,dof, mean, std_error) plt.plot(x, y, 'k', linewidth=2) shade = np.linspace(lower, upper, 300) plt.fill_between(shade, stats.t.pdf(shade, dof, mean, std_error), alpha=0.5) plt.axvline(x=lower, linestyle='--', color='k') plt.axvline(x=upper, linestyle='--', color='k') plt.title("t-Distribution with 95% Confidence Interval \n for the mean of batteries from producer 2") plt.xlabel("t-value") plt.ylabel("Probability density") plt.show()
95% Confidence Interval mean Producer 1: [1.56, 2.24]
95% Confidence Interval mean Producer 2: [2.14, 2.82]
-
stats.probplot(df1, plot=plt) plt.ylabel('Producer 1') plt.show() print('Skewness = ' + repr(round(stats.skew(df1),4))) print('Kurtosis = ' + repr(round(stats.kurtosis(df1),4))) fig, ax = plt.subplots() df1.plot.kde(ax=ax, legend=False, title='Distribution of mean of Producer 1');
Skewness = 0.1134
Kurtosis = -0.4326
stats.probplot(df2, plot=plt) plt.ylabel('Producer 2') plt.show() print('Skewness = ' + repr(round(stats.skew(df2),4))) print('Kurtosis = ' + repr(round(stats.kurtosis(df2),4))) fig, ax = plt.subplots() df2.plot.kde(ax=ax, legend=False, title='Distribution of mean of Producer 2');
Skewness = -0.1821
Kurtosis = -0.19
-
import numpy as np import scipy.stats as stats import matplotlib.pyplot as plt # Generate some sample data for two independent samples (replace this with your own data) data1 = df1 data2 = df2 # Calculate the mean and standard deviation of the data mean1 = np.mean(data1) mean2 = np.mean(data2) std_dev1 = np.std(data1, ddof=1) # ddof=1 gives an unbiased estimator of the population std dev std_dev2 = np.std(data2, ddof=1) # Calculate the standard error of the difference in means std_error = np.sqrt((std_dev1 ** 2 / len(data1)) + (std_dev2 ** 2 / len(data2))) # Set the significance level and degrees of freedom for the t-distribution alpha = 0.05 # 95% confidence level df = len(data1) + len(data2) - 2 # Calculate the critical t-value for the two-tailed t-test t_crit = stats.t.ppf(1 - alpha/2, df) # Calculate the confidence interval for the difference in means diff = mean1 - mean2 lower = diff - t_crit * std_error upper = diff + t_crit * std_error # Print the confidence interval print("95% Confidence Interval for the Difference in Means: [{:.2f}, {:.2f}]".format(lower, upper)) # Plot the t-distribution with the rejection region shaded x = np.linspace(-4, 4, 1000) y = stats.t.pdf(x, df) plt.plot(x, y, 'k', linewidth=2) shade1 = np.linspace(-t_crit, t_crit, 300) plt.fill_between(shade1, stats.t.pdf(shade1, df), alpha=0.5) plt.axvline(x=t_crit, linestyle='--', color='k') plt.axvline(x=-t_crit, linestyle='--', color='k') plt.title("t-Distribution with 95% Confidence Interval for the Difference in Means") plt.xlabel("t-value") plt.ylabel("Probability density") plt.show()
95% Confidence Interval for the Difference in Means: [-1.05, -0.10]
-
import numpy as np import scipy.stats as stats import matplotlib.pyplot as plt # Generate some sample data for two independent samples (replace this with your own data) data1 = df1 data2 = df2 # Calculate the mean and standard deviation of the data mean1 = np.mean(data1) mean2 = np.mean(data2) std_dev1 = np.std(data1, ddof=1) # ddof=1 gives an unbiased estimator of the population std dev std_dev2 = np.std(data2, ddof=1) # Set the significance level alpha = 0.05 # 95% confidence level # Perform a two-sample t-test with equal variances t_stat, p_value = stats.ttest_ind(data1, data2, equal_var=True) # Calculate the critical t-value for the two-tailed t-test t_crit = stats.t.ppf(1 - alpha/2, len(data1) + len(data2) - 2) # Print the results of the hypothesis test if p_value < alpha: print("Reject since ", round(p_value, 4), ' < ', alpha) else: print("Fail to reject since ", round(p_value, 4) , '\u2265' , alpha) # Plot the t-distribution with the rejection region shaded x = np.linspace(-4, 4, 1000) y = stats.t.pdf(x, len(data1) + len(data2) - 2) plt.plot(x, y, 'k', linewidth=2) shade1 = np.linspace(-t_crit, t_crit, 300) shade2 = np.linspace(t_crit, 4, 300) plt.fill_between(shade1, stats.t.pdf(shade1, len(data1) + len(data2) - 2), alpha=0.5) plt.fill_between(shade2, stats.t.pdf(shade2, len(data1) + len(data2) - 2), alpha=0.5) plt.axvline(x=t_crit, linestyle='--', color='k') plt.axvline(x=-t_crit, linestyle='--', color='k') # Add an arrow pointing to the position on the x-axis where the p-value lies if p_value < alpha/2: plt.annotate("p = {:.4f}".format(p_value), xy=(t_stat, 0.1), xytext=(t_stat + 1, 0.3), arrowprops=dict(facecolor='green', shrink=0.05)) elif p_value > 1 - alpha/2: plt.annotate("p = {:.4f}".format(p_value), xy=(t_stat, 0.1), xytext=(t_stat - 1, 0.3), arrowprops=dict(facecolor='green', shrink=0.05)) else: plt.annotate("p = {:.4f}".format(p_value), xy=(t_stat, 0.1), xytext=(t_stat, 0.3), arrowprops=dict(facecolor='green', shrink=0.05)) plt.title("t-Distribution with Hypothesis Test Results") plt.xlabel("t-value") plt.ylabel("Probability density") plt.show()
Reject since 0.0177 < 0.05