# 統計學問題-假設檢定

Is the water on your airline flight safe to drink? It is not feasible toanalyze the water on every flight, so sampling is necessary. In August andSeptember 2004, the Environmental Protection Agency (EPA) found bacterialcontamination in water samples from the lavatories and galley water taps on 20of 158 randomly selected U.S. flights. Alarmed by the data, the EPA orderedsanitation improvements, and then tested water samples again in November andDecember 2004. In the second sample, bacterial contamination was found in 29 of169 randomly sampled flights. Asnoted in the problem, the percentage of contaminated water samples in 2004equaled 12.7%. The EPA continues torandomly sample bacterial contamination of water taps in airlines. In a recent random sample of 200 flights, 20were found to be contaminated. Performthe appropriate test to determine if the proportion of water taps withbacterial contamination is less than the 2004 percentage of 12.7%. Complete the test using the PVALUE method forthe decision rule (instead of the test statistic method). Be sure to state the hypotheses, decision, decisionrule and conclusion.

As noted in the problem, the percentage of contaminated water samples in 2004 equaled 12.7%.

In a recent random sample of 200 flights, 20 were found to be contaminated.

Perform the appropriate test to determine if the proportion of water taps with bacterial contamination is less than the 2004 percentage of 12.7%.

很明顯這是要做比例檢定. 不過, 可能會引發疑問的是: 由於2004年

那 12.7% 也是樣本結果, 那是該做兩獨立樣本比例差異之檢定? 還

是做單一比例之檢定?

事實上 2004 年有兩個樣本:

(1) In August and September 2004, the Environmental Protection Agency (EPA) found bacterial contamination in water samples from the lavatories and galley water taps on 20 of 158 randomly selected U.S. flights.

(2) tested water samples again in November and December 2004. In the second sample, bacterial contamination was found in 29 of169 randomly sampled flights.

那12.7%是第一個樣本的結果, 而第二個樣本的污染率卻更高, 17.2%.

問題說 "to determine if... is less than the 2004 percentage of 12.7%",

而不是說與 "2004的情況" 比較. 也就是說: 把2004兩個樣本中污染率

較低的結果當參考點, 而目標是證實最近結果比2004那 "12.7%" 的

結果好. 所以, 要檢定的假說是

H0: p≧12.7%

Ha: p<12.7%

決策規則: 若 p-value≦顯著水準, 則棄卻 H0.

計算:

p-value = P[X≦x; p=12.7%] = P[X≦20; p=0.127] = 0.1482

若用常態近似,

p-value = P[X≦20; p=0.127] = P[X<20.5; p=0.127]

≒ P[Z<(20.5-200*0.127)/√(200*0.127*0.873)]

= P[Z< -1.0406] = 0.1490

結論: 由於 p-value 不小, 不能證實污染率已有改善(p<0.127),

也就是說, 暫時只能認為污染率仍在12.7%的水準.