SINHA MATH 140 PROJECT: ANOVA Randomization Hypothesis Test

Plagiarism Warning: Do not copy an online article or another math 140 student’s work. This is plagiarism. Cheating will go on your permanent record. Further discipline will be determined by the COC dean of students. A plagiarism checker may be used if I suspect cheating. I have also had problems with students simply copying online ANOVA articles as their project. Do not do this either. This is also plagiarism. Your project answers should be intro level and use the given data and the simple straight forward language used during lectures and in the textbook. Online articles are usually written by people in advanced statistics and will not answer these questions appropriately.

Grading Rubric: The project is worth 100 points. There are 25 things to put on your project listed below in bold as “Put these on the project report”. Some of them are pictures found on StatKey. Others are answers to questions or sentences to write. Each of these is worth 4% of the project grade. (4 points each)

Key Note about Assumptions (Conditions): This is a Census of students in the Fall 2015 semester. Even though it is not a random sample, we will be assuming that the sample data represents the population of COC statistics students. We will also be assuming that individual observations from this data are independent of each other, even though in reality this might not be the case.

Part I: Pick your two columns of data.

Open the Math 075/140 Combined Survey Data Fall 2015. Data Link.

This data was taken from statistics students (math 140) and pre-stat students (math 075) in the Fall 2015 semester.

There are 36 columns of data to choose from. Pick one categorical data set (column of words) and one quantitative data set (numerical measurement data) from the 075/140 combined survey data.

Put these on the Project Report:

1. What categorical variable did you pick? For example: The type of sandwich left out to spoil. 2. What quantitative variable did you pick? For example: The number of ants

Part II: Write the Null Hypothesis, Alternative Hypothesis and chose your Claim

Test the claim that there is a relationship between your categorical variable and your quantitative variable. Use the following null and alternative hypothesis. The number of options in your categorical data will determine the number of

groups in your ANOVA test. If your categorical data has two options, then the null will be µ1 = µ2. If your data has three options, then the null will be µ1 = µ2 = µ3. If your categorical data has 6 options then your null will be µ1 = µ2 = µ3 = µ4 = µ5 = µ6 . Here is an example null and alternative hypothesis. Your null and alternative hypothesis should have symbolic notation with “µ” and also the relationship statement implication. Do not just copy this null and alternative below. You should not say that the categorical and quantitative variables are related, you should say that the type of sandwich is related to the number of ants. If those were the variables you picked. Also, the claim could be either statement. Which do you think is true. Do you think the two columns of data you picked are related or not? That is the claim. Be sure to label which statement is the claim. This is a right tailed test. (Remember all ANOVA tests are right tailed.)

H0 : µ1 = µ2 = µ3 = µ4 = … (The type of sandwich is NOT RELATED to and the number of ants.)

HA : at least one is ≠ (The type of sandwich is RELATED to and the number of ants.) CLAIM

Put these on the Project Report:

http://teachoutcoc.org/files/math_075_140_combined_survey_data_fall_2015.xlsx

3. Write your null hypothesis H0 as seen in the example above. Be sure to include the symbolic notation with “µ” AND the relationship statement with the two variables you picked.

4. Write your alternative hypothesis HA as seen in the example above. Be sure to include the symbolic notation with “µ” AND the relationship statement with the two variables you picked.

5. Is your claim that the categorical and quantitative variables are related (Ha) or not related (Ho)?

Part III: Paste your Data into StatKey and Find your F-test statistic

Copy and paste your categorical column of data and the quantitative column of data next to each other in excel. The categorical column should be on the left. The quantitative column should be on the right. Now highlight both columns without the titles, right click and copy. Do NOT copy the titles.

Go to the “ANOVA for Difference in Means” under the “More Advanced Randomization Tests” menu. StatKey Link.

Click on “Edit Data” and Copy and Paste the two columns of data (categorical on left and quantitative on the right) into StatKey. Do NOT paste the titles. If you do paste the data with the titles, delete the titles. Uncheck the box for “header row” and push OK.

Put these on the Project Report:

6. Copy and Paste a picture of the “Original Sample Statistics” printout into your Project report. You can find this on the top right of the StatKey page. It should show the F-test statistic from your data, and the sample size, mean and standard deviations for all your groups. It should look like the following but the numbers will be different. Do NOT copy and paste the “randomization sample” by mistake. Only copy the one that says “original sample”. The F from “original sample” is your one and only F test statistic for your data. You do not need to copy the ANOVA table either.

a. 7. Assumptions

