Problem 1 - 5 pts
bovids <- read.table("../../static/datasets/bovid_occurrences_table.txt",
header=T, sep="\t")
#add rownames
rownames(bovids) <- bovids$taxon
#drop the last column
bovids <- bovids[,-9]Chi square is a good choice here as all the expected cell frequencies are well above 5. Thus, it can be used to test the hypothesis that row and column variables are associated.
myTest <- chisq.test(bovids)Just to make sure, lets check that our expected values are high enough.
Remember, any expected values less than 5 would call for the use of Fisher’s exact test. All of our values are well above 5 so we are good with chi-square.
myTest$expected## site1 site2 site3 site4 site5 site6 site7
## Gazella 168.3372 138.7644 141.4942 221.5681 88.71824 76.88915 161.9677
## Connochaetes 188.1332 155.0828 158.1336 247.6239 99.15127 85.93110 181.0146
## Tragelaphus 167.9099 138.4122 141.1351 221.0058 88.49307 76.69400 161.5566
## Aepyceros 215.6197 177.7406 181.2371 283.8022 113.63741 98.48576 207.4611
## site8
## Gazella 184.2610
## Connochaetes 205.9296
## Tragelaphus 183.7933
## Aepyceros 236.0162Now, we look at the results of the test and interpret
myTest##
## Pearson's Chi-squared test
##
## data: bovids
## X-squared = 508.66, df = 21, p-value < 2.2e-16