PS405–Week4Section: Differenceofmeans,ANOVA,andMatrixAlgebra D.J.Flynn February4,2014 t-tests (cid:73) forequalityoftwosamplemeans (cid:73) hypotheses: H :nodifferenceinsamplemeans 0 H :significantdifference A (cid:73) calculatingthet-stat: statistic-hypothesizeddifference t = SEofestimate Gender/partisanshipexample Question:aremenandwomenequallylikelytobeDemocrats? t-statfordifferenceinproportions: (P − P ) m f t = (cid:113) P (1−P ) P (1−P ) m m + f F n n m F p-valuethatRestimatesisfornullofnodifference;confidence intervalisfordifferencebetweentwosamplemeans Interpretationofp-value:“ifnullhypothesisistrue,howoften wouldweobserveadifferencethislargeunderrepeatedsampling?” –NOTthereisap%chancethatthetruedifferenceisequaltoX. LogicofANOVAandtheFtest (cid:73) runningtheme:experimentswith>2groups (cid:73) doesassignmenttoaparticulargroup(X)affectsome continuousoutcome(Y)? (cid:73) thisquestioncanbeansweredwithone-wayANOVA(AKA F-test) (cid:73) twosourcesofvariationinDV: (cid:73) intended:independentvariable/factor (cid:73) unintended:error/residual (cid:73) goalofANOVA:determineshareofvarianceexplainedbyX ANOVAtable (cid:73) Gothroughtablequickly (cid:73) Fstatistic(sometimescalledF-act): explainedvariance MS A F = = unexplainedvariance MS E (cid:73) lookupcriticalF-statbasedonnumeratordf,denominatordf, andconfidencelevel (cid:73) ifF-act>F-critical,thenwerejectthenullofindependence ANOVAinR 1. identifyindependentanddependentvariables 2. determinevariablestructures(andchangeifnecessary) 3. estimateANOVAandcallupresults Determiningvariablestructure (cid:73) str(variable)returnsthestructureofavariable:integer, factor,character,number,logical (cid:73) importantbecauseANOVAsareusedforcategoricalIVs (cid:73) practice: install.packages("datasets") library(datasets) names(chickwts) str(chickwts$weight) str(chickwts$feed) levels(chickwts$feed) EstimatingANOVAsinR anova<-aov(weight∼feed,data=chickwts) summary(anova) Df Sum Sq Mean Sq F value Pr(>F) feed 5 231129 46226 15.37 5.94e-10 *** Residuals 65 195556 3009 --- Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1 Whathappensifweinsteadestimateaov(feed∼weight)? wrong.model<-aov(feed∼weight,data=chickwts) Warning messages: 1: In model.response(mf, "numeric") : using type = "numeric" with a factor response will be ignored 2: In Ops.factor(y, z$residuals) : - not meaningful for factors Anotherexample Wehavedataonwhichundergraduateinstitutionpeopleattended andmid-lifesatisfaction(0-100): names(my.data) [1] "school" "satisfaction" table(my.data$school) school fsu uf um 5 5 5 my.anova<-aov(satisfaction∼school,data=my.data) summary(my.anova) Df Sum Sq Mean Sq F value Pr(>F) school 2 7216 3608 11.85 0.00144 ** Residuals 12 3655 305 --- Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
Description: