Introduction to Statistics (MAT/SST 115.03 2008S)
Primary: [Front Door] [Syllabus] [Current Outline] [R] - [Academic Honesty] [Instructions]
Groupings: [Applets] [Assignments] [Data] [Examples] [Handouts] [Labs] [Outlines] [Projects] [Readings] [Solutions]
External Links: [R Front Door] [SamR's Front Door]
You can use R to compute the test statistic. Recall that this test statistic has the form
(x1bar - x2bar)/sqrt(s1*s1/n1 + s2*s2/n2)
You can fill in the values.
Okay, you need to compute
(xfbar-xfbar) +/- tkstar*sqrt(sf*sf/nf + sm*sm/nm)
I'll leave you to fill in the details.
As the book suggests, you can use the applet. You can also use
R's pt function to compute the
p-value. Since it's a two-sided test, we should
double the computed value.
t = (1.861-2.089)/sqrt(1.777*1.777/654 + 1.760*1.760/813) 2*pt(t, df=653)
This is one of those fun times in which our data set combines a number
of essentially independent columns into a single data frame. Since R
pads the empty cells in the data frame with NA values,
our analyses may be slightly more complicated.
Let's start by loading the data. There's little enough data that we can look at all of it.
CommuteTimes = read.csv("/home/rebelsky/Stats115/Data/HypoCommute.csv")
CommuteTimes
The columns are named A1 (for Alex's Route 1),
A2 (for Alex's Route 2), B1 (for Barb's Route 1), and
so on and so forth.
You should be able to read the sample size from the table. To get
the sample mean and standard deviation, we can use
mean and
sd, but need to tell the functions
to ignore the NA values. (Having to tell the functions
to deal with the NA values differently is one of the disadvantages of
combining the columns.
mean(CommuteTimes$A1, na.rm=T) sd(CommuteTimes$A1, na.rm=T)
R makes two-sample t-tests very easy to compute.
Just call t.test with the two samples.
t.test(CommuteTimes$A1,CommuteTimes$A2)
We repeat the t-test, telling it to use a different confidence level.
t.test(CommuteTimes$A1,CommuteTimes$A2, conf.level=.90)
You should be able to figure out how to do these computations by revisiting the Alex examples from above.
Since we ended up with very different sample sizes (I'm not sure why),
I've put the data into two files, Convenient.csv
and Inconvenient.csv.
Convenient = read.csv("/home/rebelsky/Stats115/Data/ConvenientSequence.csv")
Inconvenient = read.csv("/home/rebelsky/Stats115/Data/InconvenientSequence.csv")
Here are the commands you might use to build four windows with the four separate displays.
library(BHH2, lib="/home/rebelsky/Stats115/Packages") X11() boxplot(Convenient,horizontal=T) X11() boxplot(Inconvenient,horizontal=T) X11() dotPlot(Convenient) X11() dotPlot(Inconvenient)
Are we doing a two-sided test or a one-sided test? If you're using a one-sided test, which is the direction of the test? Use your answer to figure out which of the following commands to select.
t.test(Convenient,Inconvenient) t.test(Convenient,Inconvenient, alternative="greater") t.test(Convenient,Inconvenient, alternative="less")
Add conf.level=.90 to your previous answer to compute
the confidence interval.
Primary: [Front Door] [Syllabus] [Current Outline] [R] - [Academic Honesty] [Instructions]
Groupings: [Applets] [Assignments] [Data] [Examples] [Handouts] [Labs] [Outlines] [Projects] [Readings] [Solutions]
External Links: [R Front Door] [SamR's Front Door]
Copyright (c) 2007-8 Samuel A. Rebelsky.
This work is licensed under a Creative Commons
Attribution-NonCommercial 2.5 License. To view a copy of this
license, visit http://creativecommons.org/licenses/by-nc/2.5/
or send a letter to Creative Commons, 543 Howard Street, 5th Floor,
San Francisco, California, 94105, USA.