#INTRODUCTION: 
#A quick reminder on how to use the sample function.  We used it lab 5!

#Sample from a fair die.  Roll the die 10 times.
x <- c(1,2,3,4,5,6)  
sample(x, 10, replace=TRUE)  #This assumes the die is fair!


#This time the die is not fair.  Again roll the die 10 times.
x <- c(1,2,3,4,5,6)  
sample(x, 10, prob=c(0.1, 0.2, 0.3, 0.1, 0.1, 0.2), replace=TRUE)  #Not a fair die.

#This is the distribution we assume here:
X  P(X=x)
1  0.1
2  0.2
3  0.3
4  0.1
5  0.1
6  0.2

#Use the commands above to become familiar with the "sample" function.

==================================================================================
#PART A:
#Sample from two fair dice and record the sum.  The distribution of the sum of two fair dice was discussed in class.  Please see handout #32, page 5.
#Create the sums (2 to 12)
x <- seq(2, 12, 1)

#Type x and press enter to see what you get.

#Take a sample of size n=100 from this distribution once.
s1 <- sample(x, 100, prob=c(seq(1,6,1)/36, seq(5,1,-1)/36), replace=TRUE)

#Compute the sample mean and sample standard deviation of these 100 sums:
mean(s1)
sd(s1)


#What to do:
1.  Generate 100 samples each of size 100.
2.  Compute the sample mean and sample standard deviation for each sample.
3.  Use the t distribution to construct 100 confidence intervals each one with 95% confidence level.  Note:  The degrees of freedom here are 100-1=99.
4.  How many confidence intervals do we expect to miss the true mean?  How many confidence intervals actually missed the true mean, mu=7?
5.  Compute the length of each interval.
6.  Create a data frame with the following columns:  m, s, ci_left, ci_right.
==================================================================================
#PART B:
Using the same commands as in part (B) answer questions 1-6 on the following real data set:

a <- read.table("http://www.stat.ucla.edu/~nchristo/statistics10/soil.txt",header=TRUE)

Assume the mean of variable lead is mu=153 ppm.