Homework 1
Due Friday, January 19






1.  A bernoulli random variable,X, is one for which there are only two outcomes: 1 or 0.  The probability of observing a 1 is given by p.  That is, P(X = 1) = p, and therefore P(X = 0) = 1-p.
Such a random variable could be used to model the outcome of a single flip of a coin, for example.
a) What is the expected value of X?
b) What is the variance? The standard deviation?

2)  A binomial random variable, Y, represents the number of successes in n independent experiments in which the outcome is either "success" (1) or "failure" (0).  the probability of success at each trial is the same: p.  Y can be represented as a sum of n independent bernoulli random variables.   A classic use of the binomial distribution is to model the number of heads in a fixed number, n, of coin tosses.
a) Derive the expected value for Y.
b) Derive the variance and standard deviation.
c) Toss a fair-coin 100 times.  How many heads should one expect?  68% of the time, the number of heads will be between what two numbers, roughly?

3) In xlisp-stat, normal probabilities can be determined using the normal-cdf function, which returns P(Z <= z) where Z is a N(0,1) random variable and z is a value you supply.
To learn how to use this command, type (help 'normal-cdf).  Also check out (apropos 'normal) for a list of all commands that have the word "normal" in them.
a) Find P(-1 < Z < 1), P(-2 < Z < 2), P(-3 < Z < 3).
b) Suppose that the distribution of men's heights (in inches) in the U.S. can be modeled by a N(67, 3) distribution.  Let X represent the height of a randomly selected man from this population.  Find P(X > 70), P(X > 72), P(66 < X < 80).

4) The command (normal-rand 100) generates a list of 100 observations from a standard normal distribution.  Generate a similar list from a
a) N(10,1) distribution
b) N(0,5)?
c) N(10,5).
Provide histograms of your lists as evidence.  HINT: You will need to assign a name to your lists, else you will get a list of 100 numbers scrolling across your screen.  Try something like
(def x (normal-rand 100))
(histogram x).
Hint #2:  Just to remind you how to do arithmetic operations in xlisp:   (+ 3 4) returns 7.  (* 3 4) returns 12, (/ 3 4) returns .75, etc.   (* (+ 3 4) 3) returns 36 (the innermost parentheses are evalutaed first.)

5) Fun Normal Distribution Facts (#1 in a possibly very short series).  Suppose you wanted to graph the standard normal distribution perfectly to scale out to +- 10.  If your graph is one millimeter tall at x = +- 10,  how tall must it be at x= 0?
 

6) In xlisp-stat, binomial probabilities can be determined using the binomial-pmf  function and the binomial-cdf function.  The former gives P(X = x) for an x which you supply.  ("pmf" stands for "probability mass function".)  The latter gives P(X <= x) for an x you supply.  To learn how to use these functions, type (help 'binomial-pmf) and (help 'binomial-cdf).   (A useful command is the APROPOS command, which can be used if you can't remember the exact command. For example, (apropos 'binomial) returns a list of all functions with the word "binomial" in them.)

According to the Central Limit Theorem, the normal distribution approximates the binomial distribution for "sufficiently large" n.  How large is sufficiently large?  For each, compute the probability of observing a value within 1 SD of the mean:  P( - sigma < X < + sigma). Do so first for the distribution given, and then the normal approximation.
a) X is binomial, n = 10, p = .1
b) binomial, n = 25, p = .1
c) binomial, n = 50, p = .1
d) binomial , n = 100, p = .1
e) binomial  n = 500, p = .1
f) binomial, n = 1000, p = .1

You can further explore the central limit theorem (and I strongly encourage you to take a look at this) via: http://www.ruf.rice.edu/~lane/stat_sim/index.html
This page lets you choose a "parent" distribution, and then take a random sample, form a statistic, and repeat.  The CLT says that the distribution of linear combinations of independent RVs is approximately normal, and this approximation improves as n increases.  (The further the parent distribution is from normal, the bigger n will need to be.)  Because a binomial RV is a sum of bernoulli's, the CLT applies.  This web page requires Java1.1.  The CLT is particularly useful when applied to averages (a sum of observations).  Compare the sampling distributions of averages with medians, for example.

7) For each of the data sets, summarize in whatever way you think appropriate: (hint: you can summarize by hand, without using the computer)
(Taken from Problem Solving: A Statistician's Guide, Christopher Chatfield, Chapman and Hall.
a) The marks (out of 100 and ordered by size) of 20 students in a mathematics exam:
30, 35, 37, 40, 40, 49, 51, 54, 54, 55
57, 58, 60, 60, 62, 62, 65, 67, 74, 89

b) The number of days work missed by 20 workers in one year (ordered by size):
0,0,0,0,0,0,0,1,1,1
2,2,3,3,4,5,5,5,8,45

c) The number of issues of a particularly monthly magazine read by 20 people in a year:
0,1,11,0,0,0,2,12,0,0
12,1,0,0,0,0,12,0,11,0

d) The height (in meters) of 20 women who are being investigated for a certain medical condition:
1.52, 1.60, 1.57, 1.52, 1.60, 1.75, 1.74, 1.63, 1.55, 1.63
1.65, 1.55, 1.65, 1.60, 1.68, 2.50, 1.52, 1.65, 1.60, 1.65

Do one of (8)-(12):

8) If X is a discrete uniform random variable -- that is P(X = k) = 1/n for k = 1,2,...., n  -- find E(X) and Var(X)
9) Let X have the cdf F(x) = a - x^(-alpha) for x >= 1  (that last term is x raised to the power of -alpha).
    Find E(X) for those values of alpha for which E(X) exists.  Find Var(X) for those values of alpha for which Var(X) exists.
10) Let X be a discrete RV that takes on values 0,1,2 with probabilities 1/2, 3/8, 1/8, respectively.
    a) Find E(X)
    b) let Y = X^2.  Find the pmf for Y and use it to find E(Y).
12) Let X be a continuous random variable with the density function f(x) = 2x for 0 <= x <= 1.
    a) Find E(X)
    b) Find E(X^2) and Var(X)
 

13)  Let X be a  continuous random vaiable with the density function f(x) = ax for 0 <= x <= 5 and for some constant a.
a) Find a
b ) Find P(-5 < x < 3)

14) Let X be a random variable that counts the number of events that occur in some time interval.  Suppose these assumptions hold:
i) the time can be divided into small sub-intervals so that the probability of two events happening in any one sub-interval is 0.  (Two events can't happen simultaneously.)
ii) the events in any subinterval are independent.  (An event happening at one time does not influence whether or not an event will happen in any other time interval.)
iii) the probability of an event occuring is the same in each interval.  (The rate at which events occur is constant.  Call this rate lambda.)

Then the probability distribution of X is the Poisson distribution: P(X = k) = (lambda^(k) / k! ) * exp(-lambda) for k = 0,1,2,...

A study in 1898 found that Prussian calvary officers were kicked to death at the rate of 0.61 per year.

a) Explain why the assumptions hold and this might be a good model.
b) Find P(X = 0), P(X = 3), P(X = 10)
c) Suppose we're observing cars passing beneath an overpass on a one-lane road.  If we let X represent the number of cars that pass through in an hour, would the Poisson distribution be a good model?  Explain.  What if X represented the number of cars in 24 hours?  Would the model hold in heavy traffic?