Statistics M11/Economics M40 Lab 3: Probability Exercises and Distributions

Statistics M11/Economics M40 Lab 3: Probability Exercises and Distributions

DUE FEBRUARY 16, 2001

Purpose: The purpose of this lab is to become comfortable with random processes using Stata.

Data: There is no data for this assignment. You will generate your own data. Unfortunately, you will need to work on a computer that is either in the Statistics Lab (which allows you to save things to its hard drive) or on your own home computer.

Introduction: In Chapter 4, you are introduced to the concepts of randomness and chance in statistics. In this lab, you will see illustrations of probability and construct probability distributions using simulations.

Start up Stata and issue the command:

heads2 1000 .5

If the program "heads2" is installed on your computer, a graph should appear and it should look something like Figure 4.1 in your textbook (page 291). Ask yourself "what is this graphic trying to teach me?"

This is a graph of the cumulative (total) proportion of tosses that lands "heads" for a fair coin, if you were to toss it 1,000 times.

If "heads2" is not working on your computer and you either in the Statistics Lab or at home, follow these instructions:

Click on the word "HELP" on the menu bar

Click on the line that reads "STB and User-Written Programs"

Click on the colored letters which read: http://www.stata.com

Click on the colored letters which read users

Click on the colored letters which read ucla

Click on the colored letters which read heads2 (you will need to scroll down a few pages to find it, it's in the far left hand column of words)

Click on the colored letters which read (click here to install)

When Stata is finished installing, it will read something like "click here to return to the previous page", at this point, close the help window and return to your Stata commands.

DO NOT TRY THE PREVIOUS 8 INSTRUCTIONS IN THE CLICC LABS IT WILL NOT WORK.

ASSIGNMENT

After you issue the command:

heads2 1000 .5

four variables are created (see your variable window). We're only interested in the variable heads. I am going to have you issue the command "tabulate heads", but before you do, answer this question:

(a) Question: If I toss a coin 1,000 times, how many heads do I expect or what percentage do I expect to be heads?

Hopefully you wrote 500 or 50%. Now issue the command:

tabulate heads

For the variable heads, 1=heads 0=tails. Note how many heads you got (you will not get the same as your classmates) and issue this command:

heads2 10 .5, nograph

This does the same thing as before, but no graph is produced. I want you to issue the command "tabulate heads" again, but before you do, answer this question:

(b) Question: If I toss a coin 10 times, how many heads do I expect or what percentage do I expect to be heads?

Hopefully you wrote 5 or 50%. Now issue the command:

tabulate heads

note how many heads you got (you will not get the same as your classmates) . Issue this command:

heads2 100 .5 10, nograph

What this does is like the following: let us make 100 people toss a fair coin 10 times. The results of this are stored in the variable heads.

Now issue a:

tabulate heads

What you will see is a distribution of the count of heads. That is, of the 100 people, when some tossed a coin 10 times, they only got 1 head, some got 2, some got 3, some got 5, some got 10 etc.

( c) Answer the question: What percentage of the 100 people got exactly 5 heads? What percentage got exactly 4 heads or exactly 6 heads? What percentage got 3 or 7? What percentage got 2 or 8? What percentage got 1 or 9 heads? What percentage got 0 heads or all 10 heads?

And before you go on to the next part, issue the command

graph heads, bin(11) xlab t1("Simulation of 100 people tossing coins") normal

(d) And print out this graphic. Question: Does this look at all familiar?

This is going to seem boring but, there is something to be learned here. I want you to modify the commands slightly to perform 3 simulations:

heads2 1000 .5 10, nograph

tabulate heads

graph heads, bin(11) xlab t1("Simulation of 1000 people tossing coins") normal

heads2 10000 .5 10, nograph

tabulate heads

graph heads, bin(11) xlab t1("Simulation: 10000 people tossing coins") normal

heads2 100000 .5 10, nograph

tabulate heads

graph heads, bin(11) xlab ti1("Simulation: 100000 people tossing coins") normal

The last one might take a while and make sure "normal" is on the same line as the word graph. (a) For each of the simulations, please printout the histogram of results and please include the percentages for:

Exactly 5 heads

Exactly 4 heads or Exactly 6 heads

Exactly 3 heads or Exactly 7 heads

Exactly 2 heads or Exactly 8 heads

Exactly 1 head or Exactly 9 heads

Exactly 0 heads or Exactly 10 heads

(b) Compare these percentages to a set of numbers on Page T-9 (back of your book). Check the rightmost column, the one titled ".50" and look at the set starting with the number .0010. This is the theoretical probability distribution of how a fair coin behaves in 10 tosses. How does your final table, involving 100,000 "samples" compare?

I've been using the example of how "da market" behaves in a random 3 day period. The

Market, in the lecture examples, behaves like a biased coin being "up" 60% of the time and "not up" 40% of the time. We can use the heads2 program to see how our theoretical discrete probabilities match up with a simulation.

Issue the command:

heads2 1 .6 3, nograph

tabulate heads

You would interpret the result as: in a single 3-day trading period, there were (result of the tabulate) up days.

To see how the market might behave in a little more than a year:

heads2 100 .6 3, nograph

tabulate heads

And to see how the market might behave over a very long period of time:

heads2 100000 .6 3, nograph

tabulate heads

Now, issue a summarize and tell me what are the mean and standard deviation for the number of "up days" in 3 days:

summarize heads

(a) Question: How does your simulated result compare with the theoretical result calculated during lecture? Is it very similar or very different?

You should start thinking about any single 3 day period as a "single sample" and the theoretical probability distribution as the population of "all possible samples".

RECAP

Check to make sure the program "heads2" works on the computer you are working on. It works in the Statistics laboratory and on your home computer (if you have Stata and can do an installation as instructed in the lab above).
Find all of the "bold" sections of this lab involving questions and printouts. Answer the questions and provide all of the supporting graphs and tables.
Staple your assignment together. Put your name and your section identification (e.g. TA's name or section) and turn it in on or before February 16, 2001