1.      Introduction to Curves

A Density Curve is a curve that can be used to represent a large number of observations, sometimes it is easier to work with curves than histograms.  Is always on or above the horizontal axis  Has area exactly equal to 1 (or 100% when expressed as a percentage) under it. The areas under the curve represent proportions of observations.  The median is where 50% of the observations (or .5 when expressed as a proportion) are on either side.  The mean is like a balancing point of the curve..

 Notation: x-bar stands for the mean of a list. Here, Greek letter Mu or m stands for the mean of a density curve. s stands for standard deviation for a list of numbers, here, lower-case Greek letter sigma s stands for the standard deviation of a density curve.

2.      The Standard Normal Distribution (pages T-2 and T-3)

.Properties

a. Symmetric, bell-shaped

b. Mean= 0, SD= 1

c. The median is where 50% (half) of the observations are on either side. In this distribution, the mean is equal to the median.

d. Area under the curve is equal to 1 or (100% when expressed as a proportion. Area under the curve represent proportions of the observations.

e. 68%-95%-almost 100% rule:

About 68% of the area (and therefore the observations) falls within plus or minus 1 SD of the mean. About 95% falls within plus or minus 2 SD of the mean and nearly 100% (99.7%) fall within plus or minus 3 SD

f.  Never crosses the x-axis so in theory, it is possible to have extreme observations in a normal distribution

 3.Using a standard normal table (Inside front cover of your book or pages T-2 and T-3)

First step: Draw a picture...

            ( i) Draw the x-axis. These are the STANDARDIZED SCORES or Z-SCORES.

            ( ii) Draw a normal curve above it. The area under the curve gives PERCENTAGES.

            (iii) Shade in the area that you are looking for (or know).

Solve for the area you want in terms of "left hand areas".  Examples.

 

4.      Working with standard units

 

A score z is in STANDARD UNITS if tells how many SD's the original score is above or below the average. For example, if z=1.31, then the original score was 1.31 SD's above average; if z = -0.57, then the original score was 0.57 SD's BELOW average.  

 

A Formula for converting any value into a standard score Z    

 

 

              z   =   (value of interest - average all other values)

                     ------------------------------------------------------

                         standard deviation of the other values

 

Examples of standard units from the S & P 500, comparing  Merck & Co., Inc. with the William Wrigley Jr. Co.

 

5. Why Bother with Standard Units?

 

Standard Units allow quick comparisons across variables with different units of measure. Z scores or standard units have no units, everything you convert is standardized. For example, if it mattered to you in a stock screening to own shares in a company that was as high above average on Earnings Per Share (EPS) as it is on the Percentage Change in the last 13 weeks, you would like neither Merck nor Wrigley, but might choose Maytag Corporation whose EPS had a Z = +.81 and their Percentage Change in the last 13 weeks had a Z = +1.20.

 

Not that I am suggesting that you invest solely on the basis of these variables.  But the normal table is a tool that can be used with a variety of datasets especially when you are working with variables with different units.  It provides a single measurement scale for variables that are approximately normal.

 

You should not attempt normal calculations on non-normal variables

 

6. Assessing Normality

 

A.  Common sense: if the normal curve implies nonsense results (for example, that people have negative incomes, or that some women have a negative number of children), the normal curve doesn't apply and using the normal curve will give the wrong answer.

     

Example: in 1980, the average number of children born per woman was 1.95, with an SD of 1.91. Does the normal curve apply? Try calculating how many children a woman would have if she is 2 standard deviations BELOW the mean.

 

B. Construct  a histogram: if the data look like a normal curve, the normal curve probably applies; otherwise, it does not.  If you are interested, in Stata, you can issue the command:

 

graph variablename, normal

 

and Stata will draw a normal curve over your histogram so you can compare.

 

C. Do the  data fall in a 68-95-99.7% pattern?  If yes, normality is probably being met.  You can examine any numeric variable in Stata by issuing the command

 

pnorm variablename

 

and the variable will be graphed against a line that represents the 68-95-99.7 rule.  Deviations from the line suggest deviations from normality.

 

7.  More about using the standard normal curve

 

If the data are normally distributed, then raw scores can be converted into standard units to find percentages; also, percentages can be converted into standard units and then converted back into raw scores (original numbers).

 

Examples

            a. SAT math scores are normally distributed with a mean of 500 and an SD of 100. What percentile rank is a math score of 650?

                        (first step: draw a picture ... want an area)

                        (next step: convert to standard units; z = 1.50)

                        (final step: look up the area; rank is 93.3 percentile)

            b. What SAT math score defines the top 10%?

                        (first step: draw a picture ... want an original score)

                        (next step: using the area, find the standardized score: z=1.28)

                        (final step: convert back from standard units; score is 628)

 

 

INSTABROKER! an internet brokerage, claims that its average trade execution time is 59 seconds. It also claims that the standard deviation is 15 seconds. Suppose the executions times are normally distributed. What is the chance of a trade requiring more than 90 seconds?

           

To answer this, you might draw a curve then fill in the information, then calculate a Z score.

 

            z = (90 - 59) / 15 =  3.27

 

Your chance of having to wait more than 90 seconds is less than 1 - .9995 from the table or  .0005 or .05%.  In 10,000 trades, only 5 would take longer than 90 seconds.