A basic rule of thumb for investors in the stock market is to ``diversify''; that is to spread one's money across stocks which are likely to behave differently in response to various conditions in the market. Risk to the investor is reduced because, under a given set of circumstances, some stocks in the portfolio will rise while others fall. How can one determine which stocks are similar and which are not for the purpose of diversification?

The data provided are daily stock prices from January 1988 through October 1991, for ten aerospace companies. Given this information, the first step toward answering the question posed above is to reformulate the question in terms of these data. For example, two stocks may be considered similar if they maintain approximately the same level, vary to a similar degree, or tend to move up and down in related ways over some relevant time period. An initial analysis might use some graphical techniques to examine these aspects of the data.

Make histograms of these price series.
What information is lost in converting the raw data into histograms ? What is gained ?

## Time Plots

Another simple tool for comparing price series over time is the univariate time plot. Plot stock price on day for each of the ten companies for which price series is provided. Are the Y axis scales the same for all plots? What advantages are there in making all scales the same? What are the disadvantages? Look at the overall shapes of the plots.
Can you group the companies according to the shapes?
Are these groupings a sensible answer to the question posed above concerning similarity, or should one also consider the level of activity?
That is, given two graphs with roughly the same shape, would you consider them similar even if one averaged about 20 dollars and the other about 65? What about variability? How can you assess variability in these graphs? Would a great difference in variability be enough for you to place two otherwise similar stocks in different groups?

## Descriptive Statistics

It might also be useful to have one or two numbers that capture relevant characteristics of a stock's behavior. Mean and variance are two descriptive statistics often used to summarize data. Compute the means of stock prices for Companies A through J. Which company has the highest mean price? The lowest? Find the means on the histograms. Does this mean that the company with the higher mean is a better investment than the company with the lower mean? Describe the histograms of the companies with the highest and lowest means.
What is different? What is the same?
Just by looking at the histogram, which company's stock looks more variable?
What does variability mean in the context of stock prices?
Two possible measures of variability are variance, and interquartile range. Compute the variance and interquartile range for each company. Which is a better measure of variability, thinking of variability as risk? Do these two measures tell the same story about these two stocks?
