Statistical Analysis of Brink's Data

blueline

Stanley S. Bentow and David R. Afshartous

UCLA Statistics

blueline

Introduction

In March of 1978, Brink's Inc. was awarded the contract to collect coins from approximately 70,000 parking meters in the city of New York. In response to an anonymous tip that not all money collected by Brink's was being delivered to the city's finance depository, the city began an investigation of parking meter collections. Through the use of "salted" coins (coins treated with a fluorescent substance and inserted into specific meters) and surveillance cameras, five Brink's employees were arrested and subsequently convicted of grand larceny and criminal possession of stolen property. When they were arrested, they had in their possession $4,500 in coins stolen that day from parking meter collections.

As a result, a civil suit was filed by the City of New York alleging that Brink's had failed to honor its contract and acted negligently. The city was seeking monetary compensation from Brink's for losses incurred by the criminal activities of its employees. Brink's was subsequently found guilty of negligence and breach of contract. The question remained as to what the actual dollar amount of the damages were.

The purpose of this paper is to examine the statistical methods by which the City of New York's expert witness arrived at the amount of damages, and to contrast that with the defense's expert witness' criticisms. We suggest an alterhate approach as well.

The City's Case

William Fairley was hired to help the city's attorneys determine the amount of money which Brink's had failed to collect over the entire term of their contract (May '78 to Mar '80). Although the exact calculation of monetary loss is impossible to determine, the law allows for the introduction of testimony regarding the estimation of such loss as a ``...matter of just and reasonable inference.'' Brink's' contract was terminated in March of 1980 and a new contract was awarded to CDC, Inc. in June of 1980. Fairley noted that the first ten months of collection by CDC exceeded all ten month periods collected by Brink's by $1,000,000. He decided to compare the last ten months of Brink's (Jun '79 to Mar '80) to the first ten months of CDC (Jun '80 to Mar '81). He and the city's attorneys reasoned that this was reasonable since the introduction of very strict oversight during the CDC period assured the absence of theft. Thus, he planned to argue that the difference in revenue between the two periods was due to theft. Fairley noted that there exists two threats to any causal interpretations concerning the differences in revenue collected over the two periods.
  1. General Time Effects of Trend or Seasonality The increase in revenue during the CDC period may have been the result of an ongoing trend, i.e., if Brink's' contract had not been terminated they too would have experienced an increase in revenue. Or, the increase in revenue during the CDC period may have been due to a Seasonality effect, i.e., the months which exhibited an increase may have occurred during months which exhibit higher than average revenue collection.
  2. Specific Causes There may have been other factors contributing to the rise in revenue during the CDC period. Thus, the observed difference in revenue collection may have been due to these factors and thus not due to theft.
Given these threats to causal interpretations, Fairley addressed the above by doing the following:
  1. Investigate the possible existence of general time factors such as trend and Seasonality.
  2. Consider whether other causes aside from theft could explain the observed difference.

Trend and Seasonality

Fairley first considered whether trend or seasonal effects would discredit a causal attribution of observed revenue difference to theft. Fairley controlled for seasonal confounding by choosing the exact same calendar months for comparing the Brink's and CDC 10 month periods (However, he also noted that seasonal adjustment was used when comparing the months closest to the transition from Brink's to CDC).

In order to investigate the existence of any trend Fairley fit piecewise regressions to the data over the 22 month period. As Fairley points out himself, piecewise regression is a nice idea since it will estimate changes in level (intercept shift) and trend (slope shift) over the 22 month period, while a single regression line will not. He used seasonally adjusted monthly average per meter day as the dependent variable and month as the independent variable ( the data is seasonally adjusted by dividing the city average for the month by the predicted revenue for the month, and then multiplying this by the period average for the city). It seems somewhat strange that he seasonally adjusts the data given his previous comments about how choosing the same 10 calendar months over each period made such adjustment unnecessary. Fairley argued that although there exists no evidence for a general upward trend (main point of contention for the defense, see next section), there does exist significant evidence of a change in level (intercept shift) over the two periods. Since we don't have access to the results of this regression, we estimated similar models ourselves. We fit many models, for both unadjusted and seasonally adjusted data. We formed averages as revenue per collection day, while Fairley forms averages as revenue per meter day. For instance, we compute the monthly average by dividing monthly revenue by the total number of times Brink's collected for the month, while Fairley forms averages by dividing the monthly revenue by the number of days of the month that parking meters were in effect. Since we neither have his regression results or the data concerning collection days, we are unable to examine whether or not these different procedures produce different results. For most of the models, while there exists strong evidence for an intercept shift, there isn't much evidence for a slope shift. However, upon deleting the last CDC observation (which has a Cook's distance twice as big as any other observation), the following regression results are obtained.

Model: MODEL1
Dependent Variable: Total Monthly Revenue (including area 1A)

Analysis of Variance

                   Sum of         Mean
Source    DF      Squares       Square      F Value    Prob>F

Model      3  72094721798  24031573933        2.852    0.0725
Error     15 126391276555 8426085103.7
C Total   18 198485998353

    Root MSE   91793.70950     R-square       0.3632
    Dep Mean 1765510.21053     Adj R-sq       0.2359
    C.V.           5.19927

Parameter Estimates

                 Parameter    Standard     T for H0:
Variable  DF      Estimate       Error   Parameter=0 Prob > |T|

INTERCEP   1       1720364  62707.041458    27.435    0.0001
DUMMY*     1        558901  213199.54966     2.621    0.0193
MONTH      1   1977.127273  10106.154803     0.196    0.8475
INT**      1        -29958  15574.630545    -1.924    0.0736

*  Dummy = 0-1 indicactor variable, 1 = CDC, 0 = Brink's.   
** Int   = Month*Dummy 
The overall fit of the model is improved, and the slope shift approaches statistical significance. Regarding the substantive issue of an upward trend in revenue, however, this shift is in the wrong direction. Thus, it seems that Fairley's piecewise regressions demonstrate that he has accounted for the possible confounding to causal interpretation; namely, trend and Seasonality.

Fairley goes on to introduce a more elaborate nonlinear model based on Borough level data, arguing that such a model provides more degrees of freedom, etc. Based on this model, he estimates an intercontractor difference and uses this to come up with an estimate of the damages that Brink's is liable. Unfortunately, we currently do not have borough level data and thus cannot verify his results. In any event, we will argue in section 5 that a simpler model based solely on city-wide totals is sufficient and also easier for a jury to understand.

Other Specific Causes

It was a safe bet that Brink's' attorneys would disagree with the claim that the observed difference in revenue collection was almost completely attributable to theft. Other factors besides theft might explain the observed difference in revenue collection. However, Fairley noted that any such factor would have to satisfy four criterion:
  1. Suddenness
  2. Sizable
  3. Uniqueness
  4. Uniformity
Fairley recommended that the city consult an expert on parking meter theft to investigate whether other factors satisfying the above existed during the given time period. Although the expert initially provided a long list of potential factors, none satisfied all the criteria. In addition, he testified that he had no reason to believe that certain factors were either more or less prevalent during the CDC period, i.e., there was no bias towards overcollection during the CDC period.

Brink's Case

Bruce Levin, the expert witness for Brink's, attacked the City's case on two issues.
  1. The assumption that the Brink's and CDC data are best described by their averages over their individual ten month periods.
  2. That the difference in revenues returned by Brink's and CDC is entirely due to theft.
Levin argues that there exists an upward trend throughout the Brink's period, followed by a ``leveling'' off during the CDC period. Levin argues that a horizontal line (constant trend) through the Brink's period is clearly not representative of the actual trend. Fairley only utilized 10 months worth of Brink's data, since there was only 10 months of CDC data to compare it to. Levin argues that ``Fairley has forcefully and correctly argued elsewhere that effects are not reliably estimated from a brief span of experience, and that the widest related base of experience should be surveyed in statistical estimation problems (see Fairley, 1979).'' This is further discussed in section 5.

With regard to (2), Levin argues that the difference between the two periods may (emphasis our own) be attributable to factors other than theft. He presents evidence of a gasoline shortage that led to ``marked drops'' in revenues for automobile toll bridges and tunnels. This happened during the Brink's period, June-September 1979.

Levin's analysis was based on data for the entire Brink's period for two sets of data. Since an area of the city designated as area 1A was not under the jurisdiction of Brink`s, Levin examined the given time period for city data excluding 1A and for area 1A alone. He searched for a trend in each area.

Levin plotted five-month moving averages over the Brink's and CDC (May '78 to Mar '81) contract for the city as without 1A and for 1A alone. He argued that the increase in revenue collected in area 1A during the CDC period showed that other factors besides theft were accounting for the increase in revenue across the city as a whole. In addition, he argued that these five-month moving averages exhibited an upward trend operating over the Brink's period.

Levin also fitted regression lines separately to the monthly city revenue excluding area 1A and to area 1A alone over the entire period of the Brink's contract (May '78 to Mar '80). He found a positive slope for the city-wide data, and argued that if this regression line was extrapolated over the CDC period, the predicted revenue difference for the two 10-month periods is $2,155,000, making the observed $1,000,000 difference less surprising. With regard to the line fitted to the area 1A data, he also found a positive slope. Thus, the conclusions he reached through the regression analysis are similar to the ones he made from the moving average analysis, i.e., the difference in revenue could be accounted for by factors besides theft.

Discussion

After hearing expert testimony from both Fairley and Levin, the jury decided in favor of the city and set damages at $1,000,000. In trying to rebut Fairley's argument, Levin introduced several interesting points, but seemed to have erred by stating that ``any and all of the difference in observed revenue cannot be attributable to theft.'' Given the coin salting, video tapes, and possession of $4,500 in coins by Brink's employees, it is hard to believe that none of the observed difference was a result of theft. Moreover, although Levin concluded that other causes may have been responsible for the observed difference in revenue, he did not sufficiently prove that one did in fact exist. Fairley and the city's lawyer's, on the other hand, acknowledged the potential influence of other factors, AND hired a parking meter theft expert to investigate whether these factors were prevalent across the given time periods. Levin should not have expected the jury to merely accept the possibility of other factors. Furthermore, although we like the idea of introducing area 1A into the analysis, this may have done more harm than help to Brink's case. For example, under cross-examination Levin admitted that only 45 of the city's 70,000 parking meters were located in area 1A, and that the malfunction of one meter in this area could cause a 2% change in monthly revenue. (p. 238) To be sure, although any given jury is likely to be unfamiliar with statistical procedures such as regression, most individuals can easily recognize that the number 45 is much smaller than 70,000.

Our Approach

A critique we have of both sides arguments is along these lines. It may have been beneficial to produce a simpler analysis, given the courtroom setting. For the comparison of the two 10 month periods, the following analysis may have been more accessible. As Fairley noted, each ten month period consists of the exact same calendar months, thereby controlling for seasonal effects. Thus, we may form a residual difference for these months, measuring the surplus/shortage in CDC monthly revenue compared to that of Brink's monthly revenue.
               Total Revenues

         Brink's      CDC      Shortfall 

  Jun    1685938    1941688     -255750
  Jul    1644110    1889106     -244996
  Aug    1746709    1741465        5244
  Sep    1754081    1832510      -78429
  Oct    1853363    1926233      -72870
  Nov    1754081    1670136       83945
  Dec    1692441    1948290     -255849
  Jan    1801019    1627594      173425
  Feb    1702335    1655290       47045
  Mar    1678305    1844604     -166299
======================================================= 
                                 -764,534.0  Total
                                  -76,453.4  Monthly Average 
=======================================================
It is quite easy for a jury to relate to the shortfall in revenue as they are formed by taking simple differences. Regression methods are not as convincing here for two reasons. First, it is easier to attack regression models. For instance, although our models indicate significant intercept shifts, the fit of these models is very poor (F = 2.04; prob > F = .15), an easy point for the defense to attack. Second, regression models are often inaccessible to a jury. In addition, the method of controlling for seasonality is easy to understand. The data above yield shortfall for ten corresponding months. A scatterplot and boxplot are shown in Figure 1a and Figure 1b respectively.

Given this data, what is a reasonable way of forming an estimate of the total amount of damages to be awarded to the city? There are several simple ways to proceed. On average, Brink's collected $76,453 less than CDC in any given month for months in which we have comparative information. In light of this fact, a quick and dirty damage estimate is:

    Damages = Number of Months of Brink's Contracts x Average Shortfall per Month = 23 x 76,453 = 1,758,428.2
Thus, the above procedure yields approximately 1.8 million dollars in damages. This is a liberal estimate due to the high level of variability in the shortfall. (Standard Deviation $153,086, twice that of the mean.) It should also be noted that using the median shortfall produces very similar results in this case since the median shortfall is 1.7 million dollars. A way to adjust for this variability is to use trimmed estimate of the mean. Here, a ten percent trimmed mean yields an average monthly shortfall of $85,264. Using this figure we now have the following damage estimate:
    Damages = Number of Months of Brink's Contracts x Average Shortfall per Month = 23 x 85,265 = 1,961,072
Thus, using the trimmed mean results in approximately 1.9 million dollars in damages. A conservative estimate of damages would be formed by taking a one-sided 5% trimmed, i.e., deleting the month in which Brink's collected the least revenue vis a vis CDC. Using this procedure, the average monthly shortfall is $56,520, giving a damage estimate as follows:
    Damages = Number of Months of Brink's Contracts x Average Shortfall per Month = 23 x 56,520 = 1,299,960
Thus, even using this extremely conservative (in favor of Brink's) estimate, the damages should be no less than 1.3 million dollars. This is very close to Fairley's estimate based on his model for borough level data.

However, all of the above methods are susceptible to Levin's criticism that there exists an upward trend over the entire Brink's period. Figure 2 is a plot of total revenue per month over the entire Brink's period. Indeed, there appears to be an increasing trend, with two specific periods (the 13 months prior to the 10-month comparison period, and the 10-month comparison period.) If we look at Figure 2 carefully, we see that there is an upward linear trend that appears to end at the beginning of the ten-month comparison period. A regression over the entire 23 month period exhibited both a significant intercept and slope shift, with the slope of the 10-month comparison period not being significantly different than zero (F=22.3; Prob > F=.0001). Note that month 22 was not included in the analysis since it was a highly influential observation and appeared to be an outlier (Cook's Distance=.29, four times larger than the next highest value).

If we assume that theft was occurring throughout the entire Brink's period, then factors other than theft most likely contributed to the increasing revenue over time. Possible explanations are perhaps an increase in the number of cars, the building of shopping malls, or an increase in the fare per meter. Although the city's parking meter theft expert refuted most of these possibilities, the trend is still apparent in the graph. Therefore, in trying to be fair to Brink's, we attempted to control for this upward trend by downweighting the months prior to the ten-month comparison period. Our argument goes as follows:

The average Brink's/CDC shortfall was calculated based on data for this flat period. If we divide the total revenue per month during this 13 month period by the average revenue per month for the ensuing 10 month period, we would have a rough estimate of how much weight each of these earlier months should be given. We then multiply each of these ``weights'' by the expected monthly shortfall calculated above and obtain an estimate of the expected shortfall over the entire period. (All months during the 10-month comparison period are given a weight of one.) Using this downweighting procedure we produce a more conservative damage estimate of 1.6 million dollars (approximately $200,000 less than our original raw estimate. See Appendix A for calculations).

In summary, we believe the method we used to calculate the average monthly Brink's/CDC revenue shortfall is reasonable for two reasons. It controls for Seasonality in a straightforward manner, and it is easy for the jury to understand. We also feel that the general upward trend prior to the 10-month comparison period need to be taken into consideration when forming the damage estimate. Although any method of downweighting is susceptible to criticism, our method employs a simple approach which we consider fair to Brink's and the City and reasonable given the data at hand.

Appendix A

       Month  Total      10-Month     Weight     Shortfall
              Revenue    Brink's Avg

       13     1337159    1731238.2    0.77237      59050.43
       14     1532810    1731238.2    0.88538      67690.59
       15     1318521    1731238.2    0.76161      58227.35
       16     1502054    1731238.2    0.86762      66332.37
       17     1393093    1731238.2    0.80468      61520.53
       18     1564212    1731238.2    0.90352      69077.34
       19     1474861    1731238.2    0.85191      65131.50
       20     1554116    1731238.2    0.89769      68631.49
       21     1572284    1731238.2    0.90818      69433.81
       22     1129834    1731238.2    0.65262      49894.72
       23     1781470    1731238.2    1.02901      78671.69
       24     1659206    1731238.2    0.95839      73272.38
       25     1752172    1731238.2    1.01209      77377.86
       26     1685938    1731238.2    1.00000      76453.40
       27     1644110    1731238.2    1.00000      76453.40
       28     1746709    1731238.2    1.00000      76453.40
       29     1754081    1731238.2    1.00000      76453.40
       30     1853363    1731238.2    1.00000      76453.40
       31     1754081    1731238.2    1.00000      76453.40
       32     1692441    1731238.2    1.00000      76453.40
       33     1801019    1731238.2    1.00000      76453.40
       34     1702335    1731238.2    1.00000      76453.40
       35     1678305    1731238.2    1.00000      76453.40
                                                 ==========
                                                 1628846.05
 

References

  1. DeGroot, M.H., Fienberg, S.E., and Kadane, J.B. (1986), Statistics and the Law, New York: John Wiley & Sons, Inc.
  2. Weisberg, S. (1985), Applied Linear Regression, New York: John Wiley & Sons, Inc.