================================================ KEYWORDS FOR DATASET: Prison Classification, Logistic regression ================================================ ======================================================= ACCOMPANYING DATA PROVIDED BY: Dr. Richard Berk and Dr. Jan de Leeuw UCLA, Dept. of Statistics ======================================================= ================================ GENERAL EXPLANATION OF THE STUDY ================================ This data set is from a study by the California Department of Corrections (CDC) on the effectiveness of prisoner placement, and the likelihood of misconduct while incarcerated. Because of the cost of running high security facilities, it is important to be able to sort inmates into different levels of risk and then place them into the lowest security level that eliminates risk to other inmates, staff and themselves. The data included here can be used to examine how well those goals are achieved. Most prisoners are assigned to a facility based on a classification score. The classification score is based on the length of the sentence and other variables, including age, marital status and prior convictions. There are, for most inmates, four levels of facilities to which one can be assigned. A Level 1 facility has the lowest security and a Level 4 facility has the highest security. Level 4 facilities are reserved for the most dangerous prisoners, or prisoners who need protection from other inmates. For this study, all the prisoners in the lower security Levels 1-3 are combined and compared to those assigned to Level 4 facilities. It should also be noted that this data set includes only prisoners assigned by a classification score. Some prisoners are also assigned to a security level because of other issues or constraints, such as whether there are beds available, or because the offender is a particular risk if he escapes. This study was published: "An Evaluation of California's Inmate Classification System Using a Generalized Regression Discontinuity Design," Journal of the American Statistical Association, Dec. 1999, Vol. 94, No.448, Applications and Case Studies. ================================ BRIEF DESCRIPTION OF THE DATA ================================ Beginning in January 1994, the CDC began enrolling inmates in this study. A total of 3,922 inmates are included (only 3918 have classification scores). The response variable indicates whether the prisoner committed any misconduct violations. All incidents of misconduct were recorded, including less serious violations such as not standing for a count or not showing up for an assignment, as well as more serious violations, like drug trafficking or assaulting a corrections officer. A "Strike 2" inmate is a prisoner who is serving time for a second felony and who was sentenced under a California law mandating sentence length enhancements. A "Strike 3" inmate is a prisoner who is serving time for a third felony, in which case that same law mandated a life sentence. Since such prisoners have little to lose, they are usually assigned to the maximum (level 4) security prisons. Treat refers to the security level which prisoners were assigned to. As noted above, the lower security levels (1-3) are combined into one level. This is an observational study, since the prisoners were not randomly selected and assigned. ================================ HOW TO USE THE DATA FILES ================================ The data file is space delimited text. The first row contains the list of variables and each remaining row contains the corresponding data for one prisoner. Missing data are indicated by a period. Explanations for all variable abbreviations are given below. There are five variables in the data set. RESPONSE........Misconduct Violation (1) or not (0) SCORE...........Classification Score STRIKE 2........Two Striker Inmate (1) or not (0) STRIKE 3........Three Striker Inmate (1) or not (0) TREAT...........Classified to Level 4 (1) or not (0) STATISTICAL TESTS AND ANALYSES USED 1. Logistic Regression: Misconduct as the response, Score and Treat as predictors 2. Data Tables: Comparison between 2-strikers, 3-strikers