The 'glmc' Package


 

 

Statistical Software for Constrained Generalized Linear Models

The glmc package is

Copyright (c) 2004, Mark S. Handcock, University of California - Los Angeles


The glmc package for R

This package fits generalized linear models where the parameters are subject to linear constraints. It is an adaptation of the glm function in R to allow for parameter estimation using constrained maximum likelihood.

 

The model is specified by giving a symbolic description of the linear predictor, a description of the error distribution, and a matrix of constraints on the parameters.

 

You need to be using a current version of R (version 3.0.0 or later). It runs on all platforms that R runs (e.g., Windows, LINUX, UNIX, and Macintosh). For more information on R see http://www.r-project.org

 

The package can be installed directly from within R from its location here.


Installation

If you need a detailed description of how to install the glmc package, scroll down further.  Otherwise, start R in some directory and type:

% R

> install.packages("glmc")

Don't type the R prompt ">".  This downloads and installs the glmc package and the emplik package on which it depends. Once this is done, the glmc package can be used by typing the call:

> library(glmc)

The functions are documented within R. To get help type in R :

> help(package="glmc")
> help(glmc)

To get an interactive demonstration of the package, type in R :

> demo(package="glmc")

To find the citation for the package type in R :

> citation(package="glmc")

The correct citation is:

Sanjay Chaudhuri, Mark S. Handcock, and Michael S. Rendall (2004). glmc: An R package for generalized linear models subject to constraints. URL http://www.stat.ucla.edu/~handcock/combining.


A Brief Demonstration of Package Use

To see the code for this type: help(glmc)

# In this example we apply the methodology to combine survey data

# from the British Household Panel Survey (BHPS)(Taylor et al.,

# 1995) with population level information from the birth

# registration system on the General Fertility Rate (GFR) to

# estimate annual birth probabilities by parity. This situation was

# considered by Handcock, Huovilainen, and Rendall (2000).

 

# Specify the data.

 

> n <- rbind(c(5903, 230), c(5157, 350))

 

> dimnames(n) <- list(c("no child", "child"), c("no birth",

    "birth"))

 

> n

         no birth birth

no child     5903   230

child        5157   350

 

#

# Create indicator covariates

#

 

> mat <- matrix(0, nrow = sum(n), ncol = 2)

 

> mat <- rbind(matrix(1, nrow = n[1, 1], ncol = 1) %*%

               c(0, 0), matrix(1, nrow = n[1, 2], ncol = 1) %*% c(0, 1),

               matrix(1, nrow = n[2, 1], ncol = 1) %*% c(1, 0), matrix(1,

               nrow = n[2, 2], ncol = 1) %*% c(1, 1))

 

#Specifying the population constraints.

#

# From the combination of birth registration numerator and

# population estimate denominator we can determine the general

# fertility rate (GFR) for the years 1992 to 1996, of England and

# Wales, which we assume to be measured without error (Office for

# National Statistics, 1998). The population level value of the GFR

# was found to be 0.06179.

 

> gfr <- 0.06179 * matrix(1, nrow = nrow(mat), ncol = 1)

 

> g <- matrix(1, nrow = nrow(mat), ncol = 1)

 

> amat <- matrix(mat[, 2] * g - gfr, ncol = 1)

 

# Method 1. Defining constraints in the data frame.

 

> hrh <- data.frame(birth = mat[, 2], child = mat[,

    1], constraints = amat)

 

 

# Method 2. Defining constraints through a matrix.

 

> gfit <- glmc(birth ~ child, data = hrh, family = "binomial",

    emplik.method = "Owen", control = glmc.control(maxit.glm = 10,

        maxit.weights = 200, itertrace.weights = TRUE, gradtol.weights = 10^(-6)))

[1] -0.02656434  -14.43904679 1320.26413390    1.00000000

[1] -0.02200310 -15.32395291 375.80863906   1.00000000

[1] -0.0217611 -15.3261430  18.0697997   1.0000000

[1] -0.02176048 -15.32614305   0.04572550   1.00000000

[1] -2.176048e-02 -1.532614e+01  2.942563e-07  4.000000e+00

 

> summary(gfit)

 

Call:

glmc(formula = birth ~ child, family = "binomial", data = hrh,

    control = glmc.control(maxit.glm = 10, maxit.weights = 200,

        itertrace.weights = TRUE, gradtol.weights = 10^(-6)),

    emplik.method = "Owen")

 

Deviance Residuals:

    Min       1Q   Median       3Q      Max 

-0.4045  -0.4045  -0.3091  -0.3091   2.4759 

 

Coefficients:

            Estimate Std. Error z value Pr(>|z|)   

(Intercept) -3.01731    0.05199 -58.039  < 2e-16 ***

child        0.55496    0.08700   6.379 1.78e-10 ***

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 

(Dispersion parameter for binomial family taken to be 1)

 

Null     deviance: 0.46373  on 11639  degrees of freedom

Residual deviance: 0.45934  on 11638  degrees of freedom

AIC: 4

 

Number of Fisher Scoring iterations: 6

 

 

> gfit <- glmc(birth ~ child, data = hrh, family = binomial(link = logit),

    emplik.method = "Owen", control = glmc.control(maxit.glm = 10,

    maxit.weights = 200, itertrace.weights = TRUE, gradtol.weights = 10^(-10)),

    Amat = amat, confn  .... [TRUNCATED]

[1]   -0.02656434  -14.43904679 1320.26413390    1.00000000

[1]  -0.02200310 -15.32395291 375.80863906   1.00000000

[1]  -0.0217611 -15.3261430  18.0697997   1.0000000

[1]  -0.02176048 -15.32614305   0.04572550   1.00000000

[1] -2.176048e-02 -1.532614e+01  2.942563e-07  4.000000e+00

[1] -2.176048e-02 -1.532614e+01  2.469629e-07  2.000000e+00

[1] -2.176048e-02 -1.532614e+01  7.396291e-08  0.000000e+00

 

> summary(gfit)

 

Call:

glmc(formula = birth ~ child, family = binomial(link = logit),

    data = hrh, control = glmc.control(maxit.glm = 10, maxit.weights = 200,

    itertrace.weights = TRUE, gradtol.weights = 10^(-10)),

    emplik.method = "Owen", Amat = amat, confn = NULL)

 

Deviance Residuals:

    Min       1Q   Median       3Q      Max 

-0.4045  -0.4045  -0.3091  -0.3091   2.4759 

 

Coefficients:

            Estimate Std. Error z value Pr(>|z|)   

(Intercept) -3.01731    0.05199 -58.039  < 2e-16 ***

child        0.55496    0.08700   6.379 1.78e-10 ***

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 

(Dispersion parameter for binomial family taken to be 1)

 

Null     deviance: 0.46373  on 11639  degrees of freedom

Residual deviance: 0.45934  on 11638  degrees of freedom

AIC: 4

 

Number of Fisher Scoring iterations: 6

 

#

# Method 3. Defining constraints through a function.

#

 

> fn <- function(t, Y, X) {

    grf <- 0.06179 * matrix(1, nrow = nrow(as.matrix(X)), ncol = 1)

    g <- matrix(1, nrow = nrow(X), ncol = 1)

    amat <- matrix(Y * g - grf, ncol = 1)

    return(amat)

}

 

> gfit <- glmc(birth ~ child, data = hrh, family = binomial(link = logit),

    optim.method = "BFGS", emplik.method = "Owen", control = glmc.control(maxit.glm = 10,

        maxit.optim = 10^(8), reltol.optim = 10^(-10), trace.optim = 9,

        REPO .... [TRUNCATED]

initial  value 8843.408055

iter   2 value 368.338941

iter   3 value 180.974974

iter   4 value 26.398217

iter   5 value 25.697260

iter   6 value 16.475018

iter   7 value 16.105789

iter   8 value 15.360107

iter   9 value 15.333173

iter  10 value 15.327196

iter  11 value 15.326147

iter  12 value 15.326145

iter  13 value 15.326144

iter  14 value 15.326144

iter  14 value 15.326144

iter  15 value 15.326144

iter  15 value 15.326144

iter  16 value 15.326143

iter  16 value 15.326143

iter  17 value 15.326143

iter  17 value 15.326143

iter  17 value 15.326143

final  value 15.326143

converged

 

> summary(gfit)

 

Call:

glmc(formula = birth ~ child, family = binomial(link = logit),

    data = hrh, control = glmc.control(maxit.glm = 10, maxit.optim = 10^(8),

    reltol.optim = 10^(-10), trace.optim = 9, REPORT.optim = 1,

    maxit.weights = 200, gradtol.weights = 10^(-6), itertrace.weights = FALSE),

    optim.method = "BFGS", emplik.method = "Owen", optim.hessian = TRUE,

    Amat = NULL, confn = fn)

 

Deviance Residuals:

    Min       1Q   Median       3Q      Max 

-0.4045  -0.4045  -0.3091  -0.3091   2.4759 

 

Coefficients:

            Estimate Std. Error z value Pr(>|z|)   

(Intercept) -3.01735    0.05185 -58.194  < 2e-16 ***

child        0.55503    0.08676   6.397 1.58e-10 ***

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 

(Dispersion parameter for binomial family taken to be 1)

 

Null     deviance: 0.46373  on 11639  degrees of freedom

Residual deviance: 0.45934  on 11638  degrees of freedom

AIC: 4

 

Number of Fisher Scoring iterations: 6

 

> 


Pre-Installation: Setting up a personal R library

 directory

If you do not have Administrator access under Windows, you may need to set a personal R library directory. This can contain R packages that you install that are separate from the system versions. If you have not done this before, here is a simple way to set it up (for this and other packages).

 

You need to set a user environment variable named R_LIBS to be the path to the place you want your personal library directory to be. For example, mine is set to "H:\library". To set this variable, search for "Add Environment Variables" in Windows help.


Updating the packages

 

To update the packages, you only need to type in R :

> update.packages()


 

About this web site


UW - CSSS

Contact: Webmaster or CSSS