A 2-Step Empirical Likelihood Approach for Combining Sample and Population Data in Regression Estimation

Sanjay Chaudhuri, University of Washington
Mark S. Handcock, University of Washington
Michael Rendall, RAND

Apart from the sample, sometimes limited information on the relationship of explanatory variables with the dependent variable may be known from population-level data. Using the method of constrained maximum likelihood estimation we have previously shown that it is possible to include such population-level information and achieve a large reduction in the bias and variance of the estimates of these regression coefficients. We propose here an alternative 2-step empirical likelihood based approach. We first compute optimal weights for the sample, which both maximize the empirical likelihood and satisfy the population constraints. These weights are then used in a standard statistical package to produce the parameter and standard error estimates, with the Hessian derived separately. Like the constrained MLE the use of population constraint leads to correct and substantially lower standard errors. However the 2-step approach is computationally much more flexible, allowing for estimation with multiple population constraints and multiple covariates.

  See extended abstract

Presented in Session 57: Statistical Demography