ordFacReg {OrdFacReg} | R Documentation |
This function computes estimates in least squares or logistic regression where coefficients corresponding to dummy variables of ordered factors are estimated to be in non-decreasing order and at least 0. An active set algorithm as described in Duembgen et al. (2007) is used.
ordFacReg(D, Z, fact, ordfact, ordering = NA, type = c("LS", "logreg"),
intercept = TRUE, display = 0, eps = 0)
D |
Response vector, either in |
Z |
Matrix of predictors. Factors are coded with levels from 1 to |
fact |
Specify columns in |
ordfact |
Specify columns in |
ordering |
Vector of the same length as |
type |
Specify type of response variable. |
intercept |
If |
display |
If |
eps |
Quantity to which the criterion in the Basic Procedure 2 in Duembgen et al. (2007) is compared. |
For a detailed description of the problem and the algorithm we refer to Rufibach (2010).
L |
Value of the criterion function at the maximum. |
beta |
Computed regression coefficients. |
A |
Set |
design.matrix |
Design matrix that was generated. |
Kaspar Rufibach (maintainer)
kaspar.rufibach@gmail.com
http://www.kasparrufibach.ch
Duembgen, L., Huesler, A. and Rufibach, K. (2010). Active set and EM algorithms for log-concave densities based on complete and censored data. Technical report 61, IMSV, Univ. of Bern, available at http://arxiv.org/abs/0707.4643.
Rufibach, K. (2010). An Active Set Algorithm to Estimate Parameters in Generalized Linear Models with Ordered Predictors. Comput. Statist. Data Anal., 54, 1442-1456.
ordFacRegCox
computes estimates for Cox-regression.
## ========================================================
## To illustrate least squares estimation, we generate the same data
## that was used in Rufibach (2010), Table 1.
## ========================================================
## --------------------------------------------------------
## initialization
## --------------------------------------------------------
n <- 200
Z <- NULL
intercept <- FALSE
## --------------------------------------------------------
## quantitative variables
## --------------------------------------------------------
n.q <- 3
set.seed(14012009)
if (n.q > 0){for (i in 1:n.q){Z <- cbind(Z, rnorm(n, mean = 1, sd = 2))}}
## --------------------------------------------------------
## unordered factors
## --------------------------------------------------------
un.levels <- 3
for (i in 1:length(un.levels)){Z <- cbind(Z, sample(rep(1:un.levels[i],
each = ceiling(n / un.levels)))[1:n])}
fact <- n.q + 1:length(un.levels)
## --------------------------------------------------------
## ordered factors
## --------------------------------------------------------
levels <- 8
for (i in 1:length(un.levels)){Z <- cbind(Z, sample(rep(1:levels[i],
each = ceiling(n / levels)))[1:n])}
ordfact <- n.q + length(un.levels) + 1:length(levels)
## --------------------------------------------------------
## generate data matrices
## --------------------------------------------------------
Y <- prepareData(Z, fact, ordfact, ordering = NA, intercept)$Y
## --------------------------------------------------------
## generate response
## --------------------------------------------------------
D <- apply(Y * matrix(c(rep(c(2, -3, 0), each = n), rep(c(1, 1), each = n),
rep(c(0, 2, 2, 2, 2, 5, 5), each = n)), ncol = ncol(Y)), 1, sum) +
rnorm(n, mean = 0, sd = 4)
## --------------------------------------------------------
## compute estimates
## --------------------------------------------------------
res1 <- lmLSE(D, Y)
res2 <- ordFacReg(D, Z, fact, ordfact, ordering = "i", type = "LS", intercept,
display = 1, eps = 0)
b1 <- res1$beta
g1 <- lmSS(b1, D, Y)$dL
b2 <- res2$beta
g2 <- lmSS(b2, D, Y)$dL
Ls <- c(lmSS(b1, D, Y)$L, lmSS(b2, D, Y)$L)
names(Ls) <- c("LSE", "ordFact")
disp <- cbind(1:length(b1), round(cbind(b1, g1, cumsum(g1)), 4),
round(cbind(b2, g2, cumsum(g2)), 4))
## --------------------------------------------------------
## display results
## --------------------------------------------------------
disp
Ls
## ========================================================
## Artificial data is used to illustrate logistic regression.
## ========================================================
## --------------------------------------------------------
## initialization
## --------------------------------------------------------
set.seed(1977)
n <- 500
Z <- NULL
intercept <- FALSE
## --------------------------------------------------------
## quantitative variables
## --------------------------------------------------------
n.q <- 2
if (n.q > 0){for (i in 1:n.q){Z <- cbind(Z, rnorm(n, rgamma(2, 2, 1)))}}
## --------------------------------------------------------
## unordered factors
## --------------------------------------------------------
un.levels <- c(8, 2)
for (i in 1:length(un.levels)){Z <- cbind(Z, sample(round(runif(n, 0,
un.levels[i] - 1)) + 1))}
fact <- n.q + 1:length(un.levels)
## --------------------------------------------------------
## ordered factors
## --------------------------------------------------------
levels <- c(2, 4, 10)
for (i in 1:length(levels)){Z <- cbind(Z, sample(round(runif(n, 0,
levels[i] - 1)) + 1))}
ordfact <- n.q + length(un.levels) + 1:length(levels)
## --------------------------------------------------------
## generate response
## --------------------------------------------------------
D <- sample(c(rep(0, n / 2), rep(1, n/2)))
## --------------------------------------------------------
## generate design matrix
## --------------------------------------------------------
Y <- prepareData(Z, fact, ordfact, ordering = NA, intercept)$Y
## --------------------------------------------------------
## compute estimates
## --------------------------------------------------------
res1 <- matrix(glm.fit(Y, D, family = binomial(link = logit))$coefficients, ncol = 1)
res2 <- ordFacReg(D, Z, fact, ordfact, ordering = NA, type = "logreg",
intercept = intercept, display = 1, eps = 0)
b1 <- res1
g1 <- logRegDeriv(b1, D, Y)$dL
b2 <- res2$beta
g2 <- logRegDeriv(b2, D, Y)$dL
Ls <- unlist(c(logRegLoglik(res1, D, Y), res2$L))
names(Ls) <- c("MLE", "ordFact")
disp <- cbind(1:length(b1), round(cbind(b1, g1, cumsum(g1)), 4),
round(cbind(b2, g2, cumsum(g2)), 4))
## --------------------------------------------------------
## display results
## --------------------------------------------------------
disp
Ls
## --------------------------------------------------------
## compute estimates when the third ordered factor should
## have *decreasing* estimated coefficients
## --------------------------------------------------------
res3 <- ordFacReg(D, Z, fact, ordfact, ordering = c("i", "i", "d"),
type = "logreg", intercept = intercept, display = 1, eps = 0)
b3 <- res3$beta
g3 <- logRegDeriv(b3, D, Y)$dL
Ls <- unlist(c(logRegLoglik(res1, D, Y), res2$L, res3$L))
names(Ls) <- c("MLE", "ordFact ddd", "ordFact iid")
disp <- cbind(1:length(b1), round(cbind(b1, b2, b3), 4))
## --------------------------------------------------------
## display results
## --------------------------------------------------------
disp
Ls