pagfl {PAGFL}R Documentation

Pairwise Adaptive Group Fused Lasso

Description

The pairwise adaptive group fused lasso (PAGFL) by Mehrabani (2023) jointly estimates the latent group structure and group-specific slope parameters in a panel data model. It can handle static and dynamic panels, either with or without endogenous regressors.

Usage

pagfl(
  formula,
  data,
  index = NULL,
  n_periods = NULL,
  lambda,
  method = "PLS",
  Z = NULL,
  min_group_frac = 0.05,
  bias_correc = FALSE,
  kappa = 2,
  max_iter = 5000,
  tol_convergence = 1e-08,
  tol_group = 0.001,
  rho = 0.07 * log(N * n_periods)/sqrt(N * n_periods),
  varrho = max(sqrt(5 * N * n_periods * p)/log(N * n_periods * p) - 7, 1),
  verbose = TRUE,
  parallel = TRUE,
  ...
)

## S3 method for class 'pagfl'
print(x, ...)

## S3 method for class 'pagfl'
formula(x, ...)

## S3 method for class 'pagfl'
df.residual(object, ...)

## S3 method for class 'pagfl'
summary(object, ...)

## S3 method for class 'pagfl'
coef(object, ...)

## S3 method for class 'pagfl'
residuals(object, ...)

## S3 method for class 'pagfl'
fitted(object, ...)

Arguments

formula

a formula object describing the model to be estimated.

data

a data.frame or matrix holding a panel data set. If no index variables are provided, the panel must be balanced and ordered in the long format \bold{Y}=(Y_1^\prime, \dots, Y_N^\prime)^\prime, Y_i = (Y_{i1}, \dots, Y_{iT})^\prime with Y_{it} = (y_{it}, x_{it}^\prime)^\prime. Conversely, if data is not ordered or not balanced, data must include two index variables, declaring the cross-sectional unit i and the time period t for each observation.

index

a character vector holding two strings specifying the variable names that identify the cross-sectional unit and time period for each observation. The first string denotes the individual unit, while the second string represents the time period. In case of a balanced panel data set that is ordered in the long format, index can be left empty if the the number of time periods n_periods is supplied.

n_periods

the number of observed time periods T. If an index character vector is passed, this argument can be left empty.

lambda

the tuning parameter. \lambda governs the strength of the penalty term. Either a single \lambda or a vector of candidate values can be passed. If a vector is supplied, a BIC-type IC automatically selects the best fitting parameter value.

method

the estimation method. Options are

"PLS"

for using the penalized least squares (PLS) algorithm. We recommend PLS in case of (weakly) exogenous regressors (Mehrabani, 2023, sec. 2.2).

"PGMM"

for using the penalized Generalized Method of Moments (PGMM). PGMM is required when instrumenting endogenous regressors, in which case A matrix Z containing the necessary exogenous instruments must be supplied (Mehrabani, 2023, sec. 2.3).

Default is "PLS".

Z

a NT \times q matrix or data.frame of exogenous instruments, where q \geq p, \bold{Z}=(z_1, \dots, z_N)^\prime, z_i = (z_{i1}, \dots, z_{iT})^\prime and z_{it} is a q \times 1 vector. \bold{Z} is only required when method = "PGMM" is selected. When using "PLS", either pass NULL or \bold{Z} is disregarded. Default is NULL.

min_group_frac

the minimum group size as a fraction of the total number of individuals N. In case a group falls short of this threshold, a hierarchical classifier allocates its members to the remaining groups. Default is 0.05.

bias_correc

logical. If TRUE, a Split-panel Jackknife bias correction following Dhaene and Jochmans (2015) is applied to the slope parameters. We recommend using the correction when facing a dynamic panel. Default is FALSE.

kappa

the a non-negative weight placed on the adaptive penalty weights. Default is 2.

max_iter

the maximum number of iterations for the ADMM estimation algorithm. Default is 5000.

tol_convergence

the tolerance limit for the stopping criterion of the iterative ADMM estimation algorithm. Default is 1 * 10^{-8}.

tol_group

the tolerance limit for within-group differences. Two individuals i, j are assigned to the same group if the Frobenius norm of their coefficient vector difference is below this threshold. Default is 0.001.

rho

the tuning parameter balancing the fitness and penalty terms in the IC that determines the penalty parameter \lambda. If left unspecified, the heuristic \rho = 0.07 \frac{\log(NT)}{\sqrt{NT}} of Mehrabani (2023, sec. 6) is used. We recommend the default.

varrho

the non-negative Lagrangian ADMM penalty parameter. For PLS, the \varrho value is trivial. However, for PGMM, small values lead to slow convergence. If left unspecified, the default heuristic \varrho = \max(\frac{\sqrt{5NTp}}{\log(NTp)}-7, 1) is used.

verbose

logical. If TRUE, helpful warning messages are shown. Default is TRUE.

parallel

logical. If TRUE, certain operations are parallelized across multiple cores.

...

ellipsis

x

of class pagfl.

object

of class pagfl.

Details

Consider the grouped panel data model

y_{it} = \gamma_i + \beta^\prime_{i} x_{it} + \epsilon_{it}, \quad i = 1, \dots, N, \; t = 1, \dots, T,

where y_{it} is the scalar dependent variable, \gamma_i is an individual fixed effect, x_{it} is a p \times 1 vector of explanatory variables, and \epsilon_{it} is a zero mean error. The coefficient vector \beta_i is subject to the latent group pattern

\beta_i = \sum_{k = 1}^K \alpha_k \bold{1} \{i \in G_k \},

with \cup_{k = 1}^K G_k = \{1, \dots, N\}, G_k \cap G_j = \emptyset and \| \alpha_k \| \neq \| \alpha_j \| for any k \neq M.

The PLS method jointly estimates the latent group structure and group-specific coefficient by minimizing the following criterion:

\frac{1}{T} \sum^N_{i=1} \sum^{T}_{t=1}(\tilde{y}_{it} - \beta^\prime_i \tilde{x}_{it})^2 + \frac{\lambda}{N} \sum_{1 \leq i} \sum_{i<j \leq N} \dot{w}_{ij} \| \beta_i - \beta_j \|,

where \tilde{y}_{it} is the demeaned scalar dependent variable, \tilde{x}_{it} denotes a p \times 1 vector of demeaned weakly exogenous explanatory variables, \lambda is the penalty tuning parameter and \dot{w}_{ij} reflects adaptive penalty weights (see Mehrabani, 2023, eq. 2.6). \| \cdot \| denotes the Frobenius norm. The adaptive weights \dot{w}_{ij} are obtained by a preliminary individual least squares estimation. The solution \hat{\bold{\beta}} is computed via an iterative alternating direction method of multipliers (ADMM) algorithm (see Mehrabani, 2023, sec. 5.1).

PGMM employs a set of instruments \bold{Z} to control for endogenous regressors. Using PGMM, \bold{\beta} = (\beta_1^\prime, \dots, \beta_N^\prime)^\prime is estimated by minimizing:

\sum^N_{i = 1} \left[ \frac{1}{N} \sum_{t=1}^T z_{it} (\Delta y_{it} - \beta^\prime_i \Delta x_{it}) \right]^\prime W_i \left[\frac{1}{T} \sum_{t=1}^T z_{it}(\Delta y_{it} - \beta^\prime_i \Delta x_{it}) \right] + \frac{\lambda}{N} \sum_{1 \leq i} \sum_{i<j \leq N} \ddot{w}_{ij} \| \beta_i - \beta_j \|.

\ddot{w}_{ij} are obtained by an initial GMM estimation. \Delta gives the first differences operator \Delta y_{it} = y_{it} - y_{i t-1}. W_i represents a data-driven q \times q weight matrix. I refer to Mehrabani (2023, eq. 2.10) for more details. \bold{\beta} is again estimated employing an efficient ADMM algorithm (Mehrabani, 2023, sec. 5.2).

Two individuals are assigned to the same group if \| \hat{\beta}_i - \hat{\beta}_j \| \leq \epsilon_{\text{tol}}, where \epsilon_{\text{tol}} is given by tol_group. Subsequently, the number of groups follows as the number of distinct elements in \hat{\bold{\beta}}. Given an estimated group structure, it is straightforward to obtain post-Lasso estimates using least squares.

We suggest identifying a suitable \lambda parameter by passing a logarithmically spaced grid of candidate values with a lower limit of 0 and an upper limit that leads to a fully homogeneous panel. A BIC-type information criterion then selects the best fitting \lambda value.

Value

An object of class pagfl holding

model

a data.frame containing the dependent and explanatory variables as well as cross-sectional and time indices,

coefficients

a K \times p matrix of the post-Lasso group-specific parameter estimates,

groups

a list containing (i) the total number of groups \hat{K} and (ii) a vector of estimated group memberships (\hat{g}_1, \dots, \hat{g}_N), where \hat{g}_i = k if i is assigned to group k,

residuals

a vector of residuals of the demeaned model,

fitted

a vector of fitted values of the demeaned model,

args

a list of additional arguments,

IC

a list containing (i) the value of the IC, (ii) the employed tuning parameter \lambda, and (iii) the mean squared error,

convergence

a list containing (i) a logical variable indicating if convergence was achieved and (ii) the number of executed ADMM algorithm iterations,

call

the function call.

A pagfl object has print, summary, fitted, residuals, formula, df.residual, and coef S3 methods.

Author(s)

Paul Haimerl

References

Dhaene, G., & Jochmans, K. (2015). Split-panel jackknife estimation of fixed-effect models. The Review of Economic Studies, 82(3), 991-1030. doi:10.1093/restud/rdv007.

Mehrabani, A. (2023). Estimation and identification of latent group structures in panel data. Journal of Econometrics, 235(2), 1464-1482. doi:10.1016/j.jeconom.2022.12.002.

Examples

# Simulate a panel with a group structure
sim <- sim_DGP(N = 20, n_periods = 80, p = 2, n_groups = 3)
y <- sim$y
X <- sim$X
df <- cbind(y = c(y), X)

# Run the PAGFL procedure
estim <- pagfl(y ~ ., data = df, n_periods = 80, lambda = 0.5, method = "PLS")
summary(estim)

# Lets pass a panel data set with explicit cross-sectional and time indicators
i_index <- rep(1:20, each = 80)
t_index <- rep(1:80, 20)
df <- data.frame(y = c(y), X, i_index = i_index, t_index = t_index)
estim <- pagfl(
  y ~ ., data = df, index = c("i_index", "t_index"),
  lambda = 0.5, method = "PLS"
)
summary(estim)

[Package PAGFL version 1.1.0 Index]