EllDistrEst.adapt {ElliptCopulas}R Documentation

Estimation of the generator of the elliptical distribution by kernel smoothing with adaptive choice of the bandwidth

Description

A continuous elliptical distribution has a density of the form

f_X(x) = {|\Sigma|}^{-1/2} g\left( (x-\mu)^\top \, \Sigma^{-1} \, (x-\mu) \right),

where x \in \mathbb{R}^d, \mu \in \mathbb{R}^d is the mean, \Sigma is a d \times d positive-definite matrix and a function g: \mathbb{R}_+ \rightarrow \mathbb{R}_+, called the density generator of X. The goal is to estimate g at some point \xi, by

\widehat{g}_{n,h,a}(\xi) := \dfrac{\xi^{\frac{-d+2}{2}} \psi_a'(\xi)}{n h s_d} \sum_{i=1}^n K\left( \dfrac{ \psi_a(\xi) - \psi_a(\xi_i) }{h} \right) + K\left( \dfrac{ \psi_a(\xi) + \psi_a(\xi_i) }{h} \right),

where s_d := \pi^{d/2} / \Gamma(d/2), \Gamma is the Gamma function, h and a are tuning parameters (respectively the bandwidth and a parameter controlling the bias at \xi = 0), \psi_a(\xi) := -a + (a^{d/2} + \xi^{d/2})^{2/d}, \xi \in \mathbb{R}, K is a kernel function and \xi_i := (X_i - \mu)^\top \, \Sigma^{-1} \, (X_i - \mu), for a sample X_1, \dots, X_n. This function computes "optimal asymptotic" values for the bandwidth h and the tuning parameter a from a first step bandwidth that the user needs to provide.

Usage

EllDistrEst.adapt(
  X,
  mu = 0,
  Sigma_m1 = diag(NCOL(X)),
  grid,
  h_firstStep,
  grid_a = NULL,
  Kernel = "gaussian",
  mpfr = FALSE,
  precBits = 100,
  dopb = TRUE
)

Arguments

X

a matrix of size n \times d, assumed to be n i.i.d. observations (rows) of a d-dimensional elliptical distribution.

mu

mean of X. This can be the true value or an estimate. It must be a vector of dimension d.

Sigma_m1

inverse of the covariance matrix of X. This can be the true value or an estimate. It must be a matrix of dimension d \times d.

grid

vector containing the values at which we want the generator to be estimated.

h_firstStep

a vector of size 2 containing first-step bandwidths to be used. The first one is used for the estimation of the asymptotic mean-squared error. The second one is used for the first step estimation of g. From these two estimators, a final value of the bandwidth h is determined, which is used for the final estimator of g.

If h_firstStep is of length 1, its value is reused for both purposes (estimation of the AMSE and first-step estimation of g).

grid_a

the grid of possible values of a to be used. If missing, a default sequence is used.

Kernel

name of the kernel. Possible choices are "gaussian", "epanechnikov", "triangular".

mpfr

if mpfr = TRUE, multiple precision floating point is used via the package Rmpfr. This allows for a higher (numerical) accuracy, at the expense of computing time. It is recommended to use this option for higher dimensions.

precBits

number of precBits used for floating point precision (only used if mpfr = TRUE).

dopb

a Boolean value. If dopb = TRUE, a progress bar is displayed.

Value

a list with the following elements:

Author(s)

Alexis Derumigny, Victor Ryan

References

Ryan, V., & Derumigny, A. (2024). On the choice of the two tuning parameters for nonparametric estimation of an elliptical distribution generator arxiv:2408.17087.

See Also

EllDistrEst for the nonparametric estimation of the elliptical distribution density generator, EllDistrSim for the simulation of elliptical distribution samples.

estim_tilde_AMSE which is used in this function. It estimates a component of the asymptotic mean-square error (AMSE) of the nonparametric estimator of the elliptical density generator assuming h has been optimally chosen.

Examples

n = 500
d = 3
X = matrix(rnorm(n * d), ncol = d)
grid = seq(0, 5, by = 0.1)

result = EllDistrEst.adapt(X = X, grid = grid, h = 0.05)
plot(grid, result$g, type = "l")
lines(grid, result$first_step_g, col = "blue")

# Computation of true values
g = exp(-grid/2)/(2*pi)^{3/2}
lines(grid, g, type = "l", col = "red")

plot(grid, result$best_a, type = "l", col = "red")
plot(grid, result$best_h, type = "l", col = "red")

sum((g - result$g)^2, na.rm = TRUE) < sum((g - result$first_step_g)^2, na.rm = TRUE)


[Package ElliptCopulas version 0.1.4.1 Index]