CMB.stabpath {gfboost} | R Documentation |
CMB stability paths
Description
Draws a Stability plot for CMB.
Usage
CMB.stabpath(
D,
nsing,
Bsing = 1,
alpha = 1,
singfam = Gaussian(),
evalfam = Gaussian(),
sing = FALSE,
Mseq,
m_iter = 100,
kap = 0.1,
LS = FALSE,
best = 1,
wagg,
robagg = FALSE,
lower = 0,
B,
ncmb,
...
)
Arguments
D |
Data matrix. Has to be an n \times (p+1)- dimensional data frame in the format (X,Y) . The X- part must not
contain an intercept column containing only ones since this column will be added automatically.
|
nsing |
Number of observations (rows) used for the SingBoost submodels.
|
Bsing |
Number of subsamples based on which the SingBoost models are validated. Default is 1. Not to confuse with parameter B for the Stability Selection.
|
alpha |
Optional real number in ]0,1] . Defines the fraction of best SingBoost models used in the aggregation step. Default is 1 (use all models).
|
singfam |
A SingBoost family. The SingBoost models are trained based on the corresponding loss function. Default is Gaussian() (squared loss).
|
evalfam |
A SingBoost family. The SingBoost models are validated according to the corresponding loss function. Default is Gaussian() (squared loss).
|
sing |
If sing=FALSE and the singfam family is a standard Boosting family that is contained in the package
mboost , the CMB aggregation procedure is executed for the corresponding standard Boosting models.
|
Mseq |
A vector of different values for M .
|
m_iter |
Number of SingBoost iterations. Default is 100.
|
kap |
Learning rate (step size). Must be a real number in ]0,1] . Default is 0.1 It is recommended to use
a value smaller than 0.5.
|
LS |
If a singfamily object that is already provided by mboost is used, the respective Boosting algorithm
will be performed in the singular iterations if Ls is set to TRUE . Default is FALSE .
|
best |
Needed in the case of localized ranking. The parameter K of the localized ranking loss will be
computed by best \cdot n (rounded to the next larger integer). Warning: If a parameter K is inserted into the
LocRank family, it will be ignored when executing SingBoost.
|
wagg |
Type of row weight aggregation. 'weights1' indicates that the selection frequencies of the (best)
SingBoost models are averaged. 'weights2' respects the validation losses for each model and downweights the ones
with higher validation losses.
|
robagg |
Optional. If setting robagg=TRUE , the best SingBoost models are ignored when executing the
aggregation to avoid inlier effects. Only reasonable in combination with lower .
|
lower |
Optional argument. Only reasonable when setting robagg=TRUE . lower is a real number in [0,1[ (a rather
small number is recommended) and indicates that the aggregation ignores the SingBoost models with the best
performances to avoid possible inlier effects.
|
B |
Number of subsamples of size n_{cmb} of the training data for CMB aggregation.
|
ncmb |
Number of samples used for CMB . Integer that must be smaller than the number of samples in Dtrain .
|
... |
Optional further arguments
|
Value
relev |
List of relevant variables (represented as their column number).
|
ind |
Vector of relevant variables (represented as their column number).
|
References
Werner, T., Gradient-Free Gradient Boosting, PhD Thesis, Carl von Ossietzky University Oldenburg, 2020
[Package
gfboost version 0.1.1
Index]