tsmoothlm {esemifar} | R Documentation |
This function runs an iterative plug-in algorithm to find the optimal bandwidth for the estimation of the nonparametric trend in equidistant time series (with long-memory errors) and then employs the resulting bandwidth via either local polynomial or kernel regression.
tsmoothlm(
y,
pmin = c(0, 1, 2, 3, 4, 5),
pmax = c(0, 1, 2, 3, 4, 5),
qmin = c(0, 1, 2, 3, 4, 5),
qmax = c(0, 1, 2, 3, 4, 5),
p = c(1, 3),
mu = c(0, 1, 2, 3),
InfR = c("Opt", "Nai", "Var"),
bStart = 0.15,
bb = c(0, 1),
cb = 0.05,
method = c("lpr", "kr")
)
y |
a numeric vector that contains the time series ordered from past to present. | ||||||||||
pmin |
an integer value | ||||||||||
pmax |
an integer value | ||||||||||
qmin |
an integer value | ||||||||||
qmax |
an integer value | ||||||||||
p |
an integer | ||||||||||
mu |
an integer
| ||||||||||
InfR |
a character object that represents the inflation
rate in the form
| ||||||||||
bStart |
a numeric object that indicates the starting value of the
bandwidth for the iterative process; should be | ||||||||||
bb |
can be set to
| ||||||||||
cb |
a numeric value that indicates the percentage of omitted
observations on each side of the observation period for the automated
bandwidth selection; is set to | ||||||||||
method |
the final smoothing approach; |
The trend is estimated based on the additive nonparametric regression model for an equidistant time series
y_t = m(x_t) + \epsilon_t,
where y_t
is the observed time series, x_t
is the rescaled time
on the interval [0, 1]
, m(x_t)
is a smooth and deterministic
trend function and \epsilon_t
are stationary errors with
E(\epsilon_t) = 0
and is assumed to follow a FARIMA(p, d, q
)
model (see also Beran and Feng, 2002a, Beran and Feng, 2002b and Beran
and Feng, 2002c).
The iterative-plug-in (IPI) algorithm, which numerically minimizes the Asymptotic Mean Squared Error (AMISE), is based on the proposal of Beran and Feng (2002a).
The function calculates suitable estimates for c_f
, the variance
factor, and I[m^{(k)}]
over different iterations. In each
iteration, a bandwidth is obtained in accordance with the AMISE that once
more serves as an input for the following iteration. The process repeats
until either convergence or the 40th iteration is reached. For further
details on the asymptotic theory or the algorithm, please see Letmathe et
al., 2023.
To apply the function, the following arguments are needed: a data input
y
, an order of polynomial p
, a kernel weighting function
defined by the smoothness parameter mu
, an inflation rate setting
InfR
(see also Beran and Feng, 2002b), a starting value for the
relative bandwidth bStart
, a
boundary method bb
, a boundary cut-off percentage cb
and a
final smoothing method method
. In fact, aside from the input vector
y
, every argument has a default setting that can be adjusted for the
individual case. Theoretically, the initial bandwidth does not affect the
selected optimal bandwidth. However, in practice local minima of the AMISE
might exist and influence the selected bandwidth. Therefore, the default
setting is bStart = 0.15
. In the rare
case of a clearly unsuitable optimal bandwidth, a starting bandwidth that
differs from the default value is a first possible approach to obtain a
better result. Other argument adjustments can be tried as well. For more
specific information on the input arguments consult the section
Arguments.
When applying the function, an optimal bandwidth is obtained based on a
strongly modified version of the IPI algorithm of Beran and Feng (2002a). In
a second step, the nonparametric trend of the series is calculated with
respect to the chosen bandwidth and the selected regression method (lpf
or kr
). Please note that method = "lpf"
is strongly recommended
by the authors. Moreover, it is notable that p
is automatically set to
1
for method = "kr"
. The output object is then a list that
contains, among other components, the original time series, the estimated
trend values and the series without the trend.
The default print method for this function delivers only key numbers such as
the iteration steps and the generated optimal bandwidth rounded to the fourth
decimal. The exact numbers and results such as the estimated nonparametric
trend series are saved within the output object and can be addressed via the
$
sign.
The function returns a list with different components:
the Bayesian Information Criterion of the optimal
FARIMA(p,d,q
) model.
the percentage of omitted observations on each side of the observation period.
the optimal bandwidth chosen by the IPI-algorithm.
the boundary bandwidth method used within the IPI; always equal to 1.
the starting value of the (relative) bandwidth; input argument.
the estimated variance factor; in contrast to the definitions
given in the Details section, this object actually contains an
estimated value of 2\pi c_f
, i.e. it corresponds to the estimated sum
of autocovariances.
the long-memory parameter of the optimal FARIMA(p,d,q
)
model.
the model fit of the selected FARIMA(p,d,q
model.
the estimated value of I[m^{(k)}]
.
the setting for the inflation rate according to the chosen algorithm.
the bandwidths of the single iterations steps
the smoothness parameter of the second order kernel; input argument.
the number of observations.
the total number of iterations until convergence.
the original input series; input argument.
the order p of the optimal FARIMA(p,d,q
) model.
the order of polynomial used in the IPI-algorithm; also used for the
final smoothing, if method = "lpr"
; input argument.
the order q
of the optimal FARIMA(p,d,q
)
model.
the estimated residual series.
the considered order of derivative of the trend; is always zero for this function.
the weighting system matrix used within the local polynomial
regression; this matrix is a condensed version of a complete weighting system
matrix; in each row of ws
, the weights for conducting the smoothing
procedure at a specific observation time point can be found; the first
[nb + 0.5]
rows, where n
corresponds to the number of
observations, b
is the bandwidth considered for smoothing and
[.]
denotes the integer part, contain the weights at the
[nb + 0.5]
left-hand boundary points; the weights in row
[nb + 0.5] + 1
are representative for the estimation at all
interior points and the remaining rows contain the weights for the right-hand
boundary points; each row has exactly 2[nb + 0.5] + 1
elements,
more specifically the weights for observations of the nearest
2[nb + 0.5] + 1
time points; moreover, the weights are normalized,
i.e. the weights are obtained under consideration of the time points
x_t = t/n
, where t = 1, 2, ..., n
.
the nonparametric estimates of the trend.
Yuanhua Feng (Department of Economics, Paderborn University),
Author of the Algorithms
Website: https://wiwi.uni-paderborn.de/en/dep4/feng/
Sebastian Letmathe (Scientific Employee) (Department of Economics,
Paderborn University),
Package Creator and Maintainer
Dominik Schulz (Scientific Employee) (Department of Economics,
Paderborn University),
Author
Beran, J. and Y. Feng (2002a). Iterative plug-in algorithms for SEMIFAR models - definition, convergence, and asymptotic properties. Journal of Computational and Graphical Statistics 11(3), 690-713.
Beran, J. and Feng, Y. (2002b). Local polynomial fitting with long-memory, short-memory and antipersistent errors. Annals of the Institute of Statistical Mathematics, 54(2), 291-311.
Beran, J. and Feng, Y. (2002c). SEMIFAR models - a semiparametric approach to modelling trends, longrange dependence and nonstationarity. Computational Statistics & Data Analysis 40(2), 393-419.
Letmathe, S., Beran, J. and Feng, Y. (2023). An extended exponential SEMIFAR model with application in R. Communications in Statistics - Theory and Methods: 1-13.
### Example 1: G7-GDP ###
# Logarithm of test data
# -> the logarithm of the data is assumed to follow the additive model
test_data <- gdpG7
y <- log(test_data$gdp)
n <- length(y)
# Applied tsmooth function for the trend
result <- tsmoothlm(y, p = 1, pmax = 1, qmax = 1, InfR = "Opt")
trend1 <- result$ye
# Plot of the results
t <- seq(from = 1962, to = 2020, length.out = n)
plot(t, y, type = "l", xlab = "Year", ylab = "log(G7-GDP)", bty = "n",
lwd = 1, lty = 3,
main = "Estimated trend for log-quarterly G7-GDP, Q1 1962 - Q4 2019")
points(t, trend1, type = "l", col = "red", lwd = 1)
title(sub = expression(italic("Figure 1")), col.sub = "gray47",
cex.sub = 0.6, adj = 0)
result