DiscretizeData {NPHazardRate} | R Documentation |
Defines equispaced disjoint intervals based on the range of the sample and calculates empirical hazard rate estimates at each interval center
DiscretizeData(xin, xout)
xin |
A vector of input values |
xout |
Grid points where the function will be evaluated |
The function defines the subinterval length \Delta = (0.8\max(X_i) - \min(X_i))/N
where N
is the sample size. Then at each bin (subinterval) center, the empirical hazard rate estimate is calculated by
c_i = \frac{f_i}{\Delta(N-F_i +1) }
where f_i
is the frequency of observations in the ith bin and F_i = \sum_{j\leq i} f_j
is the empirical cummulative distribution estimate.
A vector with the values of the function at the designated points xout or the random numbers drawn.
x<-seq(0, 5,length=100) #design points where the estimate will be calculated
SampleSize<-100 #amount of data to be generated
ti<- rweibull(SampleSize, .6, 1) # draw a random sample
ui<-rexp(SampleSize, .2) # censoring sample
cat("\n AMOUNT OF CENSORING: ", length(which(ti>ui))/length(ti)*100, "\n")
x1<-pmin(ti,ui) # observed data
cen<-rep.int(1, SampleSize) # initialize censoring indicators
cen[which(ti>ui)]<-0 # 0's correspond to censored indicators
a.use<-DiscretizeData(ti, x) # discretize the data
BinCenters<-a.use$BinCenters # get the data centers
ci<-a.use$ci # get empircal hazard rate estimates
Delta=a.use$Delta # Binning range