Title: | Smooth Survival Models, Including Generalized Survival Models |
---|---|
Description: | R implementation of generalized survival models (GSMs), smooth accelerated failure time (AFT) models and Markov multi-state models. For the GSMs, g(S(t|x))=eta(t,x) for a link function g, survival S at time t with covariates x and a linear predictor eta(t,x). The main assumption is that the time effect(s) are smooth <doi:10.1177/0962280216664760>. For fully parametric models with natural splines, this re-implements Stata's 'stpm2' function, which are flexible parametric survival models developed by Royston and colleagues. We have extended the parametric models to include any smooth parametric smoothers for time. We have also extended the model to include any smooth penalized smoothers from the 'mgcv' package, using penalized likelihood. These models include left truncation, right censoring, interval censoring, gamma frailties and normal random effects <doi:10.1002/sim.7451>, and copulas. For the smooth AFTs, S(t|x) = S_0(t*eta(t,x)), where the baseline survival function S_0(t)=exp(-exp(eta_0(t))) is modelled for natural splines for eta_0, and the time-dependent cumulative acceleration factor eta(t,x)=\int_0^t exp(eta_1(u,x)) du for log acceleration factor eta_1(u,x). The Markov multi-state models allow for a range of models with smooth transitions to predict transition probabilities, length of stay, utilities and costs, with differences, ratios and standardisation. |
Authors: | Mark Clements [aut, cre], Xing-Rong Liu [aut], Benjamin Christoffersen [aut], Paul Lambert [ctb], Lasse Hjort Jakobsen [ctb], Alessandro Gasparini [ctb], Gordon Smyth [cph], Patrick Alken [cph], Simon Wood [cph], Rhys Ulerich [cph] |
Maintainer: | Mark Clements <[email protected]> |
License: | GPL-2 | GPL-3 |
Version: | 1.6.6 |
Built: | 2024-11-22 06:11:00 UTC |
Source: | https://github.com/mclements/rstpm2 |
This implements the accelerated failure time models S_0(t exp(beta x)) and S_0(int_0^t exp(beta x(u)) du). The baseline function S_0(t*) is modelled as exp(-exp(eta_0(log(t*)))), where eta_0(log(t*)) is a linear predictor using natural splines.
aft(formula, data, smooth.formula = NULL, df = 3, tvc = NULL, cure.formula = ~1, control = list(), init = NULL, weights = NULL, tvc.intercept = TRUE, tvc.integrated = FALSE, timeVar = "", time0Var = "", cure = FALSE, mixture = FALSE, contrasts = NULL, subset = NULL, ...)
aft(formula, data, smooth.formula = NULL, df = 3, tvc = NULL, cure.formula = ~1, control = list(), init = NULL, weights = NULL, tvc.intercept = TRUE, tvc.integrated = FALSE, timeVar = "", time0Var = "", cure = FALSE, mixture = FALSE, contrasts = NULL, subset = NULL, ...)
formula |
a formula object, with the response on the left of a |
data |
a data-frame in which to interpret the variables named in the
|
smooth.formula |
a formula for describing the time effects for the linear predictor, excluding the baseline S_0(t*), but including time-dependent acceleration factors. The time-dependent acceleration factors can be modelled with any smooth functions. |
df |
an integer that describes the degrees of freedom for the |
tvc |
a list with the names of the time-varying coefficients. This uses natural splines
(e.g. |
cure.formula |
a formula for describing the cure fraction. |
control |
|
init |
|
weights |
an optional vector of 'prior weights' to be used in the
fitting process. Should be |
tvc.intercept |
logical for whether to include an intercept in the time-varying acceleration factor (defaults to TRUE) |
tvc.integrated |
logical for whether the time-varying acceleration factor should be based on a integration, rather than a cumulative effect (defaults to FALSE) |
timeVar |
string variable defining the time variable. By default, this is determined from the survival object, however this may be ambiguous if two variables define the time. |
time0Var |
string variable to determine the entry variable; useful for when more than one data variable is used in the entry time. |
cure |
logical for whether to model for cure using a non-mixture model (default=FALSE) |
mixture |
logical for whether to model for cure using a mixture model (default=FALSE) |
contrasts |
an optional list. See the |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
... |
additional arguments to be passed to the |
The implementation extends the mle2
object from the
bbmle
package. The model inherits all of the methods from the
mle2
class.
An aft-class
object that inherits from mle2-class
.
Mark Clements.
summary(aft(Surv(rectime,censrec==1)~hormon,data=brcancer,df=4))
summary(aft(Surv(rectime,censrec==1)~hormon,data=brcancer,df=4))
Regression object for aft
.
Objects can be created by calls of the form new("aft", ...)
and
aft( ...)
.
args
:Object of class "list"
~~
Class for mle2
, directly.
signature(x = "aft", y = "missing")
: ...
signature(x = "aft")
: ...
signature(object = "aft")
: ...
signature(object = "aft", ...)
: ...
showClass("aft")
showClass("aft")
Defined as the identity function.
bhazard(x)
bhazard(x)
x |
Input (and output) value |
Returns the input value
See https://www.stata-press.com/data/r11/brcancer.dta.
data(brcancer)
data(brcancer)
A data frame with 686 observations on the following 15 variables.
id
a numeric vector
hormon
hormonal therapy
x1
age, years
x2
menopausal status
x3
tumour size, mm
x4
tumour grade
x5
number of positive nodes
x6
progesterone receptor, fmol
x7
estrogen receptor, fmol
rectime
recurrence free survival time, days
censrec
censoring indicator
x4a
tumour grade>=2
x4b
tumour grade==3
x5e
exp(-0.12*x5)
data(brcancer) ## maybe str(brcancer) ; plot(brcancer) ...
data(brcancer) ## maybe str(brcancer) ; plot(brcancer) ...
Generic method to update the coef in an object.
coef(x) <- value
coef(x) <- value
x |
object to be updated |
value |
value of the coefficient to be updated. |
This simple generic method is used for the numerical delta method.
The updated object is returned.
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (x, value) UseMethod("coef<-")
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (x, value) UseMethod("coef<-")
Diagnoses of colon cancer.
data(colon)
data(colon)
A data frame with 15564 observations on the following 13 variables.
sex
Sex (1=male, 2=female))
age
Age at diagnosis
stage
Clinical stage at diagnosis (1=Unknown, 2=Localised, 3=Regional, 4=Distant)
mmdx
Month of diagnosis
yydx
Year of diagnosis
surv_mm
Survival time in months
surv_yy
Survival time in years
status
Vital status at last contact (1=Alive, 2=Dead: cancer, 3=Dead; other, 4=Lost to follow-up)
subsite
Anatomical subsite of tumour (1=Coecum and ascending, 2=Transverse, 3=Descending and sigmoid, 4=Other and NOS)
year8594
Year of diagnosis (1=Diagnosed 75-84, 2=Diagnosed 85-94)
agegrp
Age in 4 categories (1=0-44, 2=45-59, 3=60-74, 4=75+)
dx
Date of diagnosis
exit
Date of exit
Caution: there is a colon
dataset in the survival
package. We recommend using data(colon,package="rstpm2")
to
ensure the correct dataset is used.
data(colon,package="rstpm2") # avoids name conflict with survival::colon ## maybe str(colon) ; ...
data(colon,package="rstpm2") # avoids name conflict with survival::colon ## maybe str(colon) ; ...
coxph
modelTest for a time-varying effect in the coxph
model by re-fitting
the partial likelihood including a time-varying effect, plot
the effect size, and return the re-fitted model. The
main advantage of this function over the tt()
special is
that it scales well for moderate sized datasets
(cf. tt
which expands the dataset and scales very poorly).
cox.tvc(obj, var=NULL, method="logt")
cox.tvc(obj, var=NULL, method="logt")
obj |
A |
var |
String for the effect name. Currently assumes simple continuous effects. |
method |
A string representing the possible time transformations. Currently only "logt". |
Returns a tvcCoxph
object (which inherits from the mle2
class) of the re-fitted model.
## As per the example for cox.zph: fit <- coxph(Surv(futime, fustat) ~ age + ecog.ps, data=ovarian) temp <- rstpm2:::cox.tvc(fit, "age") print(temp) # display the results plot(temp) # plot curves
## As per the example for cox.zph: fit <- coxph(Surv(futime, fustat) ~ age + ecog.ps, data=ovarian) temp <- rstpm2:::cox.tvc(fit, "age") print(temp) # display the results plot(temp) # plot curves
S3 method for to provide exponentiated coefficents with confidence intervals.
eform(object, ...) ## S3 method for class 'stpm2' eform(object, parm, level = 0.95, method = c("Profile","Delta"), name = "exp(beta)", ...) ## Default S3 method: eform(object, parm, level = 0.95, method = c("Delta","Profile"), name = "exp(beta)", ...)
eform(object, ...) ## S3 method for class 'stpm2' eform(object, parm, level = 0.95, method = c("Profile","Delta"), name = "exp(beta)", ...) ## Default S3 method: eform(object, parm, level = 0.95, method = c("Delta","Profile"), name = "exp(beta)", ...)
object |
regression object |
parm |
not currently used |
level |
significance level for the confidence interval |
method |
method for confidence interval estimation |
name |
name for the fitted value |
... |
other arguments |
Numerical gradient for a function at a given value (internal).
grad(func, x, ...)
grad(func, x, ...)
func |
Function taking a vector argument x (returns a vector of length>=1) |
x |
vector of arguments for where the gradient is wanted. |
... |
other arguments to the function |
(func(x+delta,...)-func(x-delta,...))/(2 delta) where delta is the third root of the machine precision times pmax(1,abs(x)).
A vector if func(x) has length 1, otherwise a matrix with rows for x and columns for func(x).
Mark Clements.
numDelta()
This implements the generalised survival model g(S(t|x)) = eta, where g is a link function, S is survival, t is time, x are covariates and eta is a linear predictor. The linear predictor can include either parametric or penalised smoothers for the time effects, for time:covariate interactions and for covariate effects. The main model assumption is that the time effects in the linear predictor are smooth. This extends the class of flexible parametric survival models developed by Royston and colleagues. The model has been extended to include relative survival (excess hazards), Gamma frailties and normal random effects.
gsm(formula, data, smooth.formula = NULL, smooth.args = NULL, df = 3, cure = FALSE, tvc = NULL, tvc.formula = NULL, control = list(), init = NULL, weights = NULL, robust = FALSE, baseoff = FALSE, timeVar = "", time0Var = "", use.gr = NULL, optimiser=NULL, log.time.transform=TRUE, reltol=NULL, trace = NULL, link.type=c("PH","PO","probit","AH","AO"), theta.AO=0, contrasts = NULL, subset = NULL, robust_initial=NULL, coxph.strata = NULL, coxph.formula = NULL, logH.formula = NULL, logH.args = NULL, bhazard = NULL, bhazinit=NULL, copula=FALSE, frailty = !is.null(cluster) & !robust & !copula, cluster = NULL, logtheta=NULL, nodes=NULL, RandDist=c("Gamma","LogN"), recurrent = FALSE, adaptive = NULL, maxkappa = NULL, sp=NULL, criterion=NULL, penalty=NULL, smoother.parameters=NULL, Z=~1, outer_optim=NULL, alpha=1, sp.init=1, penalised=FALSE, ...) stpm2(formula, data, weights=NULL, subset=NULL, coxph.strata=NULL, ...) pstpm2(formula, data, weights=NULL, subset=NULL, coxph.strata=NULL, ...)
gsm(formula, data, smooth.formula = NULL, smooth.args = NULL, df = 3, cure = FALSE, tvc = NULL, tvc.formula = NULL, control = list(), init = NULL, weights = NULL, robust = FALSE, baseoff = FALSE, timeVar = "", time0Var = "", use.gr = NULL, optimiser=NULL, log.time.transform=TRUE, reltol=NULL, trace = NULL, link.type=c("PH","PO","probit","AH","AO"), theta.AO=0, contrasts = NULL, subset = NULL, robust_initial=NULL, coxph.strata = NULL, coxph.formula = NULL, logH.formula = NULL, logH.args = NULL, bhazard = NULL, bhazinit=NULL, copula=FALSE, frailty = !is.null(cluster) & !robust & !copula, cluster = NULL, logtheta=NULL, nodes=NULL, RandDist=c("Gamma","LogN"), recurrent = FALSE, adaptive = NULL, maxkappa = NULL, sp=NULL, criterion=NULL, penalty=NULL, smoother.parameters=NULL, Z=~1, outer_optim=NULL, alpha=1, sp.init=1, penalised=FALSE, ...) stpm2(formula, data, weights=NULL, subset=NULL, coxph.strata=NULL, ...) pstpm2(formula, data, weights=NULL, subset=NULL, coxph.strata=NULL, ...)
formula |
a formula object, with the response on the left of a |
data |
a data.frame in which to interpret the variables named in
the |
smooth.formula |
either a parametric formula or a penalised |
df |
an integer that describes the degrees of freedom for the |
smooth.args |
a list describing the arguments for the |
tvc |
a list with the names of the time-varying coefficients. For a parametric
model, this uses natural splines (e.g. |
tvc.formula |
separate formula for the time-varying effects. This is combined with
|
baseoff |
Boolean used to determine whether fully define the model
using |
logH.args |
as per |
logH.formula |
as per |
cure |
logical for whether to estimate a cure model (parametric model only). |
control |
list of arguments passed to |
init |
|
coxph.strata |
variable in the |
weights |
an optional vector of 'prior weights' to be used in the
fitting process. Should be |
robust |
Boolean used to determine whether to use a robust variance estimator. |
bhazard |
variable for the baseline hazard for relative survival |
bhazinit |
scalar used to adjust the background cumulative hazards for calculating
initial values. Default=0.1. Deprecated argument: use of the
|
copula |
logical to indicate whether to use a copula model (experimental) |
timeVar |
variable defining the time variable. By default, this is determined from the survival object, however this may be ambiguous if two variables define the time |
sp |
fix the value of the smoothing parameters. |
use.gr |
in R, a Boolean to determine whether to use the gradient
in the optimisation. Default=TRUE, Deprecated argument: use of the
|
criterion |
in Rcpp, determine whether to use "GCV" or "BIC" for for the smoothing parameter selection. |
penalty |
use either the "logH" penalty, which is the default
penalty from mgcv, or the "h" hazard penalty. Default="logH". Deprecated argument: use of the
|
smoother.parameters |
for the hazard penalty, a list with components which are lists with components var, transform and inverse. |
alpha |
an ad hoc tuning parameter for the smoothing parameter. |
sp.init |
initial values for the smoothing parameters. |
trace |
integer for trace reporting; 0 represents no additional
reporting. Default=0. Deprecated argument: use of the
|
contrasts |
an optional list. See the |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
coxph.formula |
additional formula used to improve the fitting of initial values [optional and rarely used]. |
time0Var |
string variable to determine the entry variable; useful for when more than one data variable is used in the entry time. |
link.type |
type of link function. For "PH" (generalised proportional hazards), g(S)=log(-log(S)); for "PO" (generalised proportional odds), g(S)=-logit(S); for "probit" (generalised probit), g(S)=-probit(S); for "AH" (generalised additive hazards), g(S)=-log(S); for "AO" (generalised Aranda-Ordaz), g(S)=log((S^(-theta.AO)-1)/theta.AO). |
theta.AO |
theta parameter for the Aranda-Ordaz link type. |
optimiser |
select which optimiser is used. Default="BFGS". Deprecated argument: use of the
|
log.time.transform |
should a log-transformation be used for calculating the derivative of the design matrix with respect to time? (default=TRUE) |
recurrent |
logical for whether clustered, left truncated data are recurrent or for first event (where the latter requires an adjustment for the frailties or random effects) |
frailty |
logical for whether to fit a shared frailty model |
cluster |
variable that determines the cluster for the frailty. This can be a vector, a string for the column, or a name. This can also be specified using a special. |
logtheta |
initial value for log-theta used in the gamma shared frailty
model (defaults to value from a |
nodes |
number of integration points for Gaussian
quadrature. Default=9. Deprecated argument: use of the
|
RandDist |
type of distribution for the random effect or frailty |
adaptive |
logical for whether to use adaptive or non-adaptive
quadrature, Default=TRUE. Deprecated argument: use of the
|
maxkappa |
double float value for the maximum value of the weight
used in the constraint. Default=1000. Deprecated argument: use of the
|
Z |
formula for the design matrix for the random effects |
reltol |
list with components for search and final relative
tolerances. Default=list(search=1e-10, final=1e-10, outer=1e-5). Deprecated argument: use of the
|
outer_optim |
Integer to indicate the algorithm for outer optimisation. If outer_optim=1 (default), then use Neldear-Mead, otherwise use Nlm. |
robust_initial |
logical for whether to use Nelder-Mead
to find initial values (max 50 iterations). This is useful for
ill-posed initial values. Default= FALSE. Deprecated argument: use of the
|
penalised |
logical to show whether to use penalised models with
|
... |
additional arguments to be passed to the |
The implementation extends the mle2
object from the
bbmle
package.
The default smoothers for time on the linear predictor scale are
nsxs(log(time),df=3)
for the parametric model and
s(log(time))
for the penalised model.
A frequently asked question is: why does rstpm2 give different spline
estimates to flexsurv and Stata's stpm2? The short answer is that
rstpm2 uses a different natural spline basis compared with flexsurv
and Stata's stpm2 and slightly different
knot placement than Stata's stpm2. If the knot
placement is the same, then the predictions and other coefficients are
expected to be very similar. As a longer answer, the default smoother
in rstpm2 is to use an extension of the splines::ns
function
(rstpm2::nsx
), which uses a QR projection of B-splines for
natural splines. In contrast, flexsurv and Stata's stpm2 use truncated
power splines for the natural spline basis (also termed 'restricted
cubic splines'). The B-splines are known to
have good numerical properties, while Stata's stpm2 implementation
defaults to using matrix orthogonalisation to account for any
numerical instability in the truncated power basis. Furthermore,
rstpm2 allows for any smooth parametric function to be used as a
smoother in stpm2
/gsm
, which is an extension over
flexsurv and Stata's stpm2. Finally, it may be difficult to get rstpm2 and
Stata's stpm2 to return the same estimates: although
nsx
includes an argument stata.stpm2.compatible = FALSE
(change to TRUE
for
compatibility), the design matrix for rstpm2 is based on individuals
with events, while Stata's stpm2 determines the spline knots from the
individuals with events and the design matrix is otherwise based on all individuals.
Either a stpm2-class
or pstpm2-class
object.
Mark Clements, Xing-Rong Liu, Benjamin Christoffersen.
## Not run: data(brcancer) summary(fit <- stpm2(Surv(rectime,censrec==1)~hormon,data=brcancer,df=3)) ## some predictions head(predict(fit,se.fit=TRUE,type="surv")) head(predict(fit,se.fit=TRUE,type="hazard")) ## some plots plot(fit,newdata=data.frame(hormon=0),type="hazard") plot(fit,newdata=data.frame(hormon=0),type="surv") ## time-varying coefficient summary(fit.tvc <- stpm2(Surv(rectime,censrec==1)~hormon,data=brcancer,df=3, tvc=list(hormon=3))) anova(fit,fit.tvc) # compare with and without tvc ## some more plots plot(fit.tvc,newdata=data.frame(hormon=0),type="hr",var="hormon", ylim=c(0,2)) lines(fit.tvc,newdata=data.frame(hormon=1),type="hr",var="hormon", col=2) plot(fit.tvc,newdata=data.frame(hormon=0),type="sdiff",var="hormon") plot(fit.tvc,newdata=data.frame(hormon=0),type="hdiff",var="hormon") library(scales) cols <- c(alpha("red",alpha=0.2), alpha("blue",alpha=0.2)) plot(fit.tvc,newdata=data.frame(hormon=0),type="hazard",ci.col=cols[1]) lines(fit.tvc,newdata=data.frame(hormon=1),type="hazard",lty=2,ci.col=cols[2], ci=TRUE) legend("topright",legend=c("No hormonal treatment", "(95 lty=c(1,1,2,1), lwd=c(1,10,1,10), col=c("black",cols[1],"black",cols[2]), bty="n") ## compare number of knots hormon0 <- data.frame(hormon=0) plot(fit,type="hazard",newdata=hormon0) AIC(fit) for (df in 4:6) { fit.new <- stpm2(Surv(rectime,censrec==1)~hormon,data=brcancer,df=df) plot(fit.new,type="hazard",newdata=hormon0,add=TRUE,ci=FALSE,line.col=df) print(AIC(fit.new)) } ## compatibility with Stata's stpm2 using the smooth.formula argument (see Details) summary(stpm2(Surv(rectime,censrec==1)~hormon,data=brcancer, smooth.formula=~nsx(log(rectime),df=3,stata.stpm2.compatible=TRUE))) summary(stpm2(Surv(rectime,censrec==1)~hormon,data=brcancer, smooth.formula=~nsx(log(rectime),df=3,stata=TRUE)+ hormon:nsx(log(rectime),df=3,stata=TRUE))) ## End(Not run)
## Not run: data(brcancer) summary(fit <- stpm2(Surv(rectime,censrec==1)~hormon,data=brcancer,df=3)) ## some predictions head(predict(fit,se.fit=TRUE,type="surv")) head(predict(fit,se.fit=TRUE,type="hazard")) ## some plots plot(fit,newdata=data.frame(hormon=0),type="hazard") plot(fit,newdata=data.frame(hormon=0),type="surv") ## time-varying coefficient summary(fit.tvc <- stpm2(Surv(rectime,censrec==1)~hormon,data=brcancer,df=3, tvc=list(hormon=3))) anova(fit,fit.tvc) # compare with and without tvc ## some more plots plot(fit.tvc,newdata=data.frame(hormon=0),type="hr",var="hormon", ylim=c(0,2)) lines(fit.tvc,newdata=data.frame(hormon=1),type="hr",var="hormon", col=2) plot(fit.tvc,newdata=data.frame(hormon=0),type="sdiff",var="hormon") plot(fit.tvc,newdata=data.frame(hormon=0),type="hdiff",var="hormon") library(scales) cols <- c(alpha("red",alpha=0.2), alpha("blue",alpha=0.2)) plot(fit.tvc,newdata=data.frame(hormon=0),type="hazard",ci.col=cols[1]) lines(fit.tvc,newdata=data.frame(hormon=1),type="hazard",lty=2,ci.col=cols[2], ci=TRUE) legend("topright",legend=c("No hormonal treatment", "(95 lty=c(1,1,2,1), lwd=c(1,10,1,10), col=c("black",cols[1],"black",cols[2]), bty="n") ## compare number of knots hormon0 <- data.frame(hormon=0) plot(fit,type="hazard",newdata=hormon0) AIC(fit) for (df in 4:6) { fit.new <- stpm2(Surv(rectime,censrec==1)~hormon,data=brcancer,df=df) plot(fit.new,type="hazard",newdata=hormon0,add=TRUE,ci=FALSE,line.col=df) print(AIC(fit.new)) } ## compatibility with Stata's stpm2 using the smooth.formula argument (see Details) summary(stpm2(Surv(rectime,censrec==1)~hormon,data=brcancer, smooth.formula=~nsx(log(rectime),df=3,stata.stpm2.compatible=TRUE))) summary(stpm2(Surv(rectime,censrec==1)~hormon,data=brcancer, smooth.formula=~nsx(log(rectime),df=3,stata=TRUE)+ hormon:nsx(log(rectime),df=3,stata=TRUE))) ## End(Not run)
Extract design information from an stpm2/gsm object and newdata for use in C++
gsm_design(object, newdata, newdata0 = NULL, t0 = NULL, inflate = 100)
gsm_design(object, newdata, newdata0 = NULL, t0 = NULL, inflate = 100)
object |
stpm2/gsm object |
newdata |
list or data-frame used for evaluation |
newdata0 |
list or data-frame used for evaluation at the entry time |
t0 |
possible delayed entry time (numeric scalar) |
inflate |
double value to inflate minimum and maximum times for root finding |
list that can be read by 'gsm ssim::read_gsm(SEX args)' in C++
Set useful default and allow changes for the gsm call. This is meant to make the gsm call simpler.
gsm.control(parscale = 1, maxit = 300, optimiser = c("BFGS", "NelderMead"), trace = 0, nodes = 9, adaptive = TRUE, kappa.init = 1, maxkappa = 1000, suppressWarnings.coxph.frailty = TRUE, robust_initial = FALSE, bhazinit = 0.1, eps.init = 1e-5, use.gr = TRUE, penalty = c("logH", "h"), outer_optim = 1, reltol.search = 1e-10, reltol.final = 1e-10, reltol.outer = 1e-05, criterion = c("GCV", "BIC"))
gsm.control(parscale = 1, maxit = 300, optimiser = c("BFGS", "NelderMead"), trace = 0, nodes = 9, adaptive = TRUE, kappa.init = 1, maxkappa = 1000, suppressWarnings.coxph.frailty = TRUE, robust_initial = FALSE, bhazinit = 0.1, eps.init = 1e-5, use.gr = TRUE, penalty = c("logH", "h"), outer_optim = 1, reltol.search = 1e-10, reltol.final = 1e-10, reltol.outer = 1e-05, criterion = c("GCV", "BIC"))
parscale |
numeric vector or scalar for the scaling of the parameter values; default 1 |
maxit |
integer for the maximum number of iterations for the optimisation process |
optimiser |
which optimiser to use for the outer optimisation |
trace |
integer indicating the trace level for each optimiser |
nodes |
number of quadrature nodes |
adaptive |
logical for whether to use adaptive or non-adaptive quadrature, Default=TRUE. |
kappa.init |
initial value for the quadratic penalty for inequality constraints |
eps.init |
initial value for epsilon |
maxkappa |
double float value for the maximum value of the weight used in the constraint. |
suppressWarnings.coxph.frailty |
logical |
robust_initial |
Not currently documented. |
bhazinit |
Not currently documented. |
use.gr |
Logical for whether to use gradients. |
penalty |
Not currently documented. |
outer_optim |
Not currently documented. |
reltol.search |
Relative tolerance. Not currently documented. |
reltol.final |
Relative tolerance. Not currently documented. |
reltol.outer |
Relative tolerance. Not currently documented. |
criterion |
Not currently documented. |
A functional approach to defining an increment in one or more variables in a data-frame. Given a variable name and an increment value, return a function that takes any data-frame to return a data-frame with incremented values.
incrVar(var, increment = 1)
incrVar(var, increment = 1)
var |
String for the name(s) of the variable(s) to be incremented |
increment |
Value that the variable should be incremented. |
Useful for defining transformations for calculating rate ratios.
A function with a single data
argument that increments the variables
in the data
list/data-frame.
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (var, increment = 1) { n <- length(var) if (n > 1 && length(increment)==1) increment <- rep(increment, n) function(data) { for (i in 1:n) { data[[var[i]]] <- data[[var[i]]] + increment[i] } data } }
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (var, increment = 1) { n <- length(var) if (n > 1 && length(increment)==1) increment <- rep(increment, n) function(data) { for (i in 1:n) { data[[var[i]]] <- data[[var[i]]] + increment[i] } data } }
Legendre quadrature rule for n=200.
data(legendre.quadrature.rule.200)
data(legendre.quadrature.rule.200)
A data frame with 200 observations on the following 2 variables.
x
x values between -1 and 1
w
weights
data(legendre.quadrature.rule.200) ## maybe str(legendre.quadrature.rule.200) ; ...
data(legendre.quadrature.rule.200) ## maybe str(legendre.quadrature.rule.200) ; ...
S3 methods for lines
## S3 method for class 'stpm2' lines(x, newdata = NULL, type = "surv", col = 1, ci.col= "grey", lty = par("lty"), ci = FALSE, rug = FALSE, var = NULL, exposed = NULL, times = NULL, type.relsurv = c("excess", "total", "other"), ratetable = survival::survexp.us, rmap, scale = 365.24, ...) ## S3 method for class 'pstpm2' lines(x, newdata = NULL, type = "surv", col = 1, ci.col= "grey", lty = par("lty"), ci = FALSE, rug = FALSE, var = NULL, exposed = NULL, times = NULL, ...)
## S3 method for class 'stpm2' lines(x, newdata = NULL, type = "surv", col = 1, ci.col= "grey", lty = par("lty"), ci = FALSE, rug = FALSE, var = NULL, exposed = NULL, times = NULL, type.relsurv = c("excess", "total", "other"), ratetable = survival::survexp.us, rmap, scale = 365.24, ...) ## S3 method for class 'pstpm2' lines(x, newdata = NULL, type = "surv", col = 1, ci.col= "grey", lty = par("lty"), ci = FALSE, rug = FALSE, var = NULL, exposed = NULL, times = NULL, ...)
x |
an |
newdata |
required list of new data. This defines the unexposed newdata (excluding the event times). |
type |
specify the type of prediction |
col |
line colour |
lty |
line type |
ci.col |
confidence interval colour |
ci |
whether to plot the confidence interval band (default=TRUE) |
rug |
whether to add a rug plot of the event times to the current plot (default=TRUE) |
var |
specify the variable name or names for the exposed/unexposed (names are given as characters) |
exposed |
function that takes newdata and returns the exposed
dataset. By default, this increments |
times |
specifies the times. By default, this uses a span of the observed times. |
type.relsurv |
type of predictions for relative survival models: either "excess", "total" or "other" |
scale |
scale to go from the days in the |
rmap |
an optional list that maps data set names to the ratetable
names. See |
ratetable |
a table of event rates used in relative survival when
|
... |
additional arguments (add to the |
A numerically efficient algorithm to calculate predictions from a continuous time, nonhomogeneous Markov multi-state model. The main inputs are the models for the transition intensities, the initial values, the transition matrix and the covariate patterns. The predictions include state occupancy probabilities (possibly with discounting and utilities), length of stay and costs. Standard errors are calculated using the delta method. Includes, differences, ratios and standardisation.
markov_msm(x, trans, t = c(0,1), newdata = NULL, init=NULL, tmvar = NULL, sing.inf = 1e+10, method="adams", rtol=1e-10, atol=1e-10, slow=FALSE, min.tm=1e-8, utility=function(t) rep(1, nrow(trans)), utility.sd=rep(0,nrow(trans)), use.costs=FALSE, transition.costs=function(t) rep(0, sum(!is.na(trans))), # per transition transition.costs.sd=rep(0,sum(!is.na(trans))), state.costs=function(t) rep(0,nrow(trans)), # per unit time state.costs.sd=rep(0,nrow(trans)), discount.rate = 0, block.size=500, spline.interpolation=FALSE, debug=FALSE, ...) ## S3 method for class 'markov_msm' vcov(object, ...) ## S3 method for class 'markov_msm' as.data.frame(x, row.names=NULL, optional=FALSE, ci=TRUE, P.conf.type="logit", L.conf.type="log", C.conf.type="log", P.range=c(0,1), L.range=c(0,Inf), C.range=c(0,Inf), state.weights=NULL, obs.weights=NULL, ...) ## S3 method for class 'markov_msm_diff' as.data.frame(x, row.names=NULL, optional=FALSE, P.conf.type="plain", L.conf.type="plain", C.conf.type="plain", P.range=c(-Inf,Inf), L.range=c(-Inf,Inf), C.range=c(-Inf,Inf), ...) ## S3 method for class 'markov_msm_ratio' as.data.frame(x, row.names=NULL, optional=FALSE, ...) standardise(x, ...) ## S3 method for class 'markov_msm' standardise(x, weights = rep(1,nrow(x$newdata)), normalise = TRUE, ...) ## S3 method for class 'markov_msm' plot(x, y, stacked=TRUE, which=c('P','L'), xlab="Time", ylab=NULL, col=2:6, border=col, ggplot2=FALSE, lattice=FALSE, alpha=0.2, strata=NULL, ...) ## S3 method for class 'markov_msm' subset(x, subset, ...) ## S3 method for class 'markov_msm' diff(x, y, ...) ratio_markov_msm(x, y, ...) ## S3 method for class 'markov_msm' rbind(..., deparse.level=1) ## S3 method for class 'markov_msm' transform(`_data`, ...) collapse_markov_msm(object, which=NULL, sep="; ") zeroModel(object) hrModel(object,hr=1,ci=NULL,seloghr=NULL) aftModel(object,af=1,ci=NULL,selogaf=NULL) addModel(...) hazFun(f, tmvar="t", ...) splineFun(time,rate,method="natural",scale=1,...)
markov_msm(x, trans, t = c(0,1), newdata = NULL, init=NULL, tmvar = NULL, sing.inf = 1e+10, method="adams", rtol=1e-10, atol=1e-10, slow=FALSE, min.tm=1e-8, utility=function(t) rep(1, nrow(trans)), utility.sd=rep(0,nrow(trans)), use.costs=FALSE, transition.costs=function(t) rep(0, sum(!is.na(trans))), # per transition transition.costs.sd=rep(0,sum(!is.na(trans))), state.costs=function(t) rep(0,nrow(trans)), # per unit time state.costs.sd=rep(0,nrow(trans)), discount.rate = 0, block.size=500, spline.interpolation=FALSE, debug=FALSE, ...) ## S3 method for class 'markov_msm' vcov(object, ...) ## S3 method for class 'markov_msm' as.data.frame(x, row.names=NULL, optional=FALSE, ci=TRUE, P.conf.type="logit", L.conf.type="log", C.conf.type="log", P.range=c(0,1), L.range=c(0,Inf), C.range=c(0,Inf), state.weights=NULL, obs.weights=NULL, ...) ## S3 method for class 'markov_msm_diff' as.data.frame(x, row.names=NULL, optional=FALSE, P.conf.type="plain", L.conf.type="plain", C.conf.type="plain", P.range=c(-Inf,Inf), L.range=c(-Inf,Inf), C.range=c(-Inf,Inf), ...) ## S3 method for class 'markov_msm_ratio' as.data.frame(x, row.names=NULL, optional=FALSE, ...) standardise(x, ...) ## S3 method for class 'markov_msm' standardise(x, weights = rep(1,nrow(x$newdata)), normalise = TRUE, ...) ## S3 method for class 'markov_msm' plot(x, y, stacked=TRUE, which=c('P','L'), xlab="Time", ylab=NULL, col=2:6, border=col, ggplot2=FALSE, lattice=FALSE, alpha=0.2, strata=NULL, ...) ## S3 method for class 'markov_msm' subset(x, subset, ...) ## S3 method for class 'markov_msm' diff(x, y, ...) ratio_markov_msm(x, y, ...) ## S3 method for class 'markov_msm' rbind(..., deparse.level=1) ## S3 method for class 'markov_msm' transform(`_data`, ...) collapse_markov_msm(object, which=NULL, sep="; ") zeroModel(object) hrModel(object,hr=1,ci=NULL,seloghr=NULL) aftModel(object,af=1,ci=NULL,selogaf=NULL) addModel(...) hazFun(f, tmvar="t", ...) splineFun(time,rate,method="natural",scale=1,...)
For markov_msm
:
x |
list of functions or parametric or penalised survival models. Currently
the models include
combinations of |
trans |
Transition matrix describing the states and transitions
in the multi-state model. If S is the number of states in the
multi-state model, |
t |
numerical vector for the times to evaluation the predictions. Includes the start time |
newdata |
|
init |
vector of the initial values with the same length as the number of states. Defaults to the first state having an
initial value of 1 (i.e. |
tmvar |
specifies the name of the time variable. This should be set for
regression models that do not specify this (e.g. |
sing.inf |
If there is a singularity in the observed hazard,
for example a Weibull distribution with |
method |
For For |
rtol |
relative error tolerance, either a
scalar or an array as long as the number of states. Passed to |
atol |
absolute error tolerance, either a scalar or an array as
long as the number of states. Passed to |
slow |
logical to show whether to use the slow |
min.tm |
Minimum time used for evaluations. Avoids log(0) for some models. |
utility |
a function of the form |
utility.sd |
a function of the form |
use.costs |
logical for whether to use costs. Default: FALSE |
transition.costs |
a function of the form |
transition.costs.sd |
a function of the form |
state.costs |
a function of the form |
state.costs.sd |
a function of the form |
discount.rate |
numerical value for the proportional reduction (per unit time) in the length of stay and costs |
block.size |
divide |
spline.interpolation |
logical for whether to use spline interpolation for the transition hazards rather than the model predictions directly (default=TRUE). |
debug |
logical flag for whether to keep the full output from the ordinary differential equation in the |
... |
other arguments. For |
For as.data.frame.markov_msm
:
row.names |
add in row names to the output data-frame |
optional |
(not currently used) |
ci |
logical for whether to include confidence intervals. Default: TRUE |
P.conf.type |
type of transformation for the confidence interval
calculation for the state occupancy probabilities. Default: log-log transformation. This is changed for
|
L.conf.type |
type of transformation for the confidence interval
calculation for the length of stay calculation. Default: log transformation. This is changed for
|
C.conf.type |
type of transformation for the confidence interval
calculation for the length of stay calculation. Default: log transformation. This is changed for
|
P.range |
valid values for the state occupancy probabilities. Default: (0,1). This is changed for
|
L.range |
valid values for the state occupancy probabilities. Default: (0,Inf). This is changed for
|
C.range |
valid values for the state occupancy probabilities. Default: (0,Inf). This is changed for
|
state.weights |
Not currently documented |
obs.weights |
Not currently documented |
For standardise.markov_msm
:
weights |
numerical vector to use in standardising the state occupancy probabilities, length of stay and costs. Default: 1 for each observation. |
normalise |
logical for whether to normalise the weights to 1. Default: TRUE |
For plot.markov_msm
:
y |
(currently ignored) |
stacked |
logical for whether to stack the plots. Default: TRUE |
xlab |
x-axis label |
ylab |
x-axis label |
col |
colours (ignored if |
border |
border colours for the |
ggplot2 |
use |
alpha |
alpha value for confidence bands (ggplot) |
lattice |
use |
strata |
formula for the stratification factors for the plot |
For subset.markov_msm
:
subset |
expression that is evaluated on the |
For transform.markov_msm
:
_data |
an object of class |
For rbind.markov_msm
:
deparse.level |
not currently used |
For collapse.states
:
which |
either an index of the states to collapse or a character vector of the state names to collapse |
sep |
separator to use for the collapsed state names |
For zeroModel
to predict zero rates:
object |
survival regression object to be wrapped |
For hrModel
to predict rates times a hazard ratio:
hr |
hazard ratio |
seloghr |
alternative specification for the se of the log(hazard ratio); see also |
For aftModel
to predict accelerated rates:
af |
acceleration factor |
selogaf |
alternative specification for the se of the log(acceleration factor); see also |
addModel
predict rates based on adding rates from different models
hazFun
provides a rate function without uncertainty:
f |
rate function, possibly with |
splineFun
predicts rates using spline interpolation:
time |
exact times |
rate |
rates as per |
scale |
rate multiplier (e.g. |
The predictions are calculated using an ordinary differential equation solver. The algorithm uses a single run of the solver to calculate the state occupancy probabilities, length of stay, costs and their partial derivatives with respect to the model parameters. The predictions can also be combined to calculate differences, ratios and standardised.
The current implementation supports a list of models for each transition.
The current implementation also only allows for a vector of initial values rather than a matrix. The predictions will need to be re-run for different vectors of initial values.
For as.data.frame.markov_msm_ratio
, the data are provided in
log form, hence the default transformations and bounds are as per
as.data.frame.markov_msm_diff
, with untransformed data on the
real line.
TODO: allow for one model to predict for the different transitions.
markov_msm
returns an object of class
"markov_msm"
.
The function summary
is used to
obtain and print a summary and analysis of variance table of the
results. The generic accessor functions coef
and vcov
extract
various useful features of the value returned by markov_msm
.
An object of class "markov_msm"
is a list containing at least the
following components:
time |
a numeric vector with the times for the predictions |
P |
an |
L |
an |
Pu |
an |
Lu |
an |
newdata |
a |
vcov |
the variance-covariance matrix for the models of the transition intensities |
trans |
copy of the |
call |
the call to the function |
For debugging:
res |
data returned from the ordinary differential equation solver. This may include more information on the predictions |
Mark Clements
## Not run: if (requireNamespace("deSolve")) { library(readstata13) library(mstate) library(ggplot2) library(survival) ## Two states: Initial -> Final ## Note: this shows how to use markov_msm to estimate survival and risk probabilities based on ## smooth hazard models. two_states <- function(model, ...) { transmat = matrix(c(NA,1,NA,NA),2,2,byrow=TRUE) rownames(transmat) <- colnames(transmat) <- c("Initial","Final") rstpm2::markov_msm(list(model), ..., trans = transmat) } ## Note: the first argument is the hazard model. The other arguments are arguments to the ## markov_msm function, except for the transition matrix, which is defined by the new function. death = gsm(Surv(time,status)~factor(rx), data=survival::colon, subset=(etype==2), df=3) cr = two_states(death, newdata=data.frame(rx="Obs"), t = seq(0,2500, length=301)) plot(cr,ggplot=TRUE) ## Competing risks ## Note: this shows how to adapt the markov_msm model for competing risks. competing_risks <- function(listOfModels, ...) { nRisks = length(listOfModels) transmat = matrix(NA,nRisks+1,nRisks+1) transmat[1,1+(1:nRisks)] = 1:nRisks rownames(transmat) <- colnames(transmat) <- c("Initial",names(listOfModels)) rstpm2::markov_msm(listOfModels, ..., trans = transmat) } ## Note: The first argument for competing_risks is a list of models. Names from that list are ## used for labelling the states. The other arguments are as per the markov_msm function, ## except for the transition matrix, which is defined by the competing_risks function. recurrence = gsm(Surv(time,status)~factor(rx), data=survival::colon, subset=(etype==1), df=3) death = gsm(Surv(time,status)~factor(rx), data=survival::colon, subset=(etype==2), df=3) cr = competing_risks(list(Recurrence=recurrence,Death=death), newdata=data.frame(rx=levels(survival::colon$rx)), t = seq(0,2500, length=301)) ## Plot the probabilities for each state for three different treatment arms plot(cr, ggplot=TRUE) + facet_grid(~ rx) ## And: differences in probabilities cr_diff = diff(subset(cr,rx=="Lev+5FU"),subset(cr,rx=="Obs")) plot(cr_diff, ggplot=TRUE, stacked=FALSE) ## Extended example: Crowther and Lambert (2017) ## library(rstpm2); library(readstata13); library(ggplot2) mex.1 <- read.dta13("http://fmwww.bc.edu/repec/bocode/m/multistate_example.dta") transmat <- rbind("Post-surgery"=c(NA,1,2), "Relapsed"=c(NA,NA,3), "Died"=c(NA,NA,NA)) colnames(transmat) <- rownames(transmat) mex.2 <- transform(mex.1,osi=(osi=="deceased")+0) levels(mex.2$size)[2] <- ">20-50 mm" # fix typo mex <- mstate::msprep(time=c(NA,"rf","os"),status=c(NA,"rfi","osi"), data=mex.2,trans=transmat,id="pid", keep=c("age","size","nodes","pr_1","hormon")) mex <- transform(mex, size2=(unclass(size)==2)+0, # avoids issues with TRUE/FALSE size3=(unclass(size)==3)+0, hormon=(hormon=="yes")+0, Tstart=Tstart/12, Tstop=Tstop/12) ## c.ar <- stpm2(Surv(Tstart,Tstop,status) ~ age + size2 + size3 + nodes + pr_1 + hormon, data = mex, subset=trans==1, df=3, tvc=list(size2=1,size3=1,pr_1=1)) c.ad <- stpm2(Surv(Tstart, Tstop, status) ~ age + size + nodes + pr_1 + hormon, data = mex, subset=trans==2, df=1) c.rd <- stpm2( Surv(Tstart,Tstop,status) ~ age + size + nodes + pr_1 + hormon, data=mex, subset=trans==3, df=3, tvc=list(pr_1=1)) ## nd <- expand.grid(nodes=seq(0,20,10), size=levels(mex$size)) nd <- transform(nd, age=54, pr_1=3, hormon=0, size2=(unclass(size)==2)+0, size3=(unclass(size)==3)+0) ## Predictions system.time(pred1 <- rstpm2::markov_msm(list(c.ar,c.ad,c.rd), t = seq(0,15,length=301), newdata=nd, trans = transmat)) # ~2 seconds pred1 <- transform(pred1, Nodes=paste("Nodes =",nodes), Size=paste("Size",size)) ## Figure 3 plot(pred1, ggplot=TRUE) + facet_grid(Nodes ~ Size) + xlab("Years since surgery") plot(pred1, ggplot=TRUE, flipped=TRUE) + facet_grid(Nodes ~ Size) + xlab("Years since surgery") plot(pred1, strata=~nodes+size, xlab="Years since surgery", lattice=TRUE) ## Figure 4 plot(subset(pred1, nodes==0 & size=="<=20 mm"), stacked=FALSE, ggplot=TRUE) + facet_grid(. ~ state) + xlab("Years since surgery") ## Figure 5 a <- diff(subset(pred1,nodes==0 & size=="<=20 mm"), subset(pred1,nodes==0 & size==">20-50 mm")) a <- transform(a, label = "Prob(Size<=20 mm)-Prob(20mm<Size<50mm)") b <- ratio_markov_msm(subset(pred1,nodes==0 & size=="<=20 mm"), subset(pred1,nodes==0 & size==">20-50 mm")) b <- transform(b,label="Prob(Size<=20 mm)-Prob(20mm<Size<50mm)") ## c <- diff(subset(pred1,nodes==0 & size=="<=20 mm"), subset(pred1,nodes==0 & size==">50 mm")) c <- transform(c, label = "Prob(Size<=20 mm)-Prob(Size>=50mm)") d <- ratio_markov_msm(subset(pred1,nodes==0 & size=="<=20 mm"), subset(pred1,nodes==0 & size==">50 mm")) d <- transform(d,label= "Prob(Size<=20 mm)-Prob(Size>=50mm)") ## e <- diff(subset(pred1,nodes==0 & size==">20-50 mm"), subset(pred1,nodes==0 & size==">50 mm")) e <- transform(e,label="Prob(20mm<Size<50 mm)-Prob(Size>=50mm)") f <- ratio_markov_msm(subset(pred1,nodes==0 & size==">20-50 mm"), subset(pred1,nodes==0 & size==">50 mm")) f <- transform(f, label = "Prob(20mm<Size<50 mm)-Prob(Size>=50mm)") ## combine diffs <- rbind(a,c,e) ratios <- rbind(b,d,f) ## Figure 5 plot(diffs, stacked=FALSE, ggplot2=TRUE) + xlab("Years since surgery") + ylim(c(-0.4, 0.4)) + facet_grid(label ~ state) ## plot(ratios, stacked=FALSE, ggplot2=TRUE) + xlab("Years since surgery") + ylim(c(0, 3)) + facet_grid(label ~ state) ## Figure 6 plot(subset(pred1, nodes==0 & size=="<=20 mm"), stacked=FALSE, which="L", ggplot2=TRUE) + facet_grid(. ~ state) + xlab("Years since surgery") ## Figure 7 plot(diffs, stacked=FALSE, which="L", ggplot2=TRUE) + xlab("Years since surgery") + ylim(c(-4, 4)) + facet_grid(label ~ state) plot(ratios, stacked=FALSE, which="L", ggplot2=TRUE) + xlab("Years since surgery") + ylim(c(0.1, 10)) + coord_trans(y="log10") + facet_grid(label ~ state) } ## End(Not run)
## Not run: if (requireNamespace("deSolve")) { library(readstata13) library(mstate) library(ggplot2) library(survival) ## Two states: Initial -> Final ## Note: this shows how to use markov_msm to estimate survival and risk probabilities based on ## smooth hazard models. two_states <- function(model, ...) { transmat = matrix(c(NA,1,NA,NA),2,2,byrow=TRUE) rownames(transmat) <- colnames(transmat) <- c("Initial","Final") rstpm2::markov_msm(list(model), ..., trans = transmat) } ## Note: the first argument is the hazard model. The other arguments are arguments to the ## markov_msm function, except for the transition matrix, which is defined by the new function. death = gsm(Surv(time,status)~factor(rx), data=survival::colon, subset=(etype==2), df=3) cr = two_states(death, newdata=data.frame(rx="Obs"), t = seq(0,2500, length=301)) plot(cr,ggplot=TRUE) ## Competing risks ## Note: this shows how to adapt the markov_msm model for competing risks. competing_risks <- function(listOfModels, ...) { nRisks = length(listOfModels) transmat = matrix(NA,nRisks+1,nRisks+1) transmat[1,1+(1:nRisks)] = 1:nRisks rownames(transmat) <- colnames(transmat) <- c("Initial",names(listOfModels)) rstpm2::markov_msm(listOfModels, ..., trans = transmat) } ## Note: The first argument for competing_risks is a list of models. Names from that list are ## used for labelling the states. The other arguments are as per the markov_msm function, ## except for the transition matrix, which is defined by the competing_risks function. recurrence = gsm(Surv(time,status)~factor(rx), data=survival::colon, subset=(etype==1), df=3) death = gsm(Surv(time,status)~factor(rx), data=survival::colon, subset=(etype==2), df=3) cr = competing_risks(list(Recurrence=recurrence,Death=death), newdata=data.frame(rx=levels(survival::colon$rx)), t = seq(0,2500, length=301)) ## Plot the probabilities for each state for three different treatment arms plot(cr, ggplot=TRUE) + facet_grid(~ rx) ## And: differences in probabilities cr_diff = diff(subset(cr,rx=="Lev+5FU"),subset(cr,rx=="Obs")) plot(cr_diff, ggplot=TRUE, stacked=FALSE) ## Extended example: Crowther and Lambert (2017) ## library(rstpm2); library(readstata13); library(ggplot2) mex.1 <- read.dta13("http://fmwww.bc.edu/repec/bocode/m/multistate_example.dta") transmat <- rbind("Post-surgery"=c(NA,1,2), "Relapsed"=c(NA,NA,3), "Died"=c(NA,NA,NA)) colnames(transmat) <- rownames(transmat) mex.2 <- transform(mex.1,osi=(osi=="deceased")+0) levels(mex.2$size)[2] <- ">20-50 mm" # fix typo mex <- mstate::msprep(time=c(NA,"rf","os"),status=c(NA,"rfi","osi"), data=mex.2,trans=transmat,id="pid", keep=c("age","size","nodes","pr_1","hormon")) mex <- transform(mex, size2=(unclass(size)==2)+0, # avoids issues with TRUE/FALSE size3=(unclass(size)==3)+0, hormon=(hormon=="yes")+0, Tstart=Tstart/12, Tstop=Tstop/12) ## c.ar <- stpm2(Surv(Tstart,Tstop,status) ~ age + size2 + size3 + nodes + pr_1 + hormon, data = mex, subset=trans==1, df=3, tvc=list(size2=1,size3=1,pr_1=1)) c.ad <- stpm2(Surv(Tstart, Tstop, status) ~ age + size + nodes + pr_1 + hormon, data = mex, subset=trans==2, df=1) c.rd <- stpm2( Surv(Tstart,Tstop,status) ~ age + size + nodes + pr_1 + hormon, data=mex, subset=trans==3, df=3, tvc=list(pr_1=1)) ## nd <- expand.grid(nodes=seq(0,20,10), size=levels(mex$size)) nd <- transform(nd, age=54, pr_1=3, hormon=0, size2=(unclass(size)==2)+0, size3=(unclass(size)==3)+0) ## Predictions system.time(pred1 <- rstpm2::markov_msm(list(c.ar,c.ad,c.rd), t = seq(0,15,length=301), newdata=nd, trans = transmat)) # ~2 seconds pred1 <- transform(pred1, Nodes=paste("Nodes =",nodes), Size=paste("Size",size)) ## Figure 3 plot(pred1, ggplot=TRUE) + facet_grid(Nodes ~ Size) + xlab("Years since surgery") plot(pred1, ggplot=TRUE, flipped=TRUE) + facet_grid(Nodes ~ Size) + xlab("Years since surgery") plot(pred1, strata=~nodes+size, xlab="Years since surgery", lattice=TRUE) ## Figure 4 plot(subset(pred1, nodes==0 & size=="<=20 mm"), stacked=FALSE, ggplot=TRUE) + facet_grid(. ~ state) + xlab("Years since surgery") ## Figure 5 a <- diff(subset(pred1,nodes==0 & size=="<=20 mm"), subset(pred1,nodes==0 & size==">20-50 mm")) a <- transform(a, label = "Prob(Size<=20 mm)-Prob(20mm<Size<50mm)") b <- ratio_markov_msm(subset(pred1,nodes==0 & size=="<=20 mm"), subset(pred1,nodes==0 & size==">20-50 mm")) b <- transform(b,label="Prob(Size<=20 mm)-Prob(20mm<Size<50mm)") ## c <- diff(subset(pred1,nodes==0 & size=="<=20 mm"), subset(pred1,nodes==0 & size==">50 mm")) c <- transform(c, label = "Prob(Size<=20 mm)-Prob(Size>=50mm)") d <- ratio_markov_msm(subset(pred1,nodes==0 & size=="<=20 mm"), subset(pred1,nodes==0 & size==">50 mm")) d <- transform(d,label= "Prob(Size<=20 mm)-Prob(Size>=50mm)") ## e <- diff(subset(pred1,nodes==0 & size==">20-50 mm"), subset(pred1,nodes==0 & size==">50 mm")) e <- transform(e,label="Prob(20mm<Size<50 mm)-Prob(Size>=50mm)") f <- ratio_markov_msm(subset(pred1,nodes==0 & size==">20-50 mm"), subset(pred1,nodes==0 & size==">50 mm")) f <- transform(f, label = "Prob(20mm<Size<50 mm)-Prob(Size>=50mm)") ## combine diffs <- rbind(a,c,e) ratios <- rbind(b,d,f) ## Figure 5 plot(diffs, stacked=FALSE, ggplot2=TRUE) + xlab("Years since surgery") + ylim(c(-0.4, 0.4)) + facet_grid(label ~ state) ## plot(ratios, stacked=FALSE, ggplot2=TRUE) + xlab("Years since surgery") + ylim(c(0, 3)) + facet_grid(label ~ state) ## Figure 6 plot(subset(pred1, nodes==0 & size=="<=20 mm"), stacked=FALSE, which="L", ggplot2=TRUE) + facet_grid(. ~ state) + xlab("Years since surgery") ## Figure 7 plot(diffs, stacked=FALSE, which="L", ggplot2=TRUE) + xlab("Years since surgery") + ylim(c(-4, 4)) + facet_grid(label ~ state) plot(ratios, stacked=FALSE, which="L", ggplot2=TRUE) + xlab("Years since surgery") + ylim(c(0.1, 10)) + coord_trans(y="log10") + facet_grid(label ~ state) } ## End(Not run)
A numerically efficient algorithm to calculate predictions from a continuous time, nonhomogeneous Markov multi-state model. The main inputs are are a list of Aalen's additive hazards models, the initial values, the transition matrix and the covariate patterns. The predictions include state occupancy probabilities and length of stay. Standard errors are calculated using the delta method. Includes differences and standardisation.
markov_sde(models, trans, newdata, init = NULL, nLebesgue = 10000 + 1, los = FALSE, nOut = 300, weights = 1) ## S3 method for class 'markov_sde' standardise(x, ...) ## S3 method for class 'markov_sde' plot(x, y, stacked=TRUE, which=c("P","L"), index=NULL, xlab="Time", ylab=NULL, col=2:6, border=col, ggplot2=FALSE, lattice=FALSE, alpha=0.2, strata=NULL, ...) ## S3 method for class 'markov_sde' as.data.frame(x, row.names=NULL, optional=NULL, ci=TRUE, P.conf.type="logit", L.conf.type="log", P.range=c(0,1), L.range=c(0,Inf), ...)
markov_sde(models, trans, newdata, init = NULL, nLebesgue = 10000 + 1, los = FALSE, nOut = 300, weights = 1) ## S3 method for class 'markov_sde' standardise(x, ...) ## S3 method for class 'markov_sde' plot(x, y, stacked=TRUE, which=c("P","L"), index=NULL, xlab="Time", ylab=NULL, col=2:6, border=col, ggplot2=FALSE, lattice=FALSE, alpha=0.2, strata=NULL, ...) ## S3 method for class 'markov_sde' as.data.frame(x, row.names=NULL, optional=NULL, ci=TRUE, P.conf.type="logit", L.conf.type="log", P.range=c(0,1), L.range=c(0,Inf), ...)
models |
list of models. Currently allows only for |
trans |
Transition matrix describing the states and transitions
in the multi-state model. If S is the number of states in the
multi-state model, |
newdata |
|
init |
vector of the initial values with the same length as the number of states. Defaults to the first state having an
initial value of 1 (i.e. |
nLebesgue |
Number of steps for the continuous integration |
los |
logical variable for whether to estimate the length of stay |
nOut |
number of rows to represent the continuous changes |
weights |
numeric vector to represent differences or standardisation |
For plot.markov_sde
:
y |
(currently ignored) |
stacked |
logical for whether to stack the plots. Default: TRUE |
index |
indicator of which row of |
which |
character to indicate either transition probabilities ( |
xlab |
x-axis label |
ylab |
x-axis label |
col |
colours (ignored if |
border |
border colours for the |
ggplot2 |
use |
alpha |
alpha value for confidence bands (ggplot) |
lattice |
use |
strata |
formula for the stratification factors for the plot |
For as.data.frame.markov_sde
:
row.names |
add in row names to the output data-frame |
optional |
(not currently used) |
ci |
logical for whether to include confidence intervals. Default: TRUE |
P.conf.type |
type of transformation for the confidence interval
calculation for the state occupancy probabilities. Default: logit transformation. This is changed to |
L.conf.type |
type of transformation for the confidence interval
calculation for the length of stay calculation. Default: log transformation. |
P.range |
valid values for the state occupancy probabilities. Default: (0,1). |
L.range |
valid values for the state occupancy probabilities. Default: (0,Inf). |
For standardise.markov_sde
:
x |
object to extract standardised values |
... |
other arguments. For |
Uses an approach developed by Ryalen and colleagues. This is a re-implementation in C++.
The current implementation only allows for a vector of initial values rather than a matrix. The predictions will need to be re-run for different vectors of initial values.
markov_sde
returns an object of class
"markov_sde"
.
Mark Clements
Generate the B-spline basis matrix for a natural cubic spline (with eXtensions).
nsx(x, df = NULL, knots = NULL, intercept = FALSE, Boundary.knots = range(x), derivs = if (cure) c(2, 1) else c(2, 2), log = FALSE, centre = FALSE, cure = FALSE, stata.stpm2.compatible = FALSE)
nsx(x, df = NULL, knots = NULL, intercept = FALSE, Boundary.knots = range(x), derivs = if (cure) c(2, 1) else c(2, 2), log = FALSE, centre = FALSE, cure = FALSE, stata.stpm2.compatible = FALSE)
x |
the predictor variable. Missing values are allowed. |
df |
degrees of freedom. One can supply |
knots |
breakpoints that define the spline. The default is no
knots; together with the natural boundary conditions this results in
a basis for linear regression on |
intercept |
if |
Boundary.knots |
boundary points at which to impose the natural
boundary conditions and anchor the B-spline basis (default the range
of the data). If both |
derivs |
an integer vector of length 2 with values between 0 and 2 giving the derivative constraint order at the left and right boundary knots; an order of 2 constrains the second derivative to zero (f”(x)=0); an order of 1 constrains the first and second derivatives to zero (f'(x)=f”(x)=0); an order of 0 constrains the zero, first and second derivatives to zero (f(x)=f'(x)=f”(x)=0) |
log |
a Boolean indicating whether the underlying values have been log transformed; (deprecated: only used to calculate derivatives in rstpm2:::stpm2Old |
centre |
if specified, then centre the splines at this value (i.e. f(centre)=0) (default=FALSE) |
cure |
a Boolean indicated whether to estimate cure; changes the default derivs argument, such that the right boundary has the first and second derivatives constrained to zero; defaults to FALSE |
stata.stpm2.compatible |
a Boolean to determine whether to use Stata stpm's default knot placement; defaults to FALSE |
A matrix of dimension length(x) * df
where either df
was
supplied or if knots
were supplied,
df = length(knots) + 1 + intercept
.
Attributes are returned that correspond to the arguments to ns
,
and explicitly give the knots
, Boundary.knots
etc for
use by predict.nsx()
.
nsx()
is based on the functions ns
and spline.des
. It
generates a basis matrix for representing the family of
piecewise-cubic splines with the specified sequence of
interior knots, and the natural boundary conditions. These enforce
the constraint that the function is linear beyond the boundary knots,
which can either be supplied, else default to the extremes of the
data. A primary use is in modeling formula to directly specify a
natural spline term in a model.
The extensions from ns
are: specification of the
derivative constraints at the boundary knots; whether to centre the
knots; incorporation of cure using derivatives; compatible knots
with Stata's stpm2; and an indicator for a log-transformation of
x
for calculating derivatives.
Hastie, T. J. (1992) Generalized additive models. Chapter 7 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
ns
, bs
, predict.nsx
, SafePrediction
require(stats); require(graphics); require(splines) nsx(women$height, df = 5) summary(fm1 <- lm(weight ~ ns(height, df = 5), data = women)) ## example of safe prediction plot(women, xlab = "Height (in)", ylab = "Weight (lb)") ht <- seq(57, 73, length.out = 200) lines(ht, predict(fm1, data.frame(height=ht)))
require(stats); require(graphics); require(splines) nsx(women$height, df = 5) summary(fm1 <- lm(weight ~ ns(height, df = 5), data = women)) ## example of safe prediction plot(women, xlab = "Height (in)", ylab = "Weight (lb)") ht <- seq(57, 73, length.out = 200) lines(ht, predict(fm1, data.frame(height=ht)))
Generate the B-spline basis matrix for the first derivative of a natural cubic spline (with eXtensions).
nsxD(x, df = NULL, knots = NULL, intercept = FALSE, Boundary.knots = range(x), derivs = if (cure) c(2, 1) else c(2, 2), log = FALSE, centre = FALSE, cure = FALSE, stata.stpm2.compatible = FALSE)
nsxD(x, df = NULL, knots = NULL, intercept = FALSE, Boundary.knots = range(x), derivs = if (cure) c(2, 1) else c(2, 2), log = FALSE, centre = FALSE, cure = FALSE, stata.stpm2.compatible = FALSE)
x |
the predictor variable. Missing values are allowed. |
df |
degrees of freedom. One can supply |
knots |
breakpoints that define the spline. The default is no
knots; together with the natural boundary conditions this results in
a basis for linear regression on |
intercept |
if |
Boundary.knots |
boundary points at which to impose the natural
boundary conditions and anchor the B-spline basis (default the range
of the data). If both |
derivs |
an integer vector of length 2 with values between 0 and 2 giving the derivative constraint order at the left and right boundary knots; an order of 2 constrains the second derivative to zero (f”(x)=0); an order of 1 constrains the first and second derivatives to zero (f'(x)=f”(x)=0); an order of 0 constrains the zero, first and second derivatives to zero (f(x)=f'(x)=f”(x)=0) |
log |
a Boolean indicating whether the underlying values have been log transformed; (deprecated: only used to calculate derivatives in rstpm2:::stpm2Old |
centre |
if specified, then centre the splines at this value (i.e. f(centre)=0) (default=FALSE) |
cure |
a Boolean indicated whether to estimate cure; changes the default derivs argument, such that the right boundary has the first and second derivatives constrained to zero; defaults to FALSE |
stata.stpm2.compatible |
a Boolean to determine whether to use Stata stpm's default knot placement; defaults to FALSE |
A matrix of dimension length(x) * df
where either df
was
supplied or if knots
were supplied,
df = length(knots) + 1 + intercept
.
Attributes are returned that correspond to the arguments to ns
,
and explicitly give the knots
, Boundary.knots
etc for
use by predict.nsxD()
.
nsxD()
is based on the functions ns
and spline.des
. It
generates a basis matrix for representing the family of
piecewise-cubic splines with the specified sequence of
interior knots, and the natural boundary conditions. These enforce
the constraint that the function is linear beyond the boundary knots,
which can either be supplied, else default to the extremes of the
data. A primary use is in modeling formula to directly specify a
natural spline term in a model.
The extensions from ns
are: specification of the
derivative constraints at the boundary knots; whether to centre the
knots; incorporation of cure using derivatives; compatible knots
with Stata's stpm2; and an indicator for a log-transformation of
x
for calculating derivatives.
Hastie, T. J. (1992) Generalized additive models. Chapter 7 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
ns
, bs
, predict.nsx
, SafePrediction
require(stats); require(graphics); require(splines) nsx(women$height, df = 5) summary(fm1 <- lm(weight ~ ns(height, df = 5), data = women)) ## example of safe prediction plot(women, xlab = "Height (in)", ylab = "Weight (lb)") ht <- seq(57, 73, length.out = 200) lines(ht, predict(fm1, data.frame(height=ht)))
require(stats); require(graphics); require(splines) nsx(women$height, df = 5) summary(fm1 <- lm(weight ~ ns(height, df = 5), data = women)) ## example of safe prediction plot(women, xlab = "Height (in)", ylab = "Weight (lb)") ht <- seq(57, 73, length.out = 200) lines(ht, predict(fm1, data.frame(height=ht)))
Given a regression object and an independent prediction function (as a function of the coefficients), calculate the point estimate and standard errors
numDeltaMethod(object, fun, gd=NULL, conf.int=FALSE, level=0.95, ...)
numDeltaMethod(object, fun, gd=NULL, conf.int=FALSE, level=0.95, ...)
object |
A regression object with methods |
fun |
An independent prediction function with signature
|
gd |
Specified gradients |
conf.int |
Logical for whether to also calculate the confidence interval |
level |
Numeric for the level of the confidence interval |
... |
Other arguments passed to |
A more user-friendly interface is provided by predictnl
.
fit |
Point estimates |
se.fit |
Standard errors |
Estimate |
Point estimates |
SE |
Standard errors |
conf.low |
Lower confidence interval (if conf.int=TRUE) |
conf.high |
Upper confidence interval (if conf.int=TRUE) |
See Also predictnl
.
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (object, fun, ...) { coef <- coef(object) est <- fun(coef, ...) Sigma <- vcov(object) gd <- grad(fun, coef, ...) se.est <- as.vector(sqrt(colSums(gd * (Sigma %*% gd)))) data.frame(Estimate = est, SE = se.est) }
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (object, fun, ...) { coef <- coef(object) est <- fun(coef, ...) Sigma <- vcov(object) gd <- grad(fun, coef, ...) se.est <- as.vector(sqrt(colSums(gd * (Sigma %*% gd)))) data.frame(Estimate = est, SE = se.est) }
Given an stpm2
fit, return a plot
## S4 method for signature 'stpm2' plot(x,y,newdata,type="surv", xlab="Time",line.col=1,ci.col="grey", add=FALSE,ci=TRUE,rug=TRUE, var=NULL,exposed=NULL,times=NULL,...) ## S4 method for signature 'pstpm2' plot(x,y,newdata,type="surv", xlab="Time",line.col=1,ci.col="grey", add=FALSE,ci=TRUE,rug=TRUE, var=NULL,exposed=NULL,times=NULL,...)
## S4 method for signature 'stpm2' plot(x,y,newdata,type="surv", xlab="Time",line.col=1,ci.col="grey", add=FALSE,ci=TRUE,rug=TRUE, var=NULL,exposed=NULL,times=NULL,...) ## S4 method for signature 'pstpm2' plot(x,y,newdata,type="surv", xlab="Time",line.col=1,ci.col="grey", add=FALSE,ci=TRUE,rug=TRUE, var=NULL,exposed=NULL,times=NULL,...)
x |
an |
y |
not used (for generic compatibility) |
newdata |
required list of new data. This defines the unexposed newdata (excluding the event times). |
type |
specify the type of prediction |
xlab |
x-axis label |
line.col |
line colour |
ci.col |
confidence interval colour |
ci |
whether to plot the confidence interval band (default=TRUE) |
add |
whether to add to the current plot ( |
rug |
whether to add a rug plot of the event times to the current plot (default=TRUE) |
var |
specify the variable name or names for the exposed/unexposed (names are given as characters) |
exposed |
function that takes newdata and returns the exposed
dataset. By default, this increments |
times |
specifies the times. By default, this uses a span of the observed times. |
... |
additional arguments (add to the |
an stpm2
fit
Background mortality rates for the colon dataset.
data(popmort)
data(popmort)
A data frame with 10600 observations on the following 5 variables.
sex
Sex (1=male, 2=female)
prob
One year probability of survival
rate
All cause mortality rate
age
Age by single year of age through to age 105 years
year
Calendar period
data(popmort) ## maybe str(popmort) ; ...
data(popmort) ## maybe str(popmort) ; ...
Given an stpm2
fit and an optional list of new data, return predictions
## S4 method for signature 'stpm2' predict(object, newdata=NULL, type=c("surv","cumhaz","hazard","density","hr","sdiff", "hdiff","loghazard","link","meansurv","meansurvdiff","meanhr", "odds","or","margsurv","marghaz","marghr","meanhaz","af", "fail","margfail","meanmargsurv","uncured","rmst","probcure", "lpmatrix", "gradh", "gradH","rmstdiff","lpmatrixD"), grid=FALSE,seqLength=300, type.relsurv=c("excess","total","other"), scale=365.24, rmap, ratetable=survival::survexp.us, se.fit=FALSE,link=NULL,exposed=NULL,var=NULL, keep.attributes=FALSE, use.gr=TRUE,level=0.95, n.gauss.quad=100,full=FALSE,...) ## S4 method for signature 'pstpm2' predict(object, newdata=NULL, type=c("surv","cumhaz","hazard","density","hr","sdiff", "hdiff","loghazard","link","meansurv","meansurvdiff","meanhr", "odds","or","margsurv","marghaz","marghr","meanhaz","af", "fail","margfail","meanmargsurv","rmst","lpmatrix", "gradh", "gradH","rmstdiff","lpmatrixD"), grid=FALSE,seqLength=300, se.fit=FALSE,link=NULL,exposed=NULL,var=NULL, keep.attributes=FALSE, use.gr=TRUE,level=0.95, n.gauss.quad=100,full=FALSE,...)
## S4 method for signature 'stpm2' predict(object, newdata=NULL, type=c("surv","cumhaz","hazard","density","hr","sdiff", "hdiff","loghazard","link","meansurv","meansurvdiff","meanhr", "odds","or","margsurv","marghaz","marghr","meanhaz","af", "fail","margfail","meanmargsurv","uncured","rmst","probcure", "lpmatrix", "gradh", "gradH","rmstdiff","lpmatrixD"), grid=FALSE,seqLength=300, type.relsurv=c("excess","total","other"), scale=365.24, rmap, ratetable=survival::survexp.us, se.fit=FALSE,link=NULL,exposed=NULL,var=NULL, keep.attributes=FALSE, use.gr=TRUE,level=0.95, n.gauss.quad=100,full=FALSE,...) ## S4 method for signature 'pstpm2' predict(object, newdata=NULL, type=c("surv","cumhaz","hazard","density","hr","sdiff", "hdiff","loghazard","link","meansurv","meansurvdiff","meanhr", "odds","or","margsurv","marghaz","marghr","meanhaz","af", "fail","margfail","meanmargsurv","rmst","lpmatrix", "gradh", "gradH","rmstdiff","lpmatrixD"), grid=FALSE,seqLength=300, se.fit=FALSE,link=NULL,exposed=NULL,var=NULL, keep.attributes=FALSE, use.gr=TRUE,level=0.95, n.gauss.quad=100,full=FALSE,...)
object |
an |
newdata |
optional list of new data (required if type in
("hr","sdiff","hdiff","meansurvdiff","or","uncured")). For type in
("hr","sdiff","hdiff","meansurvdiff","or","af","uncured"), this defines the unexposed
newdata. This can be combined with |
type |
specify the type of prediction:
|
grid |
whether to merge newdata with a regular sequence of event times (default=FALSE) |
seqLength |
length of the sequence used when |
type.relsurv |
type of predictions for relative survival models: either "excess", "total" or "other" |
scale |
scale to go from the days in the |
rmap |
an optional list that maps data set names to the ratetable
names. See |
ratetable |
a table of event rates used in relative survival when
|
se.fit |
whether to calculate confidence intervals (default=FALSE) |
link |
allows a different link for the confidence interval calculation (default=NULL, such that switch(type,surv="cloglog",cumhaz="log",hazard="log",hr="log",sdiff="I", hdiff="I",loghazard="I",link="I",odds="log",or="log",margsurv="cloglog", marghaz="log",marghr="log")) |
exposed |
a function that takes newdata and returns a transformed
data-frame for those exposed or the counterfactual. By default, this increments |
var |
specify the variable name or names for the exposed/unexposed (names are given as characters) |
keep.attributes |
Boolean to determine whether the output should include the newdata as an attribute (default=TRUE) |
use.gr |
Boolean to determine whether to use gradients in the variance calculations when they are available (default=TRUE) |
level |
confidence level for the confidence intervals (default=0.95) |
n.gauss.quad |
number of Gauassian quadrature points used for integrations (default=100) |
full |
logical for whether to return a full data-frame with
predictions and |
... |
additional arguments (for generic compatibility) |
The confidence interval estimation is based on the delta method using numerical differentiation.
A data-frame with components Estimate
, lower
and
upper
, with an attribute "newdata" for the newdata
data-frame.
an stpm2
fit
Evaluate a predefined spline basis at given values.
## S3 method for class 'nsx' predict(object, newx, ...)
## S3 method for class 'nsx' predict(object, newx, ...)
object |
the result of a call to |
newx |
the |
... |
Optional additional arguments. At present no additional arguments are used. |
An object just like object
, except evaluated at the new values
of x
.
These are methods for the generic function predict
for
objects inheriting from classes "nsx"
. See
predict
for the general behavior of this function.
nsx
.
basis <- nsx(women$height, df = 5) newX <- seq(58, 72, length.out = 51) # evaluate the basis at the new data predict(basis, newX)
basis <- nsx(women$height, df = 5) newX <- seq(58, 72, length.out = 51) # evaluate the basis at the new data predict(basis, newX)
A simple, yet exceedingly useful, approach to estimate the variance of a function using the numerical delta method. A number of packages provide functions that analytically calculate the gradients; we use numerical derivatives, which generalises to models that do not offer analytical derivatives (e.g. ordinary differential equations, integration), or to examples that are tedious or error-prone to calculate (e.g. sums of predictions from GLMs).
## Default S3 method: predictnl(object, fun, newdata=NULL, gd=NULL, ...) ## S3 method for class 'lm' predictnl(object, fun, newdata=NULL, ...) ## S3 method for class 'formula' predict(object,data,newdata,na.action,type="model.matrix",...) ## S3 method for class 'predictnl' confint(object, parm, level=0.95, ...)
## Default S3 method: predictnl(object, fun, newdata=NULL, gd=NULL, ...) ## S3 method for class 'lm' predictnl(object, fun, newdata=NULL, ...) ## S3 method for class 'formula' predict(object,data,newdata,na.action,type="model.matrix",...) ## S3 method for class 'predictnl' confint(object, parm, level=0.95, ...)
object |
An object with |
fun |
A function that takes |
newdata |
An optional argument that defines newdata to be passed to |
gd |
An optional matrix of gradients. If this is not specified, then the gradients are calculated using finite differences. |
parm |
currently ignored |
level |
significance level for 2-sided confidence intervals |
data |
object used to define the model frame |
na.action |
passed to |
type |
currently restricted to |
... |
Other arguments that are passed to |
The signature for fun
is either fun(object, ...)
or fun(object, newdata=NULL,
...)
.
The different predictnl
methods call the utility function
numDeltaMethod
, which in turn calls the grad
function for
numerical differentiation. The numDeltaMethod
function calls the
standard coef
and vcov
methods, and the non-standard
`coef<-`
method for changing the coefficients in a regression
object. This non-standard method has been provided for several
regression objects and essentially mirrors the coef
method.
One potential issue is that some predict
methods do not
re-calculate their predictions for the fitted dataset (i.e. when
newdata=NULL
). As the predictnl
function changes the
fitted coefficients, it is required that the predictions are
re-calculated. One solution is to pass newdata
as an argument to
both predictnl
and fun
; alternatively, newdata
can
be specified in fun
. These approaches are described in the examples
below. The numDeltaMethod
method called by predictnl
provides a warning when the variance estimates are zero, which may be
due to this cause.
For completeness, it is worth discussing why the example
predictnl(fit,predict)
does not work for when fit
is a
glm
object. First, predict.glm
does not update the
predictions for the fitted data. Second, the default predict
method has a signature predict(object, ...)
, which does not
include a newdata
argument. We could then either (i) require that
a newdata
argument be passed to the fun
function for all
examples, which would make this corner case work, or (ii) only pass the
newdata
argument if it is non-null or in the formals for the
fun
function, which would fail for this corner case. The current
API defaults to the latter case (ii). To support this approach, the
predictnl.lm
method replaces a null newdata
with
object$data
. We also provide a revised
numdelta:::predict.lm
method that performs the same operation,
although its use is not encouraged due to its clumsiness.
Returns an object of class
an object with class c("predictnl","data.frame")
elements
c("fit","se.fit","Estimate","SE")
and with methods print
and confint
. Note that the Estimate and SE fields are deprecated
and their use is discouraged, as we would like to remove them from future releases.
Mark Clements
df <- data.frame(x=0:1, y=c(10, 20)) fit <- glm(y ~ x, df, family=poisson) predictnl(fit, function(obj,newdata) diff(predict(obj,newdata,type="response")))
df <- data.frame(x=0:1, y=c(10, 20)) fit <- glm(y ~ x, df, family=poisson) predictnl(fit, function(obj,newdata) diff(predict(obj,newdata,type="response")))
~~ Methods for function predictnl
~~
signature(object = "mle2", ...)
: Similar to
predictnl.default, using S4 methods.
Regression object for pstpm2
.
Objects can be created by calls of the form new("pstpm2", ...)
and
pstpm2( ...)
.
xlevels
:Object of class "list"
~~
contrasts
:Object of class "listOrNULL"
~~
terms
:Object of class "terms"
~~
gam
:Object of class "gam"
~~
logli
:Object of class "function"
~~
timeVar
:Object of class "character"
~~
time0Var
:Object of class "character"
~~
time0Expr
:Object of class "nameOrcall"
~~
timeExpr
:Object of class "nameOrcall"
~~
like
:Object of class "function"
~~
model.frame
:Object of class "list"
~~
delayed
:Object of class "logical"
~~
frailty
:Object of class "logical"
~~
x
:Object of class "matrix"
~~
xd
:Object of class "matrix"
~~
termsd
:Object of class "terms"
~~
Call
:Object of class "character"
~~
y
:Object of class "Surv"
~~
sp
:Object of class "numeric"
~~
nevent
:Object of class "numeric"
~~
link
:Object of class "list"
~~
edf
:Object of class "numeric"
~~
edf_var
:Object of class "numeric"
~~
df
:Object of class "numeric"
~~
call
:Object of class "language"
~~
call.orig
:Object of class "language"
~~
coef
:Object of class "numeric"
~~
fullcoef
:Object of class "numeric"
~~
vcov
:Object of class "matrix"
~~
min
:Object of class "numeric"
~~
details
:Object of class "list"
~~
minuslogl
:Object of class "function"
~~
method
:Object of class "character"
~~
data
:Object of class "list"
~~
formula
:Object of class "character"
~~
optimizer
:Object of class "character"
~~
args
:Object of class "list"
~~
Class for mle2
, directly.
signature(x = "pstpm2", y = "missing")
: ...
signature(x = "pstpm2", ...)
: ...
signature(object = "pstpm2",...)
: ...
signature(object = "pstpm2",...,k=2)
: ...
signature(object = "pstpm2",...,nobs=NULL, k=2)
: ...
signature(object = "pstpm2",..., nobs = NULL)
: ...
signature(object = "pstpm2",..., nobs = NULL, dispersion = 1, k = 2)
: ...
signature(object = "pstpm2",..., dispersion = 1, k = 2)
: ...
signature(object = "pstpm2",...)
: ...
signature(object = "pstpm2",...)
: ...
signature(object = "pstpm2",...)
: ...
showClass("pstpm2")
showClass("pstpm2")
Given an stpm2
or pstpm2
fit, return residuals
## S4 method for signature 'stpm2' residuals(object, type=c("li","gradli")) ## S4 method for signature 'pstpm2' residuals(object, type=c("li","gradli"))
## S4 method for signature 'stpm2' residuals(object, type=c("li","gradli")) ## S4 method for signature 'pstpm2' residuals(object, type=c("li","gradli"))
object |
an |
type |
specify the type of residuals:
|
The gradients are analytical.
A vector or matrix.
an stpm2
fit
Various utility functions used internally to the rstpm2 package.
lhs(formula) rhs(formula) lhs(formula) <- value rhs(formula) <- value
lhs(formula) rhs(formula) lhs(formula) <- value rhs(formula) <- value
formula |
A formula |
value |
A symbolic value to replace the current value. |
Given an stpm2
fit and a data-frame
of new data, return simulated values
## S4 method for signature 'stpm2' simulate(object, nsim=1, seed=NULL, newdata=NULL, lower=1e-06, upper=1e+05, start=NULL, ...) ## S4 method for signature 'pstpm2' simulate(object, nsim=1, seed=NULL, newdata=NULL, lower=1e-06, upper=1e+05, start=NULL, ...)
## S4 method for signature 'stpm2' simulate(object, nsim=1, seed=NULL, newdata=NULL, lower=1e-06, upper=1e+05, start=NULL, ...) ## S4 method for signature 'pstpm2' simulate(object, nsim=1, seed=NULL, newdata=NULL, lower=1e-06, upper=1e+05, start=NULL, ...)
object |
an stpm2 or pstpm2 object |
nsim |
number of simulations per row in newdata |
seed |
optional random number seed |
newdata |
list of new data. If not specified, then defaults to object@data |
lower |
smallest possible time |
upper |
largest possible time |
start |
left truncated entry time (assumed to be zero if NULL) |
... |
additional arguments (for generic compatibility) |
an stpm2
fit
set.seed(1002) fit1 <- gsm(Surv(rectime,censrec==1)~hormon,data=brcancer,df=3) simulate(fit1, nsim=10, newdata=data.frame(hormon=1)) simulate(fit1, newdata=data.frame(hormon=0:1))
set.seed(1002) fit1 <- gsm(Surv(rectime,censrec==1)~hormon,data=brcancer,df=3) simulate(fit1, nsim=10, newdata=data.frame(hormon=1)) simulate(fit1, newdata=data.frame(hormon=0:1))
Utility to use a smooth function in markov_msm based on piece-wise constant values
smoothpwc(midts, rates, tmvar = "t", offsetvar = "", ...)
smoothpwc(midts, rates, tmvar = "t", offsetvar = "", ...)
midts |
mid-point values for time in each segment |
rates |
rates at those mid-points (or for the interval) |
tmvar |
string for the time variable |
offsetvar |
string for a time offset variable |
... |
other arguments |
Uses splines to smooth the log-rates. This assumes that the rates are strictly greater than zero.
a function that is used in markov_msm
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (midts, rates, tmvar = "t", offsetvar = "", ...) { log.smoother <- splinefunx(midts, log(rates), constant.right = TRUE) haz <- function(newdata) { t <- newdata[[tmvar]] + (if (offsetvar != "") newdata[[offsetvar]] else 0) exp(log.smoother(t)) } structure(list(haz = haz), class = "smoothpwc") }
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (midts, rates, tmvar = "t", offsetvar = "", ...) { log.smoother <- splinefunx(midts, log(rates), constant.right = TRUE) haz <- function(newdata) { t <- newdata[[tmvar]] + (if (offsetvar != "") newdata[[offsetvar]] else 0) exp(log.smoother(t)) } structure(list(haz = haz), class = "smoothpwc") }
Regression object for stpm2
.
Objects can be created by calls of the form new("stpm2", ...)
and
stpm2( ...)
.
xlevels
:Object of class "list"
~~
contrasts
:Object of class "listOrNULL"
~~
terms
:Object of class "terms"
~~
logli
:Object of class "function"
~~
lm
:Object of class "lm"
~~
timeVar
:Object of class "character"
~~
time0Var
:Object of class "character"
~~
timeExpr
:Object of class "nameOrcall"
~~
time0Expr
:Object of class "nameOrcall"
~~
delayed
:Object of class "logical"
~~
frailty
:Object of class "logical"
~~
interval
:Object of class "logical"
~~
model.frame
:Object of class "list"
~~
call.formula
:Object of class "formula"
~~
x
:Object of class "matrix"
~~
xd
:Object of class "matrix"
~~
termsd
:Object of class "terms"
~~
Call
:Object of class "character"
~~
y
:Object of class "Surv"
~~
link
:Object of class "list"
~~
call
:Object of class "language"
~~
call.orig
:Object of class "language"
~~
coef
:Object of class "numeric"
~~
fullcoef
:Object of class "numeric"
~~
vcov
:Object of class "matrix"
~~
min
:Object of class "numeric"
~~
details
:Object of class "list"
~~
minuslogl
:Object of class "function"
~~
method
:Object of class "character"
~~
data
:Object of class "list"
~~
formula
:Object of class "character"
~~
optimizer
:Object of class "character"
~~
args
:Object of class "list"
~~
Class mle2
, directly.
signature(x = "stpm2", y = "missing")
: ...
signature(x = "stpm2", ...)
: ...
signature(object = "stpm2", ...)
: ...
signature(object = "stpm2", ...)
: ...
signature(object = "stpm2", ...)
: ...
showClass("stpm2")
showClass("stpm2")
"tvcCoxph"
Experimental approach to modelling time-dependent effects in Cox regression.
Objects can be created by calls of the form new("tvcCoxph", ...)
or cox.tvc(...)
. See the mle2
documentation.
call
:Object of class "language"
~~
call.orig
:Object of class "language"
~~
coef
:Object of class "numeric"
~~
fullcoef
:Object of class "numeric"
~~
vcov
:Object of class "matrix"
~~
min
:Object of class "numeric"
~~
details
:Object of class "list"
~~
minuslogl
:Object of class "function"
~~
method
:Object of class "character"
~~
data
:Object of class "list"
~~
formula
:Object of class "character"
~~
optimizer
:Object of class "character"
~~
Class mle2
, directly.
signature(x = "tvcCoxph", y = "missing")
: ...
showClass("tvcCoxph")
showClass("tvcCoxph")
Methods for function update
signature(object = "stpm2", ...)
: Similar to
update.default, using S4 methods.
The function voptimize
searches the interval from
lower
to upper
for a minimum or maximum of
the vectorised function f
with respect to its first argument.
optimise
is an alias for optimize
.
voptimize(f, interval, ..., lower=pmin(interval[,1], interval[,2]), upper=pmax(interval[,1], interval[,2]), maximum = FALSE, tol = .Machine$double.eps^0.25) voptimise(f, interval, ..., lower=pmin(interval[,1], interval[,2]), upper=pmax(interval[,1], interval[,2]), maximum = FALSE, tol = .Machine$double.eps^0.25)
voptimize(f, interval, ..., lower=pmin(interval[,1], interval[,2]), upper=pmax(interval[,1], interval[,2]), maximum = FALSE, tol = .Machine$double.eps^0.25) voptimise(f, interval, ..., lower=pmin(interval[,1], interval[,2]), upper=pmax(interval[,1], interval[,2]), maximum = FALSE, tol = .Machine$double.eps^0.25)
f |
the function to be optimized. The function is
either minimized or maximized over its first argument
depending on the value of |
interval |
a matrix with two columns containing the end-points of the interval to be searched for the minimum. |
... |
additional named or unnamed arguments to be passed
to |
lower , upper
|
the lower and upper end points of the interval to be searched. |
maximum |
logical. Should we maximize or minimize (the default)? |
tol |
the desired accuracy. |
Note that arguments after ...
must be matched exactly.
The method used is a combination of golden section search and
successive parabolic interpolation, and was designed for use with
continuous functions. Convergence is never much slower
than that for a Fibonacci search. If f
has a continuous second
derivative which is positive at the minimum (which is not at lower
or
upper
), then convergence is superlinear, and usually of the
order of about 1.324.
The function f
is never evaluated at two points closer together
than , where
is approximately
sqrt(.Machine$double.eps)
and is the final abscissa
optimize()$minimum
.
If f
is a unimodal function and the computed values of f
are always unimodal when separated by at least
, then
approximates the abscissa of the
global minimum of
f
on the interval lower,upper
with an
error less than .
If f
is not unimodal, then optimize()
may approximate a
local, but perhaps non-global, minimum to the same accuracy.
The first evaluation of f
is always at
where
(a,b) = (lower, upper)
and
is the golden section ratio.
Almost always, the second evaluation is at
.
Note that a local minimum inside
will be found as
solution, even when
f
is constant in there, see the last
example.
f
will be called as f(x, ...)
for a numeric value
of x.
The argument passed to f
has special semantics and used to be
shared between calls. The function should not copy it.
The implementation is a vectorised version of the optimize
function.
A list with components minimum
(or maximum
)
and objective
which give the location of the minimum (or maximum)
and the value of the function at that point.
Based on R's C translation of Fortran code https://netlib.org/fmm/fmin.f
(author(s) unstated)
based on the Algol 60 procedure localmin
given in the reference.
Brent, R. (1973) Algorithms for Minimization without Derivatives. Englewood Cliffs, NJ: Prentice-Hall.
optimize
for the standard single optimiser solver,
nlm
, uniroot
.
library(graphics) f <- function (x, a) (x - a)^2 xmin <- voptimize(f, lower=c(0, 0), upper=c(1,1), tol = 0.0001, a = c(1/3,2/3)) xmin ## See where the function is evaluated: voptimize(function(x) x^2*(print(x)-1), lower = c(0,0), upper = c(10,10)) ## "wrong" solution with unlucky interval and piecewise constant f(): f <- function(x) ifelse(x > -1, ifelse(x < 4, exp(-1/abs(x - 1)), 10), 10) fp <- function(x) { print(x); f(x) } plot(f, -2,5, ylim = 0:1, col = 2) voptimize(fp, cbind(-4, 20)) # doesn't see the minimum voptimize(fp, cbind(-7, 20)) # ok
library(graphics) f <- function (x, a) (x - a)^2 xmin <- voptimize(f, lower=c(0, 0), upper=c(1,1), tol = 0.0001, a = c(1/3,2/3)) xmin ## See where the function is evaluated: voptimize(function(x) x^2*(print(x)-1), lower = c(0,0), upper = c(10,10)) ## "wrong" solution with unlucky interval and piecewise constant f(): f <- function(x) ifelse(x > -1, ifelse(x < 4, exp(-1/abs(x - 1)), 10), 10) fp <- function(x) { print(x); f(x) } plot(f, -2,5, ylim = 0:1, col = 2) voptimize(fp, cbind(-4, 20)) # doesn't see the minimum voptimize(fp, cbind(-7, 20)) # ok
The function vuniroot
searches the interval from lower
to upper
for a root (i.e., zero) of the vectorised function f
with
respect to its first argument.
Setting extendInt
to a non-"no"
string, means searching
for the correct interval = c(lower,upper)
if sign(f(x))
does not satisfy the requirements at the interval end points; see the
‘Details’ section.
vuniroot(f, interval, ..., lower, upper, f.lower = f(lower, ...), f.upper = f(upper, ...), extendInt = c("no", "yes", "downX", "upX"), check.conv = FALSE, tol = .Machine$double.eps^0.25, maxiter = 1000, trace = 0, n = NULL)
vuniroot(f, interval, ..., lower, upper, f.lower = f(lower, ...), f.upper = f(upper, ...), extendInt = c("no", "yes", "downX", "upX"), check.conv = FALSE, tol = .Machine$double.eps^0.25, maxiter = 1000, trace = 0, n = NULL)
f |
the function for which the root is sought. |
interval |
a matrix with two columns containing the end-points of the interval to be searched for the root. |
... |
additional named or unnamed arguments to be passed
to |
lower , upper
|
the lower and upper end points of the interval to be searched. |
f.lower , f.upper
|
the same as |
extendInt |
character string specifying if the interval
|
check.conv |
logical indicating whether a convergence warning of the
underlying |
tol |
the desired accuracy (convergence tolerance). |
maxiter |
the maximum number of iterations. |
trace |
integer number; if positive, tracing information is produced. Higher values giving more details. |
n |
integer number; size of input vector to |
Note that arguments after ...
must be matched exactly.
Either interval
or both lower
and upper
must be
specified: the upper endpoint must be strictly larger than the lower
endpoint.
The function values at the endpoints must be of opposite signs (or
zero), for extendInt="no"
, the default. Otherwise, if
extendInt="yes"
, the interval is extended on both sides, in
search of a sign change, i.e., until the search interval
satisfies
.
If it is known how changes sign at the root
, that is, if the function is increasing or decreasing there,
extendInt
can (and typically should) be specified as
"upX"
(for “upward crossing”) or "downX"
,
respectively. Equivalently, define , to
require
at the solution. In that case, the search interval
possibly is extended to be such that
and
.
vuniroot()
uses a C++ subroutine based on ‘"zeroin"’ (from Netlib)
and algorithms given in the reference below. They assume a
continuous function (which then is known to have at least one root in
the interval).
Convergence is declared either if f(x) == 0
or the change in
x
for one step of the algorithm is less than tol
(plus an
allowance for representation error in x
).
If the algorithm does not converge in maxiter
steps, a warning
is printed and the current approximation is returned.
f
will be called as f(x, ...)
for a numeric value
of x.
The argument passed to f
has special semantics and used to be
shared between calls. The function should not copy it.
A list with at least three components: root
and f.root
give the location of the root and the value of the function evaluated
at that point. iter
gives the number of
iterations used.
Further components may be added in future: component init.it
was added in R 3.1.0.
Based on ‘zeroin.c’ in https://netlib.org/c/brent.shar.
Brent, R. (1973) Algorithms for Minimization without Derivatives. Englewood Cliffs, NJ: Prentice-Hall.
uniroot
for the standard single root solver
polyroot
for all complex roots of a polynomial;
optimize
, nlm
.
require(utils) # for str ## some platforms hit zero exactly on the first step: ## if so the estimated precision is 2/3. f <- function (x, a) x - a str(xmin <- vuniroot(f, lower=c(0, 0), upper=c(1,1), tol = 0.0001, a = c(1/3,2/3))) ## same example with scalars for lower and upper -- using the n argument str(xmin <- vuniroot(f, lower=0, upper=1, tol = 0.0001, n=2, a = c(1/3,2/3))) ## handheld calculator example: fixed point of cos(.): vuniroot(function(x) cos(x) - x, lower = -pi, upper = pi, tol = 1e-9)$root str(vuniroot(function(x) x*(x^2-1) + .5, lower = -2, upper = 2, tol = 0.0001)) str(vuniroot(function(x) x*(x^2-1) + .5, lower = -2, upper = 2, tol = 1e-10)) ## Find the smallest value x for which exp(x) > 0 (numerically): r <- vuniroot(function(x) 1e80*exp(x) - 1e-300, cbind(-1000, 0), tol = 1e-15) str(r, digits.d = 15) # around -745, depending on the platform. exp(r$root) # = 0, but not for r$root * 0.999... minexp <- r$root * (1 - 10*.Machine$double.eps) exp(minexp) # typically denormalized ##--- vuniroot() with new interval extension + checking features: -------------- f1 <- function(x) (121 - x^2)/(x^2+1) f2 <- function(x) exp(-x)*(x - 12) tools::assertCondition(vuniroot(f1, cbind(0,10)), "error", verbose=TRUE) tools::assertCondition(vuniroot(f2, cbind(0, 2)), "error", verbose=TRUE) ##--> error: f() .. end points not of opposite sign ## where as 'extendInt="yes"' simply first enlarges the search interval: u1 <- vuniroot(f1, cbind(0,10),extendInt="yes", trace=1) u2 <- vuniroot(f2, cbind(0,2), extendInt="yes", trace=2) stopifnot(all.equal(u1$root, 11, tolerance = 1e-5), all.equal(u2$root, 12, tolerance = 6e-6)) ## The *danger* of interval extension: ## No way to find a zero of a positive function, but ## numerically, f(-|M|) becomes zero : tools::assertCondition(u3 <- vuniroot(exp, cbind(0,2), extendInt="yes", trace=TRUE), "error", verbose=TRUE) ## Nonsense example (must give an error): tools::assertCondition( vuniroot(function(x) 1, cbind(0,1), extendInt="yes"), "error", verbose=TRUE) ## Convergence checking : sinc_ <- function(x) ifelse(x == 0, 1, sin(x)/x) curve(sinc_, -6,18); abline(h=0,v=0, lty=3, col=adjustcolor("gray", 0.8)) vuniroot(sinc_, cbind(0,5), extendInt="yes", maxiter=4) #-> "just" a warning ## now with check.conv=TRUE, must signal a convergence error : vuniroot(sinc_, cbind(0,5), extendInt="yes", maxiter=4, check.conv=TRUE) ### Weibull cumulative hazard (example origin, Ravi Varadhan): cumhaz <- function(t, a, b) b * (t/b)^a froot <- function(x, u, a, b) cumhaz(x, a, b) - u n <- 10 u <- -log(runif(n)) a <- 1/2 b <- 1 ## Find failure times ru <- vuniroot(froot, u=u, a=a, b=b, interval= cbind(rep(1.e-14,n), rep(1e4,n)), extendInt="yes")$root ru2 <- vuniroot(froot, u=u, a=a, b=b, interval= cbind(rep(0.01,n), rep(10,n)), extendInt="yes")$root stopifnot(all.equal(ru, ru2, tolerance = 6e-6)) r1 <- vuniroot(froot, u= 0.99, a=a, b=b, interval= cbind(0.01, 10), extendInt="up") stopifnot(all.equal(0.99, cumhaz(r1$root, a=a, b=b))) ## An error if 'extendInt' assumes "wrong zero-crossing direction": vuniroot(froot, u= 0.99, a=a, b=b, interval= cbind(0.1, 10), extendInt="down")
require(utils) # for str ## some platforms hit zero exactly on the first step: ## if so the estimated precision is 2/3. f <- function (x, a) x - a str(xmin <- vuniroot(f, lower=c(0, 0), upper=c(1,1), tol = 0.0001, a = c(1/3,2/3))) ## same example with scalars for lower and upper -- using the n argument str(xmin <- vuniroot(f, lower=0, upper=1, tol = 0.0001, n=2, a = c(1/3,2/3))) ## handheld calculator example: fixed point of cos(.): vuniroot(function(x) cos(x) - x, lower = -pi, upper = pi, tol = 1e-9)$root str(vuniroot(function(x) x*(x^2-1) + .5, lower = -2, upper = 2, tol = 0.0001)) str(vuniroot(function(x) x*(x^2-1) + .5, lower = -2, upper = 2, tol = 1e-10)) ## Find the smallest value x for which exp(x) > 0 (numerically): r <- vuniroot(function(x) 1e80*exp(x) - 1e-300, cbind(-1000, 0), tol = 1e-15) str(r, digits.d = 15) # around -745, depending on the platform. exp(r$root) # = 0, but not for r$root * 0.999... minexp <- r$root * (1 - 10*.Machine$double.eps) exp(minexp) # typically denormalized ##--- vuniroot() with new interval extension + checking features: -------------- f1 <- function(x) (121 - x^2)/(x^2+1) f2 <- function(x) exp(-x)*(x - 12) tools::assertCondition(vuniroot(f1, cbind(0,10)), "error", verbose=TRUE) tools::assertCondition(vuniroot(f2, cbind(0, 2)), "error", verbose=TRUE) ##--> error: f() .. end points not of opposite sign ## where as 'extendInt="yes"' simply first enlarges the search interval: u1 <- vuniroot(f1, cbind(0,10),extendInt="yes", trace=1) u2 <- vuniroot(f2, cbind(0,2), extendInt="yes", trace=2) stopifnot(all.equal(u1$root, 11, tolerance = 1e-5), all.equal(u2$root, 12, tolerance = 6e-6)) ## The *danger* of interval extension: ## No way to find a zero of a positive function, but ## numerically, f(-|M|) becomes zero : tools::assertCondition(u3 <- vuniroot(exp, cbind(0,2), extendInt="yes", trace=TRUE), "error", verbose=TRUE) ## Nonsense example (must give an error): tools::assertCondition( vuniroot(function(x) 1, cbind(0,1), extendInt="yes"), "error", verbose=TRUE) ## Convergence checking : sinc_ <- function(x) ifelse(x == 0, 1, sin(x)/x) curve(sinc_, -6,18); abline(h=0,v=0, lty=3, col=adjustcolor("gray", 0.8)) vuniroot(sinc_, cbind(0,5), extendInt="yes", maxiter=4) #-> "just" a warning ## now with check.conv=TRUE, must signal a convergence error : vuniroot(sinc_, cbind(0,5), extendInt="yes", maxiter=4, check.conv=TRUE) ### Weibull cumulative hazard (example origin, Ravi Varadhan): cumhaz <- function(t, a, b) b * (t/b)^a froot <- function(x, u, a, b) cumhaz(x, a, b) - u n <- 10 u <- -log(runif(n)) a <- 1/2 b <- 1 ## Find failure times ru <- vuniroot(froot, u=u, a=a, b=b, interval= cbind(rep(1.e-14,n), rep(1e4,n)), extendInt="yes")$root ru2 <- vuniroot(froot, u=u, a=a, b=b, interval= cbind(rep(0.01,n), rep(10,n)), extendInt="yes")$root stopifnot(all.equal(ru, ru2, tolerance = 6e-6)) r1 <- vuniroot(froot, u= 0.99, a=a, b=b, interval= cbind(0.01, 10), extendInt="up") stopifnot(all.equal(0.99, cumhaz(r1$root, a=a, b=b))) ## An error if 'extendInt' assumes "wrong zero-crossing direction": vuniroot(froot, u= 0.99, a=a, b=b, interval= cbind(0.1, 10), extendInt="down")