Perform a phylogenetic regression using the Cauchy Process, by numerical optimization.

cauphylm(
  formula,
  data = list(),
  phy,
  model = c("cauchy", "lambda"),
  lower.bound = list(disp = 0, lambda = 0),
  upper.bound = list(disp = Inf, lambda = NULL),
  starting.value = list(disp = NULL, lambda = NULL),
  hessian = FALSE
)

Arguments

formula

a model formula.

data

a data frame containing variables in the model. If not found in data, the variables are taken from current environment.

phy

a phylogenetic tree of class phylo.

model

a model for the trait evolution. One of "cauchy" or "lambda" (see Details).

lower.bound

named list with lower bound values for the parameters. See Details for the default values.

upper.bound

named list with upper bound values for the parameters. See Details for the default values.

starting.value

named list initial values for the parameters. See Details for the default values.

hessian

if TRUE, then the numerical hessian is computed, for confidence interval computations. See compute_vcov.

Value

coefficients

the named vector of estimated coefficients.

disp

the maximum likelihood estimate of the dispersion parameter.

logLik

the maximum of the log likelihood.

p

the number of all parameters of the model.

aic

AIC value of the model.

fitted.values

fitted values

residuals

raw residuals

y

response

X

design matrix

n

number of observations (tips in the tree)

d

number of dependent variables

formula

the model formula

call

the original call to the function

model

the phylogenetic model for the covariance

phy

the phylogenetic tree

lambda

the ml estimate of the lambda parameter (for model="lambda")

Details

This function fits a Cauchy Process on the phylogeny, using maximum likelihood and the "fixed.root" method (see fitCauchy). It further assumes that the root value x0 is a linear combination of the covariables in formula. The corresponding regression model is: $$Y = X \beta + E,$$ with:

\(Y\)

the vector of traits at the tips of the tree;

\(X\)

the regression matrix of covariables in formula;

\(\beta\)

the vector of coefficients;

\(E\)

a centered error vector that is Cauchy distributed, and can be seen as the result of a Cauchy process starting at 0 at the root, and with a dispersion disp (see fitCauchy).

Unless specified by the user, the initial values for the parameters are taken according to the following heuristics:

coefficients:

\(\beta\) are obtained from a robust regression using lmrob.S;

disp:

is initialized from the trait centered and normalized by tip heights, with one of the following statistics, taken from Rousseeuw & Croux 1993:

Unless specified by the user, disp is taken positive unbounded.

The function uses nloptr for the numerical optimization of the (restricted) likelihood, computed with function logDensityTipsCauchy. It uses algorithms BOBYQA and MLSL_LDS for local and global optimization.

If model="lambda", the CP is fit on a tree with branch lengths re-scaled using the Pagel's lambda transform (see transf.branch.lengths), and the lambda value is estimated using numerical optimization. The default initial value for the lambda parameter is computed using adequate robust moments. The default maximum value is computed using phytools:::maxLambda, and is the ratio between the maximum height of a tip node over the maximum height of an internal node. This can be larger than 1. The default minimum value is 0.

References

Bastide, P. and Didier, G. 2023. The Cauchy Process on Phylogenies: a Tractable Model for Pulsed Evolution. Systematic Biology. doi:10.1093/sysbio/syad053.

Rothenberg T. J., Fisher F. M., Tilanus C. B. 1964. A Note on Estimation from a Cauchy Sample. Journal of the American Statistical Association. 59:460–463.

Rousseeuw P.J., Croux C. 1993. Alternatives to the Median Absolute Deviation. Journal of the American Statistical Association. 88:1273–1283.

Examples

# Simulate tree and data
set.seed(1289)
phy <- ape::rphylo(20, 0.1, 0)
error <- rTraitCauchy(n = 1, phy = phy, model = "cauchy",
                      parameters = list(root.value = 0, disp = 0.1))
x1 <- ape::rTraitCont(phy, model = "BM", sigma = 0.1, root.value = 0)
trait <- 3 + 2*x1 + error
# Fit the data
fit <- cauphylm(trait ~ x1, phy = phy)
fit
#> Call:
#> cauphylm(formula = trait ~ x1, phy = phy)
#> 
#>    AIC logLik 
#>   77.8  -35.9 
#> 
#> Parameter estimate(s) using ML:
#> dispersion: 0.05249818 
#> 
#> Coefficients:
#> (Intercept)          x1 
#>    2.005913    1.421920 
# Approximate confidence intervals
confint(fit)
#> Approximated asymptotic confidence interval using the Hessian.
#>                  2.5 %     97.5 %
#> (Intercept) 0.42927112 3.58255462
#> x1          0.68876658 2.15507244
#> disp        0.01916989 0.08582647