Title: | Statistical Analysis for Random Objects and Non-Euclidean Data |
---|---|
Description: | Provides implementation of statistical methods for random objects lying in various metric spaces, which are not necessarily linear spaces. The core of this package is Fréchet regression for random objects with Euclidean predictors, which allows one to perform regression analysis for non-Euclidean responses under some mild conditions. Examples include distributions in L^2-Wasserstein space, covariance matrices endowed with power metric (with Frobenius metric as a special case), Cholesky and log-Cholesky metrics. References: Petersen, A., & Müller, H.-G. (2019) <doi:10.1214/17-AOS1624>. |
Authors: | Yaqing Chen [aut, cre], Alvaro Gajardo [aut], Jianing Fan [aut], Qixian Zhong [aut], Paromita Dubey [aut], Kyunghee Han [aut], Satarupa Bhattacharjee [aut], Hans-Georg Müller [cph, ths, aut] |
Maintainer: | Yaqing Chen <[email protected]> |
License: | BSD_3_clause + file LICENSE |
Version: | 0.2.0 |
Built: | 2025-03-02 05:05:01 UTC |
Source: | https://github.com/functionaldata/tfrechet |
Smooth backfitting procedure for estimating nonparametric functional additive models for density responses
AddDensReg(Ly, X, x = NULL, hu = NULL, hx = NULL, dSup = NULL)
AddDensReg(Ly, X, x = NULL, hu = NULL, hx = NULL, dSup = NULL)
Ly |
A list of n vectors holding random samples independently drawn from each density response |
X |
An n-by-d design matrix whose row vectors consist of regressors for component functions |
x |
A grid matrix whose row vectors consist of evaluation points for component functions. Defaults to the observed design matrix. |
hu |
A scalar bandwidth for kernel smoothing marginal mean function of the functional additive model |
dSup |
A 2-dimensional vector that specify the lower and upper limits of the common support for density responses. Defaults to the range of the observed random samples. |
huA |
d-dimensional vector bandwidth for kernel smoothing each component function |
AddDensReg
fits additive functional regression models for density responses, where the conditional mean function of a transformed density response is given by the summation of nonparametric univariate functions associated with d covariates, respectively.
Instead of having functional responses as infinite-dimensional random objects, AddDensReg
has inputs with random samples independently drawn from each random density, and density responses are reconstructed by kernel density estimation.
The current version only supports the log-quantile density (LQD) transformation proposed by Petersen and Mueller (2016).
A list holding the following fields:
lqdGrid |
A grid where the LQD transformed density responses are evaluated. Defaults to an equally space grid over |
lqdSbfMean |
The marginal mean function estimates evaluated on |
LlqdSbfComp |
A list of d matrices holding the estimates of each component function evaluated on |
densGrid |
A grid where density responses are evaluated. Defaults to an equally space grid over the range of the observed random samples. |
densSbfMean |
The density inversion of |
LdensSbfComp |
A list of d matrices holding the density inversion of each component function together with |
Han, K., Mueller, H.-G., and Park, B. U. (2020), "Additive functional regression for densities as responses", Journal of the American Statistical Association , 115 (530), pp.997-1010.
Peterson, A. and Mueller, H.-G. (2016), "Functional data analysis for density functions by transformation to a Hilbert space", The Annals of Statistics, 44(1), pp.183-218
SBFitting
in the fdapace
package
library(MASS) # additive component functions g1 <- function (u, x1) sin(2*pi*u) * (2*x1 - 1) g2 <- function (u, x2) sin(2*pi*u) * sin(2*pi*x2) g <- function (u, x) g1(u, x[1]) + g2(u, x[2]) # generating random samples from conditional quantile functions GenLqdNoise <- function (u, e) e[1]*sin(pi*u) + e[2]*sin(2*pi*u) GenQdensResp <- function (u, x, e) exp(g(u, x) + GenLqdNoise(u, e)) GenQuantileResp <- function (u, x, e) { tmp1 <- integrate(GenQdensResp, lower = 0, upper = u, x = x, e = e)$value tmp2 <- integrate(GenQdensResp, lower = 0, upper = 1, x = x, e = e)$value return (tmp1 / tmp2) } set.seed(999) n <- 150 N <- 250 Sigma <- matrix(c(1, 0.5, 0.5, 1), nrow = 2, ncol = 2) X <- pnorm(mvrnorm(n, rep(0, 2), Sigma)) Ly <- list() for (i in 1:n) { U_i <- runif(N) E_i <- c(rnorm(1, 0, 0.1), rnorm(1, 0, 0.05)) Ly[[i]] <- sapply(1:N, function (l) GenQuantileResp(U_i[l], X[i,], E_i)) } M <- 51 x1 <- x2 <- seq(0, 1, length.out = M) x <- cbind(x1, x2) hu <- 0.05 hx <- c(0.075, 0.075) dSup <- c(0, 1) # estimating the functional additive model estAddDensReg <- AddDensReg(Ly = Ly, X = X, x = x, hu = hu, hx = hx, dSup = dSup) # true LQD component functions g1Eval <- g2Eval <- matrix(nrow = length(estAddDensReg$lqdGrid), ncol = M) for (l in seq(estAddDensReg$lqdGrid)) { for (m in seq(M)) { g1Eval[l,m] <- g1(estAddDensReg$lqdGrid[l], x1[m]) g2Eval[l,m] <- g2(estAddDensReg$lqdGrid[l], x2[m]) } } # LQD component function estimates g0Sbf <- estAddDensReg$lqdSbfMean gjSbf <- estAddDensReg$LlqdSbfComp # true density component functions dens1Eval <- dens2Eval <- matrix(nrow = length(estAddDensReg$densGrid), ncol = M) for (m in seq(M)) { dens1Eval[,m] <- fdadensity::lqd2dens(lqd = g1Eval[,m], lqdSup = estAddDensReg$lqdGrid, dSup = estAddDensReg$densGrid) dens2Eval[,m] <- fdadensity::lqd2dens(lqd = g2Eval[,m], lqdSup = estAddDensReg$lqdGrid, dSup = estAddDensReg$densGrid) } # density component function estimates dens0Sbf <- estAddDensReg$densSbfMean densjSbf <- estAddDensReg$LdensSbfComp # graphical illustration of LQD component function estimates par(mfrow = c(2,2)) par(mar=rep(0.5, 4)+0.1) persp(estAddDensReg$lqdGrid, x1, g1Eval, theta = 35, phi = 35, xlab = '\n u', ylab = '\n x1', zlab = '\n LQD component (g1)', border = NA, shade = 0.5, ticktype = 'detailed') persp(estAddDensReg$lqdGrid, x2, g2Eval, theta = 35, phi = 35, xlab = '\n u', ylab = '\n x1', zlab = '\n LQD component (g2)', border = NA, shade = 0.5, ticktype = 'detailed') persp(estAddDensReg$lqdGrid, x1, gjSbf[[1]], theta = 35, phi = 35, xlab = '\n u', ylab = '\n x1', zlab = '\n SBF estimate (g1)', border = NA, shade = 0.5, ticktype = 'detailed') persp(estAddDensReg$lqdGrid, x2, gjSbf[[2]], theta = 35, phi = 35, xlab = '\n u', ylab = '\n x2', zlab = '\n SBF estimate (g2)', border = NA, shade = 0.5, ticktype = 'detailed') # graphical illustration of density component function estimates par(mfrow = c(2,2)) par(mar=rep(0.5, 4)+0.1) persp(estAddDensReg$densGrid, x1, dens1Eval, theta = 35, phi = 35, xlab = '\n y', ylab = '\n x1', zlab = '\n\n Component density \n (Inversion of g0 + g1)', border = NA, shade = 0.5, ticktype = 'detailed') persp(estAddDensReg$densGrid, x2, dens2Eval, theta = 35, phi = 35, xlab = '\n y', ylab = '\n x1', zlab = '\n\n Component density \n (Inversion of g0 + g2)', border = NA, shade = 0.5, ticktype = 'detailed') persp(estAddDensReg$densGrid, x1, densjSbf[[1]], theta = 35, phi = 35, xlab = '\n y', ylab = '\n x1', zlab = '\n\n SBF estimate \n (Inversion of g0 + g1)', border = NA, shade = 0.5, ticktype = 'detailed') persp(estAddDensReg$densGrid, x2, densjSbf[[2]], theta = 35, phi = 35, xlab = '\n y', ylab = '\n x1', zlab = '\n\n SBF estimate \n (Inversion of g0 + g2)', border = NA, shade = 0.5, ticktype = 'detailed') # fitted density responses fitAddDensReg <- AddDensReg(Ly = Ly, X = X, hu = hu, hx = hx, dSup = dSup) # fitted LQD component functions g0SBFit <- fitAddDensReg$lqdSbfMean gjSBFit <- fitAddDensReg$LlqdSbfComp # fitted density responses densSBFit <- lapply(1:n, function (i) { gSBFit <- g0SBFit + gjSBFit[[1]][,i] + gjSBFit[[2]][,i] densSBFit_i <- fdadensity::lqd2dens(lqd = gSBFit, lqdSup = fitAddDensReg$lqdGrid, dSup = fitAddDensReg$densGrid) return (densSBFit_i) } ) # graphical illustration of fitted density responses set.seed(999) ind <- sample(1:n, 12) par(mfrow = c(3, 4)) par(mar=c(4, 4, 4, 1)+0.1) for (i in ind) { hist_i <- hist(Ly[[i]], plot = FALSE) hist(Ly[[i]], probability = TRUE, ylim = range(c(hist_i$density, densSBFit[[i]])), xlab = 'Y', main = paste(i, '-th random sample \n with X = (', round(X[i,1],2), ', ', round(X[i,2],2), ')', sep = '')) lines(fitAddDensReg$densGrid, densSBFit[[i]], col = 2, lwd = 2) }
library(MASS) # additive component functions g1 <- function (u, x1) sin(2*pi*u) * (2*x1 - 1) g2 <- function (u, x2) sin(2*pi*u) * sin(2*pi*x2) g <- function (u, x) g1(u, x[1]) + g2(u, x[2]) # generating random samples from conditional quantile functions GenLqdNoise <- function (u, e) e[1]*sin(pi*u) + e[2]*sin(2*pi*u) GenQdensResp <- function (u, x, e) exp(g(u, x) + GenLqdNoise(u, e)) GenQuantileResp <- function (u, x, e) { tmp1 <- integrate(GenQdensResp, lower = 0, upper = u, x = x, e = e)$value tmp2 <- integrate(GenQdensResp, lower = 0, upper = 1, x = x, e = e)$value return (tmp1 / tmp2) } set.seed(999) n <- 150 N <- 250 Sigma <- matrix(c(1, 0.5, 0.5, 1), nrow = 2, ncol = 2) X <- pnorm(mvrnorm(n, rep(0, 2), Sigma)) Ly <- list() for (i in 1:n) { U_i <- runif(N) E_i <- c(rnorm(1, 0, 0.1), rnorm(1, 0, 0.05)) Ly[[i]] <- sapply(1:N, function (l) GenQuantileResp(U_i[l], X[i,], E_i)) } M <- 51 x1 <- x2 <- seq(0, 1, length.out = M) x <- cbind(x1, x2) hu <- 0.05 hx <- c(0.075, 0.075) dSup <- c(0, 1) # estimating the functional additive model estAddDensReg <- AddDensReg(Ly = Ly, X = X, x = x, hu = hu, hx = hx, dSup = dSup) # true LQD component functions g1Eval <- g2Eval <- matrix(nrow = length(estAddDensReg$lqdGrid), ncol = M) for (l in seq(estAddDensReg$lqdGrid)) { for (m in seq(M)) { g1Eval[l,m] <- g1(estAddDensReg$lqdGrid[l], x1[m]) g2Eval[l,m] <- g2(estAddDensReg$lqdGrid[l], x2[m]) } } # LQD component function estimates g0Sbf <- estAddDensReg$lqdSbfMean gjSbf <- estAddDensReg$LlqdSbfComp # true density component functions dens1Eval <- dens2Eval <- matrix(nrow = length(estAddDensReg$densGrid), ncol = M) for (m in seq(M)) { dens1Eval[,m] <- fdadensity::lqd2dens(lqd = g1Eval[,m], lqdSup = estAddDensReg$lqdGrid, dSup = estAddDensReg$densGrid) dens2Eval[,m] <- fdadensity::lqd2dens(lqd = g2Eval[,m], lqdSup = estAddDensReg$lqdGrid, dSup = estAddDensReg$densGrid) } # density component function estimates dens0Sbf <- estAddDensReg$densSbfMean densjSbf <- estAddDensReg$LdensSbfComp # graphical illustration of LQD component function estimates par(mfrow = c(2,2)) par(mar=rep(0.5, 4)+0.1) persp(estAddDensReg$lqdGrid, x1, g1Eval, theta = 35, phi = 35, xlab = '\n u', ylab = '\n x1', zlab = '\n LQD component (g1)', border = NA, shade = 0.5, ticktype = 'detailed') persp(estAddDensReg$lqdGrid, x2, g2Eval, theta = 35, phi = 35, xlab = '\n u', ylab = '\n x1', zlab = '\n LQD component (g2)', border = NA, shade = 0.5, ticktype = 'detailed') persp(estAddDensReg$lqdGrid, x1, gjSbf[[1]], theta = 35, phi = 35, xlab = '\n u', ylab = '\n x1', zlab = '\n SBF estimate (g1)', border = NA, shade = 0.5, ticktype = 'detailed') persp(estAddDensReg$lqdGrid, x2, gjSbf[[2]], theta = 35, phi = 35, xlab = '\n u', ylab = '\n x2', zlab = '\n SBF estimate (g2)', border = NA, shade = 0.5, ticktype = 'detailed') # graphical illustration of density component function estimates par(mfrow = c(2,2)) par(mar=rep(0.5, 4)+0.1) persp(estAddDensReg$densGrid, x1, dens1Eval, theta = 35, phi = 35, xlab = '\n y', ylab = '\n x1', zlab = '\n\n Component density \n (Inversion of g0 + g1)', border = NA, shade = 0.5, ticktype = 'detailed') persp(estAddDensReg$densGrid, x2, dens2Eval, theta = 35, phi = 35, xlab = '\n y', ylab = '\n x1', zlab = '\n\n Component density \n (Inversion of g0 + g2)', border = NA, shade = 0.5, ticktype = 'detailed') persp(estAddDensReg$densGrid, x1, densjSbf[[1]], theta = 35, phi = 35, xlab = '\n y', ylab = '\n x1', zlab = '\n\n SBF estimate \n (Inversion of g0 + g1)', border = NA, shade = 0.5, ticktype = 'detailed') persp(estAddDensReg$densGrid, x2, densjSbf[[2]], theta = 35, phi = 35, xlab = '\n y', ylab = '\n x1', zlab = '\n\n SBF estimate \n (Inversion of g0 + g2)', border = NA, shade = 0.5, ticktype = 'detailed') # fitted density responses fitAddDensReg <- AddDensReg(Ly = Ly, X = X, hu = hu, hx = hx, dSup = dSup) # fitted LQD component functions g0SBFit <- fitAddDensReg$lqdSbfMean gjSBFit <- fitAddDensReg$LlqdSbfComp # fitted density responses densSBFit <- lapply(1:n, function (i) { gSBFit <- g0SBFit + gjSBFit[[1]][,i] + gjSBFit[[2]][,i] densSBFit_i <- fdadensity::lqd2dens(lqd = gSBFit, lqdSup = fitAddDensReg$lqdGrid, dSup = fitAddDensReg$densGrid) return (densSBFit_i) } ) # graphical illustration of fitted density responses set.seed(999) ind <- sample(1:n, 12) par(mfrow = c(3, 4)) par(mar=c(4, 4, 4, 1)+0.1) for (i in ind) { hist_i <- hist(Ly[[i]], plot = FALSE) hist(Ly[[i]], probability = TRUE, ylim = range(c(hist_i$density, densSBFit[[i]])), xlab = 'Y', main = paste(i, '-th random sample \n with X = (', round(X[i,1],2), ', ', round(X[i,2],2), ')', sep = '')) lines(fitAddDensReg$densGrid, densSBFit[[i]], col = 2, lwd = 2) }
Generate color bar/scale.
color.bar( colVal = NULL, colBreaks = NULL, min = NULL, max = NULL, lut = NULL, nticks = 5, ticks = NULL, title = NULL )
color.bar( colVal = NULL, colBreaks = NULL, min = NULL, max = NULL, lut = NULL, nticks = 5, ticks = NULL, title = NULL )
colVal |
A numeric vector giving the variable values to which each color is corresponding. It overrides |
colBreaks |
A numeric vector giving the breaks dividing the range of variable into different colors. It overrides |
min |
A scalar giving the minimum value of the variable represented by colors. |
max |
A scalar giving the maximum value of the variable represented by colors. |
lut |
Color vector. Default is |
nticks |
An integer giving the number of ticks used in the axis of color bar. |
ticks |
A numeric vector giving the locations of ticks used in the axis of color bar; it overrides |
title |
A character giving the label of the variable according to which the color bar is generated. |
No return value.
Fréchet mean computation for covariance matrices.
CovFMean(M = NULL, optns = list())
CovFMean(M = NULL, optns = list())
M |
A q by q by n array (resp. a list of q by q matrices) where |
optns |
A list of options control parameters specified by |
Available control options are
Metric type choice, "frobenius"
, "power"
, "log_cholesky"
, "cholesky"
- default: "frobenius"
which corresponds to the power metric with alpha
equal to 1.
The power parameter for the power metric, which can be any non-negative number. Default is 1 which corresponds to Frobenius metric.
A vector of weights to compute the weighted barycenter. The length of weights
is equal to the sample size n. Default is equal weights.
A list containing the following fields:
Mout |
A list containing the Fréchet mean of the covariance matrices in |
optns |
A list containing the |
Petersen, A. and Müller, H.-G. (2019). Fréchet regression for random objects with Euclidean predictors. The Annals of Statistics, 47(2), 691–719.
Petersen, A., Deoni, S. and Müller, H.-G. (2019). Fréchet estimation of time-varying covariance matrices from sparse data, with application to the regional co-evolution of myelination in the developing brain. The Annals of Applied Statistics, 13(1), 393–419.
Lin, Z. (2019). Riemannian geometry of symmetric positive definite matrices via Cholesky decomposition. Siam. J. Matrix. Anal, A. 40, 1353–1370.
#Example M input n=10 #sample size m=5 # dimension of covariance matrices M <- array(0,c(m,m,n)) for (i in 1:n){ y0=rnorm(m) aux<-diag(m)+y0%*%t(y0) M[,,i]<-aux } Fmean=CovFMean(M=M,optns=list(metric="frobenius"))
#Example M input n=10 #sample size m=5 # dimension of covariance matrices M <- array(0,c(m,m,n)) for (i in 1:n){ y0=rnorm(m) aux<-diag(m)+y0%*%t(y0) M[,,i]<-aux } Fmean=CovFMean(M=M,optns=list(metric="frobenius"))
Plots for Fréchet regression for covariance matrices.
CreateCovRegPlot(x, optns = list())
CreateCovRegPlot(x, optns = list())
x |
A |
optns |
A list of control options specified by |
Available control options are
A vector holding the indices of elements in x$Mout
at which the plots will be made. Default is
1:length(x$Mout)
when x$Mout
is of length no more than 3;
c(1,round(length(x$Mout)/2),length(x$Mout))
when x$Mout
is of length greater than 3.
An integer — default: 1; subsequent figures will be drawn in an optns$nrow
-by-ceiling(length(ind.xout)/optns$nrow)
array.
Character with two choices, "continuous" and "categorical". The former plots the correlations in a continuous scale of colors by magnitude while the latter categorizes the positive and negative entries into two different colors. Default is "continuous"
Character, the ordering method of the correlation matrix.
"original"
for original order (default);
"AOE"
for the angular order of the eigenvectors;
"FPC"
for the first principal component order;
"hclust"
for the hierarchical clustering order, drawing 4 rectangles on the graph according to the hierarchical cluster;
"alphabet"
for alphabetical order.
Character, the visualization method of correlation matrix to be used. Currently, it supports seven methods, named "circle" (default), "square", "ellipse", "number", "pie", "shade" and "color".
Logical, indicating if output is shown as correlation or covariance matrix. Default is FALSE
and corresponds to a covariance matrix.
Character, "full" (default), "upper" or "lower", display full matrix, lower triangular or upper triangular matrix.
No return value.
#Example y input n=20 # sample size t=seq(0,1,length.out=100) # length of data x = matrix(runif(n),n) theta1 = theta2 = array(0,n) for(i in 1:n){ theta1[i] = rnorm(1,x[i],x[i]^2) theta2[i] = rnorm(1,x[i]/2,(1-x[i])^2) } y = matrix(0,n,length(t)) phi1 = sqrt(3)*t phi2 = sqrt(6/5)*(1-t/2) y = theta1%*%t(phi1) + theta2 %*% t(phi2) xout = matrix(c(0.25,0.5,0.75),3) Cov_est=GloCovReg(x=x,y=y,xout=xout,optns=list(corrOut = FALSE, metric="power",alpha=3)) CreateCovRegPlot(Cov_est, optns = list(ind.xout = 2, plot.method = "shade")) CreateCovRegPlot(Cov_est, optns = list(plot.method = "color"))
#Example y input n=20 # sample size t=seq(0,1,length.out=100) # length of data x = matrix(runif(n),n) theta1 = theta2 = array(0,n) for(i in 1:n){ theta1[i] = rnorm(1,x[i],x[i]^2) theta2[i] = rnorm(1,x[i]/2,(1-x[i])^2) } y = matrix(0,n,length(t)) phi1 = sqrt(3)*t phi2 = sqrt(6/5)*(1-t/2) y = theta1%*%t(phi1) + theta2 %*% t(phi2) xout = matrix(c(0.25,0.5,0.75),3) Cov_est=GloCovReg(x=x,y=y,xout=xout,optns=list(corrOut = FALSE, metric="power",alpha=3)) CreateCovRegPlot(Cov_est, optns = list(ind.xout = 2, plot.method = "shade")) CreateCovRegPlot(Cov_est, optns = list(plot.method = "color"))
Create kernel density estimate along the support of the raw data using the HADES method.
CreateDensity( y = NULL, histogram = NULL, freq = NULL, bin = NULL, optns = list() )
CreateDensity( y = NULL, histogram = NULL, freq = NULL, bin = NULL, optns = list() )
y |
A vector of raw readings. |
histogram |
A |
freq |
A frequency vector. Use this option when frequency table is only available, but not the raw sample or the histogram object. The corresponding |
bin |
A bin vector having its length with |
optns |
A list of options control parameters specified by |
Available control options are
The bandwidth value for the smoothed mean function; positive numeric - default: determine automatically based on the data-driven bandwidth selector proposed by Sheather and Jones (1991)
The number of support points the KDE; numeric - default: 101.
The size of the bin to be used; numeric - default: diff(range(y))/1000
. It only works when the raw sample is available.
smoothing kernel choice, "rect"
, "gauss"
, "epan"
, "gausvar"
, "quar"
- default: "gauss"
.
logical if we expect the distribution to have infinite support or not; logical - default: FALSE
.
User defined output grid for the support of the KDE, it overrides nRegGrid
; numeric - default: NULL
.
A list containing the following fields:
bw |
The bandwidth used for smoothing. |
x |
A vector of length |
y |
A vector of length |
H.-G. Müller, J.L. Wang and W.B. Capra (1997). "From lifetables to hazard rates: The transformation approach." Biometrika 84, 881–892.
S.J. Sheather and M.C. Jones (1991). "A reliable data-based bandwidth selection method for kernel density estimation." JRSS-B 53, 683–690.
H.-G. Müller, U. Stadtmüller, and T. Schmitt. (1987) "Bandwidth choice and confidence intervals for derivatives of noisy data." Biometrika 74, 743–749.
### compact support case # input: raw sample set.seed(100) n <- 100 x0 <-seq(0,1,length.out=51) Y <- rbeta(n,3,2) f1 <- CreateDensity(y=Y,optns = list(outputGrid=x0)) # input: histogram histY <- hist(Y) f2 <- CreateDensity(histogram=histY,optns = list(outputGrid=x0)) # input: frequency table with unequally spaced (random) bins binY <- c(0,sort(runif(9)),1) freqY <- c() for (i in 1:(length(binY)-1)) { freqY[i] <- length(which(Y>binY[i] & Y<=binY[i+1])) } f3 <- CreateDensity(freq=freqY, bin=binY,optns = list(outputGrid=x0)) # plot plot(f1$x,f1$y,type='l',col=2,lty=2,lwd=2, xlim=c(0,1),ylim=c(0,2),xlab='domain',ylab='density') points(f2$x,f2$y,type='l',col=3,lty=3,lwd=2) points(f3$x,f3$y,type='l',col=4,lty=4,lwd=2) points(x0,dbeta(x0,3,2),type='l',lwd=2) legend('topleft', c('true','raw sample','histogram','frequency table (unequal bin)'), col=1:4,lty=1:4,lwd=3,bty='n') ### infinite support case # input: raw sample set.seed(100) n <- 200 x0 <-seq(-3,3,length.out=101) Y <- rnorm(n) f1 <- CreateDensity(y=Y,optns = list(outputGrid=x0)) # input: histogram histY <- hist(Y) f2 <- CreateDensity(histogram=histY,optns = list(outputGrid=x0)) # input: frequency table with unequally spaced (random) bins binY <- c(-3,sort(runif(9,-3,3)),3) freqY <- c() for (i in 1:(length(binY)-1)) { freqY[i] <- length(which(Y>binY[i] & Y<=binY[i+1])) } f3 <- CreateDensity(freq=freqY, bin=binY,optns = list(outputGrid=x0)) # plot plot(f1$x,f1$y,type='l',col=2,lty=2,lwd=2, xlim=c(-3,3),ylim=c(0,0.5),xlab='domain',ylab='density') points(f2$x,f2$y,type='l',col=3,lty=3,lwd=2) points(f3$x,f3$y,type='l',col=4,lty=4,lwd=2) points(x0,dnorm(x0),type='l',lwd=2) legend('topright', c('true','raw sample','histogram','frequency table (unequal bin)'), col=1:4,lty=1:4,lwd=3,bty='n')
### compact support case # input: raw sample set.seed(100) n <- 100 x0 <-seq(0,1,length.out=51) Y <- rbeta(n,3,2) f1 <- CreateDensity(y=Y,optns = list(outputGrid=x0)) # input: histogram histY <- hist(Y) f2 <- CreateDensity(histogram=histY,optns = list(outputGrid=x0)) # input: frequency table with unequally spaced (random) bins binY <- c(0,sort(runif(9)),1) freqY <- c() for (i in 1:(length(binY)-1)) { freqY[i] <- length(which(Y>binY[i] & Y<=binY[i+1])) } f3 <- CreateDensity(freq=freqY, bin=binY,optns = list(outputGrid=x0)) # plot plot(f1$x,f1$y,type='l',col=2,lty=2,lwd=2, xlim=c(0,1),ylim=c(0,2),xlab='domain',ylab='density') points(f2$x,f2$y,type='l',col=3,lty=3,lwd=2) points(f3$x,f3$y,type='l',col=4,lty=4,lwd=2) points(x0,dbeta(x0,3,2),type='l',lwd=2) legend('topleft', c('true','raw sample','histogram','frequency table (unequal bin)'), col=1:4,lty=1:4,lwd=3,bty='n') ### infinite support case # input: raw sample set.seed(100) n <- 200 x0 <-seq(-3,3,length.out=101) Y <- rnorm(n) f1 <- CreateDensity(y=Y,optns = list(outputGrid=x0)) # input: histogram histY <- hist(Y) f2 <- CreateDensity(histogram=histY,optns = list(outputGrid=x0)) # input: frequency table with unequally spaced (random) bins binY <- c(-3,sort(runif(9,-3,3)),3) freqY <- c() for (i in 1:(length(binY)-1)) { freqY[i] <- length(which(Y>binY[i] & Y<=binY[i+1])) } f3 <- CreateDensity(freq=freqY, bin=binY,optns = list(outputGrid=x0)) # plot plot(f1$x,f1$y,type='l',col=2,lty=2,lwd=2, xlim=c(-3,3),ylim=c(0,0.5),xlab='domain',ylab='density') points(f2$x,f2$y,type='l',col=3,lty=3,lwd=2) points(f3$x,f3$y,type='l',col=4,lty=4,lwd=2) points(x0,dnorm(x0),type='l',lwd=2) legend('topright', c('true','raw sample','histogram','frequency table (unequal bin)'), col=1:4,lty=1:4,lwd=3,bty='n')
Obtain Fréchet means of densities with respect to -Wasserstein distance.
DenFMean(yin = NULL, hin = NULL, qin = NULL, optns = list())
DenFMean(yin = NULL, hin = NULL, qin = NULL, optns = list())
yin |
A matrix or list holding the sample of measurements for the observed distributions. If |
hin |
A list holding the histograms of an observed distribution. |
qin |
A matrix or list holding the quantile functions of the response. If |
optns |
A list of options control parameters specified by |
Available control options are qSup
, nqSup
,
bwDen
, ndSup
, dSup
, delta
,
kernelDen
, infSupport
, and denLowerThreshold
.
See LocDenReg
for details.
A vector of weights to compute the weighted barycenter. The length of weights
is equal to the sample size. Default is equal weights.
A list containing the following components:
dout |
A numeric vector holding the density of the Fréchet mean. |
dSup |
A numeric vector giving the domain grid of |
qout |
A numeric vector holding the quantile function of the Fréchet mean. |
qSup |
A numeric vector giving the domain grid of |
optns |
A list of control options used. |
xin = seq(0,1,0.05) yin = lapply(xin, function(x) { rnorm(100, rnorm(1,x + x^2,0.005), 0.05) }) res <- DenFMean(yin=yin) plot(res)
xin = seq(0,1,0.05) yin = lapply(xin, function(x) { rnorm(100, rnorm(1,x + x^2,0.005), 0.05) }) res <- DenFMean(yin=yin) plot(res)
Distance computation between two covariance matrices
dist4cov(A = NULL, B = NULL, optns = list())
dist4cov(A = NULL, B = NULL, optns = list())
A |
an p by p matrix |
B |
an p by p matrix |
optns |
A list of options control parameters specified by |
Available control options are
Metric type choice, "frobenius"
, "power"
, "log_cholesky"
and "cholesky"
- default: "frobenius"
, which corresponds to the power metric with alpha
equal to 1.
The power parameter for the power metric, which can be any non-negative number. Default is 1 which corresponds to Frobenius metric.
A list containing the following fields:
dist |
the distance between covariance matrices |
optns |
A list containing the |
Petersen, A. and Müller, H.-G. (2016). Fréchet integration and adaptive metric selection for interpretable covariances of multivariate functional data. Biometrika, 103, 103–120.
Petersen, A. and Müller, H.-G. (2019). Fréchet regression for random objects with Euclidean predictors. The Annals of Statistics, 47(2), 691–719.
Petersen, A., Deoni, S. and Müller, H.-G. (2019). Fréchet estimation of time-varying covariance matrices from sparse data, with application to the regional co-evolution of myelination in the developing brain. The Annals of Applied Statistics, 13(1), 393–419.
# M input as array m <- 5 # dimension of covariance matrices M <- array(0,c(m,m,2)) for (i in 1:2) { y0 <- rnorm(m) aux <- diag(m) + y0 %*% t(y0) M[,,i] <- aux } A <- M[,,1] B <- M[,,2] frobDist <- dist4cov(A=A, B=B, optns=list(metric="frobenius"))
# M input as array m <- 5 # dimension of covariance matrices M <- array(0,c(m,m,2)) for (i in 1:2) { y0 <- rnorm(m) aux <- diag(m) + y0 %*% t(y0) M[,,i] <- aux } A <- M[,,1] B <- M[,,2] frobDist <- dist4cov(A=A, B=B, optns=list(metric="frobenius"))
Wasserstein distance between two distributions. Wasserstein distance between two distributions.
dist4den(d1 = NULL, d2 = NULL, fctn_type = NULL, optns = list())
dist4den(d1 = NULL, d2 = NULL, fctn_type = NULL, optns = list())
d1 , d2
|
Lists holding the density functions or quantile functions of the two distributions.
Each list consists of two numeric vectors |
fctn_type |
Character vector of length 1 holding the function type in |
optns |
A list of control parameters specified by |
Available control options are:
A scalar giving the length of the support grid of quantile functions based on which the Wasserstein distance (i.e., the
distance between the quantile functions) is computed. Default is 201.
A scalar holding the Wasserstein distance between
d1
and d2
.
d1 <- list(x = seq(-6,6,0.01)) d1$y <- dnorm(d1$x) d2 <- list(x = d1$x + 1) d2$y <- dnorm(d2$x, mean = 1) dist <- dist4den(d1 = d1,d2 = d2)
d1 <- list(x = seq(-6,6,0.01)) d1$y <- dnorm(d1$x) d2 <- list(x = d1$x + 1) d2$y <- dnorm(d2$x, mean = 1) dist <- dist4den(d1 = d1,d2 = d2)
Compute an exponential map for a unit hypersphere.
expSphere(base, tg)
expSphere(base, tg)
base |
A unit vector of length |
tg |
A vector of length |
A unit vector of length .
Generate a "natural" frame (orthonormal basis) for the tangent space at x
on the unit sphere.
frameSphere(x)
frameSphere(x)
x |
A unit vector of length |
The first elements of the
th basis vector are given by
,
,
,
,
,
, respectively.
The rest elements (if any) of the
th basis vector are all zero.
A -by-
matrix where columns hold the orthonormal basis of the tangent space at
x
on the unit sphere.
frameSphere(c(1,0,0,0))
frameSphere(c(1,0,0,0))
Provides implementation of statistical methods for random objects
lying in various metric spaces, which are not necessarily linear spaces.
The core of this package is Fréchet regression for random objects with
Euclidean predictors, which allows one to perform regression analysis
for non-Euclidean responses under some mild conditions.
Examples include distributions in -Wasserstein space,
covariance matrices endowed with power metric (with Frobenius metric as a special case), Cholesky and log-Cholesky metrics.
References: Petersen, A., & Müller, H.-G. (2019) <doi:10.1214/17-AOS1624>.
Global Fréchet regression of covariance matrices with Euclidean predictors.
GloCovReg(x, y = NULL, M = NULL, xout, optns = list())
GloCovReg(x, y = NULL, M = NULL, xout, optns = list())
x |
An n by p matrix of predictors. |
y |
An n by l matrix, each row corresponds to an observation, l is the length of time points where the responses are observed. See 'metric' option in 'Details' for more details. |
M |
A q by q by n array (resp. a list of q by q matrices) where |
xout |
An m by p matrix of output predictor levels. |
optns |
A list of options control parameters specified by |
Available control options are
Boolean indicating if output is shown as correlation or covariance matrix. Default is FALSE
and corresponds to a covariance matrix.
Metric type choice, "frobenius"
, "power"
, "log_cholesky"
, "cholesky"
- default: "frobenius"
which corresponds to the power metric with alpha
equal to 1.
For power (and Frobenius) metrics, either y
or M
must be input; y
would override M
. For Cholesky and log-Cholesky metrics, M
must be input and y
does not apply.
The power parameter for the power metric. Default is 1 which corresponds to Frobenius metric.
A covReg
object — a list containing the following fields:
xout |
An m by p matrix of output predictor levels. |
Mout |
A list of estimated conditional covariance or correlation matrices at |
optns |
A list containing the |
Petersen, A. and Müller, H.-G. (2019). Fréchet regression for random objects with Euclidean predictors. The Annals of Statistics, 47(2), 691–719.
Petersen, A., Deoni, S. and Müller, H.-G. (2019). Fréchet estimation of time-varying covariance matrices from sparse data, with application to the regional co-evolution of myelination in the developing brain. The Annals of Applied Statistics, 13(1), 393–419.
Lin, Z. (2019). Riemannian geometry of symmetric positive definite matrices via Cholesky decomposition. Siam. J. Matrix. Anal, A. 40, 1353–1370.
#Example y input n=50 # sample size t=seq(0,1,length.out=100) # length of data x = matrix(runif(n),n) theta1 = theta2 = array(0,n) for(i in 1:n){ theta1[i] = rnorm(1,x[i],x[i]^2) theta2[i] = rnorm(1,x[i]/2,(1-x[i])^2) } y = matrix(0,n,length(t)) phi1 = sqrt(3)*t phi2 = sqrt(6/5)*(1-t/2) y = theta1%*%t(phi1) + theta2 %*% t(phi2) xout = matrix(c(0.25,0.5,0.75),3) Cov_est=GloCovReg(x=x,y=y,xout=xout,optns=list(corrOut=FALSE,metric="power",alpha=3)) #Example M input n=10 #sample size m=5 # dimension of covariance matrices M <- array(0,c(m,m,n)) for (i in 1:n){ y0=rnorm(m) aux<-diag(m)+y0%*%t(y0) M[,,i]<-aux } x=cbind(matrix(rnorm(n),n),matrix(rnorm(n),n)) #vector of predictor values xout=cbind(runif(3),runif(3)) #output predictor levels Cov_est=GloCovReg(x=x,M=M,xout=xout,optns=list(corrOut=FALSE,metric="power",alpha=3))
#Example y input n=50 # sample size t=seq(0,1,length.out=100) # length of data x = matrix(runif(n),n) theta1 = theta2 = array(0,n) for(i in 1:n){ theta1[i] = rnorm(1,x[i],x[i]^2) theta2[i] = rnorm(1,x[i]/2,(1-x[i])^2) } y = matrix(0,n,length(t)) phi1 = sqrt(3)*t phi2 = sqrt(6/5)*(1-t/2) y = theta1%*%t(phi1) + theta2 %*% t(phi2) xout = matrix(c(0.25,0.5,0.75),3) Cov_est=GloCovReg(x=x,y=y,xout=xout,optns=list(corrOut=FALSE,metric="power",alpha=3)) #Example M input n=10 #sample size m=5 # dimension of covariance matrices M <- array(0,c(m,m,n)) for (i in 1:n){ y0=rnorm(m) aux<-diag(m)+y0%*%t(y0) M[,,i]<-aux } x=cbind(matrix(rnorm(n),n),matrix(rnorm(n),n)) #vector of predictor values xout=cbind(runif(3),runif(3)) #output predictor levels Cov_est=GloCovReg(x=x,M=M,xout=xout,optns=list(corrOut=FALSE,metric="power",alpha=3))
Global Fréchet regression for densities with respect to -Wasserstein distance.
GloDenReg( xin = NULL, yin = NULL, hin = NULL, qin = NULL, xout = NULL, optns = list() )
GloDenReg( xin = NULL, yin = NULL, hin = NULL, qin = NULL, xout = NULL, optns = list() )
xin |
An n by p matrix or a vector of length n (if p=1) with input measurements of the predictors. |
yin |
A matrix or list holding the sample of observations of the response. If |
hin |
A list holding the histograms of the response corresponding to each row in |
qin |
A matrix or list holding the quantile functions of the response. If |
xout |
A k by p matrix or a vector of length k (if p=1) with output measurements of the predictors. Default is |
optns |
A list of control parameters specified by |
Available control options are qSup
, nqSup
,
lower
, upper
, Rsquared
, bwDen
, ndSup
, dSup
,
delta
, kernelDen
, infSupport
, and denLowerThreshold
.
Rsquared
is explained as follows and see LocDenReg
for the other options.
A logical variable indicating whether R squared would be returned. Default is FALSE
.
A list containing the following components:
xout |
Input |
dout |
A matrix or list holding the output densities corresponding to |
dSup |
A numeric vector giving the domain grid of |
qout |
A matrix holding the quantile functions of the output densities. Each row corresponds to a value in |
qSup |
A numeric vector giving the domain grid of |
xin |
Input |
din |
Densities corresponding to the input |
qin |
Quantile functions corresponding to the input |
Rsq |
A scalar giving the R squared value if |
optns |
A list of control options used. |
Petersen, A., & Müller, H.-G. (2019). "Fréchet regression for random objects with Euclidean predictors." The Annals of Statistics, 47(2), 691–719.
xin = seq(0,1,0.05) yin = lapply(xin, function(x) { rnorm(100, rnorm(1,x,0.005), 0.05) }) qSup = seq(0,1,0.02) xout = seq(0,1,0.25) res1 <- GloDenReg(xin=xin, yin=yin, xout=xout, optns = list(qSup = qSup)) plot(res1) hin = lapply(yin, function(y) hist(y, breaks = 50, plot=FALSE)) res2 <- GloDenReg(xin=xin, hin=hin, xout=xout, optns = list(qSup = qSup)) plot(res2)
xin = seq(0,1,0.05) yin = lapply(xin, function(x) { rnorm(100, rnorm(1,x,0.005), 0.05) }) qSup = seq(0,1,0.02) xout = seq(0,1,0.25) res1 <- GloDenReg(xin=xin, yin=yin, xout=xout, optns = list(qSup = qSup)) plot(res1) hin = lapply(yin, function(y) hist(y, breaks = 50, plot=FALSE)) res2 <- GloDenReg(xin=xin, hin=hin, xout=xout, optns = list(qSup = qSup)) plot(res2)
Global Fréchet regression for spherical data with respect to the geodesic distance.
GloSpheReg(xin = NULL, yin = NULL, xout = NULL)
GloSpheReg(xin = NULL, yin = NULL, xout = NULL)
xin |
A vector of length |
yin |
An |
xout |
A vector of length |
A list containing the following components:
xout |
Input |
yout |
A |
xin |
Input |
yin |
Input |
Petersen, A., & Müller, H.-G. (2019). "Fréchet regression for random objects with Euclidean predictors." The Annals of Statistics, 47(2), 691–719.
n <- 101 xin <- seq(-1,1,length.out = n) theta_true <- rep(pi/2,n) phi_true <- (xin + 1) * pi / 4 ytrue <- apply( cbind( 1, phi_true, theta_true ), 1, pol2car ) yin <- t( ytrue ) xout <- xin res <- GloSpheReg(xin=xin, yin=yin, xout=xout)
n <- 101 xin <- seq(-1,1,length.out = n) theta_true <- rep(pi/2,n) phi_true <- (xin + 1) * pi / 4 ytrue <- apply( cbind( 1, phi_true, theta_true ), 1, pol2car ) yin <- t( ytrue ) xout <- xin res <- GloSpheReg(xin=xin, yin=yin, xout=xout)
Local Fréchet regression of covariance matrices with Euclidean predictors.
LocCovReg(x, y = NULL, M = NULL, xout, optns = list())
LocCovReg(x, y = NULL, M = NULL, xout, optns = list())
x |
An n by p matrix of predictors. |
y |
An n by l matrix, each row corresponds to an observation, l is the length of time points where the responses are observed. See 'metric' option in 'Details' for more details. |
M |
A q by q by n array (resp. a list of q by q matrices) where |
xout |
An m by p matrix of output predictor levels. |
optns |
A list of options control parameters specified by |
Available control options are
Boolean indicating if output is shown as correlation or covariance matrix. Default is FALSE
and corresponds to a covariance matrix.
Metric type choice, "frobenius"
, "power"
, "log_cholesky"
, "cholesky"
- default: "frobenius"
which corresponds to the power metric with alpha
equal to 1.
For power (and Frobenius) metrics, either y
or M
must be input; y
would override M
. For Cholesky and log-Cholesky metrics, M
must be input and y
does not apply.
The power parameter for the power metric. Default is 1 which corresponds to Frobenius metric.
A vector of length p holding the bandwidths for conditional mean estimation if y
is provided. If bwMean
is not provided, it is chosen by cross validation.
A vector of length p holding the bandwidths for conditional covariance estimation. If bwCov
is not provided, it is chosen by cross validation.
Name of the kernel function to be chosen from "rect"
, "gauss"
, "epan"
, "gausvar"
, "quar"
. Default is "gauss"
.
A covReg
object — a list containing the following fields:
xout |
An m by p matrix of output predictor levels. |
Mout |
A list of estimated conditional covariance or correlation matrices at |
optns |
A list containing the |
Petersen, A. and Müller, H.-G. (2019). Fréchet regression for random objects with Euclidean predictors. The Annals of Statistics, 47(2), 691–719.
Petersen, A., Deoni, S. and Müller, H.-G. (2019). Fréchet estimation of time-varying covariance matrices from sparse data, with application to the regional co-evolution of myelination in the developing brain. The Annals of Applied Statistics, 13(1), 393–419.
Lin, Z. (2019). Riemannian geometry of symmetric positive definite matrices via Cholesky decomposition. Siam. J. Matrix. Anal, A. 40, 1353–1370.
#Example y input n=30 # sample size t=seq(0,1,length.out=100) # length of data x = matrix(runif(n),n) theta1 = theta2 = array(0,n) for(i in 1:n){ theta1[i] = rnorm(1,x[i],x[i]^2) theta2[i] = rnorm(1,x[i]/2,(1-x[i])^2) } y = matrix(0,n,length(t)) phi1 = sqrt(3)*t phi2 = sqrt(6/5)*(1-t/2) y = theta1%*%t(phi1) + theta2 %*% t(phi2) xout = matrix(c(0.25,0.5,0.75),3) Cov_est=LocCovReg(x=x,y=y,xout=xout,optns=list(corrOut=FALSE,metric="power",alpha=3)) #Example M input n=30 #sample size m=30 #dimension of covariance matrices M <- array(0,c(m,m,n)) for (i in 1:n){ y0=rnorm(m) aux<-15*diag(m)+y0%*%t(y0) M[,,i]<-aux } x=matrix(rnorm(n),n) xout = matrix(c(0.25,0.5,0.75),3) #output predictor levels Cov_est=LocCovReg(x=x,M=M,xout=xout,optns=list(corrOut=FALSE,metric="power",alpha=0))
#Example y input n=30 # sample size t=seq(0,1,length.out=100) # length of data x = matrix(runif(n),n) theta1 = theta2 = array(0,n) for(i in 1:n){ theta1[i] = rnorm(1,x[i],x[i]^2) theta2[i] = rnorm(1,x[i]/2,(1-x[i])^2) } y = matrix(0,n,length(t)) phi1 = sqrt(3)*t phi2 = sqrt(6/5)*(1-t/2) y = theta1%*%t(phi1) + theta2 %*% t(phi2) xout = matrix(c(0.25,0.5,0.75),3) Cov_est=LocCovReg(x=x,y=y,xout=xout,optns=list(corrOut=FALSE,metric="power",alpha=3)) #Example M input n=30 #sample size m=30 #dimension of covariance matrices M <- array(0,c(m,m,n)) for (i in 1:n){ y0=rnorm(m) aux<-15*diag(m)+y0%*%t(y0) M[,,i]<-aux } x=matrix(rnorm(n),n) xout = matrix(c(0.25,0.5,0.75),3) #output predictor levels Cov_est=LocCovReg(x=x,M=M,xout=xout,optns=list(corrOut=FALSE,metric="power",alpha=0))
Local Fréchet regression for densities with respect to -Wasserstein distance.
LocDenReg( xin = NULL, yin = NULL, hin = NULL, qin = NULL, xout = NULL, optns = list() )
LocDenReg( xin = NULL, yin = NULL, hin = NULL, qin = NULL, xout = NULL, optns = list() )
xin |
An n by p matrix or a vector of length n if p=1 holding the n observations of the predictor. |
yin |
A matrix or list holding the sample of observations of the response. If |
hin |
A list holding the histograms of the response corresponding to each predictor value in the corresponding row of |
qin |
A matrix or list holding the quantile functions of the response. If |
xout |
An m by p matrix or a vector of length m if p=1 holding the m output predictor values. Default is |
optns |
A list of control parameters specified by |
Available control options are
A vector of length p used as the bandwidth for the Fréchet regression or "CV"
(default), i.e., a data-adaptive selection done by cross-validation.
A character holding the type of kernel functions for local Fréchet regression for densities; "rect"
, "gauss"
, "epan"
, "gausvar"
, "quar"
- default: "gauss"
.
A numeric vector holding the grid on [0,1] quantile functions take value on. Default is an equidistant grid.
A scalar giving the length of qSup
. Default is 201.
A scalar with the lower bound of the support of the distribution. Default is NULL
.
A scalar with the upper bound of the support of the distribution. Default is NULL
.
A 2 by p matrix whose columns contain the bandwidth selection range for each corresponding dimension of the predictor xin
for the case when bwReg
equals "CV"
. Default is NULL
and is automatically chosen by a data-adaptive method.
The bandwidth value used in CreateDensity()
for density estimation; positive numeric - default: determine automatically based on the data-driven bandwidth selector proposed by Sheather and Jones (1991).
The number of support points the kernel density estimation uses in CreateDensity()
; numeric - default: 101.
User defined output grid for the support of kernel density estimation used in CreateDensity()
, it overrides nRegGrid
; numeric - default: NULL
The size of the bin to be used used in CreateDensity()
; numeric - default: diff(range(y))/1000
. It only works when the raw sample is available.
A character holding the type of kernel functions used in CreateDensity()
for density estimation; "rect"
, "gauss"
, "epan"
, "gausvar"
, "quar"
- default: "gauss"
.
logical if we expect the distribution to have infinite support or not, used in CreateDensity()
for density estimation; logical - default: FALSE
FALSE
or a positive value giving the lower threshold of the densities used in CreateDensity()
; default: 0.001 * mean(qin[,ncol(qin)] - qin[,1])
.
A list containing the following components:
xout |
Input |
dout |
A matrix or list holding the output densities corresponding to |
dSup |
A numeric vector giving the domain grid of |
qout |
A matrix holding the quantile functions of the output densities. Each row corresponds to a value in |
qSup |
A numeric vector giving the domain grid of |
xin |
Input |
din |
Densities corresponding to the input |
qin |
Quantile functions corresponding to the input |
optns |
A list of control options used. |
Petersen, A., & Müller, H.-G. (2019). "Fréchet regression for random objects with Euclidean predictors." The Annals of Statistics, 47(2), 691–719.
xin = seq(0,1,0.05) yin = lapply(xin, function(x) { rnorm(100, rnorm(1,x + x^2,0.005), 0.05) }) qSup = seq(0,1,0.02) xout = seq(0,1,0.1) res1 <- LocDenReg(xin=xin, yin=yin, xout=xout, optns = list(bwReg = 0.12, qSup = qSup)) plot(res1) xout <- xin hin = lapply(yin, function(y) hist(y, breaks = 50)) res2 <- LocDenReg(xin=xin, hin=hin, xout=xout, optns = list(qSup = qSup)) plot(res2)
xin = seq(0,1,0.05) yin = lapply(xin, function(x) { rnorm(100, rnorm(1,x + x^2,0.005), 0.05) }) qSup = seq(0,1,0.02) xout = seq(0,1,0.1) res1 <- LocDenReg(xin=xin, yin=yin, xout=xout, optns = list(bwReg = 0.12, qSup = qSup)) plot(res1) xout <- xin hin = lapply(yin, function(y) hist(y, breaks = 50)) res2 <- LocDenReg(xin=xin, hin=hin, xout=xout, optns = list(qSup = qSup)) plot(res2)
Local Fréchet regression for spherical data with respect to the geodesic distance.
LocSpheReg(xin = NULL, yin = NULL, xout = NULL, optns = list())
LocSpheReg(xin = NULL, yin = NULL, xout = NULL, optns = list())
xin |
A vector of length n with input measurement points. |
yin |
An n by m matrix holding the spherical data, of which the sum of squares of elements within each row is 1. |
xout |
A vector of length k with output measurement points; Default: |
optns |
A list of options control parameters specified by |
Available control options are
A scalar used as the bandwidth or "CV"
(default).
A character holding the type of kernel functions for local Fréchet regression for densities; "rect"
, "gauss"
, "epan"
, "gausvar"
, "quar"
- default: "gauss"
.
A list containing the following components:
xout |
Input |
yout |
A k by m matrix holding the fitted responses, of which each row is a spherical vector, corresponding to each element in |
xin |
Input |
yin |
Input |
optns |
A list of control options used. |
Petersen, A., & Müller, H.-G. (2019). "Fréchet regression for random objects with Euclidean predictors." The Annals of Statistics, 47(2), 691–719.
set.seed(1) n <- 200 # simulate the data according to the simulation in Petersen & Müller (2019) xin <- runif(n) err_sd <- 0.2 xout <- seq(0,1,length.out = 51) phi_true <- acos(xin) theta_true <- pi * xin ytrue <- cbind( sin(phi_true) * cos(theta_true), sin(phi_true) * sin(theta_true), cos(phi_true) ) basis <- list( b1 = cbind( cos(phi_true) * cos(theta_true), cos(phi_true) * sin(theta_true), -sin(phi_true) ), b2 = cbind( sin(theta_true), -cos(theta_true), 0 ) ) yin_tg <- basis$b1 * rnorm(n, mean = 0, sd = err_sd) + basis$b2 * rnorm(n, mean = 0, sd = err_sd) yin <- t(sapply(seq_len(n), function(i) { tgNorm <- sqrt(sum(yin_tg[i,]^2)) if (tgNorm < 1e-10) { return(ytrue[i,]) } else { return(sin(tgNorm) * yin_tg[i,] / tgNorm + cos(tgNorm) * ytrue[i,]) } })) res <- LocSpheReg(xin=xin, yin=yin, xout=xout, optns = list(bw = 0.15, kernel = "epan"))
set.seed(1) n <- 200 # simulate the data according to the simulation in Petersen & Müller (2019) xin <- runif(n) err_sd <- 0.2 xout <- seq(0,1,length.out = 51) phi_true <- acos(xin) theta_true <- pi * xin ytrue <- cbind( sin(phi_true) * cos(theta_true), sin(phi_true) * sin(theta_true), cos(phi_true) ) basis <- list( b1 = cbind( cos(phi_true) * cos(theta_true), cos(phi_true) * sin(theta_true), -sin(phi_true) ), b2 = cbind( sin(theta_true), -cos(theta_true), 0 ) ) yin_tg <- basis$b1 * rnorm(n, mean = 0, sd = err_sd) + basis$b2 * rnorm(n, mean = 0, sd = err_sd) yin <- t(sapply(seq_len(n), function(i) { tgNorm <- sqrt(sum(yin_tg[i,]^2)) if (tgNorm < 1e-10) { return(ytrue[i,]) } else { return(sin(tgNorm) * yin_tg[i,] / tgNorm + cos(tgNorm) * ytrue[i,]) } })) res <- LocSpheReg(xin=xin, yin=yin, xout=xout, optns = list(bw = 0.15, kernel = "epan"))
Compute a log map for a unit hypersphere.
logSphere(base, x)
logSphere(base, x)
base |
A unit vector of length |
x |
A unit vector of length |
A tangent vector of length .
Plots for Fréchet regression for univariate densities.
## S3 method for class 'denReg' plot( x, obj = NULL, prob = NULL, xlab = NULL, ylab = NULL, main = NULL, ylim = NULL, xlim = NULL, col.bar = TRUE, widrt = 4, col.lab = NULL, nticks = 5, ticks = NULL, add = FALSE, pos.prob = 0.9, colPalette = NULL, ... )
## S3 method for class 'denReg' plot( x, obj = NULL, prob = NULL, xlab = NULL, ylab = NULL, main = NULL, ylim = NULL, xlim = NULL, col.bar = TRUE, widrt = 4, col.lab = NULL, nticks = 5, ticks = NULL, add = FALSE, pos.prob = 0.9, colPalette = NULL, ... )
x |
A |
obj |
An integer indicating which output to be plotted; 1, 2, 3, 4, and 5 for |
prob |
A vector specifying the probability levels for reference chart if |
xlab |
Character holding the label for x-axis; default: |
ylab |
Character holding the label for y-axis; default: |
main |
Character holding the plot title; default: |
ylim |
A numeric vector of length 2 holding the range of the y-axis to be drawn; default: automatically determined by the input |
xlim |
A numeric vector of length 2 holding the range of the x-axis to be drawn; default: automatically determined by the input |
col.bar |
A logical variable indicating whether a color bar is presented on the right of the plot - default: |
widrt |
A scalar giving the width ratio between the main plot and the color bar - default: 4. |
col.lab |
A character giving the color bar label. |
nticks |
An integer giving the number of ticks used in the axis of color bar. |
ticks |
A numeric vector giving the locations of ticks used in the axis of color bar; it overrides |
add |
Logical; only works when |
pos.prob |
|
colPalette |
A function that takes an integer argument (the required number of colors) and returns a character vector of colors interpolating the given sequence
(e.g., |
... |
Can set up |
No return value.
see DenFMean
, GloDenReg
and LocDenReg
for example code.
Transform polar to Cartesian coordinates
pol2car(p)
pol2car(p)
p |
A vector of length |
A vector of length holding the corresponding Cartesian coordinates
where is given by
p[1]
and is given by
p[i+1]
for .
pol2car(c(1, 0, pi/4)) # should equal c(1,0,1)/sqrt(2) pol2car(c(1, pi, 0)) # should equal c(-1,0,0)
pol2car(c(1, 0, pi/4)) # should equal c(1,0,1)/sqrt(2) pol2car(c(1, pi, 0)) # should equal c(-1,0,0)
Geodesic distance on spheres.
SpheGeoDist(y1, y2)
SpheGeoDist(y1, y2)
y1 , y2
|
Two unit vectors, i.e., with |
A scalar holding the geodesic distance between y1
and y2
.
d <- 3 y1 <- rnorm(d) y1 <- y1 / sqrt(sum(y1^2)) y2 <- rnorm(d) y2 <- y2 / sqrt(sum(y2^2)) dist <- SpheGeoDist(y1,y2)
d <- 3 y1 <- rnorm(d) y1 <- y1 / sqrt(sum(y1^2)) y2 <- rnorm(d) y2 <- y2 / sqrt(sum(y2^2)) dist <- SpheGeoDist(y1,y2)
on a unit hypersphereCompute gradient w.r.t. y of the geodesic distance on a unit hypersphere
SpheGeoGrad(x, y)
SpheGeoGrad(x, y)
x , y
|
Two unit vectors. |
A vector holding radient w.r.t. y
of the geodesic distance between x
and y
.
of the geodesic distance
on a unit hypersphereHessian of the geodesic distance
on a unit hypersphere
SpheGeoHess(x, y)
SpheGeoHess(x, y)
x , y
|
Two unit vectors. |
A Hessian matrix.