| Title: | Riemannian Metrics for Symmetric Positive Definite Matrices |
|---|---|
| Description: | Implements various Riemannian metrics for symmetric positive definite matrices, including AIRM (Affine Invariant Riemannian Metric, <doi:10.1007/s11263-005-3222-z>), Log-Euclidean (<doi:10.1002/mrm.20965>), Euclidean, Log-Cholesky (<doi:10.1137/18M1221084>), and Bures-Wasserstein metrics (<doi:10.1016/j.exmath.2018.01.002>). Provides functions for computing logarithmic and exponential maps, vectorization, and statistical operations on the manifold of positive definite matrices. |
| Authors: | Nicolas Escobar [aut, cre] (ORCID: <https://orcid.org/0009-0006-0800-5692>), Jaroslaw Harezlak [ths] |
| Maintainer: | Nicolas Escobar <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.2.5 |
| Built: | 2026-06-09 06:09:31 UTC |
| Source: | https://github.com/nicoesve/riemtan |
This function computes the Riemannian exponential map for the Affine-Invariant Riemannian Metric (AIRM).
airm_exp(sigma, v, validate = FALSE)airm_exp(sigma, v, validate = FALSE)
sigma |
A symmetric positive-definite matrix of class |
v |
A tangent vector of class |
validate |
A logical value indicating whether to validate input arguments. Default is FALSE. |
A symmetric positive-definite matrix of class dppMatrix.
if (requireNamespace("Matrix", quietly = TRUE)) { library(Matrix) sigma <- diag(2) |> Matrix::nearPD() |> _$mat |> Matrix::pack() v <- diag(c(1, 0.5)) |> Matrix::symmpart() |> Matrix::pack() airm_exp(sigma, v) }if (requireNamespace("Matrix", quietly = TRUE)) { library(Matrix) sigma <- diag(2) |> Matrix::nearPD() |> _$mat |> Matrix::pack() v <- diag(c(1, 0.5)) |> Matrix::symmpart() |> Matrix::pack() airm_exp(sigma, v) }
This function computes the Riemannian logarithmic map for the Affine-Invariant Riemannian Metric (AIRM).
airm_log(sigma, lambda, validate = FALSE)airm_log(sigma, lambda, validate = FALSE)
sigma |
A symmetric positive-definite matrix of class |
lambda |
A symmetric positive-definite matrix of class |
validate |
A logical value indicating whether to validate input arguments. Default is FALSE. |
A symmetric matrix of class dspMatrix, representing the tangent space image of lambda at sigma.
if (requireNamespace("Matrix", quietly = TRUE)) { library(Matrix) sigma <- diag(2) |> Matrix::nearPD() |> _$mat |> Matrix::pack() lambda <- diag(c(2, 3)) |> Matrix::nearPD() |> _$mat |> Matrix::pack() airm_log(sigma, lambda) }if (requireNamespace("Matrix", quietly = TRUE)) { library(Matrix) sigma <- diag(2) |> Matrix::nearPD() |> _$mat |> Matrix::pack() lambda <- diag(c(2, 3)) |> Matrix::nearPD() |> _$mat |> Matrix::pack() airm_log(sigma, lambda) }
Converts a vector back into a tangent matrix relative to a reference point using AIRM.
airm_unvec(sigma, w)airm_unvec(sigma, w)
sigma |
A symmetric positive-definite matrix of class |
w |
A numeric vector, representing the vectorized tangent image. |
A symmetric matrix of class dspMatrix, representing the tangent vector.
if (requireNamespace("Matrix", quietly = TRUE)) { library(Matrix) sigma <- diag(2) |> Matrix::nearPD() |> _$mat |> Matrix::pack() w <- c(1, sqrt(2), 2) airm_unvec(sigma, w) }if (requireNamespace("Matrix", quietly = TRUE)) { library(Matrix) sigma <- diag(2) |> Matrix::nearPD() |> _$mat |> Matrix::pack() w <- c(1, sqrt(2), 2) airm_unvec(sigma, w) }
Vectorizes a tangent matrix into a vector in Euclidean space using AIRM.
airm_vec(sigma, v)airm_vec(sigma, v)
sigma |
A symmetric positive-definite matrix of class |
v |
A symmetric matrix of class |
A numeric vector, representing the vectorized tangent image.
if (requireNamespace("Matrix", quietly = TRUE)) { library(Matrix) sigma <- diag(2) |> Matrix::nearPD() |> _$mat |> Matrix::pack() v <- diag(c(1, 0.5)) |> Matrix::symmpart() |> Matrix::pack() airm_vec(sigma, v) }if (requireNamespace("Matrix", quietly = TRUE)) { library(Matrix) sigma <- diag(2) |> Matrix::nearPD() |> _$mat |> Matrix::pack() v <- diag(c(1, 0.5)) |> Matrix::symmpart() |> Matrix::pack() airm_vec(sigma, v) }
This function computes the Riemannian exponential map using the Bures-Wasserstein metric for symmetric positive-definite matrices. The map operates by solving a Lyapunov equation and then constructing the exponential.
bures_wasserstein_exp(sigma, v, validate = FALSE)bures_wasserstein_exp(sigma, v, validate = FALSE)
sigma |
A symmetric positive-definite matrix of class |
v |
A symmetric matrix of class |
validate |
A logical value indicating whether to validate input arguments. Default is FALSE. |
A symmetric positive-definite matrix of class dppMatrix, representing the point on the manifold.
This function computes the Riemannian logarithmic map using the Bures-Wasserstein metric for symmetric positive-definite matrices.
bures_wasserstein_log(sigma, lambda, validate = FALSE)bures_wasserstein_log(sigma, lambda, validate = FALSE)
sigma |
A symmetric positive-definite matrix of class |
lambda |
A symmetric positive-definite matrix of class |
validate |
A logical value indicating whether to validate input arguments. Default is FALSE. |
A symmetric matrix of class dspMatrix, representing the tangent space image of lambda at sigma.
Compute the Bures-Wasserstein Inverse Vectorization
bures_wasserstein_unvec(sigma, w)bures_wasserstein_unvec(sigma, w)
sigma |
A symmetric positive-definite matrix of class |
w |
A numeric vector representing the vectorized tangent image |
A symmetric matrix of class dspMatrix
Compute the Bures-Wasserstein Vectorization
bures_wasserstein_vec(sigma, v)bures_wasserstein_vec(sigma, v)
sigma |
A symmetric positive-definite matrix of class |
v |
A symmetric matrix of class |
A numeric vector representing the vectorized tangent image
This function computes the Frechet mean of a sample using an iterative algorithm with optional parallel processing.
compute_frechet_mean( sample, tol = 0.05, max_iter = 20, lr = 0.2, batch_size = 32, progress = FALSE )compute_frechet_mean( sample, tol = 0.05, max_iter = 20, lr = 0.2, batch_size = 32, progress = FALSE )
sample |
An object of class |
tol |
A numeric value specifying the tolerance for convergence. Default is 0.05. |
max_iter |
An integer specifying the maximum number of iterations. Default is 20. |
lr |
A numeric value specifying the learning rate. Default is 0.2. |
batch_size |
Integer. The number of samples to process in each batch during computation. Default is 32. |
progress |
Logical indicating whether to show progress during computation (default: FALSE). Requires progressr package. |
The function iteratively updates the reference point of the sample until the change in the reference point is less than the specified tolerance or the maximum number of iterations is reached. If the tangent images are not already computed, they will be computed before starting the iterations.
When parallel processing is enabled (via set_parallel_plan()), the relocate() function will use parallel
processing for relocating tangent images in each iteration, which can significantly speed up computation
for large samples.
The computed Frechet mean as a dppMatrix object.
if (requireNamespace("Matrix", quietly = TRUE)) { library(Matrix) # Load the AIRM metric object data(airm) # Create a CSample object with example data conns <- list( diag(2) |> Matrix::nearPD() |> _$mat |> Matrix::pack(), diag(c(2, 3)) |> Matrix::nearPD() |> _$mat |> Matrix::pack() ) sample <- CSample$new(conns = conns, metric_obj = airm) # Compute the Frechet mean compute_frechet_mean(sample, tol = 0.01, max_iter = 50, lr = 0.1) }if (requireNamespace("Matrix", quietly = TRUE)) { library(Matrix) # Load the AIRM metric object data(airm) # Create a CSample object with example data conns <- list( diag(2) |> Matrix::nearPD() |> _$mat |> Matrix::pack(), diag(c(2, 3)) |> Matrix::nearPD() |> _$mat |> Matrix::pack() ) sample <- CSample$new(conns = conns, metric_obj = airm) # Compute the Frechet mean compute_frechet_mean(sample, tol = 0.01, max_iter = 50, lr = 0.1) }
Convenience function to configure progressr handlers for the current session.
configure_progress(handler = "txtprogressbar", ...)configure_progress(handler = "txtprogressbar", ...)
handler |
Character string specifying the handler type:
|
... |
Additional arguments passed to |
This is a convenience wrapper around progressr::handlers().
If progressr is not available, a warning is issued and the function returns NULL.
Invisibly returns the configured handlers (via progressr::handlers()).
## Not run: # Use base R text progress bar configure_progress("txtprogressbar") # Use cli style (if cli package is installed) configure_progress("cli") # Disable progress reporting configure_progress("none") ## End(Not run)## Not run: # Use base R text progress bar configure_progress("txtprogressbar") # Use cli style (if cli package is installed) configure_progress("cli") # Disable progress reporting configure_progress("none") ## End(Not run)
Convenience function to create a ParquetBackend from a validated directory.
create_parquet_backend(data_dir, cache_size = 10, validate = TRUE)create_parquet_backend(data_dir, cache_size = 10, validate = TRUE)
data_dir |
Path to directory containing Parquet files and metadata |
cache_size |
Maximum number of matrices to cache in memory (default: 10) |
validate |
If TRUE, validates directory before creating backend (default: TRUE) |
A ParquetBackend object
## Not run: backend <- create_parquet_backend("my_connectomes") sample <- CSample$new(backend = backend, metric_obj = airm()) ## End(Not run)## Not run: backend <- create_parquet_backend("my_connectomes") sample <- CSample$new(backend = backend, metric_obj = airm()) ## End(Not run)
This class represents a sample of connectomes, with various properties and methods to handle their tangent and vectorized images.
connectomesConnectomes data
tangent_imagesTangent images data
vector_imagesVector images data
sample_sizeSample size
matrix_sizeMatrix size
mfd_dimManifold dimension
is_centeredCentering status
frechet_meanFrechet mean
riem_metricRiemannian Metric used
variationVariation of the sample
sample_covSample covariance
ref_pointReference point for tangent or vectorized images
distancesDistances to the Frechet mean
batch_sizeBatch size for compute_fmean
new()
Initialize a CSample object
CSample$new( conns = NULL, tan_imgs = NULL, vec_imgs = NULL, centered = NULL, ref_pt = NULL, metric_obj, batch_size = NULL, backend = NULL )
connsA list of connectomes (default is NULL).
tan_imgsA list of tangent images (default is NULL).
vec_imgsA matrix whose rows are vectorized images (default is NULL).
centeredBoolean indicating whether tangent or vectorized images are centered (default is NULL).
ref_ptA connectome (default is identity)
metric_objObject of class rmetric representing the Riemannian metric used.
batch_sizeThe batch size for compute_fmean (default is 32).
backendA DataBackend object (ListBackend or ParquetBackend). If NULL and conns is provided, a ListBackend will be created automatically.
A new CSample object.
load_connectomes_batched()
Load connectomes in parallel batches from ParquetBackend. This method is particularly useful for large Parquet-backed datasets where loading all matrices at once would be memory-prohibitive.
CSample$load_connectomes_batched( indices = NULL, batch_size = 50, progress = FALSE )
indicesOptional vector of indices to load. If NULL (default), loads all matrices.
batch_sizeNumber of matrices to load per batch (default: 50). Larger batches use more memory but may be faster.
progressLogical indicating whether to show progress (default: FALSE).
This method only works when the CSample was initialized with a ParquetBackend. It loads matrices in parallel batches, clearing the cache between batches to manage memory usage. Sequential loading is used for ListBackend.
A list of dppMatrix objects.
\dontrun{
# Create CSample with ParquetBackend
backend <- create_parquet_backend("my_data")
sample <- CSample$new(backend = backend, metric_obj = airm)
# Load first 100 matrices in batches of 20
conns <- sample$load_connectomes_batched(indices = 1:100, batch_size = 20)
}
compute_tangents()
This function computes the tangent images from the connectomes.
CSample$compute_tangents(ref_pt = default_ref_pt(private$p), progress = FALSE)
ref_ptA reference point, which must be a dppMatrix object (default is default_ref_pt).
progressLogical indicating whether to show progress (default: FALSE).
Error if ref_pt is not a dppMatrix object or if conns is not specified.
None
compute_conns()
This function computes the connectomes from the tangent images.
CSample$compute_conns(progress = FALSE)
progressLogical indicating whether to show progress (default: FALSE).
Error if tangent images are not specified.
None
compute_vecs()
This function computes the vectorized tangent images from the tangent images.
CSample$compute_vecs(progress = FALSE)
progressLogical indicating whether to show progress (default: FALSE).
Error if tangent images are not specified.
None
compute_unvecs()
This function computes the tangent images from the vector images.
CSample$compute_unvecs(progress = FALSE)
progressLogical indicating whether to show progress (default: FALSE).
Error if vec_imgs is not specified.
None
compute_fmean()
This function computes the Frechet mean of the sample.
CSample$compute_fmean( tol = 0.001, max_iter = 100, lr = 0.2, batch_size = NULL, progress = FALSE )
tolTolerance for the convergence of the mean (default is 0.05).
max_iterMaximum number of iterations for the computation (default is 20).
lrLearning rate for the optimization algorithm (default is 0.2).
batch_sizeThe batch size (default is the instance's batch_size).
progressLogical indicating whether to show progress (default: FALSE).
None
change_ref_pt()
This function changes the reference point for the tangent images.
CSample$change_ref_pt(new_ref_pt, progress = FALSE)
new_ref_ptA new reference point, which must be a dppMatrix object.
progressLogical indicating whether to show progress (default: FALSE).
Error if tangent images have not been computed or if new_ref_pt is not a dppMatrix object.
None
center()
Center the sample
CSample$center()
This function centers the sample by computing the Frechet mean if it is not already computed, and then changing the reference point to the computed Frechet mean. Error if tangent images are not specified. Error if the sample is already centered.
None. This function is called for its side effects.
compute_variation()
Compute Variation
CSample$compute_variation()
This function computes the variation of the sample. It first checks if the vector images are null, and if so, it computes the vectors, computing first the tangent images if necessary. If the sample is not centered, it centers the sample and recomputes the vectors. Finally, it calculates the variation as the mean of the sum of squares of the vector images. Error if vec_imgs is not specified.
None. This function is called for its side effects.
compute_dists()
Compute distances
CSample$compute_dists()
This function computes the distances of the elements of the sample to the Frechet mean. It first checks if the vector images are null, and if so, it computes the vectors, computing first the tangent images if necessary. If the sample is not centered, it centers the sample and recomputes the vectors. Finally, it calculates the distances as the Euclidean norms of the vector images. Error if vec_imgs is not specified.
None. This function is called for its side effects.
compute_sample_cov()
Compute Sample Covariance
CSample$compute_sample_cov()
This function computes the sample covariance matrix for the vector images. It first checks if the vector images are null, and if so, it computes the vectors, computing first the tangent images if necessary.
None. This function is called for its side effects.
clone()
The objects of this class are cloneable with this method.
CSample$clone(deep = FALSE)
deepWhether to make a deep clone.
## ------------------------------------------------ ## Method `CSample$load_connectomes_batched` ## ------------------------------------------------ ## Not run: # Create CSample with ParquetBackend backend <- create_parquet_backend("my_data") sample <- CSample$new(backend = backend, metric_obj = airm) # Load first 100 matrices in batches of 20 conns <- sample$load_connectomes_batched(indices = 1:100, batch_size = 20) ## End(Not run)## ------------------------------------------------ ## Method `CSample$load_connectomes_batched` ## ------------------------------------------------ ## Not run: # Create CSample with ParquetBackend backend <- create_parquet_backend("my_data") sample <- CSample$new(backend = backend, metric_obj = airm) # Load first 100 matrices in batches of 20 conns <- sample$load_connectomes_batched(indices = 1:100, batch_size = 20) ## End(Not run)
This class represents a collection of CSample objects (samples of connectomes), providing methods to aggregate, analyze, and compute statistics across multiple samples sharing the same Riemannian metric.
list_of_samplesThe list of CSample objects aggregated in this super-sample.
sample_sizeThe total number of connectomes in all samples.
matrix_sizeThe size of the connectome matrices.
mfd_dimThe dimension of the manifold.
riem_metricThe Riemannian metric used by all samples.
variationThe total variation of the aggregated sample.
sample_covThe sample covariance matrix of the aggregated sample.
full_sampleThe aggregated CSample object containing all connectomes.
frechet_meanThe Frechet mean of the aggregated sample.
WithinThe within-group covariance matrix (W).
TotalThe total covariance matrix (T).
new()
Initialize a CSuperSample object
CSuperSample$new(samples)
samplesA list of CSample objects. All must use the same Riemannian metric.
A new CSuperSample object.
compute_variation()
Compute the total variation of the aggregated sample.
CSuperSample$compute_variation()
None. This function is called for its side effects. The result is stored in the variation active binding.
compute_sample_cov()
Compute the sample covariance matrix of the aggregated sample.
CSuperSample$compute_sample_cov()
None. This function is called for its side effects. The result is stored in the sample_cov active binding.
gather()
Gather all connectomes from the list of samples into a single CSample object.
CSuperSample$gather()
None. This function is called for its side effects. The result is stored in the full_sample active binding.
compute_fmean()
Compute the Frechet mean of the aggregated sample.
CSuperSample$compute_fmean(batch_size = NULL)
batch_sizeOptional batch size parameter passed to the underlying compute_fmean function.
None. This function is called for its side effects. The result is stored in the frechet_mean active binding.
compute_W()
Compute the within-group covariance matrix (W) for the samples.
CSuperSample$compute_W()
None. This function is called for its side effects. The result is stored in the Within active binding.
compute_T()
Compute the total covariance matrix (T) for the samples.
CSuperSample$compute_T()
None. This function is called for its side effects. The result is stored in the Total active binding.
clone()
The objects of this class are cloneable with this method.
CSuperSample$clone(deep = FALSE)
deepWhether to make a deep clone.
Default reference point
default_ref_pt(p)default_ref_pt(p)
p |
the dimension |
A diagonal matrix of the desired dimension
Computes the differential of the matrix exponential map located at a point a, evaluated at x
dexp(a, x)dexp(a, x)
a |
A symmetric matrix of class dspMatrix |
x |
A symmetric matrix representing tangent vector of class dspMatrix |
A positive definite symmetric matrix representing the differential located at a and evaluated at x, of class dppMatrix
Computes the differential of the matrix logarithm map at a point Sigma, evaluated at H
dlog(sigma, h)dlog(sigma, h)
sigma |
A symmetric positive definite matrix of class dspMatrix |
h |
A symmetric matrix representing tangent vector of class dsyMatrix |
A symmetric matrix representing the differential evaluated at H of class dsyMatrix
Compute the Euclidean Exponential
euclidean_exp(sigma, v, validate = FALSE)euclidean_exp(sigma, v, validate = FALSE)
sigma |
A reference point. |
v |
A tangent vector to be mapped back to the manifold at |
validate |
A logical value indicating whether to validate input arguments. Default is FALSE. |
The point on the manifold corresponding to the tangent vector at sigma.
Compute the Euclidean Logarithm
euclidean_log(sigma, lambda, validate = FALSE)euclidean_log(sigma, lambda, validate = FALSE)
sigma |
A reference point. |
lambda |
A point on the manifold. |
validate |
A logical value indicating whether to validate input arguments. Default is FALSE. |
The tangent space image of lambda at sigma.
Converts a vector back into a tangent matrix relative to a reference point using Euclidean metric.
euclidean_unvec(sigma, w)euclidean_unvec(sigma, w)
sigma |
A symmetric matrix. |
w |
A numeric vector, representing the vectorized tangent image. |
A symmetric matrix, representing the tangent vector.
Converts a symmetric matrix into a vector representation.
euclidean_vec(sigma, v)euclidean_vec(sigma, v)
sigma |
A symmetric matrix. |
v |
A vector. |
A numeric vector, representing the vectorized tangent image.
Returns the number of parallel workers configured in the current future plan.
get_n_workers()get_n_workers()
Integer specifying the number of workers, or 1 if sequential processing is enabled.
## Not run: set_parallel_plan("multisession", workers = 4) get_n_workers() # Returns 4 set_parallel_plan("sequential") get_n_workers() # Returns 1 ## End(Not run)## Not run: set_parallel_plan("multisession", workers = 4) get_n_workers() # Returns 4 set_parallel_plan("sequential") get_n_workers() # Returns 1 ## End(Not run)
Half-underscore operation for use in the log-Cholesky metric
half_underscore(x)half_underscore(x)
x |
A symmetric matrix (object of class dsyMatrix) |
The strictly lower triangular part of the matrix, plus half its diagonal part
Create an Identity Matrix
id_matr(sigma)id_matr(sigma)
sigma |
A matrix. |
An identity matrix of the same dimensions as sigma.
Checks whether parallel processing is currently enabled based on the future plan.
is_parallel_enabled()is_parallel_enabled()
Logical value: TRUE if parallel processing is enabled, FALSE if sequential.
set_parallel_plan(), should_parallelize()
## Not run: # Check current status is_parallel_enabled() # Enable parallel processing set_parallel_plan("multisession") is_parallel_enabled() # Returns TRUE # Disable parallel processing set_parallel_plan("sequential") is_parallel_enabled() # Returns FALSE ## End(Not run)## Not run: # Check current status is_parallel_enabled() # Enable parallel processing set_parallel_plan("multisession") is_parallel_enabled() # Returns TRUE # Disable parallel processing set_parallel_plan("sequential") is_parallel_enabled() # Returns FALSE ## End(Not run)
Checks whether the progressr package is available and can be used for progress reporting.
is_progress_available()is_progress_available()
Logical value: TRUE if progressr is available, FALSE otherwise.
if (is_progress_available()) { message("Progress reporting is available") }if (is_progress_available()) { message("Progress reporting is available") }
Backend implementation using in-memory list storage. This wraps the existing list-based storage mechanism for backwards compatibility.
riemtan::DataBackend -> ListBackend
new()
Initialize a ListBackend
ListBackend$new(matrices)
matricesA list of dppMatrix objects
get_matrix()
Get a specific matrix by index
ListBackend$get_matrix(i)
iInteger index
A dppMatrix object
get_all_matrices()
Get all matrices
ListBackend$get_all_matrices()
A list of dppMatrix objects
length()
Get the number of matrices
ListBackend$length()
Integer count
get_dimensions()
Get matrix dimensions
ListBackend$get_dimensions()
Integer p (matrices are p x p)
clone()
The objects of this class are cloneable with this method.
ListBackend$clone(deep = FALSE)
deepWhether to make a deep clone.
This function computes the Riemannian exponential map using the Log-Cholesky metric for symmetric positive-definite matrices. The map operates by transforming the tangent vector via Cholesky decomposition of the reference point.
log_cholesky_exp(sigma, v, validate = FALSE)log_cholesky_exp(sigma, v, validate = FALSE)
sigma |
A symmetric positive-definite matrix of class |
v |
A symmetric matrix of class |
validate |
A logical value indicating whether to validate input arguments. Default is FALSE. |
A symmetric positive-definite matrix of class dppMatrix, representing the point on the manifold.
This function computes the Riemannian logarithmic map using the Log-Cholesky metric for symmetric positive-definite matrices. The Log-Cholesky metric operates by transforming matrices via their Cholesky decomposition.
log_cholesky_log(sigma, lambda, validate = FALSE)log_cholesky_log(sigma, lambda, validate = FALSE)
sigma |
A symmetric positive-definite matrix of class |
lambda |
A symmetric positive-definite matrix of class |
validate |
A logical value indicating whether to validate input arguments. Default is FALSE. |
A symmetric matrix of class dspMatrix, representing the tangent space image of lambda at sigma.
Compute the Log-Cholesky Inverse Vectorization
log_cholesky_unvec(sigma, w)log_cholesky_unvec(sigma, w)
sigma |
A symmetric positive-definite matrix of class |
w |
A numeric vector representing the vectorized tangent image |
A symmetric matrix of class dspMatrix
Compute the Log-Cholesky Vectorization
log_cholesky_vec(sigma, v)log_cholesky_vec(sigma, v)
sigma |
A symmetric positive-definite matrix of class |
v |
A symmetric matrix of class |
A numeric vector representing the vectorized tangent image
This function computes the Euclidean exponential map.
log_euclidean_exp(ref_pt, v, validate = FALSE)log_euclidean_exp(ref_pt, v, validate = FALSE)
ref_pt |
A reference point. |
v |
A tangent vector to be mapped back to the manifold at |
validate |
A logical value indicating whether to validate input arguments. Default is FALSE. |
The point on the manifold corresponding to the tangent vector at ref_pt.
Compute the Log-Euclidean Logarithm
log_euclidean_log(sigma, lambda, validate = FALSE)log_euclidean_log(sigma, lambda, validate = FALSE)
sigma |
A reference point. |
lambda |
A point on the manifold. |
validate |
A logical value indicating whether to validate input arguments. Default is FALSE. |
The tangent space image of lambda at sigma.
Converts a vector back into a tangent matrix relative to a reference point using Euclidean metric.
log_euclidean_unvec(sigma, w)log_euclidean_unvec(sigma, w)
sigma |
A symmetric matrix. |
w |
A numeric vector, representing the vectorized tangent image. |
A symmetric matrix, representing the tangent vector.
Converts a symmetric matrix into a vector representation.
log_euclidean_vec(sigma, v)log_euclidean_vec(sigma, v)
sigma |
A symmetric matrix. |
v |
A vector. |
A numeric vector, representing the vectorized tangent image.
Constructs a metric object that contains the necessary functions for Riemannian operations.
metric(log, exp, vec, unvec)metric(log, exp, vec, unvec)
log |
A function representing the Riemannian logarithmic map. This function should accept a |
exp |
A function representing the Riemannian exponential map. This function should accept a |
vec |
A function representing the vectorization operation for tangent spaces. This function should accept a |
unvec |
A function representing the inverse of the vectorization operation. This function should accept a |
An object of class rmetric containing the specified functions.
Ready-to-use metric objects for various Riemannian geometries on the manifold of symmetric positive definite matrices.
airm log_euclidean euclidean log_cholesky bures_wassersteinairm log_euclidean euclidean log_cholesky bures_wasserstein
Objects of class rmetric containing four functions:
Computes the Riemannian logarithm
Computes the Riemannian exponential
Performs vectorization
Performs inverse vectorization
An object of class rmetric of length 4.
An object of class rmetric of length 4.
An object of class rmetric of length 4.
An object of class rmetric of length 4.
An object of class rmetric of length 4.
This module provides functions to configure and manage parallel processing using the futureverse framework (future + furrr packages).
Backend implementation using Apache Arrow Parquet files with lazy loading. Matrices are stored as individual Parquet files and loaded on-demand with LRU caching.
riemtan::DataBackend -> ParquetBackend
new()
Load metadata from JSON file
Load a matrix from Parquet file
Update LRU cache with a new matrix
Initialize a ParquetBackend
ParquetBackend$new(data_dir, cache_size = 10)
data_dirPath to directory containing Parquet files and metadata.json
cache_sizeMaximum number of matrices to cache (default 10)
iInteger index
iInteger index
matA dppMatrix object
A dppMatrix object
get_matrix()
Get a specific matrix by index
ParquetBackend$get_matrix(i)
iInteger index
A dppMatrix object
get_all_matrices()
Get all matrices (loads all from disk if necessary)
ParquetBackend$get_all_matrices(parallel = NULL, progress = FALSE)
parallelLogical indicating whether to use parallel loading (default: NULL, auto-detect)
progressLogical indicating whether to show progress (default: FALSE)
A list of dppMatrix objects
get_matrices_parallel()
Load multiple matrices in parallel (batch loading)
ParquetBackend$get_matrices_parallel(indices, progress = FALSE)
indicesVector of integer indices to load
progressLogical indicating whether to show progress (default: FALSE)
A list of dppMatrix objects
length()
Get the number of matrices
ParquetBackend$length()
Integer count
get_dimensions()
Get matrix dimensions
ParquetBackend$get_dimensions()
Integer p (matrices are p x p)
get_metadata()
Get metadata
ParquetBackend$get_metadata()
List containing metadata information
clear_cache()
Clear the cache
ParquetBackend$clear_cache()
get_cache_size()
Get current cache size
ParquetBackend$get_cache_size()
Integer number of cached matrices
clone()
The objects of this class are cloneable with this method.
ParquetBackend$clone(deep = FALSE)
deepWhether to make a deep clone.
This module provides utilities for progress reporting during computationally intensive operations, using the progressr package.
Changes the reference point for tangent space representations on a Riemannian manifold. Supports parallel processing via the futureverse framework for improved performance on large datasets.
relocate(old_ref, new_ref, images, met, progress = FALSE)relocate(old_ref, new_ref, images, met, progress = FALSE)
old_ref |
A reference point on the manifold to be replaced. Must be an object of class |
new_ref |
The new reference point on the manifold. Must be an object of class |
images |
A list of tangent representations relative to the old reference point. Each element in the list must be an object of class |
met |
A metric object of class |
progress |
Logical indicating whether to show progress during computation (default: FALSE). Requires progressr package. |
This function uses parallel processing when the number of images exceeds a threshold and
parallel processing is enabled via set_parallel_plan(). For small datasets, sequential
processing is used automatically to avoid parallelization overhead.
A list of tangent representations relative to the new reference point. Each element in the returned list will be an object of class dspMatrix.
if (requireNamespace("Matrix", quietly = TRUE)) { library(Matrix) data(airm) old_ref <- diag(2) |> Matrix::nearPD() |> _$mat |> Matrix::pack() new_ref <- diag(c(2, 3)) |> Matrix::nearPD() |> _$mat |> Matrix::pack() images <- list( diag(2) |> Matrix::symmpart() |> Matrix::pack(), diag(c(1, 0.5)) |> Matrix::symmpart() |> Matrix::pack() ) relocate(old_ref, new_ref, images, airm) }if (requireNamespace("Matrix", quietly = TRUE)) { library(Matrix) data(airm) old_ref <- diag(2) |> Matrix::nearPD() |> _$mat |> Matrix::pack() new_ref <- diag(c(2, 3)) |> Matrix::nearPD() |> _$mat |> Matrix::pack() images <- list( diag(2) |> Matrix::symmpart() |> Matrix::pack(), diag(c(1, 0.5)) |> Matrix::symmpart() |> Matrix::pack() ) relocate(old_ref, new_ref, images, airm) }
Convenience function to reset parallel processing to sequential mode.
Equivalent to set_parallel_plan("sequential").
reset_parallel_plan()reset_parallel_plan()
Invisibly returns the future plan object.
## Not run: # Enable parallel processing set_parallel_plan("multisession", workers = 4) # Reset to sequential reset_parallel_plan() ## End(Not run)## Not run: # Enable parallel processing set_parallel_plan("multisession", workers = 4) # Reset to sequential reset_parallel_plan() ## End(Not run)
Simulates random samples from a Riemannian normal distribution on symmetric positive definite matrices.
rspdnorm(n, refpt, disp, met)rspdnorm(n, refpt, disp, met)
n |
Number of samples to generate. |
refpt |
Reference point on the manifold, represented as a symmetric positive definite matrix. Must be an object of class |
disp |
Dispersion matrix defining the spread of the distribution. Must be an object of class |
met |
A metric object of class |
An object of class CSample containing the generated samples.
if (requireNamespace("Matrix", quietly = TRUE)) { library(Matrix) data(airm) refpt <- diag(2) |> Matrix::nearPD() |> _$mat |> Matrix::pack() disp <- diag(3) |> Matrix::nearPD() |> _$mat |> Matrix::pack() rspdnorm(10, refpt, disp, airm) }if (requireNamespace("Matrix", quietly = TRUE)) { library(Matrix) data(airm) refpt <- diag(2) |> Matrix::nearPD() |> _$mat |> Matrix::pack() disp <- diag(3) |> Matrix::nearPD() |> _$mat |> Matrix::pack() rspdnorm(10, refpt, disp, airm) }
Wrapper for the matrix logarithm
safe_logm(x)safe_logm(x)
x |
A matrix |
Its matrix logarithm
Configure the parallel processing strategy for riemtan operations. Uses the future package to manage parallel backends.
set_parallel_plan(strategy = "sequential", workers = NULL)set_parallel_plan(strategy = "sequential", workers = NULL)
strategy |
Character string specifying the parallel strategy:
|
workers |
Integer specifying the number of parallel workers.
If |
The multisession strategy is recommended for most users as it works on all platforms.
The multicore strategy is faster on Unix-like systems but is not available on Windows.
To disable parallel processing, use set_parallel_plan("sequential").
Invisibly returns the future plan object.
future::plan(), is_parallel_enabled(), should_parallelize()
## Not run: # Enable parallel processing with automatic worker detection set_parallel_plan("multisession") # Use 4 parallel workers set_parallel_plan("multisession", workers = 4) # Disable parallel processing set_parallel_plan("sequential") ## End(Not run)## Not run: # Enable parallel processing with automatic worker detection set_parallel_plan("multisession") # Use 4 parallel workers set_parallel_plan("multisession", workers = 4) # Disable parallel processing set_parallel_plan("sequential") ## End(Not run)
Internal function to determine if an operation should use parallel processing based on the number of items to process and current configuration.
should_parallelize(n, threshold = 10)should_parallelize(n, threshold = 10)
n |
Integer specifying the number of items to process. |
threshold |
Integer specifying the minimum number of items required for parallel processing to be beneficial (default: 10). Below this threshold, sequential processing is used even if parallelization is enabled. |
This function returns TRUE only if:
Parallel processing is enabled (via set_parallel_plan())
The number of items n is at least threshold
For small numbers of items, the overhead of parallelization typically outweighs the benefits, so sequential processing is used.
Logical value: TRUE if parallel processing should be used, FALSE otherwise.
set_parallel_plan(), is_parallel_enabled()
## Not run: # With parallel processing enabled set_parallel_plan("multisession") should_parallelize(5) # FALSE (below threshold) should_parallelize(20) # TRUE (above threshold) # With parallel processing disabled set_parallel_plan("sequential") should_parallelize(100) # FALSE (sequential plan) ## End(Not run)## Not run: # With parallel processing enabled set_parallel_plan("multisession") should_parallelize(5) # FALSE (below threshold) should_parallelize(20) # TRUE (above threshold) # With parallel processing disabled set_parallel_plan("sequential") should_parallelize(100) # FALSE (sequential plan) ## End(Not run)
Reverse isometry from tangent space at identity to tangent space at P
spd_isometry_from_identity(sigma, v)spd_isometry_from_identity(sigma, v)
sigma |
A symmetric positive-definite matrix of class |
v |
A symmetric matrix of class |
A symmetric matrix of class dspMatrix
Isometry from tangent space at P to tangent space at identity
spd_isometry_to_identity(sigma, v)spd_isometry_to_identity(sigma, v)
sigma |
A symmetric positive-definite matrix of class |
v |
A symmetric matrix of class |
A symmetric matrix of class dspMatrix
TangentImageHandler Class
TangentImageHandler Class
This class handles tangent images on a manifold. It provides methods to set a reference point, compute tangents, and perform various operations using a provided metric. Initialize the TangentImageHandler
Error if the tangent images have not been specified
Error if the reference point is not an object of class dppMatrix
Error if the matrix is not of type dspMatrix Tangent images getter
ref_pointA matrix of type dppMatrix
tangent_imagesA list of dspMatrix objects
new()
TangentImageHandler$new(metric_obj, reference_point = NULL)
metric_objAn rmetric object for operations.
reference_pointAn optional reference point on the manifold.
A new instance of TangentImageHandler. Set a new reference point.
set_reference_point()
If tangent images have been created, it recomputes them by mapping to the manifold and then to the new tangent space.
TangentImageHandler$set_reference_point(new_ref_pt, progress = FALSE)
new_ref_ptA new reference point of class dppMatrix.
progressLogical indicating whether to show progress (default: FALSE)
None. Computes the tangent images from the points in the manifold
compute_tangents()
TangentImageHandler$compute_tangents(manifold_points, progress = FALSE)
manifold_pointsA list of connectomes
progressLogical indicating whether to show progress (default: FALSE)
None Computes vectorizations from tangent images
compute_vecs()
TangentImageHandler$compute_vecs(progress = FALSE)
progressLogical indicating whether to show progress (default: FALSE)
A matrix, each row of which is a vectorization Computes connectomes from tangent images
compute_conns()
TangentImageHandler$compute_conns(progress = FALSE)
progressLogical indicating whether to show progress (default: FALSE)
A list of connectomes Setter for the tangent images
set_tangent_images()
TangentImageHandler$set_tangent_images(reference_point, tangent_images)
reference_pointA connectome
tangent_imagesA list of tangent images
None Appends a matrix to the list of tangent images
add_tangent_image()
TangentImageHandler$add_tangent_image(image)
imageMatrix to be added
get_tangent_images()
TangentImageHandler$get_tangent_images()
list of tangent matrices Wrapper for set_reference_point
relocate_tangents()
TangentImageHandler$relocate_tangents(new_ref_pt, progress = FALSE)
new_ref_ptThe new reference point
progressLogical indicating whether to show progress (default: FALSE)
None
clone()
The objects of this class are cloneable with this method.
TangentImageHandler$clone(deep = FALSE)
deepWhether to make a deep clone.
Validates that a backend object inherits from DataBackend and implements required methods.
validate_backend(backend)validate_backend(backend)
backend |
A backend object to validate |
None. Throws an error if validation fails.
Validates the connections input.
validate_conns(conns, tan_imgs, vec_imgs, centered)validate_conns(conns, tan_imgs, vec_imgs, centered)
conns |
List of connection matrices. |
tan_imgs |
List of tangent images. |
vec_imgs |
Matrix of vector images. |
centered |
Logical indicating if the data is centered. |
None. Throws an error if the validation fails.
Validate arguments for Riemannian logarithms
validate_exp_args(sigma, v)validate_exp_args(sigma, v)
sigma |
A dppMatrix object |
v |
A dspMatrix object |
Error if sigma and lambda are not of the same dimensions
None
Validate arguments for Riemannian logarithms
validate_log_args(sigma, lambda)validate_log_args(sigma, lambda)
sigma |
A dppMatrix object |
lambda |
A dppMatrix object |
Error if sigma and lambda are not of the same dimensions
None
Validates that the metric is not NULL.
validate_metric(metric)validate_metric(metric)
metric |
The metric to validate. |
None. Throws an error if the metric is NULL.
Validate Parquet Directory Structure
validate_parquet_dir(data_dir, check_files = TRUE)validate_parquet_dir(data_dir, check_files = TRUE)
data_dir |
Path to directory to validate |
check_files |
If TRUE, validates that Parquet files exist (default: TRUE) |
None. Throws an error if validation fails.
Validates that a directory contains properly formatted Parquet files and metadata for use with ParquetBackend.
validate_parquet_directory(data_dir, verbose = TRUE)validate_parquet_directory(data_dir, verbose = TRUE)
data_dir |
Path to directory to validate |
verbose |
If TRUE, prints validation details (default: TRUE) |
Checks:
Directory exists
metadata.json exists and is valid
All expected Parquet files exist
Parquet files have correct dimensions
Logical indicating whether the directory is valid
## Not run: validate_parquet_directory("my_connectomes") ## End(Not run)## Not run: validate_parquet_directory("my_connectomes") ## End(Not run)
Validates the tangent images input.
validate_tan_imgs(tan_imgs, vec_imgs, centered)validate_tan_imgs(tan_imgs, vec_imgs, centered)
tan_imgs |
List of tangent images. |
vec_imgs |
List of vector images. |
centered |
Logical indicating if the data is centered. |
None. Throws an error if the validation fails.
Validate arguments for inverse vectorization
validate_unvec_args(sigma, w)validate_unvec_args(sigma, w)
sigma |
A dppMatrix object |
w |
A numeric vector |
Error if the dimensionalities don't match
None
Validate arguments for vectorization
validate_vec_args(sigma, v)validate_vec_args(sigma, v)
sigma |
A dppMatrix object |
v |
A dspMatrix object |
Error if sigma and v are not of the same dimensions
None
Validates the vector images input.
validate_vec_imgs(vec_imgs, centered)validate_vec_imgs(vec_imgs, centered)
vec_imgs |
List of vector images. |
centered |
Logical indicating if the data is centered. |
None. Throws an error if the validation fails.
Converts a symmetric matrix into a vector representation specific to operations at the identity matrix.
vec_at_id(v)vec_at_id(v)
v |
A symmetric matrix of class |
A numeric vector, representing the vectorized tangent image.
if (requireNamespace("Matrix", quietly = TRUE)) { library(Matrix) v <- diag(c(1, sqrt(2))) |> Matrix::symmpart() |> Matrix::pack() vec_at_id(v) }if (requireNamespace("Matrix", quietly = TRUE)) { library(Matrix) v <- diag(c(1, sqrt(2))) |> Matrix::symmpart() |> Matrix::pack() vec_at_id(v) }
Exports a list of SPD matrices (connectomes) to individual Parquet files with accompanying metadata.
write_connectomes_to_parquet( connectomes, output_dir, file_pattern = "matrix_%04d.parquet", subject_ids = NULL, provenance = NULL, overwrite = FALSE )write_connectomes_to_parquet( connectomes, output_dir, file_pattern = "matrix_%04d.parquet", subject_ids = NULL, provenance = NULL, overwrite = FALSE )
connectomes |
A list of dppMatrix objects representing SPD matrices |
output_dir |
Path to output directory (will be created if it doesn't exist) |
file_pattern |
File naming pattern with %d placeholder for index (default: "matrix_%04d.parquet") |
subject_ids |
Optional vector of subject/sample identifiers (default: NULL) |
provenance |
Optional list containing data provenance information (default: NULL) |
overwrite |
If TRUE, overwrites existing directory (default: FALSE) |
Creates a directory structure:
Individual Parquet files (one per matrix)
metadata.json with dimensions, file pattern, and optional metadata
The metadata.json file contains:
n_matrices: Number of matrices
matrix_dim: Dimension p (matrices are p x p)
file_pattern: Naming pattern for Parquet files
subject_ids: Optional subject identifiers
provenance: Optional provenance information
Invisibly returns the path to the output directory
## Not run: # Create sample data mats <- replicate(10, Matrix::pack(Matrix::Matrix(diag(5), sparse = FALSE)), simplify = FALSE) # Write to Parquet write_connectomes_to_parquet( mats, output_dir = "my_connectomes", subject_ids = paste0("subj_", 1:10), provenance = list(study = "Example Study", date = Sys.Date()) ) ## End(Not run)## Not run: # Create sample data mats <- replicate(10, Matrix::pack(Matrix::Matrix(diag(5), sparse = FALSE)), simplify = FALSE) # Write to Parquet write_connectomes_to_parquet( mats, output_dir = "my_connectomes", subject_ids = paste0("subj_", 1:10), provenance = list(study = "Example Study", date = Sys.Date()) ) ## End(Not run)