Title: | Numerical Estimation of Sparse Hessians |
---|---|
Description: | Estimates Hessian of a scalar-valued function, and returns it in a sparse Matrix format. The sparsity pattern must be known in advance. The algorithm is especially efficient for hierarchical models with a large number of heterogeneous units. See Braun, M. (2017) <doi:10.18637/jss.v082.i10>. |
Authors: | Michael Braun [aut, cre, cph] |
Maintainer: | Michael Braun <[email protected]> |
License: | MPL (== 2.0) |
Version: | 0.3.3.7 |
Built: | 2025-03-07 04:35:01 UTC |
Source: | https://github.com/braunm/sparsehessianfd |
Estimate sparse Hessians using finite differences of gradients.
The Hessian is returned as a sparse Matrix (dgCMatrix-class). The user supplies the objective function, the gradient, and the row and column indices of the non-zero elements of the lower triangle of the Hessian (i.e., the sparsity structure must be known in advance).
In a typical case, you should only need to use the sparseHessianFD initializer, and the fn, gr and hessian methods of the sparseHessian class, and the Matrix.to.Coord utility function.
Braun, Michael. 2017. sparseHessianFD: An R Package for Estimating Sparse Hessian Matrices. Journal of Statistical Software 82 (10): 1-22.
Coleman, Thomas F, and Jin-Yi Cai. 1986. The Cyclic Coloring Problem and Estimation of Sparse Hessian Matrices. SIAM Journal on Algebraic Discrete Methods 7 (2): 221-235
Coleman, Thomas F, Burton S Garbow, and Jorge J More. 1985. Software for Estimating Sparse Hessian Matrices. ACM Transaction on Mathematical Software 11 (4) (December): 363-377.
Coleman, Thomas F and Jorge J More. 1984. Estimation of Sparse Hessian Matrices and Graph Coloring Problems. Mathematical Programming 28 (3) (October): 243-270
Powell, M J D and Ph L Toint. 1979. On the Estimation of Sparse Hessian Matrices. SIAM Journal on Numerical Analysis 16 (6) (December): 1060-1074.
Functions for binary choice example in the vignette.
binary.f(P, data, priors, order.row = FALSE) binary.grad(P, data, priors, order.row = FALSE) binary.hess(P, data, priors, order.row = FALSE)
binary.f(P, data, priors, order.row = FALSE) binary.grad(P, data, priors, order.row = FALSE) binary.hess(P, data, priors, order.row = FALSE)
P |
Numeric vector of length |
data |
Named list of data matrices Y and X, and choice count integer T |
priors |
Named list of matrices inv.Omega and inv.Sigma |
order.row |
Determines order of heterogeneous coefficients in parameter vector. If TRUE, heterogeneous coefficients are ordered by unit. If FALSE, they are ordered by covariate. |
These functions are used by the heterogeneous binary choice example in the vignette. There are N heterogeneous units, each making T binary choices. The choice probabilities depend on k covariates.
Log posterior density, gradient and Hessian. The Hessian is a dgCMatrix object.
Sample datasets for vignette
The package provides four sample datasets for the hierarchical binary choice model described in the vignette. These datasets are:
,
,
,
,
is the number of heterogeneous units.
is the number of covariates.
The datasets were generated using the code in data-raw/binary_sim.R.
cyclic coloring from a lower triangular pattern matrix
coloring(L)
coloring(L)
L |
sparsity pattern of the Hessian as a lower triangular pattern matrix |
For internal use. Exported in order for replication files for JSS article to run.
Integer vector of length nvars with color assignments for each variable.
Returns indices and pointers that define a sparse Hessian in compressed format. Inputs are the row and column indices.
Coord.to.Pointers( rows, cols, dims, triangle = TRUE, lower = TRUE, symmetric = FALSE, order = c("column", "row", "triplet"), index1 = TRUE )
Coord.to.Pointers( rows, cols, dims, triangle = TRUE, lower = TRUE, symmetric = FALSE, order = c("column", "row", "triplet"), index1 = TRUE )
rows , cols
|
row and column indices of non-zero elements |
dims |
2-element vector for number of rows and columns. |
triangle |
Is input intended to be a triangular (TRUE) or full
(FALSE) matrix. See details for how this argument is interpreted
for different values of |
lower |
If |
symmetric |
If TRUE, and matrix is triangular, then the matrix will be treated as symmetric, with only the triangular elements provided. If matrix is neither triangular nor symmetric, then symmetric=TRUE will probably trigger an error. |
order |
Determines the indexing/compression scheme for the output matrix. Use "triplet" to get row and column indices. Defaults to the same class as M. |
index1 |
TRUE if using 1-based indexing. FALSE for 0-based indexing. |
triangle
and order
have the following interpretation:
Input rows
and cols represent lower
or upper triangle of a matrix. If order="symmetric"
, then
the output list will be for a full, symmetric matrix. Otherwise,
the output list will be for only the lower or upper triangle. Any
elements outside of the specified triangle will trigger an error.
Input rows
and cols represent a full
matrix. If that matrix is not symmetric, then
order=="symmetric"
will trigger an error.
If symmetric=FALSE
and order='triplet'
, the output
list should contain the same row and column indices as the input
list.
A list. See Matrix.to.Pointers (no values are included in return list).
Matrix.to.Pointers
These functions were in earlier versions, but will no longer be maintained, and are not even guaranteed to work now.
Build sparse matrix from data in CSC (column compressed) format.
Converts row and column indices to a pattern Matrix object of Matrix class
This function is deprecated. Use sparseHessianFD
instead.
Sym.CSC.to.Matrix(H, nvars) Coord.to.Sym.Pattern.Matrix(H, nvars) Coord.to.Pattern.Matrix( rows, cols, dims, compressed = TRUE, symmetric = FALSE, index1 = TRUE ) new.sparse.hessian.obj( x, fn, gr, hs, fd.method = 0L, eps = sqrt(.Machine$double.eps), ... ) sparseHessianFD.new( x, fn, gr, rows, cols, direct = NULL, eps = sqrt(.Machine$double.eps), ... )
Sym.CSC.to.Matrix(H, nvars) Coord.to.Sym.Pattern.Matrix(H, nvars) Coord.to.Pattern.Matrix( rows, cols, dims, compressed = TRUE, symmetric = FALSE, index1 = TRUE ) new.sparse.hessian.obj( x, fn, gr, hs, fd.method = 0L, eps = sqrt(.Machine$double.eps), ... ) sparseHessianFD.new( x, fn, gr, rows, cols, direct = NULL, eps = sqrt(.Machine$double.eps), ... )
H |
a list containing Hessian data. See details. |
nvars |
the number of rows (and columns) in the matrix. |
rows , cols
|
row and column indices of non-zero elements |
dims |
2-element vector for number of rows and columns in matrix |
compressed |
If TRUE, returns a matrix is compressed column (default=TRUE) |
symmetric |
If TRUE, matrix will be symmetric, and only the lower triangular elements need to be provided (default=FALSE) |
index1 |
TRUE if input row and col use 1-based indexing, and FALSE for 0-based indexing. |
x |
variable vector for initialization |
fn |
R function that returns function value |
gr |
R function that returns the gradient of the function |
hs |
list of two vectors: row and column indices of non-zero elements of lower triangle of Hessian. See details. |
fd.method |
If TRUE, use direct method for computatation. Otherwise, use indirect/substitution method. See references. |
eps |
The perturbation amount for finite differencing of the gradient to compute the Hessian. Defaults to sqrt(.Machine$double.eps). |
... |
Other parameters to be passed to fn and gr. |
direct |
If TRUE, use direct method for computatation. Otherwise, use indirect/substitution method. See references. |
Use Matrix::sparseMatrix instead of Sym.CSC.to.Matrix.
Use Coord.to.Pattern.Matrix with symmetric=TRUE instead of Coord.to.Sym.Pattern.Matrix.
This function is useful to prototype a sparsity pattern. No assumptions are made about symmetry.
hs is a list of two elements:
Integer vector of row indices of non-zero elements in lower triangle of Hessian.
Integer vector of column indices of non-zero elements in lower triangle of Hessian.
This function is deprecated. Use sparseHessianFD
instead.
An object of Matrix class.
A sparse pattern matrix
An object of class sparseHessianFD
Generate proper vertex coloring of a sparse undirected graph.
get_colors(pntr, idx, nvars)
get_colors(pntr, idx, nvars)
pntr , idx
|
row pointers and column indices of the adjacency matrix, in compressed column-oriented format. Must use zero-based indexing. |
nvars |
Number of vertices. |
For internal use. You should not have to call this function directly.
An integer vector of length nvars, where each element represents the color of the corresponding vertex. Indices are zero-based.
Utility function to extract row and column indices of the non-zero elements of a sparse matrix.
Matrix.to.Coord(M, index1 = TRUE)
Matrix.to.Coord(M, index1 = TRUE)
M |
A matrix that is a subclass of sparseMatrix, as defined in the Matrix package. |
index1 |
TRUE if the index of the first element should be 1, and FALSE if 0. |
A wrapper to Matrix.to.Pointers for order='triplet'
and values=FALSE
, for extracting the row and column indices
of a sparsity pattern from a matrix that has that same pattern.
A list with two named elements.
Integer vector containing row indices of non-zero elements
Integer vector containing column indices of non-zero elements
M1 <- as(kronecker(diag(3), matrix(TRUE,2,2)),"sparseMatrix") C <- Matrix.to.Coord(M1) M2 <- Matrix::sparseMatrix(i=C$rows, j=C$cols) all.equal(M1,M2)
M1 <- as(kronecker(diag(3), matrix(TRUE,2,2)),"sparseMatrix") C <- Matrix.to.Coord(M1) M2 <- Matrix::sparseMatrix(i=C$rows, j=C$cols) all.equal(M1,M2)
Returns a list of row indices, column indices, pointers, and/or values of a sparse Hessian.
Matrix.to.Pointers( M, as.symmetric = Matrix::isSymmetric(M), values = !is(M, "nMatrix"), order = NULL, index1 = TRUE )
Matrix.to.Pointers( M, as.symmetric = Matrix::isSymmetric(M), values = !is(M, "nMatrix"), order = NULL, index1 = TRUE )
M |
A sparse Matrix, as defined in the Matrix package. |
as.symmetric |
Defaults to isSymmetric(M). If M is symmetric, and as.symmetric is FALSE, then index/pointer elements in the output list will be labeled according to order. If M is not symmetric, and as.symmetric is TRUE, then an error will be triggered. |
values |
If TRUE, values are returned in list as 'x'. Defaults to TRUE for numeric and logical matrices, and FALSE for pattern matrices. If M is a pattern matrix, values=TRUE will trigger a warning. |
order |
Determines the indexing/compression scheme for the output matrix. Use 'triplet" to get row and column indices. Defaults to the same class as M. |
index1 |
TRUE (default) if return indices and pointers should use 1-based indexing. FALSE for 0-based indexing. |
This function is included primarily for debugging purposes. It is used internally, but would not ordinarily be called by an end user.
A list with the following elements. If order=='row',
Integer vector containing column indices of non-zero elements
Integer vector containing pointers to elements of jCol at which the next row begins.
If order=='column'
Integer vector containing row indices of non-zero elements
Integer vector containing pointers to elements of iRow at which the next column begins.
If order=='triplet'
Row indices of non-zero elements
Column indices of non-zero elements
If as.symmetric is TRUE, then the row/column orientation does not matter.
Integer vector containing indices of non-zero elements
Integer vector containing pointers to elements of idx at which the next row or column begins.
If values=TRUE, the return list includes x, the values of the non-zero elements. The 'class' element is the name of the sparse matrix class to which the output corresponds (identifies numeric type, pattern, and indexing/compression scheme).
Matrix.to.Coord
A reference class for computing sparse Hessians
The sparseHessianFD function calls the initializer for the sparseHessianFD class, and returns a sparseHessianFD object.
sparseHessianFD(x, fn, gr, rows, cols, delta, index1, complex, ...)
The function, gradient and sparsity pattern are declared as part of the initialization.
Once initialized, the $hessian method will evaluate the Hessian at x.
obj <- sparseHessian(x, fn, gr, rows, cols, ...) obj$hessian(x)
For convenience, the class provides wrapper methods to the fn
and gr
functions that were specified in the initializer.
obj$fn(P) ## wrapper to objective function obj$gr(P) ## wrapper to gradient obj$fngr(P) ## list of obj function and gradient obj$fngrhs(P) ## list of obj function, gradient and Hessian.
an vector at which the function, gradient and Hessian are initialized and tested.
R functions that return the function value and gradient, evaluated at x.
Numeric vectors with row and column indices of the non-zero elements in the lower triangle (including diagonal) of the Hessian.
The perturbation amount for finite difference (or complex step) of the gradient to compute the Hessian. Defaults to 1e-07.
TRUE if rows and cols use 1-based (R format) indexing (FALSE for 0-based (C format) indexing.
TRUE if Hessian will be computed using the complex step method, and FALSE (default) if using finite differences. If TRUE, both fn and gr must accept complex arguments and return complex values.
other arguments to be passed to fn and gr.
Other methods are described below. Do not access any of the fields directly. The internal structure is subject to change in future versions.
fn1
A closure for calling fn(x, ...).
gr1
A closure for calling gr(x, ...).
iRow,jCol
Numeric vectors with row and column indices of the non-zero elements in the lower triangle (including diagonal) of the Hessian.
delta
The perturbation amount for finite differencing of the gradient to compute the Hessian. Defaults to 1e-07.
index1
TRUE if rows and cols use 1-based (R format) indexing (FALSE for 0-based (C format) indexing.
complex
TRUE if Hessian will be computed using the complex step method, and FALSE (default) if using finite differences.
D
raw finite differences (internal use only)
nvars
Number of variables (length of x)
nnz
Number of non-zero elements in the lower triangle of the Hessian.
ready
TRUE if object has been initialized, and Hessian has been partitioned.
idx,pntr
Column indices and row pointers for non-zero elements in lower triangle of the permuted Hessian. Row-oriented compressed storage.
colors
A vector representation of the partitioning of the columns. There are nvars elements, one for each column of the permuted Hessian. The value corresponds to the "color" for that column.
perm,invperm
Permutation vector and its inverse
fn(x)
Return function value, evaluated at x: fn(x, ...)
fngr(x)
Return list of function value and gradient, evaluated at x
fngrhs(x)
Return list of function value, gradient, and Hessian, evaluated at x
get_invperm()
Return integer vector of inverse of permutation used for computing Hessian
get_nnz()
Return number of non-zero elements in lower triangle of Hessian
get_nvars()
Return dimension (number of rows or columns) of Hessian
get_pattern()
Return pattern matrix of lower triangle of Hessian
get_perm()
Return integer vector of permutation used for computing Hessian
get_perm_pattern()
Return pattern matrix of lower triangle of *permuted* Hessian
gr(x)
Return gradient, evaluated at x: gr(x,...)
hessian(x)
Return sparse Hessian, evaluated at x, as a dgCMatrix object.
initialize(
x,
fn,
gr,
rows,
cols,
delta = 1e-07,
index1 = TRUE,
complex = FALSE,
...
)
Initialize object with functions to compute the objective function and gradient (fn and gr), row and column indices of non-zero elements (rows and cols), an initial variable vector x at which fn and gr can be evaluated, a finite differencing parameter delta, flags for 0 or 1-based indexing (index1), whether the complex step method will be used, and other arguments (...) to be passed to fn and gr.
partition()
Return the partitioning used to compute finite differences
pointers(out.index1 = index1)
Return list with indices (idx) and pointers (pntr) for sparsity pattern of the compressed sparse Hessian. Since the Hessian is symmetric, the indices and pointers for row-oriented and column-oriented storage patterns are the same.
## Log posterior density of hierarchical binary choice model. See vignette. set.seed(123) data("binary_small") N <- length(binary[["Y"]]) k <- NROW(binary[["X"]]) T <- binary[["T"]] P <- rnorm((N+1)*k) priors <- list(inv.Sigma = rWishart(1,k+5,diag(k))[,,1], inv.Omega = diag(k)) true.hess <- binary.hess(P, binary, priors) pattern <- Matrix.to.Coord(Matrix::tril(true.hess)) str(pattern) obj <- sparseHessianFD(P, fn=binary.f, gr=binary.grad, rows=pattern[["rows"]], cols=pattern[["cols"]], data=binary, priors=priors) hs <- obj$hessian(P) all.equal(hs, true.hess) f <- obj$fn(P) ## obj function df <- obj$gr(P) ## gradient fdf <- obj$fngr(P) ## list of obj function and gradient fdfhs <- obj$fngrhs(P) ## list of obj function, gradient and Hessian.
## Log posterior density of hierarchical binary choice model. See vignette. set.seed(123) data("binary_small") N <- length(binary[["Y"]]) k <- NROW(binary[["X"]]) T <- binary[["T"]] P <- rnorm((N+1)*k) priors <- list(inv.Sigma = rWishart(1,k+5,diag(k))[,,1], inv.Omega = diag(k)) true.hess <- binary.hess(P, binary, priors) pattern <- Matrix.to.Coord(Matrix::tril(true.hess)) str(pattern) obj <- sparseHessianFD(P, fn=binary.f, gr=binary.grad, rows=pattern[["rows"]], cols=pattern[["cols"]], data=binary, priors=priors) hs <- obj$hessian(P) all.equal(hs, true.hess) f <- obj$fn(P) ## obj function df <- obj$gr(P) ## gradient fdf <- obj$fngr(P) ## list of obj function and gradient fdfhs <- obj$fngrhs(P) ## list of obj function, gradient and Hessian.
Estimate Hessian using triangular subsitution algorithm
subst(Y, colors, jCol, ipntr, delta, nvars, nnz)
subst(Y, colors, jCol, ipntr, delta, nvars, nnz)
Y |
Matrix of finite differences of gradients |
colors |
Vector of length nvars that identifies color of each variable |
jCol , ipntr
|
Column indices and row pointers for non-zero elements of lower triangle of Hessian (row-oriented compressed format). |
delta |
Perturbation factor used to compute finite differences of gradients. |
nvars |
Dimension of Hessian (number of variables) |
nnz |
Number of non-zero elements in the lower triangle of the Hessian. |
For internal use. You should not have to call this function directly.
A sparse Hessian of class dgCMatrix.