| Type: | Package |
| Title: | High-Dimensional Ising Model Selection |
| Version: | 0.1.0 |
| Description: | Fits an Ising model to a binary dataset using L1 regularized logistic regression and extended BIC. Also includes a fast lasso logistic regression function for high-dimensional problems. Uses the 'libLBFGS' optimization library by Naoaki Okazaki. |
| License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
| Encoding: | UTF-8 |
| LazyData: | true |
| Depends: | R (≥ 3.1.0) |
| Imports: | Rcpp (≥ 0.12.8), data.table (≥ 1.9.6) |
| Suggests: | igraph, IsingSampler |
| LinkingTo: | Rcpp, RcppEigen (≥ 0.3.2.9) |
| RoxygenNote: | 5.0.1 |
| NeedsCompilation: | yes |
| Packaged: | 2016-11-24 15:30:45 UTC; prati |
| Author: | Pratik Ramprasad [aut, cre], Jorge Nocedal [ctb, cph], Naoaki Okazaki [ctb, cph] |
| Maintainer: | Pratik Ramprasad <pratik.ramprasad@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2016-11-25 08:43:07 |
rIsing: High-Dimensional Ising Model Selection.
Description
Fits an Ising model to a binary dataset using L1-regularized logistic regression and BIC. Also includes a fast lasso logistic regression function for high-dimensional problems. Uses the 'libLBFGS' optimization library by Naoki Okazaki.
rIsing functions
-
logreg: L1-regularized logistic regression using OWL-QN L-BFGS-B optimization. -
Ising: Ising Model selection using L1-regularized logistic regression and extended BIC.
High-Dimensional Ising Model Selection
Description
Ising Model selection using L1-regularized logistic regression and extended BIC.
Usage
ising(X, gamma = 0.5, min_sd = 0, nlambda = 50,
lambda.min.ratio = 0.001, symmetrize = "mean")
Arguments
X |
The design matrix. |
gamma |
(non-negative double) Parameter for the extended BIC (default 0.5). Higher gamma encourages sparsity. See references for more details. |
min_sd |
(non-negative double) Columns of |
nlambda |
(positive integer) The number of parameters in the regularization path (default 50). A longer regularization path will likely yield more accurate results, but will take more time to run. |
lambda.min.ratio |
(non-negative double) The ratio |
symmetrize |
The method used to symmetrize the output adjacency matrix. Must be one of "min", "max", "mean" (default), or FALSE. "min" and "max" correspond to the Wainwright min/max, respectively (see reference 1). "mean" corresponds to the coefficient-wise mean of the output adjacency matrix and its transpose. If FALSE, the output matrix is not symmetrized. |
Value
A list containing the estimated adjacency matrix (Theta) and the optimal regularization parameter for each node (lambda), as selected by extended BIC.
References
Ravikumar, P., Wainwright, M. J. and Lafferty, J. D. (2010). High-dimensional Ising model selection using L1-regularized logistic regression. https://arxiv.org/pdf/1010.0311v1
Barber, R.F., Drton, M. (2015). High-dimensional Ising model selection with Bayesian information criteria. https://arxiv.org/pdf/1403.3374v2
Examples
## Not run:
# simulate a dataset using IsingSampler
library(IsingSampler)
n = 1e3
p = 10
Theta <- matrix(sample(c(-0.5,0,0.5), replace = TRUE, size = p*p), nrow = p, ncol = p)
Theta <- Theta + t(Theta) # adjacency matrix must be symmetric
diag(Theta) <- 0
X <- unname(as.matrix(IsingSampler(n, graph = Theta, thresholds = 0, method = "direct") ))
m1 <- ising(X, symmetrize = "mean", gamma = 0.5, nlambda = 50)
# Visualize output using igraph
library(igraph)
ig <- graph_from_adjacency_matrix(m1$Theta, "undirected", weighted = TRUE, diag = FALSE)
plot.igraph(ig, vertex.color = "skyblue")
## End(Not run)
L1 Regularized Logistic Regression
Description
L1 Regularized logistic regression using OWL-QN L-BFGS-B optimization.
Usage
logreg(X, y, nlambda = 50, lambda.min.ratio = 0.001, lambda = NULL,
scale = TRUE, type = 2)
Arguments
X |
The design matrix. |
y |
Vector of binary observations of length equal to |
nlambda |
(positive integer) The number of parameters in the regularization path (default 50). |
lambda.min.ratio |
(non-negative double) The ratio of |
lambda |
A user-supplied vector of regularization parameters. Under the default option ( |
scale |
(boolean) Whether to scale |
type |
(integer 1 or 2) Type 1 aggregates the input data based on repeated rows in |
Value
A list containing the matrix of fitted weights (wmat), the vector of regularization parameters, sorted in decreasing order (lambda), and the vector of log-likelihoods corresponding to lambda (logliks).
Examples
# simulate some linear regression data
n <- 1e3
p <- 100
X <- matrix(rnorm(n*p),n,p)
wt <- sample(seq(0,9),p+1,replace = TRUE) / 10
z <- cbind(1,X) %*% wt + rnorm(n)
probs <- 1 / (1 + exp(-z))
y <- sapply(probs, function(p) rbinom(1,1,p))
m1 <- logreg(X, y)
m2 <- logreg(X, y, nlambda = 100, lambda.min.ratio = 1e-4, type = 1)
## Not run:
# Performance comparison
library(glmnet)
library(microbenchmark)
nlambda = 50; lambda.min.ratio = 1e-3
microbenchmark(
logreg_type1 = logreg(X, y, nlambda = nlambda,
lambda.min.ratio = lambda.min.ratio, type = 1),
logreg_type2 = logreg(X, y, nlambda = nlambda,
lambda.min.ratio = lambda.min.ratio, type = 2),
glmnet = glmnet(X, y, family = "binomial",
nlambda = nlambda, lambda.min.ratio = lambda.min.ratio),
times = 20L
)
## End(Not run)