| Type: | Package |
| Title: | Sparse and Group Sparse Linear Models |
| Version: | 1.0-0 |
| Date: | 2026-06-05 |
| Description: | Fits the solution paths of classical sparse regression models with efficient active set algorithms by solving small sub-problems. Include LASSO, SCAD, MCP, (Sparse) Group-LASSO, Cooperative-LASSO, (Group) LAVA, (Generalized) Fused-Lasso and (Generalized) Elastic-Net. Also provides methods for model selection purpose (information criteria, cross-validation, stability selection). |
| URL: | https://github.com/jchiquet/quadrupen, https://jchiquet.github.io/quadrupen/ |
| BugReports: | https://github.com/jchiquet/quadrupen/issues |
| License: | GPL (≥ 3) |
| Encoding: | UTF-8 |
| Language: | en-US |
| SystemRequirements: | GNU make |
| Depends: | R (≥ 4.1.0) |
| Imports: | dplyr, ggplot2, grid, Matrix, methods, parallel, R6, Rcpp, rlang, scales, tidyr, lifecycle |
| Suggests: | elasticnet, glmnet, igraph, knitr, lars, MASS, ncvreg, grpreg, rmarkdown, testthat |
| LinkingTo: | Rcpp, RcppArmadillo |
| Collate: | 'init.R' 'quadrupen-package.R' 'QuadrupenFit-R6Class.R' 'quadrupen-S3Methods.R' 'Regularizers-R6Class.R' 'DataModel-R6Class.R' 'StabilityPath-R6Class.R' 'CrossValidation-R6Class.R' 'InformationCriteria-R6Class.R' 'bounded-reg.R' 'fused-lasso.R' 'sparse_lm.R' 'sparse_lm_aliases.R' 'group_sparse_lm.R' 'group_sparse_lm_aliases.R' 'group-lava.R' 'lava.R' 'utils.R' 'ridge.R' 'RcppExports.R' |
| Packaged: | 2026-06-05 13:35:07 UTC; jchiquet |
| Config/roxygen2/version: | 8.0.0 |
| VignetteBuilder: | knitr |
| NeedsCompilation: | yes |
| Author: | Julien Chiquet |
| Maintainer: | Julien Chiquet <julien.chiquet@inrae.fr> |
| Repository: | CRAN |
| Date/Publication: | 2026-06-05 14:40:02 UTC |
Sparsity by Worst-Case Quadratic Penalties
Description
Fits the solution paths of classical sparse regression models with efficient active set algorithms by solving small sub-problems. Depending on the penalty, the sub-problems can be solved exactly (i.e. for the LASSO) or with generic solvers. The available optimizer includes quadratic solvers, Newton-based approaches and generic FISTA or PGD algorithms. Also provides a few methods for model selection purpose (information criteria, cross-validation, stability selection).
Details
Quadrupen covers the following regularizers
LASSO (Least Absolute Shrinkage and Selection Operator)
SCAD (Smoothly Clip Absolute Deviation)
MCP (Minimax Concave Penalty)
Group-LASSO (L1/L2 or L1/Linfty)
Cooperative-LASSO
Sparse Group-LASSO and Sparse Cooperative-LASSO
Bounded Regression (L-infty norm).
For all these regularizers, Quadrupen offers the possibility to add an ridge-like "structured" penalty to embed some external knowledge about the statistical dependence between the features. This is sometimes referred to as the "Structured Elastic-Net".
We also provide in the package the implementation of the Generalized Fused-LASSO originally proposed by Holger Hoefling now archived from CRAN (original repo here).
While likely not as fast as highly specialized packages like glmnet, the use of a working set algorithm combined with efficient solvers, sparse matrix support when applicable, and templated C++ code makes it both competitive and versatile.
Features:
The more important functions of the package are sparse_lm() and group_sparse_lm() functions,
which fits a linear model either with element-wise or group-wise sparsity. The functions
lasso(), elastic_net(), scad(), mcp() and group_lasso(), group_l1linf(), coop_lasso(),
sparse_group_lasso(), sparse_coop_lasso() are only aliases for these two main functions.
The functions lava(), group_lava(), fused_lasso() and bounded_reg() are also available and specific.
We also included R6 and S3 methods for plotting, cross-validation and for the stability selection procedure of Meinshausen and Buhlmann (2010).
Algorithm:
The general strategy of the algorithm relies on maintaining an
active set of variables, starting from a vector of zeros. The
underlying optimization problem is solved only on the activated
variables, thus handling with small smooth problems with
increasing size. Hence, by considering a decreasing grid of values
for the penalty \lambda_1 and fixing
\lambda_2, we may explore the whole path of
solutions at a reasonable numerical cost, providing that
\lambda_1 does not end up too small.
For the \ell_1-based methods (available in the
elastic_net function), the size of the underlying problems
solved is related to the number of nonzero coefficients in the
vector of parameters. With the \ell_\infty-norm,
(available in the boundary.reg function), we do not produce
sparse estimator. Nevertheless, the size of the systems solved
along the path deals with the number of unbounded variables for
the current penalty level, which is quite smaller than the number
of predictors for a reasonable \lambda_1. The same
kind of proposal was made in Zhao, Rocha and Yu (2009).
Underlying optimization is performed by direct resolution of (quadratic) sub problems, which is the main purpose of this package. We also implemented the popular and versatile proximal (FISTA) approaches for routine checks and numerical comparisons. A Proximal Gradient Descent approach with Anderson acceleration is also included.
The default setting uses the most appropriate solver (quadratic or FISTA). The quadratic approach, which gave its name to the package, has been optimized to be the method of choice for small and medium scale problems, and produce very accurate solutions (in particular for Elastic-Net/Lasso, Group-Lasso and Bounded Regression). However, the first order methods (PGD and FISTA) remain competitive in particular in situations where the problem is close to singular, in which case the Cholesky/Eigen value decomposition used in the quadratic solver can be computationally unstable.
Author(s)
Julien Chiquet julien.chiquet@inrae.fr
References
Yves Grandvalet, Julien Chiquet and Christophe Ambroise, "Sparsity by Worst-case Quadratic Penalties", https://doi.org/10.48550/arXiv.1210.2077, 2012.
Fan, Jianqing, and Runze Li. “Variable selection via nonconcave penalized likelihood and its oracle properties.” JASA, 2001
Nicolas Meinshausen and Peter Buhlmann. Stability Selection, JRSS(B), 2010.
Hoefling, Holger. “A path algorithm for the fused lasso signal approximator.” JCGS, 2010.
Martin Slawski, Wolfgang zu Castell, and Gerhard Tutz. Feature selection guided by structural information, AOAS, 2010.
Zhang, C. H. Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 2010.
Yuan, Ming, and Yi Lin. “Model selection and estimation in regression with grouped variables.”, JRSS(B), 2006.
Simon, Noah, et al. “A sparse-group lasso.” JCGS, 2013
Peng Zhao, Guillerme Rocha and Bin Yu. The composite absolute penalties family for grouped and hierarchical variable selection, The Annals of Statistics, 2009.
Hui Zou and Trevor Hastie. Regularization and variable selection via the elastic net, JRSS(B), 2006.
Robert Tibshirani. Regression Shrinkage and Selection via the Lasso, JRSS(B), 1996.
See Also
Useful links:
Report bugs at https://github.com/jchiquet/quadrupen/issues
Class "BoundedRegression"
Description
Class of object returned by the fitting function bounded_reg(). Inherits fields
and methods of QuadrupenFit.
Super class
QuadrupenFit -> BoundedRegressionFit
Active bindings
penaltycharacter describing the regularizer/penalty
lambdainfvector of tuning parameters for the linf penalty
lambda2vector of tuning parameters for the l2 penalty
Methods
Public methods
Inherited methods
BoundedRegressionFit$new()
Initialize a BoundedRegressionFit model
Usage
BoundedRegressionFit$new(data, intercept, regParam)
Arguments
dataa
DataModelobjectintercepta logical; should an intercept be included in the mode?
regParama list with two elements, a vector and a scalar, for the regularization
BoundedRegressionFit$clone()
The objects of this class are cloneable with this method.
Usage
BoundedRegressionFit$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
See Also
Class CrossValidation
Description
Class of object returned by the QuadrupenFit$cross_validate() method or the
cross_validate() function. Owns print() and plot() methods.
Active bindings
dataa data frame containing the mean cross-validated error and its associated standard error for each values of
lambda1andlambda2.foldslist of
Kvectors indicating the folds used for cross-validation.lambda1vector of
\lambda_1(\ell_1or\ell_\inftypenalty levels) for which each cross-validation has been performed.lambda2vector (or scalar) of
\ell_2-penalty levels for which each cross-validation has been performed.lambda1_minlevel of
\lambda_1that minimizes the error estimated by cross-validation.lambda2_minlevel of
\lambda_2that minimizes the error estimated by cross-validation.lambda1_1selargest level of
\lambda_1such as the cross-validated error is within 1 standard error of the minimum.lambda2_1selargest level of
\lambda_2such that the cross-validated error is within 1 standard error of the minimum (only relevant for ridge regression).
Methods
Public methods
CrossValidation$new()
Constructor for a CrossValidation object
Should be called internally by an object QuadrupenFit$cross_validate()
Usage
CrossValidation$new(cv_error, folds)
Arguments
cv_errordata frame storing output of a cv job
foldslist of K folds used for cross-validation
CrossValidation$show()
User friendly print method
Usage
CrossValidation$show()
CrossValidation$print()
User friendly print method
Usage
CrossValidation$print()
CrossValidation$plotCV_1D()
Plot 1-dimensional cross-validation
Usage
CrossValidation$plotCV_1D( log_scale = TRUE, title = "Cross-validation error", se = TRUE )
Arguments
log_scalelogical, should a log-scale be used for the x-axis
titlegraph title
selogical, should confidence band be displayed (TRUE by default)
Returns
a ggplot object
CrossValidation$plotCV_2D()
Plot 2-dimensional cross-validation output (grid lambda1 x lambda2)
Usage
CrossValidation$plotCV_2D(title = "Cross-validation error")
Arguments
titlegraph title
Returns
a ggplot2 object
CrossValidation$plot()
Plot cross-validation job by choosing the most appropriate output (1D- or 2D)
Usage
CrossValidation$plot(log_scale = TRUE, title = "Cross-validation error")
Arguments
log_scalelogical, should a log-scale be used for the x-axis
titlegraph title
Returns
a ggplot object
CrossValidation$clone()
The objects of this class are cloneable with this method.
Usage
CrossValidation$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
Data Class
Description
Class for storing data and various fixed quantity
Public fields
Xmatrix of regressor
yvector of response
C_invInverse of the Cholesky decomposition of S
SSDP structuring matrix
wyvector of observation weights
Active bindings
dnumber of regressor
nsample size
sparse_encodinglogical indicating if the matrix of regressor is sparsely encoded
varnamescharacter, the names of the covariates/regressors
normxnorm of each column of X
Methods
Public methods
DataModel$new()
constructor for DataModel
Usage
DataModel$new( covariates, outcome, cov_struct, obs_weights = rep(1, length(outcome)), check_args = TRUE )
Arguments
covariatesmatrix of covariates/regressors
outcomevector of outcome/response
cov_structsdp matrix structuring the covariates/regressors
obs_weightsvector of observations weights
check_argslogical, should args be check at initialization?
cov_weightsvector of covariates/regressors weights
DataModel$CholStruct()
Compute Cholesky factorization of the Structuring matrix
Usage
DataModel$CholStruct()
DataModel$splitTrainTest()
a function splitting the data into train and test folds
Usage
DataModel$splitTrainTest( nfolds = 10, folds = split(sample(1:self$n), rep(1:nfolds, length = self$n)) )
Arguments
nfoldsthe number of folds
foldsa list of vectors describing the folds (optional)
Returns
a list with train and test data and id.
DataModel$splitSubSamples()
a function splitting data into subsamples
Usage
DataModel$splitSubSamples(
n_subsamples = 50,
subsample_size = floor(self$n/2),
subsamples = replicate(n_subsamples, sample(1:self$n, subsample_size), simplify =
FALSE),
weakness = 1
)
Arguments
n_subsamplesthe number of subsamples
subsample_sizethe subsample size
subsampleslist with vector of subsamples (optional)
weaknesscoefficient for randomly weighting the regressor, default to 1
Returns
a list of DataModel, resampling of the original
DataModel$clone()
The objects of this class are cloneable with this method.
Usage
DataModel$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
Class "FusedLassoFit"
Description
Class of object returned by the fitting function fused_lasso(). Inherits fields
and methods of QuadrupenFit.
Super class
QuadrupenFit -> FusedLassoFit
Active bindings
penaltycharacter describing the regularizer/penalty
lambda1vector of tuning parameters for the l1 penalty
lambda2vector of tuning parameters for the fusion penalty
Methods
Public methods
Inherited methods
FusedLassoFit$new()
Initialize a FusedLassoFit model
Usage
FusedLassoFit$new(data, intercept, regParam)
Arguments
dataa
DataModelobjectintercepta logical; should an intercept be included in the mode?
regParama list with two elements, a vector and a scalar, for the regularization
FusedLassoFit$clone()
The objects of this class are cloneable with this method.
Usage
FusedLassoFit$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
See Also
Class "GroupLavaFit"
Description
Class of object returned by the fitting function group_lava(). Inherits fields
and methods of QuadrupenFit and LavaFit
Super classes
QuadrupenFit -> LavaFit -> GroupLavaFit
Active bindings
lambda1vector of tuning parameters for the l1 group penalty (sparse component)
lambda2vector of tuning parameters for the l2 penalty (dense component)
penaltycharacter describing the regularizer/penalty
groupvector of integers indicating group belonging
typestring indicating whether the
\ell_1/\ell_2or the\ell_1/\ell_\inftygroup-Lasso must be fitted.
Methods
Public methods
Inherited methods
GroupLavaFit$new()
Initialize a GroupLavaFit model
Usage
GroupLavaFit$new(data, intercept, group, type, regParam)
Arguments
dataa
DataModelobjectintercepta logical; should an intercept be included in the mode?
groupvector of integers indicating group belonging.
typestring indicating whether the
\ell_1/\ell_2or the\ell_1/\ell_\inftygroup-Lasso must be fitted.regParama list with two elements, a vector and a scalar, for the regularization
GroupLavaFit$clone()
The objects of this class are cloneable with this method.
Usage
GroupLavaFit$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
See Also
Class InformationCriteria
Description
Class of object returned by the QuadrupenFit$criteria() method or the
criteria() function. Owns print() and plot() methods.
Active bindings
dataa data frame containing the values of various information criteria (AIC, BIC, lmBIC, eBIC, GCV) along the value of
lambda1.lambdavector of
\lambda_1(\ell_1or\ell_\inftypenalty levels) for which each cross-validation has been performed.namesa vector of characters storing the names of the precomputed criteria
Methods
Public methods
InformationCriteria$new()
Constructor for a InformationCriteria object
Should be called internally by an object QuadrupenFit$criteria()
Usage
InformationCriteria$new(value)
Arguments
valuedata frame storing output of
QuadrupenFit$criteria()
InformationCriteria$show()
User friendly print method
Usage
InformationCriteria$show()
InformationCriteria$print()
User friendly print method
Usage
InformationCriteria$print()
InformationCriteria$plot()
Plot the the desired criteria
Usage
InformationCriteria$plot(
criteria = self$names,
log_scale = TRUE,
xvar = c("lambda", "fraction", "df"),
title = "Information Criteria"
)
Arguments
criteriaa vector of character with the criteria to plot. The default plot all the criteria available (stored in the field
names)log_scalelogical; indicates if a log-scale should be used when
xvar="lambda". Default isTRUE.xvarvariable to plot on the X-axis: either
"df"(the estimated degrees of freedom),"lambda"(\lambda_1penalty level) or"fraction"(\ell_1-norm of the coefficients). Default is set to"lambda".titlegraph title
Returns
a ggplot2 object
InformationCriteria$clone()
The objects of this class are cloneable with this method.
Usage
InformationCriteria$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
Class "LavaFit"
Description
Class of object returned by the fitting function lava(). Inherits fields
and methods of QuadrupenFit
Super class
QuadrupenFit -> LavaFit
Active bindings
penaltycharacter describing the regularizer/penalty
lambda1vector of tuning parameters for the l1 penalty (sparse component)
lambda2vector of tuning parameters for the l2 penalty (dense component)
sparse_coefsparse part of the decomposition of the coefficients
dense_coefdense part of the decomposition of the coefficients
debiaslogical, should we rely on the debias coefficient of the regularizer (if available) or not
Methods
Public methods
Inherited methods
LavaFit$new()
Initialize a LavaFit model
Usage
LavaFit$new(data, intercept, regParam)
Arguments
dataa
DataModelobjectintercepta logical; should an intercept be included in the mode?
regParama list with two elements, a vector and a scalar, for the regularization
LavaFit$fit()
function performing the optimization
Usage
LavaFit$fit(control)
Arguments
controllist controlling the optimization process Plot method for lava regularization path
LavaFit$plot_path()
Produce a plot of the solution path of a LavaFit object.
Usage
LavaFit$plot_path(
xvar = c("lambda", "fraction", "df"),
log_scale = TRUE,
component = "both",
title = paste("Lava path:", component, "component(s)"),
standardize = TRUE,
labels = NULL
)
Arguments
xvarvariable to plot on the X-axis: either
"lambda"(\ell_1penalty level, or\ell_2for ridge and\ell_\infty) or"fraction"(\ell_1-norm of the coefficients) ordffor estimated degrees of freedom. Default is set to"lambda".log_scalelogical; indicates if a log-scale should be used when
xvar="lambda". Default isTRUE.componenta character indicating the component to plot: both (sum of sparse and dense), sparse or dense. Default to both.
titlethe title. Default is set to the model name followed by what is on the Y-axis.
standardizelogical; standardize the coefficients before plotting (with the norm of the predictor). Default is
TRUE.labelsvector indicating the names associated to the plotted variables. When specified, a legend is drawn in order to identify each variable. Only relevant when the number of predictor is small. Remind that the intercept does not count. Default is
NULL.
Returns
a ggplot2 object .
LavaFit$clone()
The objects of this class are cloneable with this method.
Usage
LavaFit$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
See Also
Class "QuadrupenFit"
Description
Class of object returned by any fitting function of the
quadrupen package (elastic_net or
bounded_reg).
This class comes with the usual predict(), fitted(), coef(),
residuals(), show(), print() and deviance() S3 methods.
Specific R6 methods are available for model extraction QuadrupenFit$get_model(),
cross validation QuadrupenFit$cross_validate(), stability selection
QuadrupenFit$stability_path(), criteria derivation QuadrupenFit$criteria()
and plotting QuadrupenFit$plot(). They come with equivalent S3 methods : cross_validate(),
stability() and plot().
The "path"plot is available as soon as a fit has been performed.
For the others, the appropriate post-treatments must have been made via the
methods QuadrupenFit$criteria(), QuadrupenFit$cross_validate() or
QuadrupenFit$stability()
All plots functions are given with the default arguments, except for labels and log_scale.
If you need more control, please use the dedicated methods: QuadrupenFit$plot_path(),
InformationCriteria$plot(), CrossValidation$plot(),
StabilityPath$plot() or the corresponding S3 methods.
Plot method for regularization path
Active bindings
nvarnumber of coefficient (without intercept)
nobssample size
dataModelan object with class
DataModelstoring the datamajor_tuningvector of "leading" tuning parameters (either l1, linf or l2)
minor_tuningvector of "minor" tuning parameters (either l1 or l2)
is_l2_regularizedBoolean indicating if l2 regularization is applied
optim_monitoringlist monitoring the optimization
optim_configlist with low level options used for optimization.
fittedMatrix of fitted values, each column corresponding to a value of
lambda1.coefficientsMatrix (class
"dgCMatrix") of coefficients with respect to the original input. The number of rows corresponds the length oflambda1.interceptA vector containing the successive values of the (unpenalized) intercept. Equals to zero if
intercepthas been set toFALSE.debiaslogical, should we rely on the debias coefficient of the regularizer (if available) or not
residualsMatrix of residuals, each column corresponding to a value of
lambda1.deviancethe model deviance
degrees_freedomEstimated degree of freedoms for the successive
lambda1.r_squaredvector giving the coefficient of determination as a function of lambda1.
information_criteriaobject with class
InformationCriteriastoring various information criteria (AIC, BIC, GCV, etc) for the current fit.cross_validationobject with class
CrossValidationstoring output of CV job. Only available once method cross_validate has been called.stability_pathobject with class
StabilityPathstoring output of stability selection. Only available once method $stability has been called.
Methods
Public methods
QuadrupenFit$new()
Initialize a QuadrupenFit model
Usage
QuadrupenFit$new(data, intercept, regParam)
Arguments
dataa DataModel object
intercepta logical; should an intercept be included in the mode?
regParama list with two elements, a vector and a scalar, for the regularization
QuadrupenFit$show()
User friendly print method
Usage
QuadrupenFit$show()
QuadrupenFit$print()
User friendly print method
Usage
QuadrupenFit$print()
QuadrupenFit$fit()
function performing the optimization
Usage
QuadrupenFit$fit(control)
Arguments
controllist controlling the optimization process
QuadrupenFit$get_model()
Model extraction
Usage
QuadrupenFit$get_model(selection, type = c("coefficients", "penalty", "index"))
Arguments
selectioneither a character (model selection criteria) of a scalar (lambda value)
typecharacter for the desired output
Returns
either a vector of coefficients, a scalar or the model index
QuadrupenFit$predict()
Predict response for new sample based on the current model
Usage
QuadrupenFit$predict(newx = NULL, selection = NULL)
Arguments
newxmatrix of new values for the regressor with which to predict. If omitted, the fitted values are used.
selectioneither a character (model selection criteria) of a scalar (lambda value)
Returns
a vector of predicted value Cross-validation for Quadrupen object
QuadrupenFit$cross_validate()
Function that computes K-fold cross-validated error of a
quadrupen fit, possibly on a grid of lambda1, lambda2.
Usage
QuadrupenFit$cross_validate( K = 10, folds = split(sample(1:self$nobs), rep(1:K, length = self$nobs)), lambda2 = self$minor_tuning, verbose = TRUE, cores = 1 )
Arguments
Kinteger indicating the number of folds. Default is 10.
foldslist of
Kvectors that describes the folds to use for the cross-validation. By default, the folds are randomly sampled with the specified K. The same folds are used for each values oflambda2.lambda2tunes the
\ell_2-penalty (ridge-like) of the fit. If none is provided, a vector of values is generated and a CV is performed on a grid oflambda2andlambda1, using the same folds for eachlambda2.verboselogical; indicates if the progression (the current
lambda2should be displayed. Default isTRUE.coresthe number of cores to use. The default uses 1 core (safer in case your BLAS/LAPACK libraries are multithreaded)
Returns
an object with class CrossValidation is sent back and stored as a field of the original QuadrupenFit object.
QuadrupenFit$stability()
Compute the stability path of a (possibly randomized) fitting procedure as introduced by Meinshausen and Buhlmann (2010).
Usage
QuadrupenFit$stability(
n_subsamples = 50,
subsample_size = floor(self$nobs/2),
subsamples = replicate(n_subsamples, sample(1:self$nobs, subsample_size), simplify =
FALSE),
weakness = 1,
verbose = TRUE,
cores = 1
)
Arguments
n_subsamplesinteger indicating the number of subsamplings used to estimate the selection probabilities. Default is 100.
subsample_sizeinteger indicating the size of each subsamples. Default is
floor(n/2).subsampleslist with
subsamplesentries with vectors describing the folds to use for the stability procedure. By default, the folds are randomly sampled with the specifiedn_subsamplesandsubsample_sizeargument.weaknessCoefficient used for randomizing the weights of each features. Default is 1' for no randomization. See details below.
verboselogical; indicates if the progression should be displayed. Default is
TRUE.coresthe number of cores to use. The default uses 1 core (safer in case your BLAS/LAPACK libraries are multithreaded)
Returns
an object with class StabilityPath is sent back and stored as a field of the original QuadrupenFit object.
QuadrupenFit$criteria()
Produce a plot or send back the values of some penalized criteria
accompanied with the vector(s) of parameters selected
accordingly. The default behavior plots the BIC and the AIC (with
respective factor \log(n) and 2) yet the user can specify any
penalty.
Usage
QuadrupenFit$criteria(
penalty = setNames(c(2, log(self$nobs), log(self$nvar), log(self$nobs) + 2 *
log(self$nvar)), c("AIC", "BIC", "mBIC", "eBIC")),
sigma = NULL
)
Arguments
penaltya vector with as many penalties a desired. The default contains the penalty corresponding to the AIC and the BIC (
2and\log(n)). Setting the "names" attribute, as done in the default definition, leads to outputs which are easier to read.sigmascalar: an estimate of the residual variance. When available, it is plugged-in the criteria, which may be more relevant. If
NULL(the default), it is estimated as usual (see details).
Returns
an object with class InformationCriteria is sent back and stored as a field of the original QuadrupenFit object.
QuadrupenFit$plot()
Plot method for QuadrupenFit
Usage
QuadrupenFit$plot(
type = c("path", "criteria", "crossval", "stability"),
log_scale = TRUE,
labels = NULL
)
Arguments
typethe type of plot, either
"path"for regularization path;"criteria"for BIC-like information criteria ;"crossval"for cross-validation plot ; and"stability"for stability path.log_scalelogical; indicates if a log-scale should be used when
xvar="lambda". Default isTRUE.labelsvector indicating the names associated to the plotted variables. When specified, a legend is drawn in order to identify each variable. Only relevant when the number of predictor is small. Remind that the intercept does not count. Default is
NULL.
QuadrupenFit$plot_path()
Produce a plot of the solution path of a QuadrupenFit object.
Usage
QuadrupenFit$plot_path(
xvar = c("lambda", "fraction", "df"),
log_scale = TRUE,
title = paste("Path for", self$penalty),
standardize = TRUE,
labels = NULL
)
Arguments
xvarvariable to plot on the X-axis: either
"lambda"(\ell_1penalty level, or\ell_2for ridge and\ell_\infty) or"fraction"(\ell_1-norm of the coefficients) ordffor estimated degrees of freedom. Default is set to"lambda".log_scalelogical; indicates if a log-scale should be used when
xvar="lambda". Default isTRUE.titlethe title. Default is set to the model name followed by what is on the Y-axis.
standardizelogical; standardize the coefficients before plotting (with the norm of the predictor). Default is
TRUE.labelsvector indicating the names associated to the plotted variables. When specified, a legend is drawn in order to identify each variable. Only relevant when the number of predictor is small. Remind that the intercept does not count. Default is
NULL.
Returns
a ggplot2 object .
Examples
## Simulating multivariate Gaussian with blockwise correlation ## and piecewise constant vector of parameters beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25)) cor <- 0.75 Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables Sww <- matrix(cor,10,10) ## bloc correlation between active variables Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) diag(Sigma) <- 1 n <- 50 x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma)) y <- 10 + x %*% beta + rnorm(n,0,10) ## Plot the Lasso path plot(lasso(x,y), title="Lasso solution path") ## Plot the Elastic-net path plot(elastic_net(x,y), title = "Elastic-net solution path") ## Plot the Elastic-net path (fraction on X-axis, unstandardized coefficient) plot(elastic_net(x,y, lambda2=10), standardize=FALSE, xvar="fraction") ## Plot the Bounded regression path (fraction on X-axis) plot(bounded_reg(x,y, lambda2=10), xvar="fraction")
QuadrupenFit$clone()
The objects of this class are cloneable with this method.
Usage
QuadrupenFit$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
See Also
See also InformationCriteria, CrossValidation and
StabilityPath
cross_validate()
Stability selection for Quadrupen object
stability()
Penalized criteria based on estimation of degrees of freedom
Examples
## ------------------------------------------------
## Method `QuadrupenFit$plot_path()`
## ------------------------------------------------
## Not run:
## Simulating multivariate Gaussian with blockwise correlation
## and piecewise constant vector of parameters
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
cor <- 0.75
Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables
Sww <- matrix(cor,10,10) ## bloc correlation between active variables
Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo)
diag(Sigma) <- 1
n <- 50
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)
## Plot the Lasso path
plot(lasso(x,y), title="Lasso solution path")
## Plot the Elastic-net path
plot(elastic_net(x,y), title = "Elastic-net solution path")
## Plot the Elastic-net path (fraction on X-axis, unstandardized coefficient)
plot(elastic_net(x,y, lambda2=10), standardize=FALSE, xvar="fraction")
## Plot the Bounded regression path (fraction on X-axis)
plot(bounded_reg(x,y, lambda2=10), xvar="fraction")
## End(Not run)
Class "RidgeRegressionFit"
Description
Class of object returned by the fitting function ridge(). Inherits fields
and methods of QuadrupenFit.
Super class
QuadrupenFit -> RidgeRegressionFit
Active bindings
penaltycharacter describing the regularizer/penalty
lambda2vector of tuning parameters for the l2 penalty
Methods
Public methods
Inherited methods
RidgeRegressionFit$new()
Initialize a RidgeRegressionFit model
Usage
RidgeRegressionFit$new(data, intercept, regParam)
Arguments
dataa
DataModelobjectintercepta logical; should an intercept be included in the mode?
regParama list with two elements, a vector and a scalar, for the regularization
RidgeRegressionFit$clone()
The objects of this class are cloneable with this method.
Usage
RidgeRegressionFit$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
See Also
Class "SparseFit"
Description
Class of object returned by the fitting function elastic_net(). Inherits fields
and methods of QuadrupenFit
Super class
QuadrupenFit -> SparseFit
Active bindings
lambda1vector of tuning parameters for the l1 penalty
lambda2vector of tuning parameters for the l2 penalty
penaltycharacter describing the regularizer/penalty
typestring the type of group-wise regularization applied
unbiasing_tuningunbiasing coefficient of the MCP or SCAD penalties
Methods
Public methods
Inherited methods
SparseFit$new()
Initialize a SparseFit model
Usage
SparseFit$new(data, intercept, type, regParam)
Arguments
dataa
DataModelobjectintercepta logical; should an intercept be included in the mode?
typestring the type of group-wise regularization applied
regParama list with two elements, a vector and a scalar, for the regularization
SparseFit$clone()
The objects of this class are cloneable with this method.
Usage
SparseFit$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
See Also
Class "SparseGroupFit"
Description
Class of object returned by the fitting function group_sparse_lm(). Inherits fields
and methods of QuadrupenFit
Super class
QuadrupenFit -> SparseGroupFit
Active bindings
lambda1vector of tuning parameters for the l1 group penalty
lambda2vector of tuning parameters for the l2 penalty
alphamixing parameter of the sparse group-penalty
penaltycharacter describing the regularizer/penalty
groupvector of integers indicating group belonging
typestring the type of group-wise regularization applied
mixture_tuningmixture coefficient of the sparse group penalty
is_group_sparseboolean indicating if sparse group or group penalty is applied
Methods
Public methods
Inherited methods
SparseGroupFit$new()
Initialize a SparseGroupFit model
Usage
SparseGroupFit$new(data, intercept, group, type, regParam)
Arguments
dataa
DataModelobjectintercepta logical; should an intercept be included in the mode?
groupvector of integers indicating group belonging.
typestring indicating whether the
\ell_1/\ell_2or the\ell_1/\ell_\inftygroup-Lasso must be fitted.regParama list with two elements, a vector and a scalar, for the regularization
SparseGroupFit$clone()
The objects of this class are cloneable with this method.
Usage
SparseGroupFit$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
See Also
QuadrupenFit, group_sparse_lm()
Class StabilityPath
Description
Class of object returned by the QuadrupenFit$cross_validate() method or the
cross_validate() function. Owns print() and plot() methods.
Public fields
probabilitiesa
Matrixobject containing the estimated probabilities of selection along the path of solutions.regParama list with the levels of the regularizing parameters used
subsamplesa list that contains the folds used for each subsample.
Active bindings
nvarnumber of variables (without intercept)
nobsnumber of observation/sample size
nonzerovariables with a non-null probability of selection along the stability path
nonzeroprobsubset of the probabilities stability path on the nonzero variables
Methods
Public methods
StabilityPath$new()
Constructor for a StabilityPath object
Should be called internally by an object QuadrupenFit$stability()
Usage
StabilityPath$new(probabilities, regParam, subsamples)
Arguments
probabilitiesa
Matrixobject containing the estimated probabilities of selection along the path of solutions.regParama list with the levels of the regularizing parameters used
subsamplesa list that contains the folds used for each subsample.
StabilityPath$show()
User friendly print method
Usage
StabilityPath$show()
StabilityPath$print()
User friendly print method
Usage
StabilityPath$print()
StabilityPath$selection()
Perform variable selection based on the stability path
Usage
StabilityPath$selection(
sel_mode = c("rank", "PFER"),
cutoff = 0.75,
PFER = 2,
nvarsel = floor(self$nobs/log(self$nvar))
)
Arguments
sel_modea character string, either
"rank"or"PFER". In the first case, the selection is based on the rank of total probabilities by variables along the path: the firstnvarselvariables are selected (see below). In the second case, the PFER control is used as described in Meinshausen and Buhlmannn's paper. Default is"rank".cutoffvalue of the cutoff probability (only relevant when
sel_modeequals"PFER").PFERvalue of the per-family error rate to control (only relevant when
sel_modeequals"PFER").nvarselnumber of variables selected (only relevant when
sel_modeequals"rank". Default isfloor(n/log(p)).
StabilityPath$plot()
Produce a plot of the stability path obtained by stability selection.
Usage
StabilityPath$plot(
xvar = "lambda",
title = "Stability path",
labels = rep("unknown status", self$nvar),
sel_mode = c("rank", "PFER"),
cutoff = 0.75,
PFER = 2,
nvarsel = min(self$nvar, floor(self$nobs/log(self$nvar)))
)
Arguments
xvarvariable to plot on the X-axis: either
"lambda"(first penalty level) or"fraction"(fraction of the penalty level applied tune by\lambda_1). Default is"lambda".titletitle title. If none given, a somewhat appropriate title is automatically generated.
labelsan optional vector of labels for each variable in the path (e.g., 'relevant'/'irrelevant'). See examples.
sel_modea character string, either
"rank"or"PFER". In the first case, the selection is based on the rank of total probabilities by variables along the path: the firstnvarselvariables are selected (see below). In the second case, the PFER control is used as described in Meinshausen and Buhlmannn's paper. Default is"rank".cutoffvalue of the cutoff probability (only relevant when
sel_modeequals"PFER").PFERvalue of the per-family error rate to control (only relevant when
sel_modeequals"PFER").nvarselnumber of variables selected (only relevant when
sel_modeequals"rank". Default isfloor(n/log(p)).plotlogical; indicates if the graph should be plotted. Default is
TRUE. IfFALSE, only the ggplot2 object is sent back.
Returns
a list with a ggplot2 object which can be plotted
via the print method, and a vector of selected variables
corresponding to method of choice ("rank" or
"PFER").
Examples
## Simulating multivariate Gaussian with blockwise correlation
## and piecewise constant vector of parameters
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
Soo <- matrix(0.75,25,25) ## bloc correlation between zero variables
Sww <- matrix(0.75,10,10) ## bloc correlation between active variables
Sigma <- Matrix::bdiag(Soo,Sww,Soo,Sww,Soo) + 0.2
diag(Sigma) <- 1
n <- 100
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)
## Build a vector of label for true nonzeros
labels <- rep("irrelevant", length(beta))
labels[beta != 0] <- c("relevant")
labels <- factor(labels, ordered=TRUE, levels=c("relevant","irrelevant"))
enet <- elastic_net(x, y, lambda2 = 10, struct = solve(Sigma), minratio = 1e-2)
stab <- stability(enet, n_subsamples = 200)
## Build the plot an recover the selected variable
plot(stab, labels=labels)
cat("\nFalse positives for the randomized Elastic-net with stability selection: ",
sum(labels[stab$selection()] != "relevant"))
cat("\nDONE.\n")
StabilityPath$clone()
The objects of this class are cloneable with this method.
Usage
StabilityPath$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
Examples
## ------------------------------------------------
## Method `StabilityPath$plot()`
## ------------------------------------------------
## Not run:
## Simulating multivariate Gaussian with blockwise correlation
## and piecewise constant vector of parameters
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
Soo <- matrix(0.75,25,25) ## bloc correlation between zero variables
Sww <- matrix(0.75,10,10) ## bloc correlation between active variables
Sigma <- Matrix::bdiag(Soo,Sww,Soo,Sww,Soo) + 0.2
diag(Sigma) <- 1
n <- 100
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)
## Build a vector of label for true nonzeros
labels <- rep("irrelevant", length(beta))
labels[beta != 0] <- c("relevant")
labels <- factor(labels, ordered=TRUE, levels=c("relevant","irrelevant"))
enet <- elastic_net(x, y, lambda2 = 10, struct = solve(Sigma), minratio = 1e-2)
stab <- stability(enet, n_subsamples = 200)
## Build the plot an recover the selected variable
plot(stab, labels=labels)
cat("\nFalse positives for the randomized Elastic-net with stability selection: ",
sum(labels[stab$selection()] != "relevant"))
cat("\nDONE.\n")
## End(Not run)
Fit a linear model with infinity-norm plus ridge-like regularization
Description
Adjust a linear model penalized by a (possibly
weighted) \ell_\infty-norm (bounding the
magnitude of the parameters) and a (possibly structured)
\ell_2-norm (ridge-like). The solution path is computed
at a grid of values for the infinity-penalty, fixing the amount of
\ell_2 regularization. See details for the criterion
optimized.
Usage
bounded_reg(
x,
y,
lambda1 = NULL,
lambda2 = 0.01,
penscale = rep(1, ncol(x)),
struct = Matrix::Diagonal(ncol(x), 1),
intercept = TRUE,
normalize = TRUE,
nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)),
minratio = ifelse(nrow(x) <= ncol(x), 0.01, 1e-04),
maxfeat = ifelse(lambda2 < 0.01, min(nrow(x), ncol(x)), min(4 * nrow(x), ncol(x))),
control = list()
)
bounded.reg(
x,
y,
lambda1 = NULL,
lambda2 = 0.01,
penscale = rep(1, ncol(x)),
struct = Matrix::Diagonal(ncol(x), 1),
intercept = TRUE,
normalize = TRUE,
nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)),
minratio = ifelse(nrow(x) <= ncol(x), 0.01, 1e-04),
maxfeat = ifelse(lambda2 < 0.01, min(nrow(x), ncol(x)), min(4 * nrow(x), ncol(x))),
control = list()
)
Arguments
x |
matrix of features, possibly sparsely encoded
(experimental). Do NOT include intercept. When normalized is
|
y |
response vector. |
lambda1 |
sequence of decreasing |
lambda2 |
real scalar; tunes the |
penscale |
vector with real positive values that weight the penalty of each feature. Default sets all weights to 1. |
struct |
matrix structuring the coefficients, possibly
sparsely encoded. Must be at least positive semidefinite (this is
checked internally). If |
intercept |
logical; indicates if an intercept should be
included in the model. Default is |
normalize |
logical; indicates if variables should be
normalized to have unit L2 norm before fitting. Default is
|
nlambda1 |
integer that indicates the number of values to put
in the |
minratio |
minimal value of |
maxfeat |
integer; limits the number of features ever to
enter the model; i.e., non-zero coefficients for the Elastic-net:
the algorithm stops if this number is exceeded and |
control |
list of argument controlling low level options of the algorithm –use with care and at your own risk– :
|
Details
The optimized criterion is the following:
βhat
λ1,λ2 =
argminβ 1/2 RSS(β) + λ1
| D β |∞ + λ/2 2
βT S β,
where
D is a diagonal matrix, whose diagonal terms are provided
as a vector by the penscale argument. The \ell_2
structuring matrix S is provided via the struct
argument, a positive semidefinite matrix (possibly of class
Matrix).
Note that the quadratic algorithm for the bounded regression may
become unstable along the path because of singularity of the
underlying problem, e.g. when there are too much correlation or
when the size of the problem is close to or smaller than the
sample size. In such cases, it might be a good idea to switch to
the proximal solver, slower yet more robust. This is the strategy
automatically adopted in code, that will send a warning in verbose mode
while switching the method to 'fista' and keep on
optimizing on the remainder of the path.
Singularity of the system can also be avoided with a larger
\ell_2-regularization, via lambda2, or a
"not-too-small" \ell_\infty regularization, via
a larger 'minratio' argument.
Value
an object with class QuadrupenFit.
an object with class BoundedRegressionFit, inheriting from QuadrupenFit.
Examples
## Simulating multivariate Gaussian with blockwise correlation
## and piecewise constant vector of parameters
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
cor <- 0.75
Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables
Sww <- matrix(cor,10,10) ## bloc correlation between active variables
Sigma <- Matrix::bdiag(Soo,Sww,Soo,Sww,Soo)
diag(Sigma) <- 1
n <- 50
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)
## Infinity norm without/with an additional l2 regularization term
## and with structuring prior
labels <- rep("irrelevant", length(beta))
labels[beta != 0] <- "relevant"
plot(bounded_reg(x,y,lambda2=0) , label=labels) ## a mess
plot(bounded_reg(x,y,lambda2=10), label=labels) ## good guys are at the boundaries
plot(bounded_reg(x,y,lambda2=10,struct=solve(Sigma)), label=labels) ## even better
Extract model coefficients
Description
Extracts model coefficients from a QuadrupenFit object
Usage
## S3 method for class 'QuadrupenFit'
coef(object, selection = NULL, ...)
Arguments
object |
a QuadrupenFit object |
selection |
either a character (model selection criteria) of a scalar (lambda value) |
... |
not used, only here for S3 compatibility |
Value
a vector of coefficients
Penalized criteria based on estimation of degrees of freedom
Description
Produce a plot or send back the values of some penalized criteria
accompanied with the vector(s) of parameters selected
accordingly. The default behavior plots the BIC and the AIC (with
respective factor \log(n) and 2) yet the user can specify any
penalty.
Usage
criteria(
object,
penalty = setNames(c(2, log(object$nobs), log(object$nvar), log(object$nobs) + 2 *
log(object$nvar)), c("AIC", "BIC", "mBIC", "eBIC")),
sigma = NULL
)
## S3 method for class 'QuadrupenFit'
criteria(
object,
penalty = setNames(c(2, log(object$nobs), log(object$nvar), log(object$nobs) + 2 *
log(object$nvar)), c("AIC", "BIC", "mBIC", "eBIC")),
sigma = NULL
)
Arguments
object |
output of a fitting procedure of the quadrupen
package (e.g. |
penalty |
a vector with as many penalties a desired. The
default contains the penalty corresponding to the AIC and the BIC
( |
sigma |
scalar: an estimate of the residual variance. When
available, it is plugged-in the criteria, which may be more
relevant. If |
Value
an object with class InformationCriteria is sent back and stored as a field of the original QuadrupenFit object.
Methods (by class)
-
criteria(QuadrupenFit): S3 method for information criteria of a QuadrupenFit
Note
When sigma is provided, the criterion takes the form
When it is unknown, it writes
n*log(RSS) + penalty * dfEstimation of the degrees of freedom (for the elastic-net, the LASSO and also bounded regression) are computed by applying and adapting the results of Tibshirani and Taylor (see references below).
References
Ryan Tibshirani and Jonathan Taylor. Degrees of freedom in lasso problems, Annals of Statistics, 40(2) 2012.
Examples
## Not run:
## Simulating multivariate Gaussian with blockwise correlation
## and piecewise constant vector of parameters
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
cor <- 0.75
Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables
Sww <- matrix(cor,10,10) ## bloc correlation between active variables
Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo)
diag(Sigma) <- 1
n <- 50
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)
## Plot penalized criteria for the Elastic-net path
criteria(elastic_net(x,y, lambda2=1))
#' Plot penalized criteria for the Bounded regression
criteria(bounded_reg(x,y, lambda2=1))
## End(Not run)
Cross-validation for Quadrupen object
Description
Function that computes K-fold cross-validated error of a
quadrupen fit, possibly on a grid of lambda1, lambda2.
Usage
cross_validate(
object,
K = 10,
folds = split(sample(1:object$nobs), rep(1:K, length = object$nobs)),
lambda2 = object$minor_tuning,
verbose = TRUE,
cores = 1
)
## S3 method for class 'QuadrupenFit'
cross_validate(
object,
K = 10,
folds = split(sample(1:object$nobs), rep(1:K, length = object$nobs)),
lambda2 = object$minor_tuning,
verbose = TRUE,
cores = parallel::detectCores() - 2
)
Arguments
object |
an R6 object with class QuadrupenFit |
K |
integer indicating the number of folds. Default is 10. |
folds |
list of |
lambda2 |
tunes the |
verbose |
logical; indicates if the progression (the current
|
cores |
the number of cores to use. The default uses 1 core (safer in case your BLAS/LAPACK libraries are multithreaded) |
Value
an object with class CrossValidation is sent back and stored as a
field of the original QuadrupenFit object.
Methods (by class)
-
cross_validate(QuadrupenFit): S3 method for cross-validation of a QuadrupenFit
Note
If the user runs the fitting method with option
'bulletproof' set to FALSE, the algorithm may stop
at an early stage of the path. Early stops are handled internally,
in order to provide results on the same grid of penalty tuned by
\lambda_1. This is done by means of NA
values, so as mean and standard error are consistently
evaluated. If, while cross-validating, the procedure experiences
too much early stops, a warning is sent to the user, in which
case you should reconsider the grid of lambda1 used for the
cross-validation. If bulletproof is TRUE (the
default), there is nothing to worry about, except a possible slow
down when any switching to the proximal algorithm is required.
Examples
## Not run:
## Simulating multivariate Gaussian with blockwise correlation
## and piecewise constant vector of parameters
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
cor <- 0.75
Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variable
Sww <- matrix(cor,10,10) ## bloc correlation between active variables
Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) + 0.1
diag(Sigma) <- 1
n <- 100
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)
enet <- elastic_net(x, y, nlambda1=50)
## Use fewer lambda1 values by overwritting the default parameters
## and cross-validate over the sequences lambda1 and lambda2
cv.grid <- cross_validate(enet, lambda2=10^seq(2,-2,len=50))
## Rerun simple cross-validation with the appropriate lambda2
cv.10K <- crossval(x,y, lambda2=cv.grid$lambda2_min)
## Try leave one out also
cv.loo <- crossval(x,y, K=n, lambda2=cv.grid$lambda2_min)
plot(cv.grid)
plot(cv.10K)
plot(cv.loo)
## Performance for selection purpose
cat("\nFalse positives with the minimal 10-CV choice: ", sum(sign(beta) != sign(cv.10K$beta_min )))
cat("\nFalse positives with the minimal LOO-CV choice: ", sum(sign(beta) != sign(cv.loo$beta_min)))
## End(Not run)
Extract model deviance
Description
Extracts the deviance of a QuadrupenFit object
Usage
## S3 method for class 'QuadrupenFit'
deviance(object, ...)
Arguments
object |
a QuadrupenFit object |
... |
not used, only here for S3 compatibility |
Value
a scalar
Extracts model fitted values
Description
Extracts model fitted values
Usage
## S3 method for class 'QuadrupenFit'
fitted(object, ...)
Arguments
object |
a QuadrupenFit object |
... |
not used, only here for S3 compatibility |
Value
A matrix of fitted values extracted from object.
A function for fitting generalized fused-Lasso problems
Description
This function fits the standard version of the fused lasso.
It can take a general matrix x and allows for possible weights on the
lambda1 and lambda2 penalties.
Usage
fused_lasso(
x,
y,
lambda1 = NULL,
lambda2 = 1,
pen_fused = c("L1", "L2", "Huber"),
penscale = rep(1, ncol(x)),
struct = NULL,
intercept = TRUE,
normalize = TRUE,
nlambda1 = ifelse(is.null(lambda1), 50, length(lambda1)),
minratio = 0.01,
maxfeat = ifelse(lambda2 < 1, min(nrow(x), ncol(x)), min(2 * nrow(x), ncol(x))),
beta0 = rep(0, ncol(x)),
control = list()
)
Arguments
x |
matrix of features, possibly sparsely encoded
(experimental). Do NOT include intercept. When normalized is
|
y |
response vector. |
lambda1 |
sequence of decreasing |
lambda2 |
real scalar; tunes the |
pen_fused |
penalty used for fusing the variables (either L1, L2 or Huber). Default is L1 |
penscale |
vector with real positive values that weight the penalty of each feature. Default sets all weights to 1. |
struct |
Description of the graph that corresponds to the |
intercept |
logical; indicates if an intercept should be
included in the model. Default is |
normalize |
logical; indicates if variables should be
normalized to have unit L2 norm before fitting. Default is
|
nlambda1 |
integer that indicates the number of values to put
in the |
minratio |
minimal value of |
maxfeat |
integer; limits the number of features ever to
enter the model; i.e., non-zero coefficients for the Elastic-net:
the algorithm stops if this number is exceeded and |
beta0 |
a starting point for the vector of parameter. By default, will initialized zero. May save time in some situation. |
control |
list of argument controlling low level options of the algorithm:
|
Details
The optimized criterion is the following:
βhat
λ1,λ2 =
argminβ 1/2 RSS(β) + λ1
| D β |1 + λ/2 2
βT S β,
where
D is a diagonal matrix, whose diagonal terms are provided
as a vector by the penscale argument. The \ell_1 fusion penalty
is structured by a possibly weighted graph G provided via the struct
argument, as a symmetric (undirected) adjacency matrix.
Value
an object with class FusedLassoFit, inheriting from QuadrupenFit.
Author(s)
Original code by Holger Hoefling, refactoring by Julien Chiquet
Examples
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
cor <- 0.75
Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables
Sww <- matrix(cor,10,10) ## bloc correlation between active variables
Sigma <- Matrix::bdiag(Soo,Sww,Soo,Sww,Soo)
diag(Sigma) <- 1
n <- 50
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)
res <- fused_lasso(x, y, lambda2=5)
G <- igraph::make_ring(ncol(x)) |> igraph::as_adjacency_matrix(sparse = FALSE)
resG <- fused_lasso(x, y, lambda2=5, struct = G)
plot(res)
plot(resG)
Fit a linear model with group-lava regularization
Description
Adjust a the group-lava regularized linear models, that is a lava transformation
of the data plus a mixture of either a (possibly weighted)
\ell_1/\ell_2- or
\ell_1/\ell_\infty-norm, and a (possibly
structured) \ell_2-norm (ridge-like). The solution path
is computed at a grid of values for the
\ell_1/\ell_q-penalty. See details for the criterion
optimized.
Usage
group_lava(
x,
y,
group,
type = c("l2", "coop", "linf"),
lambda1 = NULL,
lambda2 = 1,
weights = rep(1, nrow(x)),
penscale = sqrt(tabulate(group)),
struct = Matrix::Diagonal(ncol(x), 1),
intercept = TRUE,
normalize = TRUE,
refit = FALSE,
nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)),
minratio = 0.01,
maxfeat = ifelse(lambda2 < 0.01, min(nrow(x), ncol(x)), min(4 * nrow(x), ncol(x))),
beta0 = numeric(ncol(x)),
control = list()
)
Arguments
x |
matrix of features, possibly sparsely encoded
(experimental). Do NOT include intercept. When normalized is
|
y |
response vector. |
group |
vector of integers indicating group belonging. Must
match the number of column in |
type |
string indicating whether the |
lambda1 |
sequence of decreasing |
lambda2 |
real scalar; tunes the |
weights |
vector with real positive values that weight the observations (like in weighted least square). Default sets all weights to 1. |
penscale |
vector with real positive values that weight the penalty of each feature. Default sets all weights to 1. |
struct |
matrix structuring the coefficients, possibly
sparsely encoded. Must be at least positive semidefinite (this is
checked internally). If |
intercept |
logical; indicates if an intercept should be
included in the model. Default is |
normalize |
logical; indicates if variables should be
normalized to have unit L2 norm before fitting. Default is
|
refit |
logical: indicates if the non null coefficients should be refit to avoid excessive bias. Default is FALSE. Can be changed later (both raw and refit coefficients are stored). |
nlambda1 |
integer that indicates the number of values to put
in the |
minratio |
minimal value of |
maxfeat |
integer; limits the number of features ever to
enter the model; i.e., non-zero coefficients for the Elastic-net:
the algorithm stops if this number is exceeded and |
beta0 |
a starting point for the vector of parameter. By default, will initialized zero. May save time in some situation. |
control |
list of argument controlling low level options of the algorithm –use with care and at your own risk– :
|
Details
The optimized criterion is the following: θhat λ1,λ2 = argminθ = β+δ 1/2n RSS(β + δ) + λ1 Ωg(β) + λ2/2 δT S δ,
where \Omega_g is the group-wise mixed norm:
\ell_1/\ell_2 (Group-LAVA) or
\ell_1/\ell_\infty, controlled by the type argument.
The \ell_2 structuring matrix S is provided via struct.
Value
an object with class GroupLavaFit, inheriting from QuadrupenFit.
References
Chernozhukov, Victor, Christian Hansen, and Yuan Liao. "A lava attack on the recovery of sums of dense and sparse signals." The Annals of Statistics (2017): 39-76. https://doi.org/10.1214/16-AOS1434
Examples
## Simulating multivariate Gaussian with blockwise correlation
## and piecewise constant vector of parameters
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
delta <- runif(sum(c(25,10,25,10,25)),-.1,.1)
grp <- rep(1:5, c(25,10,25,10,25))
cor <- 0.75
Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables
Sww <- matrix(cor,10,10) ## bloc correlation between active variables
Sigma <- Matrix::bdiag(Soo,Sww,Soo,Sww,Soo)
diag(Sigma) <- 1
n <- 50
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)
labels <- rep("irrelevant", length(beta))
labels[beta != 0] <- "relevant"
## Not run:
## Standard Group-Lasso
plot(group_lava(x,y,grp), label=labels)
plot(group_lava(x,y,grp, lambda2=.5), label=labels)
plot(group_lava(x,y,grp, lambda2=10), label=labels)
plot(group_lava(x,y,grp, lambda2=10,struct=solve(Sigma)), label=labels)
## L1/LINF Group-Lasso
plot(group_lava(x, y, grp, type = "linf"), label=labels)
plot(group_lava(x, y, grp, type = "linf", lambda2=.5), label=labels)
plot(group_lava(x, y, grp, type = "linf", lambda2=10), label=labels)
plot(group_lava(x, y, grp, type = "linf", lambda2=10,struct=solve(Sigma)), label=labels)
## Cooperative-Lasso
plot(group_lava(x, y, grp, type = "coop"), label=labels)
plot(group_lava(x, y, grp, type = "coop", lambda2=.5), label=labels)
plot(group_lava(x, y, grp, type = "coop", lambda2=10), label=labels)
plot(group_lava(x, y, grp, type = "coop", lambda2=10,struct=solve(Sigma)), label=labels)
## End(Not run)
Fit a linear model with (sparse) group regularisation
Description
Adjust a linear model with (sparse) group regularization, that is, a
mixture of an element-wise \ell_1 norm and a group-wise mixed-norm
(either \ell_1/\ell_2, \ell_1/\ell_\infty or
cooperative). We also add a (possibly structured) \ell_2-norm (ridge-like).
The solution path is computed on an automatically tuned grid of values for
the sparse group penalty. The mixture coefficient and the amount of ridge-like
regularization are fixed by the user. See details for the criterion optimized.
Usage
group_sparse_lm(
x,
y,
group,
type = c("l2", "coop", "linf"),
lambda1 = NULL,
lambda2 = 0.01,
alpha = 0,
weights = rep(1, nrow(x)),
penscale = sqrt(tabulate(group)),
struct = Matrix::Diagonal(ncol(x), 1),
intercept = TRUE,
normalize = TRUE,
refit = FALSE,
nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)),
minratio = 0.01,
maxfeat = ifelse(lambda2 < 0.01, min(2 * nrow(x), ncol(x)), min(4 * nrow(x), ncol(x))),
beta0 = numeric(ncol(x)),
control = list()
)
group_lasso(
x,
y,
group,
lambda1 = NULL,
lambda2 = 0,
weights = rep(1, nrow(x)),
penscale = sqrt(tabulate(group)),
intercept = TRUE,
normalize = TRUE,
nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)),
minratio = 0.01,
maxfeat = ncol(x),
beta0 = numeric(ncol(x)),
control = list(method = "quadra")
)
group_l1linf(
x,
y,
group,
lambda1 = NULL,
lambda2 = 0,
weights = rep(1, nrow(x)),
penscale = sqrt(tabulate(group)),
intercept = TRUE,
normalize = TRUE,
nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)),
minratio = 0.01,
maxfeat = ncol(x),
beta0 = numeric(ncol(x)),
control = list()
)
coop_lasso(
x,
y,
group,
lambda1 = NULL,
lambda2 = 0,
weights = rep(1, nrow(x)),
penscale = sqrt(tabulate(group)),
intercept = TRUE,
normalize = TRUE,
nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)),
minratio = 0.01,
maxfeat = ncol(x),
beta0 = numeric(ncol(x)),
control = list()
)
sparse_group_lasso(
x,
y,
group,
lambda1 = NULL,
lambda2 = 0,
alpha = 0.5,
weights = rep(1, nrow(x)),
penscale = sqrt(tabulate(group)),
intercept = TRUE,
normalize = TRUE,
nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)),
minratio = 0.01,
maxfeat = ncol(x),
beta0 = numeric(ncol(x)),
control = list()
)
sparse_group_l1linf(
x,
y,
group,
lambda1 = NULL,
lambda2 = 0,
alpha = 0.5,
weights = rep(1, nrow(x)),
penscale = sqrt(tabulate(group)),
intercept = TRUE,
normalize = TRUE,
nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)),
minratio = 0.01,
maxfeat = ncol(x),
beta0 = numeric(ncol(x)),
control = list()
)
sparse_coop_lasso(
x,
y,
group,
lambda1 = NULL,
lambda2 = 0,
alpha = 0.5,
weights = rep(1, nrow(x)),
penscale = sqrt(tabulate(group)),
intercept = TRUE,
normalize = TRUE,
nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)),
minratio = 0.01,
maxfeat = ncol(x),
beta0 = numeric(ncol(x)),
control = list()
)
Arguments
x |
matrix of features, possibly sparsely encoded
(experimental). Do NOT include intercept. When normalized is
|
y |
response vector. |
group |
vector of integers indicating group belonging. Must
match the number of column in |
type |
string indicating the sparse-group variant to be fitted. Could be "l2", "coop", or "linf". Default is "l2" (regular Group-Lasso) |
lambda1 |
sequence of decreasing |
lambda2 |
real scalar; tunes the |
alpha |
real scalar in (0,1); tunes mixture between |
weights |
vector with real positive values that weight the observations (like in weighted least square). Default sets all weights to 1. |
penscale |
vector with real positive values that weight the penalty of each feature. Default sets all weights to 1. |
struct |
matrix structuring the coefficients, possibly
sparsely encoded. Must be at least positive semidefinite (this is
checked internally). If |
intercept |
logical; indicates if an intercept should be
included in the model. Default is |
normalize |
logical; indicates if variables should be
normalized to have unit L2 norm before fitting. Default is
|
refit |
logical: indicates if the non null coefficients should be refit to avoid excessive bias. Default is FALSE. Can be changed later (both raw and refit coefficients are stored). |
nlambda1 |
integer that indicates the number of values to put
in the |
minratio |
minimal value of |
maxfeat |
integer; limits the number of features ever to
enter the model; i.e., non-zero coefficients for the Elastic-net:
the algorithm stops if this number is exceeded and |
beta0 |
a starting point for the vector of parameter. By default, will initialized zero. May save time in some situation. |
control |
list of argument controlling low level options of the algorithm –use with care and at your own risk– :
|
Details
The optimized criterion is the following: βhat λ1,λ2 = argminβ 1/2 RSS(β) + λ1 [(1-α) Ωg(β) + α | D β |1] + λ2/2 βT S β,
where \Omega_g is the group-wise mixed norm:
\ell_1/\ell_2 (Group-Lasso),
\ell_1/\ell_\infty, or cooperative (Clime), controlled
by the type argument; D is a diagonal matrix whose
diagonal terms are given by penscale; \alpha
tunes the mixture between the group and element-wise penalties;
and S is the \ell_2 structuring matrix provided
via struct, a positive semidefinite matrix (possibly of
class Matrix).
Value
an object with class SparseGroupFit, inheriting from QuadrupenFit.
See Also
See also QuadrupenFit
Examples
## Simulating multivariate Gaussian with blockwise correlation
## and piecewise constant vector of parameters
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
grp <- rep(1:5, c(25,10,25,10,25))
cor <- 0.75
Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables
Sww <- matrix(cor,10,10) ## bloc correlation between active variables
Sigma <- Matrix::bdiag(Soo,Sww,Soo,Sww,Soo)
diag(Sigma) <- 1
n <- 50
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)
## Various sparse group linear models without/with an additional l2 regularization term
## and with structuring prior
labels <- rep("irrelevant", length(beta))
labels[beta != 0] <- "relevant"
## Group-Lasso
plot(group_lasso(x, y, grp), label=labels)
## Sparse Group-Lasso
plot(sparse_group_lasso(x, y, grp, alpha = 0.75), label=labels)
## Not run:
## Sparse Group-Lasso + L2 regularization
plot(group_sparse_lm(x, y, grp, type = "l2", alpha = .75, lambda2=.5),
label=labels)
plot(group_sparse_lm(x, y, grp, type = "l2", alpha = .75, lambda2=10),
label=labels)
plot(group_sparse_lm(x, y, grp, type = "l2", alpha = .75, lambda2=10,
struct=solve(Sigma)), label=labels)
## Group-Lasso L1/LINF
plot(group_l1linf(x, y, grp), label=labels)
## Sparse Group-Lasso L1/LINF
plot(sparse_group_l1linf(x, y, grp, alpha = 0.75), label=labels)
## Sparse L1/LINF Group-Lasso + L2 regularization
plot(group_sparse_lm(x, y, grp, type = "linf", alpha = .75, lambda2=.5),
label=labels)
plot(group_sparse_lm(x, y, grp, type = "linf", alpha = .75, lambda2=10),
label=labels)
plot(group_sparse_lm(x, y, grp, type = "linf", alpha = .75, lambda2=10,
struct=solve(Sigma)), label=labels)
## Cooperative-Lasso
plot(coop_lasso(x, y, grp), label=labels)
## Sparse Cooperative-Lasso
plot(sparse_coop_lasso(x, y, grp, alpha = 0.75), label=labels)
## Sparse Cooperative-Lasso + L2 regularization
plot(group_sparse_lm(x, y, grp, type = "coop", alpha = .75, lambda2=.5),
label=labels)
plot(group_sparse_lm(x, y, grp, type = "coop", alpha = .75, lambda2=10),
label=labels)
plot(group_sparse_lm(x, y, grp, type = "coop", alpha = .75, lambda2=10,
struct=solve(Sigma)), label=labels)
## End(Not run)
Auxiliary functions to check the given class of an object
Description
Auxiliary functions to check the given class of an object
Usage
isQuadrupenFit(Robject)
Arguments
Robject |
an R object to evaluate |
Value
logical
Fit a linear model with lava regularization
Description
Adjust a lava regularized linear model, that is a lava transformation
of the data followed by a
(possibly weighted) \ell_1-norm. The solution path is
computed at a grid of values for the \ell_1-penalty. See
details for the criterion optimized.
Usage
lava(
x,
y,
lambda1 = NULL,
lambda2 = 1,
weights = rep(1, nrow(x)),
penscale = rep(1, ncol(x)),
struct = Matrix::Diagonal(ncol(x), 1),
intercept = TRUE,
normalize = TRUE,
refit = FALSE,
nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)),
minratio = 0.01,
maxfeat = min(nrow(x), ncol(x)),
beta0 = numeric(ncol(x)),
control = list()
)
Arguments
x |
matrix of features, possibly sparsely encoded
(experimental). Do NOT include intercept. When normalized is
|
y |
response vector. |
lambda1 |
sequence of decreasing |
lambda2 |
real scalar; tunes the |
weights |
vector with real positive values that weight the observations (like in weighted least square). Default sets all weights to 1. |
penscale |
vector with real positive values that weight the penalty of each feature. Default sets all weights to 1. |
struct |
matrix structuring the coefficients, possibly
sparsely encoded. Must be at least positive semidefinite (this is
checked internally). If |
intercept |
logical; indicates if an intercept should be
included in the model. Default is |
normalize |
logical; indicates if variables should be
normalized to have unit L2 norm before fitting. Default is
|
refit |
logical: indicates if the non null coefficients should be refit to avoid excessive bias. Default is FALSE. Can be changed later (both raw and refit coefficients are stored). |
nlambda1 |
integer that indicates the number of values to put
in the |
minratio |
minimal value of |
maxfeat |
integer; limits the number of features ever to
enter the model; i.e., non-zero coefficients for the Elastic-net:
the algorithm stops if this number is exceeded and |
beta0 |
a starting point for the vector of parameter. By default, will initialized zero. May save time in some situation. |
control |
list of argument controlling low level options of the algorithm –use with care and at your own risk– :
|
Details
The optimized criterion is the following: βhat λ1 = argminθ = β+δ 1/2n RSS(β + δ) + λ1 | β |1 + λ/2 2 δT S δ, .
Value
an object with class QuadrupenFit.
an object with class LavaFit, inheriting from QuadrupenFit.
References
Chernozhukov, Victor, Christian Hansen, and Yuan Liao. "A lava attack on the recovery of sums of dense and sparse signals." The Annals of Statistics (2017): 39-76. https://doi.org/10.1214/16-AOS1434
Examples
## Simulating multivariate Gaussian with blockwise correlation
## and piecewise constant vector of parameters
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
delta <- runif(sum(c(25,10,25,10,25)),-.1,.1)
cor <- 0.75
Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables
Sww <- matrix(cor,10,10) ## bloc correlation between active variables
Sigma <- Matrix::bdiag(Soo,Sww,Soo,Sww,Soo)
diag(Sigma) <- 1
n <- 50
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)
labels <- rep("irrelevant", length(beta))
labels[beta != 0] <- "relevant"
## The solution path of the LAVA
out <- lava(x,y)
out$plot_path(component = "sparse", labels=labels)
out$plot_path(component = "dense", labels=labels)
Plot method for quadrupen objects
Description
S3 plot methods for QuadrupenFit, CrossValidation and
StabilityPath objects, delegating to their respective R6 $plot() method.
Usage
## S3 method for class 'QuadrupenFit'
plot(x, ...)
## S3 method for class 'CrossValidation'
plot(x, ...)
## S3 method for class 'StabilityPath'
plot(x, ...)
Arguments
x |
a QuadrupenFit, CrossValidation or StabilityPath object. |
... |
additional arguments passed to the underlying R6 |
Value
a ggplot2 object.
Functions
-
plot(CrossValidation): Plot method for a CrossValidation object -
plot(StabilityPath): Plot method for a StabilityPath object
Perform model prediction
Description
Predict response for new sample based on the current model
Usage
## S3 method for class 'QuadrupenFit'
predict(object, newx = NULL, selection = NULL, ...)
Arguments
object |
an R object to evaluate |
newx |
matrix of new values for the regressor with which to predict. If omitted, the fitted values are used. |
selection |
either a character (model selection criteria) of a scalar (lambda value) |
... |
not used, only here for S3 compatibility |
Value
a vector of predicted value
Extract model residuals
Description
Extracts model residuals from a QuadrupenFit object
Usage
## S3 method for class 'QuadrupenFit'
residuals(object, newx = NULL, newy = NULL, ...)
Arguments
object |
a QuadrupenFit object |
newx |
matrix of new covariates for out-of-sample residuals. Must be provided together with |
newy |
vector of new responses for out-of-sample residuals. Must be provided together with |
... |
not used, only here for S3 compatibility |
Value
Matrix of residuals, each column corresponding to a value of lambda1.
Fit a linear model with a structured ridge regularization
Description
Adjust a linear model with ridge regularization (possibly
structured \ell_2-norm). The solution path is computed
at a grid of values for the \ell_2-penalty. See details
for the criterion optimized.
Usage
ridge(
x,
y,
lambda = NULL,
weights = rep(1, nrow(x)),
struct = Matrix::Diagonal(ncol(x), 1),
penscale = rep(1, ncol(x)),
intercept = TRUE,
normalize = TRUE,
nlambda = 100,
minratio = 1e-05,
lambda_max = 100,
control = list()
)
Arguments
x |
matrix of features, possibly sparsely encoded
(experimental). Do NOT include intercept. When normalized is
|
y |
response vector. |
lambda |
sequence of decreasing |
weights |
vector with real positive values that weight the observations (like in weighted least square). Default sets all weights to 1. |
struct |
matrix structuring the coefficients, possibly
sparsely encoded. Must be at least positive semidefinite (this is
checked internally). If |
penscale |
vector with real positive values that weight the penalty of each feature. Default sets all weights to 1. |
intercept |
logical; indicates if an intercept should be
included in the model. Default is |
normalize |
logical; indicates if variables should be
normalized to have unit L2 norm before fitting. Default is
|
nlambda |
integer that indicates the number of values to put
in the |
minratio |
minimal value of |
lambda_max |
the largest value of |
control |
list of argument controlling low level options of the algorithm –use with care and at your own risk– :
|
Details
The optimized criterion is the following:
βhat
λ2 = argminβ 1/2
RSS(β) + λ/2 2 βT S
β, where the
\ell_2 structuring positive semidefinite matrix
S is provided via the struct argument (possibly of
class Matrix).
Value
an object with class RidgeRegressionFit, inheriting from QuadrupenFit.
See Also
See also QuadrupenFit
Examples
## Simulating multivariate Gaussian with blockwise correlation
## and piecewise constant vector of parameters
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
cor <- 0.75
Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables
Sww <- matrix(cor,10,10) ## bloc correlation between active variables
Sigma <- Matrix::bdiag(Soo,Sww,Soo,Sww,Soo)
diag(Sigma) <- 1
n <- 50
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)
labels <- rep("irrelevant", length(beta))
labels[beta != 0] <- "relevant"
plot(ridge(x,y) , label=labels) ## a mess
plot(ridge(x,y, struct=solve(Sigma)), label=labels) ## even better
Variable selection from a stability path
Description
S3 generic for variable selection based on a StabilityPath object, as introduced by Meinshausen and Buhlmann (2010).
Usage
selection(object, ...)
## S3 method for class 'StabilityPath'
selection(
object,
sel_mode = c("rank", "PFER"),
cutoff = 0.75,
PFER = 2,
nvarsel = NULL,
...
)
Arguments
object |
a StabilityPath object. |
... |
not used, only here for S3 compatibility. |
sel_mode |
a character string, either |
cutoff |
probability threshold for |
PFER |
per-family error rate to control for |
nvarsel |
number of variables to select for |
Value
an integer vector of selected variable indices.
Methods (by class)
-
selection(StabilityPath): S3 method for variable selection from a StabilityPath
References
N. Meinshausen and P. Buhlmann (2010). Stability Selection, JRSS(B).
See Also
Fit a linear model with sparse regularization
Description
Adjust a linear model with sparse regularization.
We also add a (possibly structured) \ell_2-norm
(ridge-like). The solution path is computed at a grid of values for the
\ell_1-penalty, fixing the amount of \ell_2
regularization. See details for the criterion optimized.
Usage
sparse_lm(
x,
y,
type = c("l1", "mcp", "scad"),
lambda1 = NULL,
lambda2 = 0.01,
eta = 3.7,
weights = rep(1, nrow(x)),
penscale = rep(1, ncol(x)),
struct = Matrix::Diagonal(ncol(x), 1),
intercept = TRUE,
normalize = TRUE,
refit = FALSE,
nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)),
minratio = ifelse(nrow(x) <= ncol(x), 0.01, 1e-04),
maxfeat = ifelse(lambda2 < 0.01, min(nrow(x), ncol(x)), min(4 * nrow(x), ncol(x))),
beta0 = numeric(ncol(x)),
control = list()
)
elastic.net(
x,
y,
lambda1 = NULL,
lambda2 = 0.5,
weights = rep(1, nrow(x)),
penscale = rep(1, ncol(x)),
struct = Matrix::Diagonal(ncol(x), 1),
intercept = TRUE,
normalize = TRUE,
refit = FALSE,
nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)),
minratio = ifelse(nrow(x) <= ncol(x), 0.01, 1e-04),
maxfeat = ifelse(lambda2 < 0.01, min(nrow(x), ncol(x)), min(4 * nrow(x), ncol(x))),
beta0 = numeric(ncol(x)),
control = list(method = "quadra")
)
elastic_net(
x,
y,
lambda1 = NULL,
lambda2 = 0.5,
weights = rep(1, nrow(x)),
penscale = rep(1, ncol(x)),
struct = Matrix::Diagonal(ncol(x), 1),
intercept = TRUE,
normalize = TRUE,
refit = FALSE,
nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)),
minratio = ifelse(nrow(x) <= ncol(x), 0.01, 1e-04),
maxfeat = ifelse(lambda2 < 0.01, min(nrow(x), ncol(x)), min(4 * nrow(x), ncol(x))),
beta0 = numeric(ncol(x)),
control = list(method = "quadra")
)
lasso(
x,
y,
lambda1 = NULL,
weights = rep(1, nrow(x)),
penscale = rep(1, ncol(x)),
intercept = TRUE,
normalize = TRUE,
refit = FALSE,
nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)),
minratio = ifelse(nrow(x) <= ncol(x), 0.01, 1e-04),
maxfeat = min(nrow(x), ncol(x)),
beta0 = numeric(ncol(x)),
control = list(method = "quadra")
)
mcp(
x,
y,
lambda1 = NULL,
lambda2 = 0,
eta = 3,
weights = rep(1, nrow(x)),
penscale = rep(1, ncol(x)),
struct = Matrix::Diagonal(ncol(x), 1),
intercept = TRUE,
normalize = TRUE,
refit = FALSE,
nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)),
minratio = ifelse(nrow(x) <= ncol(x), 0.01, 1e-04),
maxfeat = ifelse(lambda2 < 0.01, min(nrow(x), ncol(x)), min(4 * nrow(x), ncol(x))),
beta0 = numeric(ncol(x)),
control = list(method = "quadra")
)
scad(
x,
y,
lambda1 = NULL,
lambda2 = 0,
eta = 3.7,
weights = rep(1, nrow(x)),
penscale = rep(1, ncol(x)),
struct = Matrix::Diagonal(ncol(x), 1),
intercept = TRUE,
normalize = TRUE,
refit = FALSE,
nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)),
minratio = ifelse(nrow(x) <= ncol(x), 0.01, 1e-04),
maxfeat = ifelse(lambda2 < 0.01, min(nrow(x), ncol(x)), min(4 * nrow(x), ncol(x))),
beta0 = numeric(ncol(x)),
control = list(method = "quadra")
)
Arguments
x |
matrix of features, possibly sparsely encoded
(experimental). Do NOT include intercept. When normalized is
|
y |
response vector. |
type |
string indicating the sparse variant to be fitted. Could be "l1", "mcp" or "scad". Default is "l1". be careful as scad and mcp are still experimental and have not been fully tested yet |
lambda1 |
sequence of decreasing |
lambda2 |
real scalar; tunes the |
eta |
real positive scalar for tuning SCAD or MCP penalties. Default is 3.7. Ignored when type == "l1". |
weights |
vector with real positive values that weight the observations (like in weighted least square). Default sets all weights to 1. |
penscale |
vector with real positive values that weight the penalty of each feature. Default sets all weights to 1. |
struct |
matrix structuring the coefficients, possibly
sparsely encoded. Must be at least positive semidefinite (this is
checked internally). If |
intercept |
logical; indicates if an intercept should be
included in the model. Default is |
normalize |
logical; indicates if variables should be
normalized to have unit L2 norm before fitting. Default is
|
refit |
logical: indicates if the non null coefficients should be refit to avoid excessive bias. Default is FALSE. Can be changed later (both raw and refit coefficients are stored). |
nlambda1 |
integer that indicates the number of values to put
in the |
minratio |
minimal value of |
maxfeat |
integer; limits the number of features ever to
enter the model; i.e., non-zero coefficients for the Elastic-net:
the algorithm stops if this number is exceeded and |
beta0 |
a starting point for the vector of parameter. By default, will initialized zero. May save time in some situation. |
control |
list of argument controlling low level options of the algorithm –use with care and at your own risk– :
|
Details
The optimized criterion is the following:
βhat
λ1,λ2 =
argminβ 1/2 RSS(β) + λ1
penη(D β) + λ/2 2
βT S β,
where
D is a diagonal matrix, whose diagonal terms are provided
as a vector by the penscale argument. The \ell_2
structuring matrix S is provided via the struct
argument, a positive semidefinite matrix (possibly of class
Matrix).
Value
an object with class SparseFit, inheriting from QuadrupenFit.
See Also
See also SparseFit
Examples
## Simulating multivariate Gaussian with blockwise correlation
## and piecewise constant vector of parameters
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
cor <- 0.75
Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables
Sww <- matrix(cor,10,10) ## bloc correlation between active variables
Sigma <- Matrix::bdiag(Soo,Sww,Soo,Sww,Soo)
diag(Sigma) <- 1
n <- 50
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)
labels <- rep("irrelevant", length(beta))
labels[beta != 0] <- "relevant"
## Lasso
plot(lasso(x, y), label=labels)
## SCAD
plot(scad(x, y), label=labels)
## MCP
plot(mcp(x, y), label=labels)
## Elastic-net
plot(elastic_net(x,y,lambda2=1), label=labels)
## Structured Elastic-net (l2-structuring prior)
plot(elastic_net(x,y,lambda2=3,struct=solve(Sigma)), label=labels)
## SCAD + L2
plot(scad(x,y, eta = 3.7, lambda2=1), label=labels)
## MCP + L2
plot(mcp(x, y, eta = 3, lambda2=1), label=labels)
Stability selection for Quadrupen object
Description
Compute the stability path of a (possibly randomized) fitting procedure as introduced by Meinshausen and Buhlmann (2010).
Usage
stability(
object,
n_subsamples = 50,
subsample_size = floor(object$nobs/2),
subsamples = replicate(n_subsamples, sample(1:object$nobs, subsample_size), simplify =
FALSE),
weakness = 1,
verbose = TRUE,
cores = 1
)
## S3 method for class 'QuadrupenFit'
stability(
object,
n_subsamples = 50,
subsample_size = floor(object$nobs/2),
subsamples = replicate(n_subsamples, sample(1:object$nobs, subsample_size), simplify =
FALSE),
weakness = 1,
verbose = TRUE,
cores = parallel::detectCores() - 2
)
Arguments
object |
an R6 object with class QuadrupenFit |
n_subsamples |
integer indicating the number of subsamplings used to estimate the selection probabilities. Default is 100. |
subsample_size |
integer indicating the size of each subsamples.
Default is |
subsamples |
list with |
weakness |
Coefficient used for randomizing the weights of each features. Default is 1' for no randomization. See details below. |
verbose |
logical; indicates if the progression should be
displayed. Default is |
cores |
the number of cores to use. The default uses 1 core (safer in case your BLAS/LAPACK libraries are multithreaded) |
Value
an object with class StabilityPath is sent back and stored as a
field of the original QuadrupenFit object.
Methods (by class)
-
stability(QuadrupenFit): S3 method for stability selection of a QuadrupenFit
Note
When weakness < 1, the penscale argument
that weights the penalty tuned by \lambda_1 is
perturbed (divided) for each subsample by a random variable
uniformly distributed on
[α,1],
where
α is
the weakness parameter.
If the user runs the fitting method with option
'bulletproof' set to FALSE, the algorithm may stop
at an early stage of the path. Early stops of the underlying
fitting function are handled internally, in the following way: we
chose to simply skip the results associated with such runs, in
order not to bias the stability selection procedure. If it occurs
too often, a warning is sent to the user, in which case you should
reconsider the grid of lambda1 for stability selection. If
bulletproof is TRUE (the default), there is nothing
to worry about, except a possible slow down when any switching to
the proximal algorithm is required.
References
N. Meinshausen and P. Buhlmann (2010). Stability Selection, JRSS(B).
Examples
## Not run:
## Simulating multivariate Gaussian with blockwise correlation
## and piecewise constant vector of parameters
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
Soo <- matrix(0.75,25,25) ## bloc correlation between zero variables
Sww <- matrix(0.75,10,10) ## bloc correlation between active variables
Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) + 0.2
diag(Sigma) <- 1
n <- 100
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)
## Build a vector of label for true nonzeros
labels <- rep("irrelevant", length(beta))
labels[beta != 0] <- c("relevant")
labels <- factor(labels, ordered=TRUE, levels=c("relevant","irrelevant"))
enet <- elastic_net(x, y, lambda2 = 10, struct = solve(Sigma), minratio = 1e-2)
stab <- stability(enet, n_subsamples = 200)
## Build the plot an recover the selected variable
plot(stab, labels=labels)
stabpath <- plot(stab, xvar="fraction", labels=labels, sel_mode="PFER", cutoff=0.75, PFER=1)
cat("\nFalse positives for the randomized Elastic-net with stability selection: ",
sum(labels[stab$selection()] != "relevant"))
cat("\nDONE.\n")
## End(Not run)