Blaze is an open-source, high-performance C++ math library for dense and sparse arithmetic. With its state-of-the-art Smart Expression Template implementation Blaze combines the elegance and ease of use of a domain-specific language with HPC-grade performance, making it one of the most intuitive and fastest C++ math libraries available. The RcppBlaze package includes the header files from the Blaze library with disabling some functionalities related to link to the thread and system libraries which make RcppBlaze be a header-only library. Therefore, users do not need to install Blaze.
You can install:
the stable version from CRAN with
the latest development version from github with
If you encounter a bug, please file a reproducible example on github.
CompressedVector
and CompressedMatrix
only support int
, float
and double
types. Logical part is not supported because there is no enough resource for it (helps are welcome). Since Matrix
only provides l*[CTR]Matrix
and d*[CTR]Matrix
and does not support z*[CTR]Matrix
.
Since CustomVector
and CustomMatrix
are mapping memory which user should manage by himself. Therefore, we provide another function to help user do convertion from R objects. We provide RcppBlaze::copyToCustomVector
and RcppBlaze::copyToCustomMatrix
to make easier on data copy. Below is the example code:
// For CustomVector<int>
using iCustomVectorUU = blaze::CustomVector<int, blaze::unaligned, blaze::unpadded>;
using iCustomVectorAP = blaze::CustomVector<int, blaze::aligned, blaze::padded>;
// initialize R IntegerVector
Rcpp::IntegerVector intVec = Rcpp::IntegerVector::create(-2, -1, 0, 1, 2 );
size_t int_vec_size = Rf_xlength(intVec);
size_t intVecPaddedSize = blaze::nextMultiple<size_t>(int_vec_size, blaze::SIMDTrait<int>::size);
// unaligned & unpadded CustomVector
std::unique_ptr<int[], blaze::ArrayDelete> data_unpadded(new int[int_vec_size]);
iCustomVectorUU cv_uu_int(data_unpadded.get(), int_vec_size);
RcppBlaze::copyToCustomVector(intVec, cv_uu_int);
// aligned & padded CustomVector
std::unique_ptr<int[], blaze::Deallocate> data_padded(blaze::allocate<int>(intVecPaddedSize));
iCustomVectorAP cv_ap_int(data_padded.get(), int_vec_size, intVecPaddedSize);
RcppBlaze::copyToCustomVector(intVec, cv_ap_int);
// For CustomMatrix<int>
using iCustomMatrixUU = blaze::CustomMatrix<int, blaze::unaligned, blaze::unpadded, blaze::columnMajor>;
using iCustomMatrixAP = blaze::CustomMatrix<int, blaze::aligned, blaze::padded, blaze::columnMajor>;
using iCustomMatrixAP_RM = blaze::CustomMatrix<int, blaze::aligned, blaze::padded, blaze::rowMajor>;
// initialize R IntegerMatrix
Rcpp::IntegerMatrix intMat(2, 3);
std::fill(intMat.begin(), intMat.end(), 8);
intMat(0, 1) = 5;
intMat(1, 2) = 4;
Rcpp::Shield<SEXP> intMatDimsSexp(Rf_getAttrib(intMat, R_DimSymbol));
int* intMatDims = INTEGER(intMatDimsSexp);
size_t m = (size_t) intMatDims[0], n = (size_t) intMatDims[1];
// column-major parameters
size_t intSimdSize = blaze::SIMDTrait<int>::size;
size_t intMatPaddedRows = blaze::nextMultiple<size_t>(m, intSimdSize);
// unaligned & unpadded column-major CustomMatrix
std::unique_ptr<int[], blaze::ArrayDelete> data_unpadded(new int[m*n]);
iCustomMatrixUU cm_uu_int(data_unpadded.get(), m, n);
RcppBlaze::copyToCustomMatrix(intMat, cm_uu_int);
// aligned & padded column-major CustomMatrix
std::unique_ptr<int[], blaze::Deallocate> data_padded(blaze::allocate<int>(intMatPaddedRows * n));
iCustomMatrixAP cm_ap_int(data_padded.get(), m, n, intMatPaddedRows);
RcppBlaze::copyToCustomMatrix(intMat, cm_ap_int);
// row-major parameters
size_t intMatPaddedCols = blaze::nextMultiple<size_t>(n, intSimdSize);
// aligned & padded row-major CustomMatrix
std::unique_ptr<int[], blaze::Deallocate> data_rm_padded(blaze::allocate<int>(m * intMatPaddedCols));
iCustomMatrixAP_RM cm_ap_rm_int(data_rm_padded.get(), m, n, intMatPaddedCols);
RcppBlaze::copyToCustomMatrix(intMat, cm_ap_rm_int);
You can refer to the file lmBenchmark.R to find the code. Below code and corresponding results show that RcppBlaze
have better performance than RcppArmadillo
and RcppGSL
. However, RcppEigen
can provide more efficient algorithms (LDLt
and LLt
) for linear model fitting (about 2.4 times faster).
source(system.file("examples", "lmBenchmark.R", package = "RcppBlaze"))
# lm benchmark for n = 10000 and p = 100: nrep = 20
# Unit: milliseconds
# expr min lq mean median uq max neval
# lm.fit 28.9307 30.76115 34.718960 31.32925 32.71110 94.3755 20
# blaze_qr 66.7372 67.63715 68.735135 68.52440 69.61730 71.2159 20
# blaze_ldlt 2.7382 3.21195 3.878950 3.48345 3.94490 9.0361 20
# blaze_llt 2.8695 3.26960 3.807020 3.61675 3.91585 6.9932 20
# blaze_lu 2.8755 3.12790 3.573530 3.45100 3.78790 5.0707 20
# eigen_PivQR 16.8238 17.60885 18.730360 19.01100 19.52710 20.0469 20
# eigen_LDLt 1.9877 2.15395 2.342755 2.28880 2.45690 3.1641 20
# eigen_SVD 122.9787 127.08120 130.421575 129.57880 133.20360 141.0751 20
# eigen_SymmEig 7.1041 7.59225 7.911580 7.94090 8.21375 8.9926 20
# lapack_GESDD 97.5447 99.89155 103.044495 102.50625 104.55925 117.6772 20
# eigen_QR 12.5883 13.26180 13.731200 13.67780 14.13895 15.0327 20
# eigen_LLt 2.0867 2.24700 2.384330 2.40915 2.50055 2.6669 20
# arma_fastLm 52.9779 55.86840 58.040880 57.72055 58.99520 71.6400 20
# arma_direct_solve 18.6251 19.64045 19.992785 19.96915 20.28160 21.4155 20
# arma_qr 63.1927 66.41025 68.602565 67.99540 69.83185 76.9579 20
# arma_chol 18.6167 19.47850 20.352435 19.99895 20.83785 24.0279 20
# arma_pinv 19.4529 20.48360 20.834835 20.89880 21.03085 22.9177 20
# GSL 281.8348 294.52890 310.380445 308.12135 325.44070 345.3462 20
Above results are run on my desktop (i9-13900K, DDR5-4000 128GB).
Ching-Chuan Chen, Klaus Iglberger, Georg Hager, Christian Godenschwager, Tobias Scharpff
BSD-3 License