Type: | Package |
Title: | Collection of Utility Functions for Data Analysis and Computing |
Version: | 0.0.7 |
Date: | 2025-07-19 |
Maintainer: | Meng Xu <mengxu98@qq.com> |
Description: | Provides utility functions for data analysis and scientific computing. Includes functions for parallel processing, and other computational tasks to streamline workflows. |
License: | MIT + file LICENSE |
URL: | https://mengxu98.github.io/thisutils/ |
BugReports: | https://github.com/mengxu98/thisutils/issues |
Depends: | R (≥ 4.1.0) |
Imports: | cli, doParallel, foreach, Matrix, methods, parallel, purrr, Rcpp, RcppArmadillo, RcppParallel, rlang, stats |
Suggests: | httr2 |
LinkingTo: | Rcpp, RcppArmadillo, RcppParallel |
Config/Needs/website: | mengxu98/thistemplate |
Config/testthat/edition: | 3 |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
Language: | en-US |
NeedsCompilation: | yes |
Packaged: | 2025-07-20 08:06:23 UTC; mx |
Author: | Meng Xu |
Repository: | CRAN |
Date/Publication: | 2025-07-20 08:40:02 UTC |
Collection of Utility Functions for Data Analysis and Computing
Description
Provides utility functions for data analysis and scientific computing. Includes functions for parallel processing, and other computational tasks to streamline workflows.
Author(s)
Meng Xu (Maintainer), mengxu98@qq.com
Source
https://mengxu98.github.io/thisutils/
See Also
Useful links:
Value selection operator
Description
This operator returns the left side if it's not NULL, otherwise it returns the right side.
Usage
a %ss% b
Arguments
a |
The left side value to check |
b |
The right side value to use if a is NULL |
Value
a if it is not NULL, otherwise b
Examples
NULL %ss% 10
5 %ss% 10
Add package file
Description
Automatically generate a file containing functions and related code for R package development.
Usage
add_pkg_file(
desc_file,
pkg_name = NULL,
title = NULL,
pkg_description = NULL,
author_name = NULL,
author_email = NULL,
github_url = NULL,
output_dir = NULL,
use_figlet = TRUE,
figlet_font = "Slant",
colors = c("red", "yellow", "green", "magenta", "cyan", "yellow", "green", "white",
"magenta", "cyan"),
unicode = TRUE,
verbose = TRUE
)
Arguments
desc_file |
The DESCRIPTION file.
Must be provided, it will be used to extract package information.
Using |
pkg_name |
Character string, the name of the package. Default is NULL, which will be read from DESCRIPTION file. |
title |
Character string, title of the package. Default is NULL, which will be read from DESCRIPTION file. |
pkg_description |
Character string, short description of the package. Default is NULL, which will be read from DESCRIPTION file. |
author_name |
Character string, name of the package author. Default is NULL, which will be read from DESCRIPTION file. |
author_email |
Character string, email of the package author. Default is NULL, which will be read from DESCRIPTION file. |
github_url |
Character string, GitHub URL of the package. Default is NULL, which will be read from DESCRIPTION file or constructed based on package name. |
output_dir |
Character string, directory where to save the package file. Default is NULL, you should specify it, such as 'R'. |
use_figlet |
Logical, whether to use figlet for ASCII art generation. Default is TRUE. Details see figlet. |
figlet_font |
Character string, figlet font to use. Default is "Slant". |
colors |
Character vector, colors to use for the logo elements. |
unicode |
Logical, whether to use Unicode symbols. Default is TRUE. |
verbose |
Logical, whether to print progress messages. Default is TRUE. |
Value
Creates a file in specified output directory
Convert sparse matrix into dense matrix
Description
Convert sparse matrix into dense matrix
Usage
as_matrix(x, parallel = FALSE, sparse = FALSE)
Arguments
x |
A matrix. |
parallel |
Logical value, default is |
sparse |
Logical value, default is |
Value
A dense or sparse matrix
Examples
m <- simulate_sparse_matrix(
1000, 1000,
decimal = 3
)
system.time(
a <- as.matrix(m)
)
system.time(
b <- as_matrix(m)
)
system.time(
c <- as_matrix(m, parallel = TRUE)
)
system.time(
d <- as_matrix(m, sparse = TRUE)
)
m[1:5, 1:5]
a[1:5, 1:5]
b[1:5, 1:5]
c[1:5, 1:5]
identical(a, b)
identical(a, c)
identical(b, c)
identical(a, d)
identical(b, d)
Capitalizes the characters Making the first letter uppercase
Description
Capitalizes the characters Making the first letter uppercase
Usage
capitalize(x, force_tolower = FALSE)
Arguments
x |
A vector of character strings to be capitalized. |
force_tolower |
Whether to force the remaining letters to be lowercase. |
Examples
x <- c(
"hello world",
"Hello world",
"hello World"
)
capitalize(x)
Check sparsity of matrix
Description
Check sparsity of matrix
Usage
check_sparsity(x)
Arguments
x |
A matrix. |
Value
Sparsity of matrix
Download file from the Internet
Description
Download file from the Internet
Usage
download(
url,
destfile,
methods = c("auto", "wget", "libcurl", "curl", "wininet", "internal"),
quiet = FALSE,
...,
max_tries = 2
)
Arguments
url |
a |
destfile |
a character string (or vector, see the |
methods |
Methods to be used for downloading files. The default is to try different download methods in turn until the download is successfully completed. |
quiet |
If |
... |
Other arguments passed to utils::download.file. |
max_tries |
Number of tries for each download method. |
figlet
Description
Create ASCII art text using figlet fonts.
Usage
figlet(
text,
font = "Slant",
width = getOption("width", 80),
justify = "left",
absolute = FALSE,
strip = TRUE
)
Arguments
text |
Text to make bigger |
font |
Name of font, path to font, or 'figlet_font' object |
width |
Width to use when justifying and breaking lines |
justify |
Text justification to use in rendering ("left", "centre", "right") |
absolute |
Logical, indicating if alignment is absolute |
strip |
Logical, indicating if whitespace should be removed |
Value
An object of class 'figlet_text' which is a character vector with a handy print method
References
http://www.figlet.org/ https://github.com/richfitz/rfiglet https://github.com/jbkunst/figletr
Examples
figlet("thisutils")
Get a figlet font
Description
Get a figlet font
Usage
figlet_font(font)
Arguments
font |
Path or name of the font to load |
Value
A 'figlet_font' object for use with [figlet]
Invoke a function with a list of arguments
Description
Invoke a function with a list of arguments
Usage
invoke_fun(.fn, .args = list(), ..., .env = rlang::caller_env())
Arguments
.fn |
A function, or function name as a string. |
.args |
A list of arguments. |
... |
Other arguments passed to the function. |
.env |
Environment in which to evaluate the call. This will be most useful if .fn is a string, or the function has side-effects. |
Examples
f <- function(x, y) {
x + y
}
invoke_fun(f, list(x = 1, y = 2))
invoke_fun("f", list(x = 1, y = 2))
invoke_fun("f", x = 1, y = 2)
List available figlet fonts
Description
List all figlet font files available in the package or system.
Usage
list_figlet_fonts()
Value
Character vector of available font names.
Examples
list_figlet_fonts()
Print diagnostic message
Description
Integrate the message printing function with the cli
package,
and the base::message function.
The message could be suppressed by base::suppressMessages.
Usage
log_message(
...,
verbose = TRUE,
message_type = c("info", "success", "warning", "error"),
cli_model = TRUE,
timestamp = TRUE,
timestamp_format = "%Y-%m-%d %H:%M:%S",
level = 1,
symbol = " ",
text_color = NULL,
back_color = NULL,
multiline_indent = TRUE,
.envir = parent.frame(),
.frame = .envir
)
Arguments
... |
Text to print. |
verbose |
Logical value, default is |
message_type |
Type of message, default is |
cli_model |
Logical value, default is |
timestamp |
Logical value, default is |
timestamp_format |
Character value, default is |
level |
Integer value, default is |
symbol |
Character value, default is |
text_color |
Character value, default is |
back_color |
Character value, default is |
multiline_indent |
Logical value, default is |
.envir |
The environment to evaluate calls in. Default is |
.frame |
The frame to use for error reporting. Default is |
Value
Formated message
printed to the console.
References
https://cli.r-lib.org/articles/index.html
Examples
# basic usage
log_message("Hello, ", "world!")
log_message("hello, world!")
log_message("Hello, world!", timestamp = FALSE)
log_message("Hello, ", "world!", message_type = "success")
log_message("Hello, world!", message_type = "warning")
log_message("Hello, ", "world!", cli_model = FALSE)
# suppress messages
suppressMessages(log_message("Hello, world!"))
log_message("Hello, world!", verbose = FALSE)
options(log_message.verbose = FALSE)
log_message("Hello, world!")
options(log_message.verbose = TRUE)
# cli formatting
log_message("hello, {.arg world}!")
message("hello, {.arg world}!")
log_message("hello, {.code world}!")
message("hello, {.code world}!")
log_message("{.emph R}")
log_message("Hello, {.file world}!")
log_message("press {.kbd ENTER}")
log_message("address: {.email example@example.com}")
log_message("URL: {.url https://example.com}")
log_message("Some {.field field}")
log_message("Hello, {.pkg world}!")
log_message("Hello, {.val world}!")
# set indentation
log_message("Hello, world!", level = 2)
log_message("Hello, world!", symbol = "->")
# multiline message
log_message("Line 1\nLine 2\nLine 3", multiline_indent = TRUE)
log_message(
"Multi-line\ncolored\nmessage",
text_color = "blue",
multiline_indent = FALSE
)
log_message(
"Multi-line\ncolored\nmessage",
text_color = "blue",
multiline_indent = FALSE,
timestamp = FALSE
)
# color formatting
log_message(
"This is a red message",
text_color = "red"
)
log_message(
"This is a message with background",
back_color = "cyan"
)
log_message(
"This is a message with both text and background",
text_color = "red",
back_color = "cyan"
)
# color formatting also works with `cli_model = FALSE`
log_message(
"This is a red message",
text_color = "red",
cli_model = FALSE
)
log_message(
"This is a message with background",
back_color = "cyan",
cli_model = FALSE
)
log_message(
"This is a message with both text and background",
text_color = "red",
back_color = "cyan",
cli_model = FALSE
)
# cli variables
fun <- function(x) {
log_message("{.val x}")
log_message("{.val {x}}")
log_message("{.val {x + 1}}")
}
fun(1)
Normalize numeric vector
Description
Normalize numeric vector
Usage
normalization(x, method = "max_min", na_rm = TRUE, ...)
Arguments
x |
Input numeric vector. |
method |
Method used for normalization. |
na_rm |
Whether to remove |
... |
Parameters for other methods. |
Value
Normalized numeric vector
Examples
x <- c(runif(2), NA, -runif(2))
x
normalization(x, method = "max_min")
normalization(x, method = "maximum")
normalization(x, method = "sum")
normalization(x, method = "softmax")
normalization(x, method = "z_score")
normalization(x, method = "mad")
normalization(x, method = "unit_vector")
normalization(x, method = "unit_vector", na_rm = FALSE)
Parallelize a function
Description
Parallelize a function
Usage
parallelize_fun(x, fun, cores = 1, export_fun = NULL, verbose = TRUE)
Arguments
x |
A vector or list to apply over. |
fun |
The function to be applied to each element. |
cores |
The number of cores to use for parallelization with foreach::foreach.
Default is |
export_fun |
The functions to export the function to workers. |
verbose |
Logical value, default is |
Value
A list of computed results
Examples
parallelize_fun(1:3, function(x) {
Sys.sleep(0.2)
x^2
})
parallelize_fun(list(1, 2, 3), function(x) {
Sys.sleep(0.2)
x^2
}, cores = 2)
Correlation and covariance calculation for sparse matrix
Description
Correlation and covariance calculation for sparse matrix
Usage
pearson_correlation(x, y = NULL)
Arguments
x |
Sparse matrix or character vector. |
y |
Sparse matrix or character vector. |
Value
A list with covariance and correlation matrices.
Examples
m1 <- simulate_sparse_matrix(
100, 100
)
m2 <- simulate_sparse_matrix(
100, 100,
sparsity = 0.05
)
a <- pearson_correlation(m1, m2)
a$cov[1:5, 1:5]
a$cor[1:5, 1:5]
Print logo
Description
Print logo
Usage
## S3 method for class 'thisutils_logo'
print(x, ...)
Arguments
x |
Input information. |
... |
Other parameters. |
Value
Print the ASCII logo
coefficient of determination (R^2
)
Description
coefficient of determination (R^2
)
Usage
r_square(y_true, y_pred)
Arguments
y_true |
A numeric vector with ground truth values. |
y_pred |
A numeric vector with predicted values. |
Value
R^2
value
Examples
y <- rnorm(100)
y_pred <- y + rnorm(100, sd = 0.5)
r_square(y, y_pred)
Remove and normalize spaces
Description
Remove leading/trailing spaces and normalize multiple spaces between words in character strings.
Usage
remove_space(
x,
trim_start = TRUE,
trim_end = FALSE,
collapse_multiple = TRUE,
preserve_newlines = TRUE
)
Arguments
x |
A vector of character strings. |
trim_start |
Logical value, default is 'TRUE'. Whether to remove leading spaces before the first word. |
trim_end |
Logical value, default is 'FALSE'. Whether to remove trailing spaces after the last word. |
collapse_multiple |
Logical value, default is 'TRUE'. Whether to collapse multiple consecutive spaces between words into a single space. |
preserve_newlines |
Logical value, default is 'TRUE'. Whether to preserve newline characters when collapsing spaces. |
Value
A character vector with spaces normalized according to the specified parameters.
Examples
x <- c(
" hello world ",
" test case ",
"no space",
" multiple spaces "
)
remove_space(x)
remove_space(x, trim_start = FALSE)
remove_space(x, trim_end = TRUE)
remove_space(x, collapse_multiple = FALSE)
remove_space(
x,
trim_start = FALSE,
trim_end = FALSE,
collapse_multiple = FALSE
)
# with newlines
multiline <- c(
"hello\n\n world ",
" first \n second "
)
remove_space(multiline)
remove_space(multiline, preserve_newlines = FALSE)
Rescale numeric vector
Description
Rescale numeric vector
Usage
rescale(x, from = range(x, na.rm = TRUE, finite = TRUE), to = c(0, 1))
Arguments
x |
A numeric vector. |
from |
The range of the original data. |
to |
The range of the rescaled data. |
Value
A numeric vector with rescaled values.
Examples
x <- rnorm(10)
rescale(x)
rescale(x, from = c(0, 1))
rescale(x, to = c(0, 2))
Generate a simulated sparse matrix
Description
This function generates a sparse matrix with a specified number of rows and columns, a given sparsity level, and a distribution function for the non-zero values.
Usage
simulate_sparse_matrix(
nrow,
ncol,
sparsity = 0.95,
distribution_fun = function(n) stats::rpois(n, lambda = 0.5) + 1,
decimal = 0,
seed = 1
)
Arguments
nrow |
Number of rows in the matrix. |
ncol |
Number of columns in the matrix. |
sparsity |
Proportion of zero elements (sparsity level). Default is 0.95, meaning 95% of elements are zero (5% are non-zero). |
distribution_fun |
Function to generate non-zero values. |
decimal |
Numeric value, default is |
seed |
Random seed for reproducibility. |
Value
A sparse matrix of class "dgCMatrix"
Examples
simulate_sparse_matrix(1000, 500) |>
check_sparsity()
simulate_sparse_matrix(10, 10, decimal = 1)
simulate_sparse_matrix(10, 10, decimal = 5)
A sparse correlation function
Description
Safe correlation function which returns a sparse matrix without missing values
Usage
sparse_cor(
x,
y = NULL,
method = "pearson",
allow_neg = TRUE,
remove_na = TRUE,
remove_inf = TRUE,
...
)
Arguments
x |
Sparse matrix or character vector. |
y |
Sparse matrix or character vector. |
method |
Method to use for calculating the correlation coefficient. |
allow_neg |
Logical. Whether to allow negative values or set them to 0. |
remove_na |
Logical. Whether to replace NA values with 0. |
remove_inf |
Logical. Whether to replace infinite values with 1. |
... |
Other arguments passed to stats::cor function. |
Value
A correlation matrix.
Examples
m1 <- simulate_sparse_matrix(
500, 100
)
m2 <- simulate_sparse_matrix(
500, 100,
seed = 2025
)
a <- sparse_cor(m1)
b <- sparse_cor(m1, m2)
c <- as_matrix(
cor(as_matrix(m1)),
sparse = TRUE
)
d <- as_matrix(
cor(as_matrix(m1), as_matrix(m2)),
sparse = TRUE
)
a[1:5, 1:5]
c[1:5, 1:5]
all.equal(a, c)
b[1:5, 1:5]
d[1:5, 1:5]
all.equal(b, d)
m1[sample(1:500, 10)] <- NA
m2[sample(1:500, 10)] <- NA
sparse_cor(m1, m2)[1:5, 1:5]
system.time(
sparse_cor(m1)
)
system.time(
cor(as_matrix(m1))
)
system.time(
sparse_cor(m1, m2)
)
system.time(
cor(as_matrix(m1), as_matrix(m2))
)
Split indices.
Description
An optimised version of split for the special case of splitting row indices into groups.
Usage
split_indices(group, n = 0L)
Arguments
group |
Integer indices |
n |
The largest integer (may not appear in index).
This is hint: if the largest value of |
Value
A list of vectors of indices.
References
Examples
split_indices(sample(10, 100, rep = TRUE))
split_indices(sample(10, 100, rep = TRUE), 10)
Wrap text
Description
Wrap text
Usage
str_wrap(x, width = 80)
Arguments
x |
A character vector to wrap. |
width |
The maximum width of the lines. |
Value
A character vector with wrapped text.
Examples
str_wrap(rep("Hello, world!", 10))
str_wrap(rep("Hello, world!", 10), width = 10)
The logo of thisutils
Description
The thisutils logo, using ASCII or Unicode characters Use cli::ansi_strip to get rid of the colors.
Usage
thisutils_logo(unicode = cli::is_utf8_output())
Arguments
unicode |
Unicode symbols on UTF-8 platforms. Default is cli::is_utf8_output. |
References
https://github.com/tidyverse/tidyverse/blob/main/R/logo.R
Examples
thisutils_logo()
Try to evaluate an expression a set number of times before failing
Description
The function is used as a fail-safe if your R code sometimes works and sometimes doesn't, usually because it depends on a resource that may be temporarily unavailable. It tries to evaluate the expression 'max_tries' times. If all the attempts fail, it throws an error; if not, the evaluated expression is returned.
Usage
try_get(expr, max_tries = 5, error_message = "", retry_message = "Retrying...")
Arguments
expr |
The expression to be evaluated. |
max_tries |
The maximum number of attempts to evaluate the expression before giving up. Default is set to 5. |
error_message |
a string, additional custom error message you would like to be displayed when an error occurs. |
retry_message |
a string, a message displayed when a new try to evaluate the expression would be attempted. |
Value
This function returns the evaluated expression if successful, otherwise it throws an error if all attempts are unsuccessful.
Examples
f <- function() {
value <- runif(1, min = 0, max = 1)
if (value > 0.5) {
log_message("value is larger than 0.5")
return(value)
} else {
log_message(
"value is smaller than 0.5",
message_type = "error"
)
}
}
f_evaluated <- try_get(expr = f())
print(f_evaluated)
Unnest a list-column
Description
Implement similar functions to the tidyr::unnest function.
Usage
unnest_fun(data, cols, keep_empty = FALSE)
Arguments
data |
A data frame. |
cols |
Columns to unnest. |
keep_empty |
By default, you get one row of output for each element of the list your unchopping/unnesting.
This means that if there's a size-0 element (like |
Examples
data <- data.frame(
id = 1:3,
x = c("a", "b", "c"),
stringsAsFactors = FALSE
)
data$data <- list(
c(1, 2),
c(3, 4, 5),
c(6)
)
unnest_fun(data, cols = "data")
data2 <- data.frame(
id = 1:3,
x = c("a", "b", "c"),
stringsAsFactors = FALSE
)
data2$data <- list(
c(1, 2),
numeric(0),
c(6)
)
unnest_fun(data2, cols = "data")
unnest_fun(data2, cols = "data", keep_empty = TRUE)