| Type: | Package |
| Title: | Create Tables for Reporting Clinical Trials |
| Version: | 0.1.15 |
| Author: | Armin Ströbel |
| Maintainer: | Armin Ströbel <arminstroebel@web.de> |
| Description: | Create Tables for Reporting Clinical Trials. Calculates descriptive statistics and hypothesis tests, arranges the results in a table ready for reporting with LaTeX, HTML or Word. |
| License: | GPL-3 |
| Depends: | R (≥ 3.5) |
| Encoding: | UTF-8 |
| LazyData: | true |
| RoxygenNote: | 7.3.2 |
| VignetteBuilder: | knitr |
| Imports: | stats (≥ 3.4), doBy (≥ 4.6), plyr (≥ 1.8.4), reshape2 (≥ 1.4.3), Hmisc (≥ 4.1), settings (≥ 0.2.4), DescTools (≥ 0.99.24), effsize (≥ 0.7.1) |
| Suggests: | testthat, knitr, survival, rmarkdown |
| URL: | https://github.com/arminstroebel/atable |
| BugReports: | https://github.com/arminstroebel/atable/issues |
| NeedsCompilation: | no |
| Packaged: | 2024-08-29 08:58:46 UTC; armin |
| Repository: | CRAN |
| Date/Publication: | 2024-08-29 11:30:10 UTC |
atable: Create Tables for Reporting Clinical Trials
Description
The packages provides functions for descriptive statistics and hypothesis tests, and arranging the results for printing.
Details
The main function is atable. See documentation there.
Author(s)
Maintainer: Armin Ströbel arminstroebel@web.de (ORCID)
Authors:
Alan Haynes alan.haynes@ctu.unibe.ch (ORCID)
See Also
Useful links:
Report bugs at https://github.com/arminstroebel/atable/issues
add_name_to_statistics
Description
add a column to a data.frame x with value name as character. Helper Function. Not intended to be called by the user.
Usage
add_name_to_statistics(x, name, ...)
## S3 method for class 'list'
add_name_to_statistics(x, name, ...)
## S3 method for class 'data.frame'
add_name_to_statistics(
x,
name,
colname_for_variable = atable_options("colname_for_variable"),
...
)
Arguments
x |
an object |
name |
a value |
... |
passed to methods |
colname_for_variable |
a character length 1. Default is defined in atable_options |
Details
checks if the new field already exists
Value
x now with new field colname_for_variable
Methods (by class)
-
add_name_to_statistics(list): apply add_name_to_statistics to all field of the list -
add_name_to_statistics(data.frame): add field colname_for_variable to the data.frame. chekc for a name clash as this field as there are many user-defined fields
add_name_to_tests
Description
Helper-function to add a field to a list or data.frame
Usage
add_name_to_tests(x, name, ...)
## S3 method for class 'list'
add_name_to_tests(x, name, ...)
## S3 method for class 'data.frame'
add_name_to_tests(
x,
name,
colname_for_variable = atable_options("colname_for_variable"),
...
)
Arguments
x |
an object |
name |
a value |
... |
passed to methods |
colname_for_variable |
a character length 1. Default is defined in atable_options |
Details
Not intended to be called by the user.
checks if the new field already exists
Value
x now with new field colname_for_variable
Methods (by class)
-
add_name_to_tests(list): apply add_name_to_statistics to all field of the list -
add_name_to_tests(data.frame): add field colname_for_variable to the data.frame. chekc for a name clash as this field as there are many user-defined fields
Adds a column to a data.frame
Description
The new column has name atable_options('colname_for_observations') and class 'count_me'.
Usage
add_observation_column(DD)
Arguments
DD |
A data.frame. |
Details
Throws an error if a column of that name is already present in DD.
Value
As DD now with one more column.
Create Tables for Reporting of Clinical Trials
Description
Applies descriptive statistics and hypothesis tests to data, and arranges the results for printing.
Usage
atable(x, ...)
## S3 method for class 'data.frame'
atable(
x,
target_cols,
group_col = NULL,
split_cols = NULL,
format_to = atable_options("format_to"),
drop_levels = TRUE,
add_levels_for_NA = FALSE,
blocks = NULL,
add_margins = atable_options("add_margins"),
indent_character = NULL,
indent = atable_options("indent"),
...
)
## S3 method for class 'formula'
atable(formula, data, ...)
Arguments
x |
An object. If |
... |
Passed from and to other methods. You can use the ellipsis ... to modify atable:
For example the default-statistics for numeric variables are mean and sd. To change these statistics pass
a function to argument See examples below how to modify atable by ... . Actually Here is a list of the statistics and hypothesis tests that can be modified by
|
target_cols |
A character vector containing some column names of Descriptive statistics and hypothesis test are applied to these columns depending on their class.
The descriptive statistics are defined by Hypothesis test are defined by |
group_col |
A character of length 1 containing a column of |
split_cols |
A character vector containing some of |
format_to |
A character vector of length 1. Specifies the format of the output of |
drop_levels |
A logical. If |
add_levels_for_NA |
If |
blocks |
|
add_margins |
A logical with length one, |
indent_character |
A character with length 1 or |
indent |
A logical with length one, |
formula |
A formula of the form |
data |
Passed to |
Value
Results depend on format_to:
'Raw':A list with two elements called
'statistics_result'and'tests_result', that contain all results of the descriptive statistics and the hypothesis tests. This format useful, when extracting a specific result unformulated (whenformat_tois not'Raw'all numbers are also returned, but as rounded characters for printing and squeezed into a data.frame).'statistics_result':contains a data.frame with colnames
split_cols, group_col, target_cols.split_colsandgroup_colretain their original values (now as factor).target_colscontain lists with the results of functionstatistics. As the result of functionstatisticsis also a list,target_colscontain lists of lists.'tests_result':has the same structure as
'statistics_result', but contains the results oftwo_sample_htestandmulti_sample_htest. Note thattests_resultonly exists ifsplit_colsis notNULL.
'Word':A data.frame. Column
atable_options('colname_for_group')contains all combinations of the levels ofsplit_colsand the names of the results of functionformat_statistics.Further columns are the levels of
group_colthe names of the results offormat_tests.The levels of
split_colsand the statistics are arranged vertically. The hypothesis test are arranged horizontally.'HTML':Same as for
format_to = 'Word'but a different character indents the first column.
#'
'Console':Meant for printing in the R console for interactive analysis. Same as for
format_to = 'Word'but a different character indents the first column.'Latex':Same as for
format_to = 'Word'but a different character indents the first column and withtranslate_to_LaTeXapplied afterwards.
Methods (by class)
-
atable(data.frame): applies descriptive statistics and hypothesis tests, arranges the results for printing. -
atable(formula): parses the formula and passes its parts toatable.
Examples
# See vignette for more examples:
# utils::vignette('atable_usage', package = 'atable')
# Analyse datasets::ToothGrowth:
# Length of tooth for each dose level and delivery method:
atable::atable(datasets::ToothGrowth,
target_cols = 'len',
group_col = 'supp',
split_cols = 'dose',
format_to = 'Word')
# Print in .docx with e.g. flextable::regulartable and officer::body_add_table
# Analyse datasets::ChickWeight:
# Weight of chickens for each time point and diet:
atable(weight ~ Diet | Time, datasets::ChickWeight, format_to = 'Latex')
# Print as .pdf with e.g. Hmisc::latex
# Analyse atable::test_data:
atable(Numeric + Logical + Factor + Ordered ~ Group | Split1 + Split2,
atable::test_data, format_to = 'HTML')
# Print as .html with e.g. knitr::kable and options(knitr.kable.NA = '')
# Modify atable: calculate median and MAD for numeric variables
new_stats <- function(x, ...){list(Median = median(x, na.rm = TRUE),
MAD = mad(x, na.rm = TRUE))}
atable(atable::test_data,
target_cols = c('Numeric', 'Numeric2'),
statistics.numeric = new_stats,
format_to = 'Console')
# Print in Console with format_to = 'Console'.
# Analyse mtcars and add labels and units of via package Hmisc
mtcars <- within(datasets::mtcars, {gear <- factor(gear)})
# Add labels and units.
attr(mtcars$mpg, 'alias') = 'Consumption [Miles (US)/ gallon]'
Hmisc::label(mtcars$qsec) = 'Quarter Mile Time'
units(mtcars$qsec) = 's'
# apply atable
atable::atable(mpg + hp + gear + qsec ~ cyl | vs,
mtcars,
format_to = 'Console')
# Blocks
# In datasets::mtcars the variables cyl, disp and mpg are related to the engine and am and gear are
# related to the gearbox. So grouping them together is desireable.
atable::atable(datasets::mtcars,
target_cols = c("cyl", "disp", "hp", "am", "gear", "qsec") ,
blocks = list("Engine" = c("cyl", "disp", "hp"),
"Gearbox" = c("am", "gear")),
format_to = "Console")
# Note that Variable qsec is not blocked and thus not indented.
# add_margins
atable::atable(atable::test_data,
target_cols = "Numeric",
group_col = "Group",
split_cols = "Split1",
add_margins = TRUE,
format_to = "Console")
# The column 'Total' contains the results of the ungrouped atable-call:
# The number of observations is the sum of observations of the groups.
# The default of add_margins can be changed via atable_options.
More compact formatting than atable()
Description
This is a wrapper for atable(), calculating the same statistics, but with different format.
Usage
atable_compact(x, ...)
## S3 method for class 'data.frame'
atable_compact(
x,
target_cols,
group_col = NULL,
indent_character = atable_options("indent_character_compact"),
blocks = NULL,
format_factor = atable_options("format_statistics_compact.statistics_factor"),
format_numeric = atable_options("format_statistics_compact.statistics_numeric"),
...
)
Arguments
x |
object passed to atable. |
... |
Passed to |
target_cols |
character. Some of colnames(x). |
group_col |
character or NULL. If character then, one of colnames(x). |
indent_character |
character length 1. Default is defined in table_options("indent_character_compact").
For Latex-Format use e.g. |
blocks |
NULL or a list, passed to atable, see help there. |
format_factor |
a function that defines the format of factor variables.
Default is defined in |
format_numeric |
a function that defines the format of numeric variables. Analog to format_factor. |
Details
The compact formatting is:
Numeric target_cols get one line in the table; the line contains the mean and SD of the variable.
Factor target_cols also get one line in the table, when they have only two levels and only the first level is displayed in the table and the name of the variable is omitted. This is intended for item like "Sex at birth: Female/Male". Knowing the percentage of Female is sufficient in this case (when NAs are not counted). Be careful with items like "Pregnant: Yes/No". Here only the level "Yes" will be printed and the name of the item (Pregnant) is omitted, making the table uninformative. Factors with three or more levels get one line per level, the levels are intended and a header line containing the name of the variable is added.
Arguments in ... are passed to atable. See the help there.
atable_compact is not designed for splitted atables, so argument split_cols must be omitted or NULL.
Also argument format_to is ignored.
Other features of atable (blocking, add_margins, alias) are available, see examples.
Value
data.frame
Methods (by class)
-
atable_compact(data.frame): a compact version of atable.
Examples
# For Console:
atable_compact(
atable::test_data,
target_cols = c("Numeric", "Numeric2", "Split2", "Factor", "Ordered"),
group_col = "Group2",
blocks = list("Primary Endpoint" = "Numeric",
"Secondary Endpoints" = c("Numeric2", "Split2", "Factor")),
add_margins = TRUE)
# The target_cols are "Numeric", "Numeric2", "Split2", "Factor", "Ordered".
# The group_col is "Group2".
# The data.frame is grouped by group_col and the summary statistcs of the target_cols are
# calculated: mean, sd for numeric, counts and percentages for factors.
# Some target_cols are blocked: the first block 'Primary Endpoint' contains the variable Numeric.
# The second block 'Secondary Endpoint' contains the variables "Numeric2", "Split2", "Factor".
# The blocks are intended.
# For variable Split2 only the first level is reported, as the variable has only two levels and
# the name 'Split2' does not appear in the table.
# The variable Factor has more than two levels, so all of them are
# reported and appropriately intended.
# The variable Ordered is not part of a block and thus not intended.
# For Latex:
# Same as for Console, but with different indent_character:
tab = atable_compact(atable::test_data,
target_cols = c("Numeric", "Numeric2", "Logical", "Factor", "Ordered"),
group_col = "Group2",
indent_character = "\\quad")
tab = atable::translate_to_LaTeX(tab)
# Then call e.g. Hmisc::latex(tab, ...)
# Example for Word format:
## Not run:
tab = atable_compact(
atable::test_data,
target_cols = c("Numeric", "Numeric2", "Split2", "Factor", "Ordered", "Character"),
group_col = "Group2",
blocks = list("Primary Endpoint" = "Numeric",
"Secondary Endpoints" = c("Numeric2", "Split2", "Factor")),
add_margins = TRUE,
indent_character = paste0(rep(intToUtf8(160), 5), collapse = ""))
# The argument indent_character has the value intToUtf8(160) (non breakable space).
# This is the important part:
# Spaces at the beginning of a cell of a data.frame are somehow lost on the way to the docx.
# Other indent_characters may also do the job.
# doc = officer::read_docx()
# doc = officer::body_add_table(doc,tab)
# print(doc, target = "atable_Word.docx")
# Other packages may exist for Word-export.
## End(Not run)
A longitudinal version of atable
Description
This is a wrapper for atable(), calculating the same statistics, but with different format.
Usage
atable_longitudinal(x, ...)
## S3 method for class 'data.frame'
atable_longitudinal(
x,
target_cols,
split_cols,
group_col = NULL,
format_numeric = atable_options("format_statistics_longitudinal.statistics_numeric"),
format_factor = atable_options("format_statistics_longitudinal.statistics_factor"),
...
)
Arguments
x |
object passed to atable. Currently x must be a data.frame. |
... |
Passed to |
target_cols |
character. Exactly one of colnames(x). |
split_cols |
character. Exactly one of colnames(x). |
group_col |
character or NULL. If character then, one of colnames(x). |
format_numeric |
a function that defines the format of numeric variables. Analog to format_factor. |
format_factor |
a function that defines the format of factor variables.
Default is defined in |
Details
The intention is to report longitudinal data, i.e. data measured on the same objects on multiple times points.
This function allows only one target_col and only one split_col (the time point of the measurement).
The longitudinal formatting is:
The names of the target_col and split_col do not show up in the table. The names should thus be written in the caption of the table.
Numeric target_cols get one line in the table; the format of the statistics is: mean (sd), N, missing.
Factor target_cols also get one line in the table, when it has only two levels and only the first level is displayed in the table and the name of the variable is omitted. This is intended for item like "Sex at birth: Female/Male". Knowing the percentage of Female is sufficient in this case (when NAs are not counted). The name of the target_cols and its first level should be stated in the caption of the table, otherwise the table is uninformative. The format of the statistics is: percent
Factors with three or more levels get one line per level and the name of the variable is omitted. The format of the statistics is: percent
Argument block must omitted, as there is only one target_col and nothing to block.
See examples.
Value
data.frame
Methods (by class)
-
atable_longitudinal(data.frame): a longitudinal version of atable.
Examples
# create data with a time-variable
x = atable::test_data
set.seed(42)
x = within(x, {time = sample(paste0("time_", 1:5), size=nrow(x), replace = TRUE)})
split_cols = "time"
group_col = "Group2"
# table for a factor with two levels
atable_longitudinal(x,
target_cols = "Split2",
group_col = group_col,
split_cols = split_cols,
add_margins = TRUE)
# table for a factor with three levels
atable_longitudinal(x,
target_cols = "Split1",
group_col = group_col,
split_cols = split_cols,
add_margins = TRUE)
# table for a numeric variable
atable_longitudinal(x,
target_cols = "Numeric",
group_col = group_col,
split_cols = split_cols,
add_margins = TRUE)
# To print the table in Word or with Latex, use
# e.g. \link[Hmisc]{latex} or \link[officer]{body_add_table}.
# No further modification of the table is needed.
# See \code{\link{atable_compact}} for examples.
Set or get options
Description
Set or get options for the atable-package via the settings package.
Usage
atable_options(...)
Arguments
... |
Option names to retrieve option values or |
Details
These options control some aspects of the atable package.
For restoring the default values see atable_options_reset.
Supported options
The following options are supported:
add_marginsA logical with length 1, TRUE of FALSE. This is the default-value of atable's argument
add_margins. See the help there.colname_for_totalA character with length 1. Default is
'Total'. This character will show up in the results ofatablewhenadd_marginsisTRUEandgroup_colis notNULL.replace_NA_byA character with length 1, or
NULL. Default is'missing'. Used in functionreplace_NA. This character will show up in the results ofatable, so it can be modified.colname_for_variableA character with length 1. Default is
'variable___'. Used in functionadd_name_to_testsandadd_name_to_statistics. This character will not show up in the results and is only used internally for intermediate data.frames. There may be name clashes with user-supplied data.frames; so modification may be necessary.colname_for_observationsA character with length 1. Default is
'Observations'. Used in functionadd_observation_column. This character will show up in the results ofatable, so it can be modified. There may be name clashes with user-supplied data.frames; so modification may be necessary.colname_for_blocksA character with length 1. Default is
'block_name___'. Used in functionindent_data_frame_with_blocks. This character will not show up in the results and is only used internally for intermediate data.frames. There may be name clashes with user-supplied data.frames; so modification may be necessary.labels_TRUE_FALSEA character of length 2. Default is
c('yes', 'no'). Currently used in functionstatistics.logical(seestatistics) to cast logical to factor.TRUEis mapped tolabels_TRUE_FALSE[1]andFALSEtolabels_TRUE_FALSE[2]. This characters may show up in the results ofatable, so it can be modified.labels_Mean_SDA character length 1. Default is
'Mean (SD)'. Currently used in functionformat_statisticsas a name for the mean and standard deviation of numeric variables. This character may show up in the results ofatable, so it can be modified.labels_valid_missingA character length 1. Default is
'valid (missing)'. Currently used in functionformat_statisticsas a name for the number of valid and missing values of numeric variables. This character may show up in the results ofatable, so it can be modified.format_toA character length 1. Default is
'Latex'. Currently used in functionatable.colname_for_groupA character of length 1. Default is
'Group'. This character will show up in the results ofatable. This column will contain all values ofDD[split_cols]andDD[target_cols].colname_for_valueA character of length 1. Default is
'value'. This character shows up in the results ofatablewhengroup_colisNULL. The column will contain the results of thestatistics.colname_for_variable_compactA character of length 1. Default is
intToUtf8(160), a non-breaking space. This character will show up in the results ofatable_compactas name of the first column.statistics.numericEither
NULLor a function. Default isNULL. If a function, then it will replaceatable:::statistics.numericwhen atable is called. The function must mimicstatistics: see the help there.statistics.factorAnalog to argument statistics.numeric.
statistics.orderedAnalog to argument statistics.numeric.
two_sample_htest.numericEither
NULLor a function. Default isNULL. If a function, then it will replaceatable:::two_sample_htest.numericwhen atable is called. The function must mimictwo_sample_htest: see the help there.two_sample_htest.factorAnalog to argument two_sample_htest.numeric
two_sample_htest.orderedAnalog to argument two_sample_htest.numeric
multi_sample_htest.numericEither
NULLor a function. Default isNULL. If a function, then it will replaceatable:::multi_sample_htest.numericwhen atable is called. The function must mimicmulti_sample_htestsee the help there.multi_sample_htest.factorAnalog to argument multi_sample_htest.numeric
multi_sample_htest.orderedAnalog to argument multi_sample_htest.numeric
format_statistics.statistics_numericEither
NULLor a function. Default isNULL. If a function, then it will replaceatable:::format_statistics.statistics_numeric. The function must mimicformat_statistics: see the help there.format_statistics.statistics_factorAnalog to argument format_statistics.statistics_numeric
format_tests.htestEither
NULLor a function. Default isNULL. If a function, then it will replaceformat_tests.htest. The function must mimicformat_tests: arguments arexand the ellipsis ... . Result is a data.frame with 1 rows and unique colnames.format_tests.htest_with_effect_sizeAnalog to argument format_tests.htest
format_p_valuesA function with one argument returning a character with same length as the argument. This functions is called by
format_teststo produce printable p-values.format_percentA function with one argument returning a character with same length as the argument. This functions is called by
format_statisticsfor factors to produce printable percentages.format_numbersA function with one argument returning a character with same length as the argument. This functions is called by
format_statisticsandformat_testsfor number, that are not p-values or percentages.digitsHow many digits a number should have in the table. Default 2. Used by
format_percentandformat_percentand passed toformat.get_alias.defaultA function with one argument
xand...returning a character orNULL. This functions is called byget_aliasandcreate_alias_mappingto retrieve alternative Variable names to print in the table.get_alias.labelledA function with one argument
xand..., that must return a character. This functions is called byget_aliason the columns that have class labelled.modifiy_colnames_without_aliasA function with one argument
xand...returning a character. This functions is called bycreate_alias_mappingon the columns that haveis.NULL(get_alias(x)). Replaces underscores by blanks and then callstrimws.indent_characterA Character with length 1. Passed to
indent_data_frame. Every option offormat_tohas a corresponding indent_character. See the help ofatablefor these options.indent_character_compactA Character with length 1. Passed to
atable_compact. Value is" "for viewing in the console. Use"\quad"for Latex andintToUtf8(160)for Word.indentA logical with length 1. Passed to
atable. Controls, if indent_data_frame is called.format_statistics_compact.statistics_factorA function with the same Properties as
format_statistics. Used as a default value foratable_compactformat_statistics_compact.statistics_numericA function with the same Properties as
format_statistics. Used as a default value foratable_compactformat_statistics_longitudinal.statistics_factorA function with the same Properties as
format_statistics. Used as a default value foratable_longitudinalformat_statistics_longitudinal.statistics_numericA function with the same Properties as
format_statistics. Used as a default value foratable_longitudinal
Examples
atable_options() # show all options
atable_options('replace_NA_by' = 'no value') # set a new value
atable_options('replace_NA_by') # return the new value
Reset atable_options to default
Description
Does as the name implies. See also atable_options.
Usage
atable_options_reset()
Examples
atable_options('replace_NA_by') # show options
atable_options('replace_NA_by' = 'foo bar') # set a new value
atable_options('replace_NA_by') # show options
atable_options_reset() # restore all defaults
atable_options('replace_NA_by') # as before
Checks the output of function create_alias_mapping
Description
Checks the output of function create_alias_mapping.
Usage
check_alias_mapping(Alias_mapping)
Arguments
Alias_mapping |
Result of function |
Value
TRUE if x has the following properties:
Alias_mapping is a non-empty data.frame with character columns 'old' and 'new', without NA and "".
Column 'new' has no duplicates.
Else throws an error. Prints the duplicates of column 'new', if available.
Checks the output of function format_statistics
Description
Checks the output of function format_statistics.
Usage
check_format_statistics(x)
Arguments
x |
Result of function |
Value
TRUE if x has the following properties:
x is a non-empty data.frame with 2 columns called 'tag' and 'value'.
Column 'tag' has class factor and no duplicates.
Column 'value' is a character.
Else throws an error.
Checks the output of functions format_test
Description
Checks the output of function format_tests.
Usage
check_format_tests(x)
Arguments
x |
Result of function |
Value
TRUE if x has the following properties:
x is a data.frame with exactly one row and with unique colnames. Else throws an error.
Checks the output of function statistics
Description
Checks the output of function statistics.
Usage
check_statistics(x)
Arguments
x |
Result of function |
Value
TRUE if x has the following properties:
x is a named list with length > 0.
The names of the list must not have duplicates.
The names may contain NA. Else an error.
Checks the output of functions two_sample_htest and multi_sample_htest
Description
Checks the output of function two_sample_htest and multi_sample_htest.
Usage
check_tests(x)
Arguments
x |
Result of function |
Value
TRUE if x has the following properties:
x is a named list with length > 0.
The names of the list must not have duplicates.
The names may contain NA. Else an error.
Most hypothesis-test-functions in R like t.test or chisq.test return an object of class htest.
This object passes this checks.
Additional fields can be added to these objects and they will still pass this check.
Get Aliases of column names
Description
Column names of data.frame in atable must have syntactically valid colnames,
see is_syntactically_valid_name.
So no blanks or special characters allowed.
But Reporting in human readable language needs special characters.
These functions here allow atable to handle arbitrary character for pretty printing.
Usage
create_alias_mapping(DD, ...)
Arguments
DD |
A data.frame |
... |
Passed from and to other methods. |
Details
We use attributes here, to assign alternative names to columns.
Also class labelled created by Hmisc's label is supported.
See create_alias_mapping for the function that does the actual work.
If no aliases are found, then underscores in the column names of DD will be replaced by blanks.
See Examples in ?atable.
Value
create_alias_mapping returns a data.frame with two columns old and new and
as many rows as DD has columns. Column old contains the original column names of
DD and column new their aliases.
Format statistics
Description
The results of function statistics must be formated before printing. format_statistics does this.
Usage
format_statistics(x, ...)
## S3 method for class 'statistics_numeric'
format_statistics(x, format_statistics.statistics_numeric = NULL, ...)
## S3 method for class 'statistics_factor'
format_statistics(x, format_statistics.statistics_factor = NULL, ...)
## S3 method for class 'statistics_count_me'
format_statistics(x, ...)
## Default S3 method:
format_statistics(x, ...)
Arguments
x |
An object. |
... |
Passed from and to other methods. |
format_statistics.statistics_numeric |
Either |
format_statistics.statistics_factor |
Analog to argument format_statistics.statistics_numeric |
Details
This function defines which statistics are printed in the final table and how they are formated.
The format depends on the class x. See section methods.
If you are not pleased with the current format you may alter these functions.
But you must keep the original output-format, see section Value.
Function check_format_statistics checks if the output of statistics is suitable for further processing.
Value
A non-empty data.frame with 2 columns called 'tag' and 'value'.
Column 'tag' has class factor and no duplicates.
Column 'value' is a character.
See also function check_format_statistics.
Methods (by class)
-
format_statistics(statistics_numeric): Defines how to format classstatistics_numeric. Returns a data.frame with 2 rows. Column'tag'contains'Mean_SD'and'valid_missing'. Column'value'contains two values: first value is the rounded mean and standard deviation, pasted them together. The standard deviation is bracketed. Second value is the number of non-missing and missing values pasted together. The number of missing values is bracketed. -
format_statistics(statistics_factor): Defines how to format classstatistics_factor. Returns a data.frame. Column'tag'contains all names ofx. Column'value'contains the percentages and the total number of values in brackets. -
format_statistics(statistics_count_me): Defines how to format classstatistics_count_me. Returns a data.frame. Column'tag'contains the empty character''. The empty character is choosen becausecolname_for_observationsalready appears in the final table. Column'value'contains the number of observations. See also'colname_for_observations'inatable_options. -
format_statistics(default): Returns a data.frame. Column'tag'contains all names ofx. Column'value'contains all elements ofx, rounded byformat.
Formats hypothesis test results
Description
The results of function two_sample_htest and multi_sample_htest
must be formated before printing. format_tests does this.
Usage
format_tests(x, ...)
## S3 method for class 'htest'
format_tests(x, format_tests.htest = NULL, ...)
## S3 method for class 'htest_with_effect_size'
format_tests(x, format_tests.htest_with_effect_size = NULL, ...)
## Default S3 method:
format_tests(x, ...)
Arguments
x |
An object. |
... |
Passed from and to other methods. |
format_tests.htest |
Either |
format_tests.htest_with_effect_size |
Analog to argument format_tests.htest |
Details
This function defines which test results are printed in the final table and how they are formated.
The format depends on the class x. See section methods.
If you are not pleased with the current format you may alter these functions.
But you must keep the original output-format, see section Value.
Function check_format_tests checks if the output of format_tests is suitable for further processing.
Value
A non-empty data.frame with one row.
See also function check_format_tests.
Methods (by class)
-
format_tests(htest): Defines how to format classhtest. Returns a data.frame with 1 rows. Columnpcontains the p-value of thex. -
format_tests(htest_with_effect_size): Defines how to format classhtest_with_effect_size. Returns a data.frame with 1 rows. Columnpcontains the p-value of thex. Columnstatcontains the teststatistic. ColumnEffect Size (CI)contains a effect size and its 95% Confidence interval. -
format_tests(default): Tries to cast to data.frame with one row. Uses the names of the list as colnames.
Get Aliases of column names
Description
Retrieves attributes label and units of class labelled and attribute alias otherwise.
Usage
get_alias(x, ...)
## S3 method for class 'labelled'
get_alias(x, ...)
## Default S3 method:
get_alias(x, ...)
## S3 method for class 'data.frame'
get_alias(x, ...)
## S3 method for class 'list'
get_alias(x, ...)
Arguments
x |
An object. Aliases will be retrieved of |
... |
Passed from and to other methods. |
Details
We use attributes here, to assign alternative names to columns.
Also class labelled created by Hmisc's label is supported.
This is a workhorse function, see create_alias_mapping for the high level function
Value
For atomic vectors a character of NULL; for non-atomic vectors the results of get_alias applied to its elements.
Methods (by class)
-
get_alias(labelled): Retrieve attributeslabelandunits, if available. Units are bracketed by '[]'. See alsolabelandunits. The user may alter this method viaatable_options, see help there. -
get_alias(default): Retrieve attributealiasviaattr. This attribute may be an arbitrary character. If there is no attributealias, thenget_alias.defaultreturnsNULL. -
get_alias(data.frame): Callsget_aliason every column. -
get_alias(list): Callsget_aliason every element of the list.
Indents data.frames
Description
Indents data.frames for printing them as tables.
Usage
indent_data_frame(
DD,
keys,
values = setdiff(colnames(DD), keys),
character_empty = "",
numeric_empty = NA,
indent_character = "\\quad",
colname_indent = "Group"
)
Arguments
DD |
A data.frame. Should be sorted by |
keys |
A character. Subset of |
values |
A character. Subset of colnames(DD). DD[keys] must be class character, factor or numeric. |
character_empty |
A character. Default ”. This character will be put in the new lines in class character columns. |
numeric_empty |
A numeric. Default NA. This character will be put in the new lines in class numeric columns. |
indent_character |
A character. character for one indent. Default is '\quad' (meant for latex). Can also be ' ' for Word. |
colname_indent |
A character. Default 'Group'. Name of the new column with the indented keys. |
Details
Squeeze multiple key-columns into one column and indents the values accordingly.
Adds new lines with the indented keys to the data.frame.
Meant for wide tables that need to be narrower and more 'readable'
Meant for plotting with e.g. xtable::xtable or Hmisc::latex or officer::body_add_table.
Look at the examples for a more precise description.
Meant for left-aligned columns. Thats why the indent_character is inserted
to the left of the original values.
Value
A data.frame. Columns: c(colname_indent, values).
Column colname_indent contains all combination of DD[keys], but now indented and squeezed in this column and casted to character.
Columns 'values' contain all values of DD[values] unchanged.
Number of rows is sum(cumprod(nlevels(DD[keys]))).
Examples
DD <- expand.grid(Arm = paste0('Arm ', c(1,2,4)),
Gender = c('Male', 'Female'),
Haircolor = c('Red', 'Green', 'Blue'),
Income = c('Low', 'Med', 'High'), stringsAsFactors = TRUE)
DD <- doBy::orderBy(~ Arm + Gender + Haircolor + Income, DD)
DD$values1 <- runif(dim(DD)[1])
DD$values2 <- 1
DD$values3 <- sample(letters[1:4], size = nrow(DD), replace = TRUE)
keys = c('Arm', 'Gender', 'Haircolor', 'Income')
values = c('values1', 'values2', 'values3')
## Not run:
DDD <- indent_data_frame(DD, keys, indent_character = ' ')
# print both:
Hmisc::latex(DD,
file = '',
longtable = TRUE,
caption = 'Original table',
rowname = NULL)
Hmisc::latex(DDD,
file = '',
longtable = TRUE,
caption = 'Indented table',
rowname = NULL)
## End(Not run)
Checks if valid name
Description
Checks for valid names by make.names,
i.e. x is valid iff make.names does nothing with x.
Usage
is_syntactically_valid_name(x)
Arguments
x |
An object. |
Value
A logical with length 1. TRUE when x is a character with length > 0 without duplicates
and is valid. Else FALSE and a warning what's wrong.
Examples
x <- c('asdf', NA,'.na', '<y', 'asdf', 'asdf.1')
is_syntactically_valid_name(x)
is_syntactically_valid_name(x[FALSE]) # FALSE because empty
is_syntactically_valid_name(NA) # FALSE because not character
is_syntactically_valid_name(as.character(NA)) # FALSE because NA
is_syntactically_valid_name('NA') # FALSE. make.names changes 'NA' to 'NA.'
is_syntactically_valid_name(letters) # TRUE
Calculates multi sample hypothesis tests
Description
Calculates multi sample hypothesis tests depending on the class of its input.
Usage
multi_sample_htest(value, group, ...)
## S3 method for class 'logical'
multi_sample_htest(value, group, ...)
## S3 method for class 'factor'
multi_sample_htest(value, group, multi_sample_htest.factor = NULL, ...)
## S3 method for class 'character'
multi_sample_htest(value, group, ...)
## S3 method for class 'ordered'
multi_sample_htest(value, group, multi_sample_htest.ordered = NULL, ...)
## S3 method for class 'numeric'
multi_sample_htest(value, group, multi_sample_htest.numeric = NULL, ...)
Arguments
value |
An atomic vector. |
group |
A factor, same length as |
... |
Passed to methods. |
multi_sample_htest.factor |
Analog to argument two_sample_htest.numeric |
multi_sample_htest.ordered |
Analog to argument two_sample_htest.numeric |
multi_sample_htest.numeric |
Either |
Details
Calculates multi sample hypothesis tests depending on the class of its input.
Results are passed to function format_tests for the final table.
If you are not pleased with the current hypothesis tests you may alter these functions.
But you must keep the original output-format, see section Value.
Function check_tests checks if the output of statistics is suitable for further processing.
The function multi_sample_htest is essentially a wrapper
to standardize the arguments of various hypothesis test functions.
Value
A named list with length > 0.
Most hypothesis-test-functions in R like t.test or chisq.test return an
object of class 'htest'. 'htest'-objects are a suitable output for function two_sample_htest.
Function check_tests checks if the output is suitable for further processing.
Methods (by class)
-
multi_sample_htest(logical): Casts to factor and then calls methodmulti_sample_htestagain. -
multi_sample_htest(factor): Callschisq.test. -
multi_sample_htest(character): Castsvalueto factor and then calls methodmulti_sample_htestagain. -
multi_sample_htest(ordered): Callskruskal.test. -
multi_sample_htest(numeric): Callsmulti_sample_htest's method onordered(value).
Replaces NA
Description
Replaces NA in characters, factors and data.frames.
Usage
replace_NA(x, ...)
## S3 method for class 'character'
replace_NA(x, replacement = atable_options("replace_NA_by"), ...)
## S3 method for class 'factor'
replace_NA(x, ...)
## S3 method for class 'ordered'
replace_NA(x, ...)
## S3 method for class 'data.frame'
replace_NA(x, ...)
## S3 method for class 'list'
replace_NA(x, ...)
## Default S3 method:
replace_NA(x, ...)
Arguments
x |
An object. |
... |
Passed to methods. |
replacement |
A character of length 1. Default value is defined
in |
Details
The atable package aims to create readable tables. For non-computer-affine
readers NA has no meaning. So replace_NA exists.
Methods for character, factor, ordered, list and data.frame available.
Default method returns x unchanged.
Gives a warning when replacement is already present in x and
does the replacement.
Silently returns x unchanged when there are no NA in x.
Silently returns x unchanged when replacement is not a character of
length 1 or when replacement is NA.
Value
Same class as x, now with NA replaced by replacement.
Methods (by class)
-
replace_NA(character): replacesNAwithreplacement. -
replace_NA(factor): appliesreplace_NAto the levels of the factor. A factor with length > 0 without levels will get the levelreplacement. -
replace_NA(ordered): as factor. -
replace_NA(data.frame): appliesreplace_NAto all columns. -
replace_NA(list): appliesreplace_NAto all elements of the list. -
replace_NA(default): returnxunchanged.
Examples
Character <- c(NA,letters[1:3], NA)
Factor <- factor(Character)
Ordered <- ordered(Factor)
Numeric <- rep(1, length(Factor))
Factor_without_NA <- factor(letters[1:length(Factor)])
DD <- data.frame(Character, Factor, Ordered,
Numeric, Factor_without_NA,
stringsAsFactors = FALSE)
## Not run:
DD2 <- replace_NA(DD, replacement = 'no value')
summary(DD)
summary(DD2) # now with 'no value' instead NA in column Character, Factor and Ordered
atable_options(replace_NA_by = 'not measured') # use atable_options to set replacement
DD3 <- replace_NA(DD)
summary(DD3) # now with 'not measured' instead NA
atable_options_reset() # set 'replace_NA_by' back to default
## End(Not run)
Replaces consecutive elements
Description
If x[i+1]=x[i] then x[i+1] is replaced by by for i=1,...length(x)-1.
Usage
replace_consecutive(x, by = "", fun_for_identical = base::identical)
Arguments
x |
A character or factor. |
by |
A character with length 1. |
fun_for_identical |
A function with two arguments called |
Details
The = is defined by function identical by default.
This function can be changed by argument fun_for_identical
Value
A character, same length as x, now with consecutives replaced by by.
If length(x) < 2, x is returned unchanged.
Examples
x <- rep(c('a','b','c','d'), times=c(2,4,1,3))
x
## Not run: replace_consecutive(x)
# NA should not be identical. So change fun_for_identical
fun_for_identical <- function(x,y) !is.na(x) && !is.na(y) && identical(x,y)
x <- c(1,1,3,3,NA,NA, 4)
x
## Not run: replace_consecutive(x, by="99")
## Not run: replace_consecutive(x, by="99", fun_for_identical = fun_for_identical)
A data.frame with standardized random data of various classes
Description
A data.frame intended for testing the atable function with standardized random data and missing values in various classes.
Usage
standardized_test_data
Format
A data frame with 1080 rows and 7 variables:
- Split1
A factor with 2 levels without
NA. The two levels have the same frequency (540).- Split2
A factor with 2 levels with
NA. The two levels and theNAhave the same frequency (360).- Group
A factor with 2 levels with
NA. The two levels and theNAhave the same frequency (360).- Logical
A logical.
- Factor
A factor with 3 levels.
- Ordered
Class ordered with 4 levels.
- Numeric
Class numeric.
Details
For every subset defined by a triplet of the levels of Split1, Split2 and Group the variables have the following properties:
60 observations
Logical has exactly the same number of
TRUEandFALSEandNA(20).Factor has exactly the same number of levels taken and
NA(15).Ordered has exactly the same number of levels taken and
NA(12).Numeric is sampled from a normal distribution and then standardized to
sd1 and with 6NA. Itsmeanis 12 whenGroupis'Treatment'and 10 otherwise (up to10^-17).
Examples
atable::atable(Logical+ Numeric + Factor + Ordered ~ Group | Split1 + Split2,
atable::standardized_test_data, add_levels_for_NA = TRUE, format_to = 'Word')
Calculates descriptive statistics
Description
Calculates descriptive statistics depending on the class of its input.
Usage
statistics(x, ...)
## S3 method for class 'numeric'
statistics(x, statistics.numeric = NULL, ...)
## S3 method for class 'factor'
statistics(x, statistics.factor = NULL, ...)
## S3 method for class 'logical'
statistics(x, labels_TRUE_FALSE = atable_options("labels_TRUE_FALSE"), ...)
## S3 method for class 'character'
statistics(x, ...)
## S3 method for class 'ordered'
statistics(x, statistics.ordered = NULL, ...)
## S3 method for class 'count_me'
statistics(x, ...)
Arguments
x |
An object. Statistics will be calculated of |
... |
Passed from and to other methods. |
statistics.numeric |
Either |
statistics.factor |
Analog to argument statistics.numeric |
labels_TRUE_FALSE |
For relabeling logicals. See also |
statistics.ordered |
Analog to argument statistics.numeric |
Details
Calculates descriptive statistics depending on the class of its input.
Results are passed to function format_statistics.
If you are not pleased with the current descriptive statistics you may alter these functions.
But you must keep the original output-format, see section Value.
Function check_statistics checks if the output of statistics is suitable for further processing.
Value
The results of statistics are passed to function format_statistics.
So the results of statistics must have a class for which the generic format_statistics has a method.
format_statistics has a default method, which accepts lists. So the results of statistics can be a
named list with length > 0. The names of the list must have no duplicates.
Function check_statistics checks if the output of statistics is suitable for further processing.
Methods (by class)
-
statistics(numeric): Descriptive statistics are: length, number of missing values, mean and standard deviation. Class of the result is'statistics_numeric'and there is a methodformat_statistics_to_Latex.statistics_numeric. This function is meant for interval scaled variables. -
statistics(factor): Counts the numbers of occurrences of the levels ofxwith functiontable. This function is meant for nominal and ordinal scaled variables. -
statistics(logical): Castsxto factor, then appliesstatisticsagain. The labels forTRUEandFALSEcan also be modfied by settingatable_options('labels_TRUE_FALSE'). -
statistics(character): Castsxto factor, then appliesstatisticsagain. -
statistics(ordered): Castsxto factor, then appliesstatisticsagain. -
statistics(count_me): Returns thelengthofx. For class'count_me'seeadd_observation_column.
A data.frame with random data of various classes
Description
A data.frame intended for testing the atable function with random data and missing values in various classes.
Usage
test_data
Format
A data frame with 129 rows and 11 variables:
- Split1
A factor with 2 levels, drawn uniformly.
- Split2
A factor with 3 levels, drawn uniformly.
- Group
A factor with 2 levels, drawn uniformly.
- Group2
A factor with 3 levels, drawn uniformly.
- Numeric
A sample from the standard normal distribution.
- Numeric2
A sample from the normal distribution with mean 4 and sd 3.
- Logical
A Logical, drawn uniformly from
TRUE,FALSEandNA.- Factor
A factor with 4 level drawn with weigths
1:1:2:2.- Ordered
Class Ordered with 3 levels, drawn uniformly.
- Character
Class character drawn uniformly from
c('a', 'b', '').- Date
Class Date, generated by adding
2001-05-25to a sample of the Poisson distribution with lambda42.
6 Missing values were randomly added to each of Numeric, Numeric2, Factor, Ordered, Character and Date.
A wrapper for latexTranslate
Description
Translate_to_LaTeX calls latexTranslate.
Usage
translate_to_LaTeX(x, ...)
## S3 method for class 'data.frame'
translate_to_LaTeX(x, ...)
## S3 method for class 'list'
translate_to_LaTeX(x, ...)
## S3 method for class 'character'
translate_to_LaTeX(
x,
inn = NULL,
out = NULL,
pb = FALSE,
greek = FALSE,
na = "",
...
)
## S3 method for class 'numeric'
translate_to_LaTeX(x, ...)
## S3 method for class 'factor'
translate_to_LaTeX(x, ...)
## S3 method for class 'logical'
translate_to_LaTeX(x, ...)
Arguments
x |
An object. |
inn, out, pb, greek, na, ... |
As in |
Details
Result is suitable for print with latex.
Translate_to_LaTeX uses S3 object system. See seection methods.
Value
Same length as x, now translated to latex.
Methods (by class)
-
translate_to_LaTeX(data.frame): ApplieslatexTranslatetorownames(x),colnames(x)and all columns ofx. -
translate_to_LaTeX(list): Translates all elements ofx. -
translate_to_LaTeX(character): AslatexTranslate. -
translate_to_LaTeX(numeric): Casts to character and then translates. -
translate_to_LaTeX(factor): Translates the levels of the factor. -
translate_to_LaTeX(logical): Casts to character and then translates.
Two sample hypothesis tests and effect size
Description
Calculates two sample hypothesis tests and effect size depending on the class of its input.
Usage
two_sample_htest(value, group, ...)
## S3 method for class 'character'
two_sample_htest(value, group, ...)
## S3 method for class 'factor'
two_sample_htest(value, group, two_sample_htest.factor = NULL, ...)
## S3 method for class 'logical'
two_sample_htest(value, group, ...)
## S3 method for class 'numeric'
two_sample_htest(value, group, two_sample_htest.numeric = NULL, ...)
## S3 method for class 'ordered'
two_sample_htest(value, group, two_sample_htest.ordered = NULL, ...)
Arguments
value |
An atomic vector. These values will be tested. |
group |
A factor with two levels and same length as |
... |
Passed to methods. |
two_sample_htest.factor |
Analog to argument two_sample_htest.numeric |
two_sample_htest.numeric |
Either |
two_sample_htest.ordered |
Analog to argument two_sample_htest.numeric |
Details
Results are passed to function format_tests for the final table.
So the results of two_sample_htest must have a class for which the generic
format_tests has a method.
If you are not pleased with the current hypothesis tests you may alter these functions. But you must keep the original output-format, see section Value.
Note that the various statistical test functions in R have heterogeneous arguments:
for example chisq.test and ks.test do not have
formula/data as arguments, whereas wilcox.test and
kruskal.test do. So the function two_sample_htest is essentially
a wrapper to standardize the arguments of various hypothesis test functions.
As two_sample_htest is only intended to be applied to unpaired two sample data,
the two arguments value and group are sufficient to describe the data.
Note that e.g. for class numeric the p-value is calculated by ks.test and the effects
size 95% CI by cohen.d. As these are two different functions the results may be
contradicting: the p-value of ks.test can be smaller than 0.05
and the CI of cohen.d contains 0 at the same time.
Value
A named list with length > 0, where all elements of the list are atomic and have the same length.
Most hypothesis-test-functions in R like t.test or chisq.test
return an object of class 'htest'. 'htest'-objects are a suitable output for function
two_sample_htest. Function check_tests checks if the output is suitable for
further processing.
Methods (by class)
-
two_sample_htest(character): Castsvalueto factor and then calls methodtwo_sample_htestagain. -
two_sample_htest(factor): Callschisq.testonvalue. Effect size is the odds ratio calculated byfisher.test(ifvaluehas two levels), or Cramer's V byCramerV. -
two_sample_htest(logical): Castsvalueto factor and then callstwo_sample_htestagain. -
two_sample_htest(numeric): Callsks.testonvalue. Effect size is Cohen's d calculated bycohen.d. -
two_sample_htest(ordered): Callswilcox.testonvalue. Effect size is Cliff's delta calculated bycliff.delta.