Title: | Tools for Working with ICD Codes and Comorbidity Algorithms |
Version: | 0.6.0 |
Description: | Provides tools for working with medical coding schemas such as the International Classification of Diseases (ICD). Includes functions for comorbidity classification algorithms such as the Pediatric Complex Chronic Conditions (PCCC), Charlson, and Elixhauser indices. |
Depends: | R (≥ 3.5.0) |
License: | GPL-2 |
Language: | en-US |
Encoding: | UTF-8 |
LazyData: | true |
Suggests: | data.table, kableExtra, knitr, rmarkdown, tibble (≥ 2.0.0) |
RoxygenNote: | 7.3.3 |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2025-10-09 07:13:40 UTC; peterdewitt |
Author: | Peter DeWitt |
Maintainer: | Peter DeWitt <peter.dewitt@cuanschutz.edu> |
Repository: | CRAN |
Date/Publication: | 2025-10-15 19:20:08 UTC |
medicalcoder
Description
An R package for working with ICD codes and comorbidity assessments.
Details
medicalcoder
is a lightweight, base-R package for working with ICD-9 and
ICD-10 diagnosis and procedure codes. It provides fast, dependency-free tools
to look up, validate, and manipulate ICD codes, while also implementing
widely used comorbidity algorithms such as Charlson, Elixhauser, and the
Pediatric Complex Chronic Conditions (PCCC).
Designed for portability and reproducibility, the package avoids external
dependencies—requiring only R >= 3.5.0—yet offers a rich set of curated ICD
code libraries from the United States' Centers for Medicare and Medicaid
Services (CMS), Centers for Disease Control (CDC), and the World Health
Organization (WHO).
The package balances performance with elegance: its internal caching,
efficient joins, and compact data structures make it practical for
large-scale health data analyses, while its clean design makes it easy to
extend or audit. Whether you need to flag comorbidities, explore ICD
hierarchies, or standardize clinical coding workflows, medicalcoder
provides
a robust, transparent foundation for research and applied work in biomedical
informatics.
Implementation
The medicalcoder
package was intentionally designed and built to have zero
dependencies beyond R version 3.5.0 (needed due to a change in data serialization)
and zero imports. The package is completely self contained for the purposes
of installation and use.
This design choice was made for several reasons.
Ease of installation:
Only requirement is R >= 3.5.0.
No need for external files or downloads or other packages for the ICD data-base.
Works well with different data paradigms
Base R
data.frames
,-
tidyverse
tibble
s, and -
data.table
s from the data.table package.
One of the reasons for focusing on building a self-contained package with no
need for additional namespaces is to make installation and use in an
pseudo-air-gapped system easier. The author of this package routinely works
on machines with extremely limited, access to the world-wide-web. As
such, relying on any system dependencies or other R packages can become
difficult as the machine may or may not have the needed software. So long as
R >= 3.5.0 is available medicalcoder
will work.
A great deal of thought went into performance of the methods and the size of the package. The internal data sets, for example, are not stored in a structure that is easy to use by end user. When the package namespaces is loaded the needed internal lookup tables are generated and cached.
Author(s)
Maintainer: Peter DeWitt peter.dewitt@cuanschutz.edu (ORCID)
Other contributors:
Tell Bennett tell.bennett@cuanschutz.edu (ORCID) [contributor]
Seth Russell seth.russell@cuanschutz.edu (ORCID) [contributor]
Meg Rebull meg.rebull@cuanschutz.edu (ORCID) [contributor]
See Also
comorbidities()
, get_icd_codes()
, is_icd()
,
Vignette for working with ICD codes:
-
vignette(topic = "icd", package = "medicalcoder")
-
Vignettes for applying comorbidities:
-
vignette(topic = "comorbidities", package = "medicalcoder")
-
vignette(topic = "pccc", package = "medicalcoder")
-
vignette(topic = "charlson", package = "medicalcoder")
-
vignette(topic = "elixhauser", package = "medicalcoder")
-
Comorbidities
Description
Apply established comorbidity algorithms to ICD-coded data. Supported methods include several variants of the Charlson comorbidity system, Elixhauser, and the Pediatric Complex Chronic Conditions (PCCC).
Usage
comorbidities(
data,
icd.codes,
method,
id.vars = NULL,
icdv.var = NULL,
icdv = NULL,
dx.var = NULL,
dx = NULL,
poa.var = NULL,
poa = NULL,
age.var = NULL,
primarydx.var = NULL,
primarydx = NULL,
flag.method = c("current", "cumulative"),
full.codes = TRUE,
compact.codes = TRUE,
subconditions = FALSE
)
Arguments
data |
A |
icd.codes |
Character scalar naming the column in |
method |
Character string indicating the comorbidity algorithm to
apply to |
id.vars |
Optional character vector of column names. When
missing, the entire input |
icdv.var |
Character scalar naming the column in |
icdv |
An integer value of |
dx.var |
Character scalar naming the column in |
dx |
An integer indicating that all |
poa.var |
Character scalar naming the column with present-on-admission
flags: integer |
poa |
Integer scalar |
age.var |
Character scalar naming the column in |
primarydx.var |
Character scalar naming the column in |
primarydx |
An integer value of |
flag.method |
When |
full.codes , compact.codes |
Logical; when |
subconditions |
Logical scalar; when |
Details
When flag.method = "current"
, only codes from the index encounter
contribute to flags. When a longitudinal method is selected (e.g.,
"cumulative"
), prior encounters for the same id.vars
combination may contribute to condition flags. For the cumulative method to
work the id.vars
need to be a character vector length 2 or more. The last
variable listed in the id.vars will be considered the encounter id and should
be sortable. For example, say you have data with a hospital, patient, and
encounter id. The id.vars
could be one of two entries: c("hospital", "patient", "encounter")
or c("patient", "hospital", "encounter")
. In both
cases the return with be the same as "encounter" within the hospital/patient
id interaction is the same as "encounter" within patient/hospital
interaction.
It is critically important that the data[[tail(id.vars, 1)]]
variable can
be sorted. Just because your data is sorted in temporal order does not mean
that the results will be correct if the tail(id.vars, 1)
is not in the same
order as the data. For example, say you had the following:
patid | enc_id | date |
P1 | 10823090 | Aug 2023 |
P1 | 10725138 | Jul 2025 |
id.vars = c("patid", "enc_id")
will give the wrong result as enc_id
10725138 would be sorted to come before enc_id 10823090. id.var = c("patid", "date")
would be sufficient input, assuming that date
has been
correctly stored. Adding a column enc_seq
, e.g.,
patid | enc_id | date | enc_seq |
P1 | 10823090 | Aug 2023 | 1 |
P1 | 10725138 | Jul 2025 | 2 |
and calling comorbidities()
with id.vars = c("patid", "enc_seq")
will
have better performance than using the date and will clear up any possible
issues with non-sequential encounter ids from the source data.
Value
The return object will be slightly different depending on the value of
method
and subconditions
.
When
subconditions = FALSE
, amedicalcoder_comorbidities
object (adata.frame
with attributes) is returned. Column(s) forid.vars
, if defined in the function call. For all method there will be the following columns:-
num_cmrb
a count of comorbidities/conditions flagged -
cmrb_flag
a 0/1 integer indicator for at least one comorbidity/condition.
Additional columns:
PCCC methods:
For
method = "pccc_v2.0"
andmethod = "pccc_v2.1"
, there is one indicator column per condition.For
method = "pccc_v3.0"
andmethod = "pccc_v3.1"
, there are four columns per condition:-
<condition>_dxpr_or_tech
: the condition was flag due to the presence of either a diagnostic or procedure code, or was flag due to the presence of a technology dependence code along with at least one comorbidity being flagged by a diagnostic or procedure code. -
<condition>_dxpr_only
: the condition was flagged due to the presence of a non-technology dependent diagnostic or procedure code only. -
<condition>_tech_only
: the condition was flagged due to the presence of a technology dependent code only and at least one other comorbidity was flagged by a non-technology dependent code. -
<condition>_dxpr_and_tech
: The patient had both diagnostic or procedure codes and a technology dependence code for the condition.
-
For Charlson variants, indicator columns are returned for the relevant conditions,
cci
(Charlson Comorbidity Index), andage_score
.For Elixhauser variants, indicator columns are returned for all relevant comorbidities, mortality, and readmission indices.
-
When
subconditions = TRUE
and the method is a PCCC variant, a list of length two is returned: the first element contains condition indicators; the second element is a named list ofdata.frame
s with indicators for subconditions within each condition.
References
Pediatric Complex Chronic Conditions:
Feudtner, C., Feinstein, J.A., Zhong, W. et al. Pediatric complex chronic conditions classification system version 2: updated for ICD-10 and complex medical technology dependence and transplantation. BMC Pediatr 14, 199 (2014). https://doi.org/10.1186/1471-2431-14-199
Feinstein JA, Hall M, Davidson A, Feudtner C. Pediatric Complex Chronic Condition System Version 3. JAMA Netw Open. 2024;7(7):e2420579. https://doi.org/10.1001/jamanetworkopen.2024.20579
Charlson Comorbidities:
Mary E. Charlson, Peter Pompei, Kathy L. Ales, C.Ronald MacKenzie, A new method of classifying prognostic comorbidity in longitudinal studies: Development and validation, Journal of Chronic Diseases, Volume 40, Issue 5, 1987, Pages 373-383, ISSN 0021-9681, https://doi.org/10.1016/0021-9681(87)90171-8.
Deyo RA, Cherkin DC, Ciol MA. Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases. J Clin Epidemiol. 1992 Jun;45(6):613-9. https://doi.org/10.1016/0895-4356(92)90133-8. PMID: 1607900.
Quan H, Sundararajan V, Halfon P, Fong A, Burnand B, Luthi JC, Saunders LD, Beck CA, Feasby TE, Ghali WA. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care. 2005 Nov;43(11):1130-9. https://doi.org/10.1097/01.mlr.0000182534.19832.83. PMID: 16224307.
Quan H, Li B, Couris CM, Fushimi K, Graham P, Hider P, Januel JM, Sundararajan V. Updating and validating the Charlson comorbidity index and score for risk adjustment in hospital discharge abstracts using data from 6 countries. Am J Epidemiol. 2011 Mar 15;173(6):676-82. https://doi.org/10.1093/aje/kwq433. Epub 2011 Feb 17. PMID: 21330339.
Glasheen WP, Cordier T, Gumpina R, Haugh G, Davis J, Renda A. Charlson Comorbidity Index: ICD-9 Update and ICD-10 Translation. Am Health Drug Benefits. 2019 Jun-Jul;12(4):188-197. PMID: 31428236; PMCID: PMC6684052.
Elixhauser Comorbidities:
Agency for Healthcare Research and Quality (AHRQ). Elixhauser Comorbidity Software Refined for ICD-10-CM Diagnoses, v2025.1 [Internet]. 2025. Available from: https://www.hcup-us.ahrq.gov/toolssoftware/comorbidityicd10/comorbidity_icd10.jsp
See Also
-
vignettes(topic = "comorbidities", package = "medicalcoder")
-
vignettes(topic = "pccc", package = "medicalcoder")
-
vignettes(topic = "charlson", package = "medicalcoder")
-
vignettes(topic = "elixhauser", package = "medicalcoder")
Examples
pccc_v3.1_results <-
comorbidities(data = mdcr,
icd.codes = "code",
id.vars = "patid",
dx.var = "dx",
method = "pccc_v3.1",
flag.method = 'current',
poa = 1)
summary(pccc_v3.1_results)
pccc_v3.1_subcondition_results <-
comorbidities(data = mdcr,
icd.codes = "code",
id.vars = "patid",
dx.var = "dx",
method = "pccc_v3.1",
flag.method = 'current',
poa = 1,
subconditions = TRUE)
summary(pccc_v3.1_subcondition_results)
charlson_results <-
comorbidities(data = mdcr,
icd.codes = "code",
id.vars = "patid",
dx.var = "dx",
method = "charlson_quan2011",
flag.method = 'current',
poa = 1)
summary(charlson_results)
elixhauser_results <-
comorbidities(data = mdcr,
icd.codes = "code",
id.vars = "patid",
dx.var = "dx",
method = "elixhauser_ahrq2025",
primarydx = 1,
flag.method = 'current',
poa = 1)
summary(elixhauser_results)
Get Charlson Codes
Description
Retrieve a copy of internal lookup tables for the ICD codes used in assessing Charlson comorbidities.
Usage
get_charlson_codes()
Value
A data.frame
with the following columns:
-
icdv
: Integer vector indicating if the code is from ICD-9 or ICD-10 -
dx
: Integer vector. 1 if the code is a diagnostic, (ICD-9-CM, ICD-10-CM, WHO, CDC Mortality), or 0 if the code is procedural (ICD-9-PCS, ICD-10-PCS) -
full_code
: Character vector with the ICD code and any relevant decimal point -
code
: Character vector with the compact ICD code -
condition
: Character vector of the conditions -
charson_\<variant\>
: Integer vector indicating if the code is part of the \<variant\> of the Charlson comorbidities.
See Also
-
get_charlson_index_scores()
for a lookup table of the by comorbidity index scores. -
get_icd_codes()
for the lookup table of all ICD codes. -
get_pccc_codes()
for the lookup table of ICD codes used for the PCCC. -
get_elixhauser_codes()
for the lookup table of ICD codes used for the Elixhauser comorbidities. -
comorbidities()
for applying comorbidity algorithms to a data set.
Examples
head(get_charlson_codes())
str(get_charlson_codes())
Get Charlson Index Scores
Description
Retrieve a copy of internal lookup tables of index scores used in assessing Charlson comorbidities.
Usage
get_charlson_index_scores()
Value
A data.frame
with the following columns:
-
condition
: Character vector of the conditions -
index
: Character vector indicating if the score is for the mortality or the readmission index score -
charlson_<variant>
: the index scores for the variant
See Also
-
get_charlson_codes()
for a lookup table of the ICD codes mapping to the Charlson comorbidities. -
comorbidities()
for applying comorbidity algorithms to a data set.
Examples
head(get_charlson_index_scores())
str(get_charlson_index_scores())
Get Elixhauser Codes
Description
Retrieve copy of internal lookup tables for the ICD codes used in assessing Elixhauser comorbidities.
Usage
get_elixhauser_codes()
Value
A data.frame
with the following columns:
-
icdv
: Integer vector indicating if the code is from ICD-9 or ICD-10 -
dx
: Integer vector. 1 if the code is a diagnostic, (ICD-9-CM, ICD-10-CM, WHO, CDC Mortality), or 0 if the code is procedural (ICD-9-PCS, ICD-10-PCS) -
full_code
: Character vector with the ICD code and any relevant decimal point -
code
: Character vector with the compact ICD code omitting any relevant decimal point -
condition
: Character vector of the conditions -
elixhauser_<variant>
: Integer vector indicating if the code is part of the<variant>
of the Elixhauser comorbidities.
See Also
-
get_elixhauser_index_scores()
for the lookup table of the condition by condition scores for mortality and readmission indices. -
get_elixhauser_poa()
for the lookup table of the conditions which do an do not require associated ICD codes to be present-on-admission to flag the comorbidity. -
get_icd_codes()
for the lookup table of all ICD codes. -
get_pccc_codes()
for the lookup table of ICD codes used for the PCCC. -
get_charlson_codes()
for the lookup table of ICD codes used for the Charlson comorbidities. -
comorbidities()
for applying comorbidity algorithms to a data set.
Examples
head(get_elixhauser_codes())
str(get_elixhauser_codes())
Get Elixhauser Index Scores
Description
Functions to get a copy of internal lookup tables for the ICD codes and index scores used in assessing Elixhauser comorbidities.
Usage
get_elixhauser_index_scores()
Value
A data.frame
with the following columns:
-
condition
: Character vector of the conditions -
index
: Character vector indicating if the score is for the mortality or the readmission index score -
elixhauser_<variant>
: integer vector of the scores
See Also
-
get_elixhauser_codes()
for the lookup table of ICD codes mapping to the Elixhauser comorbidities. -
get_elixhauser_poa()
for the lookup table of the conditions which do an do not require associated ICD codes to be present-on-admission to flag the comorbidity. -
comorbidities()
for applying comorbidity algorithms to a data set.
Examples
head(get_elixhauser_index_scores())
str(get_elixhauser_index_scores())
Get Elixhauser Present-on-Admission Requirements
Description
Retrieve a copy of internal lookup table with details on which Elixhauser comorbidities do and do not require the associated ICD codes to be present-on-admission to be flagged.
Usage
get_elixhauser_poa()
Value
A data.frame
with the following columns:
-
condition
: Character vector of the conditions -
desc
: Character vector with a verbose description of the condition -
poa_required
: Integer indicators if the code needs to present on admission to be considered a comorbidity -
elixhauser_<variant>
: indicators for the Elixhauser<variant>
See Also
-
get_elixhauser_index_scores()
for the lookup table of the condition by condition scores for mortality and readmission indices. -
get_elixhauser_codes()
for the lookup table of ICD codes mapping to the Elixhauser comorbidities. -
comorbidities()
for applying comorbidity algorithms to a data set.
Examples
head(get_elixhauser_poa())
str(get_elixhauser_poa())
Get ICD Codes
Description
Retrieve a copy of the internal look up table for all known ICD codes.
Usage
get_icd_codes(with.descriptions = FALSE, with.hierarchy = FALSE)
Arguments
with.descriptions |
Logical scalar, if |
with.hierarchy |
Logical scalar, if |
Details
Sources
There are three sources of ICD codes.
-
cms
: Codes from the ICD-9-CM, ICD-9-PCS, ICD-10-CM, and ICD-10-PCS standards. -
who
: Codes from World Health Organization. -
cdc
: Codes from CDC Mortality coding standard.
Fiscal and Calendar Years
When reporting years there is a mix of fiscal and calendar years.
Fiscal years are the United States Federal Government fiscal years, running from October 1 to September 30. For example, fiscal year 2013 started October 1 2012 and ended on September 30 2013.
Calendar years run January 1 to December 31.
Within the ICD data there are columns
known_start
, known_end
, assignable_start
, assignable_end
,
desc_start
and desc_end
. For ICD codes with src == "cms"
, these are
fiscal years. For codes with src == "cdc"
or src == "who"
these are
calendar years.
known_start
is the first fiscal or calendar year (depending on source) that
the medicalcoder package as definitive source data for. ICD-9-CM started in
the United States in fiscal year 1980. Source information that could be
downloaded from the CDC and CMS and added to the source code for the
medicalcoder package goes back to 1997. As such 1997 is the "known start"
known_end
is the last fiscal or calendar year (depending on source)
for which we have definitive source data for. For ICD-9-CM and ICD-9-PCS
that is 2015. For ICD-10-CM and ICD-10-PCS, which are active, it is just the
last year of known data. ICD-10 from the WHO ends in 2019.
Header and Assignable Codes
"Assignable" indicates that the code is the most granular for the source.
Ideally codes are reported with the greatest level of detail but that is not
always the case. Also, the greatest level of detail can differ between
sources.
Example: C86 is a header code for cms
and who
because codes C86.0, C86.1,
C86.2, C86.3, C86.4, C86.5, and C86.6 all exist in both standards. No code
with a fifth digit exists in the who
so all these four digit codes are
'assignable.' In the cms
standard, C86.0 was assignable through fiscal
year 2024. In fiscal year 2025 codes C86.00 and C86.01 were added making
C86.0 a header code and C86.00 and C86.01 assignable codes.
Value
a data.frame
The default return has the following columns:
-
icdv
: Integer vector indicating if the code is from ICD-9 or ICD-10 -
dx
: Integer vector. 1 if the code is a diagnostic, (ICD-9-CM, ICD-10-CM, WHO, CDC Mortality), or 0 if the code is procedural (ICD-9-PCS, ICD-10-PCS) -
full_code
: Character vector with the ICD code and any relevant decimal point -
code
: Character vector with the compact ICD code omitting any relevant decimal point -
src
: Character vector reporting the source of the information. See Details. -
known_start
: Integer vector reporting the first known year of use. See Details. -
known_end
: Integer vector reporting the last known year of use. See Details. -
assignable_start
: Integer vector reporting the first known year the code was assignable. See Details. -
assignable_end
: Integer vector reporting the last known year the code was assignable. See Details.
When with.descriptions = TRUE
there are the following additional columns:
-
desc
: Character vector of descriptions. Forcms
codes descriptions from CMS are used preferentially over CDC. -
desc_start
: Integer vector of the first year the description was used. -
desc_end
: Integer vector of the last year the description was used.
When with.hierarchy = TRUE
there are the following additional columns:
-
chapter
-
subchapter
-
category
-
subcategory
-
subclassification
-
subsubclassification
-
extension
See Also
is_icd()
, lookup_icd_codes()
,
vignette(topic = "icd", package = "medicalcoder")
Examples
icd_codes <- get_icd_codes()
str(icd_codes)
# Explore the change in the assignable year for C86 code between CMS and
# WHO
subset(get_icd_codes(), grepl("^C86$", full_code))
subset(get_icd_codes(), grepl("^C86\\.\\d$", full_code))
subset(get_icd_codes(), grepl("^C86\\.0(\\d|$)", full_code))
is_icd("C86", headerok = FALSE) # FALSE
is_icd("C86", headerok = TRUE) # TRUE
is_icd("C86", headerok = TRUE, src = "cdc") # Not a CDC mortality code
lookup_icd_codes("^C86\\.0\\d*", regex = TRUE)
Pediatric Complex Chronic Conditions ICD Codes
Description
Retrieve a copy of internal lookup tables for the ICD codes mapping to the Pediatric Complex Chronic Conditions (PCCC) conditions and subconditions by variant.
Usage
get_pccc_codes()
Value
a data.frame
with the following columns
-
icdv
: Integer vector indicating if the code is from ICD-9 or ICD-10. -
dx
: Integer vector. 1 if the code is a diagnostic, (ICD-9-CM, ICD-10-CM, WHO, CDC Mortality), or 0 if the code is procedural (ICD-9-PCS, ICD-10-PCS). -
full_code
: Character vector with the ICD code and any relevant decimal point. -
code
: Character vector with the compact ICD code omitting any relevant decimal point. -
condition
: Character vector of the conditions. -
subcondition
: Character vector of the subconditions. -
transplant_flag
: Integer vector indicating if the code is associated with a transplant. -
tech_dep_flag
: Integer vector indicating if the code is associated with technology dependence. -
pccc_<variant>
: Integer vector indicating if the code is part of the v2.0, v2.1, v3.0, or v3.1 variant.
See Also
-
get_pccc_conditions()
for a reference of the PCCC conditions and subconditions. -
get_icd_codes()
for the lookup table of all ICD codes. -
comorbidities()
for applying comorbidity algorithms to a data set.
Examples
head(get_pccc_codes())
str(get_pccc_codes())
Pediatric Complex Chronic Condition and Subconditions
Description
Retrieve a copy of internal lookup tables for the syntax valid and human readable labels of the Pediatric Complex Chronic Conditions (PCCC) conditions and subconditions.
Usage
get_pccc_conditions()
Value
a data.frame
with the following columns
-
condition
: (character) syntax valid name for the condition -
subconditions
: (character) syntax valid name for the subcondition -
conditions_label.
: (character) human readable label for the condition -
subconditions_label
: (character) human readable label for the subcondition
See Also
-
get_pccc_codes()
for the lookup table of ICD codes used for the PCCC. -
comorbidities()
for applying comorbidity algorithms to a data set.
Examples
get_pccc_conditions()
Convert ICD Compact Codes to Full Codes
Description
Take an assumed ICD compact code string and convert to a full code based on the ICD version (9 or 10) and type (diagnostic or procedure). This method only formats strings and does not validate the code(s).
Usage
icd_compact_to_full(x, icdv, dx)
Arguments
x |
Character vector |
icdv |
Integer vector of allowed ICD versions. Use |
dx |
Integer vector indicating allowed code type(s): |
Value
A character vector the same length as x
.
See Also
-
get_icd_codes()
to retrieve the internal lookup table of ICD codes. -
lookup_icd_codes()
for retrieving details on a specific set of ICD codes. -
is_icd()
to test if a string is a known ICD code.
Other ICD tools:
is_icd()
,
lookup_icd_codes()
Is ICD
Description
Answer the question "is the character string x a valid ICD code?"
ICD codes should be character vectors. is_icd
will assess for both
"full codes" (decimal point present when appropriate) and "compact codes"
(decimal point omitted).
ICD-10 code "C00" is a header code because the four-character codes C00.0, C00.1, C00.2, C00.3, C00.4, C00.5, C00.6, C00.7, C00.8, and C00.9 exist. Those four-character codes are assignable (as of 2025) because no five-character descendants (e.g., C00.40) exist.
When the source is the World Health Organization (WHO) or CDC Mortality, years refer to calendar years. CDC/CMS sources use the U.S. federal fiscal year, which starts on October 1 (e.g., fiscal year 2024 runs 2023-10-01 to 2024-09-30).
Usage
is_icd(
x,
icdv = c(9L, 10L),
dx = c(1L, 0L),
src = c("cms", "who", "cdc"),
year,
headerok = FALSE,
ever.assignable = missing(year),
warn.ambiguous = TRUE,
full.codes = TRUE,
compact.codes = TRUE
)
Arguments
x |
Character vector of ICD codes (full or compact form). |
icdv |
Integer vector of allowed ICD versions. Use |
dx |
Integer vector indicating allowed code type(s): |
src |
Character vector of code sources. One or more of |
year |
Numeric scalar. Calendar or fiscal year to reference. Default is the most current year available per source. For ICD-9, the latest year is 2015; ICD-10 source are updated annually. Calendar year for WHO and CDC mortality. Fiscal year for CMS. |
headerok |
Logical scalar. If |
ever.assignable |
Logical scalar. If |
warn.ambiguous |
Logical scalar. If |
full.codes |
Logical scalar. If |
compact.codes |
Logical scalar. If |
Details
Similarly for ICD-9-CM: "055" is a header for measles; 055.0, 055.1, 055.2, 055.8, and 055.9 are assignable. Codes 055.3–055.6 do not exist. Code 055.7 is a header because 055.71 and 055.72 exist.
Some codes change status across years. For example, ICD-9-CM 516.3 was assignable in fiscal years 2006–2011, then became a header in 2012–2015.
Value
A logical vector the same length as x
.
See Also
-
get_icd_codes()
to retrieve the internal lookup table of ICD codes. -
lookup_icd_codes()
for retrieving details on a specific set of ICD codes. -
icd_compact_to_full()
converts a string from a compact format to the full format based on ICD version and type (diagnostic or procedure).
Other ICD tools:
icd_compact_to_full()
,
lookup_icd_codes()
Examples
################################################################################
# Some ICD-9 diagnostic codes
x <- c("136.2", "718.60", "642.02")
is_icd(x, icdv = 9, dx = 1)
is_icd(x, icdv = 9, dx = 0)
is_icd(x, icdv = 10, dx = 1)
is_icd(x, icdv = 10, dx = 0)
is_icd(x, icdv = 9, dx = 1, headerok = TRUE)
is_icd(x, icdv = 9, dx = 1, year = 2006)
################################################################################
# ICD code with, or without a dot. The ICD-9 diagnostic code 799.3 and ICD-9
# procedure code 79.93 both become 7993 when assessed against the ICD code look
# up tables. As such "7993" is a valid ICD-9 diagnostic and procedure code,
# whereas 799.3 is only a valid dx code, and 79.93 is only a valid pr code.
# Further, codes such as ".7993", "7.993", "7993.", are all non-valid codes.
x <- c("7993", ".7993", "7.993", "79.93", "799.3", "7993.")
data.frame(
x,
dx = is_icd(x, icdv = 9, dx = 1),
pr = is_icd(x, icdv = 9, dx = 0)
)
################################################################################
# example of a ICD-9 code that was assignable, but became a header when
# more descriptive codes were introduced: ICD-9 diagnostic code 516.3
lookup_icd_codes(paste0("516.3", c("", as.character(0:9))))
# ICD-9 code 516.3 was an assignable code through fiscal year 2011.
is_icd("516.3")
# If `year` is omitted, and `ever.assignable = FALSE` then the `year` is
# implied to be the max `known_end` year for ICD codes matched by `icdv`,
# `dx`, and `src`.
is_icd("516.3", ever.assignable = FALSE)
# when `year` is provided then `ever.assignable` is `FALSE` by default and
# the return is TRUE when 516.3 was assignable and FALSE otherwise.
is_icd("516.3", year = 2015)
is_icd("516.3", year = 2011)
# when year is a non-assignable year, but `ever.assignable = TRUE` the return
# will be TRUE. Useful if you know the data is retrospective and collected
# through fiscal year 2015.
is_icd("516.3", year = 2015, ever.assignable = TRUE)
################################################################################
# Consiser the string E010
# - This could be a ICD-9-CM full code
# - Could be a ICD-10-CM compact code
lookup_icd_codes("E010")
subset(get_icd_codes(with.descriptions = TRUE), grepl("^E010$", code))
is_icd("E010")
is_icd("E010", icdv = 9) # FALSE because it is a header code and was never assignable
is_icd("E010", icdv = 9, ever.assignable = TRUE) # FALSE
is_icd("E010", icdv = 9, headerok = TRUE) # TRUE
Lookup ICD Codes
Description
Functions for working with ICD codes.
ICD-10 code "C00" is a header code because the four-character codes C00.0, C00.1, C00.2, C00.3, C00.4, C00.5, C00.6, C00.7, C00.8, and C00.9 exist. Those four-character codes are assignable (as of 2025) because no five-character descendants (e.g., C00.40) exist.
When the source is the World Health Organization (WHO) or CDC Mortality, years refer to calendar years. CDC/CMS sources use the U.S. federal fiscal year, which starts on October 1 (e.g., fiscal year 2024 runs 2023-10-01 to 2024-09-30).
Usage
lookup_icd_codes(
x,
regex = FALSE,
full.codes = TRUE,
compact.codes = TRUE,
...
)
Arguments
x |
Character vector of ICD codes (full or compact form). |
regex |
Logical scalar. If |
full.codes |
Logical scalar. If |
compact.codes |
Logical scalar. If |
... |
Passed to |
Details
ICD codes should be character vectors. These tools work with either "full codes" (decimal point present when appropriate) or "compact codes" (decimal point omitted).
Similarly for ICD-9-CM: "055" is a header for measles; 055.0, 055.1, 055.2, 055.8, and 055.9 are assignable. Codes 055.3–055.6 do not exist. Code 055.7 is a header because 055.71 and 055.72 exist.
Some codes change status across years. For example, ICD-9-CM 516.3 was assignable in fiscal years 2006–2011, then became a header in 2012–2015.
Value
A data.frame
with one or more rows per input, including columns
-
match_type
: did the input match a full or compact code -
icdv
: icd version (9 or 10) -
dx
: diagnostic code (1) or procedure code (0) -
full_code
: the full code string -
code
: the compact codes string -
src
: the source - CMS, CDC, or WHO. year ranges (
known_*
,assignable_*
).
See Also
-
get_icd_codes()
to retrieve the internal lookup table of ICD codes. -
is_icd()
to test if a string is a known ICD code. -
icd_compact_to_full()
converts a string from a compact format to the full format based on ICD version and type (diagnostic or procedure).
Other ICD tools:
icd_compact_to_full()
,
is_icd()
Synthetic Data
Description
Synthetic Data
Usage
mdcr
Format
mdcr
is a data.frame
with 4 columns, one for a patient id and 41 for
diagnostic codes and 41 possible procedure codes. Each row is for one
patient id.
-
patid
: patient identifier, integer values -
icdv
: ICD version; integer values, 9 or 10 -
dx
: indicator column for ICD diagnostic (1) or procedure (0) codes -
code
: ICD code; character values
See Also
Other datasets:
mdcr_longitudinal
Synthetic Longitudinal Data
Description
Synthetic Longitudinal Data
Usage
mdcr_longitudinal
Format
mdcr_longitudinal
is a data.frame
with four columns. The codes are
expected to be treated as diagnostic codes but there are a few ICD-9 codes
which could match to procedure codes as well.
-
patid
: patient identifier, integer values -
date
: date the diagnostic code was recorded -
icdv
: ICD version 9 or 10, integer valued -
code
: ICD codes; character values
See Also
Other datasets:
mdcr
Summaries of Comorbidities
Description
Build summaries (counts and percentages) for each comorbidity and other summary statistics by method.
Usage
## S3 method for class 'medicalcoder_comorbidities'
summary(object, ...)
Arguments
object |
a |
... |
additional parameters, not currently used |
Value
either a list or a data data.frame
Examples
pccc_v3.1_results <-
comorbidities(data = mdcr,
icd.codes = "code",
id.vars = "patid",
dx.var = "dx",
method = "pccc_v3.1",
flag.method = 'current',
poa = 1)
summary(pccc_v3.1_results)
charlson_results <-
comorbidities(data = mdcr,
icd.codes = "code",
id.vars = "patid",
dx.var = "dx",
method = "charlson_quan2011",
flag.method = 'current',
poa = 1)
summary(charlson_results)
elixhauser_results <-
comorbidities(data = mdcr,
icd.codes = "code",
id.vars = "patid",
dx.var = "dx",
method = "elixhauser_ahrq2025",
primarydx = 1,
flag.method = 'current',
poa = 1)
summary(elixhauser_results)
Summaries of Comorbidities with Subconditions
Description
Build summaries (counts and percentages) for each Pediatric Complex Chronic Condition (PCCC) condition and subcondition.
Usage
## S3 method for class 'medicalcoder_comorbidities_with_subconditions'
summary(object, ...)
Arguments
object |
a |
... |
additional parameters, not currently used |
Value
a data.frame
with five columns.
-
condition
the primary condition -
subcondition
the subcondition(s) within thecondition
. There will be a row wheresubcondition
isNA
which is used to report thecount
andpercent_of_cohort
for thecondition
overall. -
count
the number of rows inobject
with the applicablecondition
andsubcondition
. -
percent_of_cohort
: a numeric value within [0, 100] for the percent of rows inobject
with the flaggedcondition
andsubcondition
. -
percent_of_those_with_condition
: a numeric value within [0, 100] for the subset of rows inobject
with the primarycondition
and the flaggedsubcondition
. Will beNA
for the primarycondition
.
See Also
comorbidities()
,
vignette(topic = "pccc", package = "medicalcoder")
Examples
pccc_v3.1_subcondition_results <-
comorbidities(data = mdcr,
icd.codes = "code",
id.vars = "patid",
dx.var = "dx",
method = "pccc_v3.1",
flag.method = 'current',
poa = 1,
subconditions = TRUE)
summary(pccc_v3.1_subcondition_results)