puremoe ships three MeSH reference tables: a thesaurus
of descriptors and entry terms, a tree of hierarchical classifications,
and a frequency table of descriptor occurrence across PubMed.
data_mesh_thesaurus() downloads and combines the MeSH
Descriptor Thesaurus and Supplementary Concept Records (SCR). One row
per term, including synonyms and entry terms for each descriptor.
data_mesh_trees() provides the hierarchical
classification structure. Each descriptor can appear in multiple
branches; tree_location encodes the full path (e.g.,
I01.880.604 = Social Sciences > Political Science >
Political Systems).
data_mesh_frequencies is a bundled dataset giving the
frequency of each MeSH descriptor across the full PubMed corpus (39.7 M
PMIDs, April 2026). Proportions use the total corpus as denominator,
making them suitable as a baseline for enrichment analyses against
arbitrary PubMed subsets.
Both datasets are ~10 MB and fetched from GitHub on each call by
default. To avoid re-downloading every session, set
use_persistent_storage = TRUE — the files are cached to a
system data directory and reused on subsequent calls.