This vignette demonstrates how to use table and plotting functions provided by MeasurementDiagnostics to visualise results.
We use the package mock data so examples are fully reproducible.
We call summariseMeasurementUse() once and obtain
histogram bins for all numeric variables. This returns a
summarised_result containing all the diagnostics checks,
summary estimates, and density and histogram estimates to visualise
distributions of numeric variables; all for the overall measurements
codelist and stratified by sex.
result <- summariseMeasurementUse(
cdm = cdm,
codes = alkaline_phosphatase_codes,
bySex = TRUE,
byYear = FALSE,
byConcept = FALSE,
histogram = list(
days_between_measurements = list(
"0-30" = c(0, 30), "31-90" = c(31, 90), "91-365" = c(91, 365), "366+" = c(366, Inf)
),
measurements_per_subject = list(
"0" = c(0, 0), "1" = c(1, 1), "2-3" = c(2, 3), "4+" = c(4, 1000)
),
value_as_number = list(
"low" = c(0, 5.999), "mid" = c(6, 10.999), "high" = c(11, Inf)
)
)
)There is one table function corresponding to each diagnostic check:
tableMeasurementSummary() — subjects with
measurements, counts per subject, days between measurements.
tableMeasurementValueAsNumber() — numeric value
summaries (by unit where available).
tableMeasurementValueAsConcept() — frequency of
concept values.
You can customise which columns appear in the header, which are used as grouping columns, and which to hide.
# 1. Measurement summary table (timings / counts)
tableMeasurementSummary(
result,
header = c("codelist_name", "sex"),
hide = c("cdm_name", "domain_id")
)|
Codelist name
|
|||||
|---|---|---|---|---|---|
|
alkaline_phosphatase
|
|||||
| Variable name | Variable level | Estimate name |
Sex
|
||
| overall | Female | Male | |||
| Number subjects | – | N (%) | 67 (67.00%) | 40 (40.00%) | 27 (27.00%) |
| Days between measurements | – | Median [Q25 – Q75] | 249 [67 – 645] | 240 [53 – 1,133] | 267 [81 – 415] |
| Range | 8 to 2,886 | 8 to 2,886 | 8 to 2,743 | ||
| Measurements per subject | – | Median [Q25 – Q75] | 1.00 [1.00 – 2.00] | 1.00 [1.00 – 2.00] | 1.00 [1.00 – 2.00] |
| Range | 1.00 to 4.00 | 1.00 to 4.00 | 1.00 to 3.00 | ||
| CDM name | Unit concept name | Unit concept ID | Variable name | Estimate name |
Sex
|
||
|---|---|---|---|---|---|---|---|
| overall | Female | Male | |||||
| alkaline_phosphatase | |||||||
| mock database | kilogram | 9529 | Measurement records | N | 50 | 33 | 17 |
| Value as number | Median [Q25 – Q75] | 8.77 [7.07 – 10.48] | 8.12 [6.60 – 10.22] | 9.13 [8.26 – 11.17] | |||
| Q05 – Q95 | 5.70 – 11.84 | 5.58 – 11.60 | 6.64 – 11.83 | ||||
| Q01 – Q99 | 5.43 – 12.11 | 5.41 – 11.99 | 6.08 – 12.11 | ||||
| Range | 5.36 to 12.18 | 5.36 to 12.04 | 5.94 to 12.18 | ||||
| Missing value, N (%) | 2 (4.00%) | 2 (6.06%) | 0 (0.00%) | ||||
| NA | - | Measurement records | N | 50 | 27 | 23 | |
| Value as number | Median [Q25 – Q75] | 8.77 [7.10 – 10.44] | 8.55 [6.85 – 10.01] | 8.92 [7.39 – 10.88] | |||
| Q05 – Q95 | 5.77 – 11.77 | 5.75 – 11.32 | 6.18 – 11.80 | ||||
| Q01 – Q99 | 5.50 – 12.04 | 5.61 – 11.94 | 5.59 – 11.93 | ||||
| Range | 5.44 to 12.11 | 5.58 to 12.11 | 5.44 to 11.96 | ||||
| Missing value, N (%) | 3 (6.00%) | 3 (11.11%) | 0 (0.00%) | ||||
# 3. Concept-value summary table (values recorded as concepts)
tableMeasurementValueAsConcept(result)| CDM name | Variable name | Value as concept name | Value as concept ID | Estimate name |
Sex
|
||
|---|---|---|---|---|---|---|---|
| overall | Female | Male | |||||
| alkaline_phosphatase | |||||||
| mock database | Measurement records | Low | 4267416 | N (%) | 34 (34.00%) | 16 (26.67%) | 18 (45.00%) |
| High | 4328749 | N (%) | 33 (33.00%) | 26 (43.33%) | 7 (17.50%) | ||
| NA | NA | N (%) | 33 (33.00%) | 18 (30.00%) | 15 (37.50%) | ||
The plotting helpers allow to plot certain types of graphics, while
giving flexibility for variables to use for colouring, facetting, and
which to have in the horizontla and vertical axes. They return
ggplot objects, which allows further customisation using
standard ggplot2
layers.
plotMeasurementSummary() visualises
days_between_measurements, and
measurements_per_subject. Supported plot type are
"boxplot", "barplot", and
"densityplot".
The variable specified in y must be either
“days_between_measurements” or “measurements_per_subject” as it is used
to filter which of the summary results to plot.
result |>
plotMeasurementSummary(
x = "codelist_name",
y = "days_between_measurements",
plotType = "boxplot"
)result |>
plotMeasurementSummary(
x = "sex",
y = "measurements_per_subject",
plotType = "boxplot",
colour = "sex",
facet = NULL
) +
theme(legend.position = "none")If we got density estimates we can also use
densityplot for these variables. To choose which variable
to plot, we use the y argument, while the x
argument is ignored for this plot type.
result |>
plotMeasurementSummary(
y = "measurements_per_subject",
plotType = "densityplot",
colour = "sex",
facet = NULL
)Since we got specific bin-counts to plot histograms for these
variables, we can also use plotType = "barplot"
result |>
plotMeasurementSummary(
x = "variable_level",
plotType = "barplot",
colour = "variable_level",
facet = "sex"
)result |>
plotMeasurementSummary(
y = "measurements_per_subject",
plotType = "barplot",
colour = "sex",
facet = "variable_level"
)plotMeasurementValueAsNumber() visualises distributions
of numeric measurement values. We demonstrate the three plot types,
similar to the measurement summary plots.
result |>
plotMeasurementValueAsNumber(
x = "sex",
plotType = "boxplot",
facet = "unit_concept_name",
colour = "sex"
)plotMeasurementValueAsConcept() visualises concept-coded
measurement values and their frequencies. Next we plot counts for each
concept value in the codelist.
result |>
plotMeasurementValueAsConcept(
x = "count",
y = "variable_level",
facet = "cdm_name",
colour = "sex"
) +
ylab("Value as Concept Name")Instead of counts, we can also plot the percentage for each concept:
result |>
plotMeasurementValueAsConcept(
x = "variable_level",
y = "percentage",
facet = "cdm_name",
colour = "sex"
) +
xlab("Value as Concept Name") The OmopViewer package supports results produced by MeasurementDiagnostics and provides a user-friendly way to quickly generate a Shiny application to explore diagnostic results in an interactive way.
For example, the following code exports a static Shiny app that allows users to navigate the tables and plots generated in this vignette.
Tables and plots in MeasurementDiagnostics are
generated using the visOmopResults
package. Users who wish to create custom tables or visualisations
directly from a summarised_result object can do so by
leveraging the functions provided by this package.
MeasurementDiagnostics is integrated into the PhenotypeR
package. When cohorts are defined based on measurement codes,
PhenotypeR automatically applies
summariseCohortMeasurementUse() to generate measurement
diagnostics during cohort construction, using the codelists linked to
each cohort.
This integration allows users to assess measurement codelists and cohorts as part of a broader phenotype development workflow.