| Type: | Package |
| Title: | Execute and View Data Quality Checks on OMOP CDM Database |
| Version: | 2.8.6 |
| Date: | 2026-01-22 |
| Author: | Katy Sadowski [aut, cre], Clair Blacketer [aut], Maxim Moinat [aut], Ajit Londhe [aut], Anthony Sena [aut], Anthony Molinaro [aut], Frank DeFalco [aut], Pavel Grafkin [aut] |
| Maintainer: | Katy Sadowski <sadowski@ohdsi.org> |
| Description: | Assesses data quality in Observational Medical Outcomes Partnership Common Data Model (OMOP CDM) databases. Executes data quality checks and provides an R 'shiny' application to view the results. |
| License: | Apache License 2.0 |
| Config/build/clean-inst-doc: | FALSE |
| VignetteBuilder: | knitr |
| Config/testthat/edition: | 3 |
| URL: | https://github.com/OHDSI/DataQualityDashboard |
| BugReports: | https://github.com/OHDSI/DataQualityDashboard/issues |
| Depends: | R (≥ 3.2.2), DatabaseConnector (≥ 2.0.2) |
| Imports: | magrittr, ParallelLogger, dplyr, jsonlite, rJava, SqlRender (≥ 1.10.1), plyr, stringr, rlang, tidyselect, readr |
| Suggests: | testthat, knitr, rmarkdown, markdown, shiny, ggplot2, Eunomia (≥ 2.0.0), duckdb, R.utils, devtools |
| RoxygenNote: | 7.3.2 |
| Encoding: | UTF-8 |
| NeedsCompilation: | no |
| Packaged: | 2026-01-24 18:39:14 UTC; katysadowski |
| Repository: | CRAN |
| Date/Publication: | 2026-01-28 18:50:23 UTC |
Applies the 'Not Applicable' status to a single check
Description
Applies the 'Not Applicable' status to a single check
Usage
.applyNotApplicable(x)
Arguments
x |
Results from a single check |
Value
A numeric value (0 or 1) indicating whether the check is not applicable
Determines if check should be notApplicable and the notApplicableReason
Description
Determines if check should be notApplicable and the notApplicableReason
Usage
.calculateNotApplicableStatus(checkResults)
Arguments
checkResults |
A dataframe containing the results of the data quality checks |
Value
A dataframe with updated check results including notApplicable status and reasons
Determines if all checks required for 'Not Applicable' status are in the checkNames
Description
Determines if all checks required for 'Not Applicable' status are in the checkNames
Usage
.containsNAchecks(checkNames)
Arguments
checkNames |
A character vector of check names |
Value
A logical value indicating whether all required checks are present
Internal function to evaluate the data quality checks against given thresholds.
Description
Internal function to evaluate the data quality checks against given thresholds.
Usage
.evaluateThresholds(checkResults, tableChecks, fieldChecks, conceptChecks)
Arguments
checkResults |
A dataframe containing the results of the data quality checks |
tableChecks |
A dataframe containing the table checks |
fieldChecks |
A dataframe containing the field checks |
conceptChecks |
A dataframe containing the concept checks |
Value
A dataframe with updated check results including pass/fail status and threshold values
Internal function to define the id of each check.
Description
Internal function to define the id of each check.
Usage
.getCheckId(
checkLevel,
checkName,
cdmTableName,
cdmFieldName = NA,
conceptId = NA,
unitConceptId = NA
)
Arguments
checkLevel |
The level of the check. Options are table, field, or concept |
checkName |
The name of the data quality check |
cdmTableName |
The name of the CDM data table the quality check is applied to |
cdmFieldName |
The name of the field in the CDM data table the quality check is applied to |
conceptId |
The concept id the quality check is applied to |
unitConceptId |
The unit concept id the quality check is applied to |
Value
A character string representing the unique check ID
Determines if all checks are present expected to calculate the 'Not Applicable' status
Description
Determines if all checks are present expected to calculate the 'Not Applicable' status
Usage
.hasNAchecks(checkResults)
Arguments
checkResults |
A dataframe containing the results of the data quality checks |
Value
A logical value indicating whether all required checks are present
Internal function to determine if the connection needs auto commit
Description
Internal function to determine if the connection needs auto commit
Usage
.needsAutoCommit(connectionDetails, connection)
Arguments
connectionDetails |
A connectionDetails object for connecting to the CDM database |
connection |
A connection for connecting to the CDM database using the DatabaseConnector::connect(connectionDetails) function. |
Value
A logical value indicating if the connection needs auto commit
Internal function to send the fully qualified sql to the database and return the numerical result.
Description
Internal function to send the fully qualified sql to the database and return the numerical result.
Usage
.processCheck(
connection,
connectionDetails,
check,
checkDescription,
sql,
outputFolder
)
Arguments
connection |
A connection for connecting to the CDM database using the DatabaseConnector::connect(connectionDetails) function. |
connectionDetails |
A connectionDetails object for connecting to the CDM database. |
check |
The data quality check |
checkDescription |
The description of the data quality check |
sql |
The fully qualified sql for the data quality check |
outputFolder |
The folder to output logs and SQL files to. |
Value
A dataframe containing the check results
Internal function to read threshold files
Description
Internal function to read threshold files
Usage
.readThresholdFile(checkThresholdLoc, defaultLoc)
Arguments
checkThresholdLoc |
The location of the threshold file |
defaultLoc |
The default location of the threshold file |
Value
A dataframe containing the threshold data
Internal function to put the results of each quality check into a dataframe.
Description
Internal function to put the results of each quality check into a dataframe.
Usage
.recordResult(
result = NULL,
check,
checkDescription,
sql,
executionTime = NA,
warning = NA,
error = NA
)
Arguments
result |
The result of the data quality check |
check |
The data quality check |
checkDescription |
The description of the data quality check |
sql |
The fully qualified sql for the data quality check |
executionTime |
The total time it took to execute the data quality check |
warning |
Any warnings returned from the server |
error |
Any errors returned from the server |
Value
A dataframe containing the check results
Internal function to run and process each data quality check.
Description
Internal function to run and process each data quality check.
Usage
.runCheck(
checkDescription,
tableChecks,
fieldChecks,
conceptChecks,
connectionDetails,
connection,
cdmDatabaseSchema,
vocabDatabaseSchema,
resultsDatabaseSchema,
writeTableName,
cohortDatabaseSchema,
cohortTableName,
cohortDefinitionId,
outputFolder,
sqlOnlyUnionCount,
sqlOnlyIncrementalInsert,
sqlOnly
)
Arguments
checkDescription |
The description of the data quality check |
tableChecks |
A dataframe containing the table checks |
fieldChecks |
A dataframe containing the field checks |
conceptChecks |
A dataframe containing the concept checks |
connectionDetails |
A connectionDetails object for connecting to the CDM database |
connection |
A connection for connecting to the CDM database using the DatabaseConnector::connect(connectionDetails) function. |
cdmDatabaseSchema |
The fully qualified database name of the CDM schema |
vocabDatabaseSchema |
The fully qualified database name of the vocabulary schema (default is to set it as the cdmDatabaseSchema) |
resultsDatabaseSchema |
The fully qualified database name of the results schema |
writeTableName |
The table tor write DQD results to. Used when sqlOnly or writeToTable is True. |
cohortDatabaseSchema |
The schema where the cohort table is located. |
cohortTableName |
The name of the cohort table. |
cohortDefinitionId |
The cohort definition id for the cohort you wish to run the DQD on. The package assumes a standard OHDSI cohort table called 'Cohort' |
outputFolder |
The folder to output logs and SQL files to |
sqlOnlyUnionCount |
(OPTIONAL) How many SQL commands to union before inserting them into output table (speeds processing when queries done in parallel). Default is 1. |
sqlOnlyIncrementalInsert |
(OPTIONAL) Boolean to determine whether insert check results and associated metadata into output table. Default is FALSE (for backwards compatability to <= v2.2.0) |
sqlOnly |
Should the SQLs be executed (FALSE) or just returned (TRUE)? |
Value
A dataframe containing the check results or SQL queries (NULL if sqlOnlyIncrementalInsert is TRUE)
Internal function to summarize the results of the DQD run.
Description
Internal function to summarize the results of the DQD run.
Usage
.summarizeResults(checkResults)
Arguments
checkResults |
A dataframe containing the results of the checks after running against the database |
Value
A list containing summary statistics of the check results
Internal function to write the check results to a csv file.
Description
Internal function to write the check results to a csv file.
Usage
.writeResultsToCsv(
checkResults,
csvPath,
columns = c("checkId", "failed", "passed", "isError", "notApplicable", "checkName",
"checkDescription", "thresholdValue", "notesValue", "checkLevel", "category",
"subcategory", "context", "checkLevel", "cdmTableName", "cdmFieldName", "conceptId",
"unitConceptId", "numViolatedRows", "pctViolatedRows", "numDenominatorRows",
"executionTime", "notApplicableReason", "error", "queryText"),
delimiter = ","
)
Arguments
checkResults |
A dataframe containing the fully summarized data quality check results |
csvPath |
The path where the csv file should be written |
columns |
The columns to be included in the csv file. Default is all columns in the checkResults dataframe. |
delimiter |
The delimiter for the file. Default is comma. |
Value
NULL (writes results to CSV file)
Write DQD results to json
Description
Write DQD results to json
Usage
.writeResultsToJson(result, outputFolder, outputFile)
Arguments
result |
A DQD results object (list) |
outputFolder |
The output folder |
outputFile |
The output filename |
Value
NULL (writes results to JSON file)
Internal function to write the check results to a table in the database. Requires write access to the database
Description
Internal function to write the check results to a table in the database. Requires write access to the database
Usage
.writeResultsToTable(
connectionDetails,
resultsDatabaseSchema,
checkResults,
writeTableName,
cohortDefinitionId
)
Arguments
connectionDetails |
A connectionDetails object for connecting to the CDM database |
resultsDatabaseSchema |
The fully qualified database name of the results schema |
checkResults |
A dataframe containing the fully summarized data quality check results |
writeTableName |
The name of the table to be written to the database. Default is "dqdashboard_results". |
cohortDefinitionId |
(OPTIONAL) The cohort definition id for the cohort you wish to run the DQD on. The package assumes a standard OHDSI cohort table called 'Cohort' with the fields cohort_definition_id and subject_id. |
Value
NULL (writes results to database table)
Convert JSON results file case
Description
Convert a DQD JSON results file between camelcase and (all-caps) snakecase. Enables viewing of pre-v.2.1.0 results files in later DQD versions, and vice versa
Usage
convertJsonResultsFileCase(
jsonFilePath,
writeToFile,
outputFolder = NA,
outputFile = "",
targetCase
)
Arguments
jsonFilePath |
Path to the JSON results file to be converted |
writeToFile |
Whether or not to write the converted results back to a file (must be either TRUE or FALSE) |
outputFolder |
The folder to output the converted JSON results file to |
outputFile |
(OPTIONAL) File to write converted results JSON object to. Default is name of input file with a "_camel" or "_snake" postfix |
targetCase |
Case into which the results file parameters should be converted (must be either "camel" or "snake") |
Value
DQD results object (a named list)
Execute DQ checks
Description
This function will connect to the database, generate the sql scripts, and run the data quality checks against the database. By default, results will be written to a json file as well as a database table.
Usage
executeDqChecks(
connectionDetails,
cdmDatabaseSchema,
resultsDatabaseSchema,
vocabDatabaseSchema = cdmDatabaseSchema,
cdmSourceName,
numThreads = 1,
sqlOnly = FALSE,
sqlOnlyUnionCount = 1,
sqlOnlyIncrementalInsert = FALSE,
outputFolder,
outputFile = "",
verboseMode = FALSE,
writeToTable = TRUE,
writeTableName = "dqdashboard_results",
writeToCsv = FALSE,
csvFile = "",
checkLevels = c("TABLE", "FIELD", "CONCEPT"),
checkNames = c(),
checkSeverity = c("fatal", "convention", "characterization"),
cohortDefinitionId = c(),
cohortDatabaseSchema = resultsDatabaseSchema,
cohortTableName = "cohort",
tablesToExclude = c("CONCEPT", "VOCABULARY", "CONCEPT_ANCESTOR",
"CONCEPT_RELATIONSHIP", "CONCEPT_CLASS", "CONCEPT_SYNONYM", "RELATIONSHIP", "DOMAIN"),
cdmVersion = "5.3",
tableCheckThresholdLoc = "default",
fieldCheckThresholdLoc = "default",
conceptCheckThresholdLoc = "default"
)
Arguments
connectionDetails |
A connectionDetails object for connecting to the CDM database |
cdmDatabaseSchema |
The fully qualified database name of the CDM schema |
resultsDatabaseSchema |
The fully qualified database name of the results schema |
vocabDatabaseSchema |
The fully qualified database name of the vocabulary schema (default is to set it as the cdmDatabaseSchema) |
cdmSourceName |
The name of the CDM data source |
numThreads |
The number of concurrent threads to use to execute the queries |
sqlOnly |
Should the SQLs be executed (FALSE) or just returned (TRUE)? |
sqlOnlyUnionCount |
(OPTIONAL) In sqlOnlyIncrementalInsert mode, how many SQL commands to union in each query to insert check results into results table (can speed processing when queries done in parallel). Default is 1. |
sqlOnlyIncrementalInsert |
(OPTIONAL) In sqlOnly mode, boolean to determine whether to generate SQL queries that insert check results and associated metadata into results table. Default is FALSE (for backwards compatibility to <= v2.2.0) |
outputFolder |
The folder to output logs, SQL files, and JSON results file to |
outputFile |
(OPTIONAL) File to write results JSON object |
verboseMode |
Boolean to determine if the console will show all execution steps. Default is FALSE |
writeToTable |
Boolean to indicate if the check results will be written to the dqdashboard_results table in the resultsDatabaseSchema. Default is TRUE |
writeTableName |
The name of the results table. Defaults to 'dqdashboard_results'. Used when sqlOnly or writeToTable is True. |
writeToCsv |
Boolean to indicate if the check results will be written to a csv file. Default is FALSE |
csvFile |
(OPTIONAL) CSV file to write results |
checkLevels |
Choose which DQ check levels to execute. Default is all 3 (TABLE, FIELD, CONCEPT) |
checkNames |
(OPTIONAL) Choose which check names to execute. Names can be found in inst/csv/OMOP_CDM_v[cdmVersion]_Check_Descriptions.csv. Note that "cdmTable", "cdmField" and "measureValueCompleteness" are always executed. |
checkSeverity |
Choose which DQ check severity levels to execute. Default is all 3 (fatal, convention, characterization) |
cohortDefinitionId |
The cohort definition id for the cohort you wish to run the DQD on. The package assumes a standard OHDSI cohort table with the fields cohort_definition_id and subject_id. |
cohortDatabaseSchema |
The schema where the cohort table is located. |
cohortTableName |
The name of the cohort table. Defaults to 'cohort'. |
tablesToExclude |
(OPTIONAL) Choose which CDM tables to exclude from the execution. |
cdmVersion |
The CDM version to target for the data source. Options are "5.2", "5.3", or "5.4". By default, "5.3" is used. |
tableCheckThresholdLoc |
The location of the threshold file for evaluating the table checks. If not specified the default thresholds will be applied. |
fieldCheckThresholdLoc |
The location of the threshold file for evaluating the field checks. If not specified the default thresholds will be applied. |
conceptCheckThresholdLoc |
The location of the threshold file for evaluating the concept checks. If not specified the default thresholds will be applied. |
Value
A list object of results
List DQ checks
Description
Details on all checks defined by the DataQualityDashboard Package.
Usage
listDqChecks(
cdmVersion = "5.3",
tableCheckThresholdLoc = "default",
fieldCheckThresholdLoc = "default",
conceptCheckThresholdLoc = "default"
)
Arguments
cdmVersion |
The CDM version to target for the data source. By default, 5.3 is used. |
tableCheckThresholdLoc |
The location of the threshold file for evaluating the table checks. If not specified the default thresholds will be applied. |
fieldCheckThresholdLoc |
The location of the threshold file for evaluating the field checks. If not specified the default thresholds will be applied. |
conceptCheckThresholdLoc |
The location of the threshold file for evaluating the concept checks. If not specified the default thresholds will be applied. |
Value
A list containing check descriptions, table checks, field checks, and concept checks
Re-evaluate Thresholds
Description
Re-evaluate an existing DQD result against an updated thresholds file.
Usage
reEvaluateThresholds(
jsonFilePath,
outputFolder,
outputFile,
tableCheckThresholdLoc = "default",
fieldCheckThresholdLoc = "default",
conceptCheckThresholdLoc = "default",
cdmVersion = "5.3"
)
Arguments
jsonFilePath |
Path to the JSON results file generated using the execute function |
outputFolder |
The folder to output new JSON result file to |
outputFile |
File to write results JSON object to |
tableCheckThresholdLoc |
The location of the threshold file for evaluating the table checks. If not specified the default thresholds will be applied. |
fieldCheckThresholdLoc |
The location of the threshold file for evaluating the field checks. If not specified the default thresholds will be applied. |
conceptCheckThresholdLoc |
The location of the threshold file for evaluating the concept checks. If not specified the default thresholds will be applied. |
cdmVersion |
The CDM version to target for the data source. By default, 5.3 is used. |
Value
A list containing the re-evaluated DQD results
View DQ Dashboard
Description
View DQ Dashboard
Usage
viewDqDashboard(jsonPath, launch.browser = NULL, display.mode = NULL, ...)
Arguments
jsonPath |
The fully-qualified path to the JSON file produced by |
launch.browser |
Passed on to |
display.mode |
Passed on to |
... |
Extra parameters for shiny::runApp() like "port" or "host" |
Value
NULL (launches Shiny application)
Write DQD results database table to json
Description
Write DQD results database table to json
Usage
writeDBResultsToJson(
connection,
resultsDatabaseSchema,
cdmDatabaseSchema,
writeTableName,
outputFolder,
outputFile
)
Arguments
connection |
A connection object |
resultsDatabaseSchema |
The fully qualified database name of the results schema |
cdmDatabaseSchema |
The fully qualified database name of the CDM schema |
writeTableName |
Name of DQD results table in the database to read from |
outputFolder |
The folder to output the json results file to |
outputFile |
The output filename of the json results file |
Value
NULL (writes results to JSON file)
Write JSON Results to CSV file
Description
Write JSON Results to CSV file
Usage
writeJsonResultsToCsv(
jsonPath,
csvPath,
columns = c("checkId", "failed", "passed", "isError", "notApplicable", "checkName",
"checkDescription", "thresholdValue", "notesValue", "checkLevel", "category",
"subcategory", "context", "checkLevel", "cdmTableName", "cdmFieldName", "conceptId",
"unitConceptId", "numViolatedRows", "pctViolatedRows", "numDenominatorRows",
"executionTime", "notApplicableReason", "error", "queryText"),
delimiter = ","
)
Arguments
jsonPath |
Path to the JSON results file generated using the execute function |
csvPath |
Path to the CSV output file |
columns |
(OPTIONAL) List of desired columns |
delimiter |
(OPTIONAL) CSV delimiter |
Value
NULL (writes results to CSV file)
Write JSON Results to SQL Table
Description
Write JSON Results to SQL Table
Usage
writeJsonResultsToTable(
connectionDetails,
resultsDatabaseSchema,
jsonFilePath,
writeTableName = "dqdashboard_results",
cohortDefinitionId = c(),
singleTable = FALSE
)
Arguments
connectionDetails |
A connectionDetails object for connecting to the CDM database |
resultsDatabaseSchema |
The fully qualified database name of the results schema |
jsonFilePath |
Path to the JSON results file generated using the execute function |
writeTableName |
Name of table in the database to write results to |
cohortDefinitionId |
If writing results for a single cohort this is the ID that will be appended to the table name |
singleTable |
If TRUE, writes all results to a single table. If FALSE (default), writes to 3 separate tables by check level (table, field, concept) (NOTE this default behavior will be deprecated in the future) |
Value
NULL (writes results to database table)