This vignette assumes a SQL server at localhost (we
use PostgreSQL), with data in OMOP Common Data Model v5.4 format in
schema cdm_new_york3. The patient records shown in this
example are synthetic data from Synthea(TM) Patient
Generator.
library(phea)
library(dplyr)
# Connect to SQL server.
dbcon <- DBI::dbConnect(RPostgres::Postgres(),
host = fabcred$pg$host, port = fabcred$pg$port, dbname = fabcred$pg$database,
user = fabcred$pg$user, password = fabcred$pg$pass)
# Setup Phea
setup_phea(dbcon, 'cdm_new_york3')In this vignette we identify:
Case A.
Case B.
Case C.
Here’s how we compute it:
In both cases, we will normalize the unit of measurement of serum creatinine to md/dL prior to computing formulas. We could also use formulas for that, and the result would be the same.
We collect SCr records from MEASUREMENT and convert the
units to mg/dL.
# Serum creatinine codes used:
# Loinc 38483-4 Creatinine [Mass/volume] in Blood, OMOP CDM concept ID 3051825
# Loinc 2160-0 Creatinine [Mass/volume] in Serum or Plasma, OMOP CDM concept ID 3016723
# "A" records: Unit is 'mg/dL'.
scr_records_a <- sqlt(measurement) |>
filter(measurement_concept_id %in% c(3051825, 3016723) &&
unit_source_value == 'mg/dL')
# "B" records: Unit is 'µmol/L', but we convert to 'mg/dL'.
scr_records_b <- sqlt(measurement) |>
filter(measurement_concept_id %in% c(3051825, 3016723) &&
unit_source_value == 'µmol/L') |>
mutate( # Convert µmol/L to mg/dL
value_as_number = value_as_number / 88.42,
unit_source_value = 'mg/dL')
# Combine all available serum creatinine records.
scr_records <- union_all(scr_records_a, scr_records_b)
# Make a record source.
scr_record_source <- make_record_source(scr_records,
ts = 'measurement_datetime',
pid = 'person_id')We collect GFR records from MEASUREMENT.
# GFR codes used:
# Loinc 77147-7 Glomerular filtration rate/1.73 sq M.predicted [Volume Rate/Area] in Serum, Plasma or Blood by Creatinine-based formula (MDRD)
# OMOP CDM concept ID 46236952
gfr_records <- sqlt(measurement) |>
filter(measurement_concept_id == 46236952)
gfr_record_source <- make_record_source(gfr_records,
ts = 'measurement_datetime',
pid = 'person_id')Formula scr_case_a contains the logic of case A
(difference ≥0.3 mg/dL), scr_case_b contains case B (ratio
≥1.5), and gfr_case_c contains case C (two GFRs under 60
within 3-5 months).
Notice we use 48 days instead of 48 hours, and 7 months instead of 7 days. This is just to make a visually meaningful example out of the limited data that is produced by Synthea(TM). Those time intervals can be changed to “48 hours” and “7 days” to produce the real phenotype.
scr_change <- calculate_formula(
components = list(
# Current SCr
scr = make_component(scr_record_source),
# Minimum value within 48-hour window
scr_48h_min = make_component(scr_record_source,
window = "'48 days'::interval",
fn = list(value_as_number = 'min')),
# Minimum value within 7-day window
scr_7d_min = make_component(scr_record_source,
window = "'7 months'::interval",
fn = list(value_as_number = 'min')),
# Current glomerular filtration rate (GFR)
gfr = make_component(gfr_record_source),
# Glomerular filtration rate 3 to 5 months older than phenotype date
gfr_prior = make_component(gfr_record_source,
delay = "'3 months'::interval",
window = "'5 months'::interval",
fn = list(value_as_number = 'min'))),
fml = list(
scr_case_a = 'scr_value_as_number - scr_48h_min_value_as_number >= 0.3',
scr_case_b = 'scr_value_as_number / scr_7d_min_value_as_number >= 1.5',
gfr_case_c = 'gfr_value_as_number < 60 AND gfr_prior_value_as_number < 60'),
export = c(
'scr_measurement_datetime',
'scr_48h_min_measurement_datetime',
'scr_7d_min_measurement_datetime',
'gfr_measurement_datetime',
'gfr_prior_measurement_datetime'),
cascaded = FALSE, # Because we don't need to use results of prior formulas inside other formulas.
)Let us take a small peek at 15 rows from the phenotype results.
head_shot(scr_change, 15) |>
kable()| phea_row_id | pid | ts | window | scr_value_as_number | scr_48h_min_value_as_number | scr_7d_min_value_as_number | gfr_value_as_number | gfr_prior_value_as_number | scr_measurement_datetime | scr_48h_min_measurement_datetime | scr_7d_min_measurement_datetime | gfr_measurement_datetime | gfr_prior_measurement_datetime | scr_case_a | scr_case_b | gfr_case_c |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | 2014-03-07 | 00:00:00 | 1.4 | 1.4 | 1.4 | NA | NA | 2014-03-07 | 2014-03-07 | 2014-03-07 | NA | NA | FALSE | FALSE | NA |
| 2 | 1 | 2014-03-07 | 00:00:00 | 1.4 | 1.4 | 1.4 | 124.5 | NA | 2014-03-07 | 2014-03-07 | 2014-03-07 | 2014-03-07 | NA | FALSE | FALSE | FALSE |
| 3 | 1 | 2016-03-11 | 735 days | 1.4 | 1.3 | 1.3 | 120.9 | NA | 2014-03-07 | 2016-03-11 | 2016-03-11 | 2016-03-11 | NA | FALSE | FALSE | FALSE |
| 4 | 1 | 2016-03-11 | 00:00:00 | 1.3 | 1.3 | 1.3 | 120.9 | NA | 2016-03-11 | 2016-03-11 | 2016-03-11 | 2016-03-11 | NA | FALSE | FALSE | FALSE |
| 5 | 1 | 2018-03-16 | 735 days | 1.3 | 1.3 | 1.3 | 120.9 | NA | 2018-03-16 | 2018-03-16 | 2018-03-16 | 2016-03-11 | NA | FALSE | FALSE | FALSE |
| 6 | 1 | 2018-03-16 | 00:00:00 | 1.3 | 1.3 | 1.3 | 151.7 | NA | 2018-03-16 | 2018-03-16 | 2018-03-16 | 2018-03-16 | NA | FALSE | FALSE | FALSE |
| 7 | 1 | 2020-03-20 | 735 days | 1.3 | 1.3 | 1.3 | 107.3 | NA | 2018-03-16 | 2020-03-20 | 2020-03-20 | 2020-03-20 | NA | FALSE | FALSE | FALSE |
| 8 | 1 | 2020-03-20 | 00:00:00 | 1.3 | 1.3 | 1.3 | 107.3 | NA | 2020-03-20 | 2020-03-20 | 2020-03-20 | 2020-03-20 | NA | FALSE | FALSE | FALSE |
| 9 | 1 | 2022-02-11 | 693 days | 1.5 | 1.5 | 1.5 | 107.3 | NA | 2022-02-11 | 2022-02-11 | 2022-02-11 | 2020-03-20 | NA | FALSE | FALSE | FALSE |
| 10 | 1 | 2022-02-11 | 693 days | 1.8 | 1.5 | 1.5 | 107.3 | NA | 2022-02-11 | 2022-02-11 | 2022-02-11 | 2020-03-20 | NA | TRUE | FALSE | FALSE |
| 11 | 1 | 2022-02-11 | 00:00:00 | 1.8 | 1.5 | 1.5 | 84.8 | NA | 2022-02-11 | 2022-02-11 | 2022-02-11 | 2022-02-11 | NA | TRUE | FALSE | FALSE |
| 12 | 1 | 2022-02-11 | 00:00:00 | 1.8 | 1.5 | 1.5 | 132.3 | NA | 2022-02-11 | 2022-02-11 | 2022-02-11 | 2022-02-11 | NA | TRUE | FALSE | FALSE |
| 13 | 1 | 2022-03-25 | 42 days | 1.8 | 1.5 | 1.5 | 84.1 | NA | 2022-02-11 | 2022-02-11 | 2022-02-11 | 2022-03-25 | NA | TRUE | FALSE | FALSE |
| 14 | 1 | 2022-03-25 | 42 days | 1.8 | 1.5 | 1.5 | 105.7 | NA | 2022-02-11 | 2022-02-11 | 2022-02-11 | 2022-03-25 | NA | TRUE | FALSE | FALSE |
| 15 | 1 | 2022-03-25 | 00:00:00 | 1.5 | 1.5 | 1.5 | 105.7 | NA | 2022-03-25 | 2022-02-11 | 2022-02-11 | 2022-03-25 | NA | FALSE | FALSE | FALSE |
For demonstration purposes, let us pick an “interesting” patient. By
“interesting” I just mean a patient who at different times was
TRUE and FALSE in each of the three criteria.
A patient with variability in their timeline.
# Find an "interesting" case to plot: patients with each criteria at times TRUE, at times FALSE
patients <- scr_change |>
group_by(pid) |>
summarise(.groups = 'drop',
n_a = n_distinct(scr_case_a),
n_b = n_distinct(scr_case_b),
n_c = n_distinct(gfr_case_c)) |>
mutate(sort_variable = n_a + n_b + n_c) |>
select(pid, sort_variable) |>
collect()
patients <- patients |>
arrange(desc(sort_variable)) |>
pull('pid')
patients <- patients[1:20]
random_patient <- sample(patients, 1)
message('Sampled patient: ', random_patient)
#> Sampled patient: 729Then we plot all data for the chosen patient (pid =
729).
scr_change |>
select(-ends_with('datetime')) |>
mutate(
scr_case_a_or_b = scr_case_a | scr_case_b) |>
phea_plot(random_patient)
#> Collecting lazy table, done. (turn this message off with `verbose` or `.verbose` in setup_phea())At the end of this report I include a large table with all the data for this patient, directly from the record sources, for maximum verification.
To see the SQL query underlying the phenotype, use helper function
code_shot(), or dbplyr::sql_render(), or the
.clip_sql option in calculate_formula().
code_shot(scr_change)SELECT
*,
scr_value_as_number - scr_48h_min_value_as_number >= 0.3 AS "scr_case_a",
scr_value_as_number / scr_7d_min_value_as_number >= 1.5 AS "scr_case_b",
gfr_value_as_number < 60 AND gfr_prior_value_as_number < 60 AS "gfr_case_c"
FROM (
SELECT
"phea_row_id",
"pid",
"ts",
greatest("scr_ts", "scr_48h_min_ts", "scr_7d_min_ts", "gfr_ts", "gfr_prior_ts") - least("scr_ts", "scr_48h_min_ts", "scr_7d_min_ts", "gfr_ts", "gfr_prior_ts") AS "window",
"scr_value_as_number",
"scr_48h_min_value_as_number",
"scr_7d_min_value_as_number",
"gfr_value_as_number",
"gfr_prior_value_as_number",
"scr_measurement_datetime",
"scr_48h_min_measurement_datetime",
"scr_7d_min_measurement_datetime",
"gfr_measurement_datetime",
"gfr_prior_measurement_datetime"
FROM (
SELECT
row_number() OVER (ORDER BY "pid", "ts") AS "phea_row_id",
"pid",
"ts",
phea_last_value_ignore_nulls(case when "name" = 236253 then "measurement_datetime" else null end) OVER (PARTITION BY "pid" ORDER BY "ts" ROWS UNBOUNDED PRECEDING) AS "scr_measurement_datetime",
phea_last_value_ignore_nulls(case when "name" = 236253 then "value_as_number" else null end) OVER (PARTITION BY "pid" ORDER BY "ts" ROWS UNBOUNDED PRECEDING) AS "scr_value_as_number",
phea_last_value_ignore_nulls(case when "name" = 236253 then "ts" else null end) OVER (PARTITION BY "pid" ORDER BY "ts" ROWS UNBOUNDED PRECEDING) AS "scr_ts",
min(case when "name" = 236253 then "measurement_datetime" else null end) over (partition by "pid" order by "ts" range between '48 days'::interval preceding and current row) AS "scr_48h_min_measurement_datetime",
min(case when "name" = 236253 then "value_as_number" else null end) over (partition by "pid" order by "ts" range between '48 days'::interval preceding and current row) AS "scr_48h_min_value_as_number",
phea_last_value_ignore_nulls(case when "name" = 236253 then "ts" else null end) over (partition by "pid" order by "ts" range between '48 days'::interval preceding and current row) AS "scr_48h_min_ts",
min(case when "name" = 236253 then "measurement_datetime" else null end) over (partition by "pid" order by "ts" range between '7 months'::interval preceding and current row) AS "scr_7d_min_measurement_datetime",
min(case when "name" = 236253 then "value_as_number" else null end) over (partition by "pid" order by "ts" range between '7 months'::interval preceding and current row) AS "scr_7d_min_value_as_number",
phea_last_value_ignore_nulls(case when "name" = 236253 then "ts" else null end) over (partition by "pid" order by "ts" range between '7 months'::interval preceding and current row) AS "scr_7d_min_ts",
phea_last_value_ignore_nulls(case when "name" = 666684 then "measurement_datetime" else null end) OVER (PARTITION BY "pid" ORDER BY "ts" ROWS UNBOUNDED PRECEDING) AS "gfr_measurement_datetime",
phea_last_value_ignore_nulls(case when "name" = 666684 then "value_as_number" else null end) OVER (PARTITION BY "pid" ORDER BY "ts" ROWS UNBOUNDED PRECEDING) AS "gfr_value_as_number",
phea_last_value_ignore_nulls(case when "name" = 666684 then "ts" else null end) OVER (PARTITION BY "pid" ORDER BY "ts" ROWS UNBOUNDED PRECEDING) AS "gfr_ts",
min(case when "name" = 666684 then "measurement_datetime" else null end) over (partition by "pid" order by "ts" range between '5 months'::interval preceding and '3 months'::interval preceding) AS "gfr_prior_measurement_datetime",
min(case when "name" = 666684 then "value_as_number" else null end) over (partition by "pid" order by "ts" range between '5 months'::interval preceding and '3 months'::interval preceding) AS "gfr_prior_value_as_number",
phea_last_value_ignore_nulls(case when "name" = 666684 then "ts" else null end) over (partition by "pid" order by "ts" range between '5 months'::interval preceding and '3 months'::interval preceding) AS "gfr_prior_ts"
FROM (
(
SELECT
236253 AS "name",
"person_id" AS "pid",
"measurement_datetime" AS "ts",
"measurement_datetime",
"value_as_number"
FROM (
(
SELECT *
FROM "cdm_new_york3"."measurement"
WHERE ("measurement_concept_id" IN (3051825.0, 3016723.0) AND "unit_source_value" = 'mg/dL')
)
UNION ALL
(
SELECT
"measurement_id",
"person_id",
"measurement_concept_id",
"measurement_date",
"measurement_datetime",
"measurement_time",
"measurement_type_concept_id",
"operator_concept_id",
"value_as_number" / 88.42 AS "value_as_number",
"value_as_concept_id",
"unit_concept_id",
"range_low",
"range_high",
"provider_id",
"visit_occurrence_id",
"visit_detail_id",
"measurement_source_value",
"measurement_source_concept_id",
'mg/dL' AS "unit_source_value",
"unit_source_concept_id",
"value_source_value",
"measurement_event_id",
"meas_event_field_concept_id"
FROM "cdm_new_york3"."measurement"
WHERE ("measurement_concept_id" IN (3051825.0, 3016723.0) AND "unit_source_value" = 'µmol/L')
)
) "q01"
)
UNION ALL
(
SELECT
666684 AS "name",
"person_id" AS "pid",
"measurement_datetime" AS "ts",
"measurement_datetime",
"value_as_number"
FROM "cdm_new_york3"."measurement"
WHERE ("measurement_concept_id" = 46236952.0)
)
) "q02"
) "q03"
) "q04"calculate_formula()Sometimes the best way to check a result is to manually, painstakingly go over the data. Below I print all rows from the record sources for the patient that was plotted above.
scr_records |>
filter(person_id == random_patient) |>
collect() |>
arrange(measurement_datetime) |>
select(measurement_id, person_id, measurement_datetime, value_as_number, unit_source_value) |>
kable()| measurement_id | person_id | measurement_datetime | value_as_number | unit_source_value |
|---|---|---|---|---|
| 291341 | 729 | 2003-06-08 | 1.3 | mg/dL |
| 292202 | 729 | 2004-06-13 | 1.4 | mg/dL |
| 292012 | 729 | 2005-06-19 | 1.3 | mg/dL |
| 292605 | 729 | 2006-04-30 | 1.5 | mg/dL |
| 292619 | 729 | 2006-04-30 | 2.1 | mg/dL |
| 291412 | 729 | 2006-06-25 | 1.6 | mg/dL |
| 291427 | 729 | 2006-06-25 | 2.1 | mg/dL |
| 292230 | 729 | 2007-07-01 | 2.1 | mg/dL |
| 291480 | 729 | 2008-07-06 | 1.9 | mg/dL |
| 292272 | 729 | 2008-08-17 | 1.9 | mg/dL |
| 292289 | 729 | 2009-07-12 | 1.9 | mg/dL |
| 292662 | 729 | 2010-03-14 | 3.5 | mg/dL |
| 291377 | 729 | 2010-07-18 | 2.8 | mg/dL |
| 292542 | 729 | 2011-07-24 | 2.2 | mg/dL |
| 291722 | 729 | 2012-04-01 | 2.5 | mg/dL |
| 292160 | 729 | 2012-07-29 | 4.0 | mg/dL |
| 291992 | 729 | 2012-08-12 | 3.2 | mg/dL |
| 292444 | 729 | 2013-08-04 | 3.1 | mg/dL |
| 292643 | 729 | 2013-11-24 | 3.4 | mg/dL |
| 292328 | 729 | 2014-05-18 | 3.0 | mg/dL |
| 291836 | 729 | 2014-08-10 | 2.4 | mg/dL |
| 291907 | 729 | 2015-08-16 | 2.6 | mg/dL |
| 291605 | 729 | 2015-09-13 | 2.0 | mg/dL |
| 292345 | 729 | 2016-08-21 | 2.9 | mg/dL |
| 292477 | 729 | 2016-10-09 | 2.9 | mg/dL |
| 291877 | 729 | 2016-11-06 | 2.4 | mg/dL |
| 292389 | 729 | 2017-08-27 | 2.3 | mg/dL |
| 291943 | 729 | 2017-12-03 | 6.9 | mg/dL |
| 292679 | 729 | 2018-01-28 | 4.3 | mg/dL |
| 291918 | 729 | 2018-07-29 | 4.8 | mg/dL |
| 292568 | 729 | 2018-08-26 | 5.1 | mg/dL |
| 292502 | 729 | 2018-09-09 | 4.1 | mg/dL |
| 291636 | 729 | 2018-09-30 | 3.9 | mg/dL |
| 291495 | 729 | 2018-12-23 | 5.2 | mg/dL |
| 291822 | 729 | 2019-01-27 | 4.4 | mg/dL |
| 292140 | 729 | 2019-07-21 | 5.2 | mg/dL |
| 292096 | 729 | 2019-09-15 | 5.8 | mg/dL |
| 291687 | 729 | 2019-12-22 | 6.9 | mg/dL |
| 291588 | 729 | 2020-02-16 | 4.2 | mg/dL |
| 291801 | 729 | 2020-06-21 | 4.9 | mg/dL |
| 291732 | 729 | 2020-09-20 | 4.9 | mg/dL |
| 291447 | 729 | 2020-11-15 | 6.1 | mg/dL |
| 291516 | 729 | 2021-03-14 | 4.9 | mg/dL |
| 291559 | 729 | 2021-04-11 | 6.8 | mg/dL |
| 292372 | 729 | 2021-07-11 | 4.3 | mg/dL |
| 292419 | 729 | 2021-09-26 | 4.1 | mg/dL |
| 292031 | 729 | 2021-10-10 | 4.8 | mg/dL |
| 291759 | 729 | 2022-01-16 | 2.9 | mg/dL |
| 291779 | 729 | 2022-01-16 | 4.4 | mg/dL |
| 291533 | 729 | 2022-02-06 | 4.6 | mg/dL |
| 292521 | 729 | 2022-07-10 | 4.6 | mg/dL |
| 291964 | 729 | 2022-10-02 | 5.3 | mg/dL |
gfr_records |>
filter(person_id == random_patient) |>
collect() |>
arrange(measurement_datetime) |>
select(measurement_id, person_id, measurement_datetime, value_as_number, unit_source_value) |>
kable()| measurement_id | person_id | measurement_datetime | value_as_number | unit_source_value |
|---|---|---|---|---|
| 291362 | 729 | 2003-06-08 | 128.3 | mL/min/{1.73_m2} |
| 292187 | 729 | 2004-06-13 | 93.7 | mL/min/{1.73_m2} |
| 292026 | 729 | 2005-06-19 | 159.2 | mL/min/{1.73_m2} |
| 292599 | 729 | 2006-04-30 | 121.2 | mL/min |
| 292608 | 729 | 2006-04-30 | 63.1 | mL/min/{1.73_m2} |
| 291406 | 729 | 2006-06-25 | 146.8 | mL/min |
| 291416 | 729 | 2006-06-25 | 63.4 | mL/min/{1.73_m2} |
| 292216 | 729 | 2007-07-01 | 62.6 | mL/min/{1.73_m2} |
| 291482 | 729 | 2008-07-06 | 70.6 | mL/min/{1.73_m2} |
| 292271 | 729 | 2008-08-17 | 70.7 | mL/min/{1.73_m2} |
| 292304 | 729 | 2009-07-12 | 70.0 | mL/min/{1.73_m2} |
| 292665 | 729 | 2010-03-14 | 38.0 | mL/min/{1.73_m2} |
| 291366 | 729 | 2010-07-18 | 47.2 | mL/min/{1.73_m2} |
| 292546 | 729 | 2011-07-24 | 58.8 | mL/min/{1.73_m2} |
| 291702 | 729 | 2012-04-01 | 47.1 | mL/min/{1.73_m2} |
| 292166 | 729 | 2012-07-29 | 30.0 | mL/min/{1.73_m2} |
| 292003 | 729 | 2012-08-12 | 36.8 | mL/min/{1.73_m2} |
| 292435 | 729 | 2013-08-04 | 38.7 | mL/min/{1.73_m2} |
| 292634 | 729 | 2013-11-24 | 34.9 | mL/min/{1.73_m2} |
| 292317 | 729 | 2014-05-18 | 38.4 | mL/min/{1.73_m2} |
| 291847 | 729 | 2014-08-10 | 49.7 | mL/min/{1.73_m2} |
| 291893 | 729 | 2015-08-16 | 44.3 | mL/min/{1.73_m2} |
| 291620 | 729 | 2015-09-13 | 57.5 | mL/min/{1.73_m2} |
| 292360 | 729 | 2016-08-21 | 39.0 | mL/min/{1.73_m2} |
| 292488 | 729 | 2016-10-09 | 40.1 | mL/min/{1.73_m2} |
| 291881 | 729 | 2016-11-06 | 47.8 | mL/min/{1.73_m2} |
| 292393 | 729 | 2017-08-27 | 48.6 | mL/min/{1.73_m2} |
| 291949 | 729 | 2017-12-03 | 16.5 | mL/min/{1.73_m2} |
| 292690 | 729 | 2018-01-28 | 26.2 | mL/min/{1.73_m2} |
| 291929 | 729 | 2018-07-29 | 23.4 | mL/min/{1.73_m2} |
| 292578 | 729 | 2018-08-26 | 22.2 | mL/min/{1.73_m2} |
| 292509 | 729 | 2018-09-09 | 27.2 | mL/min/{1.73_m2} |
| 291632 | 729 | 2018-09-30 | 28.7 | mL/min/{1.73_m2} |
| 291485 | 729 | 2018-12-23 | 21.7 | mL/min/{1.73_m2} |
| 291833 | 729 | 2019-01-27 | 25.0 | mL/min/{1.73_m2} |
| 292150 | 729 | 2019-07-21 | 21.3 | mL/min/{1.73_m2} |
| 292109 | 729 | 2019-09-15 | 19.2 | mL/min/{1.73_m2} |
| 291698 | 729 | 2019-12-22 | 16.0 | mL/min/{1.73_m2} |
| 291592 | 729 | 2020-02-16 | 25.8 | mL/min/{1.73_m2} |
| 291797 | 729 | 2020-06-21 | 22.3 | mL/min/{1.73_m2} |
| 291743 | 729 | 2020-09-20 | 22.6 | mL/min/{1.73_m2} |
| 291458 | 729 | 2020-11-15 | 17.9 | mL/min/{1.73_m2} |
| 291525 | 729 | 2021-03-14 | 22.1 | mL/min/{1.73_m2} |
| 291570 | 729 | 2021-04-11 | 15.9 | mL/min/{1.73_m2} |
| 292382 | 729 | 2021-07-11 | 25.0 | mL/min/{1.73_m2} |
| 292425 | 729 | 2021-09-26 | 26.6 | mL/min/{1.73_m2} |
| 292047 | 729 | 2021-10-10 | 22.6 | mL/min/{1.73_m2} |
| 291768 | 729 | 2022-01-16 | 24.3 | mL/min/{1.73_m2} |
| 291788 | 729 | 2022-01-16 | 89.6 | mL/min |
| 291530 | 729 | 2022-02-06 | 23.4 | mL/min/{1.73_m2} |
| 292532 | 729 | 2022-07-10 | 23.5 | mL/min/{1.73_m2} |
| 291975 | 729 | 2022-10-02 | 20.1 | mL/min/{1.73_m2} |
One approach to use Phea’s results inside Atlas is to produce novel
records (e.g. new rows in OBSERVATION or
MEASUREMENT) using custom concept IDs, then ETL those back
into the dataset.
Below I illustrate how to do it with case A, assuming the new records
will go into MEASUREMENT. Notice I put the difference
inside value_as_number, just in case.
case_a_custom_concept_id <- 2000000001
# Case A: SCr increase by >=0.3 over 48 hours
data_for_etl <- scr_change |>
filter(scr_case_a) |> # Keep only rows where case A was TRUE
transmute(
measurement_datetime = ts,
person_id = pid,
value_as_number = scr_value_as_number - scr_48h_min_value_as_number,
measurement_concept_id = case_a_custom_concept_id)
head_shot(data_for_etl) |>
kable()| measurement_datetime | person_id | value_as_number | measurement_concept_id |
|---|---|---|---|
| 2022-02-11 | 1 | 0.3 | 2000000001 |
| 2022-02-11 | 1 | 0.3 | 2000000001 |
| 2022-02-11 | 1 | 0.3 | 2000000001 |
| 2022-03-25 | 1 | 0.3 | 2000000001 |
| 2022-03-25 | 1 | 0.3 | 2000000001 |
| 2016-01-29 | 3 | 0.5 | 2000000001 |
| 2016-11-11 | 3 | 1.7 | 2000000001 |
| 2018-02-09 | 3 | 0.7 | 2000000001 |
| 2022-03-04 | 3 | 1.5 | 2000000001 |
| 2006-09-24 | 19 | 0.7 | 2000000001 |