Summary

Row

Methods

We used four age categories: Under 25, 25 – 30, 30 – 34, and Over 34. The number of categories and precise ranges was somewhat arbitrary, but we adjusted the ranges to keep the number of individuals in each group similar in size within the Fairlabs Washington University data set.

The number of categories for the race field is reduced from the original data set by combining all categories with fewer than 2,000 patients into a single “Other” category. The threshold’s purpose is to ensure there is enough data within each group for a robust analysis. Mothers with multiple races were also grouped into the Other category.

We used R’s glm package to do multivariate logistic regression analyses with the age and race categories as independent variables. We repeated the analysis with a varying dependent variable and subpopulation for the metrics described below. We used the majority population as the reference group.

For local activation, we queried mothers who received our standard urine drug screen or urine pain drug screen either ordered directly or as part of a panel. Only drugs in our urine pain drug screen were counted as positives. This set of drugs has some differences from those in the Fairlabs data set. We queried drug screens up to 6 months before the date of birth and up to 30 days after it. We only analyzed the one drug screen closest to the date of birth for each mother.

Row

Demographic Parity

We measure demographic parity using the proportion of each race and age group that was given drug testing. If the proportions are equal for all demographics, the conditions for demographic parity are met.

The dashed green line marks the date of QI intervention.

Row

Visual

Row

Predictive Parity

We measure predictive parity using the proportion of each demographic that tested positive out of the people tested within that demographic. If the proportions are equal the conditions of predictive parity are met. We measure it separately for THC and Non-THC drugs.

Row

Visual

Row

Equalized Odds

We measure equalized odds using the proportion of each demographic and age subpopulation that was given drug testing with a relevant order indication such as “Substance use during pregnancy, excluding marijuana” and “History of opioids prescribed during pregnancy”.
Goal: Achieve parity in all race and age groups

Row

Visual

Row

General Group Equity

We measure general group equity using the proportion of each demographic and age subpopulation that received intervention for the correct event.
Equation: (TP+FP)/(TP+FN)
Definition:
- True positive is # of patients tested positive for non THC substance and was reported to CPS
- False positive is # of patients tested negative for non THC substance and was reported to CPS
- True negative is # of patients tested negative for non THC substance and wasn’t reported to CPS
- False negative is # of patients tested positive for non THC substance and was reported to CPS
Goal: Achieve as close to a ratio of 1 as possible so that a group is not under-served (ratio < 1) or over-served (ratio > 1)

Row

Visual

Row

Equal Outcomes for non tested mothers

We measure equal outcomes for non tested mothers using the proportion of each demographic subpopulation that were reported to CPS without an UDS indicating non-THC drug use. There may be other evidence not available in this dataset that would be relevant to such a report, so this visual serves to illustrate the relationship between non-testing and CPS report only.
Goal: Achieve parity in all race groups

Row

Visual

Row

Equal Outcomes for tested mothers

We measure equal outcomes for non tested mothers using the proportion of each demographic subpopulation that were reported to CPS with an UDS indicating non-THC drug use. There may be other evidence not available in this dataset that would be relevant to such a report, so this visual serves to illustrate the relationship between testing and CPS report only.
Goal: Achieve parity in all race groups

Row

Visual

Statistics

Row

Demographic Parity

We measure demographic parity using the proportion of each race and age group that was given drug testing.
Goal: Achieve parity in all race and age groups

Row

Logistic regression model- Pre Intervention

Characteristic OR1 95% CI1 p-value
Race group


    White 1.00
    Black or African American 2.27 1.85, 2.79 <0.001
    Other 0.35 0.18, 0.61 <0.001
Age group


    Under 25 1.00
    25 - 30 0.82 0.65, 1.04 0.11
    30 - 34 0.56 0.42, 0.73 <0.001
    Over 34 0.55 0.41, 0.73 <0.001
1 OR = Odds Ratio, CI = Confidence Interval

Logistic regression model - Post Intervention

Characteristic OR1 95% CI1 p-value
Race group


    White 1.00
    Black or African American 1.61 1.16, 2.25 0.004
    Other 0.46 0.18, 0.98 0.068
Age group


    Under 25 1.00
    25 - 30 0.54 0.34, 0.83 0.006
    30 - 34 0.92 0.61, 1.39 0.7
    Over 34 0.65 0.41, 1.01 0.061
1 OR = Odds Ratio, CI = Confidence Interval

Row

Predictive Parity

We measure predictive parity as the proportion of each demographic’s tested subpopulation that tested positive for THC or Non-THC drugs.
Goal: Achieve parity in all race groups and drug detection types

Row

Non-THC Logistic regression model- Pre Intervention

Characteristic OR1 95% CI1 p-value
Race group


    White 1.00
    Black or African American 0.53 0.35, 0.78 0.001
    Other 0.10 0.01, 0.56 0.033
Age group


    Under 25 1.00
    25 - 30 1.90 1.19, 3.04 0.007
    30 - 34 2.38 1.39, 4.10 0.002
    Over 34 2.37 1.36, 4.11 0.002
1 OR = Odds Ratio, CI = Confidence Interval

Non-THC Logistic regression model - Post Intervention

Characteristic OR1 95% CI1 p-value
Race group


    White 1.00
    Black or African American 0.28 0.13, 0.57 <0.001
    Other 0.12 0.01, 0.72 0.026
Age group


    Under 25 1.00
    25 - 30 2.11 0.83, 5.50 0.12
    30 - 34 4.23 1.79, 10.6 0.001
    Over 34 3.49 1.33, 9.68 0.013
1 OR = Odds Ratio, CI = Confidence Interval

THC Logistic regression model - Pre Intervention

Characteristic OR1 95% CI1 p-value
Race group


    White 1.00
    Black or African American 2.68 1.74, 4.23 <0.001
    Other 1.29 0.27, 4.63 0.7
Age group


    Under 25 1.00
    25 - 30 1.21 0.78, 1.90 0.4
    30 - 34 0.85 0.49, 1.46 0.6
    Over 34 0.67 0.37, 1.19 0.2
1 OR = Odds Ratio, CI = Confidence Interval

THC Logistic regression model - Post Intervention

Characteristic OR1 95% CI1 p-value
Race group


    White 1.00
    Black or African American 1.51 0.73, 3.21 0.3
    Other 1.99 0.25, 11.6 0.5
Age group


    Under 25 1.00
    25 - 30 3.32 1.31, 8.70 0.013
    30 - 34 1.08 0.43, 2.66 0.9
    Over 34 0.91 0.31, 2.54 0.9
1 OR = Odds Ratio, CI = Confidence Interval

Row

Equal Outcomes for non tested mothers

We measure equal outcomes for non tested mothers using the proportion of each demographic subpopulation that were reported to CPS without an UDS indicating non-THC drug use. There may be other evidence not available in this data set that would be relevant to such a report, so this visual serves to illustrate the relationship between non-testing and CPS reporting only.
Goal: Achieve parity in all race groups

Row

Non tested mothers’ Logistic regression model - Pre Intervention

Characteristic OR1 95% CI1 p-value
Race group


    White 1.00
    Black or African American 3.56 2.05, 6.53 <0.001
    Other 0.89 0.21, 2.70 0.9
Age group


    Under 25 1.00
    25 - 30 0.94 0.51, 1.69 0.8
    30 - 34 0.75 0.37, 1.47 0.4
    Over 34 0.60 0.28, 1.21 0.2
1 OR = Odds Ratio, CI = Confidence Interval

Non tested mothers’ Logistic regression model - Post Intervention

Characteristic OR1 95% CI1 p-value
Race group


    White 1.00
    Black or African American 1.60 0.84, 3.09 0.2
    Other 0.81 0.19, 2.41 0.7
Age group


    Under 25 1.00
    25 - 30 2.18 0.60, 10.2 0.3
    30 - 34 6.41 2.13, 27.7 0.003
    Over 34 5.28 1.70, 23.1 0.010
1 OR = Odds Ratio, CI = Confidence Interval

Tested mothers’ Logistic regression model - Pre Intervention

Characteristic OR1 95% CI1 p-value
Race group


    White 1.00
    Black or African American 0.85 0.57, 1.26 0.4
    Other 0.83 0.21, 2.76 0.8
Age group


    Under 25 1.00
    25 - 30 1.33 0.85, 2.09 0.2
    30 - 34 1.46 0.86, 2.48 0.2
    Over 34 1.47 0.86, 2.51 0.2
1 OR = Odds Ratio, CI = Confidence Interval

Tested mothers’ Logistic regression model - Post Intervention

Characteristic OR1 95% CI1 p-value
Race group


    White 1.00
    Black or African American 0.78 0.40, 1.54 0.5
    Other 0.44 0.05, 2.61 0.4
Age group


    Under 25 1.00
    25 - 30 5.18 2.05, 13.9 <0.001
    30 - 34 3.00 1.33, 6.98 0.009
    Over 34 4.58 1.83, 12.1 0.002
1 OR = Odds Ratio, CI = Confidence Interval

Demographics

Row

Demographics of All Mothers

Demographics of Tested Mothers

Row

Demographics of Mothers Positive for Drugs Excluding THC

Demographics of Mothers Positive for THC

---
title: "Fairlabs Dashboard"
header-includes:
  - \usepackage{comment}
output:
  flexdashboard::flex_dashboard:
    css: style.css
    theme: cosmo
    orientation: rows
    vertical_layout: scroll
    source_code: embed
---
```{r, include=FALSE}
library(flexdashboard)
library(tidyverse)
library(plotly)
library(knitr)
library(janitor)
library(reshape2)
require(scales)
library(glue)
library(gtsummary)
library(htmltools)
library(lubridate)
```

```{r}
# cleans raw data based on user defined input columns

clean_data <- function(df, input){

  # Pull user defined columns

  # dates
  delivery_date_in <- input[input$column_out=='delivery_date','column_in']
  cps_reporting_date_in <- input[input$column_out=='cps_reporting_date','column_in']
  uds_collection_date_in <- input[input$column_out=='uds_collection_date','column_in']
  uds_test_in <- input[input$column_out=='uds_test','column_in']
  maternal_birth_date_in <- input[input$column_out=='maternal_birth_date','column_in']
  # static QI intervention date from dictionary 
  intervention_date_in <- as.Date(input[input$column_out=='intervention_date','column_in'])

  # filled if non-THC and THC are stored as individual drug columns
  non_thc_col <- strsplit(input[input$column_out=='non_thc_cols','column_in'], split ='\\\\t')[[1]]
  thc_col_in <- input[input$column_out=='thc_col','column_in']
  # filled if non-THC and THC are stored as true/false 
  non_thc_detect_in <- input[input$column_out=='non_thc_detect','column_in']
  thc_detect_in <- input[input$column_out=='thc_detect','column_in']

  maternal_age_in <- input[input$column_out=='maternal_age','column_in']
  maternal_race_in <- input[input$column_out=='maternal_race','column_in']

  order_indication_in <- input[input$column_out=='order_indication','column_in']
  # Used to determine which order indication is for non-THC drug use
  ord_indict_non_thc_in <- strsplit(input[input$column_out=='ord_indict_non_thc','column_in'], split ='\\\\t')[[1]]

  # Checks if each column exist, and format for summary and visual uses
  if (delivery_date_in == "") {
    df$delivery_date <- NA
    df$delivery_qtr <- NA
  } else {
    df$delivery_date <- as.Date(df[[delivery_date_in]])
    df$delivery_qtr <- floor_date(as.Date(format.Date(df$delivery_date, '%Y-%m-01')), unit = "quarter")
  }

  if (cps_reporting_date_in == "") {
    df$cps_reporting_date <- NA
    df$cps_report <- NA
  } else {
    df$cps_reporting_date <- as.Date(df[[cps_reporting_date_in]])
    df$cps_report <- if_else(is.na(df$cps_reporting_date), FALSE, TRUE)
  }

  if (is.na(intervention_date_in)) {
    df$pre_post_QI <- NA
  } else {
    df <- df %>%
      mutate(pre_post_QI = factor(if_else(delivery_date >= intervention_date_in, 'Post', 'Pre'), 
                                  levels = c('Pre', 'Post')))
  }

  if ((uds_collection_date_in == "" & uds_test_in != "")) {
    df$uds_collection_date <- NA
    df$uds_test <- df[[uds_test_in]]
  } else if (uds_collection_date_in != "") {
    df$uds_collection_date <- as.Date(df[[uds_collection_date_in]])
    df$uds_test <- if_else(is.na(df$uds_collection_date), FALSE, TRUE)
  } else {
    df$uds_collection_date <- NA
    df$uds_test <- NA
  }

  if (thc_detect_in != "" & thc_col_in == "") {
    df$thc_detect <- df[[thc_detect_in]]
  } else if (thc_col_in != "") {
    df$detected_tetrahydrocannabinol <- df[[thc_col_in]]
    df$thc_detect <- if_else(df$detected_tetrahydrocannabinol==1, TRUE, FALSE)
  } else {
    df$thc_detect <- NA
  }

  if (non_thc_detect_in != "" & is_empty(non_thc_col)) {
    df$non_thc_detect <- df[[non_thc_detect_in]]
  } else if (!is_empty(non_thc_col)) {
    df <- df %>%
      mutate(non_thc_detect = if_else(rowSums(df[non_thc_col]) > 0, TRUE, FALSE))
  } else {
    df$non_thc_detect <- NA
  }

  if (maternal_age_in != "") {
    df$maternal_age <- df[[maternal_age_in]]
    df <- df %>%
      mutate(age_group = factor(case_when(maternal_age < 25 ~ 'Under 25', 
                                maternal_age < 30 ~ '25 - 30', 
                                maternal_age < 34 ~ '30 - 34', 
                                TRUE ~ 'Over 34'), levels=c('Under 25','25 - 30','30 - 34','Over 34')))
  } else if (maternal_birth_date_in != "") {
    df$maternal_birth_date = as.Date(df[[maternal_birth_date_in]])
    df$maternal_age <- floor(as.numeric(difftime(delivery_date, maternal_birth_date, units = 'days'))/365.25)
    df <- df %>%
    mutate(age_group = factor(case_when(maternal_age < 25 ~ 'Under 25', 
                              maternal_age < 30 ~ '25 - 30', 
                              maternal_age < 34 ~ '30 - 34', 
                              TRUE ~ 'Over 34'), levels=c('Under 25','25 - 30','30 - 34','Over 34')))
  } else {
    df$maternal_age <- NA
    df$age_group <- NA
  }

  if (maternal_race_in != "") {
    df$maternal_race <- df[[maternal_race_in]]
    race_above_thresh <- df %>%
      group_by(maternal_race) %>% summarise(total = n()) %>%
      filter(total > 2000)

    df <- df %>%
      mutate(race_group = case_when(
        maternal_race %in% race_above_thresh$maternal_race ~ maternal_race,
        TRUE ~ 'Other'))
    ref_group <- df %>% group_by(race_group) %>% summarise(total = n()) %>% filter(total == max(total))
    df$race_group <- relevel(
      factor(df$race_group),
      ref=ref_group$race_group)
  } else {
    df$maternal_race <- NA
    df$race_group <- NA
  }

  if (order_indication_in == "") {
    df$ord_indict_non_thc <- NA
  } else {
    df$order_indication <- df[[order_indication_in]]
    df$ord_indict_non_thc <- if_else(df$order_indication %in% ord_indict_non_thc_in, TRUE, FALSE)
  }

  df
}

```

```{r}
data <- read.csv("/input.csv")
input <- read.csv("/dict.txt", sep = '\t')

race_pal <- setNames(object = c("#e0aae8","#bae4bc","#7bccc4", '#2b8cbe'), nm = c('Black or African American', 'White', 'Other', 'Asian'))
age_pal <- setNames(object = c("#e0aae8","#bae4bc","#7bccc4", '#2b8cbe'), nm = c('Under 25', '25 - 30', '30 - 34', 'Over 34'))
intervention_date <- as.Date(input[input$column_out=='intervention_date','column_in'])

data <- clean_data(data, input)
```


```{r}
# data summary

## demographic parity
demo_sum <- data %>%
  group_by(race_group,age_group,delivery_qtr) %>%
  summarise(total = n(),
         perc_total = round(sum(uds_test, na.rm = TRUE)/n(), digits = 2))

## predictive parity
pred_non_thc_sum <- data %>%
  filter(uds_test) %>%
  group_by(race_group,delivery_qtr) %>%
  summarise(total = n(),
         perc_total = round(sum(non_thc_detect, na.rm = TRUE)/n(), digits = 2))

pred_thc_sum <- data %>%
  filter(uds_test) %>%
  group_by(race_group,delivery_qtr) %>%
  summarise(total = n(),
         perc_total = round(sum(thc_detect, na.rm = TRUE)/n(), digits = 2))

## equalized odds depends on order indication data
if (sum(is.na(data$ord_indict_non_thc)) == nrow(data)) {
  equal_odds_wide <- NA
  # if no data available, display error message
  error_ord <- "The data set is missing order indication data. The dashboard cannot create this visual or table."
} else {
  error_ord <- NA
  ## equalized odds
  equal_odds <- data %>%
    filter(uds_test) %>%
    group_by(race_group,age_group,pre_post_QI,ord_indict_non_thc,non_thc_detect) %>%
    count() %>%
    ungroup() %>%
    mutate(ord_ind_detect = factor(case_when(ord_indict_non_thc & non_thc_detect ~ 'TP',
                                      ord_indict_non_thc & !non_thc_detect ~ 'FP',
                                      !ord_indict_non_thc & !non_thc_detect ~ 'TN',
                                      !ord_indict_non_thc & non_thc_detect ~ 'FN')))

  equal_odds_wide <- pivot_wider(equal_odds%>%
    select(c("race_group","age_group","pre_post_QI",'ord_ind_detect', 'n')), names_from = ord_ind_detect, values_from = n) %>%
    mutate(FPR = (FP/(FP+TN)), 
           TPR = (TP/(TP+FN)), 
           ratio = FPR/TPR)
}

## equal outcome and group benefit depends on cps report data
if (sum(is.na(data$cps_report)) == nrow(data)) {
  equal_out <- NA
  group_benefit_wide <- NA
  # if no data available, display error message
  error_cps <- "The data set is missing CPS report data. The dashboard cannot create this visual or table."
} else {
  error_cps <- NA
  ## equal outcomes
  equal_out <- data %>%
    group_by(uds_test, race_group, non_thc_detect,cps_report,delivery_qtr) %>%
    count() %>%
    ungroup() %>%
    group_by(uds_test, race_group,delivery_qtr) %>%
    mutate(total = sum(n),
           perc_total = round(n/total, digits = 2)) %>%
    ungroup()

  ## group benefit equality
  group_benefit <- data %>%
    filter(uds_test) %>%
    group_by(race_group,age_group,pre_post_QI,cps_report,non_thc_detect) %>%
    count() %>%
    ungroup() %>%
    mutate(cps_detect = factor(case_when(cps_report & non_thc_detect ~ 'TP',
                                         cps_report & !non_thc_detect ~ 'FP',
                                         !cps_report & !non_thc_detect ~ 'TN',
                                         !cps_report & non_thc_detect ~ 'FN')))

  group_benefit_wide <- pivot_wider(group_benefit %>%
    select(c("race_group","age_group","pre_post_QI","cps_detect","n")), names_from = cps_detect, values_from = n) %>%
    mutate(ratio = ((TP+FP)/(TP+FN)))
}
```

```{r, child = 'visuals.rmd'}
```

```{r, child = 'section.rmd'}
```

```{r, child = 'demographics.rmd'}
```