Summary

Row

Methods

We used three categories for the race field: Black or African American, White and Other. The Other category includes all other racial groups in the dataset, mothers with an unspecified race and those with multiple races. None of these categories were analyzed individually because the Other category only makes up 10% of the datapoints in total, so there would be insufficient data for a robust analysis.

We used four age categories: Under 25, 25 – 30, 30 – 34, and Over 34. The number of categories and precise ranges was somewhat arbitrary, but we adjusted the ranges to keep the number of individuals in each group similar in size.

We used R’s glm package to do multivariate logistic regression analyses with the age and race categories as independent variables. We repeated the analysis with a varying dependent variable and subpopulation for the metrics described below.

Row

Demographic Parity

We measure demographic parity using the proportion of each race and age group that was given drug testing. If the proportions are equal for all demographics, the conditions for demographic parity are met. If no intervention date was given, then the same table will show for both pre and post intervention sections.

The dashed green line marks the date of QI intervention.

Row

Visual

Row

Predictive Parity

We measure predictive parity using the proportion of each demographic’s subpopulation that tested positive out of the people tested. If the proportions are equal the conditions of predictive partity are met. We measure it separately for THC or Non-THC drugs.

Row

Visual

Row

Equalized Odds

We measure equalized odds using the proportion of each demographic and age subpopulation that was given drug testing with a relevant order indication such as “Substance use during pregnancy, excluding marijuana” and “History of opioids prescribed during pregnancy”.
Goal: Achieve parity in all race and age groups

Row

Visual

Row

General Group Equity

We measure general group equity using the proportion of each demographic and age subpopulation that received intervention for the correct event.
Equation: (TP+FP)/(TP+FN)
Definition:
- True positive is # of patients tested positive for non THC substance and was reported to CPS
- False positive is # of patients tested negative for non THC substance and was reported to CPS
- True negative is # of patients tested negative for non THC substance and wasn’t reported to CPS
- False negative is # of patients tested positive for non THC substance and was reported to CPS
Goal: Achieve as close to a ratio of 1 as possible so that a group is not under-served (ratio < 1) or over-served (ratio > 1)

Row

Visual

Row

Equal Outcomes for non tested mothers

We measure equal outcomes for non tested mothers using the proportion of each demographic subpopulation that were reported to CPS without an UDS indicating non-THC drug use. There may be other evidence not available in this dataset that would be relevant to such a report, so this visual serves to illustrate the relationship between non-testing and CPS report only.
Goal: Achieve parity in all race groups

Row

Visual

Row

Equal Outcomes for tested mothers

We measure equal outcomes for non tested mothers using the proportion of each demographic subpopulation that were reported to CPS with an UDS indicating non-THC drug use. There may be other evidence not available in this dataset that would be relevant to such a report, so this visual serves to illustrate the relationship between testing and CPS report only.
Goal: Achieve parity in all race groups

Row

Visual

Statistics

Row

Demographic Parity

We measure demographic parity using the proportion of each demographic and age subpopulation that was given drug testing. If no intervention date was given, then the same table will show for both pre and post intervention sections.
Goal: Achieve parity in all race and age groups

Row

Logistic regression model - Pre Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other -1.9 -2.5, -1.3 <0.001
    White -0.82 -1.0, -0.61 <0.001
Age group


    Under 25 0.00
    25 - 30 -0.19 -0.43, 0.04 0.11
    30 - 34 -0.59 -0.87, -0.31 <0.001
    Over 34 -0.60 -0.88, -0.32 <0.001
1 OR = Odds Ratio, CI = Confidence Interval

Logistic regression model - Post Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other -1.3 -2.2, -0.51 0.003
    White -0.48 -0.81, -0.15 0.004
Age group


    Under 25 0.00
    25 - 30 -0.62 -1.1, -0.18 0.006
    30 - 34 -0.08 -0.49, 0.33 0.7
    Over 34 -0.43 -0.90, 0.01 0.061
1 OR = Odds Ratio, CI = Confidence Interval

Row

Predictive Parity

We measure predictive parity as the proportion of each demographic’s subpopulation that tested positive for THC or Non-THC drugs. If no intervention date was given, then the same table will show for both pre and post intervention sections.
Goal: Achieve parity in all race groups and drug detection types

Row

Non-THC Logistic regression model - Pre Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other -1.6 -4.5, 0.05 0.12
    White 0.64 0.25, 1.0 0.001
Age group


    Under 25 0.00
    25 - 30 0.64 0.17, 1.1 0.007
    30 - 34 0.87 0.33, 1.4 0.002
    Over 34 0.86 0.31, 1.4 0.002
1 OR = Odds Ratio, CI = Confidence Interval

Non-THC Logistic regression model - Post Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other -0.88 -3.0, 0.89 0.3
    White 1.3 0.56, 2.0 <0.001
Age group


    Under 25 0.00
    25 - 30 0.75 -0.18, 1.7 0.12
    30 - 34 1.4 0.58, 2.4 0.001
    Over 34 1.2 0.29, 2.3 0.013
1 OR = Odds Ratio, CI = Confidence Interval

THC Logistic regression model - Pre Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other -0.74 -2.3, 0.50 0.3
    White -0.99 -1.4, -0.55 <0.001
Age group


    Under 25 0.00
    25 - 30 0.19 -0.25, 0.64 0.4
    30 - 34 -0.16 -0.71, 0.38 0.6
    Over 34 -0.40 -0.99, 0.17 0.2
1 OR = Odds Ratio, CI = Confidence Interval

THC Logistic regression model - Post Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other 0.28 -1.7, 2.0 0.8
    White -0.41 -1.2, 0.31 0.3
Age group


    Under 25 0.00
    25 - 30 1.2 0.27, 2.2 0.013
    30 - 34 0.07 -0.84, 0.98 0.9
    Over 34 -0.10 -1.2, 0.93 0.9
1 OR = Odds Ratio, CI = Confidence Interval

Row

Equal Outcomes for non tested mothers

We measure equal outcomes for non tested mothers using the proportion of each demographic subpopulation that were reported to CPS without an UDS indicating non-THC drug use. There may be other evidence not available in this dataset that would be relevant to such a report, so this visual serves to illustrate the relationship between non-testing and CPS report only. If no intervention date was given, then the same table will show for both pre and post intervention sections.
Goal: Achieve parity in all race groups

Row

Non tested mothers’ Logistic regression model - Pre Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other -1.4 -2.8, -0.37 0.021
    White -1.3 -1.9, -0.72 <0.001
Age group


    Under 25 0.00
    25 - 30 -0.07 -0.67, 0.53 0.8
    30 - 34 -0.28 -0.98, 0.38 0.4
    Over 34 -0.52 -1.3, 0.19 0.2
1 OR = Odds Ratio, CI = Confidence Interval

Non tested mothers’ Logistic regression model - Post Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other -0.69 -2.1, 0.40 0.3
    White -0.47 -1.1, 0.18 0.2
Age group


    Under 25 0.00
    25 - 30 0.78 -0.51, 2.3 0.3
    30 - 34 1.9 0.75, 3.3 0.003
    Over 34 1.7 0.53, 3.1 0.010
1 OR = Odds Ratio, CI = Confidence Interval

Tested mothers’ Logistic regression model - Pre Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other -0.03 -1.4, 1.2 >0.9
    White 0.16 -0.23, 0.56 0.4
Age group


    Under 25 0.00
    25 - 30 0.29 -0.16, 0.74 0.2
    30 - 34 0.38 -0.15, 0.91 0.2
    Over 34 0.39 -0.15, 0.92 0.2
1 OR = Odds Ratio, CI = Confidence Interval

Tested mothers’ Logistic regression model - Post Intvervention

Characteristic log(OR)1 95% CI1 p-value
Race group


    Black or African American 0.00
    Other -0.58 -2.6, 1.2 0.5
    White 0.25 -0.43, 0.92 0.5
Age group


    Under 25 0.00
    25 - 30 1.6 0.72, 2.6 <0.001
    30 - 34 1.1 0.28, 1.9 0.009
    Over 34 1.5 0.60, 2.5 0.002
1 OR = Odds Ratio, CI = Confidence Interval
---
title: "Fairlabs Dashboard"
header-includes:
  - \usepackage{comment}
output:
  flexdashboard::flex_dashboard:
    css: style.css
    theme: cosmo
    orientation: rows
    vertical_layout: scroll
    source_code: embed
---
```{r, include=FALSE}
library(flexdashboard)
library(tidyverse)
library(plotly)
library(knitr)
library(janitor)
library(reshape2)
require(scales)
library(glue)
library(gtsummary)
library(htmltools)
library(lubridate)
```

```{r}
# cleans raw data based on user defined input columns

clean_data <- function(df, input){

  # Pull user defined columns

  # dates
  delivery_date_in <- input[input$column_out=='delivery_date','column_in']
  cps_reporting_date_in <- input[input$column_out=='cps_reporting_date','column_in']
  uds_collection_date_in <- input[input$column_out=='uds_collection_date','column_in']
  uds_test_in <- input[input$column_out=='uds_test','column_in']
  maternal_birth_date_in <- input[input$column_out=='maternal_birth_date','column_in']
  # static QI intervention date from dictionary 
  intervention_date_in <- as.Date(input[input$column_out=='intervention_date','column_in'])

  # filled if non-THC and THC are stored as individual drug columns
  non_thc_col <- strsplit(input[input$column_out=='non_thc_cols','column_in'], split ='\\\\t')[[1]]
  thc_col_in <- input[input$column_out=='thc_col','column_in']
  # filled if non-THC and THC are stored as true/false 
  non_thc_detect_in <- input[input$column_out=='non_thc_detect','column_in']
  thc_detect_in <- input[input$column_out=='thc_detect','column_in']

  maternal_age_in <- input[input$column_out=='maternal_age','column_in']
  maternal_race_in <- input[input$column_out=='maternal_race','column_in']

  order_indication_in <- input[input$column_out=='order_indication','column_in']
  # Used to determine which order indication is for non-THC drug use
  ord_indict_non_thc_in <- strsplit(input[input$column_out=='ord_indict_non_thc','column_in'], split ='\\\\t')[[1]]

  # Checks if each column exist, and format for summary and visual uses
  if (delivery_date_in == "") {
    df$delivery_date <- NA
    df$delivery_qtr <- NA
  } else {
    df$delivery_date <- as.Date(df[[delivery_date_in]])
    df$delivery_qtr <- floor_date(as.Date(format.Date(df$delivery_date, '%Y-%m-01')), unit = "quarter")
  }

  if (cps_reporting_date_in == "") {
    df$cps_reporting_date <- NA
    df$cps_report <- NA
  } else {
    df$cps_reporting_date <- as.Date(df[[cps_reporting_date_in]])
    df$cps_report <- if_else(is.na(df$cps_reporting_date), FALSE, TRUE)
  }

  if (is.na(intervention_date_in)) {
    df$pre_post_QI <- NA
  } else {
    df <- df %>%
      mutate(pre_post_QI = factor(if_else(delivery_date >= intervention_date_in, 'Post', 'Pre'), 
                                  levels = c('Pre', 'Post')))
  }

  if ((uds_collection_date_in == "" & uds_test_in != "")) {
    df$uds_collection_date <- NA
    df$uds_test <- df[[uds_test_in]]
  } else if (uds_collection_date_in != "") {
    df$uds_collection_date <- as.Date(df[[uds_collection_date_in]])
    df$uds_test <- if_else(is.na(df$uds_collection_date), FALSE, TRUE)
  } else {
    df$uds_collection_date <- NA
    df$uds_test <- NA
  }

  if (thc_detect_in != "" & thc_col_in == "") {
    df$thc_detect <- df[[thc_detect_in]]
  } else if (thc_col_in != "") {
    df$detected_tetrahydrocannabinol <- df[[thc_col_in]]
    df$thc_detect <- if_else(df$detected_tetrahydrocannabinol==1, TRUE, FALSE)
  } else {
    df$thc_detect <- NA
  }

  if (non_thc_detect_in != "" & is_empty(non_thc_col)) {
    df$non_thc_detect <- df[[non_thc_detect_in]]
  } else if (!is_empty(non_thc_col)) {
    df <- df %>%
      mutate(non_thc_detect = if_else(rowSums(df[non_thc_col]) > 0, TRUE, FALSE))
  } else {
    df$non_thc_detect <- NA
  }

  if (maternal_age_in != "") {
    df$maternal_age <- df[[maternal_age_in]]
    df <- df %>%
      mutate(age_group = factor(case_when(maternal_age < 25 ~ 'Under 25', 
                                maternal_age < 30 ~ '25 - 30', 
                                maternal_age < 34 ~ '30 - 34', 
                                TRUE ~ 'Over 34'), levels=c('Under 25','25 - 30','30 - 34','Over 34')))
  } else if (maternal_birth_date_in != "") {
    df$maternal_birth_date = as.Date(df[[maternal_birth_date_in]])
    df$maternal_age <- floor(as.numeric(difftime(delivery_date, maternal_birth_date, units = 'days'))/365.25)
    df <- df %>%
    mutate(age_group = factor(case_when(maternal_age < 25 ~ 'Under 25', 
                              maternal_age < 30 ~ '25 - 30', 
                              maternal_age < 34 ~ '30 - 34', 
                              TRUE ~ 'Over 34'), levels=c('Under 25','25 - 30','30 - 34','Over 34')))
  } else {
    df$maternal_age <- NA
    df$age_group <- NA
  }

  if (maternal_race_in != "") {
    df$maternal_race <- df[[maternal_race_in]]
    df <- df %>%
      mutate(race_group = case_when(maternal_race == 'Black or African American' ~ maternal_race, 
                                   maternal_race == 'White' ~ maternal_race, 
                                   TRUE ~ 'Other'))
  } else {
    df$maternal_race <- NA
    df$race_group <- NA
  }

  if (order_indication_in == "") {
    df$ord_indict_non_thc <- NA
  } else {
    df$order_indication <- df[[order_indication_in]]
    df$ord_indict_non_thc <- if_else(df$order_indication %in% ord_indict_non_thc_in, TRUE, FALSE)
  }

  df
}

```

```{r}
data <- read.csv("/input.csv")
input <- read.csv("/dict.txt", sep = '\t')

race_pal <- setNames(object = c("#66c2a5","#fc8d62","#8da0cb"), nm = c('Black or African American', 'White', 'Other'))
age_pal <- setNames(object = c("#f0f9e8","#bae4bc","#7bccc4", '#2b8cbe'), nm = c('Under 25', '25 - 30', '30 - 34', 'Over 34'))
intervention_date <- as.Date(input[input$column_out=='intervention_date','column_in'])

data <- clean_data(data, input)
```


```{r}
# data summary

## demographic parity
demo_sum <- data %>%
  group_by(race_group,age_group,delivery_qtr,uds_test) %>%
  count() %>%
  ungroup() %>%
  group_by(race_group,age_group,delivery_qtr) %>%
  mutate(total = sum(n),
         perc_total = round(n/total, digits = 2)) %>%
  ungroup()

## predictive parity
pred_non_thc_sum <- data %>%
  filter(uds_test) %>%
  group_by(race_group,delivery_qtr,non_thc_detect) %>%
  count() %>%
  ungroup() %>%
  group_by(race_group,delivery_qtr) %>%
  mutate(total = sum(n),
         perc_total = round(n/total, digits = 2)) %>%
  ungroup()

pred_thc_sum <- data %>%
  filter(uds_test) %>%
  group_by(race_group,delivery_qtr,thc_detect) %>%
  count() %>%
  ungroup() %>%
  group_by(race_group,delivery_qtr) %>%
  mutate(total = sum(n),
         perc_total = round(n/total, digits = 2)) %>%
  ungroup()

## equalized odds depends on order indication data
if (sum(is.na(data$ord_indict_non_thc)) == nrow(data)) {
  equal_odds_wide <- NA
  # if no data available, display error message
  error_ord <- "The data set is missing order indication data. The dashboard cannot create this visual or table."
} else {
  error_ord <- NA
  ## equalized odds
  equal_odds <- data %>%
    filter(uds_test) %>%
    group_by(race_group,age_group,pre_post_QI,ord_indict_non_thc,non_thc_detect) %>%
    count() %>%
    ungroup() %>%
    mutate(ord_ind_detect = factor(case_when(ord_indict_non_thc & non_thc_detect ~ 'TP',
                                      ord_indict_non_thc & !non_thc_detect ~ 'FP',
                                      !ord_indict_non_thc & !non_thc_detect ~ 'TN',
                                      !ord_indict_non_thc & non_thc_detect ~ 'FN')))

  equal_odds_wide <- pivot_wider(equal_odds%>%
    select(c("race_group","age_group","pre_post_QI",'ord_ind_detect', 'n')), names_from = ord_ind_detect, values_from = n) %>%
    mutate(FPR = (FP/(FP+TN)), 
           TPR = (TP/(TP+FN)), 
           ratio = FPR/TPR)
}

## equal outcome and group benefit depends on cps report data
if (sum(is.na(data$cps_report)) == nrow(data)) {
  equal_out <- NA
  group_benefit_wide <- NA
  # if no data available, display error message
  error_cps <- "The data set is missing CPS report data. The dashboard cannot create this visual or table."
} else {
  error_cps <- NA
  ## equal outcomes
  equal_out <- data %>%
    group_by(uds_test, race_group, non_thc_detect,cps_report,delivery_qtr) %>%
    count() %>%
    ungroup() %>%
    group_by(uds_test, race_group,delivery_qtr) %>%
    mutate(total = sum(n),
           perc_total = round(n/total, digits = 2)) %>%
    ungroup()

  ## group benefit equality
  group_benefit <- data %>%
    filter(uds_test) %>%
    group_by(race_group,age_group,pre_post_QI,cps_report,non_thc_detect) %>%
    count() %>%
    ungroup() %>%
    mutate(cps_detect = factor(case_when(cps_report & non_thc_detect ~ 'TP',
                                         cps_report & !non_thc_detect ~ 'FP',
                                         !cps_report & !non_thc_detect ~ 'TN',
                                         !cps_report & non_thc_detect ~ 'FN')))

  group_benefit_wide <- pivot_wider(group_benefit %>%
    select(c("race_group","age_group","pre_post_QI","cps_detect","n")), names_from = cps_detect, values_from = n) %>%
    mutate(ratio = ((TP+FP)/(TP+FN)))
}
```

```{r, child = 'visuals.rmd'}
```

```{r, child = 'section.rmd'}
```