Bug in ordered multinomial models with missing categories #28

dmphillippo · 2024-01-09T11:12:49Z

In ordered multinomial models where studies do not observe all categories, data points could be assigned the wrong category in a couple of scenarios:

Where a study does not observe the first category; absolute estimates in this study would be wrong
Where a study does not observe a middle category; relative and absolute estimates in this study would be wrong

For example, consider the HTA psoriasis data with categories removed from some studies:

Code

library(multinma)
library(dplyr)
library(tidyr)
library(ggplot2)

options(mc.cores = parallel::detectCores())

# Remove some categories
hta_dat <- hta_psoriasis %>% 
  mutate(PASI50 = ifelse(studyc == "Elewski", NA, PASI50),
         PASI75 = ifelse(studyc == "Leonardi", NA, PASI75))

hta_net <- set_agd_arm(hta_dat,
                       study = paste(studyc, year),
                       trt = trtc,
                       r = multi(r0 = sample_size - rowSums(cbind(PASI50, PASI75, PASI90), na.rm = TRUE),
                                 PASI50, PASI75, PASI90,
                                 inclusive = FALSE,
                                 type = "ordered"))
hta_net

hta_fit_FE <- nma(hta_net,
                  trt_effects = "fixed",
                  link = "probit",
                  prior_intercept = normal(scale = 100),
                  prior_trt = normal(scale = 10),
                  prior_aux = flat())

hta_fit_FE

predict(hta_fit_FE, 
        baseline = distr(qnorm, mean = -1.097, sd = 123^-0.5), 
        type = "response")

# Plot predictions against observed data
hta_pred_FE <- predict(hta_fit_FE, type = "response")

hta_pred_mod <- 
  hta_pred_FE %>% 
  as_tibble() %>% 
  mutate(method = "NMA",
         outcomef = recode_factor(gsub("^pred\\[.+: (.+), PASI(50|75|90)\\]$", "\\2", parameter), 
                                  "50" = "PASI 50",
                                  "75" = "PASI 75",
                                  "90" = "PASI 90"),
         .trt = factor(gsub("^pred\\[.+: (.+?)(, PASI(50|75|90))?\\]$", "\\1", parameter), 
                       levels = levels(hta_net$treatments)),
         estimate = mean, conf.low = `2.5%`, conf.high = `97.5%`)

# Get observed proportions
get_binom_ci <- function(r, n) {
  if (is.na(r)) return(tibble(estimate = NA_real_, conf.low = NA_real_, conf.high = NA_real_))
  bt <- binom.test(r, n)
  tibble(estimate = bt$estimate,
         conf.low = bt$conf.int[1],
         conf.high = bt$conf.int[2])
}

hta_pred_obs <- 
  hta_dat %>%
  mutate(studyc = paste(studyc, year)) %>% 
  pivot_longer(matches("PASI(50|75|90)"), names_to = "outcome", values_to = "value",
               names_pattern = "(PASI(?:50|75|90))") %>% 
  rowwise() %>% #group_by(studyc, trtc, outcome) %>%
  summarise(studyc, trtc, outcome, get_binom_ci(value, sample_size))%>% 
  mutate(method = "Observed", 
         outcomef = recode_factor(outcome,
                                  PASI50 = "PASI 50",
                                  PASI75 = "PASI 75", 
                                  PASI90 = "PASI 90"),
         .trt = factor(trtc, levels = levels(hta_net$treatments)),
         .study = factor(studyc, levels = levels(hta_net$studies)))

hta_pred_all <- bind_rows(hta_pred_mod, hta_pred_obs) %>% 
  mutate(methodf = factor(method, levels = c("NMA", "Observed")))

# Plot
ggplot(hta_pred_all,
       aes(x = estimate*100, xmin = conf.low*100, xmax = conf.high*100, 
           y = forcats::fct_rev(.trt), shape = methodf, colour = methodf)) +
  geom_errorbar(width = 0, position = position_dodge(width = 0.75), size = 0.3) +
  geom_point(position = position_dodge(width = 0.75), fill = "white") +
  xlab("Percent achieving PASI outcome") + ylab("Treatment") +
  coord_cartesian(xlim = c(0, 100)) +
  facet_grid(.study~outcomef) +
  scale_shape_manual("Method", values = c(`NMA` = 16, Observed = 21)) +
  scale_colour_manual("Method", values = c(`NMA` = "#00468B", Observed = "#8B0000")) +
  theme_multinma()

The output plot with the current package version (0.5.1) with the issue looks like this, whereas the fixed version with the correct categories looks like this.

The issue is particularly noticeable in this example for studies omitting the first category (PASI 50).

Data points could be assigned the wrong category in a couple of scenarios: * Where a study does not observe the first category; absolute estimates in this study would be wrong * Where a study does not observe a middle category; relative and absolute estimates in this study would be wrong

dmphillippo · 2024-01-10T09:31:52Z

Fix merged into development branch, for next CRAN release.

dmphillippo · 2024-01-24T14:12:37Z

Now on CRAN, with the release of version 0.6.0

dmphillippo closed this as completed Jan 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in ordered multinomial models with missing categories #28

Bug in ordered multinomial models with missing categories #28

dmphillippo commented Jan 9, 2024

dmphillippo commented Jan 10, 2024

dmphillippo commented Jan 24, 2024

Bug in ordered multinomial models with missing categories #28

Bug in ordered multinomial models with missing categories #28

Comments

dmphillippo commented Jan 9, 2024

dmphillippo commented Jan 10, 2024

dmphillippo commented Jan 24, 2024