Mediation analysis with weighted GLMs in R: What could possibly go wrong?
Issues (and some solutions) when working with R packages {mediation} and {survey}
statistics
mediation
survey
glm
fragile families
Authors
Vasco Brazão
Priya Devendran
Published
December 21, 2023
This blogpost documents the challenges we faced when attempting to run a mediation analysis using the mediation package with a binomial mediator and outcome while incorporating sampling weights into the analysis with the help of the survey package. Issues and solutions are summarized below. Click the issue number to jump straight to the relevant section of the post.
Description: We need to specify jackknife estimation of standard errors for survey to properly handle our replicate weights.
Solution: Besides specifying the weights and repweights arguments, we also have to specify combined_weights = TRUE, type = "JKn", scales = 1, rscales = 1, and mse = TRUE.
Description: When using multiple-trial binomial GLMs for the mediator and outcome models, mediate() complains: “weights on outcome and mediator models not identical”.
Solution: The problem occurs when we specify the GLMs with a proportion as the outcome and the total number of trials given in the weights argument, if the number of trials is different for both models. We can overcome this by specifying the outcome as cbind(successes, failures). However, according to the package documentation, mediate() should not accept multiple-trial binomial mediators in the first place, leaving us with some doubts about whether all the estimates are calculated correctly in this case.
Description: When using a single trial binomial GLM for the mediator model and a multiple-trial binomial GLM for the outcome model, mediate() complains: “weights on outcome and mediator models not identical”.
Solution: As in Issue 2, the issue is solved by using the cbind(successes, failures) specification for the outcome model.
Description: When the mediator model is ordinal (fit with MASS::polr(), mediate() will not work with a binomial outcome model either way it’s specified.
Solution: No solution was found, other than using an ordinal mediator model with a binary outcome model.
Description: mediate() can work via boostrap or quasi-Bayesian approximation, and has an option to calculate “robust standard errors” for the latter. Which options work with survey-weighted GLMs?
Solution: Bootstrapping should probably not be used. robustSE should be set to FALSE. If marginaleffects is used for comparisons, vcov should be set to vcov(model) or NULL, as the sandwich package does not correctly compute robust standard errors for models fit with survey.
Description: When you apply summary() to an object produced by mediate(), the upper bound of the confidence intervals is rounded to only two decimal places, whereas there are more decimal places for the lower bound.
Solution: We can modify the function print.summary.mediate() to get around this problem, although the current solution then messes with how the p-values are reported.
Background
Code
# load the packages we will uselibrary(tidyverse)library(here)library(mediation)library(survey)library(srvyr)library(gt)library(MASS)library(marginaleffects)# directory for the postpost_folder <-"2023-12-21_mediation-weighted-glm"# number of simulations for mediation analysisn_sims <-1000
As part of a research program seeking to better understand intimate partner violence (IPV) through an intersectional lens, we wanted to use the Future of Families & Child Wellbeing Study1 dataset to run a mediation analysis. The data come from a survey with a complex design, and baseline and replicate weights are provided for each participant in the “national” sample.2 These weights should allow us to make estimates nationally representative. Our mediator and outcome variables could reasonably be construed as binomial outcomes, and this was our preferred approach to fitting the necessary models. However, we encountered several difficulties along the way, and attempt to document and make sense of them here.
For demonstrative purposes we created a synthetic dataset using the synthpop package. This dataset mirrors the characteristics of our data without it being possible to identify any real individuals from it.
We have three variables that will be used for mediation, race (“White” or “Black”), informal_support (an integer from 0 to 3), and IPV (an integer from 0 to 24). We manually create _prop and _max variables for informal support and IPV, so that we can use the proportion as an outcome in our GLMs and use the weights argument to specify the number of trials. We also create informal_support_binary, which classifies women as having high access to informal support (takes the value 1) when they score 3 out of 3, and low access (takes the value 0) when they score less than 3, as well as IPV_binary, which classifies women as having experienced some form of IPV (takes the value 1) if they score more than 0 and as not having experienced IPV (takes the value 0) if they score exactly 0. These binarized variables will become useful later on. Finally, we create the ordered factors informal_support_ordinal and IPV_ordinal to be used in ordinal models later on.
Each person also has a baseline weight, m1natwt, and 33 replicate weights, m1natwt_rep1 and so on. Behold:
Code
head(dat) |> gt::gt()
race
informal_support
IPV
informal_support_prop
informal_support_max
informal_support_binary
informal_support_ordinal
IPV_prop
IPV_max
IPV_binary
IPV_ordinal
m1natwt
m1natwt_rep1
m1natwt_rep2
m1natwt_rep3
m1natwt_rep4
m1natwt_rep5
m1natwt_rep6
m1natwt_rep7
m1natwt_rep8
m1natwt_rep9
m1natwt_rep10
m1natwt_rep11
m1natwt_rep12
m1natwt_rep13
m1natwt_rep14
m1natwt_rep15
m1natwt_rep16
m1natwt_rep17
m1natwt_rep18
m1natwt_rep19
m1natwt_rep20
m1natwt_rep21
m1natwt_rep22
m1natwt_rep23
m1natwt_rep24
m1natwt_rep25
m1natwt_rep26
m1natwt_rep27
m1natwt_rep28
m1natwt_rep29
m1natwt_rep30
m1natwt_rep31
m1natwt_rep32
m1natwt_rep33
Black
3
18
1
3
1
3
0.75000000
25
1
18
41.13614
37.85516
41.24734
42.44724
44.39487
43.73881
44.45506
40.57142
40.78370
41.43095
34.34425
44.46661
0.00000
0.00000
37.94593
41.50523
43.69954
41.31135
39.22795
42.50257
42.02170
46.58911
41.82341
45.53830
42.27204
42.28968
43.25393
40.67841
0.00000
42.64928
42.28931
42.82657
43.13633
44.10038
Black
3
4
1
3
1
3
0.16666667
25
1
4
19.97570
22.96839
20.49259
20.21818
21.76786
20.79500
20.41886
21.54646
19.36423
19.80069
19.47093
19.58197
20.42065
19.78305
20.23575
21.36903
20.48008
20.91415
20.30889
21.36875
19.87386
21.06265
20.41880
23.67257
20.81135
21.42606
21.32215
19.58435
20.62730
21.86095
20.91533
20.08396
20.84812
20.38860
Black
3
0
1
3
1
3
0.00000000
25
0
0
12.24003
12.68406
11.59396
12.10748
12.00069
13.18730
11.88648
12.80008
0.00000
12.55978
11.89597
12.56892
12.57526
12.47836
11.46254
12.59188
12.65551
11.98436
12.49196
12.09257
13.70510
13.18920
12.23701
12.59980
12.74326
13.78494
12.39277
13.22096
11.71815
13.12761
12.33747
11.22335
12.79405
13.22079
Black
3
0
1
3
1
3
0.00000000
25
0
0
121.76894
129.03618
131.50407
124.49722
125.08351
115.39091
124.86470
0.00000
131.66747
121.04597
126.42019
125.73327
120.78219
132.49487
114.30053
127.66466
124.08061
125.39723
130.57684
134.18793
122.83195
0.00000
124.68690
119.24700
128.60410
128.66008
124.02149
126.49405
123.43700
141.58299
135.33669
126.14429
131.47270
125.73920
White
3
11
1
3
1
3
0.45833333
25
1
11
85.96230
87.22457
83.79171
91.27752
80.11185
88.11595
88.67043
90.75494
91.45559
85.68980
86.45096
92.47426
81.41688
99.84825
79.96129
85.80882
87.77487
89.85811
84.30115
95.09169
89.03872
90.58102
90.92682
102.89832
93.30130
94.10433
86.57666
88.45363
82.06447
89.72732
97.44034
91.89236
94.17548
0.00000
Black
3
2
1
3
1
3
0.08333333
25
1
2
98.04827
101.53748
0.00000
99.79310
97.24787
102.89133
105.71425
105.91802
102.19711
100.52937
103.41740
95.16078
97.53283
98.28993
103.28411
105.16373
102.85249
97.12798
107.91775
102.52666
107.20479
100.78193
95.96693
97.10217
101.69630
99.66428
101.72647
96.58415
95.98499
100.43622
0.00000
105.47424
96.22710
95.11009
We’ll need to run two models — the first predicting the mediator, the second predicting the outcome — and feed these models into mediation::mediate().
Issue 1: Correctly applying the Fragile Families weights
We fit the models using survey::svyglm() in order to be able to take the weights into account. However, svyglm() will not work with our dataframe directly — we need to create a “survey design” object by telling survey specifically how to handle the weights in our data. To do this correctly, we refer to the FFCW’s brief guide as well as other resources, like this CrossValidated question and the answers to it. Let’s build up the code step by step.
First, we pick the function that will create our survey design object. As we are dealing we replicate weights, we use the survey::svrepdesign() function and look at its documentation. We ignore the variables argument, as we will want all the variables that are already in our data to be included.
Next, we see the repweights and weights arguments, which ask us to specify the replicate weights and the sampling weights, respectively. Our FFCW guide says (emphasis in the original):
When setting the weights in your statistical package, you will need to indicate the name of the basic weight, as well as the replicate weights.
…
As an example, to weight the baseline mother data to be representative of births in the 77 cities, you would apply the weight m1natwt (and replicate weights m1natwt_rep1- m1natwt_rep33).
To do this in survey::svrepdesign(), we specify weights = ~m1natwt and repweights = "m1natwt.rep[1-9]+".
The data argument is straightforward, we just need to specify the dataframe we wish to use, dat.
Now the fun begins. How do we set the type argument, and do we need anything else? The FFCW guide is made for Stata users, and shows the following general command for specifying the survey design:
So far we’ve incorporated the “pweight=BASICWEIGHT” and the “REPLICATES” part, but the Stata code reveals some more intricacies. We can see based on “vce(jack)” and “jkrw” that Jackknife variance estimation is used, and that corresponds to setting type = "JKn".
Next we set combined.weights = TRUE, which does not seem to have an equivalent in the Stata code, but seems to be correct approach according to the function documentation.3
Stata’s “multiplier(1)” means we need to set scales = 1 as well as rscales = 1, as explained by Lumley in this answer.
And finally, “mse” has its survey equivalent in the argument mse = TRUE.
Because we like to stay close to the tidyverse, the code we actually run uses the srvyr package instead. This way we get to avoid specifying our sampling weight as a formula (recall “~m1natwt”) and our replicate weights as a regular expression:
With the weighted dataframe in hand, we can use survey::svyglm() to fit our GLMs predicting informal_support and IPV.
Our informal support variable is the sum of three items for which each mother could get a 1 or a 0 (indicating that she did or did not have access to that form of support). As such, it seems sensible to model it as a binomial outcome — each mother has a certain number of “successes” out of three possible trials.
Call:
survey::svyglm(formula = informal_support_prop ~ race, weights = informal_support_max,
design = dat_weights, family = "binomial")
Survey design:
Called via srvyr
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.689 7.167 0.515 0.610
raceBlack -1.732 7.069 -0.245 0.808
(Dispersion parameter for binomial family taken to be 1299.299)
Number of Fisher Scoring iterations: 6
We get the following warning many times:
Warning: non-integer #successes in a binomial glm!
This is survey’s way of asking us to specify the family as “quasibinomial” instead of “binomial”, since our outcome is a proportion (weighted by the number of trials) and not binary. However, if we do that, mediation will not work with our models.
Our IPV variable is the sum of twelve items for which each mother could get a score of 0, 1, or 2 (indicating how often she experienced a given form of violence from her partner). As such, it also seems sensible to model it as a binomial outcome — each mother has a certain number of “successes” out of 24 possible “trials”.4
Call:
survey::svyglm(formula = IPV_prop ~ race + informal_support_prop,
weights = IPV_max, design = dat_weights, family = "binomial")
Survey design:
Called via srvyr
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.4968 1.3423 -1.115 0.274
raceBlack 0.1207 1.2946 0.093 0.926
informal_support_prop -0.7482 1.3249 -0.565 0.576
(Dispersion parameter for binomial family taken to be 291.4434)
Number of Fisher Scoring iterations: 5
(Here again we suppressed the warning that would have otherwise popped up.)
weights taken as sampling weights, not total number of trials
Warning in mediation::mediate(model.m = model_m_1, model.y = model_y_1, :
treatment and control values do not match factor levels; using White and Black
as control and treatment, respectively
A small but statistically significant mediation effect! The warning about weights not being taken as total number of trials makes us nervous, but then again, we did give the models sampling weights, so maybe it’s ok?
Issue 2: Mediation with “multiple-trial binomial mediator” runs for weighted but not for unweighted models?
We knew that we would eventually perform a sensitivity analysis where all models would be run without using the sampling weights. Running the previous steps with the glm() command resulted in an issue:
Error in mediation::mediate(model.m = model_m_2, model.y = model_y_2, : weights on outcome and mediator models not identical
It seems like mediation::mediate() is taking the weights we pass to the GLM function (which correspond to total number of trials, not to sampling weights) as sampling weights, and complaining when it notices that these weights are not the same for the mediator and outcome models. Vasco perhaps clumsily posted this as an issue on the mediation package Github, but at least for now there has been no answer. We can work around this by using an alternative way of specifying the same GLM model — instead of using a proportion as an outcome and passing the number of trials through the weights argument, we can instead use a vector of “successes” and “failures” (which corresponds to the total number of trials minus the number of successes) as the outcome:
As of version 4.0, the mediator model can be of either ‘lm’, ‘glm’ (or `bayesglm’), ‘polr’ (or `bayespolr’), ‘gam’, ‘rq’, `survreg’, or `merMod’ class, corresponding respectively to the linear regression models, generalized linear models, ordered response models, generalized additive models, quantile regression models, parametric duration models, or multilevel models.. For binary response models, the ‘mediator’ must be a numeric variable with values 0 or 1 as opposed to a factor. Quasi-likelihood-based inferences are not allowed for the mediator model because the functional form must be exactly specified for the estimation algorithm to work. The ‘binomial’ family can only be used for binary response mediators and cannot be used for multiple-trial responses. This is due to conflicts between how the latter type of models are implemented in glm and how ‘mediate’ is currently written.
Thus, until the package author clarifies whether this implementation results in correct estimates, we are cautious about trusting them.
Sanity check: comparing results with linear probability models
As a sanity check, to become a little more confident that mediate is working as intended even though it’s not supposed to work well with a multiple-trial binomial mediator, we compare the results of our model that did run (med_2.1) with a mediation run on linear probability models (basically, using lm() instead of glm() even though the outcomes are proportions.
The results seem fairly similar, but this provides limited comfort.
Bonus finding: you can specify a mediator that is not technically the variable that your mediator model predicts
Note that in the models used for med_2.1 there is a potential discrepancy. The mediator model is formulated as formula = cbind(informal_support, informal_support_max - informal_support) ~ race, while the outcome model is formulated as formula = cbind(IPV, IPV_max - IPV) ~ race + informal_support_prop. So the outcome model is using informal_support_prop (our intended mediator) as a predictor of IPV, but the mediator model uses the variables informal_support and informal_support_max to internally generate the proportion of successes. In the call to mediation::mediate() we can then specify mediator = "informal_support_prop", and all seems to work as intended. What if we used informal_support as predictor in the outcome model instead? We can then specify mediator = "informal_support" and end up with something like this:
Looking at the last three rows (Average Causal Mediated Effect (average), Average Direct Effect (average), and Proportion Mediated (average)), the results seem different than the ones estimated in med_2.1 (which used informal_support_prop as mediator) and med.2.2 (which used linear proability models). But looking closer, you’ll notice that the estimates related to the mediator, namely of the Average Causal Mediated Effect and the Proportion Mediated just seem to be divided by three!
My assumption about how that came to be: When we use informal_support_prop, which can only take values on the \((0, 1)\) interval, the ACME estimates correspond to the change in probability of IPV associated with going from 0 to 1 on the mediator and thus span the full range of the mediator. When we use informal_support, which takes values in the set \(\{0, 1, 2, 3\}\), the estimates also correspond to the increase in probability of IPV associated with an increase of 1 on the mediator (e.g., from 0 to 1), but this is only \(\frac{1}{3}\) of the total possible way. Thus, applying this effect three times would be equivalent to moving from 0 to 3 on the mediator scale, the same as moving from 0 to 1 when the mediator is a proportion.5
Issue 3: Using a binary mediator and a binomial outcome — works with cbind(successes, failures) but not with proportion as outcome and weights argument specified
Next, we attempted to run everything with the same binomial outcome and a binary mediator.
Error in mediation::mediate(model.m = model_m_3, model.y = model_y_3, : weights on outcome and mediator models not identical
Once again we find that mediate() complains about different weights on mediator and outcome models, even though, according to the documentation, it works with a binary mediator and a binomial outcome. We then try using the vector of successes and failures specification for the outcome model:
This approach runs, but we are still left wondering whether we can trust the estimates.
Issue 4: Ordinal mediator and binomial outcome don’t work together
Because we would prefer not to binarize the social support variable in order to keep all the information that we can, we also explored using an ordinal mediator model. mediate() works with ordinal models fit with MASS::polr(), so we attempted that.
The quasi-Bayesian approximation (King et al. 2000) cannot be used if ‘model.m’ is of class ‘rq’ or ‘gam’, or if ‘model.y’ is of class ‘gam’, ‘polr’ or ‘bayespolr’. In these cases, either an error message is returned or use of the nonparametric bootstrap is forced. Users should note that use of the nonparametric bootstrap often requires significant computing time, especially when ‘sims’ is set to a large value.
We can get around this by setting the argument boot to TRUE.
Error in cbind(IPV, IPV_max - IPV): object 'IPV' not found
Now we get a different problem. It seems the boot package is having trouble with the way we set up our outcome model — hopefully using the other implementation will take care of that.
Error in mediation::mediate(model.m = model_m_4, model.y = model_y_4.1, : weights on outcome and mediator models not identical
Finally, we reach a dead-end. It seems once again the mediate() function is taking the prior weights we fed into our glm() as sampling weights and complaining that our ordinal model doesn’t have the same sampling weights. If we are willing to throw away some information, we can make it all work with a binary outcome, thusly:
Error in optim(s0, fmin, gmin, method = "BFGS", ...): initial value in 'vmmin' is not finite
This throws an error, which this Stack Overflow post can help us solve. We can help the function by providing plausible start values for the coefficients and the intercepts/thresholds. We have one coefficient and three thresholds (from 0 to 1, from 1 to 2, and from 2 to 3), and we know they should be relatively close to zero as well as that the thresholds should be in ascending order. With the syntax start = c(coefficients, thresholds) we can make it work:
Error in mediation::mediate(model.m = model_m_5, model.y = model_y_5, : weights on outcome and mediator models not identical
Not this again 😭. We knew that the weights used for the svyglm() and svyolr() models were the same, so there must be something else happening in the background. Vasco did some investigating and again turned to Stack Overflow. We can see what weights are stored in the model objects generated by svyglm() and svyolr() using the model.frame() function. In the following two code blocks we also use dplyr::slice_head() to display just the first line of results output by model.frame().
The first weight for our ordinal model appears to be 41.13614, while the binomial model uses 0.1142592. We can extract all the weights for both models and compare them. Let’s look at their ratio.
There seem to be three different values that all look the same — perhaps this will make more sense once we learn more about floating point arithmetic, but for now we’ll say the ratio between the weights is always the same. On Stack Overflow, Thomas Lumley, maintainer of the survey package, explains in a reply to my post:
Some functions in the survey package rescale the weights to have unit mean because the large weights (thousands or tens of thousands) in some national surveys can cause convergence problems.
Since the results aren’t affected at all, the mediate package could probably work around this fairly easily. However, there’s a rescale=FALSE option to svyglm to not rescale the weights, provided for this sort of purpose.
If you then have convergence problems in svyglm you could manually rescale the weights to have unit mean before doing any of the analyses.
Good to know!
Let’s set rescale = FALSE in the svyglm() model and try our luck:
Error in is.data.frame(pwts): object 'pwts' not found
This is perplexing, but someone else on Stack Overflow a couple of years ago encountered a similar problem, where object “.survey.prob.weights” was not found when rescale was set to FALSE. After looking at the source code and the accepted answer by user xilliam to the Stack Overflow post, we try to assign the prior weights contained in the survey design object to an object named pwts in the global environment, and that makes svyglm() happy again.
Code
pwts <- dat_weights$pweightsmodel_y_5.1<- survey::svyglm(formula = IPV_binary ~ race + informal_support_ordinal,design = dat_weights,family ="binomial",rescale =FALSE)# remove pwts from the global environmentrm(list ="pwts")
However, we can’t be sure if this correctly solves the problem or introduces new errors. Vasco posted about this issue on Stack Overflow in the hopes that perhaps Thomas Lumley can clarify this for us.
Error in MASS::polr(formula, data = df, ..., Hess = TRUE, model = FALSE, : formal argument "data" matched by multiple actual arguments
Ok, we give up here.
Issue 6: Options and choices to boot — To robustSE or not to robustSE?
For now, we’ve ignored an optional setting to mediate(): the robustSE argument defaults to FALSE, and, according to the documentation, “If ‘TRUE’, heteroskedasticity-consistent standard errors will be used in quasi-Bayesian simulations. Ignored if ‘boot’ is ‘TRUE’ or neither ‘model.m’ nor ‘model.y’ has a method for vcovHC in the sandwich package.” So we have three options:
boot = TRUE and, automatically, robustSE = FALSE
boot = FALSE and robustSE = TRUE, and
boot = FALSE and robustSE = FALSE.
And we know that, when chosen, the heteroskedasticity-consistent standard errors will be calculated with the help of the sandwich package.
According to the mediation package vignettes, the authors recommend setting boot = TRUE: “In general, as long as computing power is not an issue, analysts should estimate confidence intervals via the bootstrap with more than 1000 resamples, which is the default number of simulations.” So we could do that. Unfortunately, computing power is an issue, as we wanted to run a multiverse analysis that meant we would use the mediate() function a bunch of times. To figure out whether to set robustSE to TRUE or FALSE, it could be useful to see which option gives use results that are closer to the bootstrap option. Let’s use a binary mediator and outcome, just so we are sure there aren’t other issues at play.
For convenience, we’ll create a function called compare_meds() that takes in different mediation models and outputs a table.
We can see that the results are very similar, and in fact running the mediations several times reveals that the difference seems to be due to simulation variability. Given this, Vasco would personally rather specify robustSE = FALSE, since it does not make sense to correct for heteroskedasticity in a logistic regression model which cannot suffer from heteroskedasticity in the first place.
While the estimates are similar, the confidence intervals are much tighter when boot or robustSE are set to TRUE, and suddenly all estimates are significant! What gives?
First, an answer by Thomas Lumley (maintainer of the survey package) on Stack Overflow suggests that bootstrapping doesn’t work well with these survey-weighted models, although his answer does not fully map to our situation. In another answer, this time about applying robust standard errors with the sandwich package6, Lumley is clearer: “For svyglm models, vcov() already produces the appropriate sandwich estimator, and I don’t think the”sandwich” package knows enough about the internals of the object to get it right.” A few years prior, Achim Zeileis, maintainer of the sandwich package, had given a different perspective which also boils down to do not use sandwichon models fit with survey.
There you have it, boot = FALSE and robustSE = FALSE is the way to go.
Bonus: this goes for using marginaleffects as well
Often we are not interested in mediation, but rather would like a clean and interpretable way to report results from our GLMs. One could not be faulted for preferring to do this on the predicted probabilities scale rather than dealing with odds-ratios and other headaches. For this, the marginaleffects package can be of enormous help.
For example, we can examine the predicted probability of experiencing IPV for Black and White women:
And then compare the probabilities themselves, for which we again have the option of “correcting” the standard errors via the vcov argument. Look at the difference in the confidence intervals and p-values spit out when we don’t or do specify vcov = "HC":
The estimates are the same, but the standard error (wrongly) becomes minuscule if we apply the “correction” for heteroskedasticity without realizing that sandwich does not work properly on models fit with survey.
Issue 7: Bounds of the confidence intervals are rounded inconsistently
Did you notice? Each time we use summary() on a mediate() object, the lower bound of the 95% CI is reported with 5 decimal places, while the upper bound is reported with 2 decimal places. This also bothered Stack Overflow user sateayam over 4 years ago, and luckily user Roland answered with a fix:
Code
trace(mediation:::print.summary.mediate, at =11,tracer =quote({ printCoefmat <-function(x, digits) { p <- x[, 4] x[, 1:3] <-sprintf("%.6f", x[, 1:3]) x[, 4] <-sprintf("%.2f", p)print(x, quote =FALSE, right =TRUE) } }),print =FALSE)
The interim solution is to print the summary twice, once with the original function for expected p-value behavior, and another time with the edited function for confidence-interval consistency. Someone more well-versed might want to take a shot at adapting the code so that the behavior is maintained for the p-values7 and fixed for the confidence intervals.
Conclusion
Mediation analysis is always tricky business8, but it feels especially discouraging when the software you need doesn’t seem to want to cooperate. We hope this post will at least shorten someone else’s journey into getting their mediation and/or survey models to behave.
Another option, which we won’t go into, would be to take the responses to the items themselves and use a multi-level ordinal model with item as the random grouping factor.↩︎
Of course, the effects mentioned here are not really the effects of moving from X to Y on the mediator, but the mediated effect that changing from Race = “White” to Race = “Black” has on the probability of IPV through its effect on the mediator.↩︎
@online{brazão2023,
author = {Brazão, Vasco and Devendran, Priya},
title = {Mediation Analysis with Weighted {GLMs} in {R:} {What} Could
Possibly Go Wrong?},
date = {2023-12-21},
langid = {en}
}
For attribution, please cite this work as:
Brazão, V., & Devendran, P. (2023, December 21). Mediation
analysis with weighted GLMs in R: What could possibly go wrong?