Linear Mixed Models#
The lmer() class makes it easy to work with linear-mixed models (multi-level models) estimated via maximum-likelihood or restricted-maximum-likelihood estimation
Random-effects can be specified using the same formula syntax as lme4 in R and all methods are the same as lm models.
Core Methods and Attributes
Method |
Description |
|---|---|
Fit a model using |
|
Calculate a Type-III ANOVA table using |
|
Get a nicely formatted table containing |
|
Compute marginal predictions are arbitrary values of predictors |
|
Compute marginal means/contrasts as specific factor levels |
|
Simulate new observations from the fitted model |
|
Make predictions given a new dataset |
lmer models store the results of their last estimation as polars Dataframes like lm models, but with a few more attributes:
Attribute |
Stores |
|---|---|
|
parameter estimates |
|
cluster-level deviances for random-effects terms |
|
variances and co-variance for random-effects terms |
|
cluster-level estimates for random-effects terms ( |
|
parameter estimates, errors, & inferential statistics |
|
model fit statistics |
|
ANOVA table |
from pymer4.models import lmer
from pymer4 import load_dataset
sleep = load_dataset('sleep')
Random Intercepts#
model_i = lmer('Reaction ~ Days + (1 | Subject)', data=sleep)
model_i.fit()
As with lm() models, we can inspect estimates with the .params attribute. In the current model these are the estimates for the overall model intercept and slope for \(Days\)
model_i.params
| term | estimate |
|---|---|
| str | f64 |
| "(Intercept)" | 251.405105 |
| "Days" | 10.467286 |
We can see estimates for each cluster (each \(Subject\)), commonly called BLUPs, in the .fixef attribute
Because we only estimated a random intercept per \(Subject\) all estimates (slopes) for \(Days\) are the same
model_i.fixef.head()
| level | (Intercept) | Days |
|---|---|---|
| str | f64 | f64 |
| "308" | 292.188815 | 10.467286 |
| "309" | 173.555551 | 10.467286 |
| "310" | 188.296537 | 10.467286 |
| "330" | 255.811547 | 10.467286 |
| "331" | 261.621294 | 10.467286 |
These estimates are derived from the random deviances of each \(Subject\) around the fixed parameter estimates. We can see those with .ranef which only contains a column for random intercepts
model_i.ranef.head()
| level | (Intercept) |
|---|---|
| str | f64 |
| "308" | 40.78371 |
| "309" | -77.849554 |
| "310" | -63.108567 |
| "330" | 4.406442 |
| "331" | 10.216189 |
We can see the variability across \(Subject\)’s using .ranef_var
model_i.ranef_var
| group | term | estimate | conf_low | conf_high |
|---|---|---|---|---|
| str | str | f64 | f64 | f64 |
| "Subject" | "sd__(Intercept)" | 37.123827 | null | null |
| "Residual" | "sd__Observation" | 30.991234 | null | null |
Of course it’s often easiest to see all this information at once using .summary()
model_i.summary()
| Formula: lmer(Reaction~Days+(1|Subject)) | |||||||||
| Number of observations: 180 Confidence intervals: parametric --------------------- Log-likelihood: -893 AIC: 1794 | BIC: 1807 Residual error: 30.991 |
|||||||||
| Random Effects: | Estimate | SE | CI-low | CI-high | T-stat | DF | p | ||
|---|---|---|---|---|---|---|---|---|---|
| Subject-sd | (Intercept) | 37.124 | |||||||
| Residual-sd | Observation | 30.991 | |||||||
| Fixed Effects: | |||||||||
| (Intercept) | 251.405 | 9.747 | 231.233 | 271.577 | 25.794 | 22.810 | <.001 | *** | |
| Days | 10.467 | 0.804 | 8.879 | 12.055 | 13.015 | 161.000 | <.001 | *** | |
| Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 | |||||||||
Random Slopes#
We can easily extend our model with random slopes by adding them to the formula. Before inspecting the parameter estimates it can be helpful to test whether we really need the extra parameters by comparing this model to our first model using a likelihood ratio test.
from pymer4.models import compare
model_s = lmer('Reaction ~ Days + (Days | Subject)', data=sleep)
compare(model_i, model_s)
| Analysis of Deviance Table | |||||||||
| Model 1: lmer(Reaction~Days+(1|Subject)) Model 2: lmer(Reaction~Days+(Days|Subject)) |
|||||||||
| AIC | BIC | logLik | npar | -2*log(L) | Chisq | Df | Pr(>Chisq) | ||
|---|---|---|---|---|---|---|---|---|---|
| 1.00 | 1794.5 | 1807.2 | −893.2 | 4.00 | 1.79K | ||||
| 2.00 | 1755.6 | 1774.8 | −871.8 | 6.00 | 1.75K | 42.14 | 2.00 | <.001 | *** |
| Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 | |||||||||
It looks it’s worth it!
Let’s calculate bootstrapped confidence intervals, which will give us some uncertainty estimates around both fixed and random effects terms. By default we don’t get confidence-intervals around random effects terms, but we can with bootstrapping.
We can see that our estimate of the standard-deviation across \(Subject\)’s intercepts is about ~25 with 95% CI of [11 36], while the standard-deviation of their slopes for \(Days\) is about 6 with 95% CI of [3, 9]
model_s.fit(conf_method='boot', summary=True)
| Formula: lmer(Reaction~Days+(Days|Subject)) | |||||||||
| Number of observations: 180 Confidence intervals: boot Bootstrap Iterations: 1000 --------------------- Log-likelihood: -871 AIC: 1755 | BIC: 1774 Residual error: 25.592 |
|||||||||
| Random Effects: | Estimate | CI-low | CI-high | SE | T-stat | DF | p | ||
|---|---|---|---|---|---|---|---|---|---|
| Subject-sd | (Intercept) | 24.741 | 12.992 | 35.750 | |||||
| Subject-sd | Days | 5.922 | 3.512 | 8.450 | |||||
| Subject-cor | (Intercept) | 0.066 | −0.500 | 0.906 | |||||
| Residual-sd | Observation | 25.592 | 22.530 | 28.605 | |||||
| Fixed Effects: | |||||||||
| (Intercept) | 251.405 | 237.684 | 265.801 | 6.825 | 36.838 | 17.000 | <.001 | *** | |
| Days | 10.467 | 7.462 | 13.584 | 1.546 | 6.771 | 17.000 | <.001 | *** | |
| Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 | |||||||||
Variance Partitioning#
Of particular interest with multi-level models is variance partitioning or how much of the explained variability is attributable to the fixed vs random portions of the model. These are included in .result_fit_stats captured by three statistics:
Total Explained Variance:
R2_conditionalwhat’s explained by fixed effects & random effectsFixed Effects Explained Variance:
R2_marginalwhat’s explained by fixed effects onlyIntraclass Correlation Coefficient:
ICCwhat’s explained by random effects only; also known as: variance partition coefficient (VPC) or measurement repeatability
You can read more details about the implementation we use here
model_s.result_fit_stats
| AIC | AICc | BIC | ICC | R2_conditional | R2_marginal | REMLcrit | RMSE | Sigma | df_residual | logLik | nobs | sigma |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | i64 | f64 | i64 | f64 |
| 1755.628272 | 1756.113821 | 1774.786013 | 0.72166 | 0.79922 | 0.278651 | 1743.628272 | 23.438047 | 25.591796 | 174 | -871.814136 | 180 | 25.591796 |
Model issues#
When fitting more complex models or models with redundant parameters you’ll often run into model convergence issues and/or model singularity issues.
Models will print out warning messages from R when this happens to let you know. But you can also suppress these by passing verbose = False to .fit()
bad_model = lmer('Reaction ~ Days + (Days | Subject) + (1 | Days)', data=sleep)
bad_model.fit()
R messages:
boundary (singular) fit: see help('isSingular')
R messages:
boundary (singular) fit: see help('isSingular')
# No warning!
bad_model.fit(verbose=False)
You can always see if there were any suppressed warnings using .show_logs()
bad_model.show_logs()
R messages:
boundary (singular) fit: see help('isSingular')
Convergence issues can often be due to a lack of variance for some of the random-effects terms. The “gradient” value should be a very very tiny number that reflects the change is the optimizer’s error during fitting. Values close to or > 0 suggest the optimizer struggled to find the “best” parameter estimates.
mtcars = load_dataset('mtcars')
bad_model = lmer('mpg ~ 1 + (hp | cyl)', data=mtcars)
bad_model.set_factors('cyl')
bad_model.fit()
R messages:
Convergence status
: [1] FALSE
attr(,"gradient")
[1] 1.145261
We can see that the variability around the random slope across \(hp\) for \(cyl\) is very small (~0.034) which makes this challenging for the optimizer
bad_model.ranef_var
| group | term | estimate | conf_low | conf_high |
|---|---|---|---|---|
| str | str | f64 | f64 | f64 |
| "cyl" | "sd__(Intercept)" | 6.192348 | null | null |
| "cyl" | "cor__(Intercept).hp" | -0.869679 | null | null |
| "cyl" | "sd__hp" | 0.034038 | null | null |
| "Residual" | "sd__Observation" | 3.120614 | null | null |
We can attempt to bootstrap the model to see how much the variability changes. We again see covergence warnings and that the range of estimates includes very small values for the random slope.
We’re also having hard time estimating the correlation between intercept and slope leading to a huge confidence interval
bad_model.fit(conf_method='boot')
bad_model.ranef_var
R messages:
Convergence status
: [1] FALSE
attr(,"gradient")
[1] 1.145261
| group | term | estimate | conf_low | conf_high |
|---|---|---|---|---|
| str | str | f64 | f64 | f64 |
| "cyl" | "sd__(Intercept)" | 6.192348 | 0.0 | 13.387067 |
| "cyl" | "cor__(Intercept).hp" | -0.869679 | -1.0 | 1.0 |
| "cyl" | "sd__hp" | 0.034038 | 3.3944e-7 | 0.087685 |
| "Residual" | "sd__Observation" | 3.120614 | 2.287559 | 3.949821 |
We can try to simplify the model by remove the correlation between random-effects terms. The model still fails to convergence, but the gradient is much smaller now.
It looks like the lack of variability in the random slope is still giving the optimizer issues:
simple_model = lmer('mpg ~ 1 + (1 | cyl) + (0 + hp | cyl)', data=mtcars)
simple_model.set_factors('cyl')
simple_model.fit(conf_method='boot')
simple_model.ranef_var
R messages:
Convergence status
: [1] FALSE
attr(,"gradient")
[1] 0.1034371
| group | term | estimate | conf_low | conf_high |
|---|---|---|---|---|
| str | str | f64 | f64 | f64 |
| "cyl" | "sd__(Intercept)" | 5.706231 | 0.0 | 11.294004 |
| "cyl.1" | "sd__hp" | 0.012059 | 0.0 | 0.04829 |
| "Residual" | "sd__Observation" | 3.182668 | 2.266804 | 4.016254 |
Let’s simplify further by retaining only random-intercepts:
simpler_model = lmer('mpg ~ 1 + (1 | cyl)', data=mtcars)
simpler_model.set_factors('cyl')
simpler_model.fit(conf_method='boot')
simpler_model.ranef_var
| group | term | estimate | conf_low | conf_high |
|---|---|---|---|---|
| str | str | f64 | f64 | f64 |
| "cyl" | "sd__(Intercept)" | 5.761425 | 0.392901 | 11.721516 |
| "Residual" | "sd__Observation" | 3.222468 | 2.349387 | 4.015793 |
No more warnings! We can verify by checking out the logs:
# Empty
simpler_model.show_logs()
And convergence status, which shows successful convergence and a tiny gradient:
print(simpler_model.convergence_status)
Convergence status
: [1] TRUE
attr(,"gradient")
[1] 7.001001e-06
Summary#
lmer() models support all the same features as lm() models for working with categorical predictors and generating marginal estimates and comparisons.
So be sure to check out the linear regression and ANOVA tutorials for more details.