• Privacy Policy
  • Terms & Condition
  • Archive

ARK Foundation

  • Home
  • About
    • About Organization
    • Our Partners
    • Global Networks & Leadership
  • Our Team
    • Advisor
    • Executive Director
    • Research and Development
    • Research Uptake & Communications
    • Programme and Training
    • Finance and Administration
    • Data and Field Management
  • Our Work
    • Communicable Disease
    • Non-communicable Disease
    • Multimorbidity
    • Antimicrobial Resistance
    • Maternal, Newborn, Child and Reproductive Health
    • Nutrition
    • Health Systems
    • Climate Change and Environment
    • Gender, Equity and Social Inclusion
  • News & Media
    • Event
    • News
    • Blog
    • Video
    • Newsletter
  • Resources
    • Journal Article
    • Report
    • Working Paper
    • Project Brief
    • Policy Brief
    • Conference Proceedings
    • Infographics
    • Posters
  • Career
  • Contact
/ Published in Blog, Featured, News and Media

The Karlson-Holm-Breen (KHB) Method: Why Logistic Mediation Results Might Be Misleading?

Read pdf here

Written by Ibrahim Hasan and S M Abdullah

A simple story with a built-in complicated problem

Assume being a health researcher, you are studying health inequality. Particularly, you are interested to know the non-communicable disease (NCD) outcome status due to educational attainment exposure. Your preliminary logistic regression results showed that people with low educational attainment are more likely to be hypertensive. Since the literature establishes that NCD outcomes depend causally on physical activity,(Lee et. al., 2012) you included physical activity in the model to ensure the analysis is realistic and intuitive. In new set of results, education coefficient, suppose dropped by 30% and accordingly, you concluded “physical inactivity explains about one-third of the education gap in hypertension”. Your analysis is methodologically careful and rigorous; nonetheless, there is a great possibility that the “one-third” figure is misleading. This is not due to data quality or flawed study design, but because logistic regression acts differently than many of us assume. Unless you correct for that difference, your mediation results are mixing real effects with a statistical illusion. This is where the Karlson-Holm-Breen (KHB) method comes into play.(Karlson, Holm and Breen, 2012)

Before getting into technical aspects, let’s explore why mediation analysis matters in the first place. In public health and social science, researchers rarely just want to know whether X affects Y. They want to know “how”. For instance, in the scenario mentioned above they often frame questions as “does education reduce hypertension partly because it increases physical activity?” This is what mediation analysis tries to answer. It separates: direct effect of an exposure (X affects Y), and indirect effect operating through a mediator (X affects Y through a mediator M). In linear regression, such decomposition is simple, involving just coefficient comparisons to determine the mediator and the extent of mediation. In contrast, logistic regression presents more complexity.

What most researchers do and why it breaks?

The standard approach for mediation analysis has the following steps:

Step 1: Estimate Model 1:  logit(Y) ~ X

Step 2: Estimate Model 2:  logit(Y) ~ X + Mediator

Step 3: Estimate proportion mediated:     Mediation Proportion (PM) = (β₁ − β₂) / β₁

 

In linear regression, it works perfectly. In practice, researchers augment the model with a variable, examine the coefficient adjustment, and attribute the change to the added variable. The coefficient of X stays meaningful across models. However, in logistic regression, the coefficient of X changes for two reasons; real mediation and a built-in scaling issue called non-collapsibility. Ignoring the latter reason leads to biased mediation estimation.

The non-collapsibility problem

A statistical measure is collapsible if it remains stable when conditioned on an additional variable, provided that variable is not a confounder. Augmenting a linear regression model with a variable unrelated to X is expected to leave the coefficient of X unchanged. In contrast, in logistic regression, the coefficient of X may change even if the added variable is unrelated to X. This change does not indicate a true relationship alteration but reflects a mathematical property of odds ratios: they are non-collapsible. Logistic regression coefficients are not directly comparable across nested models, even with the same data. Therefore, when researchers observe coefficient changes after adding a mediator, part of this may simply be a scaling artifact.

The reason: a fixed hidden variance

Logistic regression assumes an unobserved continuous variable underlying the binary outcome. The variance of this latent scale is fixed at π²/3 ≈ 3.29 (the variance of the standard logistic distribution), and this does not change regardless of what variables the researcher adds to the model.

In linear regression, residual variance is estimated freely from the data. Adding a new variable, leads to shrink in the residual variance, while the other coefficients remain stable. In logistic regression, the total variance is fixed at π²/3. For adding a mediator that explains some of that variance, the residual variance must shrink. But because the total variance is fixed, the model rescales the remaining coefficients upward. This is why naive mediation in logistic regression often overstates indirect effects and can even create artificial suppression effects. Winship and Mare noted this in 1984.(Winship and Mare, 1984)

The KHB solution: residualise the mediator

Karlson, Holm, and Breen proposed a solution of this problem in 2012.(Karlson, Holm and Breen, 2012) Instead of adding the mediator (M) directly to the logistic model, first residualise it (regress M on the exposure X) and retain the residuals M̃. This residualised variable is, by construction, unrelated to X: all variance in M attributable to X has been partialled out.

Adding M̃ to the reduced model does not change the coefficient on X, because M̃ is uncorrelated with X, its inclusion does not reduce X’s explained variance and therefore does not trigger rescaling. The reduced model (with M̃) and the full model (with M) are reparameterisations of the same model: they fit identically. But they now share the same residual variance and the same error distribution, making coefficient comparison valid.

R and STATA code

The core R code would look like this:

# Step 1: Residualise the mediator on the exposure

M_tilde <- residuals(lm(M ~ X + covariates))

 

# Step 2: Full model  —  X + M  →  Direct Effect (NDE)

mod_full <- glm(Y ~ X + M + covariates, family = binomial())

 

# Step 3: Reduced model  —  X + M̃  →  Total Effect (scale fixed)

mod_red  <- glm(Y ~ X + M_tilde + covariates, family = binomial())

 

# KHB Decomposition

TE  <- coef(mod_red)[“X”]    # Total Effect

NDE <- coef(mod_full)[“X”]   # Natural Direct Effect

NIE <- TE – NDE              # Natural Indirect Effect (via M)

PM  <- NIE / TE              # Proportion Mediated

 

Fortunately, the KHB method is comparatively simpler to implement in both Stata and R, thereby enhancing its accessibility to most applied researchers.

Karlson, Holm, and Breen have provided the ‘khb’ command for Stata, available from the SSC archive. Installation and basic use are as follows:

ssc install khb

khb logit y x || m, disentangle

The disentangle option provides a full decomposition, including the contribution of each mediator when multiple mediators are specified.

In R, the KHB package provides equivalent functionality:

install.packages(“KHB”)

khb(model.full, model.reduced, mediator = “m”)

Both implementations report total, direct, and indirect effects on the log-odds scale, along with standard errors and confidence intervals derived via the delta method or bootstrapping. Because both models are on an identical scale, the additive decomposition (Total Effect = Direct Effect + Indirect Effect) holds exactly on the log-odds scale, the principal advantage of the KHB approach over conventional methods.

Reporting the Decomposition

The KHB paper (Karlson, Holm and Breen, 2012) presents three equivalent forms of the decomposition, each suited to different reporting conventions:

Method What it shows
Difference Raw KHB coefficient change; analogous to standard logit coefficient difference
Ratio Proportion of total effect mediated; scale-free
Percentage change Mediated fraction expressed as Mediation Proportion (PM); easiest to communicate to non-specialists

 

Assumptions and limitations

Like all methods for causal mediation analysis, KHB rests on identifying assumptions that researchers should be explicit about.

  • No unmeasured confounding of X → Y: the exposure-outcome relationship should not be confounded by unmeasured variables.
  • No unmeasured confounding of M → Y: the mediator-outcome relationship is similarly unconfounded. This is often the more demanding requirement in practice.
  • No X–M interaction: the standard KHB decomposition assumes no interaction between the exposure and the mediator in their effect on the outcome.
  • Correct model specification: as with all regression-based methods, misspecification can bias results.

 

It is also important to note that the KHB decomposition produces estimates on the log-odds scale, which may limit direct substantive interpretation. Researchers who need results on the probability scale (e.g. risk differences) should consider the ‘counterfactual causal mediation framework’,(VanderWeele, 2015) which produces natural direct and indirect effects with bootstrap confidence intervals on the probability scale. In practice, both approaches can be used simultaneously: KHB as the primary decomposition within the logistic framework,(Karlson, Holm and Breen, 2012)  and VanderWeele as a sensitivity check on the probability scale. (VanderWeele, 2015)

The Takeaway

For nearly three decades following Winship and Mare’s identification of the rescaling problem,(Winship and Mare, 1984) mediation estimates from logistic regression models were routinely reported without correction for a statistical artefact that could partially, and in some cases substantially, distort conclusions.

The KHB method does not require abandoning logistic regression. It requires residualising the mediator before adding it to the model, which costs nothing in terms of computation but fixes everything in terms of validity. The method is not a substitute for thoughtful causal reasoning; the identifying assumptions remain demanding, and cross-sectional designs cannot confirm causal direction. But it ensures that the arithmetic of effect decomposition is at least internally consistent, and that the proportion mediated reported in published work reflects genuine mediation rather than a scaling artefact.

References

  1. Lee IM, Shiroma EJ, Lobelo F, Puska P, Blair SN, Katzmarzyk PT; Lancet Physical Activity Series Working Group. Effect of physical inactivity on major non-communicable diseases worldwide: an analysis of burden of disease and life expectancy. Lancet. 2012 Jul 21;380(9838):219-29. doi: 10.1016/S0140-6736(12)61031-9
  2. Karlson, Kristian Bernt, Anders Holm, and Richard Breen. 2012. “Comparing regression coefficients between same-sample nested models using logit and probit: A new method.” Sociological Methodology 286-313.
  3. Winship, Christopher, and Robert D. Mare. 1984. “Regression models with ordinal variables.” American Sociological Review (American Sociological Review) 512-525.
  4. VanderWeele, Tyler J. 2015. Explanation in causal inference: Methods for mediation and interaction. New York: Oxford University Press.

 

Authors: Ibrahim Hasan1 and S M Abdullah1,2

  1. ARK Foundation, Dhaka, Bangladesh
  2. University of Dhaka, Dhaka, Bangladesh

What you can read next

International Women’s Day 2025: Importance of Gender in Research
Webinar: From Fragmentation to Harmonisation: Strengthening Urban Health Systems through Provider Partnerships and Collaboration
News Coverage of our Webinar

Recent Posts

  • South Asian Coalition of Policy and Evidence for Equitable food systems (SCOPE)

    The South Asian Coalition of Policy and Evidenc...
  • Fragmentation in urban health service provision? A plurality of providers is the answer

    Read it here or download PDF  Abdullah Rafi, fr...
  • International Women’s Day 2026

      When women give knowledge, care, and lea...
  • In-country public-private partnerships hold the key to promoting inclusiveness in Dutch trade and international cooperation agenda

    Read the PDF here...
  • COVID-19 and Tobacco

    Read the PDF here  ...
  • Taxation on Sugar-Sweetened Beverages (SSBs) in Bangladesh: What should we do?

    Read the PDF here...
  • Public Private Partnership in Improving Access and Utilization of Health Care Services: Scopes, Opportunities and Challenges

    Find the PDF here  ...
  • Influencing TB policy and practice in Bangladesh using a Public-Private Mix approach

    Read the PDF here Policy messages: TB case noti...
  • How can public-private partnerships enhance the use of long acting contraceptive methods in Bangladesh?

    Read the PDF here Using a public-private partne...
  • Improving the quality of care at community clinics in rural Bangladesh through new approaches

    Read the PDF here Key messages The training was...
  • Integrating tobacco cessation within the TB programme: Findings from the ‘TB & Tobacco’ study

    Find the PDF here Integrating tobacco cessation...
  • The Complicated Cigarette Tax Structure in Bangladesh is Causing Expansion of the Low-Tier Cigarette Market and Lower Tax Revenue

    Find the PDF here Implementing a uniform ad val...
  • Digital Health in Dhaka | Simple App | Transforming Urban Healthcare | Channel 24 | ARK Foundation

      Digital health is reshaping urban health...
  • বাংলাদেশে স্বাস্থ্যবিমা: বাস্তবসম্মত সমাধান নাকি শুধু আলোচনা? | Channel 24 | ARK Foundation

    স্বাস্থ্যসেবার ব্যয় কি নাগালের বাইরে চলে যাচ্ছ...
  • Course: Project Management in Public Health

    Download the prospectus from here Introduction ...
  • Precision at Scale: Managing 3,559 Survey Clusters in the World’s Largest Refugee Settlement

    Find the pdf version here or read it here By  Z...
  • The Areca Nut Paradox in Bangladesh: A Rapid Review of Cultural Embeddedness, Public Health Risks, Livelihood Dependence, and Policy Gaps

    Read it here or download the PDF version By Nab...
  • Why does it matter? Childhood obesity among school going children in Urban Bangladesh: Potential Way Forward

    Read the PDF here Written by Badruddin Saify Fo...
  • বৈষম্য কমাতে চাই কার্যকর প্রাথমিক স্বাস্থ্যসেবা | Prof. Dr. Liaquat Ali | ARK Foundation | Channel24

    স্বাস্থ্যসেবা কি শুধু প্রতিশ্রুতির মধ্যেই আটকে ...
  • How SCIMITAR-SA turns barriers into better support to quit tobacco

    Find the HTML version  SCIMITAR-SA is built aro...

Empower Your Career with ARK Foundation

Discover opportunities to make a difference in health, education, gender equality, and environmental sustainability.

JOIN US

ARK Foundation is a non-government, non-political and not-for-profit organization dedicated to the socio-economic development of Bangladesh. Through evidence-based research, training and communications it provides sustainable solutions for health, education and social development.

ADDRESS

Suite A-1, C-3 & C-4, House # 06, Road # 109,
Gulshan-2, Dhaka, Bangladesh, 1212

Phone: +88 02 55069866

Email: info@arkfoundationbd.org

LOCATION

  • GET SOCIAL

© 2025. All rights reserved. ARK Foundation.

TOP