Engineers routinely ask how much failure is tolerable before a system stops being viable. Call that threshold the failure ceiling or the maximum tolerable failure rate (MTFR). The MTFR is the failure rate beyond which the consequences of failure outweigh the benefits of operation — and at which management should intervene.
But payers do not have an equivalent statistic for treatment coverage. They have the number needed to treat, which describes how efficiently a treatment produces a desired outcome in a population. They have cost-effectiveness analysis, which asks whether the clinical value produced is worth the cost. Neither answers a different and increasingly important question: How much failure can a treatment absorb before the population stops benefiting overall?
Defining the statistic
For a defined cohort of patients initiating treatment over a defined time horizon, each patient experiences either clinical success — defined by an outcome the treatment is intended to produce — or failed treatment. Failures take several forms — such as discontinuation because of intolerance, discontinuation because of inadequate response, initial response followed by loss of effect after discontinuation, or discontinuation because the treatment worked too well. For example, a hair-loss drug can grow hair so well that some patients grow unwanted hair on the face and body too, and stop.
For a cohort of N patients, a successful treatment produces some clinical benefit, measured against no treatment. A failed treatment produces some clinical loss — the foregone benefit, minus any partial benefit accrued before failure, plus any iatrogenic harm. Benefits and losses vary from patient to patient. Across the cohort, total benefit is the sum across all successful treatments; total loss is the sum across all failed treatments. The cohort produces net clinical benefit when total benefit exceeds total loss.
Let B and L denote the mean benefit per success and the mean loss per failure in the cohort, expressed in whatever units the treatment is intended to produce — quality-adjusted life years (QALYs), life-years, maintained response, prevented events. Let fdenote the failure rate. The cohort produces net clinical benefit only when:
(1 − f) × B > f × L
Setting the two sides equal at break-even and solving for f gives the maximum tolerable failure rate:
MTFR = B / (B + L)
This is the share of the cohort that can fail before the value generated by successful treatments is offset by the value lost through failed treatments. Below that rate, the treatment is producing enough durable success to justify the observed failures. Above it, the treatment may still help some patients, but is no longer producing enough durable success for the cohort as a whole. If a treatment is operating above its MTFR, the payer faces a performance problem that price alone cannot solve: Too many treatments are failing to achieve the intended durable outcome.
Why the B-to-L ratio matters
If B equals L — when the typical success and the typical failure represent equal magnitudes of clinical impact in opposite directions — the MTFR is 50%. Real treatments, however, rarely produce that equality. The size of the gap between what a success produces and what a failure costs is what makes the MTFR specific to that treatment. Partial credit for failures raises the ceiling: When patients receive some benefit before the treatment fails, mean L shrinks and the treatment can tolerate more failures. Iatrogenic harm lowers the ceiling: When failures include serious adverse events, mean L grows and the treatment can tolerate fewer failures. Lost-time costs work the same way as iatrogenic harm, lowering the ceiling, and matter more in oncology and progressive disease than in obesity. The factors compete; whether the ceiling ends up above or below 50% depends on their relative sizes for a given treatment.
Two treatments with the same B and the same observed persistence rate can have very different MTFRs if their failures carry different magnitudes of foregone benefit, partial credit, and harm. The MTFR for any treatment is context-specific — there is no universal value, and the comparison that matters is between a treatment’s MTFR and its observed failure rate, not between MTFRs across treatments.
Three operational challenges shape any application of the framework. First, the choice of units for B and L — QALYs, life-years, maintained response, prevented events — is consequential and likely to be contested, since different units produce different MTFRs for the same treatment. Second, the threshold for defining success within those units — what counts as a clinically meaningful response, over what time horizon — has the same effect: Tighter definitions yield lower success rates and different MTFRs. Third, the data needed to compute observed failure rates at the relevant horizon may not exist. Extrapolation from shorter follow-up is often necessary and requires explicit assumptions about persistence decay (how rapidly patients stop therapy) and rebound (how often outcomes deteriorate after they stop). Computing B and L requires averaging across the individual benefits and losses, which vary from patient to patient. Means are the convention, but skewed distributions of benefit or loss can make the mean misrepresent the typical patient. None of these is unique to the MTFR; all are present in cost-effectiveness analysis as well. The framework does not resolve them, but it does require that they be addressed explicitly rather than buried in model assumptions.
Applying the statistic to GLP-1s
For GLP-1s in obesity, the most practical unit is maintained weight loss — the first of the three challenges. A different choice, such as QALYs derived from cardiometabolic risk reduction, would produce a different MTFR. Success is defined as at least 5% body-weight reduction from baseline maintained at 48 months post-initiation. The 5% threshold and 48-month timepoint together resolve the second challenge — how success is defined within the chosen unit. The 5% threshold is a widely used marker of clinically meaningful weight loss and appears throughout obesity-drug development, although more recent obesity-drug trials often use mean percentage weight change as the primary endpoint and at least 5% weight loss as a key categorical endpoint. The 48-month horizon is illustrative. A payer’s exposure to the coverage decision depends on member churn, and the right horizon will differ across payers and member populations. Payers need to know not only whether patients lose weight during treatment, but whether coverage policies produce a durable benefit over a multiyear period.
Under this definition, a failed treatment is one that does not produce and maintain at least 5% weight loss at 48 months — whether through intolerance, inadequate response, too much weight loss, or weight regain after discontinuation. No single study tracks all these failure modes through 48 months in a real-world covered population, which is why the calculation here is illustrative — and why data availability is the third operational challenge.
A simple way to approximate the durable success rate is:
Durable success rate = 48-month persistence × probability of maintaining at least 5% weight loss among persistent users
If 48-month persistence is 25% to 40% (a scenario range extrapolated from a recent analysis reporting one-year persistence of less than two thirds, assuming continued attrition through 48 months), and 80% to 90% of persistent users maintain at least 5% weight loss, then the durable success rate would be roughly 20% to 36%. The implied failure rate is therefore roughly 64% to 80%. Persistence may be improving: The same analysis found that one-year persistence among commercially insured adults without diabetes nearly doubled, from 33% among 2021 initiators to 61% among early-2024 initiators, as supply shortages resolved. But even at 45% to 50% persistence through 48 months — yielding 36% to 45% durable success — the implied failure rate would only fall to roughly 55% to 64%.
For GLP-1s, the MTFR is likely above the 50% baseline because failed treatments may still produce temporary benefit before discontinuation or regain (partial credit). But the ceiling is not unlimited: Gastrointestinal side effects, early discontinuation, weight regain, and rare-but-serious adverse events reduce the value of failed treatments. A plausible MTFR might be in the 55% to 65% range, depending on how the payer values temporary benefit and how much discontinuation is driven by clinical versus coverage factors.
Under these assumptions, the observed failure rate for GLP-1 obesity treatment plausibly overlaps with, and may exceed, the MTFR. This is a warning signal rather than a precise verdict. The estimate depends heavily on persistence, rebound after discontinuation, and the value assigned to temporary weight loss.
What not to count
Any threshold statistic invites misuse. Some failures are driven by the treatment’s clinical properties; others are driven by the coverage decision itself — prior-authorization burden, formulary changes, high cost sharing. An MTFR computed against raw failure rates would let a payer point to a number partly of its own making. The defensible version separates clinical failures from coverage-driven failures. Both may reduce real-world performance, but they should not be treated as the same kind of failure. A payer should not be able to create discontinuation through coverage decisions and then cite that discontinuation as evidence that the drug itself fails.
Antecedents and implications of adopting the MTFR
The methodology has a natural home. ICER has done this kind of work before — building the cost-effectiveness ratio and the willingness-to-pay threshold into standard practice, and more recently developing the equal value of life years gained (evLYG) metric to address criticism that the QALY undervalues life with disability. The MTFR is a candidate for similar treatment: a population-level statistic that complements cost-effectiveness analysis, addresses a gap the existing pharmacoeconomic toolkit does not, and could be standardized through inclusion in ICER’s Value Assessment Framework. ICER could also help to establish which units are appropriate for which therapeutic areas — as it has done with other methodology guidance. If ICER takes it up, the methodology gains the institutional weight that adoption requires.
For manufacturers, an MTFR-aware payer community creates pressure in both directions. Before launch, the knowledge that real-world performance will eventually be measured against a tolerable failure rate changes what manufacturers invest in — patient-selection support, adherence and persistence services, provider and patient education. After launch, once real-world failure rates emerge and the gap is visible, the pressure becomes direct: Manufacturers either expand the investments that close the gap, take on value-based contracts that share risk on observed performance, or watch coverage tighten around them.
Engineers know how much failure their systems can absorb. Payers should know too.


No Comments