Kaplan–Meier (KM) plots are the main way for depicting time-to-event data. They appear in a large proportion of clinical trials results papers. They are so ubiquitous that they are sometimes termed “survival curves”. It is increasingly apparent that they could be better understood by many people who need to read them. The KMunicate format takes two forward steps in improving the information presented in KM graphs.
The issue with time-to-event data arises from censoring. We want to measure the time to some event, but some participants do not experience the event during the follow up period. We only know they have not had the event for a certain amount of time. These patients are “censored” as being event-free at the end of that known amount of time. These censored participants are those for whom more information should be known in the future. The earlier timepoints on any time-to-event graph will be based on more participants still at risk for the even than the later timepoints: the later timepoints will have fewer participants because they either have been censored before reaching that time or have previously had the event.
The first problem is showing which patients are contributing data across time. KM graphs may be presented with a “risk table” underneath which usually shows the number of patients who have reached a given timepoint without recording the event of interest. However, it is not readily apparently to many readers how many people have been censored before a time and how many events have happened.
The solution is an extended risk table. This shows the status of participants at each time point: any participant is either ‘at risk’ (reached timepoint without event), event (event happened before timepoint) or censored (censored by timepoint). These three numbers add up to the total number of participants. Simple and clear, and using very little extra space.
The second problem is that there is more uncertainty about the right-side of a graph than the left, but most readers don’t recognise this. With fewer contributing participants, the curves become larger and more distracting! The uncertain needs to be made apparent, particularly because survival at later times may be of more interest than at earlier times.
The solution is to show confidence intervals around the lines. For any given proportion of patients event-free, these would be narrower where there are more contributing participants and wider where there are fewer contributing participants. This width easily reminds the reader about the uncertainty of the estimates over time. Many stats packages have been able to do this for ages, but they looked too messy to be production-ready. Most packages can do gentle, transparent shading to show the uncertainty. This means it is the right time to drawn this function!
The KMunicate format brings together the extended risk table with the gentle shading of confidence intervals. This format supports the reader in understanding the graph. The KMunicate format should be the new standard for presenting survival curves.
We developed a number of ways in which this extra information could be presented. We explored them by collecting the thoughts of nearly 1200 researchers (including statisticians and clinicians) and journal editors. The KMunicate approach is the preferred approach which addresses both issues.
You can read more about the methods in the BMJ Open paper.
The Lancet used this format in reporting of time-to-event data for the RADICALS-RT trial.(Parker 2020)
It is recommended for use by the CHAMP checklist for assessing medical papers.
The approach was used by the COVID-19 Working Group in reporting the increased risk of death from notable COVID-19 variants in an early 2021 pre-print.
This is easiest if you use R. The KMunicate package is available from CRAN here or on GitHub here.
As a prize for putting this together, Alessandro Gasparini won the beautiful hand-carved spoon on the right.
There is no official version yet in Stata and someone may wish to develop this. In the meantime, you can easily build your own by adapting from some scripts demonstrating how to do it ‘by hand’.
If you use SAS, there is a script, though it is inevitably fairly long.
The Python library lifelines also allows for KMunicate-style tables
The extended risk table takes up a little more space than the standard risk tables. This could be challenging for those journals with strict limitations on space or for use in composite diagrams, although there is no need for journals to feel/impose this pressure with online publishing. Wide confidence intervals will also overlap and with increasing numbers of arms, opacity should be reduced. This approach may work best with 4 groups or fewer.
We would be delighted if you tell us when you submit a manuscript that used KM graphs in this format, whether the paper has been published and, if so, where, and what reaction you had from reviewers and journals. We may use this as part of a list of examples. You can email us here: firstname.lastname@example.org
It would also be helpful if you cite the KMunicate paper. In the Methods section, you might put:
'KM graphs have been presented using the approach exemplified by the KMunicate project. Morris TP, Jarvis CI, Cragg W, et al Proposals on Kaplan–Meier plots in medical research and a survey of stakeholder views: KMunicateBMJ Open 2019;9:e030215. doi: 10.1136/bmjopen-2019-030215'
Getting improved approaches to be widely taken up is not straightforward. The ability to draw diagrams in this way is quite new.
Are you a journal editor? Can you convince your editorial board to make this standard? Would you like us to present to you? Contact us.
Are you a guideline developer? Can you make this part of your guideline for good reporting?
Are you a researcher? Can you submit your paper in this format and make sure that this is in the “accepted” version that you upload to PubMed or PubMedCentral?
Feel like writing to someone? Why not drop a line to ICMJE or the editor of your favourite journal saying why this should be the new standard?
And follow along for examples on Twitter.
No. Most will show a basic risk table with time numbers still at risk at specific times. Some will also show some numbers in brackets between these timepoints. In most journals, this number in brackets is the number of participants having events between timepoints. However, in some journals the number in brackets is the number of participants being censored between the timepoints. Some journals align the values in a way that makes it difficult to tell what is by a timepoint, what is at a timepoint and what is between a timepoint. This likely makes uncertain readers more confused. One standard which showed all of the information would offset this. KMunicate does this.
The KMunicate team is now spread around the world. We do have a number of thoughts and will re-visit them after KMunicate has been widely implemented.