What is a simulation study?

Sometimes we don’t know the best way to analyse our data, or we don’t know the best data to collect when planning a clinical trial. A simulation study is a computer-based experiment that aims to answer questions like:

  • What data should we collect?
  • How many people are needed in each arm of this trial?
  • Which method of analysis is most accurate?

In a simulation study, we use random numbers to create many data sets like the data we would expect to collect. We then analyse each data set and look at how good the results are, using suitable performance measures. Because we simulated the data, we know the correct answer, which helps us to evaluate the results.

Who uses simulation studies?

Researchers (especially statisticians) use simulation studies to evaluate or compare statistical methods. This is useful both in methods development and when deciding how to analyse a particular data set.

Clinical trial statisticians use simulation studies to compare different ways to design a clinical trial – in particular, including different numbers of participants. They can then decide which trial design offers the best chance of getting a reliable answer.

Our work on simulation studies

We are long-term users of simulation studies as a tool in developing statistical methods. We developed the ADEMP framework:

  • A: Aims – what are you trying to find out?
  • D: Data generating mechanisms – how will the simulated data sets be generated?
  • E: Estimands / target of estimation – what is the analysis aiming to find out?
  • M: Methods of analysis – how will each simulated data set be analysed?
  • P: Performance measures – how will you evaluate the methods?

We ran our first short course in Leicester in 2015 and published a tutorial on how to use simulation studies to evaluate statistical methods in 2019. The course ran 19 times by 2023.

In 2023, we published a second tutorial on how to check a simulation study.

We wrote a Stata command, simsum, for the analysis of simulation studies. The R package rsimsum (available from CRAN) is an expanded version of this. We are currently producing a suite of Stata programs for further analysis and graphical display of simulation studies. The current version is available on GitHub.

Resources

Courses:

Publications: Tutorials and guidelines:

Software (Stata):