What Is a “Well-Powered” Study?
From the Desk of Dr. Danielle Meadows
There are many nuances that contribute to the complexity and duration of the research process I introduced back in October, 2024. Therefore, I wanted to take time each month to explore some of those nuances and how they factor into Open Medicine Foundation Canada (OMFCA) research programs, starting at the beginning of the process with “Study Design”.
One important aspect of designing a study is determining the number of participants that will be included. It’s a difficult balance between trying to design a study with the best odds of producing an impactful result and working within the resources available.
At OMFCA, we try to use our resources wisely, focusing on scientifically rigorous research that has the potential to guide future research and patient care as quickly as possible. Conducting well-powered research is a big part of that mission, so I want to explain a little bit about what that means from a science perspective and how it impacted the design of our first clinical trial.
The Heart of the Matter
- A “well-powered” study is one that has enough data to provide confidence in the results presented.
- In research, a power analysis is a calculation used to determine sample sizes needed to consider a study well-powered.
- OMF’s Life Improvement Trial (LIFT) will include enough patients to detect both medium and large effects from the drugs being investigated.
- As a general rule of thumb, studies that have less than 15 participants in each group will require follow-up in a larger project to validate the results.
OMF aims to support well-powered research studies designed to:
- understand the root causes of ME/CFS, Long COVID, and other complex chronic diseases,
- detect significant molecular signatures that may help us identify a biomarker or diagnostic test for these diseases, or
- test treatments based on rigorous scientific evidence.
What Does “Well-Powered” Mean?
“Well-powered” is a statistical way of saying that the study has enough data to indicate that any result discovered is unlikely to be due to chance. Typically, this is accomplished by increasing the sample size – the more participants showing the same result, the more likely the result is due to the thing being studied.
You can also consider the term well-powered as a way to describe a set of measurements. Take a clinical trial, for example. The set of measurements would be your primary outcomes (e.g., a set of surveys) and whether they are well-powered or not is impacted by the number of statistical tests being run on them. The more statistical tests done, the lower the power.
How do researchers know if their study is well-powered?
Typically, during the design phase of the research process, researchers will conduct what’s called a power analysis to help them calculate the sample size needed to see the desired effect. There are a lot of factors that go into a power analysis, but some of the main ones include:
- Effect size: The effect size is often defined as the standardized difference between the means of two groups. In other words, it describes how large of a change you expect to see in a study. Effect size values most commonly range from 0 to 1, where 0.2 is considered a small effect size, 0.5 medium, and 0.8 large. In general, the relationship between effect size and sample size is inversely proportional, which means you need larger sample sizes to see smaller effects. This is particularly true in heterogeneous populations where the standard deviation in measurements is high.
- Statistical analysis model: There are many statistical models that are used to determine if research results are significant. Which model is used depends on things like how many study arms there are and how many outcomes are being analyzed. Power analyses, therefore, have to consider which statistical model will be used in order to determine how many participants are needed in each study arm.
- Significance level: Statistical analysis results are most often reported using p-values. Prior to performing a study, researchers should set a significance level, which is the threshold for a p-value where the results are considered significant. The majority of studies set their significance level at p<0.05, which essentially means there is a 95% chance that if the p-value is less than 0.05, the results are due to a real effect or relationship.
To make this concept a little more tangible, let’s use the Life Improvement Trial (LIFT) as an example. Because of the complexity of studying the effect of multiple drugs in one trial, the LIFT will use a statistical model that accounts for data with multiple timepoints and subgroups. The LIFT significance level is set at p<0.05. Given these factors, a power analysis determined that 40 participants per study arm (160 participants total) will allow us to detect both large and medium effect sizes, which we hope will correlate with clinically meaningful changes.
How can I tell if a research study is well-powered?
First and foremost, look at the number of participants included in each group in the study, sometimes reported as an “n” – an n of 20 means there are 20 participants in that group. While there are many factors that go into power analyses, as a general rule of thumb, if there are fewer than 15 participants in each group, the study is likely under-powered. Under-powered studies may still provide useful information, but it’s important that the results are validated with larger, more robust studies before they are used to influence clinical action.