Consider study population, sample size and trial statistics

Information
Study Population
The study population consists of the total number of members of a defined group of people which has been selected because of relevance to a research question. For example, to study maternal healthcare, the target population would likely be pregnant women, or pregnant women with the specific issue that is being addressed by the research question. The study population is those who would be eligible to take part in the study because they meet the entry criteria and live in the area where the study is taking place (or are likely to present in the health facility being used as a research site).

The study population has to be clearly defined according to specific characteristics such as age, sex or geographical accessibility to health services. Also, the way the study population is defined depends on the problem that needs to be investigated and the objectives of the study.

These are important criteria to consider at the protocol planning stage. How many participants does your study need to answer your question (this should have now been determined by the statistical considerations). Where and how can you recruit? Are the numbers or rarity of presentation such that you might need more than one research site in order that you can complete the study in a realistic time frame?

Patient population can make or break a clinical study, firstly because you need the right set of participants to answer your question, and so the inclusion and exclusion criteria must be right. If your criteria are too broad then you could risk not answering the primary objective due to other confounding factors. Equally, criteria which are too narrow will mean that the population is so limited the data might be irrelative to the real-life clinical setting. Secondly, you need to recruit in a timely manner to ensure you can answer your question. Lastly is the question of retention: losing participants in the follow-up visits is one of the most detrimental problems in conducting clinical research studies, so you can optimise the chances of retaining the patients within your selection criteria – for example this might involve including subjects only if they live within an accessible distance of the study site.

Sample bias in study protocols
In selecting the sample patient, several biases can be introduced such as the exclusion of patients, the use of a retrospective sample versus a prospectively collected sample, patient enrolment, etc. The study population therefore needs to be thoroughly described, so that the assumptions made to the patient pool is transparent.

The following factors need to be considered in reducing the chance of bias in selecting the patient population to be studied:

1. Exclusion of patients
This denotes the characteristics that will make study participants representative of the eventual target population, for whom the question is set to benefit. Specific qualities such as age, duration and severity of disease, previous treatment and its effects all describe the potential patient population. When analysing and reporting the results, it will be important to describe who was studied and also describe patients who were excluded or declined to participate in the study, because they may be different from the patients actually studied. Some exclusions are random: for example, the hard copy of a patient’s CT scan that becomes corrupt and therefore unusable or a patient who died due to reasons unrelated to the study drug such as a car accident. Other exclusions are not random, and might introduce bias. Your study documentation needs to capture this information, so useful tools for this include a screening log where reasons for enrolment or exclusion are recorded.

For instance in determining the sensitivity of a particular diagnostic test, patients referred to a tertiary medical centre might appear to have more severe disease than patients treated for the same disease in a community hospital. This factor might make a diagnostic test appear to be more sensitive than it actually is, since it is generally easier to detect severe disease.

Some exclusion may have little to do with characteristics of the target population but may seek to exclude patients with concomitant conditions that may put patients at higher risk and/or could obscure the assessment of the drug effect. In this case, the risk/benefit ratio will need to be assessed by the investigator particularly for those patients where it is already known that the study medication is harmful.

Taking everything into account, trying to eliminate bias entirely could mean that patients are difficult or impossible to recruit. One way of circumventing this issue, is by writing the protocol in such a way that leaves room for investigator discretion. For instance gastric bleeding will be a contraindication in a study involving an anti-inflammatory drug and therefore the exclusion criteria could be written so that patients with a history of major bleeding are not expressly excluded but left to the investigator’s judgement. This then leaves the investigator the flexibility to decide how far back the patient’s history is deemed relevant for study entry.

In these cases of co-morbidity, the ability to assess the true impact of the study treatment on the results could be biased as the effect of the intervention may exacerbate or confer some benefit to the co-morbidity.

2. Retrospective vs prospective studies

Retrospective studies, by their nature will have some element of bias due to the time period involved and sometimes, having to depend on human memory to assess the severity and timings of the disease. If an investigator were to design a study aimed at determining the severity of dyspnoea in patients with suspected pulmonary embolism, it is likely that patients diagnosed with pulmonary embolism who are hospitalised for treatment will remember their dyspnoea more vividly and rate it more severe than patients not diagnosed with pulmonary embolism who were sent home. This would exaggerate the difference in reported dyspnoea in the two groups, compared with what would be seen if all of the patients were asked about dyspnoea.

Clinical research participants are selected on the basis that they can follow the study procedure and adhere to scheduled appointments. Again patients are restricted in the concomitant medication or the timings at which the concomitant meds can be administered, all of which introduces some bias in the patient population as patients are not always compliant in the real world.

By nature of the clinical studies themselves, patients will normally be referred to the clinical site where the investigation is taking place and will need to “volunteer” to participate in the study. This therefore implies that only certain types of patients who are unfazed by trying new interventions, or patients that are desperate due to the type or severity of the disease or lack of other access to free healthcare, are enrolled and may not necessarily represent the general population.

Conclusions
While it is important to identify the patient population and difficult to remove bias entirely, it is hoped that the points stated above will help any potential researcher bear in mind some of the constraints with selecting patients and be aware of where bias could creep in.
In saying this, science is moving towards developing individualist medicine and selecting very specific patient populations for research may not be bad after all; as long as there are sufficient patients in the general population that will “fit” within this specific group to benefit from treatment. This is evident in the use of the anti cancer treatment, e.g. herceptin in HER 2 positive breast cancer sufferers and many more treatments that are specific to a sub-group of patients,

Sample Size and trial Statistics
Determining the sample size is a crucial component of the clinical research study design and needs to be fully considered to make sure the design can answer the question. If the sample size is too large, it may waste time, resources and money whilst a smaller sample size may lead to inaccurate results and therefore deemed unethical as patients will have needlessly been exposed to research procedures and interventions, and yet the question would remain unanswered.

Choosing a sample size is a combination of logistical and pragmatic considerations. Some of these considerations include: the number of participants one can reasonably expect to recruit in the given time period and the available resources, the estimated screen failure and drop out rate, etc. The sample size estimation formula will provide the number of evaluable subjects required for achieving the desired statistical significance for a given hypothesis. However in practice, there may be the need to enroll more subjects to account for potential dropouts.

The final figure calculated indicates the minimum number of participants required to reliably answer the research question.
There are many statistical packages dedicated to performing sample size/power calculations . It is often possible to perform such calculations using standard statistical software and stand-alone packages are available for this purpose. However, many of the calculations required to achieve sample size can be calculated manually with the aid of statistical books or literature. The type of formulae to adopt will depend on what the test drug is compared to i.e. active or placebo or the type of study - retrospective, cross over study etc.

One of the most important decisions to make before calculating sample size is to define the following parameters namely:
- Statistically significant level or the false-positive rate, typically set at 5% (also written as P<0.05) for most trials. As the sample size increases the level of significance decreases.
- Clinically significant effect, (delta). To detect a smaller difference, one needs a sample of large size and vice versa. Both the statistical significant level and the clinical effects are two different effects and should not be confused with each other.
- Power – adequate power for a trial is widely accepted as greater than or equal to 0.8 (or 80%). Power is defined as 1-β where β, the false-negative rate; in this case, this would be 0.2 (or 20%). The power is directly related to the sample size therefore as the sample size increases, the power increases but as the power increases, there is a lower chance of missing a real effect.
- If the primary outcome is a (continuous) measurement such as blood pressure or peak flow, there is the need to estimate the natural variability in the population; this is known as the standard deviation. Natural variability can be thought of as ‘noise’ and makes the ‘signal’ more difficult to hear. A simple way to alleviate this problem is to take a few repeated measurements and use the average.
Other factors in determining sample size include:
The study design - there are various types of trial designs depending on what question the researcher sets out to answer. This includes parallel group, crossover, randomised controlled design etc. The sample size estimation for these trials will obviously be different depending on what design is adopted for the study.

Hypothesis testing or trial question - this describes the aim of the trial which can be equality, non-inferiority, superiority or equivalence. Equality and equivalence trials are two-sided trials whereas non-inferiority and superiority trials are one-sided trials. Superiority or non-inferiority trials can be conducted only if there is prior information available about the test drug on a specific end point.

In designing a clinical trial the assumption is that the new intervention will be better than its comparator. It is however important to estimate reliably and realistically how much the intervention will be better either through smaller trials with the same intervention or similar trials conducted previously.

Another way of coming up with an estimate is to consider what observed treatment effect would make practitioners change their current clinical practice. The key here is to define a difference between test and reference which can be considered clinically meaningful. For example, in designing a trial for a treatment to lower blood pressure, one could argue that an average lowering of systolic BP of 5mm Hg is clinically important; however, lowering the systolic BP by 10mm Hg would be deemed clinically important to prescribers to think about changing their current prescribing habit (which they’ve done for many years) to a new medication.

Conclusion
It is important that the sample size is accurately estimated otherwise patients are recruited aimlessly or the results are not acceptable.
Many institutions including academic, research and commercial institutions will have access to a statistical team who will help researchers work out the sample size. Where this opportunity is lacking it will be a worthwhile to seek advice of a statistician to help with this invaluable task to stand a chance of designing a successful trial.

References
- Michael Silverman May 2011 “Clinical Development - the clinical trial protocol patient population” Biostrategies Word press
- http://biostrategics.wordpress.com/2011/05/02/clinical-development-the-clinical-trial-protocol-patient-population/
- Ella A. Kazerooni1 “Population and sample” American Journal of Roentgenology November 2001 vol. 177 no. 5 993-999
- Lawrence M. Friedman, Curt D. Furberg, David L. DeMets “Study Population” Fundamentals of Clinical Trials 2010 4th edition chapter 4 55 - 65
- Tushar V Sakpal “Sample Size Estimation in Clinical Trial” Perspectives in Clinical Research. 2010 Apr-Jun; 1(2): 67–69.
- Oxford Radcliffe NHS “Medical Statistics online help: Sample size & Power for clinical Trials” Nov 2001 http://www.oxfordradcliffe.nhs.uk/research/projects/documents/medical-statistics-online-help.pdf
Print all information
Resources

Featured
Global Health Epidemiology

5 ways statistics can fool you—Tips for practicing clinicians

by Jai K Das

This article is part of the network’s archive of useful research information. This article is closed to new comments due to inactivity. We welcome new content which can be done ...

in Articles

Can sample size in qualitative research be determined a priori?
Global Health Social Science

Sample size calculations for pathogen variant surveillance in the presence of biological and systematic biases
Global Outbreaks Research

Statistics from the Beginning
Global Health Training Centre

Literature: Who Counts? The power of participatory statistics
mesh

Statistical considerations for the development of prescriptive fetal and newborn growth standards in the INTERGROWTH-21st Project
Global Health Training Centre

Participatory Statistics - A Mesh Introduction
mesh

A Comparison of Study Population Inclusion and Exclusion Criteria reported in Advanced Stage Clinical Trial Protocols
CEPI Technical Resources

Novel coronavirus sampling plan
Global Health Training Centre

Sampling begins across ODIN sites
ODIN Wastewater Surveillance Project

Peer Power - COVID-19
WEPHREN
Discussions

Featured
Global Health Trials Group: Global Health Research Process Map

Study population, sample size and statistics

Study population, sample size and statistics

It is very important to identify your study population and the sample size required at an early stage. The sample size will depend on ...

Latest reply: 27 Jun 2011 thuy wrote: I would like to add some common terms using in ... READ MORE

Featured
Global Health Trials Group: Data Management and Statistics

Statistical Advice

This discussion group is for members to post specific clinical trial statistical questions that they have

Latest reply: 20 Nov 2012 GHN_Editors wrote: We are planning a new area for statisticians and data ... READ MORE

in Discussions

Study population, sample size and statistics
Global Health Trials
Study population, sample size and statistics

It is very important to identify your study population and the sample size required at an early stage. The sample size will depend on ...

Statistical Advice
Global Health Trials
This discussion group is for members to post specific clinical trial statistical questions that they have

An interesting article on statistics

Evidence International - Statistics Video
Global Health Training Centre
Evidence International are dedicated to strengthening health systems in low-resource settings by improving the capacity for, and access to, evidence-based health care. if you are rusty on statistical methods, watch ...

The Power of Social Support
mesh
We, at the SHM Foundation. hope you will share your insights on the power of social support in tackling mental ill health.

Through our Project Khuluma, we have gathered their ...

"Women's Invisible Power" - what do you think?
Global Health Social Science
With the launch of the insightful Nature commentary "Women's Invisible Power" https://www.nature.com/nature/journal/v550/n7674_supp/pdf/550S4a.pdf?foxtrotcallback=true, this month we would like to focus on women in the community in global health. What do ...

The East African Data Management and Statistics Network
Global Health Trials
This area is being set up as first step for the newly formed 'East African Data Management and Statistics Network'. The aim of this professional community is to foster collaboration ...

Uniquely personalised medical care versus statistical confidence limits?
Global Health Methodology Research
Metabonomics, along with all the other 'omics' is driving truly personalised medical care. How compatible is this person-centric model with "traditional" statistics (averages over large populations)?

Of interest is Student's ...

New areas specially for Data Management and Statistics coming soon!
Global Health Trials
Throughout this site (see the many topics on this subject) and across the Global Health Network (within which Global Health Trials is located) there are many articles and discussions around ...

WorldWide Seminar Series Topic 2: Statistics From the Beginning
Global Health Trials
We'd like to invite you all to take part in the next WorldWide Seminar during August on the subject of Statistics.

The WorldWide Seminar Series is a very simple capacity ...

in Articles

in Discussions