**Study Population**

The study population consists of the total number of members of a defined group of people which has been selected because of relevance to a research question. For example, to study maternal healthcare, the target population would likely be pregnant women, or pregnant women with the specific issue that is being addressed by the research question. The study population is those who would be eligible to take part in the study because they meet the entry criteria and live in the area where the study is taking place (or are likely to present in the health facility being used as a research site).

The study population has to be clearly defined according to specific characteristics such as age, sex or geographical accessibility to health services. Also, the way the study population is defined depends on the problem that needs to be investigated and the objectives of the study.

These are important criteria to consider at the protocol planning stage. How many participants does your study need to answer your question (this should have now been determined by the statistical considerations). Where and how can you recruit? Are the numbers or rarity of presentation such that you might need more than one research site in order that you can complete the study in a realistic time frame?

Patient population can make or break a clinical study, firstly because you need the right set of participants to answer your question, and so the inclusion and exclusion criteria must be right. If your criteria are too broad then you could risk not answering the primary objective due to other confounding factors. Equally, criteria which are too narrow will mean that the population is so limited the data might be irrelative to the real-life clinical setting. Secondly, you need to recruit in a timely manner to ensure you can answer your question. Lastly is the question of retention: losing participants in the follow-up visits is one of the most detrimental problems in conducting clinical research studies, so you can optimise the chances of retaining the patients within your selection criteria – for example this might involve including subjects only if they live within an accessible distance of the study site.

**Sample bias in study protocols**

In selecting the sample patient, several biases can be introduced such as the exclusion of patients, the use of a retrospective sample versus a prospectively collected sample, patient enrolment, etc. The study population therefore needs to be thoroughly described, so that the assumptions made to the patient pool is transparent.

The following factors need to be considered in reducing the chance of bias in selecting the patient population to be studied:

1. **Exclusion of patients**

This denotes the characteristics that will make study participants representative of the eventual target population, for whom the question is set to benefit. Specific qualities such as age, duration and severity of disease, previous treatment and its effects all describe the potential patient population. When analysing and reporting the results, it will be important to describe who was studied and also describe patients who were excluded or declined to participate in the study, because they may be different from the patients actually studied. Some exclusions are random: for example, the hard copy of a patient’s CT scan that becomes corrupt and therefore unusable or a patient who died due to reasons unrelated to the study drug such as a car accident. Other exclusions are not random, and might introduce bias. Your study documentation needs to capture this information, so useful tools for this include a screening log where reasons for enrolment or exclusion are recorded.

For instance in determining the sensitivity of a particular diagnostic test, patients referred to a tertiary medical centre might appear to have more severe disease than patients treated for the same disease in a community hospital. This factor might make a diagnostic test appear to be more sensitive than it actually is, since it is generally easier to detect severe disease.

Some exclusion may have little to do with characteristics of the target population but may seek to exclude patients with concomitant conditions that may put patients at higher risk and/or could obscure the assessment of the drug effect. In this case, the risk/benefit ratio will need to be assessed by the investigator particularly for those patients where it is already known that the study medication is harmful.

Taking everything into account, trying to eliminate bias entirely could mean that patients are difficult or impossible to recruit. One way of circumventing this issue, is by writing the protocol in such a way that leaves room for investigator discretion. For instance gastric bleeding will be a contraindication in a study involving an anti-inflammatory drug and therefore the exclusion criteria could be written so that patients with a history of major bleeding are not expressly excluded but left to the investigator’s judgement. This then leaves the investigator the flexibility to decide how far back the patient’s history is deemed relevant for study entry.

In these cases of co-morbidity, the ability to assess the true impact of the study treatment on the results could be biased as the effect of the intervention may exacerbate or confer some benefit to the co-morbidity.

2. **Retrospective vs prospective studies**

Retrospective studies, by their nature will have some element of bias due to the time period involved and sometimes, having to depend on human memory to assess the severity and timings of the disease. If an investigator were to design a study aimed at determining the severity of dyspnoea in patients with suspected pulmonary embolism, it is likely that patients diagnosed with pulmonary embolism who are hospitalised for treatment will remember their dyspnoea more vividly and rate it more severe than patients not diagnosed with pulmonary embolism who were sent home. This would exaggerate the difference in reported dyspnoea in the two groups, compared with what would be seen if all of the patients were asked about dyspnoea.

Clinical research participants are selected on the basis that they can follow the study procedure and adhere to scheduled appointments. Again patients are restricted in the concomitant medication or the timings at which the concomitant meds can be administered, all of which introduces some bias in the patient population as patients are not always compliant in the real world.

By nature of the clinical studies themselves, patients will normally be referred to the clinical site where the investigation is taking place and will need to “volunteer” to participate in the study. This therefore implies that only certain types of patients who are unfazed by trying new interventions, or patients that are desperate due to the type or severity of the disease or lack of other access to free healthcare, are enrolled and may not necessarily represent the general population.

**Conclusions**

While it is important to identify the patient population and difficult to remove bias entirely, it is hoped that the points stated above will help any potential researcher bear in mind some of the constraints with selecting patients and be aware of where bias could creep in.

In saying this, science is moving towards developing individualist medicine and selecting very specific patient populations for research may not be bad after all; as long as there are sufficient patients in the general population that will “fit” within this specific group to benefit from treatment. This is evident in the use of the anti cancer treatment, e.g. herceptin in HER 2 positive breast cancer sufferers and many more treatments that are specific to a sub-group of patients,

**Sample Size and trial Statistics**

Determining the sample size is a crucial component of the clinical research study design and needs to be fully considered to make sure the design can answer the question. If the sample size is too large, it may waste time, resources and money whilst a smaller sample size may lead to inaccurate results and therefore deemed unethical as patients will have needlessly been exposed to research procedures and interventions, and yet the question would remain unanswered.

Choosing a sample size is a combination of logistical and pragmatic considerations. Some of these considerations include: the number of participants one can reasonably expect to recruit in the given time period and the available resources, the estimated screen failure and drop out rate, etc. The sample size estimation formula will provide the number of evaluable subjects required for achieving the desired statistical significance for a given hypothesis. However in practice, there may be the need to enroll more subjects to account for potential dropouts.

The final figure calculated indicates the minimum number of participants required to reliably answer the research question.

There are many statistical packages dedicated to performing sample size/power calculations . It is often possible to perform such calculations using standard statistical software and stand-alone packages are available for this purpose. However, many of the calculations required to achieve sample size can be calculated manually with the aid of statistical books or literature. The type of formulae to adopt will depend on what the test drug is compared to i.e. active or placebo or the type of study - retrospective, cross over study etc.

One of the most important decisions to make before calculating sample size is to define the following parameters namely:

- Statistically significant level or the false-positive rate, typically set at 5% (also written as P<0.05) for most trials. As the sample size increases the level of significance decreases.
- Clinically significant effect, (delta). To detect a smaller difference, one needs a sample of large size and vice versa. Both the statistical significant level and the clinical effects are two different effects and should not be confused with each other.
- Power – adequate power for a trial is widely accepted as greater than or equal to 0.8 (or 80%). Power is defined as 1-β where β, the false-negative rate; in this case, this would be 0.2 (or 20%). The power is directly related to the sample size therefore as the sample size increases, the power increases but as the power increases, there is a lower chance of missing a real effect.
- If the primary outcome is a (continuous) measurement such as blood pressure or peak flow, there is the need to estimate the natural variability in the population; this is known as the standard deviation. Natural variability can be thought of as ‘noise’ and makes the ‘signal’ more difficult to hear. A simple way to alleviate this problem is to take a few repeated measurements and use the average.

**Other factors in determining sample size include:**

The study design - there are various types of trial designs depending on what question the researcher sets out to answer. This includes parallel group, crossover, randomised controlled design etc. The sample size estimation for these trials will obviously be different depending on what design is adopted for the study.

**Hypothesis testing or trial question** - this describes the aim of the trial which can be equality, non-inferiority, superiority or equivalence. Equality and equivalence trials are two-sided trials whereas non-inferiority and superiority trials are one-sided trials. Superiority or non-inferiority trials can be conducted only if there is prior information available about the test drug on a specific end point.

In designing a clinical trial the assumption is that the new intervention will be better than its comparator. It is however important to estimate reliably and realistically how much the intervention will be better either through smaller trials with the same intervention or similar trials conducted previously.

Another way of coming up with an estimate is to consider what observed treatment effect would make practitioners change their current clinical practice. The key here is to define a difference between test and reference which can be considered clinically meaningful. For example, in designing a trial for a treatment to lower blood pressure, one could argue that an average lowering of systolic BP of 5mm Hg is clinically important; however, lowering the systolic BP by 10mm Hg would be deemed clinically important to prescribers to think about changing their current prescribing habit (which they’ve done for many years) to a new medication.

**Conclusion**

It is important that the sample size is accurately estimated otherwise patients are recruited aimlessly or the results are not acceptable.

Many institutions including academic, research and commercial institutions will have access to a statistical team who will help researchers work out the sample size. Where this opportunity is lacking it will be a worthwhile to seek advice of a statistician to help with this invaluable task to stand a chance of designing a successful trial.

**References**

Print all information