Critical Appraisal of Evidence - Prognosis Scenario
Are the results of this study valid?
Information about prognosis can come from a variety of study types. Cohort studies (investigators follow 1 or more groups of individuals over time and monitor for the occurrence of the outcome of interest) are the best source of evidence about prognosis. Randomised control trials can also provide information about prognosis although trial participants may not be representative of the population with the disorder. Case-control studies (investigators retrospectively determine prognostic factors by defining the exposure of cases who have already experienced the outcome of interest and of controls who haven't) are useful when the outcome of interest is rare or when the required follow-up is long. The strength of inference that can be drawn from a case-control study is limited because they are more susceptible to bias.
Returning to our clinical scenario from the question formulation tutorial:
You see a 70 year old man in your outpatient clinic 3 months after he was discharged from your service with an ischemic stroke. He is in sinus rhythm, has mild residual left-sided weakness but is otherwise well. His only medication is ASA and he has no allergies. He recently saw an article on the BMJ website describing the risk of seizure after a stroke and is concerned that this will happen to him.
Our search of the literature to answer this question retrieved an article from the BMJ (1997;315:1582-7).
How do we critically appraise this prognosis paper? We'll start by considering validity first and the following list outlines the questions that we need to consider when deciding if a prognosis paper is valid.
Was a defined, representative sample of patients assembled at a common (usually early) point in the course of their disease?
We hope to find that the individuals included in the study are representative of the underlying population (and reflect the spectrum of illness). But, from what point in the target disorder should patients be followed? Above, we state 'usually early' implying an inception cohort (a group of people who are assembled at an early point in their disease), but clinicians may want information about prognosis in later stages of a target disorder. Thus, a study that assembled patients at a later point in the disease may provide useful information. However, if observations are made at different points in the course of disease for various people in the cohort, the relative timing of outcome events would be difficult to interpret. Thus, the ideal cohort is one in which participants are all at a similar stage in the course of the same disease.
Returning to the paper we found, the study included patients who were entered after their first stroke. Further details on entry procedures aren't included in the study.
Was patient follow-up sufficiently long and complete?
Ideally, we'd like to see a follow-up period for a study that lasts until every patient recovers or has one of the other outcomes of interest, or until the elapsed time of observation is of clinical interest to clinicians or patients. If follow-up is short, it may be that too few study patients will have the outcome of interest, thus providing little information of use to a patient.
The more patients who are unavailable for follow-up, the less accurate the estimate of the risk of the outcome. Losses may occur because patients are too ill (or too well) to be followed or may have died, and the failure to document these losses threatens the validity of the study. Sometimes, however, losses to follow-up are unavoidable and unrelated to prognosis. Although an analysis showing that the baseline demographics of these patients are similar to those followed up provides some reassurance that certain types of participants were not selectively lost, such an analysis is limited by those characteristics that were measured at baseline. Investigators cannot control for unmeasured traits that may be important prognostically, and that may have been more or less prevalent in the lost participants than in the followed-up participants. most evidence-based journals of secondary publication (like ACP Journal Club and Evidence Based Medicine) require at least 80% follow-up for a prognosis study to be considered valid.
In the study we retrieved, follow-up was sufficiently complete and patients were followed from 2 to 6.5 years.
Were objective outcome criteria applied in a "blind" fashion?
We need to assess whether and how explicit criteria for each outcome of interest were applied and if there is evidence that they were applied without knowledge of the prognostic factors under consideration. Blinding is crucial if any judgement is required to assess the outcome because unblinded investigators may search more aggressively for outcomes in people with the characteristic(s) felt to be of prognostic importance than in other individuals. Blinding may be unnecessary if the assessments are preplanned for all patients and/or are unequivocal, such as total mortality. However, judging the underlying cause of death is difficult and requires blinding to the presence of the risk factor to ensure that it is unbiased.
In the study we identified, patients were asked at follow-up if they had a seizure and if they said "yes", a study neurologist subsequently assessed them. It is unclear if the study neurologist was "blind".
If subgroups with different prognoses are identified, was there adjustment for important prognostic factors and was there validation in an independent, "test set" of patients?
We often want to know if patients with certain characteristics will have a different prognosis. For example, are patients with an intracranial hemorrhage at increased risk of seizure? Demographic, disease-specific or comorbid variables that are associated with the outcome of interest are called prognostic factors. They need not be causal but must be strongly enough associated with the development of an outcome to predict its occurrence.
The identification of a prognostic factor for the first time could be the result of a chance difference in its distribution between patients with different prognoses. Therefore, the initial patient group in which the variable was identified as a prognostic factor may be considered to be a training set or a hypothesis generation set. Indeed, if investigators were to search for multiple potential prognostic factors in the same data set, a few would likely emerge on the basis of chance alone. Ideally, therefore, data from a second independent patient group, or a "test set" would be required to confirm the importance of a prognostic factor. Although this degree of evidence has often not been collected in the past, an increasing number of reports are describing a second, independent study validating the predictive power of prognostic factors. If a second, independent study validates these prognostic factors, it can be called a clinical prediction guide.
In the study we found, the investigators looked at patients with different stroke types and identified that patients in these groups had different risks of seizures. This was not tested in an independent group of patients to see if it holds true.
If the study fails any of the above criteria, we need to consider if the flaw is significant and threatens the validity of the study. If this is the case, we'll need to look for another study. Returning to our clinical scenario, the paper we found satisfies all of the above criteria and we will proceed to assessing it for importance.
If subgroups with different prognoses are identified, was there adjustment for important prognostic factors and was there validation in an independent, "test set" of patients?
We often want to know if patients with certain characteristics will have a different prognosis. For example, are patients with an intracranial hemorrhage at increased risk of seizure? Demographic, disease-specific or comorbid variables that are associated with the outcome of interest are called prognostic factors. They need not be causal but must be strongly enough associated with the development of an outcome to predict its occurrence.
The identification of a prognostic factor for the first time could be the result of a chance difference in its distribution between patients with different prognoses. Therefore, the initial patient group in which the variable was identified as a prognostic factor may be considered to be a training set or a hypothesis generation set. Indeed, if investigators were to search for multiple potential prognostic factors in the same data set, a few would likely emerge on the basis of chance alone. Ideally, therefore, data from a second independent patient group, or a "test set" would be required to confirm the importance of a prognostic factor. Although this degree of evidence has often not been collected in the past, an increasing number of reports are describing a second, independent study validating the predictive power of prognostic factors. If a second, independent study validates these prognostic factors, it can be called a clinical prediction guide.
In the study we found, the investigators looked at patients with different stroke types and identified that patients in these groups had different risks of seizures. This was not tested in an independent group of patients to see if it holds true.
If the study fails any of the above criteria, we need to consider if the flaw is significant and threatens the validity of the study. If this is the case, we'll need to look for another study. Returning to our clinical scenario, the paper we found satisfies all of the above criteria and we will proceed to assessing it for importance.
Are the results of this study important?
How likely are the outcomes over time?
Typically, results of prognosis studies are reported in one of three ways: as a percentage of the outcome of interest at a particular point in time (e.g. 1 year survival rates), as median time to the outcome (e.g. the length of follow-up by which 50% of patients have died) or as event curves (e.g. survival curves) that illustrate, at each point in time, the proportion of the original study sample who have not yet had a specified outcome.
From the study we found, the risk of seizure after any type of stroke is 5.7% at 1 year.
How precise is this prognostic estimate?
The precision of the estimate is best reflected by its 95% confidence interval; the range of values within which we can be 95% sure that the population value lies. The narrower the confidence interval, the more precise is the estimate. If survival over time is the outcome of interest, earlier follow-up periods usually include results from more patients than later periods, so that survival curves are more precise (i.e. have narrower confidence intervals) earlier in follow-up.
To calculate the 95% confidence interval for the study we identified, we can use the following equation:
95% Confidence Interval = p +/- 1.96 x SE
where:
Standard Error (SE) =
And 'p' is a proportion of people with the outcome of interest and 'n' is the sample size.
From the study, n = 675 and p = 0.057
SE =
SE = 0.009
Therefore the 95% CI is:
0.057 +/- 1.96 x 0.009 = 3.9% to 7.5%