A network for students interested in evidence-based health care

Diagnostics Studies: how to get started with appraising the evidence

Posted on 30th June 2020 by Richard Colling

Tutorials and Fundamentals

The focus of evidence-based healthcare appraisal is often on interventional studies, but before we can treat patients appropriately we must get the diagnosis right. The task of working through differential diagnoses is complex and involves a blend of clinical experience and the best evidence. Clinical experience may take years to develop, but the basics of appraising diagnostic studies are not too challenging to get started on.


Once you have found a paper you are interested in appraising, the first thing to decide is what research question the study is trying to address. Read the Introduction and try to frame the question with a model such as PIRTO:

Population – who are the patients being tested?

Index test – the new test being studied

Reference test – the comparator method of establishing the diagnosis

Target condition – the diagnosis of interest

Outcome – measures of the index test performance

Once framed in this way it is easy to work through and appraise the various points. You may have already used PIRTO to formulate your literature search strategy – how close is the paper to your question?


Read the Methods section. Are the patients in the study representative of your patient (or patients)? If not, then this might not be the best study to inform you and you might even want to discard it at this stage. An important point not to overlook when making this decision is disease prevalence. It’s important to consider whether the disease prevalence in the study population is similar to what you will encounter when testing your patients, because prevalence can affect some outcome measures (such as predictive values). It is very easy to confuse a screening context with a diagnostic context, where the prevalence of disease can be extremely different.

If you are happy to continue, look for risks of bias. Are the patients selected randomly / consecutively, or were highly selected patients chosen (e.g. only severely affected patients)? Choosing clear cut examples of the disease to test in the study will increase the chances of getting good accuracy measures and so could introduce a significant risk of bias in a diagnostic study. Also ask, how many patients were included in the study and are these enough? Was a sample size calculation outlined?

Index test

Make sure the study clearly sets out the details about the index test and that the study is definitely evaluating the same test you are interested in! Check the Introduction and Methods. Who carried out the index test? Were they (and the patients) blinded to the reference test results? This is important to minimise the risk of biasing results. Was there any degree of interpretation involved in testing? This is important because there is quite a difference between interpreting a machine generated report (positive/negative) and making a judgement about the test outcome (e.g. consolidation on an X-ray). It’s important the study defines ‘positive’ and ‘negative’ results prior to testing and there is a plan for handling indeterminate results.

Although not necessarily a significant risk of bias, prospective studies where new patients are tested are usually preferable to retrospective studies where tests are carried out on previously diagnosed patients (or their stored samples).

Reference test

Similar considerations apply to the reference test as to the index test. Also consider how reliably was the ground truth diagnosis established? Check the Methods. Ideally a ‘gold standard’ test or clinical criteria should be used. Sometimes, longterm outcome is the best measure. If a surrogate test is used, check the reported accuracy of the reference test. Did all patients in the study have the reference test? If not, then the outcome measures may not be reliable. Make sure the index test and reference test were both performed within a short time period, so that any potential clinical interventions could not have affected either test.

Target condition

Is the condition being diagnosed stated clearly and is this the diagnosis you are interested in? Check the Introduction and Methods.


How are the test results evaluated? This is what most people will think of as the crucial part of the study. Diagnostic studies will usually report sensitivity and specificity, perhaps predictive markers or likelihood ratios: find these in the Results section. You need to decide what is most useful in your clinical context, take a look at this recent post Sensitivity and specificity explained’ that comprehensively covers various outcome measures. Remember, if you are looking to confirm a diagnosis, then you need a test with high specificity and high positive predictive value. If you want to rule out a diagnosis, then you need a test with high sensitivity and a high negative predictive value. Some studies may report 95% confidence intervals for measures of test accuracy – if these are wide be cautious. Some studies may report AUC and ROC curves instead – look out for a blog on this topic coming soon! How were missing and indeterminate results handled? Excluding these may inflate the outcome measures.


In the end you need to bring this all together. Decide if the internal validity of the study is acceptable. In other words, you need to decide if the risk of bias is satisfactorily minimised for you to accept the findings. If not, then you might want to reject the study (or adjust your opinion of the findings accordingly). Next, you need to decide if the accuracy of the test (sensitivity etc.) was found to be acceptable for use in clinical practice. If not, then perhaps look for a better test. Finally, you need to decide if the findings are externally valid – that the study population was representative enough that the findings can be applied to your patient or patients (your population). Don’t forget to check for any weaknesses the authors may have highlighted in the Discussion.


This is a basic and simple way of making a quick appraisal of a diagnostic study and by no means is this comprehensive. This should however be an easy way to get started looking at diagnostic studies and serve as a useful framework to build upon. You will come to realise that there are no correct answers when evaluating diagnostic tests and that experience and judgement are key – even in the world of evidence-based healthcare!

One last bonus for you: now that you know the basics of appraising a diagnostic paper, you also now know the basics of designing a diagnostic study.


CEBM. Critical Appraisal tools. Available from: https://www.cebm.net/2014/06/critical-appraisal [accessed June 12 2020].

The opinions expressed here are the author’s own and do not reflect those of any affiliated organisation or institution.


Richard Colling

Richard is a medical doctor, health researcher, teacher, and pathologist based in the UK. He is currently Senior Clinical Research Fellow and a DPhil (PhD) student in evidence-based healthcare at the University of Oxford. View more posts from Richard

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe to our newsletter

You will receive our monthly newsletter and free access to Trip Premium.