Bedside Ultrasound Evidence Echoes
As you might know – I am an Ultrasound tragic. You may not know that I will also confess to being a little obsessed with biostatistics. So in the lead up to SMACC GOLD I have been doing some research and come up with a few interesting info graphics that I thought I would share with you all.
The concept is what I call the “Sonospectrum” – everything we do in diagnostic medicine is imperfect. Clinical examination is a relatively blunt tool, bedside ultrasound has its great moments and other times when it just cannot give us a robust answer, even the CT scan – the ionising oracle has its own short fallings (mainly due to overcalling diagnoses). Each modality of ultrasound sits somewhere on the spectrum – for example ED US is great for detecting intrauterine pregnancy but not so powerful when it comes to excluding a PE (based on ECHO). For some clinical questions ultrasound is more potent than other forms of examination and imaging – it is the “gold standard”. So how do you know when to use it as a ‘diagnostic test’, when is it a ‘useful tool’, when is it an ‘extension of the clinical examination’ or when is it just a guy ‘fiddling with knobs at the bedside’????
In medical school we all learn about the Sensitivity and Specificity of these tests – the “test characteristics”. However, lately I have started teaching and using “likelihood ratios” as the numbers I carry around in my head to aide my practice. Likelihood ratios are more useful than Sens and Spec – as they can be applied to individual patients in their particular context. Here is how it basically works:
- Define a clinical question, e.g. “Is there a drainable abscess under that cellulitis?” OR “IS there a pneumothorax?”
- Decide on your pre-test probability (either by using clinical acumen, or an existing tool – e.g.. Well’s score for DVT, or just use a basic know prevalence of the disease)
- Know the +LR and -LRs. Are they going to be able to change your management? That is – will doing this test move your post-test probability into a range where it will either allow you to treat / intervene OR stop the process with comfort that you have in practical terms “excluded” disease? Will you push the probability past an upper or lower test threshold.
- Do the test. Get an answer [sometimes the answer remains elusive. e.g.. Why did I think I could scan the 240 kg man with RIF pain…? Doh!]
- Carry on. Integrate this information into the clinical picture and do the needful.
Bedside ultrasound is a newish field – and the evidence basis of what we use it for is young, but rapidly expanding. In the last 5 years there has been an explosion of published papers looking at the test characteristics of bedside US done in EDs by point-of-care providers… i.e. you, or me… not the crew in the Radiology department. A lot of the evidence comes in the form of relatively small, underpowered studies. There are a few meta-analyses of these data sets. So a quick disclaimer to the data presented below – I completely accept that a lot of it is based on relatively weak numbers. However, we have to start somewhere.
Often my colleagues will say: “What did the scan show?” However, in my head the question ought to be: “Does the result of my scan mean that I can change this patient’s management?” These are two very different questions! In order to be a rational practitioner at the bedside with a probe in one hand – we need to know these numbers. Ok, you don’t need to know exactly the LRs for every exam – but you need to have a ballpark idea about what you can actually achieve with your scan. I sometimes kick myself for missing things with the bedside US, and yet I know that for a lot of what we do – it is an insensitive tool, the specificity is often stronger. And that is the point – we need to know what questions we can answer and not get too carried away with ourselves. Being realistic requires discipline.
So – enough foreplay. Here are a couple of tables. They rank a pile of “tests” by the relative potency of their likelihood ratios. An LR is basically the ratio of true answers to false answers that you might expect to get from any given test. Remember when it comes to likelihood ratios the basic rules are as follows:
For POSITIVE LIKELIHOODS – that is “If the test is positive then it is a true positive…”
- +LR > 20 is very potent (hang your hat on it!)
- 10 – 20 is strong (considered diagnostic in most settings)
- 5 – 10 is good (few clinical exam findings are better than 5)
- 2 – 5 is just barely useful, (common for a lot of the basic blood work ordered in ED)
- < 2 is unhelpful, probably will not change your post-test significantly, unless you were already very close to a threshold
- LR = 1 means that the test does nothing for the diagnostic process in either direction
For NEGATIVE LIKELIHOODS
- A negative likelihood ratio tells us “if the test is negative, how likely is it that the disease is absent?”
- It is just the same in the reverse order really just divide 1 by the negative (fractional) value and you have an equivalent.
- So a – LR of 0.100 carries the same weight as a +LR of 10, -LR of 0.25 is as useful as + LR of 4. If you see a heap of zeroes it is strong!
CLICK on these to open the info graphics. Have a read down the list of “tests” – there are a few surprises. Of course, the gold standards used to generate the data are crucial – so you will need to read some papers to find that out – too hard to put into a table sorry. However I think this is useful in ordering / ranking your bedside US tests, so you know what question to ask and what you can answer in the ED.
So let me know if you think this is useful. Or if you have a paper to analyse, a clinical question about the diagnostic characteristics of an US modality – hit me on the comments or the email. I am expanding this project constantly. And if you are going to be at the SMACC US Workshop – then this is good stuff to digest before the day of awesomeness that is in store for you!
I think folks generally work better with integer numbers, I think the + LR graph is awesome, but my only suggestion would be to list the negative LR graph with reciprocals (a.k.a. positive integers) for more easy comparison -> the math ends up being the same (just divide instead of multiply your pre-test prob).
It always irks me a bit when I hear someone quote the sensitivity or specificity of ultrasound to diagnose a particular condition.
As we know, ultrasound is highly operator dependant. The sensitivity & specificity of this tool is not a static entity. It changes each and every time a different person picks up the ultrasound probe. An expert will be fantastic at picking up a sonographic Murphy’s sign but a novice will have trouble even finding the gallbladder.
In addition, ultrasound is also “patient dependant.” We all know that obese patients or those with a lot of bowel gas can make things challenging. And we have all had that skinny patient for whatever reason seems to have an integumen that is impervious to sonographic energy.
Furthermore much of the published literature looking at the sensitivity & specificity of bedside ultrasound is performed by highly motivated and skilled practitioners and may not translate in to real world practice and be externally valid.
Therefore, I wonder if we should call for a moratorium on studies reporting the sensitivity and specificity (and therefore likelihood ratios) of point-of care ultrasound and focus on what is intellectually more honest.
Although it is far from sounding scientific, it would be good to know if bedside ultrasound is “pretty darn good” at including or excluding particular conditions when performed by reasonably good providers with knowledge and experience in the specific application. Yes, looking at the numbers plotted to several decimal points gives us a pseudo-scientific idea. But maybe there is a simpler and more genuine way.
Perhaps a modified 5 point Likert scale would be the best answer for stating the utility for bedside ultrasound: terrible, not very good, neutral, pretty darn good, friggin great.
Ok… we can work on the Likert scale on the utility of point-of-care ultrasound, but the concept remains the same. The test performance of ultrasound is not a static entity and lets get honest.
Appreciate your perspective. Trust me – after wading through a pile of ED-based US papers it is very clear that there is a large range of qulaity of trials – for all the reasons you have mentioned. Lots of confounders, lots of small trials, lots of outliers in the stats.
However, the same argument could be made ofr a lot of medicine – the academic centres do some things differentl – so can we apply any of the evidence to a wider target population?
A lot of the trials are pragmatic and do describe the experience level of the Sonographers, some even give breakdowns as to the test characteristic by the seniority of the proviedrs. So there is some insight into these effects.
The main goal of my producing this list / graphic was to try and put bedside US in context – both along the spectrum of clinical exam – Xray – CT – ….and to highlight the relative inadequacy of a lot of the commonly used imaging tests (eg. AXR for SBO).
Agree that it would be unwise to use a lot of this data to actually make decisions in an individual patient’s context – that is what we are paid to do! But it is worthwhile to have an idea as to what questions you should be asking and might be able to answer with bedside US. The reality is that this technology is spreading and being used by all types of providers in lots of environments – so we need to have some perspective on the utility – rather than a purely binary “Yes” or “no” approach to the images we get on the screen.
If you like likelihood ratios, check out the website “A Life at Risk” (http://www.alifeatrisk.com/)
How come everyone talks about ultrasound being “user-dependent” but you almost never hear about reading an ECG being user dependent, or reading a head CT or a CXR or listening to a chest etc etc. Surely everything we do is “user-dependent” so why single out ultrasound?