Should Cochrane reviews of surgical interventions concluding ‘no evidence of benefit’ come with a health warning?

Mr James W Fairley BSc MBBS FRCS MS
Consultant ENT Surgeon

Last updated 23 July 2007
© 2007 – 2015 JW Fairley

Limitations of Evidence Based Medicine

Cochrane reviews of surgical treatments suffer from a lack of admissible evidence because very few high quality randomised controlled trials (RCTs) of surgical treatments are done. RCTs of surgical treatments are not done for various reasons:

These constitute some of the more significant inherent limitations in the methodology used to produce “robust and reliable evidence”.

Lack of formal acknowledgement of methodological limitations in systematic reviews of surgical interventions

Inherent limitations in conducting surgical trials are not prominently acknowledged in Cochrane Reviews. The search methodology, and the standards used to select or reject trials from inclusion in the review, are of course prominently featured, but little is said as to

  • why there may be a dearth of acceptable evidence, and
  • whether the required RCTs are ever likely to be done.

Until recently, this didn’t matter much, because the reviews were likely to be read by interested and informed professionals only. Now, however, Summaries & Conclusions are now easily available on-line to healthcare commissioners and the general public.

Serious consequences of misinterpretation of Cochrane review conclusions

“No evidence of benefit” is being misinterpreted as “this treatment does not work”. Evidence based commissioning of health services is shifting funding away from treatments which lack the “evidence based” imprimatur. This has serious health implications. Not only could effective surgical treatments be denied to those who need them, but the stepwise evolutionary development of surgical techniques (mainly resulting from the application of technological advances) risks being strangulated by regulatory and funding requirements to show evidence of effectiveness at each stage.

Benefit of treatment is obvious

It is obvious to those of us who carry out mastoidectomy and tympanoplasty surgery for chronic suppurative otitis media that these operations, when carried out by competent surgeons, offer the best chance of definitive treatment for the patient. Although there are large published case series, there are no randomised controlled trials, so a Cochrane review is obliged to conclude “No evidence of benefit”. Although this particular review has now been withdrawn (to be replaced by seven separate reviews including one specifically on surgical interventions) the conclusion is misleading for patients and health commissioners. The argument can be illustrated by reductio ad absurdam:

No evidence of benefit for surgical treatment?

Although no-one has yet published a Cochrane review of the incision and drainage of abscesses, I would predict the conclusion “no evidence of benefit” because there are no RCTs on the subject.

  • There are no RCTs because surgeons know what to do with an abscess.
  • They have known it for thousands of years.
  • The benefit of treatment is obvious.
  • In the case of the abscess, it is obvious to all.
  • In the case of mastoidectomy and tympanoplasty, it is only obvious to the specialist in the field.

Further examples.

Difficulties in defining outcome measures

Where surgical treatment has a hard outcome measure – such as death versus survival in cancer or cardiac surgery – the choice of primary outcome measure for trials is clear. But most of the conditions we treat are not likely to result in death, we are operating to improve quality of life. Until recently, this was thought too difficult to measure, but the application of psychometric techniques to patients’ symptoms in the 1980s began to allow a more quantitative approach to soft outcome measurement.

An explosion of research interest in the 1990s and early 21st Century has resulted in hundreds of disease-specific outcome measures, as well as numerous validated general health outcome measures. We are now spoilt for choice. Despite this, there remain significant difficulties in defining suitable outcome measures, particularly when it comes to variations in surgical technique.

An example of difficulties in defining outcome measures for a Cochrane review

In the protocol for a planned Cochrane review on the use of topical anaesthesia in flexible fibreoptic laryngoscopy, the clinical questions were:

  1. was topical anaesthesia was needed at all?

    and, if so,

  2. which would be the best agent?

There was a clear difficulty in defining suitable outcome measures. Although “obtaining an adequate view” was the most important outcome measure, I commented that the definition of this was inherently subjective and operator-dependent. Furthermore, my personal standard would be “consistently obtaining an excellent view” and this standard could vary depending on the purpose of the examination. Secondary outcome measures would be patient discomfort and the incidence of side effects from the topical anaesthetic agents. Several parameters of the local anaesthetic preparation could be important, including:

  • rapidity of onset
  • degree of vasoconstriction/vasodilatation
  • depth of anaesthesia
  • duration of action.

Different clinical circumstances might well dictate different preferences for these parameters.

  • For a specialist working alone in an office clinical environment, with the fibreoptic examination equipment immediately to hand in the same room, rapid onset and short duration of local anaesthesia would be preferable.
  • For a specialist working in a typical busy overbooked NHS clinic, with several doctors carrying out consultations but sharing a treatment room, it would not be unusual to give the local anaesthetic spray and then find another doctor occupying the treatment room, and/or have to wait for the laryngoscope to be re-sterilised, while getting on with seeing the next patient. A longer duration of action would be preferable in that clinic.
  • Vasoconstriction would be an important attribute if the examination were to include an assessment of the nose and sinuses, but is virtually irrelevant in the context of a voice clinic concentrating on the larynx.

In other words, there are horses for courses. In the main, I prefer to use 10% cocaine spray, but there are perfectly legitimate reasons why different specialists might prefer other techniques. Furthermore, the same specialist might use different techniques in different clinical circumstances.

Selection of technique – best left to the expert

The selection of a particular technique is akin to the selection of a particular golf club for a particular shot. It is best left to the discretion of the expert. The more expert the golfer, the more likely he is to have the ability – and the requirement – to use a variety of different clubs, according to his professional assessment of the exact circumstances of each shot. Not only that, but another golfer with a different set of preferences might get just as good a result. It is therefore unlikely that any review based on analysis of RCTs could come up with useful generalisable recommendations.

Why does this matter?

Until recently, this didn’t matter, because it was acknowledged that professionals knew best how to organise their specialist work. Now, however, managers and healthcare commissioners, ignorant of the subtleties and complexity of clinical expertise, will read the summaries of reviews and, armed with the assurance that there is “no robust and reliable evidence” to support one technique over another, will insist on standardisation to the cheapest. They congratulate themselves, convinced that they are “making the best use of scarce resources”. Would they also like to tell Tiger Woods that, since there is no robust and reliable evidence for his preference, he has to use a limited selection from the cheapest golf clubs?

Difficulties in controlling for surgical skill/preference

The desire of politicians and public to produce league tables has brought surgical skill into sharp public relief. Managers and health economists increasingly opine about the need to standardise “unexplained variances”. They seem to regard it as an unacceptable failing that 50% of surgeons are below average. An education department spokesman was once forced onto the defensive when confronted with the statistic that 50% of children were scoring below average marks in tests.

  • When setting up and interpreting RCTs of surgical techniques, the skill and preference of the surgeon become extremely important variables, and almost impossible to control.
  • Skill and preference are not independent variables. The one influences the other considerably.

Surgical skill

Skill is made up of a combination of inherent ability, which is developed by good training, then perfected and maintained by experience and regular practice. A surgeon who does not practice his skills, rather like a pianist who does not play, will tend to lose them. Some, less gifted with natural talent, will never be great no matter how much they practice. Some have had indifferent teachers, have never seen the finer points of their craft, and therefore aspire to nothing better.

Preference of the surgeon for one technique over another

Most established surgeons prefer familiar techniques, and, if the results are reasonable and predictable, carry on using them. Some surgeons are highly adaptable, migrate easily to new techniques, and are comfortable with a wide variety of approaches, while others are methodical “plodders”, or become so with advancing years. A bad choice of technique has serious and immediate consequences for the patient – and the reputation of the surgeon. Preference for the tried and tested is, therefore, understandably commoner than desire to try the newest latest thing, yet advances rely on the latter.

Can you apply the results of an RCT to your practice? – choice of technique for water shot

This can be illustrated by another golfing analogy – the “Across the water” shot.

Suppose the green lies directly over a stretch of water. The best shot is to drive straight over, onto the green in one – but only if you have the skill and the equipment to do it reliably. A much safer option is to take the dog-leg, go around the hazard – it will require at least two strokes, but should get you there safer.

Now suppose a group of top golfer-surgeons get together and publish a randomised controlled trial of the straight across the water versus the dog leg “operation” for this “condition”. Being top players, they will consistently succeed with the more difficult shot. Their multi-centre RCT will conclude that the best operation is the straight over the water shot.

Now, supposing Mr Average Surgeon starts trying this. We will rapidly find that not all surgeons can walk on water…. and, unfortunately, no amount of “specialist training” can bring the bulk of the profession above average skill levels.

We now have to come back to, and rely upon, individual insight and clinical judgement as to which operation to do – even though there may be published RCTs giving “the answer”.

Professional judgement

Judgement is something beyond skill. A surgeon with good judgement will tend to avoid getting into situations which test his advanced skills.

Teamwork and the performance environment

It must also be realised that surgeons do not work as isolated individuals, but as part of a team. Teamwork in the operating theatre is crucial. The finest and most skilled operator can have his performance ruined by an unfamiliar anaesthetist producing poor conditions, a clumsy assistant, a scrub nurse who doesn’t know the instruments, bad maintenance of equipment, or a thousand other mishaps which can and will happen. In the UK, NHS surgeons have less and less control over these aspects of their performance environment. It is part of surgical judgement to know in which clinical environment to attempt certain cases. None of these factors can be controlled for in RCTs. They will not appear in any evidence base, and yet they are fundamental to successful surgical practice.

Techniques are continuously evolving

Modern surgery has developed primarily by the application of new technology. ENT as a speciality could not develop until the invention of the electric light bulb made reliable illumination of dark recesses possible. Improved visualisation – especially fibreoptic endoscopy and other forms of imaging – together with reliable, safe anaesthesia, and improved peri-operative care – have resulted in progressive improvements over the decades. Virtually no advances in surgery have been made by learning from the results of RCTs.

Although there are some seismic changes – such as the introduction of cross sectional imaging in the early CT scanners of the 1970s – most of the improvements that occur in surgical techniques are incremental – a small improvement in an instrument, a slightly brighter light source, a higher resolution, finer fibre endoscope.

Who is best placed to recognize improvements in the quality of surgical equipment?

The improvement in quality is obvious to the skilled surgeon, who has the device in his hand, but not necessarily to others. Surgeons who are highly motivated to obtain the best results for their patients demand the best equipment. As skilled artisans, they know they can do a better job with better tools. If, however, they are obliged to produce “the evidence base” for using this new and more expensive piece of kit, the evidence base is unlikely to be forthcoming. In order to demonstrate an improvement in clinical outcome to the standards required for a Cochrane review, trials involving large numbers of patients will need to be set up. This is likely to take several years, By the time the results are available, it almost certain that the technology will have moved on.

Conclusions & Recommendations

Cochrane reviews of surgical treatments which conclude “no evidence of benefit” due to a lack of high quality randomised controlled trials should display, prominently in the plain language summary, a Health Warning to non-experts interpreting the conclusions. This warning should include the following:

  • An explanation of what is meant by “no evidence of benefit” and, specifically, that it does not mean “this treatment does not work”.
  • A list of reasons why RCTs of surgical treatments may not be available.
  • Specific recommendation that healthcare commissioners seek expert local advice before acting on the conclusions of the review.


Although a number of golfing analogies appear in this article, I would like to make it clear that I am not a golfer. I gave up the unequal struggle to achieve consistent and reliable results 20 years ago, when it became clear to me that the amount of remedial training and practice required to compensate for my deficiency of natural talent in this area would considerably exceed the time available to a busy professional with a young family.

