November 29, 2023

Evaluating digital health services as drugs is stifling innovation.

Prof. Nick Barber
Head of Clinical Outcomes

There is an argument that digital health services should be assessed like drugs, and I would like to refute it. The issue is not that we would be using a sledgehammer to crack a nut but that we would be using one to turn a bolt.

My PhD was in clinical pharmacology, involving laboratory testing and trials of drugs. Since then, I have run NHS services and evaluated human and digital systems related to medicines. For 20 years, I collaborated on evaluations with the Information Systems Group at the London School of Economics.

Let’s consider how drugs are licensed. Candidate drug molecules are created, and after laboratory testing, successful ones will be used in increasing numbers of people. If early studies are successful, they culminate in a large randomised controlled trial to evaluate its effectiveness and identify side effects. If all goes well, the product will be licenced by a national body, made widely available and monitored (somewhat weakly) for problems. This process is incredibly resource-intensive. Only one in seven drugs that enter Phase 1 trials become licenced, which takes an average of 12 years. It costs over $1 billion to bring a molecule to market. 

The problems with using this methodology as a template for digital health technologies fall into two related categories: 1) What we are evaluating and 2) The appropriateness of clinical trials as a methodology. This is not an attack on drug licencing nor randomised controlled trials.

Medicines and digital systems are very different things. One is fixed, and the other is infinitely adaptable. In the language of philosophy, they have different ontologies – natures – and so we would expect to use different epistemologies – ways of knowing about them.

Here are four issues with evaluating digital systems in the same way as evaluating drugs.

1. Pace of change

The aspirin molecule has not changed in over a century and, by definition, never will. This means we can accrete knowledge about it. Trials build on trials to explain its actions and quantify its clinical effects. All studies on aspirin measure the same molecule.

In contrast, digital technologies such as patient apps keep changing. They evolve to respond to user needs, the marketplace and technical developments. This causes an unavoidable problem as the ‘thing’ we are evaluating will soon become different.

30 years ago, I studied an automated dispensing system in the USA; let’s call it the Fixit. It is still available, using the same name, yet its hardware, software, and ownership have changed multiple times. Studies of the effectiveness of the Fixit from 30 years ago are now nothing more than historical interest. Yet, a literature search on the Fixit today will likely lump all studies together and be misleading for decision-making. If the organisation approaches product design as it should, iteratively and by learning from user behaviour, earlier versions will likely be less effective than later ones.

2. Potential for harm

Licencing medicines is so intensive because of their potential to cause harm. After licencing, they will permeate the bodies of tens of millions of people, and we wish to know or be fairly certain how well they will work and what risks may be associated. The horrific consequences of Thalidomide drove much more robust evaluation processes, which have continued to be refined.

In contrast, digital apps are usually there to inform, gather data, and supplement rather than replace clinical care. In the case of Aide, for example, people can choose which data to enter about themselves and which information resources to access. It can be ignored or deleted at will; hence, the risks of harm to users are far less severe.

3. Usability

Drugs are designed to treat diseases rather than people. Their evaluation is based on a biomedical model rather than recognising individual people. This undoubtedly contributes to problems with the usability of medicines. Around 30-50% of people with a long-term condition do not adhere to their medication regime.

Drug companies may be able to improve this by reformulation or novel dosage forms; however, these require changes to licencing and manufacturing plants, both of which are time-consuming and very expensive.

In contrast, apps are very adaptable; it is in their interest to be person-centric at scale and to make using them an easy and enjoyable experience. Evaluation methods need to take this characteristic of serving the individual specifically into account.

4. Effectiveness

The level of proof that an intervention ‘works’ was taken over 100 years ago as being less than a 5% probability of the decision being wrong. Historians of statistics have found no rational reason for this degree of certainty having been adopted, but, with slight variations, it has become a standard level of proof for the effectiveness of medicines. One can see the need for a high level of proof for a molecule that will be used worldwide.

However, those of us who have managed and developed services have done so without any such sense of certainty. Policymakers at the system and national levels can institute practices that can create enormous harm without evidence that their innovations are 95% certain to succeed.

We need different levels of proof to adopt technologies which are inherently less risky than drugs and can be quickly adapted if problems emerge, like those of digital health platforms.

Any organisational leader will have introduced change with some trepidation, but with digital systems, they could be reminded that they can monitor and adapt the new system in light of the experience of implementing it. Perhaps we need to look to the law for inspiration: crown courts require proof beyond reasonable doubt, such as 95% certainty, whereas, in civil courts, the necessary level of proof is based on ‘the balance of probabilities’. Perhaps a lower confidence level should be considered satisfactory for formative evaluation studies of digital health.


Digital product development is a complex emergent system. It requires rapid and effective feedback loops to learn and build on what works. It is an adaptive technology. Early studies provide feedback that can improve effectiveness and usability and reduce risk. The method of evaluation of apps should take this into account. An example of this approach is Michael Quinn Patton’s book Developmental Evaluation. In my view, it provides a good, gradualist guide to the issues in developing and evaluating, amongst other things, digital health platforms.

We should not adopt a process analogous to the licencing of drugs in evaluating apps. That is not to say that randomised controlled trials have no place; they do, but their own risks and benefits need to be recognised, and their adoption a rational act rather than ‘de rigueur’. Moving from research application to the final publication of a large trial can take five years. In that time, an app will have changed, clinical practices may have changed, and alternative digital services may have appeared on the market.

We need to be more flexible. 

This is not to say that healthcare systems would be exposing patients to unnecessary risks. Quality assurance measures are in place; there will be a clinical safety case, risk register, data security standards, etc. Quality systems from management can be a helpful perspective on product development. For example, Juran’s three intersecting areas of quality: Quality Control, Quality Improvement and Quality Planning. Juran was clear that the quality plan should be based on the customer’s view of quality, tying in with the increasingly person-centric direction of health care.

There is a moral imperative to get technologies that help patients, health professionals and the NHS into use as quickly as possible. Evaluation must enhance, not hamper, that process.

An insistence on over-elaborate and, in my view, inappropriate evaluation methodologies stifles innovation. The high costs of running a trial favour well-funded existing companies and are a barrier to start-ups' original thinking and innovation. Adopting appropriate evaluation methods speeds access to good products to benefit patients, health professionals and the NHS.

Nick Barber is a pharmacist with an international reputation in patient safety, patient centredness and technology implementation. He is an Emeritus Professor at University College London School of Pharmacy, the former Vice President of the Royal Pharmaceutical Society and the Head of Clinical Outcomes at Aide Health.


Other articles you might like