How we can predict patient response to anti-HIV treatment

Innovation Matters


With funding from the Sixth Framework European Union, the IBM Haifa Research Lab is working on EuResist, a research project that is developing an integrated European system for the clinical management of antiretroviral drug resistance.

Some 33.2 million people are infected with the human immunodeficiency virus type 1(HIV-1). To date, doctors have about 20 antiretroviral compounds at their disposal to fight the pandemic. These are divided into four classes, with each class of compounds targeting a specific stage of viral replication. Common treatment includes a combination of three-to-four compounds from at least two groups of drugs. While such a combination has made HIV a treatable condition, the disease itself has not been eradicated. Treatment must be prolonged, possibly for the patient’s entire life. Long-term toxicity, difficulty in adhering to complex regimes, possible pharmacokinetics problems, and intrinsically limited potency are all factors favoring the growth of drug-resistant viral strains. Indeed, drug resistance is now a major cause for treatment failure.

This paper presents work carried out by members of the Machine-Learning group at the Haifa Research Lab (HRL). Our work attempts to predict patient response to a combination of antiretroviral treatments by using a data-driven prediction system. Such a system requires a large database for training and validating data. To this end, we have created the EuResist integrated database – an integration of demographic, clinical and genomic information of HIV patients from three large and expanding databases in different European countries. This work, carried out by the Healthcare and Life Sciences group, has resulted in one of the largest clinical genomic HIV databases in the world.

Our main goal was to rank combinations of antiretroviral treatments according to their potential success for a given patient and a given genotype of the HIV virus. Moreover, we needed to correctly predict the response to each combination of antiretroviral treatment. We used the viral load (VL) measure, i.e., the number of copies of viral RNA per milliliter of blood, to assess therapies. We defined a therapy success as a viral load measure below 500 copies or a reduction of 2 log compared to the baseline VL measure after a period of eight (4-12) weeks of treatment (short-term response).

The EuResist integrated database is equipped with an “automatic labeling of therapies” feature. Currently it contains data collected from about 18,500 patients with 65,000 different therapies. Only one-third of the therapies contain therapy response information. Of these, 13,935 therapies are successful and 6,314 are failure therapies. Only five percent of the therapy records contain response information as well as genotypic data (999 failure therapies and 2,144 successful ones).

We created a generative-discriminative (GD) prediction engine aimed at predicting therapy response given the virus genotypic data. In the training phase, we made use of the largest database containing about 20K therapies in which most of the therapies were missing the genotypic information. We applied model-selection techniques of cross-validation tests to select the Bayesian network that would most contributes to the overall prediction. We used the test of accuracy (the number of correct predictions of success and failure therapies), as well as area under-curve on a cross- validation and set-aside test set, to assess the performance of the prediction engine. Our tests showed that:

• The GD prediction engine outperforms current systems that are based on learning from HIV response to single drugs in the lab (in-vitro data).
• Prediction improves when patient history is available.

We devoted part of our work to coordinating prediction engines generated by other EuResist partners. We found that two other prediction engines seemed to perform well:

• The evolutionary engine (EV), developed by our partners from MPI.
• The mixed effects engine (ME), developed by our partners from Rome.

The comparison between the engines' accuracies has yielded at least two important findings:

• Different engines tend to agree on successful therapies much more than on failed ones. (As Tolstoy writes in Anna Karenina, "Happy families are all alike; every unhappy family is unhappy in its own way.") See Figure 1.


Figure 1


• The largest portion of 'all wrong' on the failure therapies compared with successful therapies was due mostly to an inherent noise. In other words, many of these cases were characterized by inconsistent VL tests. For example, the one used for labeling the therapy showed a failure while at least one other test showed a drop below 400 ml/copies, which means a definite success of the therapy.

We are now working on combining these three engines into a single system. In our analysis we show that this single system out-performs each of the individual prediction engines in most tests. In a few cases, performance is very similar to best performing engine. Moreover, the standard deviation of prediction errors consistently went down in the combined engine. This leads to a more robust prediction system, which will be available online in June 2008.

Our partners in the European Union 6th Framework project includes:

Informa S.r.l., Universit degli Studi di Siena, Italy; Karolinska Institute, Sweden; Max-Planck-Institute for Informatics, University Hospital of Cologne, Germany; RMKI, Hungry; Kingston University; and the European Federation of Pharmaceutical Industries and Associations (EFPIA).

The EuResist combined system will be available in June 2008.

Visit the EuResist website for updates.

Read EuResist collaboration receives global media attention.

Related Publications  

E. Aharoni, A. Altman, G. Borgulya, R. D'Autilia, F. Incardona, R. Kaiser, C. Kent, T. Lengauer, H. Neuvirth, Y. Peres, A. Petroczi, M. Prosperi, M. Rosen-Zvi, E. Schulter, T. Sing, A. Sonnenborg, R. Thompson and M. Zazzi. Integration of viral genomics with clinical data to predict response to anti-HIV treatment. IST-Africa 2007 Conference & Exhibition, Maputo, Mozambique. May 2007.

A. Altmann, M. Rosen-Zvi, M. Prosperi, E. Aharoni, H. Neuvirth, E. Schülter, J. Büch, Y. Peres, F. Incardona, A. Sönnerborg, R. Kaiser, M. Zazzi and T. Lengauer. The EuResist approach for predicting response to anti HIV-1 therapy. Accepted for an oral presentation at the 6th European HIV Drug Resistance Workshop, Cascais, Portugal. 2008.

H. Neuvirth, M. Rosen-Zvi, N. Srebro, E. Aharoni, M. Zazzi and N. Tishby. Improved Prediction of HIV Resistance In-Vitro by Biochemically-Driven Models. NIPS 2006 workshop talk: "New Problems and Methods in Computational Biology". December 2006.

M. Rosen-Zvi, H. Neuvirth, E. Aharoni, M. Zazzi and N. Tishby. Consistent dimensionality reduction scheme and its application to clinical HIV data. NIPS 2006 workshop poster: "Novel applications of dimensionality reduction". December 2006.

M. Rosen-Zvi, E. Aharoni, A. Altmann, R. Kaiser, T. Lengauer , H. Neuvirth, M. Prosperi, F. Bazsó, F. Incardona and M. Zazzi. EuResist: European data – European interpretation systems. European Journal of Medical Research 12, August 2007.

M. Rosen-Zvi, A. Altmann, M. Prosperi, E. Aharoni, H. Neuvirth, E. Schülter, J. Büch, Y. Peres, F. Incardona, A. Sönnerborg, R. Kaiser, M. Zazzi and T. Lengauer. Selecting anti-HIV therapies based on a variety of genomic and clinical factors. 16th Annual International Conference Intelligent Systems for Molecular Biology (ISMB). 2008.

M. Zazzi, E. Aharoni, A. Altmann, F. Baszó, P. Bidgood, G. Borgulya, J. Denholm-Prince, M. Fielder, C. Kent,, T. Lengauer, T. Nepusz, H. Neuvirth, Y. Peres, A. Petroczi, Mattia Prosperi, L. Romano, M. Rosen-Zvi, E. Schülter, T. Sing, A. Sonnerborg, R. Thompson, G. Ulivi, L Zalány and F. Incardona. EuResist: exploration of multiple modeling techniques for prediction of response to treatment. Proceedings of the 5th European HIV Drug Resistance Workshop. March 2007.

At IBM, the Euresist project is part of the Global Pandemic Initiative Accomplishment.

Last updated April 1, 2008

Innovator's corner  

Michal Rosen-ZviMichal Rosen-Zvi Researcher

What's the potential for the work you are doing?
The EuResist prediction system is expected to be available online to support virologists when they have to select a treatment for a patient. More generally, this work corroborates the personalized treatment approach and shows that large-scale databases of clinical genomic data enhanced with analytical tools can provide a powerful decision-support system.

What is the most interesting part of your research?
The findings that can be made in these vast volumes of data are exciting. We compare treatment approaches and find significant differences between the responses to the different treatments.

Who or what inspired you to go into this field?
My first steps as a machine learning researcher were at UC Berkeley. Professor Michael I. Jordan, whose class I attended, was my host there and was a source of inspiration for me.

What is your favorite invention of all time?
The written word, which probably originated in Egypt about 6000 years ago, is my favorite invention. The ability to communicate ideas across countries and across different time periods simply by putting them in writing has made the world far richer than it otherwise would have been and gave the concept of eternity a whole different meaning.

Research team  

Ehud Aharoni

Ehud Aharoni

Hani Neuvirth

Hani Neuvirth

Yardena Peres

Carmel Kent

Related Research