By M Vidyasagar
After (too long!) an absence, Mathukumalli Vidyasagar (“Sagar”) returns with his Computational Biology Corner column. This time Sagar recounts an incident that reinforces the need to critically review your measurements!
In my last column dated about a year ago, I had addressed the lack of standardization in biological instrumentation. In that column I bemoaned the fact that two different platforms, each of which claim to measure exactly the same quantity, namely the amount of messenger RNA (mRNA) produced by various cancer tumor tissues, produce wildly different measurements. I must apologetically return to the same theme in this column, albeit under different circumstances.
To refresh the memory of the reader (and to introduce my earlier column to those who had not read it when it originally appeared), my students downloaded data on about 580 ovarian cancer tumors from the web site of the National Cancer Institute (NCI), specifically The Cancer Genome Atlas (TCGA) project. Gene expression levels of the several genes in each tumor sample had been measured using two different platforms. But when the sets of measurements were plotted against each other, there was no resemblance whatsoever between them! Therefore, any prognostic predictor based on one set of measurements will fail miserably on the other set of data, and it does not matter which one is used! To us engineers, it sounds fairly absurd to say “Well, if you use this platform, then these are the genes that give you the best predictions of your chances of recovery, but if you use that platform, then an entirely different set of genes are the best predictors.”
But today’s column goes one better, because it concerns two sets of measurements taken on ostensibly the same platform, but at two different points in time. To me the lack of repeatability on the same platform is far worse than repeatability across platforms, because it would cause me to question even the worth of a single platform.
In brief here is the story. About 18 months ago, one of our collaborators measured the expression levels of 1,428 micro-RNAs (miRNAs) in 94 tumors of endometrial cancer. He found that, out of the 1,428 x 94 = 134,232 measurements, about 43% came out as “NaN” (Not a Number). We had assumed that the NaN readings were due to the fact that the quantities being measured were too small to register, and thus replaced them by a very small number. Later on we got a fresh supply of 30 more tumors, and when some of the same miRNAs were measured on the new samples, there were hardly any NaN entries! So our collaborators re-measured the miRNAs on three of the old samples — and this time again there were hardly any NaN readings! Exploring the mystery further, our collaborators discovered that the company that manufactured the hybridization system for the measurement platform had gone out of business, so the core facility was now using a different (and supposedly functionally equivalent) system for hybridization. Except that the expression levels were now 4 to 10 times higher, or an addition of 2 to 3 on a binary logarithmic scale. So the two measurement systems, taken as a whole, were not at all identical. However, knowing this, somehow we could “normalize” for this phenomenon. But we were hardly prepared for what happened next.
One particular miRNA measurement on the old and the new samples did not match at all!. The diagram below shows the situation. The blue curve is the set of measurements of the original samples and the red is for the new batch of 30 samples. There is the well-known two-sample K-S test that allows us to test whether or not two sets of samples are generated by the same (unknown) probability distribution. In this case however no such fancy mathematics is needed because one can see with the naked eye that the two sets of samples have nothing to do with each other.
But it is the reason why two sets of samples have no resemblance that is most interesting. Further investigation revealed that the platform maker (not the maker of the hybridization system that had been changed) had replaced a primer that binds to the 3′ end of the miRNA with another primer that binds to the 5′ end. Obviously this would affect the biochemistry considerably, and I find it inexcusable that a vendor would make such a fundamental change while still maintaining the facade that the new platform is “the same” as the old platform.
Note also that we were able to detect this anomaly only because we had 30 new samples to compare against the original 94 samples. What if we had just one sample? We would have perhaps rejected this measurement as an outlier, whereas in reality the problem lay elsewhere.
So now I am beginning to wonder just how seriously one should take the outcomes of experimental biology. The biologists themselves are not to be blamed. After all, a hundred years ago each physics laboratory painstakingly built its own equipment, with all the lack of standardization and lack of repeatability that entailed. There can be no denying that physics advanced more rapidly after instrumentation became standardized. In today’s physics or engineering laboratories, researchers change from one vendor to another without a second thought. They certainly do not need to worry that the vendors are selling them a pile of goods under false premises. In turning away from local home-grown instrumentation to commercial vendors, biologists are following the path earlier taken by physicists. But it appears that the biology researchers are indeed being sold a pile of goods under false premises.
One of my collaborators told me (as I stated in my earlier column) that researchers in biology know this, and do a lot of pro bono quality control work for equipment vendors. He also told me recently that “biologists still prefer small, hypothesis-driven research to large, data-driven research” for precisely this reason. So this would seem to give the lie to the widely repeated statement that, with so much data being generated at an unprecedented pace, great advances are just around the corner.
When I give talks on the role of computation in biology, I often start off by saying:
- Data is not information
- Information is not knowledge
- Knowledge is not wisdom
But now I think I need to put one more bullet in front of all those bullets, namely:
- Measurements are not data!
Mathukumalli Vidyasagar is the Founding Head of the Bioengineering Department, University of Texas at Dallas. He is a Fellow of the Royal Society, UK. Read more