Posts Categorized:

Algorithms Good to GREAT: The Search for the Perfect Acronym

By M. Vidyasagar
University of Texas at Dallas

“What’s in a name? That which we call a rose
Would smell as sweet by any other name”
(Romeo & Juliet, Act II, Scene II)

Clearly William Shakespeare did not understand marketing. Perhaps there was a time when an idea would sell simply on its own merits, and did not need a catchy name. Indeed, this still seems to be the case in my home field, control theory, where I cannot think of even a single catchy acronym.

What few acronyms there are, are pretty descriptive, such as LQR (Linear Quadratic Regulator), LQG (Linear Quadratic Gaussian), and so on. The most famous algorithm of linear control theory developed during the past thirty years doesn’t even have a name, let alone an acronym. The algorithm for H-infinity optimal controller design published in [1] can perhaps be called “the coupled Riccati equation algorithm”, but that is about it. The paper itself is often referred to as DGKF after its authors, but the algorithm continues on in blissful anonymity.

Contrast this with the situation that prevails in computational biology. I can say that it is virtually impossible to publish a paper in this area unless one has a catchy acronym for whatever one is doing. Just consider the following sampler:

BLAST: Basic Local Alignment Search Technique (Anyone who has tried to read the papers [2] and [3] that present the theory can vouch for the fact that the technique is anything but “basic”.)

GLIMMER: The “IMM” stands for “Interpolated Markov Model”

HMMER (Pronounced “Hammer): The “HMM” stands for “Hidden Markov Model”

PANTHER: Protein Analysis Through Evolutionary Relationships (The know-it all Microsoft Word editor won’t allow me to misspell Analysis and Through)

CanPredict: “A computational tool for predicting cancer-associated mutations”

I could go on and on. Indeed the reader can browse just about any issue of any journal in computational biology and come up with a similar list.

To my mind the grand-daddy of them all is:

TCGA: The Cancer Genome Atlasv

Lest we think that this is purely an American phenomenon, from across the big pond comes:

COSMIC: Catalogue of Somatic Mutations in Cancer (The spelling of “Catalogue” ought to reveal that this one comes from the UK, from the Sanger Institute as a matter of fact.)

There is a good reason why GREAT is in upper case letters in the title of this column. It too is an acronym, and stands for “Genomic Regions Enrichment of Annotations Tool”. So you see, dear reader, it is no longer sufficient for you to invent an algorithm that is merely “great” – it has to be “GREAT”.

Just today there was a news item that a professor at MIT has developed an algorithm for detecting short tandem repeats (STRs) called, what else, “lobSTR”. Though I read the news item with a great deal of care, I am unable to find the rationale for “lob” in “lobSTR”. It also makes me wonder whether the good professor would have called his algorithm “lobSTR” if he had been working in a land-locked state, say Kansas.

Continuing on in that vein, I propose TORNADO: “Transcriptome of RNA-Derivatives Optimization”. No need to thank me, all you biology professors in Nebraska, Kansas, Arkansas, Oklahoma, etc. I have done the heavy lifting in inventing the acronym. All you have to do now is to invent the algorithm itself.

Let me conclude with the observation that charity begins at home (and also that confession is good for the soul). My students and I invented an algorithm for identifying a handful of the most informative features from genome-wide expression studies. The algorithm combines the “Student” t-test, the ell-one norm (sum of absolute values) support vector machine (SVM) instead of the more common ell-two norm (Euclidean norm), and finally, recursive feature elimination (RFE). So naturally we called our algorithm

ell-one SVM t-test and RFE

leading to the second-level acronym:

lone star

With a name like that, and considering our domicile, how can we miss?

For Further Reading

1. John C. Doyle, Keith Glover, Pramod P. Khargonekar and Bruce A. Francis, “State-Space Solutions to Standard H_2 and H_infinity Control Problems”, IEEE Transactions on Automatic Control, 34(8), 831-847, August 1989.

2. A. Dembo, S. Karlin and O. Zeitouni, “Critical phenomena for sequence matching with scoring”, Annals of Probability, 22(4), 1993-2021, 1994.

3. A. Dembo, S. Karlin and O. Zeitouni, “Limit distribution of maximal non-aligned segmental score”, Annals of Probability, 22(4), 2022-2039, 1994.

IEEE Life Sciences

About the Newsletter

June 2012 Contributors