Science (whose etymological origin means ‘knowledge’) is supposed to be based on objective and verifiable facts, but as I have blogged before there can occasionally be a tendency to groupthink. This is a phenomenon in which, simply by multiple repetition of them, ‘facts’ become widely ‘known’ after a while, even when they have no or limited experimental basis (something easily amplified via the Internet). (I experienced a related issue during my D. Phil., where I traced back a series of literature references to a widely cited claim that the purportedly passive concentrative uptake ratio of a particular lipophilic cation into Escherichia coli was independent of its external concentration over a huge range. The origin of the claim turned out to be a review (by an exponent of the particular theory he was trying to defend) in which the ‘finding’ was in fact referred to as unpublished data. Since the intracellular concentration that would have been achieved at the highest external concentration exceeded considerably the aqueous solubility of the substance involved, one can only wonder what if anything had really been found…; in this case, almost certainly an active transporter was involved as documented for yeast – and see also a review of carriers for pharmaceutical xenobiotics).

A nice book that I regularly dip into is Lloyd & Mitchison’s The book of general ignorance, whose seductive attraction is that it explodes widely held beliefs as mistaken. I am not going to spoil it by giving the bases for why ‘facts’ such as these are mistaken, but the dust jacket states e.g. that ‘Henry VIII had six wives – wrong – and Everest is the highest mountain in the world – wrong’. More common in Science than claims being completely wrong is that we merely see partial truths and adduce a greater contribution to the truth of a specific finding than is the case; note too that in nonlinear networks each contribution is rarely independent for finite changes. That X affects Y does not mean that X is the only thing that affects Y.

Another fact I was told early in my career was that the rate of expression of recombinant proteins in E. coli was governed by codon usage (aka codon bias), and that for high-level expression this simply needed to be optimized and all would be well. Actually, as with almost any other area of biology, multiple phenomena contribute to the control of metabolic fluxes, and this one is no exception. In a really lovely piece of work just published, Kudla and colleagues varied the DNA sequence (and thus inter alia the codon usage) by producing an engineered library of 154 variants of GFP that nevertheless (because of redundancy) encoded the same amino acid sequence, causing a (huge) variance in the GFP expression of some 250-fold. However, codon bias did not correlate with the gene expression, and most of the variance was in fact due to variation in the stability of mRNA folding near the ribosomal binding site. While it is well known that the rate of translation is governed strongly by multiple elements involved in initiation, this is a nice example of the complications to be expected in synthetic biology.

In a similar vein, a multiplicity of elements contribute to answering the question of what potentially makes a good antibiotic target or a good TB vaccine (see also an ontology and curated database of epitopes), where the Rpf family of bacterial cytokines scores especially well. Scoring these contributions by sensitivity analysis allows one to estimate the relative contributions of such multiple causes, either in terms of the variance (as above) or via information theory.

As H.L.Mencken famously said, “For every complex problem there is an answer that is clear, simple, and wrong”. Sensitivity analysis can explain why.

Related posts (based on tags and chronology):