The recipe for much of modelling in systems and network biology is comparatively easy. First one establishes the topology or ‘structure’ of the network (the curly arrow version seen in wallcharts – such as those for metabolism, now available electronically  – of ‘who talks to whom?’). Then one finds out the equations – such as that of Henri, Michaelis and Menten – describing the effect of the concentrations of the various interacting partners on the rate of the reaction step involved. Finally one determines the parameters of those equations (such as the rate constants, measured using the methods of molecular enzymology, and ‘fixed concentrations of substrates and other effectors). Armed with the model, it is then straightforward to run the model using appropriate software (e.g. the COPASI software found on the web) either using ordinary differential equations or when necessary stochastic methods. These give the time evolution of the system variables – typically the concentrations and fluxes of molecules – and one may also determine  the local or global sensitivities and summation laws of every variable to every parameter. Even for large models, modern algorithms (including those for solving so-called ‘stiff’ ordinary differential equations) require comparatively little computer power.

The first problem is that we do not necessarily know any of the above (neither the topology, nor the equations nor the parameters), and all we have – as nowadays produced by ‘omics’ methods – is a set of measurements of the time series of variables. The other problem is that the parameters determine the variables and not the other way round. How then do we infer the network – the systems biology model – that is best capable of fitting experimental measurements of variables (or even the desired behaviour of variables, as may be of importance in synthetic biology or metabolic engineering)? This is an ‘inverse’ or ‘system identification’ problem, that often requires considerable computer power and is rather hard as these problems are usually highly underdetermined – many sets of parameters can fit the same data. To avoid (re)inventing the wheel we would like to know which methods work best. (We have recently published a new method of our own, and a separate one on the essentially similar problem of detecting the mode or site of action of something that effectively changes a parameter.)

In these circumstances, a bake-off of competing methods allows a comparison of their strengths and weaknesses – much as the CASP competitions for protein structure prediction – and this is what has been done by the DREAM team (where the acronym stands for Dialogue on Reverse Engineering Assessment and Methods). The results from two published competitions are too detailed to give here, but a number of lessons have already emerged. First, the best algorithm is not the same for all problems (as one may expect from the No Free Lunch theorem, shown and popularised more than ten years ago). Secondly, some algorithms do do remarkably well, and this is very encouraging. One might perhaps summarise the DREAM competition by saying Network Inferencing Gets Heavy Testing Mechanisms, And Results Excel Sometimes – or NIGHTMARES for short…

Last week’s travelogue included the launch of the new Blueprint (PDF) by Lord Drayson and others from the Office for Life Sciences, and a visit to the European Bioinformatics Institute to discuss infrastructure issues.

Finally, I blogged before about how even scientists have been known to copy stories or references without checking them independently. There is a hilarious story from the Times on the speculative semiotics of furniture names, based on the perceived relationships between Swedes and Danes. However, according to the Language Log, this turns out to have been a spoof.  Another urban myth exploded.

Related posts (based on tags and chronology):