As highlighted in the latest issue of Nature Reviews in Drug Discovery, the incubation phase of a new Open Access platform called Sage ( should begin on 1 July, 2009. The aim of this initiative, led by Stephen Friend and Eric Schadt who are leaving Merck to run it, is ‘to integrate large-scale biological information into models and then enable other scientists to leverage that information in an open access way’. It starts with a large chunk of data and information (especially expression profiling data) that they bring to the party.

The basic approach involves expression profiling of carefully designed samples coupled to advanced numerical inferencing techniques to predict genetic or other networks underlying disease (or regulation more generally, e.g. in yeast). Examples including Bayesian approaches, that Manchester colleagues and I have used with success. Particular emphasis is laid on eQTLs (or as I called it somewhat earlier genotype-phenotype mapping). The Sage approach has already enjoyed a number of successes, as in the identification of biochemical links to obesity, inflammation, susceptibility to diabetes, the discovery of drug targets, network-based drug discovery (see also a previous blog).

Our own take includes the need to make available to the community models (such as the yeast metabolic network) in a semantically principled manner, preferably in SBML, approaches using Taverna to integrate such SBML-specified networks with expression profiling data, and to recognise the absolute importance of metabolic transporters for moving drugs around the body.

A number of related initiatives recognise the power of collaborative projects, made that much easier by the net and the wikinomics culture of Web 2.0. One might here also mention the recent acquisition for Open Access (by the Wellcome Trust via the EBI) of a large chemogenomic dataset from Galapagos, the rise of Open Access chemistry, internet-accessible chemical databases such as ZINC (that we have found very useful), and Chemspider (recently acquired by the RSC). The future for integrative data analysis for understanding complex biochemical networks has been looming for quite a while; I am confident that Sage and related activities can give it a great boost.

On the swine flu front, we already have some ideas of the antigenic and inhibitor sensitivity properties of the 2009 strain, suggesting that in its present form it should be sensitive to the commonest existing neuraminidase inhibitors but that it may be resistant to existing antibodies/vaccines. In addition, the May 21st Nature has a couple of reflective articles, noting that while this (northern) summer’s  incidence suggests that the current strain(s) circulating are not yet especially fearsome, they serve as a timely reminder for proper preparedness, and to learn some lessons of history about the need to communicate accurately. That need is even more acute in the era of the internet and especially of the Web.

Related posts (based on tags and chronology):