As well as many other individual meetings, including a monthly meeting of the RCUK Chief Executives (and our annual ‘awayday’ in the exotic location of Polaris House, Swindon), I had a thoroughly interesting and enjoyable visit to Cranfield University, where the Vice Chancellor is previous EPSRC Chief Executive Professor Sir John O’Reilly.

Cranfield has a particular focus on applied problems, and an attendant culture. A number of the very interesting project areas that I saw could be characterized as the study of complex, multi-organism systems, including soils, water treatment plants, composts and so on. The National Soil Research Institute maintains definitive details and data on soils, including the World Soil Survey Archive and Catalogue (Wossac), and some of these ‘soilscapes’ can be viewed online. The opportunities for combing these kinds of data with metagenomics analyses (review) are considerable.

In addition, I had the honour to rename ‘Building 53’ at Cranfield as the Bullock Building, in honour of Professor Peter Bullock, a distinguished soil scientist whose achievements included Presidency of the British Society of Soil Science, driver of a variety of soil science surveys and initiatives, adviser to the then Royal Commission on Environmental Pollution, and part of the team recognised for their contribution to the award of the 2007 Nobel Peace Prize to the Intergovernmental Panel on Climate Change. As well as housing scientific activities, the Bullock Building will be home to The Institute of Agricultural Engineers and the British Society of Soil Science. One of Peter’s last projects involved contributing to the Soil-net resource.

One approach to dealing with the complexity of biological systems involves studying them at just one level of organization (genome, transcriptome, proteome, metabolome, etc), and this leads naturally to the recognition that the next steps should involve integration of such data. Among many schemes proposed, a recent update from the Gaggle team is of interest, as it uses the idea of annotating the nominal fixed points of the genome for integration and understanding of cellular dynamics. Of course these data are typically distributed, and – especially for genomics data – increasing at rates far in excess of Moore’s Law. A general class of proposed solutions involves so-called cloud computing, something we are actively looking at; a recent review summarises some of the successes to date.

This week I am pleased to have been invited to attend the 2010 Sci Foo camp at the Googleplex, a follow-up to last year’s about which I blogged. The cast list includes both previous and new attendees, and as an unconference it is hard to know what will be presented, save that it will undoubtedly be interesting and visionary. I am sure it will improve considerably my understanding of how to deal with the data deluge.

Our woes with avian predators continue. Last Sunday I awoke to see a very large heron standing by the side of our garden pond, looking carefully for any remaining fish it might have missed. Evidently, our success in clearing algae from the pond had merely made the fish more visible. Sometimes one manipulates ecosystems at one’s peril!

Bare JC, Koide T, Reiss DJ, Tenenbaum D, Baliga NS: Integration and visualization of systems biology data in context of the genome. BMC Bioinformatics 2010; 11:382. Full free text (pdf).

Schatz MC, Langmead B, Salzberg SL: Cloud computing and the DNA data race. Nat Biotechnol 2010; 28:691-693.

Wooley JC, Godzik A, Friedberg I: A primer on metagenomics. PLoS Comput Biol 2010; 6:e1000667. Full free text (pdf).

Related posts (based on tags and chronology):