Leiden University
  home   contact      
 
 
 
Home Research
   
 

Genes predict course of disease


Jelle Goeman: 'It takes some 20,000 gene measurements per patient. Classical statistics are not equipped for processing such large amounts of data.'

Wouldn't it be ideal: to be able to predict whether or not a tumour will spread by removing a small amount of cancer tissue from a cancer patient? Statistician Jelle Goeman, who works at LUMC (Leiden University Medical Centre), has received a Veni subsidy to develop statistical techniques for analysing large numbers of gene measurements. His aim is to be able to help doctors make better diagnoses, and to improve his understanding of diseases by using knowledge of the functions of gene groups.

Microarray
How can you produce a gene profile? 'Scientists and medical specialists use microarrays, small chips on which they place pieces of genetic material,' Goeman explains. 'You can use these microarrays to show how active certain genes are. If this were to be done for a group of patients with metastasized (malignant) cancer and a group with non-metastasized (benign) cancer, you could then try to differentiate the two groups based on their gene profiles. Hopefully, such knowledge will enable you to predict at an earlier stage whether or not a tumour will spread.'

Gene functions
Moreover, research on microarrays serves a fundamental scientific purpose,' Goeman explains. 'By examining which genes can predict whether or not a tumour will spread, we acquire more insight into the function of these genes.


Microarrays: small chips on which large numbers of DNA spots (pieces of genetic material) have been placed to examine how active certain genes are.

Genes which distinguish metastasized tumours from non-metastasized tumours apparently play, either directly or indirectly, a part in the metastasizing process.'

Statistical problem
Microarrays have served a useful purpose both from the point of view of applied scientific research as well as from fundamental scientific research. There are, however, a number of statistical problems related to their use. Particularly the enormous number of measurements involved per patient present the researcher with problems. Goeman: ' It takes some 20,000 gene measurements per patient. Classical statistics are not used to processing such large amounts of data. To be able to draw reliable conclusions in classical statistics on the genes' predictive value, you will need at least five patients for every gene that you examine. But in practice it is, of course, impossible to study 100,000 patients.'

Gene grouping
Within his Veni project Goeman will develop new statistical methods which will be better equipped to analyse large numbers of gene measurements. How he will go about that? 'I will make use of existing biological knowledge on genes. There are databases, such as Gene Ontology (GO), which contain information on genes, such as their different functions. Based on this information, you can subdivide genes into groups that are all involved with the same function: genes that regulate cell division, genes that are responsible for destroying unhealthy cells and so on. By using groups of genes as building bricks for your model, instead of individual genes, you can hopefully make do with smaller numbers of patients


Gene Ontology: a hierarchical database in which information on the functions of groups of genes is stored (the lower in the hierarchical tree, the more specific the function becomes). The branches indicated in red correspond to gene functions in which differences have been found between cancer patients with a favourable disease course and patients whose disease progresses unfavourably.

Biological processes
Researching groups of genes not only has statistical advantages, it also serves another important purpose. 'As well as developing better instruments for prognosis I would also like to gain more insight into the processes which are significant in the course of the patient's disease, Goeman says. 'Which gene functions, for example, are involved in spreading a tumour? By introducing functionally related groups one by one into your prediction model, you can see which of those groups have additional value for the prediction and consequently which groups play a part in the origins of the metastases. With this research my work will involve more than just building statistical models; I also want to try and make more innovative biological insights possible.'

(22 May 2007/Tristan Lavender)

       
 
   
previous page top of page