Machine learning approaches for deciphering complex pathomechanismsin cancer

Eils, RolandSchubert, Sigrid E.Reusch, BerndJesse, Norbert2019-11-142019-11-1420023-88579-349-0https://dl.gi.de/handle/20.500.12116/30192Recent years have seen a dramatic increase in the amount of genetic information stored in electronic format. It has been estimated that the amount of information in genomics and proteomics doubles every 20 months and the size and number of databases are increasing even faster. It is widely accepted that a sophisticated exploration of such data is crucial in a variety of fields such as disease genetics and pharmacogenomics. While both corporate and institutional efforts have concentrated on the integration of heterogeneous data in genomics and proteomics, a systematic data exploration is still at its beginning. Although data mining has celebrated many successes in business operations applications as retail and marketing (see e.g. [1]), its application to scientific and engineering data is not straightforward. Data sets in life sciences are often significantly larger in volume, structurally more complex then traditional business data, and often rapidly changing in time. In contrast to business environments, the body of existing background knowledge in life sciences is extensive. I will report on our recent efforts [2,3] to adapt data mining technology in particular from the field of machine learning for effective knowledge discovery in tumour genetics. To exemplify the power of this approach, we will describe how complex concepts such as survival or therapy response [4,5] can be learnt from heterogeneous clinical or molecular data.enMachine learning approaches for deciphering complex pathomechanismsin cancerText/Conference Paper1617-5468