Auflistung nach Autor:in "Selbig, Joachim"
1 - 10 von 21
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragAn application of latent topic document analysis to large-scale proteomics databases(German conference on bioinformatics – GCB 2007, 2007) Klie, Sebastian; Martens, Lennart; Vizcaino, Juan Antonio; Cote, Richard; Jones, Phil; Apweiler, Rolf; Hinneburg, Alexander; Hermjakob, HenningSince the advent of public data repositories for proteomics data, readily accessible results from high-throughput experiments have been accumulating steadily. Several large-scale projects in particular have contributed substantially to the amount of identifications available to the community. Despite the considerable body of information amassed, very few successful analysis have been performed and published on this data, levelling off the ultimate value of these projects far below their potential. In order to illustrate that these repositories should be considered sources of detailed knowledge instead of data graveyards, we here present a novel way of analyzing the information contained in proteomics experiments with a ’latent semantic analysis’. We apply this information retrieval approach to the peptide identification data contributed by the Plasma Proteome Project. Interestingly, this analysis is able to overcome the fundamental difficulties of analyzing such divergent and heterogeneous data emerging from large scale proteomics studies employing a vast spec- trum of different sample treatment and mass-spectrometry technologies. Moreover, it yields several concrete recommendations for optimizing pro- teomics project planning as well as the choice of technologies used in the experiments. It is clear from these results that the analysis of large bodies of publicly available proteomics data holds great promise and is currently underexploited.
- KonferenzbeitragAre we overestimating the number of cell-cycling genes? The impact of background models for time series data(German conference on bioinformatics – GCB 2007, 2007) Futschik, Matthias E.; Herzel, HanspeterPeriodic processes play fundamental roles in organisms. Prominent examples are the cell cycle and the circadian clock. Microarray array technology has enabled us to screen complete sets of transcripts for possible association with such fundamental periodic processes on a system-wide level. Frequently, quite a large number of genes has been detected as periodically expressed. However, the small overlap of identified genes between different studies has shaded considerable doubts about the reliability of the detected periodic expression. In this study, we show that a major reason for the lacking agreement is the use of an inadequate background model for the determination of significance. We demonstrate that the choice of background model has considerable impact on the statistical significance of periodic expression. For illustration, we reanalyzed two microarray studies of the yeast cell cycle. Our evaluation strongly indicates that the results of previous analyses might have been overoptimistic and that the use of more suitable background model promises to give more realistic results.
- KonferenzbeitragControl of translation: comparative genomics and mechanistic aspects(German conference on bioinformatics – GCB 2007, 2007) Pilpel, Yitzhak; Man, OrnaA major challenge in comparative genomics is to understand how phenotypic differences between species are encoded in their genomes. Phenotypic divergence may result from differential transcription of orthologous genes, yet less is known about the involvement of differential translation regulation in species phenotypic divergence. In order to assess translation effects on divergence, we analyzed approximately 2,800 orthologous genes in nine yeast genomes. For each gene in each species, we predicted translation efficiency, using a measure of the adaptation of its codons to the organism's tRNA pool. Mining this data set, we found hundreds of genes and gene modules with correlated patterns of translational efficiency across the species. One signal encompassed entire modules that are either needed for oxidative respiration or fermentation and are efficiently translated in aerobic or anaerobic species, respectively. In addition, the efficiency of translation of the mRNA splicing machinery strongly correlates with the number of introns in the various genomes. Altogether, we found extensive selection on synonymous codon usage that modulates translation according to gene function and organism phenotype. We conclude that, like factors such as transcription regulation, translation efficiency affects and is affected by the process of species divergence.
- KonferenzbeitragExpected Gene Order Distances and Model Selection in Bacteria(German conference on bioinformatics – GCB 2007, 2007) Dalevi, Daniel; Eriksen, NiklasThe most parsimonous distances calculated in pairwise gene order comparisons cannot accurately reflect the true number of events separating two species, unless the number of changes are few. Better is to use the expected distances. In this study we recapitulate previous results and derive new expected distances for models that have gained support in other studies, such as, symmetrical reversal distances and short reversals. Further, we investigate the patterns of dotplots between species of bacteria with the purpose of model selection in gene order problems. We find several categories of data which can be explained by carefully weighing the contributions of reversals, transpositions, symmetric reversals, single gene transpositions, and single gene reversals.
- KonferenzbeitragFinding relevant biotransformation routes in weighted metabolic networks using atom mapping rules(German conference on bioinformatics – GCB 2007, 2007) Blum, Torsten; Kohlbacher, OliverComputational analysis of pathways in metabolic networks has numerous applications in systems biology. While graph theory-based approaches have been pre- sented that find biotransformation routes from one metabolite to another in these net- works, most of these approaches suffer from finding too many routes, most of which are biologically infeasible or meaningless. We present a novel approach for finding relevant routes based on atom mapping rules (describing which educt atoms are mapped onto which product atoms in a chemical reaction). This leads to a reformulation of the problem as a lightest path search in a degree-weighted metabolic network. A key component of the approach is a new method of computing optimal atom mapping rules.
- KonferenzbeitragA general paradigm for fast, adaptive clustering of biological sequences(German conference on bioinformatics – GCB 2007, 2007) Reinert, Knut; Bauer, Markus; Döring, Andreas; Klau, Gunnar W.; Halpern, Aaron L.There are numerous methods that compute clusterings of biological sequences based on pairwise distances. This necessitates the computation of O(n2) sequence comparisons. Users usually want to apply the most sensitive distance measure which normally is the most expensive in terms of runtime. This poses a problem if the number of sequences is large or the computation of the measure is slow. In this paper we present a general heuristic to speed up distance based clustering methods considerably while compromising little on the accuracy of the results. The speedup comes from using fast comparison methods to perform an initial ‘top-down’ split into relatively homogeneous clusters, while the slower measures are used for smaller groups. Then profiles are computed for the final groups and the resulting profiles are used in a bottom-up phase to compute the final clustering. The algorithm is general in the sense that any sequence comparison method can be employed (e.g. for DNA, RNA or amino acids). We test our algorithm using a prototypical imple- mentation for agglomerative RNA clustering and show its effectiveness.
- Editiertes Buch
- KonferenzbeitragHigh-Precision Function Prediction using Conserved Interactions(German conference on bioinformatics – GCB 2007, 2007) Jaeger, S.; Leser, U.The recent availability of large data sets of protein- protein-interactions (PPIs) from various species offers new opportunities for functional genomics and proteomics. We describe a method for exploiting conserved and connected subgraphs (CCSs) in the PPI networks of multiple species for the prediction of protein function. Structural conservation is combined with functional conservation using a GeneOntology-based scoring scheme. We applied our method to the PPI networks of five species, i.e., E. coli, D. melanogaster, M. musculus, H. sapiens and S. cerevisiae. We detected surprisingly large CCSs for groups of thee species but not beyond. A manual analysis of the biological coherence of exemplary subgraphs strongly supports a close relationship between structural and functional conservation. Based on this observation, we devised an algorithm for function prediction based on CCS. Using our method, for instance, we predict new functional annotations for human based on mouse proteins with a precision of 70%.
- KonferenzbeitragIdentifying microRNAs and their targets(German conference on bioinformatics – GCB 2007, 2007) Rajewsky, NikolausI will summarize what can be learned from predicting and analyzing microRNA targets. As an example, I will discuss the function of miR-150 in the immune system. Finally, I will present a new algorithm for the identification of microRNAs from deep sequencing data.
- KonferenzbeitragIndependent components analysis of starch deficient pgm mutants(German Conference on Bioinformatics 2004, GCB 2004, 2004) Scholz, Matthias; Gibon, Yves; Stitt, Mark; Selbig, JoachimChanges in enzymatic activities in response to carbon starvation were investigated in Arabidopsis thaliana in two distinct experiments. One compares the Columbia ecotype (Col-0) and its starch deficient pgm mutant (plastidial phosphoglucomutase), the other investigates the enzymatic activities of Col-0 under extended night conditions. A classical technique for detecting and visualizing relevant information from the measured data is principal component analysis (PCA). We show that independent component analysis (ICA) is more suitable for our questions and the results are more precise than those obtained with PCA. This higher informative power is only achieved when ICA is combined with suitable pre-processing and evaluation criteria. It is essential to first reduce the dimensionality of the data set, using PCA. The number of principal components determines the quality of ICA significantly, therefore we propose a criterion for estimating the optimal dimension automatically. The measure of kurtosis is used to sort the extracted components. We found that ICA could detect on the one hand the time component of the extended night experiment, and on the other hand a discriminating component in the pgm mutant experiment. In both components the most important enzymes were the same, confirming the carbon starvation phenotype in the mutant.
- «
- 1 (current)
- 2
- 3
- »