This was one of the first attempts to analyze DNA from an IT point of view, further pursued during that decade. Over three decades ago, in a seminal book, Lila Gatlin explored the relation between IT and biology and the applicability of entropy concepts to DNA sequence analysis, following previous work in the 1960’s. In fact, living organisms are able to process and transmit information at many levels, from genetic to ecological inheritance mechanisms, which frames IT as a broad research ground that crosses many disciplines. Nowadays, IT is not a mere subset of communication theory and is playing a key role in disciplines such as physics and thermodynamics, computer science (through the connections with Kolmogorov complexity), probability and statistics and also in the life sciences. Īlthough IT was first developed to study transmission of messages over channels for communication engineering applications, it was later applied to many other fields of research. The key idea to achieve such transmission is to wait for several blocks of information and use code words, adding redundancy to the transmitted information. This was a surprising and counter-intuitive result. The fundamental theorem of IT states that it is possible to transmit information through a noisy channel (at any rate less than channel capacity) with an arbitrary small probability of error. IT has answered two essential questions about the ultimate data compression, related with the entropy of a source, and also the maximum possible transmission rate through a channel, associated with its capacity, computed by its statistical noise characteristics. IT, generally regarded as having been founded by Claude Shannon (1948), attempts to construct mathematical models for each of the components of these systems. Information theory (IT) addresses the analysis of communication systems, which are usually defined as connected blocks representing a source of messages, an encoder, a (noisy) channel, a decoder and a receiver. Information theory, alignment-free, Rényi entropy, sequence analysis, chaos game representation, genomic signature INTRODUCTION While not exhaustive, this review attempts to categorize existing methods and to indicate their relation with broader transversal topics such as genomic signatures, data compression and complexity, time series analysis and phylogenetic classification, providing a resource for future developments in this promising area. IT has also been applied to high-level correlations that combine DNA, RNA or protein features with sequence-independent properties, such as gene mapping and phenotype analysis, and has also provided models based on communication systems theory to describe information transmission channels at the cell level and also during evolutionary processes. This review covers several aspects of IT applications, ranging from genome global analysis and comparison, including block-entropy estimation and resolution-free metrics based on iterative maps, to local analysis, comprising the classification of motifs, prediction of transcription factor binding sites and sequence characterization based on linguistic complexity and entropic profiles. In particular, alignment-free sequence analysis and comparison greatly benefited from concepts derived from IT, such as entropy and mutual information. Information theory (IT) addresses the analysis of communication systems and has been widely applied in molecular biology.
0 Comments
Leave a Reply. |