WED Feb 25 2004
Susana Vinga
"Rényi continuous entropy of DNA sequences."
Abstract:Entropy measures of DNA sequences estimate their randomness or, inversely, their repeatability. L-block Shannon discrete entropy accounts for the empirical distribution of all length-L words and has convergence problems for finite sequences. We propose a new entropy measure that extends Shannon' s formalism. Rényi's quadratic entropy calculated with Parzen window density estimation method applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM) of DNA sequences constitute a novel technique to evaluate sequence global randomness without some of the drawbacks of the former method. We have analytically deduced some of the asymptotic behaviour of this new measure and also performed the calculation of entropies for several synthetic and experimental biological sequences. This new technique can be very useful in the study of complexity of DNA sequences and provide additional tools for DNA entropy estimation.