Инд. авторы: Ryabko B., Reznikova Z., Druzyaka A., Panteleeva S.
Заглавие: Using Ideas of Kolmogorov Complexity for Studying Biological Texts
Библ. ссылка: Ryabko B., Reznikova Z., Druzyaka A., Panteleeva S. Using Ideas of Kolmogorov Complexity for Studying Biological Texts // Theory of Computing Systems. - 2012. - P.133-147. - ISSN 1432-4350. - EISSN 1433-0490.
Внешние системы: DOI: 10.1007/s00224-012-9403-6; РИНЦ: 23957499; РИНЦ: 21878008; SCOPUS: 2-s2.0-84871404997; WoS: 000316087100009;
Реферат: eng: Kolmogorov complexity furnishes many useful tools for studying different natural processes that can be expressed using sequences of symbols from a finite alphabet (texts), such as genetic texts, literary and music texts, animal communications, etc. Although Kolmogorov complexity is not algorithmically computable, in a certain sense it can be estimated by means of data compressors. Here we suggest a method of analysis of sequences based on ideas of Kolmogorov complexity and mathematical statistics, and apply this method to biological (ethological) "texts." A distinction of the suggested method from other approaches to the analysis of sequential data by means of Kolmogorov complexity is that it belongs to the framework of mathematical statistics, more specifically, that of hypothesis testing. This makes it a promising candidate for being included in the toolbox of standard biological methods of analysis of different natural texts, from DNA sequences to animal behavioural patterns (ethological "texts"). Two examples of analysis of ethological texts are considered in this paper. Theses examples show that the proposed method is a useful tool for distinguishing between stereotyped and flexible behaviours, which is important for behavioural and evolutionary studies
Ключевые слова: Kolmogorov complexity; hypothesis testing;
Издано: 2012
Физ. характеристика: с.133-147
Цитирование: 1. Anel, C., Sanderson, M. J.: Missing the forest for the trees: phylogenetic compression and its implications for inferring complex evolutionary histories. Syst. Biol. 54(1), 146-157 (2005). 2. Billingsley, P.: Ergodic Theory and Information. Wiley, New York (1965). 3. Cilibrasi, R., Vitanyi, P.: Clustering by compression. IEEE Trans. Inf. Theory 51(4), 1523-1545 (2005). 4. Cover, T. M., Thomas, J. A.: Elements of Information Theory. Wiley-Interscience, New York (2006). 5. Ferragina, P., Giancarlo, R., Greco, V., Manzini, G., Valiente, G.: Compression-based classification of biological sequences and structures via the Universal Similarity Metric: experimental assessment. BMC Bioinf. 8 (2007). 6. Fisher, R. A.: Statistical Methods, Experimental Design, and Scientific Inference. Oliver & Boyd, Edinburgh (1956). 7. Gallager, R. G.: Information Theory and Reliable Communication. Wiley, New York (1968). 8. Groothuis, T.: The influence of social experience on the development and fixation of the form of displays in the black-headed gull. Anim. Behav. 43(1), 1-14 (1992). 9. Hutter, M.: Universal Artificial Intelligence. Sequential Decisions Based on Algorithmic Probability. Springer, Berlin (2005). 10. Kendall, M. G., Stuart, A.: The Advanced Theory of Statistics, Inference and Relationship, vol. 2. Griffin, London (1961). 11. KGB archiver (v. 1. 2). http://www. softpedia. com/get/Compression-tools/KGB-Archiver. shtml. 12. Kieffer, J., Yang, E.: Grammar-based codes: a new class of universal lossless source codes. IEEE Trans. Inf. Theory 46, 737-754 (2000). 13. Li, M., Vitanyi, P.: An Introduction to Kolmogorov Complexity and Its Applications, 2nd edn. Springer, New York (1997). 14. Li, M., Badger, J., Chen, X., Kwong, S., Kearney, P., Zhang, H. Y.: An information based distance and its application to whole mitochondrial genome phylogeny. Bioinformatics (Oxford) 17, 149-154 (2001). 15. Li, M., Chen, X., Li, X., Ma, B., Vitanyi, P.: The similarity metric. IEEE Trans. Inf. Theory 50(12), 3250-3264 (2004). 16. Lloyd, E. (ed.): Handbook of Applied Statistics, vol. 2. Wiley-Interscience, New York (1984). 17. McCowan, B., Doyle, L. R., Hanser, S. F.: Using information theory to assess the diversity, complexity, and development of communicative repertoires. J. Comp. Psychol. 116(2), 166-172 (2002). 18. Oller, D. K., Griebel, U. (eds.): Evolution of Communicative Flexibility: Complexity, Creativity, and Adaptability in Human and Animal Communication. MIT Press, Cambridge (2008). 19. Panteleeva, S., Danzanov, Zh., Reznikova, Zh.: Estimate of complexity of behavioral patterns in ants: analysis of hunting behavior in myrmica rubra (hymenoptera, formicidae) as an example. Entomol. Rev. 91(2), 221-230 (2011). 20. Reznikova, Z.: Animal Intelligence: From Individual to Social Cognition. Cambridge University Press, Cambridge (2007). 21. Reznikova, Zh., Panteleeva, S.: An ant's eye view of culture: propagation of new traditions through triggering dormant behavioural patterns. Acta Ethol. 11, 73-80 (2008). 22. Rissanen, J.: Universal coding, information, prediction, and estimation. IEEE Trans. Inf. Theory 30(4), 629-636 (1984). 23. Ryabko, B.: Prediction of random sequences and universal coding. Probl. Inf. Transm. 24(2), 87-96 (1988). 24. Ryabko, B., Reznikova, Zh.: Using Shannon entropy and Kolmogorov complexity to study the communicative system and cognitive capacities in ants. Complexity 2, 37-42 (1996). 25. Ryabko, B., Reznikova, Z.: The use of ideas of information theory for studying "language" and intelligence in ants. Entropy 11, 836-853 (2009). 26. Ryabko, D., Schmidhuber, J.: Using data compressors to construct order tests for homogeneity and component independence. Appl. Math. Lett. 22(7), 1029-1032 (2009). 27. Ryabko, B., Astola, J., Gammerman, A.: Application of Kolmogorov complexity and universal codes to identity testing and nonparametric testing of serial independence for time series. Theor. Comput. Sci. 359, 440-448 (2006). 28. Tinbergen, N.: An objective study of the innate behaviour of animals. Bibl. Biotheor. 1, 39-98 (1942). 29. Tinbergen, N.: The Study of Instinct. Oxford University Press, London (1951). 30. Vitanyi, P. M. B.: Information distance in multiples. IEEE Trans. Inf. Theory 57(4), 2451-2456 (2011). 31. Yaglom, A. M., Yaglom, I. M.: Probability and Information. Theory and Decision Library. Springer, Berlin (1983). 32. Zvonkin, A. K., Levin, L. A.: The complexity of finite objects and concepts of information and randomness through the algorithm theory. Russ. Math. Surv. 25(6), 83-124 (1970).