Инд. авторы: | Ryabko B. |
Заглавие: | Time-universal data compression and prediction |
Библ. ссылка: | Ryabko B. Time-universal data compression and prediction // IEEE International Symposium on Information Theory, ISIT (Paris, France, 07.07-12.07.2019). - 2019. - P.562-566. - ISBN 978-1-5386-9291-2. |
Внешние системы: | DOI: 10.1109/ISIT.2019.8849224; РИНЦ: 41690874; SCOPUS: 2-s2.0-85073171866; WoS: 000489100300114; |
Реферат: | eng: Suppose there is a large file which should be transmitted (or stored) and there are several (say, m) admissible data-compressors. It seems natural to try all the compressors and then choose the best, i.e. the one that gives the shortest compressed file. Then transfer (or store) the index number of the best compressor (it requires dlogme bits) and the compressed file. The only problem is the time, which essentially increases due to the need to compress the file m times (in order to find the best compressor). We propose a method that encodes the file with the optimal compressor, but uses a relatively small additional time: the ratio of this extra time and the total time of calculation can be limited by an arbitrary positive constant. A similar situation occurs when forecasting time series. Generally speaking, in many situations it may be necessary find the best data compressor (or predictor) out of a given set, which is often done by comparing them empirically. One of the goals of this work is to turn such a selection process into a part of the data compression method, automating and optimizing it. |
Ключевые слова: | Universal data compression; Positive constant; Large files; Index numbers; Compression methods; Compressed files; Information theory; Compressors; Data compressor; Data compression; |
Издано: | 2019 |
Физ. характеристика: | с.562-566 |
Конференция: | Название: IEEE International Symposium on Information Theory Аббревиатура: ISIT Город: Paris Страна: France Даты проведения: 2019-07-07 - 2019-07-12 |
Цитирование: | 1. J. Cleary and I. Witten, "Data compression using adaptive coding and partial string matching, " IEEE transactions on Communications, vol. 32, no. 4, pp. 396-402, 1984. 2. J. Rissanen and G. G. Langdon, "Arithmetic coding, " IBM Journal of research and development, vol. 23, no. 2, pp. 149-162, 1979. 3. J. Ziv and A. Lempel, "A universal algorithm for sequential data compression, " IEEE Transactions on information theory, vol. 23, no. 3, pp. 337-343, 1977. 4. M. Burrows and D. J. Wheeler, "A block-sorting lossless data compression algorithm, " 1994. 5. B. Y. Ryabko, "Data compression by means of a book stack, " Problems of Information Transmission, vol. 16, no. 4, pp. 265-269, 1980. 6. J. Bentley, D. Sleator, R. Tarjan, and V. Wei, " A locally adaptive data compression scheme, " Communications of the ACM, vol. 29, no. 4, pp. 320-330, 1986. 7. B. Ryabko, N. R. Horspool, G. V. Cormack, S. Sekar, and S. B. Ahuja, "Technical correspondence, " Communications of the ACM, vol. 30, no. 9, pp. 792-797, 1987. 8. J. C. Kieffer and E.-H. Yang, "Grammar-based codes: A new class of universal lossless source codes, " IEEE Transactions on Information Theory, vol. 46, no. 3, pp. 737-754, 2000. 9. E.-H. Yang and J. C. Kieffer, "Efficient universal lossless data compression algorithms based on a greedy sequential grammar transform. i. without context models, " IEEE Transactions on Information Theory, vol. 46, no. 3, pp. 755-777, 2000. 10. M. Drmota, Yu. Reznik, and W. Szpankowski, " Tunstall code, Khodak variations, and random walks, " IEEE Transactions on Information Theory, vol. 56, no. 6, pp. 2928-2937, 2010. 11. B. Ryabko, "Twice-universal coding, " Problems of Information Transmission, vol. 3, pp. 173-177, 1984. 12. F.M.J. Willems, Yu. Shtarkov, and T.J. Tjalkens, " The context-Tree weighting method: basic properties, " IEEE Transactions on Information Theory, vol. 41, no. 3, pp. 653-664, 1995. 13. M. Li and P. Vitanyi, An Introduction to Kolmogorov Complexity and Its Applications, 3rd edition. New York: Springer, 2008. 14. M. Hutter, Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability. Berlin: Springer, 2005. 15. B. Ryabko, "Prediction of random sequences and universal coding, " Problems of Information Transmission, vol. 24, pp. 87-96, 1988. 16. J. J. Rissanen, "Modeling by shortest data description." Automatica, vol. 14, pp. 465-471, 1978. 17. -, Stochastic Complexity in Statistical Inquiry. World Scientific Publ. Co., 1989. 18. A. Barron, J. Rissanen, and B. Yu, "The mdl principle in modeling and coding', special issue of ieee trans, " Information Theory to commemorate, vol. 50, pp. 2743-2760, 1998. 19. P. Grünwald, The Minimum Description Length Principle. The MIT Press, Cambridge, 2007. 20. P. Kontkanen, P. Myllymäki, T. Silander, H. Tirri, and P. Grünwald, "On predictive distributions and Bayesian networks, " Statistics and Computing, vol. 10, no. 1, pp. 39-54, 2000. 21. T. M. Cover and J. A. Thomas, Elements of information theory. New York, NY, USA: Wiley-Interscience, 2006. 22. M. Mahoney. "Data Compression Programs." http://mattmahoney.net/dc 23. R. Krichevsky, "A relation between the plausibility of information about a source and encoding redundancy, " Problems Inform. Transmission, vol. 4, no. 3, pp. 48-57, 1968. 24. -, Universal Compression and Retrival. Kluwer Academic Publishers, 1993. 25. B. Ryabko, "Compression-based methods for nonparametric prediction and estimation of some characteristics of time series, " IEEE Transactions on Information Theory, vol. 55, pp. 4309-4315, 2009. |