WSEAS Transactions on Computers


Print ISSN: 1109-2750
E-ISSN: 2224-2872

Volume 17, 2018

Notice: As of 2014 and for the forthcoming years, the publication frequency/periodicity of WSEAS Journals is adapted to the 'continuously updated' model. What this means is that instead of being separated into issues, new papers will be added on a continuous basis, allowing a more regular flow and shorter publication times. The papers will appear in reverse order, therefore the most recent one will be on top.



Optimizing the Performance of Text File Compression using a Combination of the Burrows-Wheeler Transform (BWT), Move-to-Front (MTF) and Shannon-Fano Algorithms

AUTHORS: Yayuk Anggraini, Teddy Mantoro, Media A. Ayu

Download as PDF

ABSTRACT: Compression in Information Technology is the way to minimize the file size. Performance of compression algorithm is measured by the speed of process and the compression ratio. The compression time will effect on memory allocation and CPU performance, while the low compression ratio will weakens the ability of the algorithm to compress the data. Huffman and ShannonFano are two compression algorithms have same ways to work, but both produced a different performance. The test results concluded that the Shannon-Fano performance has a percentage of 1,56% lower than Huffman. This problem can be solved by adding a reversible transformation algorithm to the data source. Burrows-Wheeler Transform (BWT) produces output that is more easily to be processed at a later stage, and Move-to-front (MTF) is an algorithm to transform the data unifying and reduce redundancies. This study discusses a combination of BWT + MTF + Shannon-Fano algorithm and compare it with other algorithms (Shannon-Fano, Huffman and LZ77) which were applied on text files. The test results have shown that the combination of BWT + MTF + ShannonFano has the most efficient compression ratio, which is 60.89% higher at around 0.37% compared to LZ77. On compression time aspect, LZ77 is the slowest, approximately 39391,11 ms, while a combination of BWT + MTF + Shannon-Fano performs at approximately 1237,95 ms. This study concluded that the combination of BWT + MTF + Shannon-Fano algorithm performs as the most optimal algorithm in compression time (speed) and compression ratio (size).

KEYWORDS: BWT, MTF, Shannon-Fano, Huffman, LZ77, text files, compression algorithm optimization.

REFERENCES:

[1] D. Solomon, Data Compression, 4th ed. London: Spriger, 2007.

[2] P. Yellamma and N. Challa, “Performance Analysis Of Different Data Compression Techniques On Text File,” Int. J. Eng. Res. Technol., vol. 1, no. 8, (Oktober, pp. 1–6, 2012.

[3] B. Souley, P. Das, and S. Tanko, “A Comparative Analysis of Data Compression Techniques,” Int. J. Appl. Sci., vol. 2, no. 10, (April, pp. 63–82, 2014.

[4] J. M. Silaen, “Studi Perbandingan Algoritma Huffman dan Shannon- Fano dalam Pemampatan File Teks,” Pelita Inform. Budi Darma, vol. 7, no. 1, (Juli, pp. 60–66, 2014.

[5] K. Rastogi, K. Sengar, and M. T. Scholar, “Analysis and Performance Comparison of Lossless Compression Techniques for Text Data,” Int. J. Eng. Comput. Res., vol. 2, no. 1, pp. 16–19, 2014.

[6] P. Jeyanthi and V. Anuratha, “Analysis of Lossless Reversible Transformation Algorithms to Enhance Data Compression,” J. Glob. Res. Comput. Sci., vol. 3, no. 8, (Agustus, pp. 56–62, 2012.

[7] R. Bastys, “Fibonacci Coding Within the Burrows-Wheeler Compression Scheme,” Electron. Electr. Eng., vol. 1, pp. 28–32, 2010.

[8] M. Burrows and D. Wheeler, “A Block-Sorting Lossless Data Compression Algorithm,” Algorithm, Data Compression, no. 124, p. 18, 1994.

[9] A. S. E. Campos, “Move to Front,” 1999.

[Online]. Available: https://www.arturocampos.com/ac_mtf.html.

[Accessed: 15-Mar2017].

[10] M. Dipperstein, “Burrows-Wheeler Transform Discussion and Implementation,” 2010.

[Online]. Available: https://michael.dipperstein.com/bwt/.

[Accessed: 15-Mar-2017].

[11] R. R. Baruah, V. Deka, and M. P. Bhuyan, “Enhancing Dictionary Based Preprocessing For Better Text Compression,” Int. J. Comput. Technol., vol. 9, no. 1, (Maret, pp. 4–9, 2014.

[12] R. Râdescu, “Transform Methods Used in Lossless Compression of Text Files,” Rom. J. Inf. Sci. Technol., vol. 12, no. 1, (November, pp. 101–115, 2009.

[13] I. M. Pu, Fundamental Data Compression. London: ButterworthHeinemann, 2006.

[14] P. Widhiartha, “Pengantar Kompresi Data (Introduction to Data Compression)” 2008, Available: https://www.ilmukomputer.org/ wp-content/uploads/2008/10/ widhiartha_kompresidata.pdf.

[Accessed: 1-Nov-2017].

[15] C. P. Nugraha and R. G. Santosa, “Perbandingan Metode LZ77, Metode Huffman dan Metode Deflate Terhadap Kompresi Data Teks,” Informatika, vol. 10, no. 2, pp. 80–91, 2014.

[16] M. Dipperstein, “Huffman Code Discussion and Implementation,” 2010.

[Online]. Available: https://michael.dipperstein.com/Huffman/index.html.

[Accessed: 16-Apr-2017].

[17] H. Altarawneh and M. Altarawneh, “Data Compression Techniques on Text Files: A Comparison Study,” Int. J. Comput. Appl., vol. 26, no. 5, (Juli, pp. 42–54, 2011.

[18] S. Shanmugasundaram and R. Lourdusamy, “A Comparative Study Of Text Compression Algorithms,” Int. J. Wisdom Based Comput., vol. 1, no. 4, (December, pp. 68–76, 2011.

[19] S. Senthil, S. J. Rexiline, and L. Robert, “RIDBE: A Lossless, Reversible Text Transformation Scheme for Better Compression,” Int. J. Comput. Appl., vol. 51, no. 12, (Agustus, pp. 35–40, 2012.

[20] P. Jeyanthi and V. Anuratha, “Bwt Based Lossless Reversible Transformation Algorithms – An Analysis,” Int. J. Eng. Res. Appl., vol. 2, no. 5, (September-Oktober, pp. 807–814, 2012.

[21] E. C. Cankaya and O. Darwish, “Improving Compression Performance with a Star Encoding Front End : A Linguistic Comparison,” Proceedings of the International Conference on Foundations of Computer Science (FCS); Athens : 1-7 2013.

WSEAS Transactions on Computers, ISSN / E-ISSN: 1109-2750 / 2224-2872, Volume 17, 2018, Art. #28, pp. 227-239


Copyright © 2018 Author(s) retain the copyright of this article. This article is published under the terms of the Creative Commons Attribution License 4.0

Bulletin Board

Currently:

The editorial board is accepting papers.


WSEAS Main Site