This article includes a list ofgeneral references,butit lacks sufficient correspondinginline citations.(April 2010) |
Image compressionis a type ofdata compressionapplied todigital images,to reduce their cost forstorageortransmission.Algorithmsmay take advantage ofvisual perceptionand thestatisticalproperties of image data to provide superior results compared with genericdata compressionmethods which are used for other digital data.[1]
Lossy and lossless image compression
editImage compression may belossyorlossless.Lossless compression is preferred for archival purposes and often for medical imaging, technical drawings,clip art,or comics. Lossy compression methods, especially when used at lowbit rates,introducecompression artifacts.Lossy methods are especially suitable for natural images such as photographs in applications where minor (sometimes imperceptible) loss of fidelity is acceptable to achieve a substantial reduction in bit rate. Lossy compression that produces negligible differences may be called visually lossless.
Methods forlossy compression:
- Transform coding– This is the most commonly used method.
- Discrete Cosine Transform(DCT) – The most widely used form of lossy compression. It is a type ofFourier-related transform,and was originally developed byNasir Ahmed,T. Natarajan andK. R. Raoin 1974.[2]The DCT is sometimes referred to as "DCT-II" in the context of a family of discrete cosine transforms (seediscrete cosine transform). It is generally the most efficient form of image compression.
- The more recently developedwavelet transformis also used extensively, followed byquantizationandentropy coding.
- Color quantization- Reducing thecolor spaceto a few "representative" colors in the image. The selected colors are specified in the colorpalettein the header of the compressed image. Each pixel just references the index of a color in the color palette. This method can be combined withditheringto avoidposterization.
- Chroma subsampling.This takes advantage of the fact that the human eye perceives spatial changes of brightness more sharply than those of color, by averaging or dropping some of the chrominance information in the image.
- Fractal compression.
- More recently, methods based onMachine Learningwere applied, usingMultilayer perceptrons,Convolutional neural networks,Generative adversarial networks[3]andDiffusion models.[4]Implementations are available inOpenCV,TensorFlow,MATLAB's Image Processing Toolbox (IPT), and theHigh-Fidelity Generative Image Compression(HiFiC) open source project.[5]
Methods forlossless compression:
- Run-length encoding– used in default method inPCXand as one of possible inBMP,TGA,TIFF
- Predictive coding – used inDPCM
- Entropy encoding– the two most common entropy encoding techniques arearithmetic codingandHuffman coding
- Adaptive dictionary algorithms such asLZW– used inGIFandTIFF
- DEFLATE– used inPNG,MNG,andTIFF
- Chain codes
Other properties
editThe best image quality at a given compression rate (orbit rate) is the main goal of image compression, however, there are other important properties of image compression schemes:
Scalabilitygenerally refers to a quality reduction achieved by manipulation of the bitstream or file (without decompression and re-compression). Other names for scalability areprogressive codingorembedded bitstreams.Despite its contrary nature, scalability also may be found in lossless codecs, usually in form of coarse-to-fine pixel scans. Scalability is especially useful for previewing images while downloading them (e.g., in a web browser) or for providing variable quality access to e.g., databases. There are several types of scalability:
- Quality progressiveor layer progressive: The bitstream successively refines the reconstructed image.
- Resolution progressive:First encode a lower image resolution; then encode the difference to higher resolutions.[6][7]
- Component progressive:First encode grey-scale version; then adding full color.
Region of interest coding.Certain parts of the image are encoded with higher quality than others. This may be combined with scalability (encode these parts first, others later).
Meta information.Compressed data may contain information about the image which may be used to categorize, search, or browse images. Such information may include color and texture statistics, smallpreviewimages, and author or copyright information.
Processing power.Compression algorithms require different amounts ofprocessing powerto encode and decode. Some high compression algorithms require high processing power.
The quality of a compression method often is measured by thepeak signal-to-noise ratio.It measures the amount of noise introduced through a lossy compression of the image, however, the subjective judgment of the viewer also is regarded as an important measure, perhaps, being the most important measure.
History
editEntropy codingstarted in the late 1940s with the introduction ofShannon–Fano coding,[8]the basis forHuffman codingwhich was published in 1952.[9]Transform codingdates back to the late 1960s, with the introduction offast Fourier transform(FFT) coding in 1968 and theHadamard transformin 1969.[10]
An important development in imagedata compressionwas thediscrete cosine transform(DCT), alossy compressiontechnique first proposed byNasir Ahmed,T. Natarajan andK. R. Raoin 1973.[11]JPEGwas introduced by theJoint Photographic Experts Group(JPEG) in 1992.[12]JPEG compresses images down to much smaller file sizes, and has become the most widely usedimage file format.[13]JPEG was largely responsible for the wide proliferation ofdigital imagesanddigital photos,[14]with several billion JPEG images produced every day as of 2015.[15]
Lempel–Ziv–Welch(LZW) is alossless compressionalgorithm developed byAbraham Lempel,Jacob ZivandTerry Welchin 1984. It is used in theGIFformat, introduced in 1987.[16]DEFLATE,a lossless compression algorithm developed byPhil Katzand specified in 1996, is used in thePortable Network Graphics(PNG) format.[17]
TheJPEG 2000standard was developed from 1997 to 2000 by a JPEG committee chaired by Touradj Ebrahimi (later the JPEG president).[18]In contrast to the DCT algorithm used by the original JPEG format, JPEG 2000 instead usesdiscrete wavelet transform(DWT) algorithms. It uses theCDF9/7 wavelet transform (developed byIngrid Daubechiesin 1992) for its lossy compression algorithm,[19]and the Le Gall–Tabatabai (LGT) 5/3 wavelet transform[20][21](developed by Didier Le Gall and Ali J. Tabatabai in 1988)[22]for its lossless compression algorithm.[19]JPEG 2000technology, which includes theMotion JPEG 2000extension, was selected as thevideo coding standardfordigital cinemain 2004.[23]
Huffman Coding
editHuffman coding is a fundamental technique used in image compression algorithms to achieve efficient data representation. Named after its inventor David A. Huffman, this method is widely employed in various image compression standards such as JPEG and PNG.
Principle of Huffman Coding
editHuffman coding is a form of entropy encoding that assigns variable-length codes to input symbols based on their frequencies of occurrence. The basic principle is to assign shorter codes to more frequently occurring symbols and longer codes to less frequent symbols, thereby reducing the average code length compared to fixed-length codes.
Application in Image Compression
editIn image compression, Huffman coding is typically applied after other transformations like Discrete Cosine Transform (DCT) in the case of JPEG compression. After transforming the image data into a frequency domain representation, Huffman coding is used to encode the transformed coefficients efficiently.
Steps in Huffman Coding for Image Compression
edit- Frequency Analysis: Calculate the frequency of occurrence of each symbol or symbol combination in the transformed image data.
- Constructing the Huffman Tree: Build a Huffman tree based on the symbol frequencies. The tree is constructed recursively by combining the nodes with the lowest frequencies until a single root node is formed.
- Assigning Codewords: Traverse the Huffman tree to assign variable-length codewords to each symbol, with shorter codewords assigned to more frequent symbols.
- Encoding: Replace the original symbols in the image data with their corresponding Huffman codewords to generate the compressed data stream.
Benefits of Huffman Coding in Image Compression
edit- Lossless Compression: Huffman coding can be used in both lossy and lossless image compression techniques, providing flexibility in balancing between compression ratio and image quality.
- Efficiency: By assigning shorter codes to frequently occurring symbols, Huffman coding reduces the average code length, resulting in efficient data representation and reduced storage requirements.
- Compatibility: Huffman coding is widely supported and can be seamlessly integrated into existing image compression standards and algorithms.
Conclusion
editHuffman coding plays a crucial role in image compression by efficiently encoding image data into a compact representation. Its ability to adaptively assign variable-length codewords based on symbol frequencies makes it an essential component in modern image compression techniques, contributing to the reduction of storage space and transmission bandwidth while maintaining image quality.
Notes and references
edit- ^"Image Data Compression".
- ^Ahmed, N.; Natarajan, T.; Rao, K.R. (1974)."Discrete Cosine Transform"(PDF).IEEE Transactions on Computers:90–93.doi:10.1109/T-C.1974.223784.S2CID149806273.Archived fromthe original(PDF)on 2011-11-25.
- ^Gilad David Maayan (Nov 24, 2021)."AI-Based Image Compression: The State of the Art".Towards Data Science.Retrieved6 April2023.
- ^Bühlmann, Matthias (2022-09-28)."Stable Diffusion Based Image Compression".Medium.Retrieved2022-11-02.
- ^"High-Fidelity Generative Image Compression".Retrieved6 April2023.
- ^Burt, P.; Adelson, E. (1 April 1983). "The Laplacian Pyramid as a Compact Image Code".IEEE Transactions on Communications.31(4): 532–540.CiteSeerX10.1.1.54.299.doi:10.1109/TCOM.1983.1095851.S2CID8018433.
- ^Shao, Dan; Kropatsch, Walter G. (February 3–5, 2010). Špaček, Libor; Franc, Vojtěch (eds.)."Irregular Laplacian Graph Pyramid"(PDF).Computer Vision Winter Workshop 2010.Nové Hrady, Czech Republic: Czech Pattern Recognition Society.Archived(PDF)from the original on 2013-05-27.
- ^Claude Elwood Shannon(1948). Alcatel-Lucent (ed.)."A Mathematical Theory of Communication"(PDF).Bell System Technical Journal.27(3–4): 379–423, 623–656.doi:10.1002/j.1538-7305.1948.tb01338.x.hdl:11858/00-001M-0000-002C-4314-2.Archived(PDF)from the original on 2011-05-24.Retrieved2019-04-21.
- ^David Albert Huffman(September 1952),"A method for the construction of minimum-redundancy codes"(PDF),Proceedings of the IRE,vol. 40, no. 9, pp. 1098–1101,doi:10.1109/JRPROC.1952.273898,archived(PDF)from the original on 2005-10-08
- ^Pratt, W.K.; Kane, J.; Andrews, H.C. (1969). "Hadamard transform image coding".Proceedings of the IEEE.57:58–68.doi:10.1109/PROC.1969.6869.
- ^Ahmed, Nasir(January 1991)."How I Came Up With the Discrete Cosine Transform".Digital Signal Processing.1(1): 4–5.Bibcode:1991DSP.....1....4A.doi:10.1016/1051-2004(91)90086-Z.
- ^"T.81 – DIGITAL COMPRESSION AND CODING OF CONTINUOUS-TONE STILL IMAGES – REQUIREMENTS AND GUIDELINES"(PDF).CCITT.September 1992.Archived(PDF)from the original on 2000-08-18.Retrieved12 July2019.
- ^"The JPEG image format explained".BT.BT Group.31 May 2018.Retrieved5 August2019.
- ^"What Is a JPEG? The Invisible Object You See Every Day".The Atlantic.24 September 2013.Retrieved13 September2019.
- ^Baraniuk, Chris (15 October 2015)."Copy protections could come to JPEGs".BBC News.BBC.Retrieved13 September2019.
- ^"The GIF Controversy: A Software Developer's Perspective".27 January 1995.Retrieved26 May2015.
- ^L. Peter Deutsch(May 1996).DEFLATE Compressed Data Format Specification version 1.3.IETF.p. 1. sec. Abstract.doi:10.17487/RFC1951.RFC1951.Retrieved2014-04-23.
- ^Taubman, David; Marcellin, Michael (2012).JPEG2000 Image Compression Fundamentals, Standards and Practice: Image Compression Fundamentals, Standards and Practice.Springer Science & Business Media.ISBN9781461507994.
- ^abUnser, M.; Blu, T. (2003)."Mathematical properties of the JPEG2000 wavelet filters"(PDF).IEEE Transactions on Image Processing.12(9): 1080–1090.Bibcode:2003ITIP...12.1080U.doi:10.1109/TIP.2003.812329.PMID18237979.S2CID2765169.Archived fromthe original(PDF)on 2019-10-13.
- ^Sullivan, Gary (8–12 December 2003)."General characteristics and design considerations for temporal subband video coding".ITU-T.Video Coding Experts Group.Retrieved13 September2019.
- ^Bovik, Alan C. (2009).The Essential Guide to Video Processing.Academic Press.p. 355.ISBN9780080922508.
- ^Le Gall, Didier; Tabatabai, Ali J. (1988). "Sub-band coding of digital images using symmetric short kernel filters and arithmetic coding techniques".ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.pp. 761–764 vol.2.doi:10.1109/ICASSP.1988.196696.S2CID109186495.
- ^Swartz, Charles S. (2005).Understanding Digital Cinema: A Professional Handbook.Taylor & Francis.p. 147.ISBN9780240806174.