How does JPEG format work?
The JPEG compression process consists of several steps. In the first step, the image is converted from the RGB color space to the YUV space, based on the brightness and color characteristics. All further work is done precisely with this color space, which, due to some of its characteristics, allows us to obtain such large degrees of compression.
What is so unusual about YUV color representation compared to RGB? And the fact that it is closest to the “natural”, that which a person unconsciously performs. The Y component, or brightness, is closely related to picture quality. More precisely, Y – this is the picture, only black and white. Components U and V contain color information and allow us to colorize the Y-picture.
At the next step after the conversion, the image is divided into square sections measuring 8×8 pixels. After that, so-called discrete cosine transform (DCT). In this case, an analysis of each block is performed, its decomposition into color components and calculation of the frequency of occurrence of each color.
The human eye is designed in such a way that it is most sensitive to the brightness component of the image (Y-component) and the least to color. The reason for this phenomenon lies in physiology. You probably remember that the pupil is an optical lens that focuses the image on the fundus, covered with sticks and cones. Well, sticks are sensors that perceive precisely the luminance component, and cones – color. Moreover, the rods are an order of magnitude larger than the cones, and they are much more sensitive to light. Remember the proverb “At night, all cats are sulfur.” Why is that? Why does everything lose color in the evening? It is due to the fact that the amount of light incident on the pupil is not enough to cause a cone reaction. But the sensitivity of the human eye to different colors is also not a constant value. The pupil is more sensitive to the lower part of the color spectrum than to the upper. The JPG format just takes these features into account.
By analyzing the frequency information about the appearance of colors, it is possible to get rid of part of the information already in the process of quantization. In this case, colors in the upper part of the spectrum are excluded, which practically does not affect the visual perception of the image. A portion of the luminance information is also excluded. Roughly speaking, JPG simply discards half the useful signal from the luminance component, and 3/4 from the color component. This, of course, is approximately because there are gradations and more complex compression schemes.
The amount of information excluded during compression depends on the required image quality. At the highest compression levels, the parts are completely erased and the block turns gray. At medium and low levels of compression, _proximate_ color information of this section is stored in the file. The magnitude of this “approximation” directly depends on the degree of compression. And you need to understand that, unlike conventional dot-saving formats, JPG preserves rough colors. Speaking in a scientific language, then JPG uses Fourier series for conservation and, at high compression ratios, it simply discards the members of a higher order series. And every time you play the image on the screen, the computer produces a synthesis. Moreover, it is quite resource-intensive and noticeable on slow computers. One remark follows from this – if you saved any image in JPG format, then it is impossible to restore it back to the last pixel! It is because of this that the format is called the “lossy format,” and that is why it is not recommended to recompress JPG images, as they will definitely get worse. And if you do it 10 times?
The brightness and color information is then encoded so that only differences between adjacent blocks are saved. As a result, blocks are represented by strings of numbers that can be compressed further. Since, as a result of processing, the blocks contain many zeros, the last stage of coding (performed using the Huffman algorithm – similar to that used in archivers) gives good results. Hence, another small remark – to compress the JPG file by the archiver does not make any sense, because it is already compressed. The resulting archive will probably be larger than the original photo.
Thus, the original 24 bits per image element or 1536 bits (192 bytes) per block are converted into a handful of bits that describe the visual characteristics of the entire image area.