README-turbo.txt 16 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346
  1. Background
  2. ==========
  3. libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2,
  4. AVX2, NEON, AltiVec) to accelerate baseline JPEG compression and decompression
  5. on x86, x86-64, ARM, and PowerPC systems, as well as progressive JPEG
  6. compression on x86 and x86-64 systems. On such systems, libjpeg-turbo is
  7. generally 2-6x as fast as libjpeg, all else being equal. On other types of
  8. systems, libjpeg-turbo can still outperform libjpeg by a significant amount, by
  9. virtue of its highly-optimized Huffman coding routines. In many cases, the
  10. performance of libjpeg-turbo rivals that of proprietary high-speed JPEG codecs.
  11. libjpeg-turbo implements both the traditional libjpeg API as well as the less
  12. powerful but more straightforward TurboJPEG API. libjpeg-turbo also features
  13. colorspace extensions that allow it to compress from/decompress to 32-bit and
  14. big-endian pixel buffers (RGBX, XBGR, etc.), as well as a full-featured Java
  15. interface.
  16. libjpeg-turbo was originally based on libjpeg/SIMD, an MMX-accelerated
  17. derivative of libjpeg v6b developed by Miyasaka Masaru. The TigerVNC and
  18. VirtualGL projects made numerous enhancements to the codec in 2009, and in
  19. early 2010, libjpeg-turbo spun off into an independent project, with the goal
  20. of making high-speed JPEG compression/decompression technology available to a
  21. broader range of users and developers.
  22. License
  23. =======
  24. libjpeg-turbo is covered by three compatible BSD-style open source licenses.
  25. Refer to [LICENSE.md](LICENSE.md) for a roll-up of license terms.
  26. Building libjpeg-turbo
  27. ======================
  28. Refer to [BUILDING.md](BUILDING.md) for complete instructions.
  29. Using libjpeg-turbo
  30. ===================
  31. libjpeg-turbo includes two APIs that can be used to compress and decompress
  32. JPEG images:
  33. - **TurboJPEG API**<br>
  34. This API provides an easy-to-use interface for compressing and decompressing
  35. JPEG images in memory. It also provides some functionality that would not be
  36. straightforward to achieve using the underlying libjpeg API, such as
  37. generating planar YUV images and performing multiple simultaneous lossless
  38. transforms on an image. The Java interface for libjpeg-turbo is written on
  39. top of the TurboJPEG API. The TurboJPEG API is recommended for first-time
  40. users of libjpeg-turbo. Refer to [tjexample.c](tjexample.c) and
  41. [TJExample.java](java/TJExample.java) for examples of its usage and to
  42. <http://libjpeg-turbo.org/Documentation/Documentation> for API documentation.
  43. - **libjpeg API**<br>
  44. This is the de facto industry-standard API for compressing and decompressing
  45. JPEG images. It is more difficult to use than the TurboJPEG API but also
  46. more powerful. The libjpeg API implementation in libjpeg-turbo is both
  47. API/ABI-compatible and mathematically compatible with libjpeg v6b. It can
  48. also optionally be configured to be API/ABI-compatible with libjpeg v7 and v8
  49. (see below.) Refer to [cjpeg.c](cjpeg.c) and [djpeg.c](djpeg.c) for examples
  50. of its usage and to [libjpeg.txt](libjpeg.txt) for API documentation.
  51. There is no significant performance advantage to either API when both are used
  52. to perform similar operations.
  53. Colorspace Extensions
  54. ---------------------
  55. libjpeg-turbo includes extensions that allow JPEG images to be compressed
  56. directly from (and decompressed directly to) buffers that use BGR, BGRX,
  57. RGBX, XBGR, and XRGB pixel ordering. This is implemented with ten new
  58. colorspace constants:
  59. JCS_EXT_RGB /* red/green/blue */
  60. JCS_EXT_RGBX /* red/green/blue/x */
  61. JCS_EXT_BGR /* blue/green/red */
  62. JCS_EXT_BGRX /* blue/green/red/x */
  63. JCS_EXT_XBGR /* x/blue/green/red */
  64. JCS_EXT_XRGB /* x/red/green/blue */
  65. JCS_EXT_RGBA /* red/green/blue/alpha */
  66. JCS_EXT_BGRA /* blue/green/red/alpha */
  67. JCS_EXT_ABGR /* alpha/blue/green/red */
  68. JCS_EXT_ARGB /* alpha/red/green/blue */
  69. Setting `cinfo.in_color_space` (compression) or `cinfo.out_color_space`
  70. (decompression) to one of these values will cause libjpeg-turbo to read the
  71. red, green, and blue values from (or write them to) the appropriate position in
  72. the pixel when compressing from/decompressing to an RGB buffer.
  73. Your application can check for the existence of these extensions at compile
  74. time with:
  75. #ifdef JCS_EXTENSIONS
  76. At run time, attempting to use these extensions with a libjpeg implementation
  77. that does not support them will result in a "Bogus input colorspace" error.
  78. Applications can trap this error in order to test whether run-time support is
  79. available for the colorspace extensions.
  80. When using the RGBX, BGRX, XBGR, and XRGB colorspaces during decompression, the
  81. X byte is undefined, and in order to ensure the best performance, libjpeg-turbo
  82. can set that byte to whatever value it wishes. If an application expects the X
  83. byte to be used as an alpha channel, then it should specify `JCS_EXT_RGBA`,
  84. `JCS_EXT_BGRA`, `JCS_EXT_ABGR`, or `JCS_EXT_ARGB`. When these colorspace
  85. constants are used, the X byte is guaranteed to be 0xFF, which is interpreted
  86. as opaque.
  87. Your application can check for the existence of the alpha channel colorspace
  88. extensions at compile time with:
  89. #ifdef JCS_ALPHA_EXTENSIONS
  90. [jcstest.c](jcstest.c), located in the libjpeg-turbo source tree, demonstrates
  91. how to check for the existence of the colorspace extensions at compile time and
  92. run time.
  93. libjpeg v7 and v8 API/ABI Emulation
  94. -----------------------------------
  95. With libjpeg v7 and v8, new features were added that necessitated extending the
  96. compression and decompression structures. Unfortunately, due to the exposed
  97. nature of those structures, extending them also necessitated breaking backward
  98. ABI compatibility with previous libjpeg releases. Thus, programs that were
  99. built to use libjpeg v7 or v8 did not work with libjpeg-turbo, since it is
  100. based on the libjpeg v6b code base. Although libjpeg v7 and v8 are not
  101. as widely used as v6b, enough programs (including a few Linux distros) made
  102. the switch that there was a demand to emulate the libjpeg v7 and v8 ABIs
  103. in libjpeg-turbo. It should be noted, however, that this feature was added
  104. primarily so that applications that had already been compiled to use libjpeg
  105. v7+ could take advantage of accelerated baseline JPEG encoding/decoding
  106. without recompiling. libjpeg-turbo does not claim to support all of the
  107. libjpeg v7+ features, nor to produce identical output to libjpeg v7+ in all
  108. cases (see below.)
  109. By passing an argument of `--with-jpeg7` or `--with-jpeg8` to `configure`, or
  110. an argument of `-DWITH_JPEG7=1` or `-DWITH_JPEG8=1` to `cmake`, you can build a
  111. version of libjpeg-turbo that emulates the libjpeg v7 or v8 ABI, so that
  112. programs that are built against libjpeg v7 or v8 can be run with libjpeg-turbo.
  113. The following section describes which libjpeg v7+ features are supported and
  114. which aren't.
  115. ### Support for libjpeg v7 and v8 Features
  116. #### Fully supported
  117. - **libjpeg: IDCT scaling extensions in decompressor**<br>
  118. libjpeg-turbo supports IDCT scaling with scaling factors of 1/8, 1/4, 3/8,
  119. 1/2, 5/8, 3/4, 7/8, 9/8, 5/4, 11/8, 3/2, 13/8, 7/4, 15/8, and 2/1 (only 1/4
  120. and 1/2 are SIMD-accelerated.)
  121. - **libjpeg: Arithmetic coding**
  122. - **libjpeg: In-memory source and destination managers**<br>
  123. See notes below.
  124. - **cjpeg: Separate quality settings for luminance and chrominance**<br>
  125. Note that the libpjeg v7+ API was extended to accommodate this feature only
  126. for convenience purposes. It has always been possible to implement this
  127. feature with libjpeg v6b (see rdswitch.c for an example.)
  128. - **cjpeg: 32-bit BMP support**
  129. - **cjpeg: `-rgb` option**
  130. - **jpegtran: Lossless cropping**
  131. - **jpegtran: `-perfect` option**
  132. - **jpegtran: Forcing width/height when performing lossless crop**
  133. - **rdjpgcom: `-raw` option**
  134. - **rdjpgcom: Locale awareness**
  135. #### Not supported
  136. NOTE: As of this writing, extensive research has been conducted into the
  137. usefulness of DCT scaling as a means of data reduction and SmartScale as a
  138. means of quality improvement. The reader is invited to peruse the research at
  139. <http://www.libjpeg-turbo.org/About/SmartScale> and draw his/her own conclusions,
  140. but it is the general belief of our project that these features have not
  141. demonstrated sufficient usefulness to justify inclusion in libjpeg-turbo.
  142. - **libjpeg: DCT scaling in compressor**<br>
  143. `cinfo.scale_num` and `cinfo.scale_denom` are silently ignored.
  144. There is no technical reason why DCT scaling could not be supported when
  145. emulating the libjpeg v7+ API/ABI, but without the SmartScale extension (see
  146. below), only scaling factors of 1/2, 8/15, 4/7, 8/13, 2/3, 8/11, 4/5, and
  147. 8/9 would be available, which is of limited usefulness.
  148. - **libjpeg: SmartScale**<br>
  149. `cinfo.block_size` is silently ignored.
  150. SmartScale is an extension to the JPEG format that allows for DCT block
  151. sizes other than 8x8. Providing support for this new format would be
  152. feasible (particularly without full acceleration.) However, until/unless
  153. the format becomes either an official industry standard or, at minimum, an
  154. accepted solution in the community, we are hesitant to implement it, as
  155. there is no sense of whether or how it might change in the future. It is
  156. our belief that SmartScale has not demonstrated sufficient usefulness as a
  157. lossless format nor as a means of quality enhancement, and thus our primary
  158. interest in providing this feature would be as a means of supporting
  159. additional DCT scaling factors.
  160. - **libjpeg: Fancy downsampling in compressor**<br>
  161. `cinfo.do_fancy_downsampling` is silently ignored.
  162. This requires the DCT scaling feature, which is not supported.
  163. - **jpegtran: Scaling**<br>
  164. This requires both the DCT scaling and SmartScale features, which are not
  165. supported.
  166. - **Lossless RGB JPEG files**<br>
  167. This requires the SmartScale feature, which is not supported.
  168. ### What About libjpeg v9?
  169. libjpeg v9 introduced yet another field to the JPEG compression structure
  170. (`color_transform`), thus making the ABI backward incompatible with that of
  171. libjpeg v8. This new field was introduced solely for the purpose of supporting
  172. lossless SmartScale encoding. Furthermore, there was actually no reason to
  173. extend the API in this manner, as the color transform could have just as easily
  174. been activated by way of a new JPEG colorspace constant, thus preserving
  175. backward ABI compatibility.
  176. Our research (see link above) has shown that lossless SmartScale does not
  177. generally accomplish anything that can't already be accomplished better with
  178. existing, standard lossless formats. Therefore, at this time it is our belief
  179. that there is not sufficient technical justification for software projects to
  180. upgrade from libjpeg v8 to libjpeg v9, and thus there is not sufficient
  181. technical justification for us to emulate the libjpeg v9 ABI.
  182. In-Memory Source/Destination Managers
  183. -------------------------------------
  184. By default, libjpeg-turbo 1.3 and later includes the `jpeg_mem_src()` and
  185. `jpeg_mem_dest()` functions, even when not emulating the libjpeg v8 API/ABI.
  186. Previously, it was necessary to build libjpeg-turbo from source with libjpeg v8
  187. API/ABI emulation in order to use the in-memory source/destination managers,
  188. but several projects requested that those functions be included when emulating
  189. the libjpeg v6b API/ABI as well. This allows the use of those functions by
  190. programs that need them, without breaking ABI compatibility for programs that
  191. don't, and it allows those functions to be provided in the "official"
  192. libjpeg-turbo binaries.
  193. Those who are concerned about maintaining strict conformance with the libjpeg
  194. v6b or v7 API can pass an argument of `--without-mem-srcdst` to `configure` or
  195. an argument of `-DWITH_MEM_SRCDST=0` to `cmake` prior to building
  196. libjpeg-turbo. This will restore the pre-1.3 behavior, in which
  197. `jpeg_mem_src()` and `jpeg_mem_dest()` are only included when emulating the
  198. libjpeg v8 API/ABI.
  199. On Un*x systems, including the in-memory source/destination managers changes
  200. the dynamic library version from 62.1.0 to 62.2.0 if using libjpeg v6b API/ABI
  201. emulation and from 7.1.0 to 7.2.0 if using libjpeg v7 API/ABI emulation.
  202. Note that, on most Un*x systems, the dynamic linker will not look for a
  203. function in a library until that function is actually used. Thus, if a program
  204. is built against libjpeg-turbo 1.3+ and uses `jpeg_mem_src()` or
  205. `jpeg_mem_dest()`, that program will not fail if run against an older version
  206. of libjpeg-turbo or against libjpeg v7- until the program actually tries to
  207. call `jpeg_mem_src()` or `jpeg_mem_dest()`. Such is not the case on Windows.
  208. If a program is built against the libjpeg-turbo 1.3+ DLL and uses
  209. `jpeg_mem_src()` or `jpeg_mem_dest()`, then it must use the libjpeg-turbo 1.3+
  210. DLL at run time.
  211. Both cjpeg and djpeg have been extended to allow testing the in-memory
  212. source/destination manager functions. See their respective man pages for more
  213. details.
  214. Mathematical Compatibility
  215. ==========================
  216. For the most part, libjpeg-turbo should produce identical output to libjpeg
  217. v6b. The one exception to this is when using the floating point DCT/IDCT, in
  218. which case the outputs of libjpeg v6b and libjpeg-turbo can differ for the
  219. following reasons:
  220. - The SSE/SSE2 floating point DCT implementation in libjpeg-turbo is ever so
  221. slightly more accurate than the implementation in libjpeg v6b, but not by
  222. any amount perceptible to human vision (generally in the range of 0.01 to
  223. 0.08 dB gain in PNSR.)
  224. - When not using the SIMD extensions, libjpeg-turbo uses the more accurate
  225. (and slightly faster) floating point IDCT algorithm introduced in libjpeg
  226. v8a as opposed to the algorithm used in libjpeg v6b. It should be noted,
  227. however, that this algorithm basically brings the accuracy of the floating
  228. point IDCT in line with the accuracy of the slow integer IDCT. The floating
  229. point DCT/IDCT algorithms are mainly a legacy feature, and they do not
  230. produce significantly more accuracy than the slow integer algorithms (to put
  231. numbers on this, the typical difference in PNSR between the two algorithms
  232. is less than 0.10 dB, whereas changing the quality level by 1 in the upper
  233. range of the quality scale is typically more like a 1.0 dB difference.)
  234. - If the floating point algorithms in libjpeg-turbo are not implemented using
  235. SIMD instructions on a particular platform, then the accuracy of the
  236. floating point DCT/IDCT can depend on the compiler settings.
  237. While libjpeg-turbo does emulate the libjpeg v8 API/ABI, under the hood it is
  238. still using the same algorithms as libjpeg v6b, so there are several specific
  239. cases in which libjpeg-turbo cannot be expected to produce the same output as
  240. libjpeg v8:
  241. - When decompressing using scaling factors of 1/2 and 1/4, because libjpeg v8
  242. implements those scaling algorithms differently than libjpeg v6b does, and
  243. libjpeg-turbo's SIMD extensions are based on the libjpeg v6b behavior.
  244. - When using chrominance subsampling, because libjpeg v8 implements this
  245. with its DCT/IDCT scaling algorithms rather than with a separate
  246. downsampling/upsampling algorithm. In our testing, the subsampled/upsampled
  247. output of libjpeg v8 is less accurate than that of libjpeg v6b for this
  248. reason.
  249. - When decompressing using a scaling factor > 1 and merged (AKA "non-fancy" or
  250. "non-smooth") chrominance upsampling, because libjpeg v8 does not support
  251. merged upsampling with scaling factors > 1.
  252. Performance Pitfalls
  253. ====================
  254. Restart Markers
  255. ---------------
  256. The optimized Huffman decoder in libjpeg-turbo does not handle restart markers
  257. in a way that makes the rest of the libjpeg infrastructure happy, so it is
  258. necessary to use the slow Huffman decoder when decompressing a JPEG image that
  259. has restart markers. This can cause the decompression performance to drop by
  260. as much as 20%, but the performance will still be much greater than that of
  261. libjpeg. Many consumer packages, such as PhotoShop, use restart markers when
  262. generating JPEG images, so images generated by those programs will experience
  263. this issue.
  264. Fast Integer Forward DCT at High Quality Levels
  265. -----------------------------------------------
  266. The algorithm used by the SIMD-accelerated quantization function cannot produce
  267. correct results whenever the fast integer forward DCT is used along with a JPEG
  268. quality of 98-100. Thus, libjpeg-turbo must use the non-SIMD quantization
  269. function in those cases. This causes performance to drop by as much as 40%.
  270. It is therefore strongly advised that you use the slow integer forward DCT
  271. whenever encoding images with a JPEG quality of 98 or higher.