1 /* stb_image - v2.10 - public domain image loader - http://nothings.org/stb_image.h
2 no warranty implied; use at your own risk
3
4 Do this:
5 #define STB_IMAGE_IMPLEMENTATION
6 before you include this file in *one* C or C++ file to create the implementation.
7
8 // i.e. it should look like this:
9 #include ...
10 #include ...
11 #include ...
12 #define STB_IMAGE_IMPLEMENTATION
13 #include "stb_image.h"
14
15 You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
16 And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
17
18
19 QUICK NOTES:
20 Primarily of interest to game developers and other people who can
21 avoid problematic images and only need the trivial interface
22
23 JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
24 PNG 1/2/4/8-bit-per-channel (16 bpc not supported)
25
26 TGA (not sure what subset, if a subset)
27 BMP non-1bpp, non-RLE
28 PSD (composited view only, no extra channels, 8/16 bit-per-channel)
29
30 GIF (*comp always reports as 4-channel)
31 HDR (radiance rgbE format)
32 PIC (Softimage PIC)
33 PNM (PPM and PGM binary only)
34
35 Animated GIF still needs a proper API, but here's one way to do it:
36 http://gist.github.com/urraka/685d9a6340b26b830d49
37
38 - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
39 - decode from arbitrary I/O callbacks
40 - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
41
42 Full documentation under "DOCUMENTATION" below.
43
44
45 Revision 2.00 release notes:
46
47 - Progressive JPEG is now supported.
48
49 - PPM and PGM binary formats are now supported, thanks to Ken Miller.
50
51 - x86 platforms now make use of SSE2 SIMD instructions for
52 JPEG decoding, and ARM platforms can use NEON SIMD if requested.
53 This work was done by Fabian "ryg" Giesen. SSE2 is used by
54 default, but NEON must be enabled explicitly; see docs.
55
56 With other JPEG optimizations included in this version, we see
57 2x speedup on a JPEG on an x86 machine, and a 1.5x speedup
58 on a JPEG on an ARM machine, relative to previous versions of this
59 library. The same results will not obtain for all JPGs and for all
60 x86/ARM machines. (Note that progressive JPEGs are significantly
61 slower to decode than regular JPEGs.) This doesn't mean that this
62 is the fastest JPEG decoder in the land; rather, it brings it
63 closer to parity with standard libraries. If you want the fastest
64 decode, look elsewhere. (See "Philosophy" section of docs below.)
65
66 See final bullet items below for more info on SIMD.
67
68 - Added STBI_MALLOC, STBI_REALLOC, and STBI_FREE macros for replacing
69 the memory allocator. Unlike other STBI libraries, these macros don't
70 support a context parameter, so if you need to pass a context in to
71 the allocator, you'll have to store it in a global or a thread-local
72 variable.
73
74 - Split existing STBI_NO_HDR flag into two flags, STBI_NO_HDR and
75 STBI_NO_LINEAR.
76 STBI_NO_HDR: suppress implementation of .hdr reader format
77 STBI_NO_LINEAR: suppress high-dynamic-range light-linear float API
78
79 - You can suppress implementation of any of the decoders to reduce
80 your code footprint by #defining one or more of the following
81 symbols before creating the implementation.
82
83 STBI_NO_JPEG
84 STBI_NO_PNG
85 STBI_NO_BMP
86 STBI_NO_PSD
87 STBI_NO_TGA
88 STBI_NO_GIF
89 STBI_NO_HDR
90 STBI_NO_PIC
91 STBI_NO_PNM (.ppm and .pgm)
92
93 - You can request *only* certain decoders and suppress all other ones
94 (this will be more forward-compatible, as addition of new decoders
95 doesn't require you to disable them explicitly):
96
97 STBI_ONLY_JPEG
98 STBI_ONLY_PNG
99 STBI_ONLY_BMP
100 STBI_ONLY_PSD
101 STBI_ONLY_TGA
102 STBI_ONLY_GIF
103 STBI_ONLY_HDR
104 STBI_ONLY_PIC
105 STBI_ONLY_PNM (.ppm and .pgm)
106
107 Note that you can define multiples of these, and you will get all
108 of them ("only x" and "only y" is interpreted to mean "only x&y").
109
110 - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
111 want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
112
113 - Compilation of all SIMD code can be suppressed with
114 #define STBI_NO_SIMD
115 It should not be necessary to disable SIMD unless you have issues
116 compiling (e.g. using an x86 compiler which doesn't support SSE
117 intrinsics or that doesn't support the method used to detect
118 SSE2 support at run-time), and even those can be reported as
119 bugs so I can refine the built-in compile-time checking to be
120 smarter.
121
122 - The old STBI_SIMD system which allowed installing a user-defined
123 IDCT etc. has been removed. If you need this, don't upgrade. My
124 assumption is that almost nobody was doing this, and those who
125 were will find the built-in SIMD more satisfactory anyway.
126
127 - RGB values computed for JPEG images are slightly different from
128 previous versions of stb_image. (This is due to using less
129 integer precision in SIMD.) The C code has been adjusted so
130 that the same RGB values will be computed regardless of whether
131 SIMD support is available, so your app should always produce
132 consistent results. But these results are slightly different from
133 previous versions. (Specifically, about 3% of available YCbCr values
134 will compute different RGB results from pre-1.49 versions by +-1;
135 most of the deviating values are one smaller in the G channel.)
136
137 - If you must produce consistent results with previous versions of
138 stb_image, #define STBI_JPEG_OLD and you will get the same results
139 you used to; however, you will not get the SIMD speedups for
140 the YCbCr-to-RGB conversion step (although you should still see
141 significant JPEG speedup from the other changes).
142
143 Please note that STBI_JPEG_OLD is a temporary feature; it will be
144 removed in future versions of the library. It is only intended for
145 near-term back-compatibility use.
146
147
148 Latest revision history:
149 2.10 (2016-01-22) avoid warning introduced in 2.09
150 2.09 (2016-01-16) 16-bit TGA; comments in PNM files; STBI_REALLOC_SIZED
151 2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
152 2.07 (2015-09-13) partial animated GIF support
153 limited 16-bit PSD support
154 minor bugs, code cleanup, and compiler warnings
155 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
156 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
157 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
158 2.03 (2015-04-12) additional corruption checking
159 stbi_set_flip_vertically_on_load
160 fix NEON support; fix mingw support
161 2.02 (2015-01-19) fix incorrect assert, fix warning
162 2.01 (2015-01-17) fix various warnings
163 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
164 2.00 (2014-12-25) optimize JPEG, including x86 SSE2 & ARM NEON SIMD
165 progressive JPEG
166 PGM/PPM support
167 STBI_MALLOC,STBI_REALLOC,STBI_FREE
168 STBI_NO_*, STBI_ONLY_*
169 GIF bugfix
170 1.48 (2014-12-14) fix incorrectly-named assert()
171 1.47 (2014-12-14) 1/2/4-bit PNG support (both grayscale and paletted)
172 optimize PNG
173 fix bug in interlaced PNG with user-specified channel count
174
175 See end of file for full revision history.
176
177
178 ============================ Contributors =========================
179
180 Image formats Extensions, features
181 Sean Barrett (jpeg, png, bmp) Jetro Lauha (stbi_info)
182 Nicolas Schulz (hdr, psd) Martin "SpartanJ" Golini (stbi_info)
183 Jonathan Dummer (tga) James "moose2000" Brown (iPhone PNG)
184 Jean-Marc Lienher (gif) Ben "Disch" Wenger (io callbacks)
185 Tom Seddon (pic) Omar Cornut (1/2/4-bit PNG)
186 Thatcher Ulrich (psd) Nicolas Guillemot (vertical flip)
187 Ken Miller (pgm, ppm) Richard Mitton (16-bit PSD)
188 urraka@github (animated gif) Junggon Kim (PNM comments)
189 Daniel Gibson (16-bit TGA)
190
191 Optimizations & bugfixes
192 Fabian "ryg" Giesen
193 Arseny Kapoulkine
194
195 Bug & warning fixes
196 Marc LeBlanc David Woo Guillaume George Martins Mozeiko
197 Christpher Lloyd Martin Golini Jerry Jansson Joseph Thomson
198 Dave Moore Roy Eltham Hayaki Saito Phil Jordan
199 Won Chun Luke Graham Johan Duparc Nathan Reed
200 the Horde3D community Thomas Ruf Ronny Chevalier Nick Verigakis
201 Janez Zemva John Bartholomew Michal Cichon svdijk@github
202 Jonathan Blow Ken Hamada Tero Hanninen Baldur Karlsson
203 Laurent Gomila Cort Stratton Sergio Gonzalez romigrou@github
204 Aruelien Pocheville Thibault Reuille Cass Everitt
205 Ryamond Barbiero Paul Du Bois Engin Manap
206 Blazej Dariusz Roszkowski
207 Michaelangel007@github
208
209
210 LICENSE
211
212 This software is in the public domain. Where that dedication is not
213 recognized, you are granted a perpetual, irrevocable license to copy,
214 distribute, and modify this file as you see fit.
215
216 */
217
218 #ifndef STBI_INCLUDE_STB_IMAGE_H
219 #define STBI_INCLUDE_STB_IMAGE_H
220
221 // DOCUMENTATION
222 //
223 // Limitations:
224 // - no 16-bit-per-channel PNG
225 // - no 12-bit-per-channel JPEG
226 // - no JPEGs with arithmetic coding
227 // - no 1-bit BMP
228 // - GIF always returns *comp=4
229 //
230 // Basic usage (see HDR discussion below for HDR usage):
231 // int x,y,n;
232 // unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
233 // // ... process data if not NULL ...
234 // // ... x = width, y = height, n = # 8-bit components per pixel ...
235 // // ... replace '0' with '1'..'4' to force that many components per pixel
236 // // ... but 'n' will always be the number that it would have been if you said 0
237 // stbi_image_free(data)
238 //
239 // Standard parameters:
240 // int *x -- outputs image width in pixels
241 // int *y -- outputs image height in pixels
242 // int *comp -- outputs # of image components in image file
243 // int req_comp -- if non-zero, # of image components requested in result
244 //
245 // The return value from an image loader is an 'unsigned char *' which points
246 // to the pixel data, or NULL on an allocation failure or if the image is
247 // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
248 // with each pixel consisting of N interleaved 8-bit components; the first
249 // pixel pointed to is top-left-most in the image. There is no padding between
250 // image scanlines or between pixels, regardless of format. The number of
251 // components N is 'req_comp' if req_comp is non-zero, or *comp otherwise.
252 // If req_comp is non-zero, *comp has the number of components that _would_
253 // have been output otherwise. E.g. if you set req_comp to 4, you will always
254 // get RGBA output, but you can check *comp to see if it's trivially opaque
255 // because e.g. there were only 3 channels in the source image.
256 //
257 // An output image with N components has the following components interleaved
258 // in this order in each pixel:
259 //
260 // N=#comp components
261 // 1 grey
262 // 2 grey, alpha
263 // 3 red, green, blue
264 // 4 red, green, blue, alpha
265 //
266 // If image loading fails for any reason, the return value will be NULL,
267 // and *x, *y, *comp will be unchanged. The function stbi_failure_reason()
268 // can be queried for an extremely brief, end-user unfriendly explanation
269 // of why the load failed. Define STBI_NO_FAILURE_STRINGS to avoid
270 // compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
271 // more user-friendly ones.
272 //
273 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
274 //
275 // ===========================================================================
276 //
277 // Philosophy
278 //
279 // stb libraries are designed with the following priorities:
280 //
281 // 1. easy to use
282 // 2. easy to maintain
283 // 3. good performance
284 //
285 // Sometimes I let "good performance" creep up in priority over "easy to maintain",
286 // and for best performance I may provide less-easy-to-use APIs that give higher
287 // performance, in addition to the easy to use ones. Nevertheless, it's important
288 // to keep in mind that from the standpoint of you, a client of this library,
289 // all you care about is #1 and #3, and stb libraries do not emphasize #3 above all.
290 //
291 // Some secondary priorities arise directly from the first two, some of which
292 // make more explicit reasons why performance can't be emphasized.
293 //
294 // - Portable ("ease of use")
295 // - Small footprint ("easy to maintain")
296 // - No dependencies ("ease of use")
297 //
298 // ===========================================================================
299 //
300 // I/O callbacks
301 //
302 // I/O callbacks allow you to read from arbitrary sources, like packaged
303 // files or some other source. Data read from callbacks are processed
304 // through a small internal buffer (currently 128 bytes) to try to reduce
305 // overhead.
306 //
307 // The three functions you must define are "read" (reads some bytes of data),
308 // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
309 //
310 // ===========================================================================
311 //
312 // SIMD support
313 //
314 // The JPEG decoder will try to automatically use SIMD kernels on x86 when
315 // supported by the compiler. For ARM Neon support, you must explicitly
316 // request it.
317 //
318 // (The old do-it-yourself SIMD API is no longer supported in the current
319 // code.)
320 //
321 // On x86, SSE2 will automatically be used when available based on a run-time
322 // test; if not, the generic C versions are used as a fall-back. On ARM targets,
323 // the typical path is to have separate builds for NEON and non-NEON devices
324 // (at least this is true for iOS and Android). Therefore, the NEON support is
325 // toggled by a build flag: define STBI_NEON to get NEON loops.
326 //
327 // The output of the JPEG decoder is slightly different from versions where
328 // SIMD support was introduced (that is, for versions before 1.49). The
329 // difference is only +-1 in the 8-bit RGB channels, and only on a small
330 // fraction of pixels. You can force the pre-1.49 behavior by defining
331 // STBI_JPEG_OLD, but this will disable some of the SIMD decoding path
332 // and hence cost some performance.
333 //
334 // If for some reason you do not want to use any of SIMD code, or if
335 // you have issues compiling it, you can disable it entirely by
336 // defining STBI_NO_SIMD.
337 //
338 // ===========================================================================
339 //
340 // HDR image support (disable by defining STBI_NO_HDR)
341 //
342 // stb_image now supports loading HDR images in general, and currently
343 // the Radiance .HDR file format, although the support is provided
344 // generically. You can still load any file through the existing interface;
345 // if you attempt to load an HDR file, it will be automatically remapped to
346 // LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
347 // both of these constants can be reconfigured through this interface:
348 //
349 // stbi_hdr_to_ldr_gamma(2.2f);
350 // stbi_hdr_to_ldr_scale(1.0f);
351 //
352 // (note, do not use _inverse_ constants; stbi_image will invert them
353 // appropriately).
354 //
355 // Additionally, there is a new, parallel interface for loading files as
356 // (linear) floats to preserve the full dynamic range:
357 //
358 // float *data = stbi_loadf(filename, &x, &y, &n, 0);
359 //
360 // If you load LDR images through this interface, those images will
361 // be promoted to floating point values, run through the inverse of
362 // constants corresponding to the above:
363 //
364 // stbi_ldr_to_hdr_scale(1.0f);
365 // stbi_ldr_to_hdr_gamma(2.2f);
366 //
367 // Finally, given a filename (or an open file or memory block--see header
368 // file for details) containing image data, you can query for the "most
369 // appropriate" interface to use (that is, whether the image is HDR or
370 // not), using:
371 //
372 // stbi_is_hdr(char *filename);
373 //
374 // ===========================================================================
375 //
376 // iPhone PNG support:
377 //
378 // By default we convert iphone-formatted PNGs back to RGB, even though
379 // they are internally encoded differently. You can disable this conversion
380 // by by calling stbi_convert_iphone_png_to_rgb(0), in which case
381 // you will always just get the native iphone "format" through (which
382 // is BGR stored in RGB).
383 //
384 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
385 // pixel to remove any premultiplied alpha *only* if the image file explicitly
386 // says there's premultiplied data (currently only happens in iPhone images,
387 // and only if iPhone convert-to-rgb processing is on).
388 //
389
390 #pragma GCC diagnostic push
391 #pragma GCC diagnostic ignored "-Wmisleading-indentation"
392 #pragma GCC diagnostic ignored "-Wshift-negative-value"
393 #pragma GCC diagnostic ignored "-Wimplicit-fallthrough"
394
395 #ifndef STBI_NO_STDIO
396 #include <stdio.h>
397 #endif // STBI_NO_STDIO
398
399 #define STBI_VERSION 1
400
401 enum
402 {
403 STBI_default = 0, // only used for req_comp
404
405 STBI_grey = 1,
406 STBI_grey_alpha = 2,
407 STBI_rgb = 3,
408 STBI_rgb_alpha = 4
409 };
410
411 typedef unsigned char stbi_uc;
412
413 #ifdef __cplusplus
414 extern "C" {
415 #endif
416
417 #ifdef STB_IMAGE_STATIC
418 #define STBIDEF static
419 #else
420 #define STBIDEF extern
421 #endif
422
423 //////////////////////////////////////////////////////////////////////////////
424 //
425 // PRIMARY API - works on images of any type
426 //
427
428 //
429 // load image by filename, open file, or memory buffer
430 //
431
432 typedef struct
433 {
434 int (*read) (void *user,char *data,int size); // fill 'data' with 'size' bytes. return number of bytes actually read
435 void (*skip) (void *user,int n); // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
436 int (*eof) (void *user); // returns nonzero if we are at end of file/data
437 } stbi_io_callbacks;
438
439 STBIDEF stbi_uc *stbi_load (char const *filename, int *x, int *y, int *comp, int req_comp);
440 STBIDEF stbi_uc *stbi_load_from_memory (stbi_uc const *buffer, int len , int *x, int *y, int *comp, int req_comp);
441 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk , void *user, int *x, int *y, int *comp, int req_comp);
442
443 #ifndef STBI_NO_STDIO
444 STBIDEF stbi_uc *stbi_load_from_file (FILE *f, int *x, int *y, int *comp, int req_comp);
445 // for stbi_load_from_file, file pointer is left pointing immediately after image
446 #endif
447
448 #ifndef STBI_NO_LINEAR
449 STBIDEF float *stbi_loadf (char const *filename, int *x, int *y, int *comp, int req_comp);
450 STBIDEF float *stbi_loadf_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
451 STBIDEF float *stbi_loadf_from_callbacks (stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp);
452
453 #ifndef STBI_NO_STDIO
454 STBIDEF float *stbi_loadf_from_file (FILE *f, int *x, int *y, int *comp, int req_comp);
455 #endif
456 #endif
457
458 #ifndef STBI_NO_HDR
459 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma);
460 STBIDEF void stbi_hdr_to_ldr_scale(float scale);
461 #endif // STBI_NO_HDR
462
463 #ifndef STBI_NO_LINEAR
464 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma);
465 STBIDEF void stbi_ldr_to_hdr_scale(float scale);
466 #endif // STBI_NO_LINEAR
467
468 // stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
469 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
470 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
471 #ifndef STBI_NO_STDIO
472 STBIDEF int stbi_is_hdr (char const *filename);
473 STBIDEF int stbi_is_hdr_from_file(FILE *f);
474 #endif // STBI_NO_STDIO
475
476
477 // get a VERY brief reason for failure
478 // NOT THREADSAFE
479 STBIDEF const char *stbi_failure_reason (void);
480
481 // free the loaded image -- this is just free()
482 STBIDEF void stbi_image_free (void *retval_from_stbi_load);
483
484 // get image dimensions & components without fully decoding
485 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
486 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp);
487
488 #ifndef STBI_NO_STDIO
489 STBIDEF int stbi_info (char const *filename, int *x, int *y, int *comp);
490 STBIDEF int stbi_info_from_file (FILE *f, int *x, int *y, int *comp);
491
492 #endif
493
494
495
496 // for image formats that explicitly notate that they have premultiplied alpha,
497 // we just return the colors as stored in the file. set this flag to force
498 // unpremultiplication. results are undefined if the unpremultiply overflow.
499 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
500
501 // indicate whether we should process iphone images back to canonical format,
502 // or just pass them through "as-is"
503 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
504
505 // flip the image vertically, so the first pixel in the output array is the bottom left
506 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip);
507
508 // ZLIB client - used by PNG, available for other purposes
509
510 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
511 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
512 STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
513 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
514
515 STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
516 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
517
518
519 #ifdef __cplusplus
520 }
521 #endif
522
523 //
524 //
525 //// end header file /////////////////////////////////////////////////////
526 #endif // STBI_INCLUDE_STB_IMAGE_H
527
528 #ifdef STB_IMAGE_IMPLEMENTATION
529
530 #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
531 || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
532 || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
533 || defined(STBI_ONLY_ZLIB)
534 #ifndef STBI_ONLY_JPEG
535 #define STBI_NO_JPEG
536 #endif
537 #ifndef STBI_ONLY_PNG
538 #define STBI_NO_PNG
539 #endif
540 #ifndef STBI_ONLY_BMP
541 #define STBI_NO_BMP
542 #endif
543 #ifndef STBI_ONLY_PSD
544 #define STBI_NO_PSD
545 #endif
546 #ifndef STBI_ONLY_TGA
547 #define STBI_NO_TGA
548 #endif
549 #ifndef STBI_ONLY_GIF
550 #define STBI_NO_GIF
551 #endif
552 #ifndef STBI_ONLY_HDR
553 #define STBI_NO_HDR
554 #endif
555 #ifndef STBI_ONLY_PIC
556 #define STBI_NO_PIC
557 #endif
558 #ifndef STBI_ONLY_PNM
559 #define STBI_NO_PNM
560 #endif
561 #endif
562
563 #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
564 #define STBI_NO_ZLIB
565 #endif
566
567
568 #include <stdarg.h>
569 #include <stddef.h> // ptrdiff_t on osx
570 #include <stdlib.h>
571 #include <string.h>
572
573 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
574 #include <math.h> // ldexp
575 #endif
576
577 #ifndef STBI_NO_STDIO
578 #include <stdio.h>
579 #endif
580
581 #ifndef STBI_ASSERT
582 #include <assert.h>
583 #define STBI_ASSERT(x) assert(x)
584 #endif
585
586
587 #ifndef _MSC_VER
588 #ifdef __cplusplus
589 #define stbi_inline inline
590 #else
591 #define stbi_inline
592 #endif
593 #else
594 #define stbi_inline __forceinline
595 #endif
596
597
598 #ifdef _MSC_VER
599 typedef unsigned short stbi__uint16;
600 typedef signed short stbi__int16;
601 typedef unsigned int stbi__uint32;
602 typedef signed int stbi__int32;
603 #else
604 #include <stdint.h>
605 typedef uint16_t stbi__uint16;
606 typedef int16_t stbi__int16;
607 typedef uint32_t stbi__uint32;
608 typedef int32_t stbi__int32;
609 #endif
610
611 // should produce compiler error if size is wrong
612 typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1];
613
614 #ifdef _MSC_VER
615 #define STBI_NOTUSED(v) (void)(v)
616 #else
617 #define STBI_NOTUSED(v) (void)sizeof(v)
618 #endif
619
620 #ifdef _MSC_VER
621 #define STBI_HAS_LROTL
622 #endif
623
624 #ifdef STBI_HAS_LROTL
625 #define stbi_lrot(x,y) _lrotl(x,y)
626 #else
627 #define stbi_lrot(x,y) (((x) << (y)) | ((x) >> (32 - (y))))
628 #endif
629
630 #if defined(STBI_MALLOC) && defined(STBI_FREE) && (defined(STBI_REALLOC) || defined(STBI_REALLOC_SIZED))
631 // ok
632 #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC) && !defined(STBI_REALLOC_SIZED)
633 // ok
634 #else
635 #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC (or STBI_REALLOC_SIZED)."
636 #endif
637
638 #ifndef STBI_MALLOC
639 #define STBI_MALLOC(sz) malloc(sz)
640 #define STBI_REALLOC(p,newsz) realloc(p,newsz)
641 #define STBI_FREE(p) free(p)
642 #endif
643
644 #ifndef STBI_REALLOC_SIZED
645 #define STBI_REALLOC_SIZED(p,oldsz,newsz) STBI_REALLOC(p,newsz)
646 #endif
647
648 // x86/x64 detection
649 #if defined(__x86_64__) || defined(_M_X64)
650 #define STBI__X64_TARGET
651 #elif defined(__i386) || defined(_M_IX86)
652 #define STBI__X86_TARGET
653 #endif
654
655 #if defined(__GNUC__) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET)) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
656 // NOTE: not clear do we actually need this for the 64-bit path?
657 // gcc doesn't support sse2 intrinsics unless you compile with -msse2,
658 // (but compiling with -msse2 allows the compiler to use SSE2 everywhere;
659 // this is just broken and gcc are jerks for not fixing it properly
660 // http://www.virtualdub.org/blog/pivot/entry.php?id=363 )
661 #define STBI_NO_SIMD
662 #endif
663
664 #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
665 // Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
666 //
667 // 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
668 // Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
669 // As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
670 // simultaneously enabling "-mstackrealign".
671 //
672 // See https://github.com/nothings/stb/issues/81 for more information.
673 //
674 // So default to no SSE2 on 32-bit MinGW. If you've read this far and added
675 // -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
676 #define STBI_NO_SIMD
677 #endif
678
679 #if !defined(STBI_NO_SIMD) && defined(STBI__X86_TARGET)
680 #define STBI_SSE2
681 #include <emmintrin.h>
682
683 #ifdef _MSC_VER
684
685 #if _MSC_VER >= 1400 // not VC6
686 #include <intrin.h> // __cpuid
stbi__cpuid3(void)687 static int stbi__cpuid3(void)
688 {
689 int info[4];
690 __cpuid(info,1);
691 return info[3];
692 }
693 #else
stbi__cpuid3(void)694 static int stbi__cpuid3(void)
695 {
696 int res;
697 __asm {
698 mov eax,1
699 cpuid
700 mov res,edx
701 }
702 return res;
703 }
704 #endif
705
706 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
707
stbi__sse2_available()708 static int stbi__sse2_available()
709 {
710 int info3 = stbi__cpuid3();
711 return ((info3 >> 26) & 1) != 0;
712 }
713 #else // assume GCC-style if not VC++
714 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
715
stbi__sse2_available()716 static int stbi__sse2_available()
717 {
718 #if defined(__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__) >= 408 // GCC 4.8 or later
719 // GCC 4.8+ has a nice way to do this
720 return __builtin_cpu_supports("sse2");
721 #else
722 // portable way to do this, preferably without using GCC inline ASM?
723 // just bail for now.
724 return 0;
725 #endif
726 }
727 #endif
728 #endif
729
730 // ARM NEON
731 #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
732 #undef STBI_NEON
733 #endif
734
735 #ifdef STBI_NEON
736 #include <arm_neon.h>
737 // assume GCC or Clang on ARM targets
738 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
739 #endif
740
741 #ifndef STBI_SIMD_ALIGN
742 #define STBI_SIMD_ALIGN(type, name) type name
743 #endif
744
745 ///////////////////////////////////////////////
746 //
747 // stbi__context struct and start_xxx functions
748
749 // stbi__context structure is our basic context used by all images, so it
750 // contains all the IO context, plus some basic image information
751 typedef struct
752 {
753 stbi__uint32 img_x, img_y;
754 int img_n, img_out_n;
755
756 stbi_io_callbacks io;
757 void *io_user_data;
758
759 int read_from_callbacks;
760 int buflen;
761 stbi_uc buffer_start[128];
762
763 stbi_uc *img_buffer, *img_buffer_end;
764 stbi_uc *img_buffer_original, *img_buffer_original_end;
765 } stbi__context;
766
767
768 static void stbi__refill_buffer(stbi__context *s);
769
770 // initialize a memory-decode context
stbi__start_mem(stbi__context * s,stbi_uc const * buffer,int len)771 static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len)
772 {
773 s->io.read = NULL;
774 s->read_from_callbacks = 0;
775 s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer;
776 s->img_buffer_end = s->img_buffer_original_end = (stbi_uc *) buffer+len;
777 }
778
779 // initialize a callback-based context
stbi__start_callbacks(stbi__context * s,stbi_io_callbacks * c,void * user)780 static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user)
781 {
782 s->io = *c;
783 s->io_user_data = user;
784 s->buflen = sizeof(s->buffer_start);
785 s->read_from_callbacks = 1;
786 s->img_buffer_original = s->buffer_start;
787 stbi__refill_buffer(s);
788 s->img_buffer_original_end = s->img_buffer_end;
789 }
790
791 #ifndef STBI_NO_STDIO
792
stbi__stdio_read(void * user,char * data,int size)793 static int stbi__stdio_read(void *user, char *data, int size)
794 {
795 return (int) fread(data,1,size,(FILE*) user);
796 }
797
stbi__stdio_skip(void * user,int n)798 static void stbi__stdio_skip(void *user, int n)
799 {
800 fseek((FILE*) user, n, SEEK_CUR);
801 }
802
stbi__stdio_eof(void * user)803 static int stbi__stdio_eof(void *user)
804 {
805 return feof((FILE*) user);
806 }
807
808 static stbi_io_callbacks stbi__stdio_callbacks =
809 {
810 stbi__stdio_read,
811 stbi__stdio_skip,
812 stbi__stdio_eof,
813 };
814
stbi__start_file(stbi__context * s,FILE * f)815 static void stbi__start_file(stbi__context *s, FILE *f)
816 {
817 stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f);
818 }
819
820 //static void stop_file(stbi__context *s) { }
821
822 #endif // !STBI_NO_STDIO
823
stbi__rewind(stbi__context * s)824 static void stbi__rewind(stbi__context *s)
825 {
826 // conceptually rewind SHOULD rewind to the beginning of the stream,
827 // but we just rewind to the beginning of the initial buffer, because
828 // we only use it after doing 'test', which only ever looks at at most 92 bytes
829 s->img_buffer = s->img_buffer_original;
830 s->img_buffer_end = s->img_buffer_original_end;
831 }
832
833 #ifndef STBI_NO_JPEG
834 static int stbi__jpeg_test(stbi__context *s);
835 static stbi_uc *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
836 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp);
837 #endif
838
839 #ifndef STBI_NO_PNG
840 static int stbi__png_test(stbi__context *s);
841 static stbi_uc *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
842 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp);
843 #endif
844
845 #ifndef STBI_NO_BMP
846 static int stbi__bmp_test(stbi__context *s);
847 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
848 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp);
849 #endif
850
851 #ifndef STBI_NO_TGA
852 static int stbi__tga_test(stbi__context *s);
853 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
854 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp);
855 #endif
856
857 #ifndef STBI_NO_PSD
858 static int stbi__psd_test(stbi__context *s);
859 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
860 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp);
861 #endif
862
863 #ifndef STBI_NO_HDR
864 static int stbi__hdr_test(stbi__context *s);
865 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
866 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp);
867 #endif
868
869 #ifndef STBI_NO_PIC
870 static int stbi__pic_test(stbi__context *s);
871 static stbi_uc *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
872 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp);
873 #endif
874
875 #ifndef STBI_NO_GIF
876 static int stbi__gif_test(stbi__context *s);
877 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
878 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp);
879 #endif
880
881 #ifndef STBI_NO_PNM
882 static int stbi__pnm_test(stbi__context *s);
883 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
884 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp);
885 #endif
886
887 // this is not threadsafe
888 static const char *stbi__g_failure_reason;
889
stbi_failure_reason(void)890 STBIDEF const char *stbi_failure_reason(void)
891 {
892 return stbi__g_failure_reason;
893 }
894
stbi__err(const char * str)895 static int stbi__err(const char *str)
896 {
897 stbi__g_failure_reason = str;
898 return 0;
899 }
900
stbi__malloc(size_t size)901 static void *stbi__malloc(size_t size)
902 {
903 return STBI_MALLOC(size);
904 }
905
906 // stbi__err - error
907 // stbi__errpf - error returning pointer to float
908 // stbi__errpuc - error returning pointer to unsigned char
909
910 #ifdef STBI_NO_FAILURE_STRINGS
911 #define stbi__err(x,y) 0
912 #elif defined(STBI_FAILURE_USERMSG)
913 #define stbi__err(x,y) stbi__err(y)
914 #else
915 #define stbi__err(x,y) stbi__err(x)
916 #endif
917
918 #define stbi__errpf(x,y) ((float *)(size_t) (stbi__err(x,y)?NULL:NULL))
919 #define stbi__errpuc(x,y) ((unsigned char *)(size_t) (stbi__err(x,y)?NULL:NULL))
920
stbi_image_free(void * retval_from_stbi_load)921 STBIDEF void stbi_image_free(void *retval_from_stbi_load)
922 {
923 STBI_FREE(retval_from_stbi_load);
924 }
925
926 #ifndef STBI_NO_LINEAR
927 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
928 #endif
929
930 #ifndef STBI_NO_HDR
931 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp);
932 #endif
933
934 static int stbi__vertically_flip_on_load = 0;
935
stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)936 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
937 {
938 stbi__vertically_flip_on_load = flag_true_if_should_flip;
939 }
940
stbi__load_main(stbi__context * s,int * x,int * y,int * comp,int req_comp)941 static unsigned char *stbi__load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
942 {
943 #ifndef STBI_NO_JPEG
944 if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp);
945 #endif
946 #ifndef STBI_NO_PNG
947 if (stbi__png_test(s)) return stbi__png_load(s,x,y,comp,req_comp);
948 #endif
949 #ifndef STBI_NO_BMP
950 if (stbi__bmp_test(s)) return stbi__bmp_load(s,x,y,comp,req_comp);
951 #endif
952 #ifndef STBI_NO_GIF
953 if (stbi__gif_test(s)) return stbi__gif_load(s,x,y,comp,req_comp);
954 #endif
955 #ifndef STBI_NO_PSD
956 if (stbi__psd_test(s)) return stbi__psd_load(s,x,y,comp,req_comp);
957 #endif
958 #ifndef STBI_NO_PIC
959 if (stbi__pic_test(s)) return stbi__pic_load(s,x,y,comp,req_comp);
960 #endif
961 #ifndef STBI_NO_PNM
962 if (stbi__pnm_test(s)) return stbi__pnm_load(s,x,y,comp,req_comp);
963 #endif
964
965 #ifndef STBI_NO_HDR
966 if (stbi__hdr_test(s)) {
967 float *hdr = stbi__hdr_load(s, x,y,comp,req_comp);
968 return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
969 }
970 #endif
971
972 #ifndef STBI_NO_TGA
973 // test tga last because it's a crappy test!
974 if (stbi__tga_test(s))
975 return stbi__tga_load(s,x,y,comp,req_comp);
976 #endif
977
978 return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
979 }
980
stbi__load_flip(stbi__context * s,int * x,int * y,int * comp,int req_comp)981 static unsigned char *stbi__load_flip(stbi__context *s, int *x, int *y, int *comp, int req_comp)
982 {
983 unsigned char *result = stbi__load_main(s, x, y, comp, req_comp);
984
985 if (stbi__vertically_flip_on_load && result != NULL) {
986 int w = *x, h = *y;
987 int depth = req_comp ? req_comp : *comp;
988 int row,col,z;
989 stbi_uc temp;
990
991 // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
992 for (row = 0; row < (h>>1); row++) {
993 for (col = 0; col < w; col++) {
994 for (z = 0; z < depth; z++) {
995 temp = result[(row * w + col) * depth + z];
996 result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
997 result[((h - row - 1) * w + col) * depth + z] = temp;
998 }
999 }
1000 }
1001 }
1002
1003 return result;
1004 }
1005
1006 #ifndef STBI_NO_HDR
stbi__float_postprocess(float * result,int * x,int * y,int * comp,int req_comp)1007 static void stbi__float_postprocess(float *result, int *x, int *y, int *comp, int req_comp)
1008 {
1009 if (stbi__vertically_flip_on_load && result != NULL) {
1010 int w = *x, h = *y;
1011 int depth = req_comp ? req_comp : *comp;
1012 int row,col,z;
1013 float temp;
1014
1015 // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
1016 for (row = 0; row < (h>>1); row++) {
1017 for (col = 0; col < w; col++) {
1018 for (z = 0; z < depth; z++) {
1019 temp = result[(row * w + col) * depth + z];
1020 result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
1021 result[((h - row - 1) * w + col) * depth + z] = temp;
1022 }
1023 }
1024 }
1025 }
1026 }
1027 #endif
1028
1029 #ifndef STBI_NO_STDIO
1030
stbi__fopen(char const * filename,char const * mode)1031 static FILE *stbi__fopen(char const *filename, char const *mode)
1032 {
1033 FILE *f;
1034 #if defined(_MSC_VER) && _MSC_VER >= 1400
1035 if (0 != fopen_s(&f, filename, mode))
1036 f=0;
1037 #else
1038 f = fopen(filename, mode);
1039 #endif
1040 return f;
1041 }
1042
1043
stbi_load(char const * filename,int * x,int * y,int * comp,int req_comp)1044 STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
1045 {
1046 FILE *f = stbi__fopen(filename, "rb");
1047 unsigned char *result;
1048 if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
1049 result = stbi_load_from_file(f,x,y,comp,req_comp);
1050 fclose(f);
1051 return result;
1052 }
1053
stbi_load_from_file(FILE * f,int * x,int * y,int * comp,int req_comp)1054 STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1055 {
1056 unsigned char *result;
1057 stbi__context s;
1058 stbi__start_file(&s,f);
1059 result = stbi__load_flip(&s,x,y,comp,req_comp);
1060 if (result) {
1061 // need to 'unget' all the characters in the IO buffer
1062 fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
1063 }
1064 return result;
1065 }
1066 #endif //!STBI_NO_STDIO
1067
stbi_load_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp,int req_comp)1068 STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1069 {
1070 stbi__context s;
1071 stbi__start_mem(&s,buffer,len);
1072 return stbi__load_flip(&s,x,y,comp,req_comp);
1073 }
1074
stbi_load_from_callbacks(stbi_io_callbacks const * clbk,void * user,int * x,int * y,int * comp,int req_comp)1075 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1076 {
1077 stbi__context s;
1078 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1079 return stbi__load_flip(&s,x,y,comp,req_comp);
1080 }
1081
1082 #ifndef STBI_NO_LINEAR
stbi__loadf_main(stbi__context * s,int * x,int * y,int * comp,int req_comp)1083 static float *stbi__loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
1084 {
1085 unsigned char *data;
1086 #ifndef STBI_NO_HDR
1087 if (stbi__hdr_test(s)) {
1088 float *hdr_data = stbi__hdr_load(s,x,y,comp,req_comp);
1089 if (hdr_data)
1090 stbi__float_postprocess(hdr_data,x,y,comp,req_comp);
1091 return hdr_data;
1092 }
1093 #endif
1094 data = stbi__load_flip(s, x, y, comp, req_comp);
1095 if (data)
1096 return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
1097 return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
1098 }
1099
stbi_loadf_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp,int req_comp)1100 STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1101 {
1102 stbi__context s;
1103 stbi__start_mem(&s,buffer,len);
1104 return stbi__loadf_main(&s,x,y,comp,req_comp);
1105 }
1106
stbi_loadf_from_callbacks(stbi_io_callbacks const * clbk,void * user,int * x,int * y,int * comp,int req_comp)1107 STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1108 {
1109 stbi__context s;
1110 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1111 return stbi__loadf_main(&s,x,y,comp,req_comp);
1112 }
1113
1114 #ifndef STBI_NO_STDIO
stbi_loadf(char const * filename,int * x,int * y,int * comp,int req_comp)1115 STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
1116 {
1117 float *result;
1118 FILE *f = stbi__fopen(filename, "rb");
1119 if (!f) return stbi__errpf("can't fopen", "Unable to open file");
1120 result = stbi_loadf_from_file(f,x,y,comp,req_comp);
1121 fclose(f);
1122 return result;
1123 }
1124
stbi_loadf_from_file(FILE * f,int * x,int * y,int * comp,int req_comp)1125 STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1126 {
1127 stbi__context s;
1128 stbi__start_file(&s,f);
1129 return stbi__loadf_main(&s,x,y,comp,req_comp);
1130 }
1131 #endif // !STBI_NO_STDIO
1132
1133 #endif // !STBI_NO_LINEAR
1134
1135 // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
1136 // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
1137 // reports false!
1138
stbi_is_hdr_from_memory(stbi_uc const * buffer,int len)1139 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
1140 {
1141 #ifndef STBI_NO_HDR
1142 stbi__context s;
1143 stbi__start_mem(&s,buffer,len);
1144 return stbi__hdr_test(&s);
1145 #else
1146 STBI_NOTUSED(buffer);
1147 STBI_NOTUSED(len);
1148 return 0;
1149 #endif
1150 }
1151
1152 #ifndef STBI_NO_STDIO
stbi_is_hdr(char const * filename)1153 STBIDEF int stbi_is_hdr (char const *filename)
1154 {
1155 FILE *f = stbi__fopen(filename, "rb");
1156 int result=0;
1157 if (f) {
1158 result = stbi_is_hdr_from_file(f);
1159 fclose(f);
1160 }
1161 return result;
1162 }
1163
stbi_is_hdr_from_file(FILE * f)1164 STBIDEF int stbi_is_hdr_from_file(FILE *f)
1165 {
1166 #ifndef STBI_NO_HDR
1167 stbi__context s;
1168 stbi__start_file(&s,f);
1169 return stbi__hdr_test(&s);
1170 #else
1171 STBI_NOTUSED(f);
1172 return 0;
1173 #endif
1174 }
1175 #endif // !STBI_NO_STDIO
1176
stbi_is_hdr_from_callbacks(stbi_io_callbacks const * clbk,void * user)1177 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
1178 {
1179 #ifndef STBI_NO_HDR
1180 stbi__context s;
1181 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1182 return stbi__hdr_test(&s);
1183 #else
1184 STBI_NOTUSED(clbk);
1185 STBI_NOTUSED(user);
1186 return 0;
1187 #endif
1188 }
1189
1190 #ifndef STBI_NO_LINEAR
1191 static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f;
1192
stbi_ldr_to_hdr_gamma(float gamma)1193 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
stbi_ldr_to_hdr_scale(float scale)1194 STBIDEF void stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
1195 #endif
1196
1197 static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f;
1198
stbi_hdr_to_ldr_gamma(float gamma)1199 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; }
stbi_hdr_to_ldr_scale(float scale)1200 STBIDEF void stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; }
1201
1202
1203 //////////////////////////////////////////////////////////////////////////////
1204 //
1205 // Common code used by all image loaders
1206 //
1207
1208 enum
1209 {
1210 STBI__SCAN_load=0,
1211 STBI__SCAN_type,
1212 STBI__SCAN_header
1213 };
1214
stbi__refill_buffer(stbi__context * s)1215 static void stbi__refill_buffer(stbi__context *s)
1216 {
1217 int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
1218 if (n == 0) {
1219 // at end of file, treat same as if from memory, but need to handle case
1220 // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
1221 s->read_from_callbacks = 0;
1222 s->img_buffer = s->buffer_start;
1223 s->img_buffer_end = s->buffer_start+1;
1224 *s->img_buffer = 0;
1225 } else {
1226 s->img_buffer = s->buffer_start;
1227 s->img_buffer_end = s->buffer_start + n;
1228 }
1229 }
1230
stbi__get8(stbi__context * s)1231 stbi_inline static stbi_uc stbi__get8(stbi__context *s)
1232 {
1233 if (s->img_buffer < s->img_buffer_end)
1234 return *s->img_buffer++;
1235 if (s->read_from_callbacks) {
1236 stbi__refill_buffer(s);
1237 return *s->img_buffer++;
1238 }
1239 return 0;
1240 }
1241
stbi__at_eof(stbi__context * s)1242 stbi_inline static int stbi__at_eof(stbi__context *s)
1243 {
1244 if (s->io.read) {
1245 if (!(s->io.eof)(s->io_user_data)) return 0;
1246 // if feof() is true, check if buffer = end
1247 // special case: we've only got the special 0 character at the end
1248 if (s->read_from_callbacks == 0) return 1;
1249 }
1250
1251 return s->img_buffer >= s->img_buffer_end;
1252 }
1253
stbi__skip(stbi__context * s,int n)1254 static void stbi__skip(stbi__context *s, int n)
1255 {
1256 if (n < 0) {
1257 s->img_buffer = s->img_buffer_end;
1258 return;
1259 }
1260 if (s->io.read) {
1261 int blen = (int) (s->img_buffer_end - s->img_buffer);
1262 if (blen < n) {
1263 s->img_buffer = s->img_buffer_end;
1264 (s->io.skip)(s->io_user_data, n - blen);
1265 return;
1266 }
1267 }
1268 s->img_buffer += n;
1269 }
1270
stbi__getn(stbi__context * s,stbi_uc * buffer,int n)1271 static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n)
1272 {
1273 if (s->io.read) {
1274 int blen = (int) (s->img_buffer_end - s->img_buffer);
1275 if (blen < n) {
1276 int res, count;
1277
1278 memcpy(buffer, s->img_buffer, blen);
1279
1280 count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
1281 res = (count == (n-blen));
1282 s->img_buffer = s->img_buffer_end;
1283 return res;
1284 }
1285 }
1286
1287 if (s->img_buffer+n <= s->img_buffer_end) {
1288 memcpy(buffer, s->img_buffer, n);
1289 s->img_buffer += n;
1290 return 1;
1291 } else
1292 return 0;
1293 }
1294
stbi__get16be(stbi__context * s)1295 static int stbi__get16be(stbi__context *s)
1296 {
1297 int z = stbi__get8(s);
1298 return (z << 8) + stbi__get8(s);
1299 }
1300
stbi__get32be(stbi__context * s)1301 static stbi__uint32 stbi__get32be(stbi__context *s)
1302 {
1303 stbi__uint32 z = stbi__get16be(s);
1304 return (z << 16) + stbi__get16be(s);
1305 }
1306
1307 #if defined(STBI_NO_BMP) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF)
1308 // nothing
1309 #else
stbi__get16le(stbi__context * s)1310 static int stbi__get16le(stbi__context *s)
1311 {
1312 int z = stbi__get8(s);
1313 return z + (stbi__get8(s) << 8);
1314 }
1315 #endif
1316
1317 #ifndef STBI_NO_BMP
stbi__get32le(stbi__context * s)1318 static stbi__uint32 stbi__get32le(stbi__context *s)
1319 {
1320 stbi__uint32 z = stbi__get16le(s);
1321 return z + (stbi__get16le(s) << 16);
1322 }
1323 #endif
1324
1325 #define STBI__BYTECAST(x) ((stbi_uc) ((x) & 255)) // truncate int to byte without warnings
1326
1327
1328 //////////////////////////////////////////////////////////////////////////////
1329 //
1330 // generic converter from built-in img_n to req_comp
1331 // individual types do this automatically as much as possible (e.g. jpeg
1332 // does all cases internally since it needs to colorspace convert anyway,
1333 // and it never has alpha, so very few cases ). png can automatically
1334 // interleave an alpha=255 channel, but falls back to this for other cases
1335 //
1336 // assume data buffer is malloced, so malloc a new one and free that one
1337 // only failure mode is malloc failing
1338
stbi__compute_y(int r,int g,int b)1339 static stbi_uc stbi__compute_y(int r, int g, int b)
1340 {
1341 return (stbi_uc) (((r*77) + (g*150) + (29*b)) >> 8);
1342 }
1343
stbi__convert_format(unsigned char * data,int img_n,int req_comp,unsigned int x,unsigned int y)1344 static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y)
1345 {
1346 int i,j;
1347 unsigned char *good;
1348
1349 if (req_comp == img_n) return data;
1350 STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
1351
1352 good = (unsigned char *) stbi__malloc(req_comp * x * y);
1353 if (good == NULL) {
1354 STBI_FREE(data);
1355 return stbi__errpuc("outofmem", "Out of memory");
1356 }
1357
1358 for (j=0; j < (int) y; ++j) {
1359 unsigned char *src = data + j * x * img_n ;
1360 unsigned char *dest = good + j * x * req_comp;
1361
1362 #define COMBO(a,b) ((a)*8+(b))
1363 #define CASE(a,b) case COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
1364 // convert source image with img_n components to one with req_comp components;
1365 // avoid switch per pixel, so use switch per scanline and massive macros
1366 switch (COMBO(img_n, req_comp)) {
1367 CASE(1,2) dest[0]=src[0], dest[1]=255; break;
1368 CASE(1,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1369 CASE(1,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=255; break;
1370 CASE(2,1) dest[0]=src[0]; break;
1371 CASE(2,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1372 CASE(2,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=src[1]; break;
1373 CASE(3,4) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2],dest[3]=255; break;
1374 CASE(3,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1375 CASE(3,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = 255; break;
1376 CASE(4,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1377 CASE(4,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = src[3]; break;
1378 CASE(4,3) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2]; break;
1379 default: STBI_ASSERT(0);
1380 }
1381 #undef CASE
1382 }
1383
1384 STBI_FREE(data);
1385 return good;
1386 }
1387
1388 #ifndef STBI_NO_LINEAR
stbi__ldr_to_hdr(stbi_uc * data,int x,int y,int comp)1389 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
1390 {
1391 int i,k,n;
1392 float *output = (float *) stbi__malloc(x * y * comp * sizeof(float));
1393 if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
1394 // compute number of non-alpha components
1395 if (comp & 1) n = comp; else n = comp-1;
1396 for (i=0; i < x*y; ++i) {
1397 for (k=0; k < n; ++k) {
1398 output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
1399 }
1400 if (k < comp) output[i*comp + k] = data[i*comp+k]/255.0f;
1401 }
1402 STBI_FREE(data);
1403 return output;
1404 }
1405 #endif
1406
1407 #ifndef STBI_NO_HDR
1408 #define stbi__float2int(x) ((int) (x))
stbi__hdr_to_ldr(float * data,int x,int y,int comp)1409 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp)
1410 {
1411 int i,k,n;
1412 stbi_uc *output = (stbi_uc *) stbi__malloc(x * y * comp);
1413 if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
1414 // compute number of non-alpha components
1415 if (comp & 1) n = comp; else n = comp-1;
1416 for (i=0; i < x*y; ++i) {
1417 for (k=0; k < n; ++k) {
1418 float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
1419 if (z < 0) z = 0;
1420 if (z > 255) z = 255;
1421 output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1422 }
1423 if (k < comp) {
1424 float z = data[i*comp+k] * 255 + 0.5f;
1425 if (z < 0) z = 0;
1426 if (z > 255) z = 255;
1427 output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1428 }
1429 }
1430 STBI_FREE(data);
1431 return output;
1432 }
1433 #endif
1434
1435 //////////////////////////////////////////////////////////////////////////////
1436 //
1437 // "baseline" JPEG/JFIF decoder
1438 //
1439 // simple implementation
1440 // - doesn't support delayed output of y-dimension
1441 // - simple interface (only one output format: 8-bit interleaved RGB)
1442 // - doesn't try to recover corrupt jpegs
1443 // - doesn't allow partial loading, loading multiple at once
1444 // - still fast on x86 (copying globals into locals doesn't help x86)
1445 // - allocates lots of intermediate memory (full size of all components)
1446 // - non-interleaved case requires this anyway
1447 // - allows good upsampling (see next)
1448 // high-quality
1449 // - upsampled channels are bilinearly interpolated, even across blocks
1450 // - quality integer IDCT derived from IJG's 'slow'
1451 // performance
1452 // - fast huffman; reasonable integer IDCT
1453 // - some SIMD kernels for common paths on targets with SSE2/NEON
1454 // - uses a lot of intermediate memory, could cache poorly
1455
1456 #ifndef STBI_NO_JPEG
1457
1458 // huffman decoding acceleration
1459 #define FAST_BITS 9 // larger handles more cases; smaller stomps less cache
1460
1461 typedef struct
1462 {
1463 stbi_uc fast[1 << FAST_BITS];
1464 // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
1465 stbi__uint16 code[256];
1466 stbi_uc values[256];
1467 stbi_uc size[257];
1468 unsigned int maxcode[18];
1469 int delta[17]; // old 'firstsymbol' - old 'firstcode'
1470 } stbi__huffman;
1471
1472 typedef struct
1473 {
1474 stbi__context *s;
1475 stbi__huffman huff_dc[4];
1476 stbi__huffman huff_ac[4];
1477 stbi_uc dequant[4][64];
1478 stbi__int16 fast_ac[4][1 << FAST_BITS];
1479
1480 // sizes for components, interleaved MCUs
1481 int img_h_max, img_v_max;
1482 int img_mcu_x, img_mcu_y;
1483 int img_mcu_w, img_mcu_h;
1484
1485 // definition of jpeg image component
1486 struct
1487 {
1488 int id;
1489 int h,v;
1490 int tq;
1491 int hd,ha;
1492 int dc_pred;
1493
1494 int x,y,w2,h2;
1495 stbi_uc *data;
1496 void *raw_data, *raw_coeff;
1497 stbi_uc *linebuf;
1498 short *coeff; // progressive only
1499 int coeff_w, coeff_h; // number of 8x8 coefficient blocks
1500 } img_comp[4];
1501
1502 stbi__uint32 code_buffer; // jpeg entropy-coded buffer
1503 int code_bits; // number of valid bits
1504 unsigned char marker; // marker seen while filling entropy buffer
1505 int nomore; // flag if we saw a marker so must stop
1506
1507 int progressive;
1508 int spec_start;
1509 int spec_end;
1510 int succ_high;
1511 int succ_low;
1512 int eob_run;
1513
1514 int scan_n, order[4];
1515 int restart_interval, todo;
1516
1517 // kernels
1518 void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]);
1519 void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step);
1520 stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs);
1521 } stbi__jpeg;
1522
stbi__build_huffman(stbi__huffman * h,int * count)1523 static int stbi__build_huffman(stbi__huffman *h, int *count)
1524 {
1525 int i,j,k=0,code;
1526 // build size list for each symbol (from JPEG spec)
1527 for (i=0; i < 16; ++i)
1528 for (j=0; j < count[i]; ++j)
1529 h->size[k++] = (stbi_uc) (i+1);
1530 h->size[k] = 0;
1531
1532 // compute actual symbols (from jpeg spec)
1533 code = 0;
1534 k = 0;
1535 for(j=1; j <= 16; ++j) {
1536 // compute delta to add to code to compute symbol id
1537 h->delta[j] = k - code;
1538 if (h->size[k] == j) {
1539 while (h->size[k] == j)
1540 h->code[k++] = (stbi__uint16) (code++);
1541 if (code-1 >= (1 << j)) return stbi__err("bad code lengths","Corrupt JPEG");
1542 }
1543 // compute largest code + 1 for this size, preshifted as needed later
1544 h->maxcode[j] = code << (16-j);
1545 code <<= 1;
1546 }
1547 h->maxcode[j] = 0xffffffff;
1548
1549 // build non-spec acceleration table; 255 is flag for not-accelerated
1550 memset(h->fast, 255, 1 << FAST_BITS);
1551 for (i=0; i < k; ++i) {
1552 int s = h->size[i];
1553 if (s <= FAST_BITS) {
1554 int c = h->code[i] << (FAST_BITS-s);
1555 int m = 1 << (FAST_BITS-s);
1556 for (j=0; j < m; ++j) {
1557 h->fast[c+j] = (stbi_uc) i;
1558 }
1559 }
1560 }
1561 return 1;
1562 }
1563
1564 // build a table that decodes both magnitude and value of small ACs in
1565 // one go.
stbi__build_fast_ac(stbi__int16 * fast_ac,stbi__huffman * h)1566 static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h)
1567 {
1568 int i;
1569 for (i=0; i < (1 << FAST_BITS); ++i) {
1570 stbi_uc fast = h->fast[i];
1571 fast_ac[i] = 0;
1572 if (fast < 255) {
1573 int rs = h->values[fast];
1574 int run = (rs >> 4) & 15;
1575 int magbits = rs & 15;
1576 int len = h->size[fast];
1577
1578 if (magbits && len + magbits <= FAST_BITS) {
1579 // magnitude code followed by receive_extend code
1580 int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
1581 int m = 1 << (magbits - 1);
1582 if (k < m) k += (-1 << magbits) + 1;
1583 // if the result is small enough, we can fit it in fast_ac table
1584 if (k >= -128 && k <= 127)
1585 fast_ac[i] = (stbi__int16) ((k << 8) + (run << 4) + (len + magbits));
1586 }
1587 }
1588 }
1589 }
1590
stbi__grow_buffer_unsafe(stbi__jpeg * j)1591 static void stbi__grow_buffer_unsafe(stbi__jpeg *j)
1592 {
1593 do {
1594 int b = j->nomore ? 0 : stbi__get8(j->s);
1595 if (b == 0xff) {
1596 int c = stbi__get8(j->s);
1597 if (c != 0) {
1598 j->marker = (unsigned char) c;
1599 j->nomore = 1;
1600 return;
1601 }
1602 }
1603 j->code_buffer |= b << (24 - j->code_bits);
1604 j->code_bits += 8;
1605 } while (j->code_bits <= 24);
1606 }
1607
1608 // (1 << n) - 1
1609 static stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
1610
1611 // decode a jpeg huffman value from the bitstream
stbi__jpeg_huff_decode(stbi__jpeg * j,stbi__huffman * h)1612 stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h)
1613 {
1614 unsigned int temp;
1615 int c,k;
1616
1617 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1618
1619 // look at the top FAST_BITS and determine what symbol ID it is,
1620 // if the code is <= FAST_BITS
1621 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1622 k = h->fast[c];
1623 if (k < 255) {
1624 int s = h->size[k];
1625 if (s > j->code_bits)
1626 return -1;
1627 j->code_buffer <<= s;
1628 j->code_bits -= s;
1629 return h->values[k];
1630 }
1631
1632 // naive test is to shift the code_buffer down so k bits are
1633 // valid, then test against maxcode. To speed this up, we've
1634 // preshifted maxcode left so that it has (16-k) 0s at the
1635 // end; in other words, regardless of the number of bits, it
1636 // wants to be compared against something shifted to have 16;
1637 // that way we don't need to shift inside the loop.
1638 temp = j->code_buffer >> 16;
1639 for (k=FAST_BITS+1 ; ; ++k)
1640 if (temp < h->maxcode[k])
1641 break;
1642 if (k == 17) {
1643 // error! code not found
1644 j->code_bits -= 16;
1645 return -1;
1646 }
1647
1648 if (k > j->code_bits)
1649 return -1;
1650
1651 // convert the huffman code to the symbol id
1652 c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
1653 STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
1654
1655 // convert the id to a symbol
1656 j->code_bits -= k;
1657 j->code_buffer <<= k;
1658 return h->values[c];
1659 }
1660
1661 // bias[n] = (-1<<n) + 1
1662 static int const stbi__jbias[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
1663
1664 // combined JPEG 'receive' and JPEG 'extend', since baseline
1665 // always extends everything it receives.
stbi__extend_receive(stbi__jpeg * j,int n)1666 stbi_inline static int stbi__extend_receive(stbi__jpeg *j, int n)
1667 {
1668 unsigned int k;
1669 int sgn;
1670 if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1671
1672 sgn = (stbi__int32)j->code_buffer >> 31; // sign bit is always in MSB
1673 k = stbi_lrot(j->code_buffer, n);
1674 STBI_ASSERT(n >= 0 && n < (int) (sizeof(stbi__bmask)/sizeof(*stbi__bmask)));
1675 j->code_buffer = k & ~stbi__bmask[n];
1676 k &= stbi__bmask[n];
1677 j->code_bits -= n;
1678 return k + (stbi__jbias[n] & ~sgn);
1679 }
1680
1681 // get some unsigned bits
stbi__jpeg_get_bits(stbi__jpeg * j,int n)1682 stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg *j, int n)
1683 {
1684 unsigned int k;
1685 if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1686 k = stbi_lrot(j->code_buffer, n);
1687 j->code_buffer = k & ~stbi__bmask[n];
1688 k &= stbi__bmask[n];
1689 j->code_bits -= n;
1690 return k;
1691 }
1692
stbi__jpeg_get_bit(stbi__jpeg * j)1693 stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg *j)
1694 {
1695 unsigned int k;
1696 if (j->code_bits < 1) stbi__grow_buffer_unsafe(j);
1697 k = j->code_buffer;
1698 j->code_buffer <<= 1;
1699 --j->code_bits;
1700 return k & 0x80000000;
1701 }
1702
1703 // given a value that's at position X in the zigzag stream,
1704 // where does it appear in the 8x8 matrix coded as row-major?
1705 static stbi_uc stbi__jpeg_dezigzag[64+15] =
1706 {
1707 0, 1, 8, 16, 9, 2, 3, 10,
1708 17, 24, 32, 25, 18, 11, 4, 5,
1709 12, 19, 26, 33, 40, 48, 41, 34,
1710 27, 20, 13, 6, 7, 14, 21, 28,
1711 35, 42, 49, 56, 57, 50, 43, 36,
1712 29, 22, 15, 23, 30, 37, 44, 51,
1713 58, 59, 52, 45, 38, 31, 39, 46,
1714 53, 60, 61, 54, 47, 55, 62, 63,
1715 // let corrupt input sample past end
1716 63, 63, 63, 63, 63, 63, 63, 63,
1717 63, 63, 63, 63, 63, 63, 63
1718 };
1719
1720 // decode one 64-entry block--
stbi__jpeg_decode_block(stbi__jpeg * j,short data[64],stbi__huffman * hdc,stbi__huffman * hac,stbi__int16 * fac,int b,stbi_uc * dequant)1721 static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi_uc *dequant)
1722 {
1723 int diff,dc,k;
1724 int t;
1725
1726 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1727 t = stbi__jpeg_huff_decode(j, hdc);
1728 if (t < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1729
1730 // 0 all the ac values now so we can do it 32-bits at a time
1731 memset(data,0,64*sizeof(data[0]));
1732
1733 diff = t ? stbi__extend_receive(j, t) : 0;
1734 dc = j->img_comp[b].dc_pred + diff;
1735 j->img_comp[b].dc_pred = dc;
1736 data[0] = (short) (dc * dequant[0]);
1737
1738 // decode AC components, see JPEG spec
1739 k = 1;
1740 do {
1741 unsigned int zig;
1742 int c,r,s;
1743 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1744 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1745 r = fac[c];
1746 if (r) { // fast-AC path
1747 k += (r >> 4) & 15; // run
1748 s = r & 15; // combined length
1749 j->code_buffer <<= s;
1750 j->code_bits -= s;
1751 // decode into unzigzag'd location
1752 zig = stbi__jpeg_dezigzag[k++];
1753 data[zig] = (short) ((r >> 8) * dequant[zig]);
1754 } else {
1755 int rs = stbi__jpeg_huff_decode(j, hac);
1756 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1757 s = rs & 15;
1758 r = rs >> 4;
1759 if (s == 0) {
1760 if (rs != 0xf0) break; // end block
1761 k += 16;
1762 } else {
1763 k += r;
1764 // decode into unzigzag'd location
1765 zig = stbi__jpeg_dezigzag[k++];
1766 data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]);
1767 }
1768 }
1769 } while (k < 64);
1770 return 1;
1771 }
1772
stbi__jpeg_decode_block_prog_dc(stbi__jpeg * j,short data[64],stbi__huffman * hdc,int b)1773 static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg *j, short data[64], stbi__huffman *hdc, int b)
1774 {
1775 int diff,dc;
1776 int t;
1777 if (j->spec_end != 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1778
1779 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1780
1781 if (j->succ_high == 0) {
1782 // first scan for DC coefficient, must be first
1783 memset(data,0,64*sizeof(data[0])); // 0 all the ac values now
1784 t = stbi__jpeg_huff_decode(j, hdc);
1785 diff = t ? stbi__extend_receive(j, t) : 0;
1786
1787 dc = j->img_comp[b].dc_pred + diff;
1788 j->img_comp[b].dc_pred = dc;
1789 data[0] = (short) (dc << j->succ_low);
1790 } else {
1791 // refinement scan for DC coefficient
1792 if (stbi__jpeg_get_bit(j))
1793 data[0] += (short) (1 << j->succ_low);
1794 }
1795 return 1;
1796 }
1797
1798 // @OPTIMIZE: store non-zigzagged during the decode passes,
1799 // and only de-zigzag when dequantizing
stbi__jpeg_decode_block_prog_ac(stbi__jpeg * j,short data[64],stbi__huffman * hac,stbi__int16 * fac)1800 static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg *j, short data[64], stbi__huffman *hac, stbi__int16 *fac)
1801 {
1802 int k;
1803 if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1804
1805 if (j->succ_high == 0) {
1806 int shift = j->succ_low;
1807
1808 if (j->eob_run) {
1809 --j->eob_run;
1810 return 1;
1811 }
1812
1813 k = j->spec_start;
1814 do {
1815 unsigned int zig;
1816 int c,r,s;
1817 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1818 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1819 r = fac[c];
1820 if (r) { // fast-AC path
1821 k += (r >> 4) & 15; // run
1822 s = r & 15; // combined length
1823 j->code_buffer <<= s;
1824 j->code_bits -= s;
1825 zig = stbi__jpeg_dezigzag[k++];
1826 data[zig] = (short) ((r >> 8) << shift);
1827 } else {
1828 int rs = stbi__jpeg_huff_decode(j, hac);
1829 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1830 s = rs & 15;
1831 r = rs >> 4;
1832 if (s == 0) {
1833 if (r < 15) {
1834 j->eob_run = (1 << r);
1835 if (r)
1836 j->eob_run += stbi__jpeg_get_bits(j, r);
1837 --j->eob_run;
1838 break;
1839 }
1840 k += 16;
1841 } else {
1842 k += r;
1843 zig = stbi__jpeg_dezigzag[k++];
1844 data[zig] = (short) (stbi__extend_receive(j,s) << shift);
1845 }
1846 }
1847 } while (k <= j->spec_end);
1848 } else {
1849 // refinement scan for these AC coefficients
1850
1851 short bit = (short) (1 << j->succ_low);
1852
1853 if (j->eob_run) {
1854 --j->eob_run;
1855 for (k = j->spec_start; k <= j->spec_end; ++k) {
1856 short *p = &data[stbi__jpeg_dezigzag[k]];
1857 if (*p != 0)
1858 if (stbi__jpeg_get_bit(j))
1859 if ((*p & bit)==0) {
1860 if (*p > 0)
1861 *p += bit;
1862 else
1863 *p -= bit;
1864 }
1865 }
1866 } else {
1867 k = j->spec_start;
1868 do {
1869 int r,s;
1870 int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
1871 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1872 s = rs & 15;
1873 r = rs >> 4;
1874 if (s == 0) {
1875 if (r < 15) {
1876 j->eob_run = (1 << r) - 1;
1877 if (r)
1878 j->eob_run += stbi__jpeg_get_bits(j, r);
1879 r = 64; // force end of block
1880 } else {
1881 // r=15 s=0 should write 16 0s, so we just do
1882 // a run of 15 0s and then write s (which is 0),
1883 // so we don't have to do anything special here
1884 }
1885 } else {
1886 if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG");
1887 // sign bit
1888 if (stbi__jpeg_get_bit(j))
1889 s = bit;
1890 else
1891 s = -bit;
1892 }
1893
1894 // advance by r
1895 while (k <= j->spec_end) {
1896 short *p = &data[stbi__jpeg_dezigzag[k++]];
1897 if (*p != 0) {
1898 if (stbi__jpeg_get_bit(j))
1899 if ((*p & bit)==0) {
1900 if (*p > 0)
1901 *p += bit;
1902 else
1903 *p -= bit;
1904 }
1905 } else {
1906 if (r == 0) {
1907 *p = (short) s;
1908 break;
1909 }
1910 --r;
1911 }
1912 }
1913 } while (k <= j->spec_end);
1914 }
1915 }
1916 return 1;
1917 }
1918
1919 // take a -128..127 value and stbi__clamp it and convert to 0..255
stbi__clamp(int x)1920 stbi_inline static stbi_uc stbi__clamp(int x)
1921 {
1922 // trick to use a single test to catch both cases
1923 if ((unsigned int) x > 255) {
1924 if (x < 0) return 0;
1925 if (x > 255) return 255;
1926 }
1927 return (stbi_uc) x;
1928 }
1929
1930 #define stbi__f2f(x) ((int) (((x) * 4096 + 0.5)))
1931 #define stbi__fsh(x) ((x) << 12)
1932
1933 // derived from jidctint -- DCT_ISLOW
1934 #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
1935 int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
1936 p2 = s2; \
1937 p3 = s6; \
1938 p1 = (p2+p3) * stbi__f2f(0.5411961f); \
1939 t2 = p1 + p3*stbi__f2f(-1.847759065f); \
1940 t3 = p1 + p2*stbi__f2f( 0.765366865f); \
1941 p2 = s0; \
1942 p3 = s4; \
1943 t0 = stbi__fsh(p2+p3); \
1944 t1 = stbi__fsh(p2-p3); \
1945 x0 = t0+t3; \
1946 x3 = t0-t3; \
1947 x1 = t1+t2; \
1948 x2 = t1-t2; \
1949 t0 = s7; \
1950 t1 = s5; \
1951 t2 = s3; \
1952 t3 = s1; \
1953 p3 = t0+t2; \
1954 p4 = t1+t3; \
1955 p1 = t0+t3; \
1956 p2 = t1+t2; \
1957 p5 = (p3+p4)*stbi__f2f( 1.175875602f); \
1958 t0 = t0*stbi__f2f( 0.298631336f); \
1959 t1 = t1*stbi__f2f( 2.053119869f); \
1960 t2 = t2*stbi__f2f( 3.072711026f); \
1961 t3 = t3*stbi__f2f( 1.501321110f); \
1962 p1 = p5 + p1*stbi__f2f(-0.899976223f); \
1963 p2 = p5 + p2*stbi__f2f(-2.562915447f); \
1964 p3 = p3*stbi__f2f(-1.961570560f); \
1965 p4 = p4*stbi__f2f(-0.390180644f); \
1966 t3 += p1+p4; \
1967 t2 += p2+p3; \
1968 t1 += p2+p4; \
1969 t0 += p1+p3;
1970
stbi__idct_block(stbi_uc * out,int out_stride,short data[64])1971 static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64])
1972 {
1973 int i,val[64],*v=val;
1974 stbi_uc *o;
1975 short *d = data;
1976
1977 // columns
1978 for (i=0; i < 8; ++i,++d, ++v) {
1979 // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
1980 if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
1981 && d[40]==0 && d[48]==0 && d[56]==0) {
1982 // no shortcut 0 seconds
1983 // (1|2|3|4|5|6|7)==0 0 seconds
1984 // all separate -0.047 seconds
1985 // 1 && 2|3 && 4|5 && 6|7: -0.047 seconds
1986 int dcterm = d[0] << 2;
1987 v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
1988 } else {
1989 STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56])
1990 // constants scaled things up by 1<<12; let's bring them back
1991 // down, but keep 2 extra bits of precision
1992 x0 += 512; x1 += 512; x2 += 512; x3 += 512;
1993 v[ 0] = (x0+t3) >> 10;
1994 v[56] = (x0-t3) >> 10;
1995 v[ 8] = (x1+t2) >> 10;
1996 v[48] = (x1-t2) >> 10;
1997 v[16] = (x2+t1) >> 10;
1998 v[40] = (x2-t1) >> 10;
1999 v[24] = (x3+t0) >> 10;
2000 v[32] = (x3-t0) >> 10;
2001 }
2002 }
2003
2004 for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
2005 // no fast case since the first 1D IDCT spread components out
2006 STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
2007 // constants scaled things up by 1<<12, plus we had 1<<2 from first
2008 // loop, plus horizontal and vertical each scale by sqrt(8) so together
2009 // we've got an extra 1<<3, so 1<<17 total we need to remove.
2010 // so we want to round that, which means adding 0.5 * 1<<17,
2011 // aka 65536. Also, we'll end up with -128 to 127 that we want
2012 // to encode as 0..255 by adding 128, so we'll add that before the shift
2013 x0 += 65536 + (128<<17);
2014 x1 += 65536 + (128<<17);
2015 x2 += 65536 + (128<<17);
2016 x3 += 65536 + (128<<17);
2017 // tried computing the shifts into temps, or'ing the temps to see
2018 // if any were out of range, but that was slower
2019 o[0] = stbi__clamp((x0+t3) >> 17);
2020 o[7] = stbi__clamp((x0-t3) >> 17);
2021 o[1] = stbi__clamp((x1+t2) >> 17);
2022 o[6] = stbi__clamp((x1-t2) >> 17);
2023 o[2] = stbi__clamp((x2+t1) >> 17);
2024 o[5] = stbi__clamp((x2-t1) >> 17);
2025 o[3] = stbi__clamp((x3+t0) >> 17);
2026 o[4] = stbi__clamp((x3-t0) >> 17);
2027 }
2028 }
2029
2030 #ifdef STBI_SSE2
2031 // sse2 integer IDCT. not the fastest possible implementation but it
2032 // produces bit-identical results to the generic C version so it's
2033 // fully "transparent".
stbi__idct_simd(stbi_uc * out,int out_stride,short data[64])2034 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2035 {
2036 // This is constructed to match our regular (generic) integer IDCT exactly.
2037 __m128i row0, row1, row2, row3, row4, row5, row6, row7;
2038 __m128i tmp;
2039
2040 // dot product constant: even elems=x, odd elems=y
2041 #define dct_const(x,y) _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
2042
2043 // out(0) = c0[even]*x + c0[odd]*y (c0, x, y 16-bit, out 32-bit)
2044 // out(1) = c1[even]*x + c1[odd]*y
2045 #define dct_rot(out0,out1, x,y,c0,c1) \
2046 __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
2047 __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
2048 __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
2049 __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
2050 __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
2051 __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
2052
2053 // out = in << 12 (in 16-bit, out 32-bit)
2054 #define dct_widen(out, in) \
2055 __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
2056 __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
2057
2058 // wide add
2059 #define dct_wadd(out, a, b) \
2060 __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
2061 __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
2062
2063 // wide sub
2064 #define dct_wsub(out, a, b) \
2065 __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
2066 __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
2067
2068 // butterfly a/b, add bias, then shift by "s" and pack
2069 #define dct_bfly32o(out0, out1, a,b,bias,s) \
2070 { \
2071 __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
2072 __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
2073 dct_wadd(sum, abiased, b); \
2074 dct_wsub(dif, abiased, b); \
2075 out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
2076 out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
2077 }
2078
2079 // 8-bit interleave step (for transposes)
2080 #define dct_interleave8(a, b) \
2081 tmp = a; \
2082 a = _mm_unpacklo_epi8(a, b); \
2083 b = _mm_unpackhi_epi8(tmp, b)
2084
2085 // 16-bit interleave step (for transposes)
2086 #define dct_interleave16(a, b) \
2087 tmp = a; \
2088 a = _mm_unpacklo_epi16(a, b); \
2089 b = _mm_unpackhi_epi16(tmp, b)
2090
2091 #define dct_pass(bias,shift) \
2092 { \
2093 /* even part */ \
2094 dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
2095 __m128i sum04 = _mm_add_epi16(row0, row4); \
2096 __m128i dif04 = _mm_sub_epi16(row0, row4); \
2097 dct_widen(t0e, sum04); \
2098 dct_widen(t1e, dif04); \
2099 dct_wadd(x0, t0e, t3e); \
2100 dct_wsub(x3, t0e, t3e); \
2101 dct_wadd(x1, t1e, t2e); \
2102 dct_wsub(x2, t1e, t2e); \
2103 /* odd part */ \
2104 dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
2105 dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
2106 __m128i sum17 = _mm_add_epi16(row1, row7); \
2107 __m128i sum35 = _mm_add_epi16(row3, row5); \
2108 dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
2109 dct_wadd(x4, y0o, y4o); \
2110 dct_wadd(x5, y1o, y5o); \
2111 dct_wadd(x6, y2o, y5o); \
2112 dct_wadd(x7, y3o, y4o); \
2113 dct_bfly32o(row0,row7, x0,x7,bias,shift); \
2114 dct_bfly32o(row1,row6, x1,x6,bias,shift); \
2115 dct_bfly32o(row2,row5, x2,x5,bias,shift); \
2116 dct_bfly32o(row3,row4, x3,x4,bias,shift); \
2117 }
2118
2119 __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
2120 __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f));
2121 __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
2122 __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
2123 __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f));
2124 __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f));
2125 __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f));
2126 __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f));
2127
2128 // rounding biases in column/row passes, see stbi__idct_block for explanation.
2129 __m128i bias_0 = _mm_set1_epi32(512);
2130 __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17));
2131
2132 // load
2133 row0 = _mm_load_si128((const __m128i *) (data + 0*8));
2134 row1 = _mm_load_si128((const __m128i *) (data + 1*8));
2135 row2 = _mm_load_si128((const __m128i *) (data + 2*8));
2136 row3 = _mm_load_si128((const __m128i *) (data + 3*8));
2137 row4 = _mm_load_si128((const __m128i *) (data + 4*8));
2138 row5 = _mm_load_si128((const __m128i *) (data + 5*8));
2139 row6 = _mm_load_si128((const __m128i *) (data + 6*8));
2140 row7 = _mm_load_si128((const __m128i *) (data + 7*8));
2141
2142 // column pass
2143 dct_pass(bias_0, 10);
2144
2145 {
2146 // 16bit 8x8 transpose pass 1
2147 dct_interleave16(row0, row4);
2148 dct_interleave16(row1, row5);
2149 dct_interleave16(row2, row6);
2150 dct_interleave16(row3, row7);
2151
2152 // transpose pass 2
2153 dct_interleave16(row0, row2);
2154 dct_interleave16(row1, row3);
2155 dct_interleave16(row4, row6);
2156 dct_interleave16(row5, row7);
2157
2158 // transpose pass 3
2159 dct_interleave16(row0, row1);
2160 dct_interleave16(row2, row3);
2161 dct_interleave16(row4, row5);
2162 dct_interleave16(row6, row7);
2163 }
2164
2165 // row pass
2166 dct_pass(bias_1, 17);
2167
2168 {
2169 // pack
2170 __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
2171 __m128i p1 = _mm_packus_epi16(row2, row3);
2172 __m128i p2 = _mm_packus_epi16(row4, row5);
2173 __m128i p3 = _mm_packus_epi16(row6, row7);
2174
2175 // 8bit 8x8 transpose pass 1
2176 dct_interleave8(p0, p2); // a0e0a1e1...
2177 dct_interleave8(p1, p3); // c0g0c1g1...
2178
2179 // transpose pass 2
2180 dct_interleave8(p0, p1); // a0c0e0g0...
2181 dct_interleave8(p2, p3); // b0d0f0h0...
2182
2183 // transpose pass 3
2184 dct_interleave8(p0, p2); // a0b0c0d0...
2185 dct_interleave8(p1, p3); // a4b4c4d4...
2186
2187 // store
2188 _mm_storel_epi64((__m128i *) out, p0); out += out_stride;
2189 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
2190 _mm_storel_epi64((__m128i *) out, p2); out += out_stride;
2191 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
2192 _mm_storel_epi64((__m128i *) out, p1); out += out_stride;
2193 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
2194 _mm_storel_epi64((__m128i *) out, p3); out += out_stride;
2195 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e));
2196 }
2197
2198 #undef dct_const
2199 #undef dct_rot
2200 #undef dct_widen
2201 #undef dct_wadd
2202 #undef dct_wsub
2203 #undef dct_bfly32o
2204 #undef dct_interleave8
2205 #undef dct_interleave16
2206 #undef dct_pass
2207 }
2208
2209 #endif // STBI_SSE2
2210
2211 #ifdef STBI_NEON
2212
2213 // NEON integer IDCT. should produce bit-identical
2214 // results to the generic C version.
stbi__idct_simd(stbi_uc * out,int out_stride,short data[64])2215 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2216 {
2217 int16x8_t row0, row1, row2, row3, row4, row5, row6, row7;
2218
2219 int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f));
2220 int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f));
2221 int16x4_t rot0_2 = vdup_n_s16(stbi__f2f( 0.765366865f));
2222 int16x4_t rot1_0 = vdup_n_s16(stbi__f2f( 1.175875602f));
2223 int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f));
2224 int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f));
2225 int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f));
2226 int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f));
2227 int16x4_t rot3_0 = vdup_n_s16(stbi__f2f( 0.298631336f));
2228 int16x4_t rot3_1 = vdup_n_s16(stbi__f2f( 2.053119869f));
2229 int16x4_t rot3_2 = vdup_n_s16(stbi__f2f( 3.072711026f));
2230 int16x4_t rot3_3 = vdup_n_s16(stbi__f2f( 1.501321110f));
2231
2232 #define dct_long_mul(out, inq, coeff) \
2233 int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
2234 int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
2235
2236 #define dct_long_mac(out, acc, inq, coeff) \
2237 int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
2238 int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
2239
2240 #define dct_widen(out, inq) \
2241 int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
2242 int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
2243
2244 // wide add
2245 #define dct_wadd(out, a, b) \
2246 int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
2247 int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
2248
2249 // wide sub
2250 #define dct_wsub(out, a, b) \
2251 int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
2252 int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
2253
2254 // butterfly a/b, then shift using "shiftop" by "s" and pack
2255 #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
2256 { \
2257 dct_wadd(sum, a, b); \
2258 dct_wsub(dif, a, b); \
2259 out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
2260 out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
2261 }
2262
2263 #define dct_pass(shiftop, shift) \
2264 { \
2265 /* even part */ \
2266 int16x8_t sum26 = vaddq_s16(row2, row6); \
2267 dct_long_mul(p1e, sum26, rot0_0); \
2268 dct_long_mac(t2e, p1e, row6, rot0_1); \
2269 dct_long_mac(t3e, p1e, row2, rot0_2); \
2270 int16x8_t sum04 = vaddq_s16(row0, row4); \
2271 int16x8_t dif04 = vsubq_s16(row0, row4); \
2272 dct_widen(t0e, sum04); \
2273 dct_widen(t1e, dif04); \
2274 dct_wadd(x0, t0e, t3e); \
2275 dct_wsub(x3, t0e, t3e); \
2276 dct_wadd(x1, t1e, t2e); \
2277 dct_wsub(x2, t1e, t2e); \
2278 /* odd part */ \
2279 int16x8_t sum15 = vaddq_s16(row1, row5); \
2280 int16x8_t sum17 = vaddq_s16(row1, row7); \
2281 int16x8_t sum35 = vaddq_s16(row3, row5); \
2282 int16x8_t sum37 = vaddq_s16(row3, row7); \
2283 int16x8_t sumodd = vaddq_s16(sum17, sum35); \
2284 dct_long_mul(p5o, sumodd, rot1_0); \
2285 dct_long_mac(p1o, p5o, sum17, rot1_1); \
2286 dct_long_mac(p2o, p5o, sum35, rot1_2); \
2287 dct_long_mul(p3o, sum37, rot2_0); \
2288 dct_long_mul(p4o, sum15, rot2_1); \
2289 dct_wadd(sump13o, p1o, p3o); \
2290 dct_wadd(sump24o, p2o, p4o); \
2291 dct_wadd(sump23o, p2o, p3o); \
2292 dct_wadd(sump14o, p1o, p4o); \
2293 dct_long_mac(x4, sump13o, row7, rot3_0); \
2294 dct_long_mac(x5, sump24o, row5, rot3_1); \
2295 dct_long_mac(x6, sump23o, row3, rot3_2); \
2296 dct_long_mac(x7, sump14o, row1, rot3_3); \
2297 dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
2298 dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
2299 dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
2300 dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
2301 }
2302
2303 // load
2304 row0 = vld1q_s16(data + 0*8);
2305 row1 = vld1q_s16(data + 1*8);
2306 row2 = vld1q_s16(data + 2*8);
2307 row3 = vld1q_s16(data + 3*8);
2308 row4 = vld1q_s16(data + 4*8);
2309 row5 = vld1q_s16(data + 5*8);
2310 row6 = vld1q_s16(data + 6*8);
2311 row7 = vld1q_s16(data + 7*8);
2312
2313 // add DC bias
2314 row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
2315
2316 // column pass
2317 dct_pass(vrshrn_n_s32, 10);
2318
2319 // 16bit 8x8 transpose
2320 {
2321 // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
2322 // whether compilers actually get this is another story, sadly.
2323 #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
2324 #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
2325 #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
2326
2327 // pass 1
2328 dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6
2329 dct_trn16(row2, row3);
2330 dct_trn16(row4, row5);
2331 dct_trn16(row6, row7);
2332
2333 // pass 2
2334 dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4
2335 dct_trn32(row1, row3);
2336 dct_trn32(row4, row6);
2337 dct_trn32(row5, row7);
2338
2339 // pass 3
2340 dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0
2341 dct_trn64(row1, row5);
2342 dct_trn64(row2, row6);
2343 dct_trn64(row3, row7);
2344
2345 #undef dct_trn16
2346 #undef dct_trn32
2347 #undef dct_trn64
2348 }
2349
2350 // row pass
2351 // vrshrn_n_s32 only supports shifts up to 16, we need
2352 // 17. so do a non-rounding shift of 16 first then follow
2353 // up with a rounding shift by 1.
2354 dct_pass(vshrn_n_s32, 16);
2355
2356 {
2357 // pack and round
2358 uint8x8_t p0 = vqrshrun_n_s16(row0, 1);
2359 uint8x8_t p1 = vqrshrun_n_s16(row1, 1);
2360 uint8x8_t p2 = vqrshrun_n_s16(row2, 1);
2361 uint8x8_t p3 = vqrshrun_n_s16(row3, 1);
2362 uint8x8_t p4 = vqrshrun_n_s16(row4, 1);
2363 uint8x8_t p5 = vqrshrun_n_s16(row5, 1);
2364 uint8x8_t p6 = vqrshrun_n_s16(row6, 1);
2365 uint8x8_t p7 = vqrshrun_n_s16(row7, 1);
2366
2367 // again, these can translate into one instruction, but often don't.
2368 #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
2369 #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
2370 #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
2371
2372 // sadly can't use interleaved stores here since we only write
2373 // 8 bytes to each scan line!
2374
2375 // 8x8 8-bit transpose pass 1
2376 dct_trn8_8(p0, p1);
2377 dct_trn8_8(p2, p3);
2378 dct_trn8_8(p4, p5);
2379 dct_trn8_8(p6, p7);
2380
2381 // pass 2
2382 dct_trn8_16(p0, p2);
2383 dct_trn8_16(p1, p3);
2384 dct_trn8_16(p4, p6);
2385 dct_trn8_16(p5, p7);
2386
2387 // pass 3
2388 dct_trn8_32(p0, p4);
2389 dct_trn8_32(p1, p5);
2390 dct_trn8_32(p2, p6);
2391 dct_trn8_32(p3, p7);
2392
2393 // store
2394 vst1_u8(out, p0); out += out_stride;
2395 vst1_u8(out, p1); out += out_stride;
2396 vst1_u8(out, p2); out += out_stride;
2397 vst1_u8(out, p3); out += out_stride;
2398 vst1_u8(out, p4); out += out_stride;
2399 vst1_u8(out, p5); out += out_stride;
2400 vst1_u8(out, p6); out += out_stride;
2401 vst1_u8(out, p7);
2402
2403 #undef dct_trn8_8
2404 #undef dct_trn8_16
2405 #undef dct_trn8_32
2406 }
2407
2408 #undef dct_long_mul
2409 #undef dct_long_mac
2410 #undef dct_widen
2411 #undef dct_wadd
2412 #undef dct_wsub
2413 #undef dct_bfly32o
2414 #undef dct_pass
2415 }
2416
2417 #endif // STBI_NEON
2418
2419 #define STBI__MARKER_none 0xff
2420 // if there's a pending marker from the entropy stream, return that
2421 // otherwise, fetch from the stream and get a marker. if there's no
2422 // marker, return 0xff, which is never a valid marker value
stbi__get_marker(stbi__jpeg * j)2423 static stbi_uc stbi__get_marker(stbi__jpeg *j)
2424 {
2425 stbi_uc x;
2426 if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
2427 x = stbi__get8(j->s);
2428 if (x != 0xff) return STBI__MARKER_none;
2429 while (x == 0xff)
2430 x = stbi__get8(j->s);
2431 return x;
2432 }
2433
2434 // in each scan, we'll have scan_n components, and the order
2435 // of the components is specified by order[]
2436 #define STBI__RESTART(x) ((x) >= 0xd0 && (x) <= 0xd7)
2437
2438 // after a restart interval, stbi__jpeg_reset the entropy decoder and
2439 // the dc prediction
stbi__jpeg_reset(stbi__jpeg * j)2440 static void stbi__jpeg_reset(stbi__jpeg *j)
2441 {
2442 j->code_bits = 0;
2443 j->code_buffer = 0;
2444 j->nomore = 0;
2445 j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = 0;
2446 j->marker = STBI__MARKER_none;
2447 j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
2448 j->eob_run = 0;
2449 // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
2450 // since we don't even allow 1<<30 pixels
2451 }
2452
stbi__parse_entropy_coded_data(stbi__jpeg * z)2453 static int stbi__parse_entropy_coded_data(stbi__jpeg *z)
2454 {
2455 stbi__jpeg_reset(z);
2456 if (!z->progressive) {
2457 if (z->scan_n == 1) {
2458 int i,j;
2459 STBI_SIMD_ALIGN(short, data[64]);
2460 int n = z->order[0];
2461 // non-interleaved data, we just need to process one block at a time,
2462 // in trivial scanline order
2463 // number of blocks to do just depends on how many actual "pixels" this
2464 // component has, independent of interleaved MCU blocking and such
2465 int w = (z->img_comp[n].x+7) >> 3;
2466 int h = (z->img_comp[n].y+7) >> 3;
2467 for (j=0; j < h; ++j) {
2468 for (i=0; i < w; ++i) {
2469 int ha = z->img_comp[n].ha;
2470 if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2471 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2472 // every data block is an MCU, so countdown the restart interval
2473 if (--z->todo <= 0) {
2474 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2475 // if it's NOT a restart, then just bail, so we get corrupt data
2476 // rather than no data
2477 if (!STBI__RESTART(z->marker)) return 1;
2478 stbi__jpeg_reset(z);
2479 }
2480 }
2481 }
2482 return 1;
2483 } else { // interleaved
2484 int i,j,k,x,y;
2485 STBI_SIMD_ALIGN(short, data[64]);
2486 for (j=0; j < z->img_mcu_y; ++j) {
2487 for (i=0; i < z->img_mcu_x; ++i) {
2488 // scan an interleaved mcu... process scan_n components in order
2489 for (k=0; k < z->scan_n; ++k) {
2490 int n = z->order[k];
2491 // scan out an mcu's worth of this component; that's just determined
2492 // by the basic H and V specified for the component
2493 for (y=0; y < z->img_comp[n].v; ++y) {
2494 for (x=0; x < z->img_comp[n].h; ++x) {
2495 int x2 = (i*z->img_comp[n].h + x)*8;
2496 int y2 = (j*z->img_comp[n].v + y)*8;
2497 int ha = z->img_comp[n].ha;
2498 if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2499 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data);
2500 }
2501 }
2502 }
2503 // after all interleaved components, that's an interleaved MCU,
2504 // so now count down the restart interval
2505 if (--z->todo <= 0) {
2506 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2507 if (!STBI__RESTART(z->marker)) return 1;
2508 stbi__jpeg_reset(z);
2509 }
2510 }
2511 }
2512 return 1;
2513 }
2514 } else {
2515 if (z->scan_n == 1) {
2516 int i,j;
2517 int n = z->order[0];
2518 // non-interleaved data, we just need to process one block at a time,
2519 // in trivial scanline order
2520 // number of blocks to do just depends on how many actual "pixels" this
2521 // component has, independent of interleaved MCU blocking and such
2522 int w = (z->img_comp[n].x+7) >> 3;
2523 int h = (z->img_comp[n].y+7) >> 3;
2524 for (j=0; j < h; ++j) {
2525 for (i=0; i < w; ++i) {
2526 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2527 if (z->spec_start == 0) {
2528 if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2529 return 0;
2530 } else {
2531 int ha = z->img_comp[n].ha;
2532 if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha]))
2533 return 0;
2534 }
2535 // every data block is an MCU, so countdown the restart interval
2536 if (--z->todo <= 0) {
2537 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2538 if (!STBI__RESTART(z->marker)) return 1;
2539 stbi__jpeg_reset(z);
2540 }
2541 }
2542 }
2543 return 1;
2544 } else { // interleaved
2545 int i,j,k,x,y;
2546 for (j=0; j < z->img_mcu_y; ++j) {
2547 for (i=0; i < z->img_mcu_x; ++i) {
2548 // scan an interleaved mcu... process scan_n components in order
2549 for (k=0; k < z->scan_n; ++k) {
2550 int n = z->order[k];
2551 // scan out an mcu's worth of this component; that's just determined
2552 // by the basic H and V specified for the component
2553 for (y=0; y < z->img_comp[n].v; ++y) {
2554 for (x=0; x < z->img_comp[n].h; ++x) {
2555 int x2 = (i*z->img_comp[n].h + x);
2556 int y2 = (j*z->img_comp[n].v + y);
2557 short *data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w);
2558 if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2559 return 0;
2560 }
2561 }
2562 }
2563 // after all interleaved components, that's an interleaved MCU,
2564 // so now count down the restart interval
2565 if (--z->todo <= 0) {
2566 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2567 if (!STBI__RESTART(z->marker)) return 1;
2568 stbi__jpeg_reset(z);
2569 }
2570 }
2571 }
2572 return 1;
2573 }
2574 }
2575 }
2576
stbi__jpeg_dequantize(short * data,stbi_uc * dequant)2577 static void stbi__jpeg_dequantize(short *data, stbi_uc *dequant)
2578 {
2579 int i;
2580 for (i=0; i < 64; ++i)
2581 data[i] *= dequant[i];
2582 }
2583
stbi__jpeg_finish(stbi__jpeg * z)2584 static void stbi__jpeg_finish(stbi__jpeg *z)
2585 {
2586 if (z->progressive) {
2587 // dequantize and idct the data
2588 int i,j,n;
2589 for (n=0; n < z->s->img_n; ++n) {
2590 int w = (z->img_comp[n].x+7) >> 3;
2591 int h = (z->img_comp[n].y+7) >> 3;
2592 for (j=0; j < h; ++j) {
2593 for (i=0; i < w; ++i) {
2594 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2595 stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]);
2596 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2597 }
2598 }
2599 }
2600 }
2601 }
2602
stbi__process_marker(stbi__jpeg * z,int m)2603 static int stbi__process_marker(stbi__jpeg *z, int m)
2604 {
2605 int L;
2606 switch (m) {
2607 case STBI__MARKER_none: // no marker found
2608 return stbi__err("expected marker","Corrupt JPEG");
2609
2610 case 0xDD: // DRI - specify restart interval
2611 if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
2612 z->restart_interval = stbi__get16be(z->s);
2613 return 1;
2614
2615 case 0xDB: // DQT - define quantization table
2616 L = stbi__get16be(z->s)-2;
2617 while (L > 0) {
2618 int q = stbi__get8(z->s);
2619 int p = q >> 4;
2620 int t = q & 15,i;
2621 if (p != 0) return stbi__err("bad DQT type","Corrupt JPEG");
2622 if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG");
2623 for (i=0; i < 64; ++i)
2624 z->dequant[t][stbi__jpeg_dezigzag[i]] = stbi__get8(z->s);
2625 L -= 65;
2626 }
2627 return L==0;
2628
2629 case 0xC4: // DHT - define huffman table
2630 L = stbi__get16be(z->s)-2;
2631 while (L > 0) {
2632 stbi_uc *v;
2633 int sizes[16],i,n=0;
2634 int q = stbi__get8(z->s);
2635 int tc = q >> 4;
2636 int th = q & 15;
2637 if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG");
2638 for (i=0; i < 16; ++i) {
2639 sizes[i] = stbi__get8(z->s);
2640 n += sizes[i];
2641 }
2642 L -= 17;
2643 if (tc == 0) {
2644 if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0;
2645 v = z->huff_dc[th].values;
2646 } else {
2647 if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0;
2648 v = z->huff_ac[th].values;
2649 }
2650 for (i=0; i < n; ++i)
2651 v[i] = stbi__get8(z->s);
2652 if (tc != 0)
2653 stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
2654 L -= n;
2655 }
2656 return L==0;
2657 }
2658 // check for comment block or APP blocks
2659 if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
2660 stbi__skip(z->s, stbi__get16be(z->s)-2);
2661 return 1;
2662 }
2663 return 0;
2664 }
2665
2666 // after we see SOS
stbi__process_scan_header(stbi__jpeg * z)2667 static int stbi__process_scan_header(stbi__jpeg *z)
2668 {
2669 int i;
2670 int Ls = stbi__get16be(z->s);
2671 z->scan_n = stbi__get8(z->s);
2672 if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return stbi__err("bad SOS component count","Corrupt JPEG");
2673 if (Ls != 6+2*z->scan_n) return stbi__err("bad SOS len","Corrupt JPEG");
2674 for (i=0; i < z->scan_n; ++i) {
2675 int id = stbi__get8(z->s), which;
2676 int q = stbi__get8(z->s);
2677 for (which = 0; which < z->s->img_n; ++which)
2678 if (z->img_comp[which].id == id)
2679 break;
2680 if (which == z->s->img_n) return 0; // no match
2681 z->img_comp[which].hd = q >> 4; if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff","Corrupt JPEG");
2682 z->img_comp[which].ha = q & 15; if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff","Corrupt JPEG");
2683 z->order[i] = which;
2684 }
2685
2686 {
2687 int aa;
2688 z->spec_start = stbi__get8(z->s);
2689 z->spec_end = stbi__get8(z->s); // should be 63, but might be 0
2690 aa = stbi__get8(z->s);
2691 z->succ_high = (aa >> 4);
2692 z->succ_low = (aa & 15);
2693 if (z->progressive) {
2694 if (z->spec_start > 63 || z->spec_end > 63 || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13)
2695 return stbi__err("bad SOS", "Corrupt JPEG");
2696 } else {
2697 if (z->spec_start != 0) return stbi__err("bad SOS","Corrupt JPEG");
2698 if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS","Corrupt JPEG");
2699 z->spec_end = 63;
2700 }
2701 }
2702
2703 return 1;
2704 }
2705
stbi__process_frame_header(stbi__jpeg * z,int scan)2706 static int stbi__process_frame_header(stbi__jpeg *z, int scan)
2707 {
2708 stbi__context *s = z->s;
2709 int Lf,p,i,q, h_max=1,v_max=1,c;
2710 Lf = stbi__get16be(s); if (Lf < 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
2711 p = stbi__get8(s); if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
2712 s->img_y = stbi__get16be(s); if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
2713 s->img_x = stbi__get16be(s); if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
2714 c = stbi__get8(s);
2715 if (c != 3 && c != 1) return stbi__err("bad component count","Corrupt JPEG"); // JFIF requires
2716 s->img_n = c;
2717 for (i=0; i < c; ++i) {
2718 z->img_comp[i].data = NULL;
2719 z->img_comp[i].linebuf = NULL;
2720 }
2721
2722 if (Lf != 8+3*s->img_n) return stbi__err("bad SOF len","Corrupt JPEG");
2723
2724 for (i=0; i < s->img_n; ++i) {
2725 z->img_comp[i].id = stbi__get8(s);
2726 if (z->img_comp[i].id != i+1) // JFIF requires
2727 if (z->img_comp[i].id != i) // some version of jpegtran outputs non-JFIF-compliant files!
2728 return stbi__err("bad component ID","Corrupt JPEG");
2729 q = stbi__get8(s);
2730 z->img_comp[i].h = (q >> 4); if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG");
2731 z->img_comp[i].v = q & 15; if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG");
2732 z->img_comp[i].tq = stbi__get8(s); if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG");
2733 }
2734
2735 if (scan != STBI__SCAN_load) return 1;
2736
2737 if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
2738
2739 for (i=0; i < s->img_n; ++i) {
2740 if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
2741 if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
2742 }
2743
2744 // compute interleaved mcu info
2745 z->img_h_max = h_max;
2746 z->img_v_max = v_max;
2747 z->img_mcu_w = h_max * 8;
2748 z->img_mcu_h = v_max * 8;
2749 z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
2750 z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;
2751
2752 for (i=0; i < s->img_n; ++i) {
2753 // number of effective pixels (e.g. for non-interleaved MCU)
2754 z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
2755 z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
2756 // to simplify generation, we'll allocate enough memory to decode
2757 // the bogus oversized data from using interleaved MCUs and their
2758 // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
2759 // discard the extra data until colorspace conversion
2760 z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
2761 z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
2762 z->img_comp[i].raw_data = stbi__malloc(z->img_comp[i].w2 * z->img_comp[i].h2+15);
2763
2764 if (z->img_comp[i].raw_data == NULL) {
2765 for(--i; i >= 0; --i) {
2766 STBI_FREE(z->img_comp[i].raw_data);
2767 z->img_comp[i].raw_data = NULL;
2768 }
2769 return stbi__err("outofmem", "Out of memory");
2770 }
2771 // align blocks for idct using mmx/sse
2772 z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
2773 z->img_comp[i].linebuf = NULL;
2774 if (z->progressive) {
2775 z->img_comp[i].coeff_w = (z->img_comp[i].w2 + 7) >> 3;
2776 z->img_comp[i].coeff_h = (z->img_comp[i].h2 + 7) >> 3;
2777 z->img_comp[i].raw_coeff = STBI_MALLOC(z->img_comp[i].coeff_w * z->img_comp[i].coeff_h * 64 * sizeof(short) + 15);
2778 z->img_comp[i].coeff = (short*) (((size_t) z->img_comp[i].raw_coeff + 15) & ~15);
2779 } else {
2780 z->img_comp[i].coeff = 0;
2781 z->img_comp[i].raw_coeff = 0;
2782 }
2783 }
2784
2785 return 1;
2786 }
2787
2788 // use comparisons since in some cases we handle more than one case (e.g. SOF)
2789 #define stbi__DNL(x) ((x) == 0xdc)
2790 #define stbi__SOI(x) ((x) == 0xd8)
2791 #define stbi__EOI(x) ((x) == 0xd9)
2792 #define stbi__SOF(x) ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
2793 #define stbi__SOS(x) ((x) == 0xda)
2794
2795 #define stbi__SOF_progressive(x) ((x) == 0xc2)
2796
stbi__decode_jpeg_header(stbi__jpeg * z,int scan)2797 static int stbi__decode_jpeg_header(stbi__jpeg *z, int scan)
2798 {
2799 int m;
2800 z->marker = STBI__MARKER_none; // initialize cached marker to empty
2801 m = stbi__get_marker(z);
2802 if (!stbi__SOI(m)) return stbi__err("no SOI","Corrupt JPEG");
2803 if (scan == STBI__SCAN_type) return 1;
2804 m = stbi__get_marker(z);
2805 while (!stbi__SOF(m)) {
2806 if (!stbi__process_marker(z,m)) return 0;
2807 m = stbi__get_marker(z);
2808 while (m == STBI__MARKER_none) {
2809 // some files have extra padding after their blocks, so ok, we'll scan
2810 if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG");
2811 m = stbi__get_marker(z);
2812 }
2813 }
2814 z->progressive = stbi__SOF_progressive(m);
2815 if (!stbi__process_frame_header(z, scan)) return 0;
2816 return 1;
2817 }
2818
2819 // decode image to YCbCr format
stbi__decode_jpeg_image(stbi__jpeg * j)2820 static int stbi__decode_jpeg_image(stbi__jpeg *j)
2821 {
2822 int m;
2823 for (m = 0; m < 4; m++) {
2824 j->img_comp[m].raw_data = NULL;
2825 j->img_comp[m].raw_coeff = NULL;
2826 }
2827 j->restart_interval = 0;
2828 if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0;
2829 m = stbi__get_marker(j);
2830 while (!stbi__EOI(m)) {
2831 if (stbi__SOS(m)) {
2832 if (!stbi__process_scan_header(j)) return 0;
2833 if (!stbi__parse_entropy_coded_data(j)) return 0;
2834 if (j->marker == STBI__MARKER_none ) {
2835 // handle 0s at the end of image data from IP Kamera 9060
2836 while (!stbi__at_eof(j->s)) {
2837 int x = stbi__get8(j->s);
2838 if (x == 255) {
2839 j->marker = stbi__get8(j->s);
2840 break;
2841 } else if (x != 0) {
2842 return stbi__err("junk before marker", "Corrupt JPEG");
2843 }
2844 }
2845 // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
2846 }
2847 } else {
2848 if (!stbi__process_marker(j, m)) return 0;
2849 }
2850 m = stbi__get_marker(j);
2851 }
2852 if (j->progressive)
2853 stbi__jpeg_finish(j);
2854 return 1;
2855 }
2856
2857 // static jfif-centered resampling (across block boundaries)
2858
2859 typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1,
2860 int w, int hs);
2861
2862 #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
2863
resample_row_1(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2864 static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2865 {
2866 STBI_NOTUSED(out);
2867 STBI_NOTUSED(in_far);
2868 STBI_NOTUSED(w);
2869 STBI_NOTUSED(hs);
2870 return in_near;
2871 }
2872
stbi__resample_row_v_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2873 static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2874 {
2875 // need to generate two samples vertically for every one in input
2876 int i;
2877 STBI_NOTUSED(hs);
2878 for (i=0; i < w; ++i)
2879 out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2);
2880 return out;
2881 }
2882
stbi__resample_row_h_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2883 static stbi_uc* stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2884 {
2885 // need to generate two samples horizontally for every one in input
2886 int i;
2887 stbi_uc *input = in_near;
2888
2889 if (w == 1) {
2890 // if only one sample, can't do any interpolation
2891 out[0] = out[1] = input[0];
2892 return out;
2893 }
2894
2895 out[0] = input[0];
2896 out[1] = stbi__div4(input[0]*3 + input[1] + 2);
2897 for (i=1; i < w-1; ++i) {
2898 int n = 3*input[i]+2;
2899 out[i*2+0] = stbi__div4(n+input[i-1]);
2900 out[i*2+1] = stbi__div4(n+input[i+1]);
2901 }
2902 out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2);
2903 out[i*2+1] = input[w-1];
2904
2905 STBI_NOTUSED(in_far);
2906 STBI_NOTUSED(hs);
2907
2908 return out;
2909 }
2910
2911 #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
2912
stbi__resample_row_hv_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2913 static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2914 {
2915 // need to generate 2x2 samples for every one in input
2916 int i,t0,t1;
2917 if (w == 1) {
2918 out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2919 return out;
2920 }
2921
2922 t1 = 3*in_near[0] + in_far[0];
2923 out[0] = stbi__div4(t1+2);
2924 for (i=1; i < w; ++i) {
2925 t0 = t1;
2926 t1 = 3*in_near[i]+in_far[i];
2927 out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
2928 out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
2929 }
2930 out[w*2-1] = stbi__div4(t1+2);
2931
2932 STBI_NOTUSED(hs);
2933
2934 return out;
2935 }
2936
2937 #if defined(STBI_SSE2) || defined(STBI_NEON)
stbi__resample_row_hv_2_simd(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2938 static stbi_uc *stbi__resample_row_hv_2_simd(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2939 {
2940 // need to generate 2x2 samples for every one in input
2941 int i=0,t0,t1;
2942
2943 if (w == 1) {
2944 out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2945 return out;
2946 }
2947
2948 t1 = 3*in_near[0] + in_far[0];
2949 // process groups of 8 pixels for as long as we can.
2950 // note we can't handle the last pixel in a row in this loop
2951 // because we need to handle the filter boundary conditions.
2952 for (; i < ((w-1) & ~7); i += 8) {
2953 #if defined(STBI_SSE2)
2954 // load and perform the vertical filtering pass
2955 // this uses 3*x + y = 4*x + (y - x)
2956 __m128i zero = _mm_setzero_si128();
2957 __m128i farb = _mm_loadl_epi64((__m128i *) (in_far + i));
2958 __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i));
2959 __m128i farw = _mm_unpacklo_epi8(farb, zero);
2960 __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
2961 __m128i diff = _mm_sub_epi16(farw, nearw);
2962 __m128i nears = _mm_slli_epi16(nearw, 2);
2963 __m128i curr = _mm_add_epi16(nears, diff); // current row
2964
2965 // horizontal filter works the same based on shifted vers of current
2966 // row. "prev" is current row shifted right by 1 pixel; we need to
2967 // insert the previous pixel value (from t1).
2968 // "next" is current row shifted left by 1 pixel, with first pixel
2969 // of next block of 8 pixels added in.
2970 __m128i prv0 = _mm_slli_si128(curr, 2);
2971 __m128i nxt0 = _mm_srli_si128(curr, 2);
2972 __m128i prev = _mm_insert_epi16(prv0, t1, 0);
2973 __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7);
2974
2975 // horizontal filter, polyphase implementation since it's convenient:
2976 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
2977 // odd pixels = 3*cur + next = cur*4 + (next - cur)
2978 // note the shared term.
2979 __m128i bias = _mm_set1_epi16(8);
2980 __m128i curs = _mm_slli_epi16(curr, 2);
2981 __m128i prvd = _mm_sub_epi16(prev, curr);
2982 __m128i nxtd = _mm_sub_epi16(next, curr);
2983 __m128i curb = _mm_add_epi16(curs, bias);
2984 __m128i even = _mm_add_epi16(prvd, curb);
2985 __m128i odd = _mm_add_epi16(nxtd, curb);
2986
2987 // interleave even and odd pixels, then undo scaling.
2988 __m128i int0 = _mm_unpacklo_epi16(even, odd);
2989 __m128i int1 = _mm_unpackhi_epi16(even, odd);
2990 __m128i de0 = _mm_srli_epi16(int0, 4);
2991 __m128i de1 = _mm_srli_epi16(int1, 4);
2992
2993 // pack and write output
2994 __m128i outv = _mm_packus_epi16(de0, de1);
2995 _mm_storeu_si128((__m128i *) (out + i*2), outv);
2996 #elif defined(STBI_NEON)
2997 // load and perform the vertical filtering pass
2998 // this uses 3*x + y = 4*x + (y - x)
2999 uint8x8_t farb = vld1_u8(in_far + i);
3000 uint8x8_t nearb = vld1_u8(in_near + i);
3001 int16x8_t diff = vreinterpretq_s16_u16(vsubl_u8(farb, nearb));
3002 int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2));
3003 int16x8_t curr = vaddq_s16(nears, diff); // current row
3004
3005 // horizontal filter works the same based on shifted vers of current
3006 // row. "prev" is current row shifted right by 1 pixel; we need to
3007 // insert the previous pixel value (from t1).
3008 // "next" is current row shifted left by 1 pixel, with first pixel
3009 // of next block of 8 pixels added in.
3010 int16x8_t prv0 = vextq_s16(curr, curr, 7);
3011 int16x8_t nxt0 = vextq_s16(curr, curr, 1);
3012 int16x8_t prev = vsetq_lane_s16(t1, prv0, 0);
3013 int16x8_t next = vsetq_lane_s16(3*in_near[i+8] + in_far[i+8], nxt0, 7);
3014
3015 // horizontal filter, polyphase implementation since it's convenient:
3016 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
3017 // odd pixels = 3*cur + next = cur*4 + (next - cur)
3018 // note the shared term.
3019 int16x8_t curs = vshlq_n_s16(curr, 2);
3020 int16x8_t prvd = vsubq_s16(prev, curr);
3021 int16x8_t nxtd = vsubq_s16(next, curr);
3022 int16x8_t even = vaddq_s16(curs, prvd);
3023 int16x8_t odd = vaddq_s16(curs, nxtd);
3024
3025 // undo scaling and round, then store with even/odd phases interleaved
3026 uint8x8x2_t o;
3027 o.val[0] = vqrshrun_n_s16(even, 4);
3028 o.val[1] = vqrshrun_n_s16(odd, 4);
3029 vst2_u8(out + i*2, o);
3030 #endif
3031
3032 // "previous" value for next iter
3033 t1 = 3*in_near[i+7] + in_far[i+7];
3034 }
3035
3036 t0 = t1;
3037 t1 = 3*in_near[i] + in_far[i];
3038 out[i*2] = stbi__div16(3*t1 + t0 + 8);
3039
3040 for (++i; i < w; ++i) {
3041 t0 = t1;
3042 t1 = 3*in_near[i]+in_far[i];
3043 out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
3044 out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
3045 }
3046 out[w*2-1] = stbi__div4(t1+2);
3047
3048 STBI_NOTUSED(hs);
3049
3050 return out;
3051 }
3052 #endif
3053
stbi__resample_row_generic(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)3054 static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3055 {
3056 // resample with nearest-neighbor
3057 int i,j;
3058 STBI_NOTUSED(in_far);
3059 for (i=0; i < w; ++i)
3060 for (j=0; j < hs; ++j)
3061 out[i*hs+j] = in_near[i];
3062 return out;
3063 }
3064
3065 #ifdef STBI_JPEG_OLD
3066 // this is the same YCbCr-to-RGB calculation that stb_image has used
3067 // historically before the algorithm changes in 1.49
3068 #define float2fixed(x) ((int) ((x) * 65536 + 0.5))
stbi__YCbCr_to_RGB_row(stbi_uc * out,const stbi_uc * y,const stbi_uc * pcb,const stbi_uc * pcr,int count,int step)3069 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3070 {
3071 int i;
3072 for (i=0; i < count; ++i) {
3073 int y_fixed = (y[i] << 16) + 32768; // rounding
3074 int r,g,b;
3075 int cr = pcr[i] - 128;
3076 int cb = pcb[i] - 128;
3077 r = y_fixed + cr*float2fixed(1.40200f);
3078 g = y_fixed - cr*float2fixed(0.71414f) - cb*float2fixed(0.34414f);
3079 b = y_fixed + cb*float2fixed(1.77200f);
3080 r >>= 16;
3081 g >>= 16;
3082 b >>= 16;
3083 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3084 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3085 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3086 out[0] = (stbi_uc)r;
3087 out[1] = (stbi_uc)g;
3088 out[2] = (stbi_uc)b;
3089 out[3] = 255;
3090 out += step;
3091 }
3092 }
3093 #else
3094 // this is a reduced-precision calculation of YCbCr-to-RGB introduced
3095 // to make sure the code produces the same results in both SIMD and scalar
3096 #define float2fixed(x) (((int) ((x) * 4096.0f + 0.5f)) << 8)
stbi__YCbCr_to_RGB_row(stbi_uc * out,const stbi_uc * y,const stbi_uc * pcb,const stbi_uc * pcr,int count,int step)3097 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3098 {
3099 int i;
3100 for (i=0; i < count; ++i) {
3101 int y_fixed = (y[i] << 20) + (1<<19); // rounding
3102 int r,g,b;
3103 int cr = pcr[i] - 128;
3104 int cb = pcb[i] - 128;
3105 r = y_fixed + cr* float2fixed(1.40200f);
3106 g = y_fixed + (cr*-float2fixed(0.71414f)) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3107 b = y_fixed + cb* float2fixed(1.77200f);
3108 r >>= 20;
3109 g >>= 20;
3110 b >>= 20;
3111 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3112 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3113 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3114 out[0] = (stbi_uc)r;
3115 out[1] = (stbi_uc)g;
3116 out[2] = (stbi_uc)b;
3117 out[3] = 255;
3118 out += step;
3119 }
3120 }
3121 #endif
3122
3123 #if defined(STBI_SSE2) || defined(STBI_NEON)
stbi__YCbCr_to_RGB_simd(stbi_uc * out,stbi_uc const * y,stbi_uc const * pcb,stbi_uc const * pcr,int count,int step)3124 static void stbi__YCbCr_to_RGB_simd(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step)
3125 {
3126 int i = 0;
3127
3128 #ifdef STBI_SSE2
3129 // step == 3 is pretty ugly on the final interleave, and i'm not convinced
3130 // it's useful in practice (you wouldn't use it for textures, for example).
3131 // so just accelerate step == 4 case.
3132 if (step == 4) {
3133 // this is a fairly straightforward implementation and not super-optimized.
3134 __m128i signflip = _mm_set1_epi8(-0x80);
3135 __m128i cr_const0 = _mm_set1_epi16( (short) ( 1.40200f*4096.0f+0.5f));
3136 __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f));
3137 __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f));
3138 __m128i cb_const1 = _mm_set1_epi16( (short) ( 1.77200f*4096.0f+0.5f));
3139 __m128i y_bias = _mm_set1_epi8((char) (unsigned char) 128);
3140 __m128i xw = _mm_set1_epi16(255); // alpha channel
3141
3142 for (; i+7 < count; i += 8) {
3143 // load
3144 __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i));
3145 __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i));
3146 __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i));
3147 __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
3148 __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
3149
3150 // unpack to short (and left-shift cr, cb by 8)
3151 __m128i yw = _mm_unpacklo_epi8(y_bias, y_bytes);
3152 __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
3153 __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
3154
3155 // color transform
3156 __m128i yws = _mm_srli_epi16(yw, 4);
3157 __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
3158 __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
3159 __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
3160 __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
3161 __m128i rws = _mm_add_epi16(cr0, yws);
3162 __m128i gwt = _mm_add_epi16(cb0, yws);
3163 __m128i bws = _mm_add_epi16(yws, cb1);
3164 __m128i gws = _mm_add_epi16(gwt, cr1);
3165
3166 // descale
3167 __m128i rw = _mm_srai_epi16(rws, 4);
3168 __m128i bw = _mm_srai_epi16(bws, 4);
3169 __m128i gw = _mm_srai_epi16(gws, 4);
3170
3171 // back to byte, set up for transpose
3172 __m128i brb = _mm_packus_epi16(rw, bw);
3173 __m128i gxb = _mm_packus_epi16(gw, xw);
3174
3175 // transpose to interleave channels
3176 __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
3177 __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
3178 __m128i o0 = _mm_unpacklo_epi16(t0, t1);
3179 __m128i o1 = _mm_unpackhi_epi16(t0, t1);
3180
3181 // store
3182 _mm_storeu_si128((__m128i *) (out + 0), o0);
3183 _mm_storeu_si128((__m128i *) (out + 16), o1);
3184 out += 32;
3185 }
3186 }
3187 #endif
3188
3189 #ifdef STBI_NEON
3190 // in this version, step=3 support would be easy to add. but is there demand?
3191 if (step == 4) {
3192 // this is a fairly straightforward implementation and not super-optimized.
3193 uint8x8_t signflip = vdup_n_u8(0x80);
3194 int16x8_t cr_const0 = vdupq_n_s16( (short) ( 1.40200f*4096.0f+0.5f));
3195 int16x8_t cr_const1 = vdupq_n_s16( - (short) ( 0.71414f*4096.0f+0.5f));
3196 int16x8_t cb_const0 = vdupq_n_s16( - (short) ( 0.34414f*4096.0f+0.5f));
3197 int16x8_t cb_const1 = vdupq_n_s16( (short) ( 1.77200f*4096.0f+0.5f));
3198
3199 for (; i+7 < count; i += 8) {
3200 // load
3201 uint8x8_t y_bytes = vld1_u8(y + i);
3202 uint8x8_t cr_bytes = vld1_u8(pcr + i);
3203 uint8x8_t cb_bytes = vld1_u8(pcb + i);
3204 int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip));
3205 int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip));
3206
3207 // expand to s16
3208 int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4));
3209 int16x8_t crw = vshll_n_s8(cr_biased, 7);
3210 int16x8_t cbw = vshll_n_s8(cb_biased, 7);
3211
3212 // color transform
3213 int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0);
3214 int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0);
3215 int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1);
3216 int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1);
3217 int16x8_t rws = vaddq_s16(yws, cr0);
3218 int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1);
3219 int16x8_t bws = vaddq_s16(yws, cb1);
3220
3221 // undo scaling, round, convert to byte
3222 uint8x8x4_t o;
3223 o.val[0] = vqrshrun_n_s16(rws, 4);
3224 o.val[1] = vqrshrun_n_s16(gws, 4);
3225 o.val[2] = vqrshrun_n_s16(bws, 4);
3226 o.val[3] = vdup_n_u8(255);
3227
3228 // store, interleaving r/g/b/a
3229 vst4_u8(out, o);
3230 out += 8*4;
3231 }
3232 }
3233 #endif
3234
3235 for (; i < count; ++i) {
3236 int y_fixed = (y[i] << 20) + (1<<19); // rounding
3237 int r,g,b;
3238 int cr = pcr[i] - 128;
3239 int cb = pcb[i] - 128;
3240 r = y_fixed + cr* float2fixed(1.40200f);
3241 g = y_fixed + cr*-float2fixed(0.71414f) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3242 b = y_fixed + cb* float2fixed(1.77200f);
3243 r >>= 20;
3244 g >>= 20;
3245 b >>= 20;
3246 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3247 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3248 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3249 out[0] = (stbi_uc)r;
3250 out[1] = (stbi_uc)g;
3251 out[2] = (stbi_uc)b;
3252 out[3] = 255;
3253 out += step;
3254 }
3255 }
3256 #endif
3257
3258 // set up the kernels
stbi__setup_jpeg(stbi__jpeg * j)3259 static void stbi__setup_jpeg(stbi__jpeg *j)
3260 {
3261 j->idct_block_kernel = stbi__idct_block;
3262 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
3263 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
3264
3265 #ifdef STBI_SSE2
3266 if (stbi__sse2_available()) {
3267 j->idct_block_kernel = stbi__idct_simd;
3268 #ifndef STBI_JPEG_OLD
3269 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3270 #endif
3271 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3272 }
3273 #endif
3274
3275 #ifdef STBI_NEON
3276 j->idct_block_kernel = stbi__idct_simd;
3277 #ifndef STBI_JPEG_OLD
3278 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3279 #endif
3280 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3281 #endif
3282 }
3283
3284 // clean up the temporary component buffers
stbi__cleanup_jpeg(stbi__jpeg * j)3285 static void stbi__cleanup_jpeg(stbi__jpeg *j)
3286 {
3287 int i;
3288 for (i=0; i < j->s->img_n; ++i) {
3289 if (j->img_comp[i].raw_data) {
3290 STBI_FREE(j->img_comp[i].raw_data);
3291 j->img_comp[i].raw_data = NULL;
3292 j->img_comp[i].data = NULL;
3293 }
3294 if (j->img_comp[i].raw_coeff) {
3295 STBI_FREE(j->img_comp[i].raw_coeff);
3296 j->img_comp[i].raw_coeff = 0;
3297 j->img_comp[i].coeff = 0;
3298 }
3299 if (j->img_comp[i].linebuf) {
3300 STBI_FREE(j->img_comp[i].linebuf);
3301 j->img_comp[i].linebuf = NULL;
3302 }
3303 }
3304 }
3305
3306 typedef struct
3307 {
3308 resample_row_func resample;
3309 stbi_uc *line0,*line1;
3310 int hs,vs; // expansion factor in each axis
3311 int w_lores; // horizontal pixels pre-expansion
3312 int ystep; // how far through vertical expansion we are
3313 int ypos; // which pre-expansion row we're on
3314 } stbi__resample;
3315
load_jpeg_image(stbi__jpeg * z,int * out_x,int * out_y,int * comp,int req_comp)3316 static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
3317 {
3318 int n, decode_n;
3319 z->s->img_n = 0; // make stbi__cleanup_jpeg safe
3320
3321 // validate req_comp
3322 if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
3323
3324 // load a jpeg image from whichever source, but leave in YCbCr format
3325 if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
3326
3327 // determine actual number of components to generate
3328 n = req_comp ? req_comp : z->s->img_n;
3329
3330 if (z->s->img_n == 3 && n < 3)
3331 decode_n = 1;
3332 else
3333 decode_n = z->s->img_n;
3334
3335 // resample and color-convert
3336 {
3337 int k;
3338 unsigned int i,j;
3339 stbi_uc *output;
3340 stbi_uc *coutput[4];
3341
3342 stbi__resample res_comp[4];
3343
3344 for (k=0; k < decode_n; ++k) {
3345 stbi__resample *r = &res_comp[k];
3346
3347 // allocate line buffer big enough for upsampling off the edges
3348 // with upsample factor of 4
3349 z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3);
3350 if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3351
3352 r->hs = z->img_h_max / z->img_comp[k].h;
3353 r->vs = z->img_v_max / z->img_comp[k].v;
3354 r->ystep = r->vs >> 1;
3355 r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
3356 r->ypos = 0;
3357 r->line0 = r->line1 = z->img_comp[k].data;
3358
3359 if (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
3360 else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
3361 else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
3362 else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
3363 else r->resample = stbi__resample_row_generic;
3364 }
3365
3366 // can't error after this so, this is safe
3367 output = (stbi_uc *) stbi__malloc(n * z->s->img_x * z->s->img_y + 1);
3368 if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3369
3370 // now go ahead and resample
3371 for (j=0; j < z->s->img_y; ++j) {
3372 stbi_uc *out = output + n * z->s->img_x * j;
3373 for (k=0; k < decode_n; ++k) {
3374 stbi__resample *r = &res_comp[k];
3375 int y_bot = r->ystep >= (r->vs >> 1);
3376 coutput[k] = r->resample(z->img_comp[k].linebuf,
3377 y_bot ? r->line1 : r->line0,
3378 y_bot ? r->line0 : r->line1,
3379 r->w_lores, r->hs);
3380 if (++r->ystep >= r->vs) {
3381 r->ystep = 0;
3382 r->line0 = r->line1;
3383 if (++r->ypos < z->img_comp[k].y)
3384 r->line1 += z->img_comp[k].w2;
3385 }
3386 }
3387 if (n >= 3) {
3388 stbi_uc *y = coutput[0];
3389 if (z->s->img_n == 3) {
3390 z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3391 } else
3392 for (i=0; i < z->s->img_x; ++i) {
3393 out[0] = out[1] = out[2] = y[i];
3394 out[3] = 255; // not used if n==3
3395 out += n;
3396 }
3397 } else {
3398 stbi_uc *y = coutput[0];
3399 if (n == 1)
3400 for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
3401 else
3402 for (i=0; i < z->s->img_x; ++i) *out++ = y[i], *out++ = 255;
3403 }
3404 }
3405 stbi__cleanup_jpeg(z);
3406 *out_x = z->s->img_x;
3407 *out_y = z->s->img_y;
3408 if (comp) *comp = z->s->img_n; // report original components, not output
3409 return output;
3410 }
3411 }
3412
stbi__jpeg_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)3413 static unsigned char *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
3414 {
3415 stbi__jpeg j;
3416 j.s = s;
3417 stbi__setup_jpeg(&j);
3418 return load_jpeg_image(&j, x,y,comp,req_comp);
3419 }
3420
stbi__jpeg_test(stbi__context * s)3421 static int stbi__jpeg_test(stbi__context *s)
3422 {
3423 int r;
3424 stbi__jpeg j;
3425 j.s = s;
3426 stbi__setup_jpeg(&j);
3427 r = stbi__decode_jpeg_header(&j, STBI__SCAN_type);
3428 stbi__rewind(s);
3429 return r;
3430 }
3431
stbi__jpeg_info_raw(stbi__jpeg * j,int * x,int * y,int * comp)3432 static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp)
3433 {
3434 if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) {
3435 stbi__rewind( j->s );
3436 return 0;
3437 }
3438 if (x) *x = j->s->img_x;
3439 if (y) *y = j->s->img_y;
3440 if (comp) *comp = j->s->img_n;
3441 return 1;
3442 }
3443
stbi__jpeg_info(stbi__context * s,int * x,int * y,int * comp)3444 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp)
3445 {
3446 stbi__jpeg j;
3447 j.s = s;
3448 return stbi__jpeg_info_raw(&j, x, y, comp);
3449 }
3450 #endif
3451
3452 // public domain zlib decode v0.2 Sean Barrett 2006-11-18
3453 // simple implementation
3454 // - all input must be provided in an upfront buffer
3455 // - all output is written to a single output buffer (can malloc/realloc)
3456 // performance
3457 // - fast huffman
3458
3459 #ifndef STBI_NO_ZLIB
3460
3461 // fast-way is faster to check than jpeg huffman, but slow way is slower
3462 #define STBI__ZFAST_BITS 9 // accelerate all cases in default tables
3463 #define STBI__ZFAST_MASK ((1 << STBI__ZFAST_BITS) - 1)
3464
3465 // zlib-style huffman encoding
3466 // (jpegs packs from left, zlib from right, so can't share code)
3467 typedef struct
3468 {
3469 stbi__uint16 fast[1 << STBI__ZFAST_BITS];
3470 stbi__uint16 firstcode[16];
3471 int maxcode[17];
3472 stbi__uint16 firstsymbol[16];
3473 stbi_uc size[288];
3474 stbi__uint16 value[288];
3475 } stbi__zhuffman;
3476
stbi__bitreverse16(int n)3477 stbi_inline static int stbi__bitreverse16(int n)
3478 {
3479 n = ((n & 0xAAAA) >> 1) | ((n & 0x5555) << 1);
3480 n = ((n & 0xCCCC) >> 2) | ((n & 0x3333) << 2);
3481 n = ((n & 0xF0F0) >> 4) | ((n & 0x0F0F) << 4);
3482 n = ((n & 0xFF00) >> 8) | ((n & 0x00FF) << 8);
3483 return n;
3484 }
3485
stbi__bit_reverse(int v,int bits)3486 stbi_inline static int stbi__bit_reverse(int v, int bits)
3487 {
3488 STBI_ASSERT(bits <= 16);
3489 // to bit reverse n bits, reverse 16 and shift
3490 // e.g. 11 bits, bit reverse and shift away 5
3491 return stbi__bitreverse16(v) >> (16-bits);
3492 }
3493
stbi__zbuild_huffman(stbi__zhuffman * z,stbi_uc * sizelist,int num)3494 static int stbi__zbuild_huffman(stbi__zhuffman *z, stbi_uc *sizelist, int num)
3495 {
3496 int i,k=0;
3497 int code, next_code[16], sizes[17];
3498
3499 // DEFLATE spec for generating codes
3500 memset(sizes, 0, sizeof(sizes));
3501 memset(z->fast, 0, sizeof(z->fast));
3502 for (i=0; i < num; ++i)
3503 ++sizes[sizelist[i]];
3504 sizes[0] = 0;
3505 for (i=1; i < 16; ++i)
3506 if (sizes[i] > (1 << i))
3507 return stbi__err("bad sizes", "Corrupt PNG");
3508 code = 0;
3509 for (i=1; i < 16; ++i) {
3510 next_code[i] = code;
3511 z->firstcode[i] = (stbi__uint16) code;
3512 z->firstsymbol[i] = (stbi__uint16) k;
3513 code = (code + sizes[i]);
3514 if (sizes[i])
3515 if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt PNG");
3516 z->maxcode[i] = code << (16-i); // preshift for inner loop
3517 code <<= 1;
3518 k += sizes[i];
3519 }
3520 z->maxcode[16] = 0x10000; // sentinel
3521 for (i=0; i < num; ++i) {
3522 int s = sizelist[i];
3523 if (s) {
3524 int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
3525 stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i);
3526 z->size [c] = (stbi_uc ) s;
3527 z->value[c] = (stbi__uint16) i;
3528 if (s <= STBI__ZFAST_BITS) {
3529 int j = stbi__bit_reverse(next_code[s],s);
3530 while (j < (1 << STBI__ZFAST_BITS)) {
3531 z->fast[j] = fastv;
3532 j += (1 << s);
3533 }
3534 }
3535 ++next_code[s];
3536 }
3537 }
3538 return 1;
3539 }
3540
3541 // zlib-from-memory implementation for PNG reading
3542 // because PNG allows splitting the zlib stream arbitrarily,
3543 // and it's annoying structurally to have PNG call ZLIB call PNG,
3544 // we require PNG read all the IDATs and combine them into a single
3545 // memory buffer
3546
3547 typedef struct
3548 {
3549 stbi_uc *zbuffer, *zbuffer_end;
3550 int num_bits;
3551 stbi__uint32 code_buffer;
3552
3553 char *zout;
3554 char *zout_start;
3555 char *zout_end;
3556 int z_expandable;
3557
3558 stbi__zhuffman z_length, z_distance;
3559 } stbi__zbuf;
3560
stbi__zget8(stbi__zbuf * z)3561 stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z)
3562 {
3563 if (z->zbuffer >= z->zbuffer_end) return 0;
3564 return *z->zbuffer++;
3565 }
3566
stbi__fill_bits(stbi__zbuf * z)3567 static void stbi__fill_bits(stbi__zbuf *z)
3568 {
3569 do {
3570 STBI_ASSERT(z->code_buffer < (1U << z->num_bits));
3571 z->code_buffer |= (unsigned int) stbi__zget8(z) << z->num_bits;
3572 z->num_bits += 8;
3573 } while (z->num_bits <= 24);
3574 }
3575
stbi__zreceive(stbi__zbuf * z,int n)3576 stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n)
3577 {
3578 unsigned int k;
3579 if (z->num_bits < n) stbi__fill_bits(z);
3580 k = z->code_buffer & ((1 << n) - 1);
3581 z->code_buffer >>= n;
3582 z->num_bits -= n;
3583 return k;
3584 }
3585
stbi__zhuffman_decode_slowpath(stbi__zbuf * a,stbi__zhuffman * z)3586 static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z)
3587 {
3588 int b,s,k;
3589 // not resolved by fast table, so compute it the slow way
3590 // use jpeg approach, which requires MSbits at top
3591 k = stbi__bit_reverse(a->code_buffer, 16);
3592 for (s=STBI__ZFAST_BITS+1; ; ++s)
3593 if (k < z->maxcode[s])
3594 break;
3595 if (s == 16) return -1; // invalid code!
3596 // code size is s, so:
3597 b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
3598 STBI_ASSERT(z->size[b] == s);
3599 a->code_buffer >>= s;
3600 a->num_bits -= s;
3601 return z->value[b];
3602 }
3603
stbi__zhuffman_decode(stbi__zbuf * a,stbi__zhuffman * z)3604 stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z)
3605 {
3606 int b,s;
3607 if (a->num_bits < 16) stbi__fill_bits(a);
3608 b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
3609 if (b) {
3610 s = b >> 9;
3611 a->code_buffer >>= s;
3612 a->num_bits -= s;
3613 return b & 511;
3614 }
3615 return stbi__zhuffman_decode_slowpath(a, z);
3616 }
3617
stbi__zexpand(stbi__zbuf * z,char * zout,int n)3618 static int stbi__zexpand(stbi__zbuf *z, char *zout, int n) // need to make room for n bytes
3619 {
3620 char *q;
3621 int cur, limit, old_limit;
3622 z->zout = zout;
3623 if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG");
3624 cur = (int) (z->zout - z->zout_start);
3625 limit = old_limit = (int) (z->zout_end - z->zout_start);
3626 while (cur + n > limit)
3627 limit *= 2;
3628 q = (char *) STBI_REALLOC_SIZED(z->zout_start, old_limit, limit);
3629 STBI_NOTUSED(old_limit);
3630 if (q == NULL) return stbi__err("outofmem", "Out of memory");
3631 z->zout_start = q;
3632 z->zout = q + cur;
3633 z->zout_end = q + limit;
3634 return 1;
3635 }
3636
3637 static int stbi__zlength_base[31] = {
3638 3,4,5,6,7,8,9,10,11,13,
3639 15,17,19,23,27,31,35,43,51,59,
3640 67,83,99,115,131,163,195,227,258,0,0 };
3641
3642 static int stbi__zlength_extra[31]=
3643 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
3644
3645 static int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
3646 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
3647
3648 static int stbi__zdist_extra[32] =
3649 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
3650
stbi__parse_huffman_block(stbi__zbuf * a)3651 static int stbi__parse_huffman_block(stbi__zbuf *a)
3652 {
3653 char *zout = a->zout;
3654 for(;;) {
3655 int z = stbi__zhuffman_decode(a, &a->z_length);
3656 if (z < 256) {
3657 if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
3658 if (zout >= a->zout_end) {
3659 if (!stbi__zexpand(a, zout, 1)) return 0;
3660 zout = a->zout;
3661 }
3662 *zout++ = (char) z;
3663 } else {
3664 stbi_uc *p;
3665 int len,dist;
3666 if (z == 256) {
3667 a->zout = zout;
3668 return 1;
3669 }
3670 z -= 257;
3671 len = stbi__zlength_base[z];
3672 if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
3673 z = stbi__zhuffman_decode(a, &a->z_distance);
3674 if (z < 0) return stbi__err("bad huffman code","Corrupt PNG");
3675 dist = stbi__zdist_base[z];
3676 if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
3677 if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG");
3678 if (zout + len > a->zout_end) {
3679 if (!stbi__zexpand(a, zout, len)) return 0;
3680 zout = a->zout;
3681 }
3682 p = (stbi_uc *) (zout - dist);
3683 if (dist == 1) { // run of one byte; common in images.
3684 stbi_uc v = *p;
3685 if (len) { do *zout++ = v; while (--len); }
3686 } else {
3687 if (len) { do *zout++ = *p++; while (--len); }
3688 }
3689 }
3690 }
3691 }
3692
stbi__compute_huffman_codes(stbi__zbuf * a)3693 static int stbi__compute_huffman_codes(stbi__zbuf *a)
3694 {
3695 static stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
3696 stbi__zhuffman z_codelength;
3697 stbi_uc lencodes[286+32+137];//padding for maximum single op
3698 stbi_uc codelength_sizes[19];
3699 int i,n;
3700
3701 int hlit = stbi__zreceive(a,5) + 257;
3702 int hdist = stbi__zreceive(a,5) + 1;
3703 int hclen = stbi__zreceive(a,4) + 4;
3704
3705 memset(codelength_sizes, 0, sizeof(codelength_sizes));
3706 for (i=0; i < hclen; ++i) {
3707 int s = stbi__zreceive(a,3);
3708 codelength_sizes[length_dezigzag[i]] = (stbi_uc) s;
3709 }
3710 if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
3711
3712 n = 0;
3713 while (n < hlit + hdist) {
3714 int c = stbi__zhuffman_decode(a, &z_codelength);
3715 if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG");
3716 if (c < 16)
3717 lencodes[n++] = (stbi_uc) c;
3718 else if (c == 16) {
3719 c = stbi__zreceive(a,2)+3;
3720 memset(lencodes+n, lencodes[n-1], c);
3721 n += c;
3722 } else if (c == 17) {
3723 c = stbi__zreceive(a,3)+3;
3724 memset(lencodes+n, 0, c);
3725 n += c;
3726 } else {
3727 STBI_ASSERT(c == 18);
3728 c = stbi__zreceive(a,7)+11;
3729 memset(lencodes+n, 0, c);
3730 n += c;
3731 }
3732 }
3733 if (n != hlit+hdist) return stbi__err("bad codelengths","Corrupt PNG");
3734 if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
3735 if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
3736 return 1;
3737 }
3738
stbi__parse_uncomperssed_block(stbi__zbuf * a)3739 static int stbi__parse_uncomperssed_block(stbi__zbuf *a)
3740 {
3741 stbi_uc header[4];
3742 int len,nlen,k;
3743 if (a->num_bits & 7)
3744 stbi__zreceive(a, a->num_bits & 7); // discard
3745 // drain the bit-packed data into header
3746 k = 0;
3747 while (a->num_bits > 0) {
3748 header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check
3749 a->code_buffer >>= 8;
3750 a->num_bits -= 8;
3751 }
3752 STBI_ASSERT(a->num_bits == 0);
3753 // now fill header the normal way
3754 while (k < 4)
3755 header[k++] = stbi__zget8(a);
3756 len = header[1] * 256 + header[0];
3757 nlen = header[3] * 256 + header[2];
3758 if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
3759 if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG");
3760 if (a->zout + len > a->zout_end)
3761 if (!stbi__zexpand(a, a->zout, len)) return 0;
3762 memcpy(a->zout, a->zbuffer, len);
3763 a->zbuffer += len;
3764 a->zout += len;
3765 return 1;
3766 }
3767
stbi__parse_zlib_header(stbi__zbuf * a)3768 static int stbi__parse_zlib_header(stbi__zbuf *a)
3769 {
3770 int cmf = stbi__zget8(a);
3771 int cm = cmf & 15;
3772 /* int cinfo = cmf >> 4; */
3773 int flg = stbi__zget8(a);
3774 if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
3775 if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
3776 if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
3777 // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
3778 return 1;
3779 }
3780
3781 // @TODO: should statically initialize these for optimal thread safety
3782 static stbi_uc stbi__zdefault_length[288], stbi__zdefault_distance[32];
stbi__init_zdefaults(void)3783 static void stbi__init_zdefaults(void)
3784 {
3785 int i; // use <= to match clearly with spec
3786 for (i=0; i <= 143; ++i) stbi__zdefault_length[i] = 8;
3787 for ( ; i <= 255; ++i) stbi__zdefault_length[i] = 9;
3788 for ( ; i <= 279; ++i) stbi__zdefault_length[i] = 7;
3789 for ( ; i <= 287; ++i) stbi__zdefault_length[i] = 8;
3790
3791 for (i=0; i <= 31; ++i) stbi__zdefault_distance[i] = 5;
3792 }
3793
stbi__parse_zlib(stbi__zbuf * a,int parse_header)3794 static int stbi__parse_zlib(stbi__zbuf *a, int parse_header)
3795 {
3796 int final, type;
3797 if (parse_header)
3798 if (!stbi__parse_zlib_header(a)) return 0;
3799 a->num_bits = 0;
3800 a->code_buffer = 0;
3801 do {
3802 final = stbi__zreceive(a,1);
3803 type = stbi__zreceive(a,2);
3804 if (type == 0) {
3805 if (!stbi__parse_uncomperssed_block(a)) return 0;
3806 } else if (type == 3) {
3807 return 0;
3808 } else {
3809 if (type == 1) {
3810 // use fixed code lengths
3811 if (!stbi__zdefault_distance[31]) stbi__init_zdefaults();
3812 if (!stbi__zbuild_huffman(&a->z_length , stbi__zdefault_length , 288)) return 0;
3813 if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance, 32)) return 0;
3814 } else {
3815 if (!stbi__compute_huffman_codes(a)) return 0;
3816 }
3817 if (!stbi__parse_huffman_block(a)) return 0;
3818 }
3819 } while (!final);
3820 return 1;
3821 }
3822
stbi__do_zlib(stbi__zbuf * a,char * obuf,int olen,int exp,int parse_header)3823 static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header)
3824 {
3825 a->zout_start = obuf;
3826 a->zout = obuf;
3827 a->zout_end = obuf + olen;
3828 a->z_expandable = exp;
3829
3830 return stbi__parse_zlib(a, parse_header);
3831 }
3832
stbi_zlib_decode_malloc_guesssize(const char * buffer,int len,int initial_size,int * outlen)3833 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)
3834 {
3835 stbi__zbuf a;
3836 char *p = (char *) stbi__malloc(initial_size);
3837 if (p == NULL) return NULL;
3838 a.zbuffer = (stbi_uc *) buffer;
3839 a.zbuffer_end = (stbi_uc *) buffer + len;
3840 if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
3841 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3842 return a.zout_start;
3843 } else {
3844 STBI_FREE(a.zout_start);
3845 return NULL;
3846 }
3847 }
3848
stbi_zlib_decode_malloc(char const * buffer,int len,int * outlen)3849 STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
3850 {
3851 return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
3852 }
3853
stbi_zlib_decode_malloc_guesssize_headerflag(const char * buffer,int len,int initial_size,int * outlen,int parse_header)3854 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
3855 {
3856 stbi__zbuf a;
3857 char *p = (char *) stbi__malloc(initial_size);
3858 if (p == NULL) return NULL;
3859 a.zbuffer = (stbi_uc *) buffer;
3860 a.zbuffer_end = (stbi_uc *) buffer + len;
3861 if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
3862 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3863 return a.zout_start;
3864 } else {
3865 STBI_FREE(a.zout_start);
3866 return NULL;
3867 }
3868 }
3869
stbi_zlib_decode_buffer(char * obuffer,int olen,char const * ibuffer,int ilen)3870 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
3871 {
3872 stbi__zbuf a;
3873 a.zbuffer = (stbi_uc *) ibuffer;
3874 a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3875 if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
3876 return (int) (a.zout - a.zout_start);
3877 else
3878 return -1;
3879 }
3880
stbi_zlib_decode_noheader_malloc(char const * buffer,int len,int * outlen)3881 STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
3882 {
3883 stbi__zbuf a;
3884 char *p = (char *) stbi__malloc(16384);
3885 if (p == NULL) return NULL;
3886 a.zbuffer = (stbi_uc *) buffer;
3887 a.zbuffer_end = (stbi_uc *) buffer+len;
3888 if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
3889 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3890 return a.zout_start;
3891 } else {
3892 STBI_FREE(a.zout_start);
3893 return NULL;
3894 }
3895 }
3896
stbi_zlib_decode_noheader_buffer(char * obuffer,int olen,const char * ibuffer,int ilen)3897 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
3898 {
3899 stbi__zbuf a;
3900 a.zbuffer = (stbi_uc *) ibuffer;
3901 a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3902 if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
3903 return (int) (a.zout - a.zout_start);
3904 else
3905 return -1;
3906 }
3907 #endif
3908
3909 // public domain "baseline" PNG decoder v0.10 Sean Barrett 2006-11-18
3910 // simple implementation
3911 // - only 8-bit samples
3912 // - no CRC checking
3913 // - allocates lots of intermediate memory
3914 // - avoids problem of streaming data between subsystems
3915 // - avoids explicit window management
3916 // performance
3917 // - uses stb_zlib, a PD zlib implementation with fast huffman decoding
3918
3919 #ifndef STBI_NO_PNG
3920 typedef struct
3921 {
3922 stbi__uint32 length;
3923 stbi__uint32 type;
3924 } stbi__pngchunk;
3925
stbi__get_chunk_header(stbi__context * s)3926 static stbi__pngchunk stbi__get_chunk_header(stbi__context *s)
3927 {
3928 stbi__pngchunk c;
3929 c.length = stbi__get32be(s);
3930 c.type = stbi__get32be(s);
3931 return c;
3932 }
3933
stbi__check_png_header(stbi__context * s)3934 static int stbi__check_png_header(stbi__context *s)
3935 {
3936 static stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
3937 int i;
3938 for (i=0; i < 8; ++i)
3939 if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG");
3940 return 1;
3941 }
3942
3943 typedef struct
3944 {
3945 stbi__context *s;
3946 stbi_uc *idata, *expanded, *out;
3947 } stbi__png;
3948
3949
3950 enum {
3951 STBI__F_none=0,
3952 STBI__F_sub=1,
3953 STBI__F_up=2,
3954 STBI__F_avg=3,
3955 STBI__F_paeth=4,
3956 // synthetic filters used for first scanline to avoid needing a dummy row of 0s
3957 STBI__F_avg_first,
3958 STBI__F_paeth_first
3959 };
3960
3961 static stbi_uc first_row_filter[5] =
3962 {
3963 STBI__F_none,
3964 STBI__F_sub,
3965 STBI__F_none,
3966 STBI__F_avg_first,
3967 STBI__F_paeth_first
3968 };
3969
stbi__paeth(int a,int b,int c)3970 static int stbi__paeth(int a, int b, int c)
3971 {
3972 int p = a + b - c;
3973 int pa = abs(p-a);
3974 int pb = abs(p-b);
3975 int pc = abs(p-c);
3976 if (pa <= pb && pa <= pc) return a;
3977 if (pb <= pc) return b;
3978 return c;
3979 }
3980
3981 static stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
3982
3983 // create the png data from post-deflated data
stbi__create_png_image_raw(stbi__png * a,stbi_uc * raw,stbi__uint32 raw_len,int out_n,stbi__uint32 x,stbi__uint32 y,int depth,int color)3984 static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
3985 {
3986 stbi__context *s = a->s;
3987 stbi__uint32 i,j,stride = x*out_n;
3988 stbi__uint32 img_len, img_width_bytes;
3989 int k;
3990 int img_n = s->img_n; // copy it into a local for later
3991
3992 STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1);
3993 a->out = (stbi_uc *) stbi__malloc(x * y * out_n); // extra bytes to write off the end into
3994 if (!a->out) return stbi__err("outofmem", "Out of memory");
3995
3996 img_width_bytes = (((img_n * x * depth) + 7) >> 3);
3997 img_len = (img_width_bytes + 1) * y;
3998 if (s->img_x == x && s->img_y == y) {
3999 if (raw_len != img_len) return stbi__err("not enough pixels","Corrupt PNG");
4000 } else { // interlaced:
4001 if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG");
4002 }
4003
4004 for (j=0; j < y; ++j) {
4005 stbi_uc *cur = a->out + stride*j;
4006 stbi_uc *prior = cur - stride;
4007 int filter = *raw++;
4008 int filter_bytes = img_n;
4009 int width = x;
4010 if (filter > 4)
4011 return stbi__err("invalid filter","Corrupt PNG");
4012
4013 if (depth < 8) {
4014 STBI_ASSERT(img_width_bytes <= x);
4015 cur += x*out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place
4016 filter_bytes = 1;
4017 width = img_width_bytes;
4018 }
4019
4020 // if first row, use special filter that doesn't sample previous row
4021 if (j == 0) filter = first_row_filter[filter];
4022
4023 // handle first byte explicitly
4024 for (k=0; k < filter_bytes; ++k) {
4025 switch (filter) {
4026 case STBI__F_none : cur[k] = raw[k]; break;
4027 case STBI__F_sub : cur[k] = raw[k]; break;
4028 case STBI__F_up : cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4029 case STBI__F_avg : cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1)); break;
4030 case STBI__F_paeth : cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0,prior[k],0)); break;
4031 case STBI__F_avg_first : cur[k] = raw[k]; break;
4032 case STBI__F_paeth_first: cur[k] = raw[k]; break;
4033 }
4034 }
4035
4036 if (depth == 8) {
4037 if (img_n != out_n)
4038 cur[img_n] = 255; // first pixel
4039 raw += img_n;
4040 cur += out_n;
4041 prior += out_n;
4042 } else {
4043 raw += 1;
4044 cur += 1;
4045 prior += 1;
4046 }
4047
4048 // this is a little gross, so that we don't switch per-pixel or per-component
4049 if (depth < 8 || img_n == out_n) {
4050 int nk = (width - 1)*img_n;
4051 #define CASE(f) \
4052 case f: \
4053 for (k=0; k < nk; ++k)
4054 switch (filter) {
4055 // "none" filter turns into a memcpy here; make that explicit.
4056 case STBI__F_none: memcpy(cur, raw, nk); break;
4057 CASE(STBI__F_sub) cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]); break;
4058 CASE(STBI__F_up) cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4059 CASE(STBI__F_avg) cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1)); break;
4060 CASE(STBI__F_paeth) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],prior[k],prior[k-filter_bytes])); break;
4061 CASE(STBI__F_avg_first) cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1)); break;
4062 CASE(STBI__F_paeth_first) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],0,0)); break;
4063 }
4064 #undef CASE
4065 raw += nk;
4066 } else {
4067 STBI_ASSERT(img_n+1 == out_n);
4068 #define CASE(f) \
4069 case f: \
4070 for (i=x-1; i >= 1; --i, cur[img_n]=255,raw+=img_n,cur+=out_n,prior+=out_n) \
4071 for (k=0; k < img_n; ++k)
4072 switch (filter) {
4073 CASE(STBI__F_none) cur[k] = raw[k]; break;
4074 CASE(STBI__F_sub) cur[k] = STBI__BYTECAST(raw[k] + cur[k-out_n]); break;
4075 CASE(STBI__F_up) cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4076 CASE(STBI__F_avg) cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-out_n])>>1)); break;
4077 CASE(STBI__F_paeth) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-out_n],prior[k],prior[k-out_n])); break;
4078 CASE(STBI__F_avg_first) cur[k] = STBI__BYTECAST(raw[k] + (cur[k-out_n] >> 1)); break;
4079 CASE(STBI__F_paeth_first) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-out_n],0,0)); break;
4080 }
4081 #undef CASE
4082 }
4083 }
4084
4085 // we make a separate pass to expand bits to pixels; for performance,
4086 // this could run two scanlines behind the above code, so it won't
4087 // intefere with filtering but will still be in the cache.
4088 if (depth < 8) {
4089 for (j=0; j < y; ++j) {
4090 stbi_uc *cur = a->out + stride*j;
4091 stbi_uc *in = a->out + stride*j + x*out_n - img_width_bytes;
4092 // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
4093 // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
4094 stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range
4095
4096 // note that the final byte might overshoot and write more data than desired.
4097 // we can allocate enough data that this never writes out of memory, but it
4098 // could also overwrite the next scanline. can it overwrite non-empty data
4099 // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
4100 // so we need to explicitly clamp the final ones
4101
4102 if (depth == 4) {
4103 for (k=x*img_n; k >= 2; k-=2, ++in) {
4104 *cur++ = scale * ((*in >> 4) );
4105 *cur++ = scale * ((*in ) & 0x0f);
4106 }
4107 if (k > 0) *cur++ = scale * ((*in >> 4) );
4108 } else if (depth == 2) {
4109 for (k=x*img_n; k >= 4; k-=4, ++in) {
4110 *cur++ = scale * ((*in >> 6) );
4111 *cur++ = scale * ((*in >> 4) & 0x03);
4112 *cur++ = scale * ((*in >> 2) & 0x03);
4113 *cur++ = scale * ((*in ) & 0x03);
4114 }
4115 if (k > 0) *cur++ = scale * ((*in >> 6) );
4116 if (k > 1) *cur++ = scale * ((*in >> 4) & 0x03);
4117 if (k > 2) *cur++ = scale * ((*in >> 2) & 0x03);
4118 } else if (depth == 1) {
4119 for (k=x*img_n; k >= 8; k-=8, ++in) {
4120 *cur++ = scale * ((*in >> 7) );
4121 *cur++ = scale * ((*in >> 6) & 0x01);
4122 *cur++ = scale * ((*in >> 5) & 0x01);
4123 *cur++ = scale * ((*in >> 4) & 0x01);
4124 *cur++ = scale * ((*in >> 3) & 0x01);
4125 *cur++ = scale * ((*in >> 2) & 0x01);
4126 *cur++ = scale * ((*in >> 1) & 0x01);
4127 *cur++ = scale * ((*in ) & 0x01);
4128 }
4129 if (k > 0) *cur++ = scale * ((*in >> 7) );
4130 if (k > 1) *cur++ = scale * ((*in >> 6) & 0x01);
4131 if (k > 2) *cur++ = scale * ((*in >> 5) & 0x01);
4132 if (k > 3) *cur++ = scale * ((*in >> 4) & 0x01);
4133 if (k > 4) *cur++ = scale * ((*in >> 3) & 0x01);
4134 if (k > 5) *cur++ = scale * ((*in >> 2) & 0x01);
4135 if (k > 6) *cur++ = scale * ((*in >> 1) & 0x01);
4136 }
4137 if (img_n != out_n) {
4138 int q;
4139 // insert alpha = 255
4140 cur = a->out + stride*j;
4141 if (img_n == 1) {
4142 for (q=x-1; q >= 0; --q) {
4143 cur[q*2+1] = 255;
4144 cur[q*2+0] = cur[q];
4145 }
4146 } else {
4147 STBI_ASSERT(img_n == 3);
4148 for (q=x-1; q >= 0; --q) {
4149 cur[q*4+3] = 255;
4150 cur[q*4+2] = cur[q*3+2];
4151 cur[q*4+1] = cur[q*3+1];
4152 cur[q*4+0] = cur[q*3+0];
4153 }
4154 }
4155 }
4156 }
4157 }
4158
4159 return 1;
4160 }
4161
stbi__create_png_image(stbi__png * a,stbi_uc * image_data,stbi__uint32 image_data_len,int out_n,int depth,int color,int interlaced)4162 static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
4163 {
4164 stbi_uc *final;
4165 int p;
4166 if (!interlaced)
4167 return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
4168
4169 // de-interlacing
4170 final = (stbi_uc *) stbi__malloc(a->s->img_x * a->s->img_y * out_n);
4171 for (p=0; p < 7; ++p) {
4172 int xorig[] = { 0,4,0,2,0,1,0 };
4173 int yorig[] = { 0,0,4,0,2,0,1 };
4174 int xspc[] = { 8,8,4,4,2,2,1 };
4175 int yspc[] = { 8,8,8,4,4,2,2 };
4176 int i,j,x,y;
4177 // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
4178 x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
4179 y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
4180 if (x && y) {
4181 stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
4182 if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
4183 STBI_FREE(final);
4184 return 0;
4185 }
4186 for (j=0; j < y; ++j) {
4187 for (i=0; i < x; ++i) {
4188 int out_y = j*yspc[p]+yorig[p];
4189 int out_x = i*xspc[p]+xorig[p];
4190 memcpy(final + out_y*a->s->img_x*out_n + out_x*out_n,
4191 a->out + (j*x+i)*out_n, out_n);
4192 }
4193 }
4194 STBI_FREE(a->out);
4195 image_data += img_len;
4196 image_data_len -= img_len;
4197 }
4198 }
4199 a->out = final;
4200
4201 return 1;
4202 }
4203
stbi__compute_transparency(stbi__png * z,stbi_uc tc[3],int out_n)4204 static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n)
4205 {
4206 stbi__context *s = z->s;
4207 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4208 stbi_uc *p = z->out;
4209
4210 // compute color-based transparency, assuming we've
4211 // already got 255 as the alpha value in the output
4212 STBI_ASSERT(out_n == 2 || out_n == 4);
4213
4214 if (out_n == 2) {
4215 for (i=0; i < pixel_count; ++i) {
4216 p[1] = (p[0] == tc[0] ? 0 : 255);
4217 p += 2;
4218 }
4219 } else {
4220 for (i=0; i < pixel_count; ++i) {
4221 if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4222 p[3] = 0;
4223 p += 4;
4224 }
4225 }
4226 return 1;
4227 }
4228
stbi__expand_png_palette(stbi__png * a,stbi_uc * palette,int len,int pal_img_n)4229 static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n)
4230 {
4231 stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
4232 stbi_uc *p, *temp_out, *orig = a->out;
4233
4234 p = (stbi_uc *) stbi__malloc(pixel_count * pal_img_n);
4235 if (p == NULL) return stbi__err("outofmem", "Out of memory");
4236
4237 // between here and free(out) below, exitting would leak
4238 temp_out = p;
4239
4240 if (pal_img_n == 3) {
4241 for (i=0; i < pixel_count; ++i) {
4242 int n = orig[i]*4;
4243 p[0] = palette[n ];
4244 p[1] = palette[n+1];
4245 p[2] = palette[n+2];
4246 p += 3;
4247 }
4248 } else {
4249 for (i=0; i < pixel_count; ++i) {
4250 int n = orig[i]*4;
4251 p[0] = palette[n ];
4252 p[1] = palette[n+1];
4253 p[2] = palette[n+2];
4254 p[3] = palette[n+3];
4255 p += 4;
4256 }
4257 }
4258 STBI_FREE(a->out);
4259 a->out = temp_out;
4260
4261 STBI_NOTUSED(len);
4262
4263 return 1;
4264 }
4265
4266 static int stbi__unpremultiply_on_load = 0;
4267 static int stbi__de_iphone_flag = 0;
4268
stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)4269 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
4270 {
4271 stbi__unpremultiply_on_load = flag_true_if_should_unpremultiply;
4272 }
4273
stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)4274 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
4275 {
4276 stbi__de_iphone_flag = flag_true_if_should_convert;
4277 }
4278
stbi__de_iphone(stbi__png * z)4279 static void stbi__de_iphone(stbi__png *z)
4280 {
4281 stbi__context *s = z->s;
4282 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4283 stbi_uc *p = z->out;
4284
4285 if (s->img_out_n == 3) { // convert bgr to rgb
4286 for (i=0; i < pixel_count; ++i) {
4287 stbi_uc t = p[0];
4288 p[0] = p[2];
4289 p[2] = t;
4290 p += 3;
4291 }
4292 } else {
4293 STBI_ASSERT(s->img_out_n == 4);
4294 if (stbi__unpremultiply_on_load) {
4295 // convert bgr to rgb and unpremultiply
4296 for (i=0; i < pixel_count; ++i) {
4297 stbi_uc a = p[3];
4298 stbi_uc t = p[0];
4299 if (a) {
4300 p[0] = p[2] * 255 / a;
4301 p[1] = p[1] * 255 / a;
4302 p[2] = t * 255 / a;
4303 } else {
4304 p[0] = p[2];
4305 p[2] = t;
4306 }
4307 p += 4;
4308 }
4309 } else {
4310 // convert bgr to rgb
4311 for (i=0; i < pixel_count; ++i) {
4312 stbi_uc t = p[0];
4313 p[0] = p[2];
4314 p[2] = t;
4315 p += 4;
4316 }
4317 }
4318 }
4319 }
4320
4321 #define STBI__PNG_TYPE(a,b,c,d) (((a) << 24) + ((b) << 16) + ((c) << 8) + (d))
4322
stbi__parse_png_file(stbi__png * z,int scan,int req_comp)4323 static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp)
4324 {
4325 stbi_uc palette[1024], pal_img_n=0;
4326 stbi_uc has_trans=0, tc[3];
4327 stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0;
4328 int first=1,k,interlace=0, color=0, depth=0, is_iphone=0;
4329 stbi__context *s = z->s;
4330
4331 z->expanded = NULL;
4332 z->idata = NULL;
4333 z->out = NULL;
4334
4335 if (!stbi__check_png_header(s)) return 0;
4336
4337 if (scan == STBI__SCAN_type) return 1;
4338
4339 for (;;) {
4340 stbi__pngchunk c = stbi__get_chunk_header(s);
4341 switch (c.type) {
4342 case STBI__PNG_TYPE('C','g','B','I'):
4343 is_iphone = 1;
4344 stbi__skip(s, c.length);
4345 break;
4346 case STBI__PNG_TYPE('I','H','D','R'): {
4347 int comp,filter;
4348 if (!first) return stbi__err("multiple IHDR","Corrupt PNG");
4349 first = 0;
4350 if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG");
4351 s->img_x = stbi__get32be(s); if (s->img_x > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4352 s->img_y = stbi__get32be(s); if (s->img_y > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4353 depth = stbi__get8(s); if (depth != 1 && depth != 2 && depth != 4 && depth != 8) return stbi__err("1/2/4/8-bit only","PNG not supported: 1/2/4/8-bit only");
4354 color = stbi__get8(s); if (color > 6) return stbi__err("bad ctype","Corrupt PNG");
4355 if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG");
4356 comp = stbi__get8(s); if (comp) return stbi__err("bad comp method","Corrupt PNG");
4357 filter= stbi__get8(s); if (filter) return stbi__err("bad filter method","Corrupt PNG");
4358 interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG");
4359 if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG");
4360 if (!pal_img_n) {
4361 s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
4362 if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
4363 if (scan == STBI__SCAN_header) return 1;
4364 } else {
4365 // if paletted, then pal_n is our final components, and
4366 // img_n is # components to decompress/filter.
4367 s->img_n = 1;
4368 if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG");
4369 // if SCAN_header, have to scan to see if we have a tRNS
4370 }
4371 break;
4372 }
4373
4374 case STBI__PNG_TYPE('P','L','T','E'): {
4375 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4376 if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
4377 pal_len = c.length / 3;
4378 if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG");
4379 for (i=0; i < pal_len; ++i) {
4380 palette[i*4+0] = stbi__get8(s);
4381 palette[i*4+1] = stbi__get8(s);
4382 palette[i*4+2] = stbi__get8(s);
4383 palette[i*4+3] = 255;
4384 }
4385 break;
4386 }
4387
4388 case STBI__PNG_TYPE('t','R','N','S'): {
4389 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4390 if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG");
4391 if (pal_img_n) {
4392 if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; }
4393 if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
4394 if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG");
4395 pal_img_n = 4;
4396 for (i=0; i < c.length; ++i)
4397 palette[i*4+3] = stbi__get8(s);
4398 } else {
4399 if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
4400 if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG");
4401 has_trans = 1;
4402 for (k=0; k < s->img_n; ++k)
4403 tc[k] = (stbi_uc) (stbi__get16be(s) & 255) * stbi__depth_scale_table[depth]; // non 8-bit images will be larger
4404 }
4405 break;
4406 }
4407
4408 case STBI__PNG_TYPE('I','D','A','T'): {
4409 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4410 if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG");
4411 if (scan == STBI__SCAN_header) { s->img_n = pal_img_n; return 1; }
4412 if ((int)(ioff + c.length) < (int)ioff) return 0;
4413 if (ioff + c.length > idata_limit) {
4414 stbi__uint32 idata_limit_old = idata_limit;
4415 stbi_uc *p;
4416 if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
4417 while (ioff + c.length > idata_limit)
4418 idata_limit *= 2;
4419 STBI_NOTUSED(idata_limit_old);
4420 p = (stbi_uc *) STBI_REALLOC_SIZED(z->idata, idata_limit_old, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
4421 z->idata = p;
4422 }
4423 if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG");
4424 ioff += c.length;
4425 break;
4426 }
4427
4428 case STBI__PNG_TYPE('I','E','N','D'): {
4429 stbi__uint32 raw_len, bpl;
4430 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4431 if (scan != STBI__SCAN_load) return 1;
4432 if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG");
4433 // initial guess for decoded data size to avoid unnecessary reallocs
4434 bpl = (s->img_x * depth + 7) / 8; // bytes per line, per component
4435 raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
4436 z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone);
4437 if (z->expanded == NULL) return 0; // zlib should set error
4438 STBI_FREE(z->idata); z->idata = NULL;
4439 if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
4440 s->img_out_n = s->img_n+1;
4441 else
4442 s->img_out_n = s->img_n;
4443 if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, depth, color, interlace)) return 0;
4444 if (has_trans)
4445 if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
4446 if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
4447 stbi__de_iphone(z);
4448 if (pal_img_n) {
4449 // pal_img_n == 3 or 4
4450 s->img_n = pal_img_n; // record the actual colors we had
4451 s->img_out_n = pal_img_n;
4452 if (req_comp >= 3) s->img_out_n = req_comp;
4453 if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
4454 return 0;
4455 }
4456 STBI_FREE(z->expanded); z->expanded = NULL;
4457 return 1;
4458 }
4459
4460 default:
4461 // if critical, fail
4462 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4463 if ((c.type & (1 << 29)) == 0) {
4464 #ifndef STBI_NO_FAILURE_STRINGS
4465 // not threadsafe
4466 static char invalid_chunk[] = "XXXX PNG chunk not known";
4467 invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
4468 invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
4469 invalid_chunk[2] = STBI__BYTECAST(c.type >> 8);
4470 invalid_chunk[3] = STBI__BYTECAST(c.type >> 0);
4471 #endif
4472 return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
4473 }
4474 stbi__skip(s, c.length);
4475 break;
4476 }
4477 // end of PNG chunk, read and skip CRC
4478 stbi__get32be(s);
4479 }
4480 }
4481
stbi__do_png(stbi__png * p,int * x,int * y,int * n,int req_comp)4482 static unsigned char *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp)
4483 {
4484 unsigned char *result=NULL;
4485 if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
4486 if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) {
4487 result = p->out;
4488 p->out = NULL;
4489 if (req_comp && req_comp != p->s->img_out_n) {
4490 result = stbi__convert_format(result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
4491 p->s->img_out_n = req_comp;
4492 if (result == NULL) return result;
4493 }
4494 *x = p->s->img_x;
4495 *y = p->s->img_y;
4496 if (n) *n = p->s->img_out_n;
4497 }
4498 STBI_FREE(p->out); p->out = NULL;
4499 STBI_FREE(p->expanded); p->expanded = NULL;
4500 STBI_FREE(p->idata); p->idata = NULL;
4501
4502 return result;
4503 }
4504
stbi__png_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4505 static unsigned char *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4506 {
4507 stbi__png p;
4508 p.s = s;
4509 return stbi__do_png(&p, x,y,comp,req_comp);
4510 }
4511
stbi__png_test(stbi__context * s)4512 static int stbi__png_test(stbi__context *s)
4513 {
4514 int r;
4515 r = stbi__check_png_header(s);
4516 stbi__rewind(s);
4517 return r;
4518 }
4519
stbi__png_info_raw(stbi__png * p,int * x,int * y,int * comp)4520 static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp)
4521 {
4522 if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) {
4523 stbi__rewind( p->s );
4524 return 0;
4525 }
4526 if (x) *x = p->s->img_x;
4527 if (y) *y = p->s->img_y;
4528 if (comp) *comp = p->s->img_n;
4529 return 1;
4530 }
4531
stbi__png_info(stbi__context * s,int * x,int * y,int * comp)4532 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp)
4533 {
4534 stbi__png p;
4535 p.s = s;
4536 return stbi__png_info_raw(&p, x, y, comp);
4537 }
4538 #endif
4539
4540 // Microsoft/Windows BMP image
4541
4542 #ifndef STBI_NO_BMP
stbi__bmp_test_raw(stbi__context * s)4543 static int stbi__bmp_test_raw(stbi__context *s)
4544 {
4545 int r;
4546 int sz;
4547 if (stbi__get8(s) != 'B') return 0;
4548 if (stbi__get8(s) != 'M') return 0;
4549 stbi__get32le(s); // discard filesize
4550 stbi__get16le(s); // discard reserved
4551 stbi__get16le(s); // discard reserved
4552 stbi__get32le(s); // discard data offset
4553 sz = stbi__get32le(s);
4554 r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
4555 return r;
4556 }
4557
stbi__bmp_test(stbi__context * s)4558 static int stbi__bmp_test(stbi__context *s)
4559 {
4560 int r = stbi__bmp_test_raw(s);
4561 stbi__rewind(s);
4562 return r;
4563 }
4564
4565
4566 // returns 0..31 for the highest set bit
stbi__high_bit(unsigned int z)4567 static int stbi__high_bit(unsigned int z)
4568 {
4569 int n=0;
4570 if (z == 0) return -1;
4571 if (z >= 0x10000) n += 16, z >>= 16;
4572 if (z >= 0x00100) n += 8, z >>= 8;
4573 if (z >= 0x00010) n += 4, z >>= 4;
4574 if (z >= 0x00004) n += 2, z >>= 2;
4575 if (z >= 0x00002) n += 1, z >>= 1;
4576 return n;
4577 }
4578
stbi__bitcount(unsigned int a)4579 static int stbi__bitcount(unsigned int a)
4580 {
4581 a = (a & 0x55555555) + ((a >> 1) & 0x55555555); // max 2
4582 a = (a & 0x33333333) + ((a >> 2) & 0x33333333); // max 4
4583 a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
4584 a = (a + (a >> 8)); // max 16 per 8 bits
4585 a = (a + (a >> 16)); // max 32 per 8 bits
4586 return a & 0xff;
4587 }
4588
stbi__shiftsigned(int v,int shift,int bits)4589 static int stbi__shiftsigned(int v, int shift, int bits)
4590 {
4591 int result;
4592 int z=0;
4593
4594 if (shift < 0) v <<= -shift;
4595 else v >>= shift;
4596 result = v;
4597
4598 z = bits;
4599 while (z < 8) {
4600 result += v >> z;
4601 z += bits;
4602 }
4603 return result;
4604 }
4605
4606 typedef struct
4607 {
4608 int bpp, offset, hsz;
4609 unsigned int mr,mg,mb,ma, all_a;
4610 } stbi__bmp_data;
4611
stbi__bmp_parse_header(stbi__context * s,stbi__bmp_data * info)4612 static void *stbi__bmp_parse_header(stbi__context *s, stbi__bmp_data *info)
4613 {
4614 int hsz;
4615 if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
4616 stbi__get32le(s); // discard filesize
4617 stbi__get16le(s); // discard reserved
4618 stbi__get16le(s); // discard reserved
4619 info->offset = stbi__get32le(s);
4620 info->hsz = hsz = stbi__get32le(s);
4621
4622 if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
4623 if (hsz == 12) {
4624 s->img_x = stbi__get16le(s);
4625 s->img_y = stbi__get16le(s);
4626 } else {
4627 s->img_x = stbi__get32le(s);
4628 s->img_y = stbi__get32le(s);
4629 }
4630 if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
4631 info->bpp = stbi__get16le(s);
4632 if (info->bpp == 1) return stbi__errpuc("monochrome", "BMP type not supported: 1-bit");
4633 if (hsz != 12) {
4634 int compress = stbi__get32le(s);
4635 if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
4636 stbi__get32le(s); // discard sizeof
4637 stbi__get32le(s); // discard hres
4638 stbi__get32le(s); // discard vres
4639 stbi__get32le(s); // discard colorsused
4640 stbi__get32le(s); // discard max important
4641 if (hsz == 40 || hsz == 56) {
4642 if (hsz == 56) {
4643 stbi__get32le(s);
4644 stbi__get32le(s);
4645 stbi__get32le(s);
4646 stbi__get32le(s);
4647 }
4648 if (info->bpp == 16 || info->bpp == 32) {
4649 info->mr = info->mg = info->mb = 0;
4650 if (compress == 0) {
4651 if (info->bpp == 32) {
4652 info->mr = 0xffu << 16;
4653 info->mg = 0xffu << 8;
4654 info->mb = 0xffu << 0;
4655 info->ma = 0xffu << 24;
4656 info->all_a = 0; // if all_a is 0 at end, then we loaded alpha channel but it was all 0
4657 } else {
4658 info->mr = 31u << 10;
4659 info->mg = 31u << 5;
4660 info->mb = 31u << 0;
4661 }
4662 } else if (compress == 3) {
4663 info->mr = stbi__get32le(s);
4664 info->mg = stbi__get32le(s);
4665 info->mb = stbi__get32le(s);
4666 // not documented, but generated by photoshop and handled by mspaint
4667 if (info->mr == info->mg && info->mg == info->mb) {
4668 // ?!?!?
4669 return stbi__errpuc("bad BMP", "bad BMP");
4670 }
4671 } else
4672 return stbi__errpuc("bad BMP", "bad BMP");
4673 }
4674 } else {
4675 int i;
4676 if (hsz != 108 && hsz != 124)
4677 return stbi__errpuc("bad BMP", "bad BMP");
4678 info->mr = stbi__get32le(s);
4679 info->mg = stbi__get32le(s);
4680 info->mb = stbi__get32le(s);
4681 info->ma = stbi__get32le(s);
4682 stbi__get32le(s); // discard color space
4683 for (i=0; i < 12; ++i)
4684 stbi__get32le(s); // discard color space parameters
4685 if (hsz == 124) {
4686 stbi__get32le(s); // discard rendering intent
4687 stbi__get32le(s); // discard offset of profile data
4688 stbi__get32le(s); // discard size of profile data
4689 stbi__get32le(s); // discard reserved
4690 }
4691 }
4692 }
4693 return (void *) 1;
4694 }
4695
4696
stbi__bmp_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4697 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4698 {
4699 stbi_uc *out;
4700 unsigned int mr=0,mg=0,mb=0,ma=0, all_a;
4701 stbi_uc pal[256][4];
4702 int psize=0,i,j,width;
4703 int flip_vertically, pad, target;
4704 stbi__bmp_data info;
4705
4706 info.all_a = 255;
4707 if (stbi__bmp_parse_header(s, &info) == NULL)
4708 return NULL; // error code already set
4709
4710 flip_vertically = ((int) s->img_y) > 0;
4711 s->img_y = abs((int) s->img_y);
4712
4713 mr = info.mr;
4714 mg = info.mg;
4715 mb = info.mb;
4716 ma = info.ma;
4717 all_a = info.all_a;
4718
4719 if (info.hsz == 12) {
4720 if (info.bpp < 24)
4721 psize = (info.offset - 14 - 24) / 3;
4722 } else {
4723 if (info.bpp < 16)
4724 psize = (info.offset - 14 - info.hsz) >> 2;
4725 }
4726
4727 s->img_n = ma ? 4 : 3;
4728 if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
4729 target = req_comp;
4730 else
4731 target = s->img_n; // if they want monochrome, we'll post-convert
4732
4733 out = (stbi_uc *) stbi__malloc(target * s->img_x * s->img_y);
4734 if (!out) return stbi__errpuc("outofmem", "Out of memory");
4735 if (info.bpp < 16) {
4736 int z=0;
4737 if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
4738 for (i=0; i < psize; ++i) {
4739 pal[i][2] = stbi__get8(s);
4740 pal[i][1] = stbi__get8(s);
4741 pal[i][0] = stbi__get8(s);
4742 if (info.hsz != 12) stbi__get8(s);
4743 pal[i][3] = 255;
4744 }
4745 stbi__skip(s, info.offset - 14 - info.hsz - psize * (info.hsz == 12 ? 3 : 4));
4746 if (info.bpp == 4) width = (s->img_x + 1) >> 1;
4747 else if (info.bpp == 8) width = s->img_x;
4748 else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
4749 pad = (-width)&3;
4750 for (j=0; j < (int) s->img_y; ++j) {
4751 for (i=0; i < (int) s->img_x; i += 2) {
4752 int v=stbi__get8(s),v2=0;
4753 if (info.bpp == 4) {
4754 v2 = v & 15;
4755 v >>= 4;
4756 }
4757 out[z++] = pal[v][0];
4758 out[z++] = pal[v][1];
4759 out[z++] = pal[v][2];
4760 if (target == 4) out[z++] = 255;
4761 if (i+1 == (int) s->img_x) break;
4762 v = (info.bpp == 8) ? stbi__get8(s) : v2;
4763 out[z++] = pal[v][0];
4764 out[z++] = pal[v][1];
4765 out[z++] = pal[v][2];
4766 if (target == 4) out[z++] = 255;
4767 }
4768 stbi__skip(s, pad);
4769 }
4770 } else {
4771 int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
4772 int z = 0;
4773 int easy=0;
4774 stbi__skip(s, info.offset - 14 - info.hsz);
4775 if (info.bpp == 24) width = 3 * s->img_x;
4776 else if (info.bpp == 16) width = 2*s->img_x;
4777 else /* bpp = 32 and pad = 0 */ width=0;
4778 pad = (-width) & 3;
4779 if (info.bpp == 24) {
4780 easy = 1;
4781 } else if (info.bpp == 32) {
4782 if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
4783 easy = 2;
4784 }
4785 if (!easy) {
4786 if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
4787 // right shift amt to put high bit in position #7
4788 rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr);
4789 gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg);
4790 bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb);
4791 ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma);
4792 }
4793 for (j=0; j < (int) s->img_y; ++j) {
4794 if (easy) {
4795 for (i=0; i < (int) s->img_x; ++i) {
4796 unsigned char a;
4797 out[z+2] = stbi__get8(s);
4798 out[z+1] = stbi__get8(s);
4799 out[z+0] = stbi__get8(s);
4800 z += 3;
4801 a = (easy == 2 ? stbi__get8(s) : 255);
4802 all_a |= a;
4803 if (target == 4) out[z++] = a;
4804 }
4805 } else {
4806 int bpp = info.bpp;
4807 for (i=0; i < (int) s->img_x; ++i) {
4808 stbi__uint32 v = (bpp == 16 ? (stbi__uint32) stbi__get16le(s) : stbi__get32le(s));
4809 int a;
4810 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
4811 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
4812 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
4813 a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
4814 all_a |= a;
4815 if (target == 4) out[z++] = STBI__BYTECAST(a);
4816 }
4817 }
4818 stbi__skip(s, pad);
4819 }
4820 }
4821
4822 // if alpha channel is all 0s, replace with all 255s
4823 if (target == 4 && all_a == 0)
4824 for (i=4*s->img_x*s->img_y-1; i >= 0; i -= 4)
4825 out[i] = 255;
4826
4827 if (flip_vertically) {
4828 stbi_uc t;
4829 for (j=0; j < (int) s->img_y>>1; ++j) {
4830 stbi_uc *p1 = out + j *s->img_x*target;
4831 stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
4832 for (i=0; i < (int) s->img_x*target; ++i) {
4833 t = p1[i], p1[i] = p2[i], p2[i] = t;
4834 }
4835 }
4836 }
4837
4838 if (req_comp && req_comp != target) {
4839 out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
4840 if (out == NULL) return out; // stbi__convert_format frees input on failure
4841 }
4842
4843 *x = s->img_x;
4844 *y = s->img_y;
4845 if (comp) *comp = s->img_n;
4846 return out;
4847 }
4848 #endif
4849
4850 // Targa Truevision - TGA
4851 // by Jonathan Dummer
4852 #ifndef STBI_NO_TGA
4853 // returns STBI_rgb or whatever, 0 on error
stbi__tga_get_comp(int bits_per_pixel,int is_grey,int * is_rgb16)4854 static int stbi__tga_get_comp(int bits_per_pixel, int is_grey, int* is_rgb16)
4855 {
4856 // only RGB or RGBA (incl. 16bit) or grey allowed
4857 if(is_rgb16) *is_rgb16 = 0;
4858 switch(bits_per_pixel) {
4859 case 8: return STBI_grey;
4860 case 16: if(is_grey) return STBI_grey_alpha;
4861 // else: fall-through
4862 case 15: if(is_rgb16) *is_rgb16 = 1;
4863 return STBI_rgb;
4864 case 24: // fall-through
4865 case 32: return bits_per_pixel/8;
4866 default: return 0;
4867 }
4868 }
4869
stbi__tga_info(stbi__context * s,int * x,int * y,int * comp)4870 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp)
4871 {
4872 int tga_w, tga_h, tga_comp, tga_image_type, tga_bits_per_pixel, tga_colormap_bpp;
4873 int sz, tga_colormap_type;
4874 stbi__get8(s); // discard Offset
4875 tga_colormap_type = stbi__get8(s); // colormap type
4876 if( tga_colormap_type > 1 ) {
4877 stbi__rewind(s);
4878 return 0; // only RGB or indexed allowed
4879 }
4880 tga_image_type = stbi__get8(s); // image type
4881 if ( tga_colormap_type == 1 ) { // colormapped (paletted) image
4882 if (tga_image_type != 1 && tga_image_type != 9) {
4883 stbi__rewind(s);
4884 return 0;
4885 }
4886 stbi__skip(s,4); // skip index of first colormap entry and number of entries
4887 sz = stbi__get8(s); // check bits per palette color entry
4888 if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) {
4889 stbi__rewind(s);
4890 return 0;
4891 }
4892 stbi__skip(s,4); // skip image x and y origin
4893 tga_colormap_bpp = sz;
4894 } else { // "normal" image w/o colormap - only RGB or grey allowed, +/- RLE
4895 if ( (tga_image_type != 2) && (tga_image_type != 3) && (tga_image_type != 10) && (tga_image_type != 11) ) {
4896 stbi__rewind(s);
4897 return 0; // only RGB or grey allowed, +/- RLE
4898 }
4899 stbi__skip(s,9); // skip colormap specification and image x/y origin
4900 tga_colormap_bpp = 0;
4901 }
4902 tga_w = stbi__get16le(s);
4903 if( tga_w < 1 ) {
4904 stbi__rewind(s);
4905 return 0; // test width
4906 }
4907 tga_h = stbi__get16le(s);
4908 if( tga_h < 1 ) {
4909 stbi__rewind(s);
4910 return 0; // test height
4911 }
4912 tga_bits_per_pixel = stbi__get8(s); // bits per pixel
4913 stbi__get8(s); // ignore alpha bits
4914 if (tga_colormap_bpp != 0) {
4915 if((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16)) {
4916 // when using a colormap, tga_bits_per_pixel is the size of the indexes
4917 // I don't think anything but 8 or 16bit indexes makes sense
4918 stbi__rewind(s);
4919 return 0;
4920 }
4921 tga_comp = stbi__tga_get_comp(tga_colormap_bpp, 0, NULL);
4922 } else {
4923 tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3) || (tga_image_type == 11), NULL);
4924 }
4925 if(!tga_comp) {
4926 stbi__rewind(s);
4927 return 0;
4928 }
4929 if (x) *x = tga_w;
4930 if (y) *y = tga_h;
4931 if (comp) *comp = tga_comp;
4932 return 1; // seems to have passed everything
4933 }
4934
stbi__tga_test(stbi__context * s)4935 static int stbi__tga_test(stbi__context *s)
4936 {
4937 int res = 0;
4938 int sz, tga_color_type;
4939 stbi__get8(s); // discard Offset
4940 tga_color_type = stbi__get8(s); // color type
4941 if ( tga_color_type > 1 ) goto errorEnd; // only RGB or indexed allowed
4942 sz = stbi__get8(s); // image type
4943 if ( tga_color_type == 1 ) { // colormapped (paletted) image
4944 if (sz != 1 && sz != 9) goto errorEnd; // colortype 1 demands image type 1 or 9
4945 stbi__skip(s,4); // skip index of first colormap entry and number of entries
4946 sz = stbi__get8(s); // check bits per palette color entry
4947 if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
4948 stbi__skip(s,4); // skip image x and y origin
4949 } else { // "normal" image w/o colormap
4950 if ( (sz != 2) && (sz != 3) && (sz != 10) && (sz != 11) ) goto errorEnd; // only RGB or grey allowed, +/- RLE
4951 stbi__skip(s,9); // skip colormap specification and image x/y origin
4952 }
4953 if ( stbi__get16le(s) < 1 ) goto errorEnd; // test width
4954 if ( stbi__get16le(s) < 1 ) goto errorEnd; // test height
4955 sz = stbi__get8(s); // bits per pixel
4956 if ( (tga_color_type == 1) && (sz != 8) && (sz != 16) ) goto errorEnd; // for colormapped images, bpp is size of an index
4957 if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
4958
4959 res = 1; // if we got this far, everything's good and we can return 1 instead of 0
4960
4961 errorEnd:
4962 stbi__rewind(s);
4963 return res;
4964 }
4965
4966 // read 16bit value and convert to 24bit RGB
stbi__tga_read_rgb16(stbi__context * s,stbi_uc * out)4967 void stbi__tga_read_rgb16(stbi__context *s, stbi_uc* out)
4968 {
4969 stbi__uint16 px = stbi__get16le(s);
4970 stbi__uint16 fiveBitMask = 31;
4971 // we have 3 channels with 5bits each
4972 int r = (px >> 10) & fiveBitMask;
4973 int g = (px >> 5) & fiveBitMask;
4974 int b = px & fiveBitMask;
4975 // Note that this saves the data in RGB(A) order, so it doesn't need to be swapped later
4976 out[0] = (r * 255)/31;
4977 out[1] = (g * 255)/31;
4978 out[2] = (b * 255)/31;
4979
4980 // some people claim that the most significant bit might be used for alpha
4981 // (possibly if an alpha-bit is set in the "image descriptor byte")
4982 // but that only made 16bit test images completely translucent..
4983 // so let's treat all 15 and 16bit TGAs as RGB with no alpha.
4984 }
4985
stbi__tga_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4986 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4987 {
4988 // read in the TGA header stuff
4989 int tga_offset = stbi__get8(s);
4990 int tga_indexed = stbi__get8(s);
4991 int tga_image_type = stbi__get8(s);
4992 int tga_is_RLE = 0;
4993 int tga_palette_start = stbi__get16le(s);
4994 int tga_palette_len = stbi__get16le(s);
4995 int tga_palette_bits = stbi__get8(s);
4996 int tga_x_origin = stbi__get16le(s);
4997 int tga_y_origin = stbi__get16le(s);
4998 int tga_width = stbi__get16le(s);
4999 int tga_height = stbi__get16le(s);
5000 int tga_bits_per_pixel = stbi__get8(s);
5001 int tga_comp, tga_rgb16=0;
5002 int tga_inverted = stbi__get8(s);
5003 // int tga_alpha_bits = tga_inverted & 15; // the 4 lowest bits - unused (useless?)
5004 // image data
5005 unsigned char *tga_data;
5006 unsigned char *tga_palette = NULL;
5007 int i, j;
5008 unsigned char raw_data[4];
5009 int RLE_count = 0;
5010 int RLE_repeating = 0;
5011 int read_next_pixel = 1;
5012
5013 // do a tiny bit of precessing
5014 if ( tga_image_type >= 8 )
5015 {
5016 tga_image_type -= 8;
5017 tga_is_RLE = 1;
5018 }
5019 tga_inverted = 1 - ((tga_inverted >> 5) & 1);
5020
5021 // If I'm paletted, then I'll use the number of bits from the palette
5022 if ( tga_indexed ) tga_comp = stbi__tga_get_comp(tga_palette_bits, 0, &tga_rgb16);
5023 else tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3), &tga_rgb16);
5024
5025 if(!tga_comp) // shouldn't really happen, stbi__tga_test() should have ensured basic consistency
5026 return stbi__errpuc("bad format", "Can't find out TGA pixelformat");
5027
5028 // tga info
5029 *x = tga_width;
5030 *y = tga_height;
5031 if (comp) *comp = tga_comp;
5032
5033 tga_data = (unsigned char*)stbi__malloc( (size_t)tga_width * tga_height * tga_comp );
5034 if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
5035
5036 // skip to the data's starting position (offset usually = 0)
5037 stbi__skip(s, tga_offset );
5038
5039 if ( !tga_indexed && !tga_is_RLE && !tga_rgb16 ) {
5040 for (i=0; i < tga_height; ++i) {
5041 int row = tga_inverted ? tga_height -i - 1 : i;
5042 stbi_uc *tga_row = tga_data + row*tga_width*tga_comp;
5043 stbi__getn(s, tga_row, tga_width * tga_comp);
5044 }
5045 } else {
5046 // do I need to load a palette?
5047 if ( tga_indexed)
5048 {
5049 // any data to skip? (offset usually = 0)
5050 stbi__skip(s, tga_palette_start );
5051 // load the palette
5052 tga_palette = (unsigned char*)stbi__malloc( tga_palette_len * tga_comp );
5053 if (!tga_palette) {
5054 STBI_FREE(tga_data);
5055 return stbi__errpuc("outofmem", "Out of memory");
5056 }
5057 if (tga_rgb16) {
5058 stbi_uc *pal_entry = tga_palette;
5059 STBI_ASSERT(tga_comp == STBI_rgb);
5060 for (i=0; i < tga_palette_len; ++i) {
5061 stbi__tga_read_rgb16(s, pal_entry);
5062 pal_entry += tga_comp;
5063 }
5064 } else if (!stbi__getn(s, tga_palette, tga_palette_len * tga_comp)) {
5065 STBI_FREE(tga_data);
5066 STBI_FREE(tga_palette);
5067 return stbi__errpuc("bad palette", "Corrupt TGA");
5068 }
5069 }
5070 // load the data
5071 for (i=0; i < tga_width * tga_height; ++i)
5072 {
5073 // if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
5074 if ( tga_is_RLE )
5075 {
5076 if ( RLE_count == 0 )
5077 {
5078 // yep, get the next byte as a RLE command
5079 int RLE_cmd = stbi__get8(s);
5080 RLE_count = 1 + (RLE_cmd & 127);
5081 RLE_repeating = RLE_cmd >> 7;
5082 read_next_pixel = 1;
5083 } else if ( !RLE_repeating )
5084 {
5085 read_next_pixel = 1;
5086 }
5087 } else
5088 {
5089 read_next_pixel = 1;
5090 }
5091 // OK, if I need to read a pixel, do it now
5092 if ( read_next_pixel )
5093 {
5094 // load however much data we did have
5095 if ( tga_indexed )
5096 {
5097 // read in index, then perform the lookup
5098 int pal_idx = (tga_bits_per_pixel == 8) ? stbi__get8(s) : stbi__get16le(s);
5099 if ( pal_idx >= tga_palette_len ) {
5100 // invalid index
5101 pal_idx = 0;
5102 }
5103 pal_idx *= tga_comp;
5104 for (j = 0; j < tga_comp; ++j) {
5105 raw_data[j] = tga_palette[pal_idx+j];
5106 }
5107 } else if(tga_rgb16) {
5108 STBI_ASSERT(tga_comp == STBI_rgb);
5109 stbi__tga_read_rgb16(s, raw_data);
5110 } else {
5111 // read in the data raw
5112 for (j = 0; j < tga_comp; ++j) {
5113 raw_data[j] = stbi__get8(s);
5114 }
5115 }
5116 // clear the reading flag for the next pixel
5117 read_next_pixel = 0;
5118 } // end of reading a pixel
5119
5120 // copy data
5121 for (j = 0; j < tga_comp; ++j)
5122 tga_data[i*tga_comp+j] = raw_data[j];
5123
5124 // in case we're in RLE mode, keep counting down
5125 --RLE_count;
5126 }
5127 // do I need to invert the image?
5128 if ( tga_inverted )
5129 {
5130 for (j = 0; j*2 < tga_height; ++j)
5131 {
5132 int index1 = j * tga_width * tga_comp;
5133 int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
5134 for (i = tga_width * tga_comp; i > 0; --i)
5135 {
5136 unsigned char temp = tga_data[index1];
5137 tga_data[index1] = tga_data[index2];
5138 tga_data[index2] = temp;
5139 ++index1;
5140 ++index2;
5141 }
5142 }
5143 }
5144 // clear my palette, if I had one
5145 if ( tga_palette != NULL )
5146 {
5147 STBI_FREE( tga_palette );
5148 }
5149 }
5150
5151 // swap RGB - if the source data was RGB16, it already is in the right order
5152 if (tga_comp >= 3 && !tga_rgb16)
5153 {
5154 unsigned char* tga_pixel = tga_data;
5155 for (i=0; i < tga_width * tga_height; ++i)
5156 {
5157 unsigned char temp = tga_pixel[0];
5158 tga_pixel[0] = tga_pixel[2];
5159 tga_pixel[2] = temp;
5160 tga_pixel += tga_comp;
5161 }
5162 }
5163
5164 // convert to target component count
5165 if (req_comp && req_comp != tga_comp)
5166 tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
5167
5168 // the things I do to get rid of an error message, and yet keep
5169 // Microsoft's C compilers happy... [8^(
5170 tga_palette_start = tga_palette_len = tga_palette_bits =
5171 tga_x_origin = tga_y_origin = 0;
5172 // OK, done
5173 return tga_data;
5174 }
5175 #endif
5176
5177 // *************************************************************************************************
5178 // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
5179
5180 #ifndef STBI_NO_PSD
stbi__psd_test(stbi__context * s)5181 static int stbi__psd_test(stbi__context *s)
5182 {
5183 int r = (stbi__get32be(s) == 0x38425053);
5184 stbi__rewind(s);
5185 return r;
5186 }
5187
stbi__psd_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)5188 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5189 {
5190 int pixelCount;
5191 int channelCount, compression;
5192 int channel, i, count, len;
5193 int bitdepth;
5194 int w,h;
5195 stbi_uc *out;
5196
5197 // Check identifier
5198 if (stbi__get32be(s) != 0x38425053) // "8BPS"
5199 return stbi__errpuc("not PSD", "Corrupt PSD image");
5200
5201 // Check file type version.
5202 if (stbi__get16be(s) != 1)
5203 return stbi__errpuc("wrong version", "Unsupported version of PSD image");
5204
5205 // Skip 6 reserved bytes.
5206 stbi__skip(s, 6 );
5207
5208 // Read the number of channels (R, G, B, A, etc).
5209 channelCount = stbi__get16be(s);
5210 if (channelCount < 0 || channelCount > 16)
5211 return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
5212
5213 // Read the rows and columns of the image.
5214 h = stbi__get32be(s);
5215 w = stbi__get32be(s);
5216
5217 // Make sure the depth is 8 bits.
5218 bitdepth = stbi__get16be(s);
5219 if (bitdepth != 8 && bitdepth != 16)
5220 return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 or 16 bit");
5221
5222 // Make sure the color mode is RGB.
5223 // Valid options are:
5224 // 0: Bitmap
5225 // 1: Grayscale
5226 // 2: Indexed color
5227 // 3: RGB color
5228 // 4: CMYK color
5229 // 7: Multichannel
5230 // 8: Duotone
5231 // 9: Lab color
5232 if (stbi__get16be(s) != 3)
5233 return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
5234
5235 // Skip the Mode Data. (It's the palette for indexed color; other info for other modes.)
5236 stbi__skip(s,stbi__get32be(s) );
5237
5238 // Skip the image resources. (resolution, pen tool paths, etc)
5239 stbi__skip(s, stbi__get32be(s) );
5240
5241 // Skip the reserved data.
5242 stbi__skip(s, stbi__get32be(s) );
5243
5244 // Find out if the data is compressed.
5245 // Known values:
5246 // 0: no compression
5247 // 1: RLE compressed
5248 compression = stbi__get16be(s);
5249 if (compression > 1)
5250 return stbi__errpuc("bad compression", "PSD has an unknown compression format");
5251
5252 // Create the destination image.
5253 out = (stbi_uc *) stbi__malloc(4 * w*h);
5254 if (!out) return stbi__errpuc("outofmem", "Out of memory");
5255 pixelCount = w*h;
5256
5257 // Initialize the data to zero.
5258 //memset( out, 0, pixelCount * 4 );
5259
5260 // Finally, the image data.
5261 if (compression) {
5262 // RLE as used by .PSD and .TIFF
5263 // Loop until you get the number of unpacked bytes you are expecting:
5264 // Read the next source byte into n.
5265 // If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
5266 // Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
5267 // Else if n is 128, noop.
5268 // Endloop
5269
5270 // The RLE-compressed data is preceeded by a 2-byte data count for each row in the data,
5271 // which we're going to just skip.
5272 stbi__skip(s, h * channelCount * 2 );
5273
5274 // Read the RLE data by channel.
5275 for (channel = 0; channel < 4; channel++) {
5276 stbi_uc *p;
5277
5278 p = out+channel;
5279 if (channel >= channelCount) {
5280 // Fill this channel with default data.
5281 for (i = 0; i < pixelCount; i++, p += 4)
5282 *p = (channel == 3 ? 255 : 0);
5283 } else {
5284 // Read the RLE data.
5285 count = 0;
5286 while (count < pixelCount) {
5287 len = stbi__get8(s);
5288 if (len == 128) {
5289 // No-op.
5290 } else if (len < 128) {
5291 // Copy next len+1 bytes literally.
5292 len++;
5293 count += len;
5294 while (len) {
5295 *p = stbi__get8(s);
5296 p += 4;
5297 len--;
5298 }
5299 } else if (len > 128) {
5300 stbi_uc val;
5301 // Next -len+1 bytes in the dest are replicated from next source byte.
5302 // (Interpret len as a negative 8-bit int.)
5303 len ^= 0x0FF;
5304 len += 2;
5305 val = stbi__get8(s);
5306 count += len;
5307 while (len) {
5308 *p = val;
5309 p += 4;
5310 len--;
5311 }
5312 }
5313 }
5314 }
5315 }
5316
5317 } else {
5318 // We're at the raw image data. It's each channel in order (Red, Green, Blue, Alpha, ...)
5319 // where each channel consists of an 8-bit value for each pixel in the image.
5320
5321 // Read the data by channel.
5322 for (channel = 0; channel < 4; channel++) {
5323 stbi_uc *p;
5324
5325 p = out + channel;
5326 if (channel >= channelCount) {
5327 // Fill this channel with default data.
5328 stbi_uc val = channel == 3 ? 255 : 0;
5329 for (i = 0; i < pixelCount; i++, p += 4)
5330 *p = val;
5331 } else {
5332 // Read the data.
5333 if (bitdepth == 16) {
5334 for (i = 0; i < pixelCount; i++, p += 4)
5335 *p = (stbi_uc) (stbi__get16be(s) >> 8);
5336 } else {
5337 for (i = 0; i < pixelCount; i++, p += 4)
5338 *p = stbi__get8(s);
5339 }
5340 }
5341 }
5342 }
5343
5344 if (req_comp && req_comp != 4) {
5345 out = stbi__convert_format(out, 4, req_comp, w, h);
5346 if (out == NULL) return out; // stbi__convert_format frees input on failure
5347 }
5348
5349 if (comp) *comp = 4;
5350 *y = h;
5351 *x = w;
5352
5353 return out;
5354 }
5355 #endif
5356
5357 // *************************************************************************************************
5358 // Softimage PIC loader
5359 // by Tom Seddon
5360 //
5361 // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
5362 // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
5363
5364 #ifndef STBI_NO_PIC
stbi__pic_is4(stbi__context * s,const char * str)5365 static int stbi__pic_is4(stbi__context *s,const char *str)
5366 {
5367 int i;
5368 for (i=0; i<4; ++i)
5369 if (stbi__get8(s) != (stbi_uc)str[i])
5370 return 0;
5371
5372 return 1;
5373 }
5374
stbi__pic_test_core(stbi__context * s)5375 static int stbi__pic_test_core(stbi__context *s)
5376 {
5377 int i;
5378
5379 if (!stbi__pic_is4(s,"\x53\x80\xF6\x34"))
5380 return 0;
5381
5382 for(i=0;i<84;++i)
5383 stbi__get8(s);
5384
5385 if (!stbi__pic_is4(s,"PICT"))
5386 return 0;
5387
5388 return 1;
5389 }
5390
5391 typedef struct
5392 {
5393 stbi_uc size,type,channel;
5394 } stbi__pic_packet;
5395
stbi__readval(stbi__context * s,int channel,stbi_uc * dest)5396 static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest)
5397 {
5398 int mask=0x80, i;
5399
5400 for (i=0; i<4; ++i, mask>>=1) {
5401 if (channel & mask) {
5402 if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short");
5403 dest[i]=stbi__get8(s);
5404 }
5405 }
5406
5407 return dest;
5408 }
5409
stbi__copyval(int channel,stbi_uc * dest,const stbi_uc * src)5410 static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src)
5411 {
5412 int mask=0x80,i;
5413
5414 for (i=0;i<4; ++i, mask>>=1)
5415 if (channel&mask)
5416 dest[i]=src[i];
5417 }
5418
stbi__pic_load_core(stbi__context * s,int width,int height,int * comp,stbi_uc * result)5419 static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result)
5420 {
5421 int act_comp=0,num_packets=0,y,chained;
5422 stbi__pic_packet packets[10];
5423
5424 // this will (should...) cater for even some bizarre stuff like having data
5425 // for the same channel in multiple packets.
5426 do {
5427 stbi__pic_packet *packet;
5428
5429 if (num_packets==sizeof(packets)/sizeof(packets[0]))
5430 return stbi__errpuc("bad format","too many packets");
5431
5432 packet = &packets[num_packets++];
5433
5434 chained = stbi__get8(s);
5435 packet->size = stbi__get8(s);
5436 packet->type = stbi__get8(s);
5437 packet->channel = stbi__get8(s);
5438
5439 act_comp |= packet->channel;
5440
5441 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (reading packets)");
5442 if (packet->size != 8) return stbi__errpuc("bad format","packet isn't 8bpp");
5443 } while (chained);
5444
5445 *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
5446
5447 for(y=0; y<height; ++y) {
5448 int packet_idx;
5449
5450 for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
5451 stbi__pic_packet *packet = &packets[packet_idx];
5452 stbi_uc *dest = result+y*width*4;
5453
5454 switch (packet->type) {
5455 default:
5456 return stbi__errpuc("bad format","packet has bad compression type");
5457
5458 case 0: {//uncompressed
5459 int x;
5460
5461 for(x=0;x<width;++x, dest+=4)
5462 if (!stbi__readval(s,packet->channel,dest))
5463 return 0;
5464 break;
5465 }
5466
5467 case 1://Pure RLE
5468 {
5469 int left=width, i;
5470
5471 while (left>0) {
5472 stbi_uc count,value[4];
5473
5474 count=stbi__get8(s);
5475 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pure read count)");
5476
5477 if (count > left)
5478 count = (stbi_uc) left;
5479
5480 if (!stbi__readval(s,packet->channel,value)) return 0;
5481
5482 for(i=0; i<count; ++i,dest+=4)
5483 stbi__copyval(packet->channel,dest,value);
5484 left -= count;
5485 }
5486 }
5487 break;
5488
5489 case 2: {//Mixed RLE
5490 int left=width;
5491 while (left>0) {
5492 int count = stbi__get8(s), i;
5493 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (mixed read count)");
5494
5495 if (count >= 128) { // Repeated
5496 stbi_uc value[4];
5497
5498 if (count==128)
5499 count = stbi__get16be(s);
5500 else
5501 count -= 127;
5502 if (count > left)
5503 return stbi__errpuc("bad file","scanline overrun");
5504
5505 if (!stbi__readval(s,packet->channel,value))
5506 return 0;
5507
5508 for(i=0;i<count;++i, dest += 4)
5509 stbi__copyval(packet->channel,dest,value);
5510 } else { // Raw
5511 ++count;
5512 if (count>left) return stbi__errpuc("bad file","scanline overrun");
5513
5514 for(i=0;i<count;++i, dest+=4)
5515 if (!stbi__readval(s,packet->channel,dest))
5516 return 0;
5517 }
5518 left-=count;
5519 }
5520 break;
5521 }
5522 }
5523 }
5524 }
5525
5526 return result;
5527 }
5528
stbi__pic_load(stbi__context * s,int * px,int * py,int * comp,int req_comp)5529 static stbi_uc *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp)
5530 {
5531 stbi_uc *result;
5532 int i, x,y;
5533
5534 for (i=0; i<92; ++i)
5535 stbi__get8(s);
5536
5537 x = stbi__get16be(s);
5538 y = stbi__get16be(s);
5539 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pic header)");
5540 if ((1 << 28) / x < y) return stbi__errpuc("too large", "Image too large to decode");
5541
5542 stbi__get32be(s); //skip `ratio'
5543 stbi__get16be(s); //skip `fields'
5544 stbi__get16be(s); //skip `pad'
5545
5546 // intermediate buffer is RGBA
5547 result = (stbi_uc *) stbi__malloc(x*y*4);
5548 memset(result, 0xff, x*y*4);
5549
5550 if (!stbi__pic_load_core(s,x,y,comp, result)) {
5551 STBI_FREE(result);
5552 result=0;
5553 }
5554 *px = x;
5555 *py = y;
5556 if (req_comp == 0) req_comp = *comp;
5557 result=stbi__convert_format(result,4,req_comp,x,y);
5558
5559 return result;
5560 }
5561
stbi__pic_test(stbi__context * s)5562 static int stbi__pic_test(stbi__context *s)
5563 {
5564 int r = stbi__pic_test_core(s);
5565 stbi__rewind(s);
5566 return r;
5567 }
5568 #endif
5569
5570 // *************************************************************************************************
5571 // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
5572
5573 #ifndef STBI_NO_GIF
5574 typedef struct
5575 {
5576 stbi__int16 prefix;
5577 stbi_uc first;
5578 stbi_uc suffix;
5579 } stbi__gif_lzw;
5580
5581 typedef struct
5582 {
5583 int w,h;
5584 stbi_uc *out, *old_out; // output buffer (always 4 components)
5585 int flags, bgindex, ratio, transparent, eflags, delay;
5586 stbi_uc pal[256][4];
5587 stbi_uc lpal[256][4];
5588 stbi__gif_lzw codes[4096];
5589 stbi_uc *color_table;
5590 int parse, step;
5591 int lflags;
5592 int start_x, start_y;
5593 int max_x, max_y;
5594 int cur_x, cur_y;
5595 int line_size;
5596 } stbi__gif;
5597
stbi__gif_test_raw(stbi__context * s)5598 static int stbi__gif_test_raw(stbi__context *s)
5599 {
5600 int sz;
5601 if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
5602 sz = stbi__get8(s);
5603 if (sz != '9' && sz != '7') return 0;
5604 if (stbi__get8(s) != 'a') return 0;
5605 return 1;
5606 }
5607
stbi__gif_test(stbi__context * s)5608 static int stbi__gif_test(stbi__context *s)
5609 {
5610 int r = stbi__gif_test_raw(s);
5611 stbi__rewind(s);
5612 return r;
5613 }
5614
stbi__gif_parse_colortable(stbi__context * s,stbi_uc pal[256][4],int num_entries,int transp)5615 static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp)
5616 {
5617 int i;
5618 for (i=0; i < num_entries; ++i) {
5619 pal[i][2] = stbi__get8(s);
5620 pal[i][1] = stbi__get8(s);
5621 pal[i][0] = stbi__get8(s);
5622 pal[i][3] = transp == i ? 0 : 255;
5623 }
5624 }
5625
stbi__gif_header(stbi__context * s,stbi__gif * g,int * comp,int is_info)5626 static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info)
5627 {
5628 stbi_uc version;
5629 if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
5630 return stbi__err("not GIF", "Corrupt GIF");
5631
5632 version = stbi__get8(s);
5633 if (version != '7' && version != '9') return stbi__err("not GIF", "Corrupt GIF");
5634 if (stbi__get8(s) != 'a') return stbi__err("not GIF", "Corrupt GIF");
5635
5636 stbi__g_failure_reason = "";
5637 g->w = stbi__get16le(s);
5638 g->h = stbi__get16le(s);
5639 g->flags = stbi__get8(s);
5640 g->bgindex = stbi__get8(s);
5641 g->ratio = stbi__get8(s);
5642 g->transparent = -1;
5643
5644 if (comp != 0) *comp = 4; // can't actually tell whether it's 3 or 4 until we parse the comments
5645
5646 if (is_info) return 1;
5647
5648 if (g->flags & 0x80)
5649 stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
5650
5651 return 1;
5652 }
5653
stbi__gif_info_raw(stbi__context * s,int * x,int * y,int * comp)5654 static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp)
5655 {
5656 stbi__gif g;
5657 if (!stbi__gif_header(s, &g, comp, 1)) {
5658 stbi__rewind( s );
5659 return 0;
5660 }
5661 if (x) *x = g.w;
5662 if (y) *y = g.h;
5663 return 1;
5664 }
5665
stbi__out_gif_code(stbi__gif * g,stbi__uint16 code)5666 static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code)
5667 {
5668 stbi_uc *p, *c;
5669
5670 // recurse to decode the prefixes, since the linked-list is backwards,
5671 // and working backwards through an interleaved image would be nasty
5672 if (g->codes[code].prefix >= 0)
5673 stbi__out_gif_code(g, g->codes[code].prefix);
5674
5675 if (g->cur_y >= g->max_y) return;
5676
5677 p = &g->out[g->cur_x + g->cur_y];
5678 c = &g->color_table[g->codes[code].suffix * 4];
5679
5680 if (c[3] >= 128) {
5681 p[0] = c[2];
5682 p[1] = c[1];
5683 p[2] = c[0];
5684 p[3] = c[3];
5685 }
5686 g->cur_x += 4;
5687
5688 if (g->cur_x >= g->max_x) {
5689 g->cur_x = g->start_x;
5690 g->cur_y += g->step;
5691
5692 while (g->cur_y >= g->max_y && g->parse > 0) {
5693 g->step = (1 << g->parse) * g->line_size;
5694 g->cur_y = g->start_y + (g->step >> 1);
5695 --g->parse;
5696 }
5697 }
5698 }
5699
stbi__process_gif_raster(stbi__context * s,stbi__gif * g)5700 static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g)
5701 {
5702 stbi_uc lzw_cs;
5703 stbi__int32 len, init_code;
5704 stbi__uint32 first;
5705 stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
5706 stbi__gif_lzw *p;
5707
5708 lzw_cs = stbi__get8(s);
5709 if (lzw_cs > 12) return NULL;
5710 clear = 1 << lzw_cs;
5711 first = 1;
5712 codesize = lzw_cs + 1;
5713 codemask = (1 << codesize) - 1;
5714 bits = 0;
5715 valid_bits = 0;
5716 for (init_code = 0; init_code < clear; init_code++) {
5717 g->codes[init_code].prefix = -1;
5718 g->codes[init_code].first = (stbi_uc) init_code;
5719 g->codes[init_code].suffix = (stbi_uc) init_code;
5720 }
5721
5722 // support no starting clear code
5723 avail = clear+2;
5724 oldcode = -1;
5725
5726 len = 0;
5727 for(;;) {
5728 if (valid_bits < codesize) {
5729 if (len == 0) {
5730 len = stbi__get8(s); // start new block
5731 if (len == 0)
5732 return g->out;
5733 }
5734 --len;
5735 bits |= (stbi__int32) stbi__get8(s) << valid_bits;
5736 valid_bits += 8;
5737 } else {
5738 stbi__int32 code = bits & codemask;
5739 bits >>= codesize;
5740 valid_bits -= codesize;
5741 // @OPTIMIZE: is there some way we can accelerate the non-clear path?
5742 if (code == clear) { // clear code
5743 codesize = lzw_cs + 1;
5744 codemask = (1 << codesize) - 1;
5745 avail = clear + 2;
5746 oldcode = -1;
5747 first = 0;
5748 } else if (code == clear + 1) { // end of stream code
5749 stbi__skip(s, len);
5750 while ((len = stbi__get8(s)) > 0)
5751 stbi__skip(s,len);
5752 return g->out;
5753 } else if (code <= avail) {
5754 if (first) return stbi__errpuc("no clear code", "Corrupt GIF");
5755
5756 if (oldcode >= 0) {
5757 p = &g->codes[avail++];
5758 if (avail > 4096) return stbi__errpuc("too many codes", "Corrupt GIF");
5759 p->prefix = (stbi__int16) oldcode;
5760 p->first = g->codes[oldcode].first;
5761 p->suffix = (code == avail) ? p->first : g->codes[code].first;
5762 } else if (code == avail)
5763 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5764
5765 stbi__out_gif_code(g, (stbi__uint16) code);
5766
5767 if ((avail & codemask) == 0 && avail <= 0x0FFF) {
5768 codesize++;
5769 codemask = (1 << codesize) - 1;
5770 }
5771
5772 oldcode = code;
5773 } else {
5774 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5775 }
5776 }
5777 }
5778 }
5779
stbi__fill_gif_background(stbi__gif * g,int x0,int y0,int x1,int y1)5780 static void stbi__fill_gif_background(stbi__gif *g, int x0, int y0, int x1, int y1)
5781 {
5782 int x, y;
5783 stbi_uc *c = g->pal[g->bgindex];
5784 for (y = y0; y < y1; y += 4 * g->w) {
5785 for (x = x0; x < x1; x += 4) {
5786 stbi_uc *p = &g->out[y + x];
5787 p[0] = c[2];
5788 p[1] = c[1];
5789 p[2] = c[0];
5790 p[3] = 0;
5791 }
5792 }
5793 }
5794
5795 // this function is designed to support animated gifs, although stb_image doesn't support it
stbi__gif_load_next(stbi__context * s,stbi__gif * g,int * comp,int req_comp)5796 static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp)
5797 {
5798 int i;
5799 stbi_uc *prev_out = 0;
5800
5801 if (g->out == 0 && !stbi__gif_header(s, g, comp,0))
5802 return 0; // stbi__g_failure_reason set by stbi__gif_header
5803
5804 prev_out = g->out;
5805 g->out = (stbi_uc *) stbi__malloc(4 * g->w * g->h);
5806 if (g->out == 0) return stbi__errpuc("outofmem", "Out of memory");
5807
5808 switch ((g->eflags & 0x1C) >> 2) {
5809 case 0: // unspecified (also always used on 1st frame)
5810 stbi__fill_gif_background(g, 0, 0, 4 * g->w, 4 * g->w * g->h);
5811 break;
5812 case 1: // do not dispose
5813 if (prev_out) memcpy(g->out, prev_out, 4 * g->w * g->h);
5814 g->old_out = prev_out;
5815 break;
5816 case 2: // dispose to background
5817 if (prev_out) memcpy(g->out, prev_out, 4 * g->w * g->h);
5818 stbi__fill_gif_background(g, g->start_x, g->start_y, g->max_x, g->max_y);
5819 break;
5820 case 3: // dispose to previous
5821 if (g->old_out) {
5822 for (i = g->start_y; i < g->max_y; i += 4 * g->w)
5823 memcpy(&g->out[i + g->start_x], &g->old_out[i + g->start_x], g->max_x - g->start_x);
5824 }
5825 break;
5826 }
5827
5828 for (;;) {
5829 switch (stbi__get8(s)) {
5830 case 0x2C: /* Image Descriptor */
5831 {
5832 int prev_trans = -1;
5833 stbi__int32 x, y, w, h;
5834 stbi_uc *o;
5835
5836 x = stbi__get16le(s);
5837 y = stbi__get16le(s);
5838 w = stbi__get16le(s);
5839 h = stbi__get16le(s);
5840 if (((x + w) > (g->w)) || ((y + h) > (g->h)))
5841 return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
5842
5843 g->line_size = g->w * 4;
5844 g->start_x = x * 4;
5845 g->start_y = y * g->line_size;
5846 g->max_x = g->start_x + w * 4;
5847 g->max_y = g->start_y + h * g->line_size;
5848 g->cur_x = g->start_x;
5849 g->cur_y = g->start_y;
5850
5851 g->lflags = stbi__get8(s);
5852
5853 if (g->lflags & 0x40) {
5854 g->step = 8 * g->line_size; // first interlaced spacing
5855 g->parse = 3;
5856 } else {
5857 g->step = g->line_size;
5858 g->parse = 0;
5859 }
5860
5861 if (g->lflags & 0x80) {
5862 stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
5863 g->color_table = (stbi_uc *) g->lpal;
5864 } else if (g->flags & 0x80) {
5865 if (g->transparent >= 0 && (g->eflags & 0x01)) {
5866 prev_trans = g->pal[g->transparent][3];
5867 g->pal[g->transparent][3] = 0;
5868 }
5869 g->color_table = (stbi_uc *) g->pal;
5870 } else
5871 return stbi__errpuc("missing color table", "Corrupt GIF");
5872
5873 o = stbi__process_gif_raster(s, g);
5874 if (o == NULL) return NULL;
5875
5876 if (prev_trans != -1)
5877 g->pal[g->transparent][3] = (stbi_uc) prev_trans;
5878
5879 return o;
5880 }
5881
5882 case 0x21: // Comment Extension.
5883 {
5884 int len;
5885 if (stbi__get8(s) == 0xF9) { // Graphic Control Extension.
5886 len = stbi__get8(s);
5887 if (len == 4) {
5888 g->eflags = stbi__get8(s);
5889 g->delay = stbi__get16le(s);
5890 g->transparent = stbi__get8(s);
5891 } else {
5892 stbi__skip(s, len);
5893 break;
5894 }
5895 }
5896 while ((len = stbi__get8(s)) != 0)
5897 stbi__skip(s, len);
5898 break;
5899 }
5900
5901 case 0x3B: // gif stream termination code
5902 return (stbi_uc *) s; // using '1' causes warning on some compilers
5903
5904 default:
5905 return stbi__errpuc("unknown code", "Corrupt GIF");
5906 }
5907 }
5908
5909 STBI_NOTUSED(req_comp);
5910 }
5911
stbi__gif_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)5912 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5913 {
5914 stbi_uc *u = 0;
5915 stbi__gif g;
5916 memset(&g, 0, sizeof(g));
5917
5918 u = stbi__gif_load_next(s, &g, comp, req_comp);
5919 if (u == (stbi_uc *) s) u = 0; // end of animated gif marker
5920 if (u) {
5921 *x = g.w;
5922 *y = g.h;
5923 if (req_comp && req_comp != 4)
5924 u = stbi__convert_format(u, 4, req_comp, g.w, g.h);
5925 }
5926 else if (g.out)
5927 STBI_FREE(g.out);
5928
5929 return u;
5930 }
5931
stbi__gif_info(stbi__context * s,int * x,int * y,int * comp)5932 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp)
5933 {
5934 return stbi__gif_info_raw(s,x,y,comp);
5935 }
5936 #endif
5937
5938 // *************************************************************************************************
5939 // Radiance RGBE HDR loader
5940 // originally by Nicolas Schulz
5941 #ifndef STBI_NO_HDR
stbi__hdr_test_core(stbi__context * s)5942 static int stbi__hdr_test_core(stbi__context *s)
5943 {
5944 const char *signature = "#?RADIANCE\n";
5945 int i;
5946 for (i=0; signature[i]; ++i)
5947 if (stbi__get8(s) != signature[i])
5948 return 0;
5949 return 1;
5950 }
5951
stbi__hdr_test(stbi__context * s)5952 static int stbi__hdr_test(stbi__context* s)
5953 {
5954 int r = stbi__hdr_test_core(s);
5955 stbi__rewind(s);
5956 return r;
5957 }
5958
5959 #define STBI__HDR_BUFLEN 1024
stbi__hdr_gettoken(stbi__context * z,char * buffer)5960 static char *stbi__hdr_gettoken(stbi__context *z, char *buffer)
5961 {
5962 int len=0;
5963 char c = '\0';
5964
5965 c = (char) stbi__get8(z);
5966
5967 while (!stbi__at_eof(z) && c != '\n') {
5968 buffer[len++] = c;
5969 if (len == STBI__HDR_BUFLEN-1) {
5970 // flush to end of line
5971 while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
5972 ;
5973 break;
5974 }
5975 c = (char) stbi__get8(z);
5976 }
5977
5978 buffer[len] = 0;
5979 return buffer;
5980 }
5981
stbi__hdr_convert(float * output,stbi_uc * input,int req_comp)5982 static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp)
5983 {
5984 if ( input[3] != 0 ) {
5985 float f1;
5986 // Exponent
5987 f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
5988 if (req_comp <= 2)
5989 output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
5990 else {
5991 output[0] = input[0] * f1;
5992 output[1] = input[1] * f1;
5993 output[2] = input[2] * f1;
5994 }
5995 if (req_comp == 2) output[1] = 1;
5996 if (req_comp == 4) output[3] = 1;
5997 } else {
5998 switch (req_comp) {
5999 case 4: output[3] = 1; /* fallthrough */
6000 case 3: output[0] = output[1] = output[2] = 0;
6001 break;
6002 case 2: output[1] = 1; /* fallthrough */
6003 case 1: output[0] = 0;
6004 break;
6005 }
6006 }
6007 }
6008
stbi__hdr_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)6009 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6010 {
6011 char buffer[STBI__HDR_BUFLEN];
6012 char *token;
6013 int valid = 0;
6014 int width, height;
6015 stbi_uc *scanline;
6016 float *hdr_data;
6017 int len;
6018 unsigned char count, value;
6019 int i, j, k, c1,c2, z;
6020
6021
6022 // Check identifier
6023 if (strcmp(stbi__hdr_gettoken(s,buffer), "#?RADIANCE") != 0)
6024 return stbi__errpf("not HDR", "Corrupt HDR image");
6025
6026 // Parse header
6027 for(;;) {
6028 token = stbi__hdr_gettoken(s,buffer);
6029 if (token[0] == 0) break;
6030 if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6031 }
6032
6033 if (!valid) return stbi__errpf("unsupported format", "Unsupported HDR format");
6034
6035 // Parse width and height
6036 // can't use sscanf() if we're not using stdio!
6037 token = stbi__hdr_gettoken(s,buffer);
6038 if (strncmp(token, "-Y ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6039 token += 3;
6040 height = (int) strtol(token, &token, 10);
6041 while (*token == ' ') ++token;
6042 if (strncmp(token, "+X ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6043 token += 3;
6044 width = (int) strtol(token, NULL, 10);
6045
6046 *x = width;
6047 *y = height;
6048
6049 if (comp) *comp = 3;
6050 if (req_comp == 0) req_comp = 3;
6051
6052 // Read data
6053 hdr_data = (float *) stbi__malloc(height * width * req_comp * sizeof(float));
6054
6055 // Load image data
6056 // image data is stored as some number of sca
6057 if ( width < 8 || width >= 32768) {
6058 // Read flat data
6059 for (j=0; j < height; ++j) {
6060 for (i=0; i < width; ++i) {
6061 stbi_uc rgbe[4];
6062 main_decode_loop:
6063 stbi__getn(s, rgbe, 4);
6064 stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
6065 }
6066 }
6067 } else {
6068 // Read RLE-encoded data
6069 scanline = NULL;
6070
6071 for (j = 0; j < height; ++j) {
6072 c1 = stbi__get8(s);
6073 c2 = stbi__get8(s);
6074 len = stbi__get8(s);
6075 if (c1 != 2 || c2 != 2 || (len & 0x80)) {
6076 // not run-length encoded, so we have to actually use THIS data as a decoded
6077 // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
6078 stbi_uc rgbe[4];
6079 rgbe[0] = (stbi_uc) c1;
6080 rgbe[1] = (stbi_uc) c2;
6081 rgbe[2] = (stbi_uc) len;
6082 rgbe[3] = (stbi_uc) stbi__get8(s);
6083 stbi__hdr_convert(hdr_data, rgbe, req_comp);
6084 i = 1;
6085 j = 0;
6086 STBI_FREE(scanline);
6087 goto main_decode_loop; // yes, this makes no sense
6088 }
6089 len <<= 8;
6090 len |= stbi__get8(s);
6091 if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
6092 if (scanline == NULL) scanline = (stbi_uc *) stbi__malloc(width * 4);
6093
6094 for (k = 0; k < 4; ++k) {
6095 i = 0;
6096 while (i < width) {
6097 count = stbi__get8(s);
6098 if (count > 128) {
6099 // Run
6100 value = stbi__get8(s);
6101 count -= 128;
6102 for (z = 0; z < count; ++z)
6103 scanline[i++ * 4 + k] = value;
6104 } else {
6105 // Dump
6106 for (z = 0; z < count; ++z)
6107 scanline[i++ * 4 + k] = stbi__get8(s);
6108 }
6109 }
6110 }
6111 for (i=0; i < width; ++i)
6112 stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
6113 }
6114 STBI_FREE(scanline);
6115 }
6116
6117 return hdr_data;
6118 }
6119
stbi__hdr_info(stbi__context * s,int * x,int * y,int * comp)6120 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp)
6121 {
6122 char buffer[STBI__HDR_BUFLEN];
6123 char *token;
6124 int valid = 0;
6125
6126 if (stbi__hdr_test(s) == 0) {
6127 stbi__rewind( s );
6128 return 0;
6129 }
6130
6131 for(;;) {
6132 token = stbi__hdr_gettoken(s,buffer);
6133 if (token[0] == 0) break;
6134 if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6135 }
6136
6137 if (!valid) {
6138 stbi__rewind( s );
6139 return 0;
6140 }
6141 token = stbi__hdr_gettoken(s,buffer);
6142 if (strncmp(token, "-Y ", 3)) {
6143 stbi__rewind( s );
6144 return 0;
6145 }
6146 token += 3;
6147 *y = (int) strtol(token, &token, 10);
6148 while (*token == ' ') ++token;
6149 if (strncmp(token, "+X ", 3)) {
6150 stbi__rewind( s );
6151 return 0;
6152 }
6153 token += 3;
6154 *x = (int) strtol(token, NULL, 10);
6155 *comp = 3;
6156 return 1;
6157 }
6158 #endif // STBI_NO_HDR
6159
6160 #ifndef STBI_NO_BMP
stbi__bmp_info(stbi__context * s,int * x,int * y,int * comp)6161 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp)
6162 {
6163 void *p;
6164 stbi__bmp_data info;
6165
6166 info.all_a = 255;
6167 p = stbi__bmp_parse_header(s, &info);
6168 stbi__rewind( s );
6169 if (p == NULL)
6170 return 0;
6171 *x = s->img_x;
6172 *y = s->img_y;
6173 *comp = info.ma ? 4 : 3;
6174 return 1;
6175 }
6176 #endif
6177
6178 #ifndef STBI_NO_PSD
stbi__psd_info(stbi__context * s,int * x,int * y,int * comp)6179 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp)
6180 {
6181 int channelCount;
6182 if (stbi__get32be(s) != 0x38425053) {
6183 stbi__rewind( s );
6184 return 0;
6185 }
6186 if (stbi__get16be(s) != 1) {
6187 stbi__rewind( s );
6188 return 0;
6189 }
6190 stbi__skip(s, 6);
6191 channelCount = stbi__get16be(s);
6192 if (channelCount < 0 || channelCount > 16) {
6193 stbi__rewind( s );
6194 return 0;
6195 }
6196 *y = stbi__get32be(s);
6197 *x = stbi__get32be(s);
6198 if (stbi__get16be(s) != 8) {
6199 stbi__rewind( s );
6200 return 0;
6201 }
6202 if (stbi__get16be(s) != 3) {
6203 stbi__rewind( s );
6204 return 0;
6205 }
6206 *comp = 4;
6207 return 1;
6208 }
6209 #endif
6210
6211 #ifndef STBI_NO_PIC
stbi__pic_info(stbi__context * s,int * x,int * y,int * comp)6212 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp)
6213 {
6214 int act_comp=0,num_packets=0,chained;
6215 stbi__pic_packet packets[10];
6216
6217 if (!stbi__pic_is4(s,"\x53\x80\xF6\x34")) {
6218 stbi__rewind(s);
6219 return 0;
6220 }
6221
6222 stbi__skip(s, 88);
6223
6224 *x = stbi__get16be(s);
6225 *y = stbi__get16be(s);
6226 if (stbi__at_eof(s)) {
6227 stbi__rewind( s);
6228 return 0;
6229 }
6230 if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
6231 stbi__rewind( s );
6232 return 0;
6233 }
6234
6235 stbi__skip(s, 8);
6236
6237 do {
6238 stbi__pic_packet *packet;
6239
6240 if (num_packets==sizeof(packets)/sizeof(packets[0]))
6241 return 0;
6242
6243 packet = &packets[num_packets++];
6244 chained = stbi__get8(s);
6245 packet->size = stbi__get8(s);
6246 packet->type = stbi__get8(s);
6247 packet->channel = stbi__get8(s);
6248 act_comp |= packet->channel;
6249
6250 if (stbi__at_eof(s)) {
6251 stbi__rewind( s );
6252 return 0;
6253 }
6254 if (packet->size != 8) {
6255 stbi__rewind( s );
6256 return 0;
6257 }
6258 } while (chained);
6259
6260 *comp = (act_comp & 0x10 ? 4 : 3);
6261
6262 return 1;
6263 }
6264 #endif
6265
6266 // *************************************************************************************************
6267 // Portable Gray Map and Portable Pixel Map loader
6268 // by Ken Miller
6269 //
6270 // PGM: http://netpbm.sourceforge.net/doc/pgm.html
6271 // PPM: http://netpbm.sourceforge.net/doc/ppm.html
6272 //
6273 // Known limitations:
6274 // Does not support comments in the header section
6275 // Does not support ASCII image data (formats P2 and P3)
6276 // Does not support 16-bit-per-channel
6277
6278 #ifndef STBI_NO_PNM
6279
stbi__pnm_test(stbi__context * s)6280 static int stbi__pnm_test(stbi__context *s)
6281 {
6282 char p, t;
6283 p = (char) stbi__get8(s);
6284 t = (char) stbi__get8(s);
6285 if (p != 'P' || (t != '5' && t != '6')) {
6286 stbi__rewind( s );
6287 return 0;
6288 }
6289 return 1;
6290 }
6291
stbi__pnm_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)6292 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6293 {
6294 stbi_uc *out;
6295 if (!stbi__pnm_info(s, (int *)&s->img_x, (int *)&s->img_y, (int *)&s->img_n))
6296 return 0;
6297 *x = s->img_x;
6298 *y = s->img_y;
6299 *comp = s->img_n;
6300
6301 out = (stbi_uc *) stbi__malloc(s->img_n * s->img_x * s->img_y);
6302 if (!out) return stbi__errpuc("outofmem", "Out of memory");
6303 stbi__getn(s, out, s->img_n * s->img_x * s->img_y);
6304
6305 if (req_comp && req_comp != s->img_n) {
6306 out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y);
6307 if (out == NULL) return out; // stbi__convert_format frees input on failure
6308 }
6309 return out;
6310 }
6311
stbi__pnm_isspace(char c)6312 static int stbi__pnm_isspace(char c)
6313 {
6314 return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r';
6315 }
6316
stbi__pnm_skip_whitespace(stbi__context * s,char * c)6317 static void stbi__pnm_skip_whitespace(stbi__context *s, char *c)
6318 {
6319 for (;;) {
6320 while (!stbi__at_eof(s) && stbi__pnm_isspace(*c))
6321 *c = (char) stbi__get8(s);
6322
6323 if (stbi__at_eof(s) || *c != '#')
6324 break;
6325
6326 while (!stbi__at_eof(s) && *c != '\n' && *c != '\r' )
6327 *c = (char) stbi__get8(s);
6328 }
6329 }
6330
stbi__pnm_isdigit(char c)6331 static int stbi__pnm_isdigit(char c)
6332 {
6333 return c >= '0' && c <= '9';
6334 }
6335
stbi__pnm_getinteger(stbi__context * s,char * c)6336 static int stbi__pnm_getinteger(stbi__context *s, char *c)
6337 {
6338 int value = 0;
6339
6340 while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) {
6341 value = value*10 + (*c - '0');
6342 *c = (char) stbi__get8(s);
6343 }
6344
6345 return value;
6346 }
6347
stbi__pnm_info(stbi__context * s,int * x,int * y,int * comp)6348 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp)
6349 {
6350 int maxv;
6351 char c, p, t;
6352
6353 stbi__rewind( s );
6354
6355 // Get identifier
6356 p = (char) stbi__get8(s);
6357 t = (char) stbi__get8(s);
6358 if (p != 'P' || (t != '5' && t != '6')) {
6359 stbi__rewind( s );
6360 return 0;
6361 }
6362
6363 *comp = (t == '6') ? 3 : 1; // '5' is 1-component .pgm; '6' is 3-component .ppm
6364
6365 c = (char) stbi__get8(s);
6366 stbi__pnm_skip_whitespace(s, &c);
6367
6368 *x = stbi__pnm_getinteger(s, &c); // read width
6369 stbi__pnm_skip_whitespace(s, &c);
6370
6371 *y = stbi__pnm_getinteger(s, &c); // read height
6372 stbi__pnm_skip_whitespace(s, &c);
6373
6374 maxv = stbi__pnm_getinteger(s, &c); // read max value
6375
6376 if (maxv > 255)
6377 return stbi__err("max value > 255", "PPM image not 8-bit");
6378 else
6379 return 1;
6380 }
6381 #endif
6382
stbi__info_main(stbi__context * s,int * x,int * y,int * comp)6383 static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp)
6384 {
6385 #ifndef STBI_NO_JPEG
6386 if (stbi__jpeg_info(s, x, y, comp)) return 1;
6387 #endif
6388
6389 #ifndef STBI_NO_PNG
6390 if (stbi__png_info(s, x, y, comp)) return 1;
6391 #endif
6392
6393 #ifndef STBI_NO_GIF
6394 if (stbi__gif_info(s, x, y, comp)) return 1;
6395 #endif
6396
6397 #ifndef STBI_NO_BMP
6398 if (stbi__bmp_info(s, x, y, comp)) return 1;
6399 #endif
6400
6401 #ifndef STBI_NO_PSD
6402 if (stbi__psd_info(s, x, y, comp)) return 1;
6403 #endif
6404
6405 #ifndef STBI_NO_PIC
6406 if (stbi__pic_info(s, x, y, comp)) return 1;
6407 #endif
6408
6409 #ifndef STBI_NO_PNM
6410 if (stbi__pnm_info(s, x, y, comp)) return 1;
6411 #endif
6412
6413 #ifndef STBI_NO_HDR
6414 if (stbi__hdr_info(s, x, y, comp)) return 1;
6415 #endif
6416
6417 // test tga last because it's a crappy test!
6418 #ifndef STBI_NO_TGA
6419 if (stbi__tga_info(s, x, y, comp))
6420 return 1;
6421 #endif
6422 return stbi__err("unknown image type", "Image not of any known type, or corrupt");
6423 }
6424
6425 #ifndef STBI_NO_STDIO
stbi_info(char const * filename,int * x,int * y,int * comp)6426 STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
6427 {
6428 FILE *f = stbi__fopen(filename, "rb");
6429 int result;
6430 if (!f) return stbi__err("can't fopen", "Unable to open file");
6431 result = stbi_info_from_file(f, x, y, comp);
6432 fclose(f);
6433 return result;
6434 }
6435
stbi_info_from_file(FILE * f,int * x,int * y,int * comp)6436 STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
6437 {
6438 int r;
6439 stbi__context s;
6440 long pos = ftell(f);
6441 stbi__start_file(&s, f);
6442 r = stbi__info_main(&s,x,y,comp);
6443 fseek(f,pos,SEEK_SET);
6444 return r;
6445 }
6446 #endif // !STBI_NO_STDIO
6447
stbi_info_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp)6448 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
6449 {
6450 stbi__context s;
6451 stbi__start_mem(&s,buffer,len);
6452 return stbi__info_main(&s,x,y,comp);
6453 }
6454
stbi_info_from_callbacks(stbi_io_callbacks const * c,void * user,int * x,int * y,int * comp)6455 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
6456 {
6457 stbi__context s;
6458 stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
6459 return stbi__info_main(&s,x,y,comp);
6460 }
6461
6462 #pragma GCC diagnostic pop
6463
6464 #endif // STB_IMAGE_IMPLEMENTATION
6465
6466 /*
6467 revision history:
6468 2.10 (2016-01-22) avoid warning introduced in 2.09 by STBI_REALLOC_SIZED
6469 2.09 (2016-01-16) allow comments in PNM files
6470 16-bit-per-pixel TGA (not bit-per-component)
6471 info() for TGA could break due to .hdr handling
6472 info() for BMP to shares code instead of sloppy parse
6473 can use STBI_REALLOC_SIZED if allocator doesn't support realloc
6474 code cleanup
6475 2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
6476 2.07 (2015-09-13) fix compiler warnings
6477 partial animated GIF support
6478 limited 16-bpc PSD support
6479 #ifdef unused functions
6480 bug with < 92 byte PIC,PNM,HDR,TGA
6481 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
6482 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
6483 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
6484 2.03 (2015-04-12) extra corruption checking (mmozeiko)
6485 stbi_set_flip_vertically_on_load (nguillemot)
6486 fix NEON support; fix mingw support
6487 2.02 (2015-01-19) fix incorrect assert, fix warning
6488 2.01 (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
6489 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
6490 2.00 (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
6491 progressive JPEG (stb)
6492 PGM/PPM support (Ken Miller)
6493 STBI_MALLOC,STBI_REALLOC,STBI_FREE
6494 GIF bugfix -- seemingly never worked
6495 STBI_NO_*, STBI_ONLY_*
6496 1.48 (2014-12-14) fix incorrectly-named assert()
6497 1.47 (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
6498 optimize PNG (ryg)
6499 fix bug in interlaced PNG with user-specified channel count (stb)
6500 1.46 (2014-08-26)
6501 fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
6502 1.45 (2014-08-16)
6503 fix MSVC-ARM internal compiler error by wrapping malloc
6504 1.44 (2014-08-07)
6505 various warning fixes from Ronny Chevalier
6506 1.43 (2014-07-15)
6507 fix MSVC-only compiler problem in code changed in 1.42
6508 1.42 (2014-07-09)
6509 don't define _CRT_SECURE_NO_WARNINGS (affects user code)
6510 fixes to stbi__cleanup_jpeg path
6511 added STBI_ASSERT to avoid requiring assert.h
6512 1.41 (2014-06-25)
6513 fix search&replace from 1.36 that messed up comments/error messages
6514 1.40 (2014-06-22)
6515 fix gcc struct-initialization warning
6516 1.39 (2014-06-15)
6517 fix to TGA optimization when req_comp != number of components in TGA;
6518 fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
6519 add support for BMP version 5 (more ignored fields)
6520 1.38 (2014-06-06)
6521 suppress MSVC warnings on integer casts truncating values
6522 fix accidental rename of 'skip' field of I/O
6523 1.37 (2014-06-04)
6524 remove duplicate typedef
6525 1.36 (2014-06-03)
6526 convert to header file single-file library
6527 if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
6528 1.35 (2014-05-27)
6529 various warnings
6530 fix broken STBI_SIMD path
6531 fix bug where stbi_load_from_file no longer left file pointer in correct place
6532 fix broken non-easy path for 32-bit BMP (possibly never used)
6533 TGA optimization by Arseny Kapoulkine
6534 1.34 (unknown)
6535 use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
6536 1.33 (2011-07-14)
6537 make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
6538 1.32 (2011-07-13)
6539 support for "info" function for all supported filetypes (SpartanJ)
6540 1.31 (2011-06-20)
6541 a few more leak fixes, bug in PNG handling (SpartanJ)
6542 1.30 (2011-06-11)
6543 added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
6544 removed deprecated format-specific test/load functions
6545 removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
6546 error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
6547 fix inefficiency in decoding 32-bit BMP (David Woo)
6548 1.29 (2010-08-16)
6549 various warning fixes from Aurelien Pocheville
6550 1.28 (2010-08-01)
6551 fix bug in GIF palette transparency (SpartanJ)
6552 1.27 (2010-08-01)
6553 cast-to-stbi_uc to fix warnings
6554 1.26 (2010-07-24)
6555 fix bug in file buffering for PNG reported by SpartanJ
6556 1.25 (2010-07-17)
6557 refix trans_data warning (Won Chun)
6558 1.24 (2010-07-12)
6559 perf improvements reading from files on platforms with lock-heavy fgetc()
6560 minor perf improvements for jpeg
6561 deprecated type-specific functions so we'll get feedback if they're needed
6562 attempt to fix trans_data warning (Won Chun)
6563 1.23 fixed bug in iPhone support
6564 1.22 (2010-07-10)
6565 removed image *writing* support
6566 stbi_info support from Jetro Lauha
6567 GIF support from Jean-Marc Lienher
6568 iPhone PNG-extensions from James Brown
6569 warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
6570 1.21 fix use of 'stbi_uc' in header (reported by jon blow)
6571 1.20 added support for Softimage PIC, by Tom Seddon
6572 1.19 bug in interlaced PNG corruption check (found by ryg)
6573 1.18 (2008-08-02)
6574 fix a threading bug (local mutable static)
6575 1.17 support interlaced PNG
6576 1.16 major bugfix - stbi__convert_format converted one too many pixels
6577 1.15 initialize some fields for thread safety
6578 1.14 fix threadsafe conversion bug
6579 header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
6580 1.13 threadsafe
6581 1.12 const qualifiers in the API
6582 1.11 Support installable IDCT, colorspace conversion routines
6583 1.10 Fixes for 64-bit (don't use "unsigned long")
6584 optimized upsampling by Fabian "ryg" Giesen
6585 1.09 Fix format-conversion for PSD code (bad global variables!)
6586 1.08 Thatcher Ulrich's PSD code integrated by Nicolas Schulz
6587 1.07 attempt to fix C++ warning/errors again
6588 1.06 attempt to fix C++ warning/errors again
6589 1.05 fix TGA loading to return correct *comp and use good luminance calc
6590 1.04 default float alpha is 1, not 255; use 'void *' for stbi_image_free
6591 1.03 bugfixes to STBI_NO_STDIO, STBI_NO_HDR
6592 1.02 support for (subset of) HDR files, float interface for preferred access to them
6593 1.01 fix bug: possible bug in handling right-side up bmps... not sure
6594 fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
6595 1.00 interface to zlib that skips zlib header
6596 0.99 correct handling of alpha in palette
6597 0.98 TGA loader by lonesock; dynamically add loaders (untested)
6598 0.97 jpeg errors on too large a file; also catch another malloc failure
6599 0.96 fix detection of invalid v value - particleman@mollyrocket forum
6600 0.95 during header scan, seek to markers in case of padding
6601 0.94 STBI_NO_STDIO to disable stdio usage; rename all #defines the same
6602 0.93 handle jpegtran output; verbose errors
6603 0.92 read 4,8,16,24,32-bit BMP files of several formats
6604 0.91 output 24-bit Windows 3.0 BMP files
6605 0.90 fix a few more warnings; bump version number to approach 1.0
6606 0.61 bugfixes due to Marc LeBlanc, Christopher Lloyd
6607 0.60 fix compiling as c++
6608 0.59 fix warnings: merge Dave Moore's -Wall fixes
6609 0.58 fix bug: zlib uncompressed mode len/nlen was wrong endian
6610 0.57 fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
6611 0.56 fix bug: zlib uncompressed mode len vs. nlen
6612 0.55 fix bug: restart_interval not initialized to 0
6613 0.54 allow NULL for 'int *comp'
6614 0.53 fix bug in png 3->4; speedup png decoding
6615 0.52 png handles req_comp=3,4 directly; minor cleanup; jpeg comments
6616 0.51 obey req_comp requests, 1-component jpegs return as 1-component,
6617 on 'test' only check type, not whether we support this variant
6618 0.50 (2006-11-19)
6619 first released version
6620 */
6621