1 /* stb_image - v2.10 - public domain image loader - http://nothings.org/stb_image.h
2                                      no warranty implied; use at your own risk
3 
4    Do this:
5       #define STB_IMAGE_IMPLEMENTATION
6    before you include this file in *one* C or C++ file to create the implementation.
7 
8    // i.e. it should look like this:
9    #include ...
10    #include ...
11    #include ...
12    #define STB_IMAGE_IMPLEMENTATION
13    #include "stb_image.h"
14 
15    You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
16    And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
17 
18 
19    QUICK NOTES:
20       Primarily of interest to game developers and other people who can
21           avoid problematic images and only need the trivial interface
22 
23       JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
24       PNG 1/2/4/8-bit-per-channel (16 bpc not supported)
25 
26       TGA (not sure what subset, if a subset)
27       BMP non-1bpp, non-RLE
28       PSD (composited view only, no extra channels, 8/16 bit-per-channel)
29 
30       GIF (*comp always reports as 4-channel)
31       HDR (radiance rgbE format)
32       PIC (Softimage PIC)
33       PNM (PPM and PGM binary only)
34 
35       Animated GIF still needs a proper API, but here's one way to do it:
36           http://gist.github.com/urraka/685d9a6340b26b830d49
37 
38       - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
39       - decode from arbitrary I/O callbacks
40       - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
41 
42    Full documentation under "DOCUMENTATION" below.
43 
44 
45    Revision 2.00 release notes:
46 
47       - Progressive JPEG is now supported.
48 
49       - PPM and PGM binary formats are now supported, thanks to Ken Miller.
50 
51       - x86 platforms now make use of SSE2 SIMD instructions for
52         JPEG decoding, and ARM platforms can use NEON SIMD if requested.
53         This work was done by Fabian "ryg" Giesen. SSE2 is used by
54         default, but NEON must be enabled explicitly; see docs.
55 
56         With other JPEG optimizations included in this version, we see
57         2x speedup on a JPEG on an x86 machine, and a 1.5x speedup
58         on a JPEG on an ARM machine, relative to previous versions of this
59         library. The same results will not obtain for all JPGs and for all
60         x86/ARM machines. (Note that progressive JPEGs are significantly
61         slower to decode than regular JPEGs.) This doesn't mean that this
62         is the fastest JPEG decoder in the land; rather, it brings it
63         closer to parity with standard libraries. If you want the fastest
64         decode, look elsewhere. (See "Philosophy" section of docs below.)
65 
66         See final bullet items below for more info on SIMD.
67 
68       - Added STBI_MALLOC, STBI_REALLOC, and STBI_FREE macros for replacing
69         the memory allocator. Unlike other STBI libraries, these macros don't
70         support a context parameter, so if you need to pass a context in to
71         the allocator, you'll have to store it in a global or a thread-local
72         variable.
73 
74       - Split existing STBI_NO_HDR flag into two flags, STBI_NO_HDR and
75         STBI_NO_LINEAR.
76             STBI_NO_HDR:     suppress implementation of .hdr reader format
77             STBI_NO_LINEAR:  suppress high-dynamic-range light-linear float API
78 
79       - You can suppress implementation of any of the decoders to reduce
80         your code footprint by #defining one or more of the following
81         symbols before creating the implementation.
82 
83             STBI_NO_JPEG
84             STBI_NO_PNG
85             STBI_NO_BMP
86             STBI_NO_PSD
87             STBI_NO_TGA
88             STBI_NO_GIF
89             STBI_NO_HDR
90             STBI_NO_PIC
91             STBI_NO_PNM   (.ppm and .pgm)
92 
93       - You can request *only* certain decoders and suppress all other ones
94         (this will be more forward-compatible, as addition of new decoders
95         doesn't require you to disable them explicitly):
96 
97             STBI_ONLY_JPEG
98             STBI_ONLY_PNG
99             STBI_ONLY_BMP
100             STBI_ONLY_PSD
101             STBI_ONLY_TGA
102             STBI_ONLY_GIF
103             STBI_ONLY_HDR
104             STBI_ONLY_PIC
105             STBI_ONLY_PNM   (.ppm and .pgm)
106 
107          Note that you can define multiples of these, and you will get all
108          of them ("only x" and "only y" is interpreted to mean "only x&y").
109 
110        - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
111          want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
112 
113       - Compilation of all SIMD code can be suppressed with
114             #define STBI_NO_SIMD
115         It should not be necessary to disable SIMD unless you have issues
116         compiling (e.g. using an x86 compiler which doesn't support SSE
117         intrinsics or that doesn't support the method used to detect
118         SSE2 support at run-time), and even those can be reported as
119         bugs so I can refine the built-in compile-time checking to be
120         smarter.
121 
122       - The old STBI_SIMD system which allowed installing a user-defined
123         IDCT etc. has been removed. If you need this, don't upgrade. My
124         assumption is that almost nobody was doing this, and those who
125         were will find the built-in SIMD more satisfactory anyway.
126 
127       - RGB values computed for JPEG images are slightly different from
128         previous versions of stb_image. (This is due to using less
129         integer precision in SIMD.) The C code has been adjusted so
130         that the same RGB values will be computed regardless of whether
131         SIMD support is available, so your app should always produce
132         consistent results. But these results are slightly different from
133         previous versions. (Specifically, about 3% of available YCbCr values
134         will compute different RGB results from pre-1.49 versions by +-1;
135         most of the deviating values are one smaller in the G channel.)
136 
137       - If you must produce consistent results with previous versions of
138         stb_image, #define STBI_JPEG_OLD and you will get the same results
139         you used to; however, you will not get the SIMD speedups for
140         the YCbCr-to-RGB conversion step (although you should still see
141         significant JPEG speedup from the other changes).
142 
143         Please note that STBI_JPEG_OLD is a temporary feature; it will be
144         removed in future versions of the library. It is only intended for
145         near-term back-compatibility use.
146 
147 
148    Latest revision history:
149       2.10  (2016-01-22) avoid warning introduced in 2.09
150       2.09  (2016-01-16) 16-bit TGA; comments in PNM files; STBI_REALLOC_SIZED
151       2.08  (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
152       2.07  (2015-09-13) partial animated GIF support
153                          limited 16-bit PSD support
154                          minor bugs, code cleanup, and compiler warnings
155       2.06  (2015-04-19) fix bug where PSD returns wrong '*comp' value
156       2.05  (2015-04-19) fix bug in progressive JPEG handling, fix warning
157       2.04  (2015-04-15) try to re-enable SIMD on MinGW 64-bit
158       2.03  (2015-04-12) additional corruption checking
159                          stbi_set_flip_vertically_on_load
160                          fix NEON support; fix mingw support
161       2.02  (2015-01-19) fix incorrect assert, fix warning
162       2.01  (2015-01-17) fix various warnings
163       2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
164       2.00  (2014-12-25) optimize JPEG, including x86 SSE2 & ARM NEON SIMD
165                          progressive JPEG
166                          PGM/PPM support
167                          STBI_MALLOC,STBI_REALLOC,STBI_FREE
168                          STBI_NO_*, STBI_ONLY_*
169                          GIF bugfix
170       1.48  (2014-12-14) fix incorrectly-named assert()
171       1.47  (2014-12-14) 1/2/4-bit PNG support (both grayscale and paletted)
172                          optimize PNG
173                          fix bug in interlaced PNG with user-specified channel count
174 
175    See end of file for full revision history.
176 
177 
178  ============================    Contributors    =========================
179 
180  Image formats                          Extensions, features
181     Sean Barrett (jpeg, png, bmp)          Jetro Lauha (stbi_info)
182     Nicolas Schulz (hdr, psd)              Martin "SpartanJ" Golini (stbi_info)
183     Jonathan Dummer (tga)                  James "moose2000" Brown (iPhone PNG)
184     Jean-Marc Lienher (gif)                Ben "Disch" Wenger (io callbacks)
185     Tom Seddon (pic)                       Omar Cornut (1/2/4-bit PNG)
186     Thatcher Ulrich (psd)                  Nicolas Guillemot (vertical flip)
187     Ken Miller (pgm, ppm)                  Richard Mitton (16-bit PSD)
188     urraka@github (animated gif)           Junggon Kim (PNM comments)
189                                            Daniel Gibson (16-bit TGA)
190 
191  Optimizations & bugfixes
192     Fabian "ryg" Giesen
193     Arseny Kapoulkine
194 
195  Bug & warning fixes
196     Marc LeBlanc            David Woo          Guillaume George   Martins Mozeiko
197     Christpher Lloyd        Martin Golini      Jerry Jansson      Joseph Thomson
198     Dave Moore              Roy Eltham         Hayaki Saito       Phil Jordan
199     Won Chun                Luke Graham        Johan Duparc       Nathan Reed
200     the Horde3D community   Thomas Ruf         Ronny Chevalier    Nick Verigakis
201     Janez Zemva             John Bartholomew   Michal Cichon      svdijk@github
202     Jonathan Blow           Ken Hamada         Tero Hanninen      Baldur Karlsson
203     Laurent Gomila          Cort Stratton      Sergio Gonzalez    romigrou@github
204     Aruelien Pocheville     Thibault Reuille   Cass Everitt
205     Ryamond Barbiero        Paul Du Bois       Engin Manap
206     Blazej Dariusz Roszkowski
207     Michaelangel007@github
208 
209 
210 LICENSE
211 
212 This software is in the public domain. Where that dedication is not
213 recognized, you are granted a perpetual, irrevocable license to copy,
214 distribute, and modify this file as you see fit.
215 
216 */
217 
218 #ifndef STBI_INCLUDE_STB_IMAGE_H
219 #define STBI_INCLUDE_STB_IMAGE_H
220 
221 // DOCUMENTATION
222 //
223 // Limitations:
224 //    - no 16-bit-per-channel PNG
225 //    - no 12-bit-per-channel JPEG
226 //    - no JPEGs with arithmetic coding
227 //    - no 1-bit BMP
228 //    - GIF always returns *comp=4
229 //
230 // Basic usage (see HDR discussion below for HDR usage):
231 //    int x,y,n;
232 //    unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
233 //    // ... process data if not NULL ...
234 //    // ... x = width, y = height, n = # 8-bit components per pixel ...
235 //    // ... replace '0' with '1'..'4' to force that many components per pixel
236 //    // ... but 'n' will always be the number that it would have been if you said 0
237 //    stbi_image_free(data)
238 //
239 // Standard parameters:
240 //    int *x       -- outputs image width in pixels
241 //    int *y       -- outputs image height in pixels
242 //    int *comp    -- outputs # of image components in image file
243 //    int req_comp -- if non-zero, # of image components requested in result
244 //
245 // The return value from an image loader is an 'unsigned char *' which points
246 // to the pixel data, or NULL on an allocation failure or if the image is
247 // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
248 // with each pixel consisting of N interleaved 8-bit components; the first
249 // pixel pointed to is top-left-most in the image. There is no padding between
250 // image scanlines or between pixels, regardless of format. The number of
251 // components N is 'req_comp' if req_comp is non-zero, or *comp otherwise.
252 // If req_comp is non-zero, *comp has the number of components that _would_
253 // have been output otherwise. E.g. if you set req_comp to 4, you will always
254 // get RGBA output, but you can check *comp to see if it's trivially opaque
255 // because e.g. there were only 3 channels in the source image.
256 //
257 // An output image with N components has the following components interleaved
258 // in this order in each pixel:
259 //
260 //     N=#comp     components
261 //       1           grey
262 //       2           grey, alpha
263 //       3           red, green, blue
264 //       4           red, green, blue, alpha
265 //
266 // If image loading fails for any reason, the return value will be NULL,
267 // and *x, *y, *comp will be unchanged. The function stbi_failure_reason()
268 // can be queried for an extremely brief, end-user unfriendly explanation
269 // of why the load failed. Define STBI_NO_FAILURE_STRINGS to avoid
270 // compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
271 // more user-friendly ones.
272 //
273 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
274 //
275 // ===========================================================================
276 //
277 // Philosophy
278 //
279 // stb libraries are designed with the following priorities:
280 //
281 //    1. easy to use
282 //    2. easy to maintain
283 //    3. good performance
284 //
285 // Sometimes I let "good performance" creep up in priority over "easy to maintain",
286 // and for best performance I may provide less-easy-to-use APIs that give higher
287 // performance, in addition to the easy to use ones. Nevertheless, it's important
288 // to keep in mind that from the standpoint of you, a client of this library,
289 // all you care about is #1 and #3, and stb libraries do not emphasize #3 above all.
290 //
291 // Some secondary priorities arise directly from the first two, some of which
292 // make more explicit reasons why performance can't be emphasized.
293 //
294 //    - Portable ("ease of use")
295 //    - Small footprint ("easy to maintain")
296 //    - No dependencies ("ease of use")
297 //
298 // ===========================================================================
299 //
300 // I/O callbacks
301 //
302 // I/O callbacks allow you to read from arbitrary sources, like packaged
303 // files or some other source. Data read from callbacks are processed
304 // through a small internal buffer (currently 128 bytes) to try to reduce
305 // overhead.
306 //
307 // The three functions you must define are "read" (reads some bytes of data),
308 // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
309 //
310 // ===========================================================================
311 //
312 // SIMD support
313 //
314 // The JPEG decoder will try to automatically use SIMD kernels on x86 when
315 // supported by the compiler. For ARM Neon support, you must explicitly
316 // request it.
317 //
318 // (The old do-it-yourself SIMD API is no longer supported in the current
319 // code.)
320 //
321 // On x86, SSE2 will automatically be used when available based on a run-time
322 // test; if not, the generic C versions are used as a fall-back. On ARM targets,
323 // the typical path is to have separate builds for NEON and non-NEON devices
324 // (at least this is true for iOS and Android). Therefore, the NEON support is
325 // toggled by a build flag: define STBI_NEON to get NEON loops.
326 //
327 // The output of the JPEG decoder is slightly different from versions where
328 // SIMD support was introduced (that is, for versions before 1.49). The
329 // difference is only +-1 in the 8-bit RGB channels, and only on a small
330 // fraction of pixels. You can force the pre-1.49 behavior by defining
331 // STBI_JPEG_OLD, but this will disable some of the SIMD decoding path
332 // and hence cost some performance.
333 //
334 // If for some reason you do not want to use any of SIMD code, or if
335 // you have issues compiling it, you can disable it entirely by
336 // defining STBI_NO_SIMD.
337 //
338 // ===========================================================================
339 //
340 // HDR image support   (disable by defining STBI_NO_HDR)
341 //
342 // stb_image now supports loading HDR images in general, and currently
343 // the Radiance .HDR file format, although the support is provided
344 // generically. You can still load any file through the existing interface;
345 // if you attempt to load an HDR file, it will be automatically remapped to
346 // LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
347 // both of these constants can be reconfigured through this interface:
348 //
349 //     stbi_hdr_to_ldr_gamma(2.2f);
350 //     stbi_hdr_to_ldr_scale(1.0f);
351 //
352 // (note, do not use _inverse_ constants; stbi_image will invert them
353 // appropriately).
354 //
355 // Additionally, there is a new, parallel interface for loading files as
356 // (linear) floats to preserve the full dynamic range:
357 //
358 //    float *data = stbi_loadf(filename, &x, &y, &n, 0);
359 //
360 // If you load LDR images through this interface, those images will
361 // be promoted to floating point values, run through the inverse of
362 // constants corresponding to the above:
363 //
364 //     stbi_ldr_to_hdr_scale(1.0f);
365 //     stbi_ldr_to_hdr_gamma(2.2f);
366 //
367 // Finally, given a filename (or an open file or memory block--see header
368 // file for details) containing image data, you can query for the "most
369 // appropriate" interface to use (that is, whether the image is HDR or
370 // not), using:
371 //
372 //     stbi_is_hdr(char *filename);
373 //
374 // ===========================================================================
375 //
376 // iPhone PNG support:
377 //
378 // By default we convert iphone-formatted PNGs back to RGB, even though
379 // they are internally encoded differently. You can disable this conversion
380 // by by calling stbi_convert_iphone_png_to_rgb(0), in which case
381 // you will always just get the native iphone "format" through (which
382 // is BGR stored in RGB).
383 //
384 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
385 // pixel to remove any premultiplied alpha *only* if the image file explicitly
386 // says there's premultiplied data (currently only happens in iPhone images,
387 // and only if iPhone convert-to-rgb processing is on).
388 //
389 
390 #pragma GCC diagnostic push
391 #pragma GCC diagnostic ignored "-Wmisleading-indentation"
392 #pragma GCC diagnostic ignored "-Wshift-negative-value"
393 #pragma GCC diagnostic ignored "-Wimplicit-fallthrough"
394 
395 #ifndef STBI_NO_STDIO
396 #include <stdio.h>
397 #endif // STBI_NO_STDIO
398 
399 #define STBI_VERSION 1
400 
401 enum
402 {
403    STBI_default = 0, // only used for req_comp
404 
405    STBI_grey       = 1,
406    STBI_grey_alpha = 2,
407    STBI_rgb        = 3,
408    STBI_rgb_alpha  = 4
409 };
410 
411 typedef unsigned char stbi_uc;
412 
413 #ifdef __cplusplus
414 extern "C" {
415 #endif
416 
417 #ifdef STB_IMAGE_STATIC
418 #define STBIDEF static
419 #else
420 #define STBIDEF extern
421 #endif
422 
423 //////////////////////////////////////////////////////////////////////////////
424 //
425 // PRIMARY API - works on images of any type
426 //
427 
428 //
429 // load image by filename, open file, or memory buffer
430 //
431 
432 typedef struct
433 {
434    int      (*read)  (void *user,char *data,int size);   // fill 'data' with 'size' bytes.  return number of bytes actually read
435    void     (*skip)  (void *user,int n);                 // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
436    int      (*eof)   (void *user);                       // returns nonzero if we are at end of file/data
437 } stbi_io_callbacks;
438 
439 STBIDEF stbi_uc *stbi_load               (char              const *filename,           int *x, int *y, int *comp, int req_comp);
440 STBIDEF stbi_uc *stbi_load_from_memory   (stbi_uc           const *buffer, int len   , int *x, int *y, int *comp, int req_comp);
441 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk  , void *user, int *x, int *y, int *comp, int req_comp);
442 
443 #ifndef STBI_NO_STDIO
444 STBIDEF stbi_uc *stbi_load_from_file  (FILE *f,                  int *x, int *y, int *comp, int req_comp);
445 // for stbi_load_from_file, file pointer is left pointing immediately after image
446 #endif
447 
448 #ifndef STBI_NO_LINEAR
449    STBIDEF float *stbi_loadf                 (char const *filename,           int *x, int *y, int *comp, int req_comp);
450    STBIDEF float *stbi_loadf_from_memory     (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
451    STBIDEF float *stbi_loadf_from_callbacks  (stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp);
452 
453    #ifndef STBI_NO_STDIO
454    STBIDEF float *stbi_loadf_from_file  (FILE *f,                int *x, int *y, int *comp, int req_comp);
455    #endif
456 #endif
457 
458 #ifndef STBI_NO_HDR
459    STBIDEF void   stbi_hdr_to_ldr_gamma(float gamma);
460    STBIDEF void   stbi_hdr_to_ldr_scale(float scale);
461 #endif // STBI_NO_HDR
462 
463 #ifndef STBI_NO_LINEAR
464    STBIDEF void   stbi_ldr_to_hdr_gamma(float gamma);
465    STBIDEF void   stbi_ldr_to_hdr_scale(float scale);
466 #endif // STBI_NO_LINEAR
467 
468 // stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
469 STBIDEF int    stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
470 STBIDEF int    stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
471 #ifndef STBI_NO_STDIO
472 STBIDEF int      stbi_is_hdr          (char const *filename);
473 STBIDEF int      stbi_is_hdr_from_file(FILE *f);
474 #endif // STBI_NO_STDIO
475 
476 
477 // get a VERY brief reason for failure
478 // NOT THREADSAFE
479 STBIDEF const char *stbi_failure_reason  (void);
480 
481 // free the loaded image -- this is just free()
482 STBIDEF void     stbi_image_free      (void *retval_from_stbi_load);
483 
484 // get image dimensions & components without fully decoding
485 STBIDEF int      stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
486 STBIDEF int      stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp);
487 
488 #ifndef STBI_NO_STDIO
489 STBIDEF int      stbi_info            (char const *filename,     int *x, int *y, int *comp);
490 STBIDEF int      stbi_info_from_file  (FILE *f,                  int *x, int *y, int *comp);
491 
492 #endif
493 
494 
495 
496 // for image formats that explicitly notate that they have premultiplied alpha,
497 // we just return the colors as stored in the file. set this flag to force
498 // unpremultiplication. results are undefined if the unpremultiply overflow.
499 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
500 
501 // indicate whether we should process iphone images back to canonical format,
502 // or just pass them through "as-is"
503 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
504 
505 // flip the image vertically, so the first pixel in the output array is the bottom left
506 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip);
507 
508 // ZLIB client - used by PNG, available for other purposes
509 
510 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
511 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
512 STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
513 STBIDEF int   stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
514 
515 STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
516 STBIDEF int   stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
517 
518 
519 #ifdef __cplusplus
520 }
521 #endif
522 
523 //
524 //
525 ////   end header file   /////////////////////////////////////////////////////
526 #endif // STBI_INCLUDE_STB_IMAGE_H
527 
528 #ifdef STB_IMAGE_IMPLEMENTATION
529 
530 #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
531   || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
532   || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
533   || defined(STBI_ONLY_ZLIB)
534    #ifndef STBI_ONLY_JPEG
535    #define STBI_NO_JPEG
536    #endif
537    #ifndef STBI_ONLY_PNG
538    #define STBI_NO_PNG
539    #endif
540    #ifndef STBI_ONLY_BMP
541    #define STBI_NO_BMP
542    #endif
543    #ifndef STBI_ONLY_PSD
544    #define STBI_NO_PSD
545    #endif
546    #ifndef STBI_ONLY_TGA
547    #define STBI_NO_TGA
548    #endif
549    #ifndef STBI_ONLY_GIF
550    #define STBI_NO_GIF
551    #endif
552    #ifndef STBI_ONLY_HDR
553    #define STBI_NO_HDR
554    #endif
555    #ifndef STBI_ONLY_PIC
556    #define STBI_NO_PIC
557    #endif
558    #ifndef STBI_ONLY_PNM
559    #define STBI_NO_PNM
560    #endif
561 #endif
562 
563 #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
564 #define STBI_NO_ZLIB
565 #endif
566 
567 
568 #include <stdarg.h>
569 #include <stddef.h> // ptrdiff_t on osx
570 #include <stdlib.h>
571 #include <string.h>
572 
573 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
574 #include <math.h>  // ldexp
575 #endif
576 
577 #ifndef STBI_NO_STDIO
578 #include <stdio.h>
579 #endif
580 
581 #ifndef STBI_ASSERT
582 #include <assert.h>
583 #define STBI_ASSERT(x) assert(x)
584 #endif
585 
586 
587 #ifndef _MSC_VER
588    #ifdef __cplusplus
589    #define stbi_inline inline
590    #else
591    #define stbi_inline
592    #endif
593 #else
594    #define stbi_inline __forceinline
595 #endif
596 
597 
598 #ifdef _MSC_VER
599 typedef unsigned short stbi__uint16;
600 typedef   signed short stbi__int16;
601 typedef unsigned int   stbi__uint32;
602 typedef   signed int   stbi__int32;
603 #else
604 #include <stdint.h>
605 typedef uint16_t stbi__uint16;
606 typedef int16_t  stbi__int16;
607 typedef uint32_t stbi__uint32;
608 typedef int32_t  stbi__int32;
609 #endif
610 
611 // should produce compiler error if size is wrong
612 typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1];
613 
614 #ifdef _MSC_VER
615 #define STBI_NOTUSED(v)  (void)(v)
616 #else
617 #define STBI_NOTUSED(v)  (void)sizeof(v)
618 #endif
619 
620 #ifdef _MSC_VER
621 #define STBI_HAS_LROTL
622 #endif
623 
624 #ifdef STBI_HAS_LROTL
625    #define stbi_lrot(x,y)  _lrotl(x,y)
626 #else
627    #define stbi_lrot(x,y)  (((x) << (y)) | ((x) >> (32 - (y))))
628 #endif
629 
630 #if defined(STBI_MALLOC) && defined(STBI_FREE) && (defined(STBI_REALLOC) || defined(STBI_REALLOC_SIZED))
631 // ok
632 #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC) && !defined(STBI_REALLOC_SIZED)
633 // ok
634 #else
635 #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC (or STBI_REALLOC_SIZED)."
636 #endif
637 
638 #ifndef STBI_MALLOC
639 #define STBI_MALLOC(sz)           malloc(sz)
640 #define STBI_REALLOC(p,newsz)     realloc(p,newsz)
641 #define STBI_FREE(p)              free(p)
642 #endif
643 
644 #ifndef STBI_REALLOC_SIZED
645 #define STBI_REALLOC_SIZED(p,oldsz,newsz) STBI_REALLOC(p,newsz)
646 #endif
647 
648 // x86/x64 detection
649 #if defined(__x86_64__) || defined(_M_X64)
650 #define STBI__X64_TARGET
651 #elif defined(__i386) || defined(_M_IX86)
652 #define STBI__X86_TARGET
653 #endif
654 
655 #if defined(__GNUC__) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET)) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
656 // NOTE: not clear do we actually need this for the 64-bit path?
657 // gcc doesn't support sse2 intrinsics unless you compile with -msse2,
658 // (but compiling with -msse2 allows the compiler to use SSE2 everywhere;
659 // this is just broken and gcc are jerks for not fixing it properly
660 // http://www.virtualdub.org/blog/pivot/entry.php?id=363 )
661 #define STBI_NO_SIMD
662 #endif
663 
664 #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
665 // Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
666 //
667 // 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
668 // Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
669 // As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
670 // simultaneously enabling "-mstackrealign".
671 //
672 // See https://github.com/nothings/stb/issues/81 for more information.
673 //
674 // So default to no SSE2 on 32-bit MinGW. If you've read this far and added
675 // -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
676 #define STBI_NO_SIMD
677 #endif
678 
679 #if !defined(STBI_NO_SIMD) && defined(STBI__X86_TARGET)
680 #define STBI_SSE2
681 #include <emmintrin.h>
682 
683 #ifdef _MSC_VER
684 
685 #if _MSC_VER >= 1400  // not VC6
686 #include <intrin.h> // __cpuid
stbi__cpuid3(void)687 static int stbi__cpuid3(void)
688 {
689    int info[4];
690    __cpuid(info,1);
691    return info[3];
692 }
693 #else
stbi__cpuid3(void)694 static int stbi__cpuid3(void)
695 {
696    int res;
697    __asm {
698       mov  eax,1
699       cpuid
700       mov  res,edx
701    }
702    return res;
703 }
704 #endif
705 
706 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
707 
stbi__sse2_available()708 static int stbi__sse2_available()
709 {
710    int info3 = stbi__cpuid3();
711    return ((info3 >> 26) & 1) != 0;
712 }
713 #else // assume GCC-style if not VC++
714 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
715 
stbi__sse2_available()716 static int stbi__sse2_available()
717 {
718 #if defined(__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__) >= 408 // GCC 4.8 or later
719    // GCC 4.8+ has a nice way to do this
720    return __builtin_cpu_supports("sse2");
721 #else
722    // portable way to do this, preferably without using GCC inline ASM?
723    // just bail for now.
724    return 0;
725 #endif
726 }
727 #endif
728 #endif
729 
730 // ARM NEON
731 #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
732 #undef STBI_NEON
733 #endif
734 
735 #ifdef STBI_NEON
736 #include <arm_neon.h>
737 // assume GCC or Clang on ARM targets
738 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
739 #endif
740 
741 #ifndef STBI_SIMD_ALIGN
742 #define STBI_SIMD_ALIGN(type, name) type name
743 #endif
744 
745 ///////////////////////////////////////////////
746 //
747 //  stbi__context struct and start_xxx functions
748 
749 // stbi__context structure is our basic context used by all images, so it
750 // contains all the IO context, plus some basic image information
751 typedef struct
752 {
753    stbi__uint32 img_x, img_y;
754    int img_n, img_out_n;
755 
756    stbi_io_callbacks io;
757    void *io_user_data;
758 
759    int read_from_callbacks;
760    int buflen;
761    stbi_uc buffer_start[128];
762 
763    stbi_uc *img_buffer, *img_buffer_end;
764    stbi_uc *img_buffer_original, *img_buffer_original_end;
765 } stbi__context;
766 
767 
768 static void stbi__refill_buffer(stbi__context *s);
769 
770 // initialize a memory-decode context
stbi__start_mem(stbi__context * s,stbi_uc const * buffer,int len)771 static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len)
772 {
773    s->io.read = NULL;
774    s->read_from_callbacks = 0;
775    s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer;
776    s->img_buffer_end = s->img_buffer_original_end = (stbi_uc *) buffer+len;
777 }
778 
779 // initialize a callback-based context
stbi__start_callbacks(stbi__context * s,stbi_io_callbacks * c,void * user)780 static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user)
781 {
782    s->io = *c;
783    s->io_user_data = user;
784    s->buflen = sizeof(s->buffer_start);
785    s->read_from_callbacks = 1;
786    s->img_buffer_original = s->buffer_start;
787    stbi__refill_buffer(s);
788    s->img_buffer_original_end = s->img_buffer_end;
789 }
790 
791 #ifndef STBI_NO_STDIO
792 
stbi__stdio_read(void * user,char * data,int size)793 static int stbi__stdio_read(void *user, char *data, int size)
794 {
795    return (int) fread(data,1,size,(FILE*) user);
796 }
797 
stbi__stdio_skip(void * user,int n)798 static void stbi__stdio_skip(void *user, int n)
799 {
800    fseek((FILE*) user, n, SEEK_CUR);
801 }
802 
stbi__stdio_eof(void * user)803 static int stbi__stdio_eof(void *user)
804 {
805    return feof((FILE*) user);
806 }
807 
808 static stbi_io_callbacks stbi__stdio_callbacks =
809 {
810    stbi__stdio_read,
811    stbi__stdio_skip,
812    stbi__stdio_eof,
813 };
814 
stbi__start_file(stbi__context * s,FILE * f)815 static void stbi__start_file(stbi__context *s, FILE *f)
816 {
817    stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f);
818 }
819 
820 //static void stop_file(stbi__context *s) { }
821 
822 #endif // !STBI_NO_STDIO
823 
stbi__rewind(stbi__context * s)824 static void stbi__rewind(stbi__context *s)
825 {
826    // conceptually rewind SHOULD rewind to the beginning of the stream,
827    // but we just rewind to the beginning of the initial buffer, because
828    // we only use it after doing 'test', which only ever looks at at most 92 bytes
829    s->img_buffer = s->img_buffer_original;
830    s->img_buffer_end = s->img_buffer_original_end;
831 }
832 
833 #ifndef STBI_NO_JPEG
834 static int      stbi__jpeg_test(stbi__context *s);
835 static stbi_uc *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
836 static int      stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp);
837 #endif
838 
839 #ifndef STBI_NO_PNG
840 static int      stbi__png_test(stbi__context *s);
841 static stbi_uc *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
842 static int      stbi__png_info(stbi__context *s, int *x, int *y, int *comp);
843 #endif
844 
845 #ifndef STBI_NO_BMP
846 static int      stbi__bmp_test(stbi__context *s);
847 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
848 static int      stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp);
849 #endif
850 
851 #ifndef STBI_NO_TGA
852 static int      stbi__tga_test(stbi__context *s);
853 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
854 static int      stbi__tga_info(stbi__context *s, int *x, int *y, int *comp);
855 #endif
856 
857 #ifndef STBI_NO_PSD
858 static int      stbi__psd_test(stbi__context *s);
859 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
860 static int      stbi__psd_info(stbi__context *s, int *x, int *y, int *comp);
861 #endif
862 
863 #ifndef STBI_NO_HDR
864 static int      stbi__hdr_test(stbi__context *s);
865 static float   *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
866 static int      stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp);
867 #endif
868 
869 #ifndef STBI_NO_PIC
870 static int      stbi__pic_test(stbi__context *s);
871 static stbi_uc *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
872 static int      stbi__pic_info(stbi__context *s, int *x, int *y, int *comp);
873 #endif
874 
875 #ifndef STBI_NO_GIF
876 static int      stbi__gif_test(stbi__context *s);
877 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
878 static int      stbi__gif_info(stbi__context *s, int *x, int *y, int *comp);
879 #endif
880 
881 #ifndef STBI_NO_PNM
882 static int      stbi__pnm_test(stbi__context *s);
883 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
884 static int      stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp);
885 #endif
886 
887 // this is not threadsafe
888 static const char *stbi__g_failure_reason;
889 
stbi_failure_reason(void)890 STBIDEF const char *stbi_failure_reason(void)
891 {
892    return stbi__g_failure_reason;
893 }
894 
stbi__err(const char * str)895 static int stbi__err(const char *str)
896 {
897    stbi__g_failure_reason = str;
898    return 0;
899 }
900 
stbi__malloc(size_t size)901 static void *stbi__malloc(size_t size)
902 {
903     return STBI_MALLOC(size);
904 }
905 
906 // stbi__err - error
907 // stbi__errpf - error returning pointer to float
908 // stbi__errpuc - error returning pointer to unsigned char
909 
910 #ifdef STBI_NO_FAILURE_STRINGS
911    #define stbi__err(x,y)  0
912 #elif defined(STBI_FAILURE_USERMSG)
913    #define stbi__err(x,y)  stbi__err(y)
914 #else
915    #define stbi__err(x,y)  stbi__err(x)
916 #endif
917 
918 #define stbi__errpf(x,y)   ((float *)(size_t) (stbi__err(x,y)?NULL:NULL))
919 #define stbi__errpuc(x,y)  ((unsigned char *)(size_t) (stbi__err(x,y)?NULL:NULL))
920 
stbi_image_free(void * retval_from_stbi_load)921 STBIDEF void stbi_image_free(void *retval_from_stbi_load)
922 {
923    STBI_FREE(retval_from_stbi_load);
924 }
925 
926 #ifndef STBI_NO_LINEAR
927 static float   *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
928 #endif
929 
930 #ifndef STBI_NO_HDR
931 static stbi_uc *stbi__hdr_to_ldr(float   *data, int x, int y, int comp);
932 #endif
933 
934 static int stbi__vertically_flip_on_load = 0;
935 
stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)936 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
937 {
938     stbi__vertically_flip_on_load = flag_true_if_should_flip;
939 }
940 
stbi__load_main(stbi__context * s,int * x,int * y,int * comp,int req_comp)941 static unsigned char *stbi__load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
942 {
943    #ifndef STBI_NO_JPEG
944    if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp);
945    #endif
946    #ifndef STBI_NO_PNG
947    if (stbi__png_test(s))  return stbi__png_load(s,x,y,comp,req_comp);
948    #endif
949    #ifndef STBI_NO_BMP
950    if (stbi__bmp_test(s))  return stbi__bmp_load(s,x,y,comp,req_comp);
951    #endif
952    #ifndef STBI_NO_GIF
953    if (stbi__gif_test(s))  return stbi__gif_load(s,x,y,comp,req_comp);
954    #endif
955    #ifndef STBI_NO_PSD
956    if (stbi__psd_test(s))  return stbi__psd_load(s,x,y,comp,req_comp);
957    #endif
958    #ifndef STBI_NO_PIC
959    if (stbi__pic_test(s))  return stbi__pic_load(s,x,y,comp,req_comp);
960    #endif
961    #ifndef STBI_NO_PNM
962    if (stbi__pnm_test(s))  return stbi__pnm_load(s,x,y,comp,req_comp);
963    #endif
964 
965    #ifndef STBI_NO_HDR
966    if (stbi__hdr_test(s)) {
967       float *hdr = stbi__hdr_load(s, x,y,comp,req_comp);
968       return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
969    }
970    #endif
971 
972    #ifndef STBI_NO_TGA
973    // test tga last because it's a crappy test!
974    if (stbi__tga_test(s))
975       return stbi__tga_load(s,x,y,comp,req_comp);
976    #endif
977 
978    return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
979 }
980 
stbi__load_flip(stbi__context * s,int * x,int * y,int * comp,int req_comp)981 static unsigned char *stbi__load_flip(stbi__context *s, int *x, int *y, int *comp, int req_comp)
982 {
983    unsigned char *result = stbi__load_main(s, x, y, comp, req_comp);
984 
985    if (stbi__vertically_flip_on_load && result != NULL) {
986       int w = *x, h = *y;
987       int depth = req_comp ? req_comp : *comp;
988       int row,col,z;
989       stbi_uc temp;
990 
991       // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
992       for (row = 0; row < (h>>1); row++) {
993          for (col = 0; col < w; col++) {
994             for (z = 0; z < depth; z++) {
995                temp = result[(row * w + col) * depth + z];
996                result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
997                result[((h - row - 1) * w + col) * depth + z] = temp;
998             }
999          }
1000       }
1001    }
1002 
1003    return result;
1004 }
1005 
1006 #ifndef STBI_NO_HDR
stbi__float_postprocess(float * result,int * x,int * y,int * comp,int req_comp)1007 static void stbi__float_postprocess(float *result, int *x, int *y, int *comp, int req_comp)
1008 {
1009    if (stbi__vertically_flip_on_load && result != NULL) {
1010       int w = *x, h = *y;
1011       int depth = req_comp ? req_comp : *comp;
1012       int row,col,z;
1013       float temp;
1014 
1015       // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
1016       for (row = 0; row < (h>>1); row++) {
1017          for (col = 0; col < w; col++) {
1018             for (z = 0; z < depth; z++) {
1019                temp = result[(row * w + col) * depth + z];
1020                result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
1021                result[((h - row - 1) * w + col) * depth + z] = temp;
1022             }
1023          }
1024       }
1025    }
1026 }
1027 #endif
1028 
1029 #ifndef STBI_NO_STDIO
1030 
stbi__fopen(char const * filename,char const * mode)1031 static FILE *stbi__fopen(char const *filename, char const *mode)
1032 {
1033    FILE *f;
1034 #if defined(_MSC_VER) && _MSC_VER >= 1400
1035    if (0 != fopen_s(&f, filename, mode))
1036       f=0;
1037 #else
1038    f = fopen(filename, mode);
1039 #endif
1040    return f;
1041 }
1042 
1043 
stbi_load(char const * filename,int * x,int * y,int * comp,int req_comp)1044 STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
1045 {
1046    FILE *f = stbi__fopen(filename, "rb");
1047    unsigned char *result;
1048    if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
1049    result = stbi_load_from_file(f,x,y,comp,req_comp);
1050    fclose(f);
1051    return result;
1052 }
1053 
stbi_load_from_file(FILE * f,int * x,int * y,int * comp,int req_comp)1054 STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1055 {
1056    unsigned char *result;
1057    stbi__context s;
1058    stbi__start_file(&s,f);
1059    result = stbi__load_flip(&s,x,y,comp,req_comp);
1060    if (result) {
1061       // need to 'unget' all the characters in the IO buffer
1062       fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
1063    }
1064    return result;
1065 }
1066 #endif //!STBI_NO_STDIO
1067 
stbi_load_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp,int req_comp)1068 STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1069 {
1070    stbi__context s;
1071    stbi__start_mem(&s,buffer,len);
1072    return stbi__load_flip(&s,x,y,comp,req_comp);
1073 }
1074 
stbi_load_from_callbacks(stbi_io_callbacks const * clbk,void * user,int * x,int * y,int * comp,int req_comp)1075 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1076 {
1077    stbi__context s;
1078    stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1079    return stbi__load_flip(&s,x,y,comp,req_comp);
1080 }
1081 
1082 #ifndef STBI_NO_LINEAR
stbi__loadf_main(stbi__context * s,int * x,int * y,int * comp,int req_comp)1083 static float *stbi__loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
1084 {
1085    unsigned char *data;
1086    #ifndef STBI_NO_HDR
1087    if (stbi__hdr_test(s)) {
1088       float *hdr_data = stbi__hdr_load(s,x,y,comp,req_comp);
1089       if (hdr_data)
1090          stbi__float_postprocess(hdr_data,x,y,comp,req_comp);
1091       return hdr_data;
1092    }
1093    #endif
1094    data = stbi__load_flip(s, x, y, comp, req_comp);
1095    if (data)
1096       return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
1097    return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
1098 }
1099 
stbi_loadf_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp,int req_comp)1100 STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1101 {
1102    stbi__context s;
1103    stbi__start_mem(&s,buffer,len);
1104    return stbi__loadf_main(&s,x,y,comp,req_comp);
1105 }
1106 
stbi_loadf_from_callbacks(stbi_io_callbacks const * clbk,void * user,int * x,int * y,int * comp,int req_comp)1107 STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1108 {
1109    stbi__context s;
1110    stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1111    return stbi__loadf_main(&s,x,y,comp,req_comp);
1112 }
1113 
1114 #ifndef STBI_NO_STDIO
stbi_loadf(char const * filename,int * x,int * y,int * comp,int req_comp)1115 STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
1116 {
1117    float *result;
1118    FILE *f = stbi__fopen(filename, "rb");
1119    if (!f) return stbi__errpf("can't fopen", "Unable to open file");
1120    result = stbi_loadf_from_file(f,x,y,comp,req_comp);
1121    fclose(f);
1122    return result;
1123 }
1124 
stbi_loadf_from_file(FILE * f,int * x,int * y,int * comp,int req_comp)1125 STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1126 {
1127    stbi__context s;
1128    stbi__start_file(&s,f);
1129    return stbi__loadf_main(&s,x,y,comp,req_comp);
1130 }
1131 #endif // !STBI_NO_STDIO
1132 
1133 #endif // !STBI_NO_LINEAR
1134 
1135 // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
1136 // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
1137 // reports false!
1138 
stbi_is_hdr_from_memory(stbi_uc const * buffer,int len)1139 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
1140 {
1141    #ifndef STBI_NO_HDR
1142    stbi__context s;
1143    stbi__start_mem(&s,buffer,len);
1144    return stbi__hdr_test(&s);
1145    #else
1146    STBI_NOTUSED(buffer);
1147    STBI_NOTUSED(len);
1148    return 0;
1149    #endif
1150 }
1151 
1152 #ifndef STBI_NO_STDIO
stbi_is_hdr(char const * filename)1153 STBIDEF int      stbi_is_hdr          (char const *filename)
1154 {
1155    FILE *f = stbi__fopen(filename, "rb");
1156    int result=0;
1157    if (f) {
1158       result = stbi_is_hdr_from_file(f);
1159       fclose(f);
1160    }
1161    return result;
1162 }
1163 
stbi_is_hdr_from_file(FILE * f)1164 STBIDEF int      stbi_is_hdr_from_file(FILE *f)
1165 {
1166    #ifndef STBI_NO_HDR
1167    stbi__context s;
1168    stbi__start_file(&s,f);
1169    return stbi__hdr_test(&s);
1170    #else
1171    STBI_NOTUSED(f);
1172    return 0;
1173    #endif
1174 }
1175 #endif // !STBI_NO_STDIO
1176 
stbi_is_hdr_from_callbacks(stbi_io_callbacks const * clbk,void * user)1177 STBIDEF int      stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
1178 {
1179    #ifndef STBI_NO_HDR
1180    stbi__context s;
1181    stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1182    return stbi__hdr_test(&s);
1183    #else
1184    STBI_NOTUSED(clbk);
1185    STBI_NOTUSED(user);
1186    return 0;
1187    #endif
1188 }
1189 
1190 #ifndef STBI_NO_LINEAR
1191 static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f;
1192 
stbi_ldr_to_hdr_gamma(float gamma)1193 STBIDEF void   stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
stbi_ldr_to_hdr_scale(float scale)1194 STBIDEF void   stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
1195 #endif
1196 
1197 static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f;
1198 
stbi_hdr_to_ldr_gamma(float gamma)1199 STBIDEF void   stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; }
stbi_hdr_to_ldr_scale(float scale)1200 STBIDEF void   stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; }
1201 
1202 
1203 //////////////////////////////////////////////////////////////////////////////
1204 //
1205 // Common code used by all image loaders
1206 //
1207 
1208 enum
1209 {
1210    STBI__SCAN_load=0,
1211    STBI__SCAN_type,
1212    STBI__SCAN_header
1213 };
1214 
stbi__refill_buffer(stbi__context * s)1215 static void stbi__refill_buffer(stbi__context *s)
1216 {
1217    int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
1218    if (n == 0) {
1219       // at end of file, treat same as if from memory, but need to handle case
1220       // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
1221       s->read_from_callbacks = 0;
1222       s->img_buffer = s->buffer_start;
1223       s->img_buffer_end = s->buffer_start+1;
1224       *s->img_buffer = 0;
1225    } else {
1226       s->img_buffer = s->buffer_start;
1227       s->img_buffer_end = s->buffer_start + n;
1228    }
1229 }
1230 
stbi__get8(stbi__context * s)1231 stbi_inline static stbi_uc stbi__get8(stbi__context *s)
1232 {
1233    if (s->img_buffer < s->img_buffer_end)
1234       return *s->img_buffer++;
1235    if (s->read_from_callbacks) {
1236       stbi__refill_buffer(s);
1237       return *s->img_buffer++;
1238    }
1239    return 0;
1240 }
1241 
stbi__at_eof(stbi__context * s)1242 stbi_inline static int stbi__at_eof(stbi__context *s)
1243 {
1244    if (s->io.read) {
1245       if (!(s->io.eof)(s->io_user_data)) return 0;
1246       // if feof() is true, check if buffer = end
1247       // special case: we've only got the special 0 character at the end
1248       if (s->read_from_callbacks == 0) return 1;
1249    }
1250 
1251    return s->img_buffer >= s->img_buffer_end;
1252 }
1253 
stbi__skip(stbi__context * s,int n)1254 static void stbi__skip(stbi__context *s, int n)
1255 {
1256    if (n < 0) {
1257       s->img_buffer = s->img_buffer_end;
1258       return;
1259    }
1260    if (s->io.read) {
1261       int blen = (int) (s->img_buffer_end - s->img_buffer);
1262       if (blen < n) {
1263          s->img_buffer = s->img_buffer_end;
1264          (s->io.skip)(s->io_user_data, n - blen);
1265          return;
1266       }
1267    }
1268    s->img_buffer += n;
1269 }
1270 
stbi__getn(stbi__context * s,stbi_uc * buffer,int n)1271 static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n)
1272 {
1273    if (s->io.read) {
1274       int blen = (int) (s->img_buffer_end - s->img_buffer);
1275       if (blen < n) {
1276          int res, count;
1277 
1278          memcpy(buffer, s->img_buffer, blen);
1279 
1280          count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
1281          res = (count == (n-blen));
1282          s->img_buffer = s->img_buffer_end;
1283          return res;
1284       }
1285    }
1286 
1287    if (s->img_buffer+n <= s->img_buffer_end) {
1288       memcpy(buffer, s->img_buffer, n);
1289       s->img_buffer += n;
1290       return 1;
1291    } else
1292       return 0;
1293 }
1294 
stbi__get16be(stbi__context * s)1295 static int stbi__get16be(stbi__context *s)
1296 {
1297    int z = stbi__get8(s);
1298    return (z << 8) + stbi__get8(s);
1299 }
1300 
stbi__get32be(stbi__context * s)1301 static stbi__uint32 stbi__get32be(stbi__context *s)
1302 {
1303    stbi__uint32 z = stbi__get16be(s);
1304    return (z << 16) + stbi__get16be(s);
1305 }
1306 
1307 #if defined(STBI_NO_BMP) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF)
1308 // nothing
1309 #else
stbi__get16le(stbi__context * s)1310 static int stbi__get16le(stbi__context *s)
1311 {
1312    int z = stbi__get8(s);
1313    return z + (stbi__get8(s) << 8);
1314 }
1315 #endif
1316 
1317 #ifndef STBI_NO_BMP
stbi__get32le(stbi__context * s)1318 static stbi__uint32 stbi__get32le(stbi__context *s)
1319 {
1320    stbi__uint32 z = stbi__get16le(s);
1321    return z + (stbi__get16le(s) << 16);
1322 }
1323 #endif
1324 
1325 #define STBI__BYTECAST(x)  ((stbi_uc) ((x) & 255))  // truncate int to byte without warnings
1326 
1327 
1328 //////////////////////////////////////////////////////////////////////////////
1329 //
1330 //  generic converter from built-in img_n to req_comp
1331 //    individual types do this automatically as much as possible (e.g. jpeg
1332 //    does all cases internally since it needs to colorspace convert anyway,
1333 //    and it never has alpha, so very few cases ). png can automatically
1334 //    interleave an alpha=255 channel, but falls back to this for other cases
1335 //
1336 //  assume data buffer is malloced, so malloc a new one and free that one
1337 //  only failure mode is malloc failing
1338 
stbi__compute_y(int r,int g,int b)1339 static stbi_uc stbi__compute_y(int r, int g, int b)
1340 {
1341    return (stbi_uc) (((r*77) + (g*150) +  (29*b)) >> 8);
1342 }
1343 
stbi__convert_format(unsigned char * data,int img_n,int req_comp,unsigned int x,unsigned int y)1344 static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y)
1345 {
1346    int i,j;
1347    unsigned char *good;
1348 
1349    if (req_comp == img_n) return data;
1350    STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
1351 
1352    good = (unsigned char *) stbi__malloc(req_comp * x * y);
1353    if (good == NULL) {
1354       STBI_FREE(data);
1355       return stbi__errpuc("outofmem", "Out of memory");
1356    }
1357 
1358    for (j=0; j < (int) y; ++j) {
1359       unsigned char *src  = data + j * x * img_n   ;
1360       unsigned char *dest = good + j * x * req_comp;
1361 
1362       #define COMBO(a,b)  ((a)*8+(b))
1363       #define CASE(a,b)   case COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
1364       // convert source image with img_n components to one with req_comp components;
1365       // avoid switch per pixel, so use switch per scanline and massive macros
1366       switch (COMBO(img_n, req_comp)) {
1367          CASE(1,2) dest[0]=src[0], dest[1]=255; break;
1368          CASE(1,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1369          CASE(1,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=255; break;
1370          CASE(2,1) dest[0]=src[0]; break;
1371          CASE(2,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1372          CASE(2,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=src[1]; break;
1373          CASE(3,4) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2],dest[3]=255; break;
1374          CASE(3,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1375          CASE(3,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = 255; break;
1376          CASE(4,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1377          CASE(4,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = src[3]; break;
1378          CASE(4,3) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2]; break;
1379          default: STBI_ASSERT(0);
1380       }
1381       #undef CASE
1382    }
1383 
1384    STBI_FREE(data);
1385    return good;
1386 }
1387 
1388 #ifndef STBI_NO_LINEAR
stbi__ldr_to_hdr(stbi_uc * data,int x,int y,int comp)1389 static float   *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
1390 {
1391    int i,k,n;
1392    float *output = (float *) stbi__malloc(x * y * comp * sizeof(float));
1393    if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
1394    // compute number of non-alpha components
1395    if (comp & 1) n = comp; else n = comp-1;
1396    for (i=0; i < x*y; ++i) {
1397       for (k=0; k < n; ++k) {
1398          output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
1399       }
1400       if (k < comp) output[i*comp + k] = data[i*comp+k]/255.0f;
1401    }
1402    STBI_FREE(data);
1403    return output;
1404 }
1405 #endif
1406 
1407 #ifndef STBI_NO_HDR
1408 #define stbi__float2int(x)   ((int) (x))
stbi__hdr_to_ldr(float * data,int x,int y,int comp)1409 static stbi_uc *stbi__hdr_to_ldr(float   *data, int x, int y, int comp)
1410 {
1411    int i,k,n;
1412    stbi_uc *output = (stbi_uc *) stbi__malloc(x * y * comp);
1413    if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
1414    // compute number of non-alpha components
1415    if (comp & 1) n = comp; else n = comp-1;
1416    for (i=0; i < x*y; ++i) {
1417       for (k=0; k < n; ++k) {
1418          float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
1419          if (z < 0) z = 0;
1420          if (z > 255) z = 255;
1421          output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1422       }
1423       if (k < comp) {
1424          float z = data[i*comp+k] * 255 + 0.5f;
1425          if (z < 0) z = 0;
1426          if (z > 255) z = 255;
1427          output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1428       }
1429    }
1430    STBI_FREE(data);
1431    return output;
1432 }
1433 #endif
1434 
1435 //////////////////////////////////////////////////////////////////////////////
1436 //
1437 //  "baseline" JPEG/JFIF decoder
1438 //
1439 //    simple implementation
1440 //      - doesn't support delayed output of y-dimension
1441 //      - simple interface (only one output format: 8-bit interleaved RGB)
1442 //      - doesn't try to recover corrupt jpegs
1443 //      - doesn't allow partial loading, loading multiple at once
1444 //      - still fast on x86 (copying globals into locals doesn't help x86)
1445 //      - allocates lots of intermediate memory (full size of all components)
1446 //        - non-interleaved case requires this anyway
1447 //        - allows good upsampling (see next)
1448 //    high-quality
1449 //      - upsampled channels are bilinearly interpolated, even across blocks
1450 //      - quality integer IDCT derived from IJG's 'slow'
1451 //    performance
1452 //      - fast huffman; reasonable integer IDCT
1453 //      - some SIMD kernels for common paths on targets with SSE2/NEON
1454 //      - uses a lot of intermediate memory, could cache poorly
1455 
1456 #ifndef STBI_NO_JPEG
1457 
1458 // huffman decoding acceleration
1459 #define FAST_BITS   9  // larger handles more cases; smaller stomps less cache
1460 
1461 typedef struct
1462 {
1463    stbi_uc  fast[1 << FAST_BITS];
1464    // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
1465    stbi__uint16 code[256];
1466    stbi_uc  values[256];
1467    stbi_uc  size[257];
1468    unsigned int maxcode[18];
1469    int    delta[17];   // old 'firstsymbol' - old 'firstcode'
1470 } stbi__huffman;
1471 
1472 typedef struct
1473 {
1474    stbi__context *s;
1475    stbi__huffman huff_dc[4];
1476    stbi__huffman huff_ac[4];
1477    stbi_uc dequant[4][64];
1478    stbi__int16 fast_ac[4][1 << FAST_BITS];
1479 
1480 // sizes for components, interleaved MCUs
1481    int img_h_max, img_v_max;
1482    int img_mcu_x, img_mcu_y;
1483    int img_mcu_w, img_mcu_h;
1484 
1485 // definition of jpeg image component
1486    struct
1487    {
1488       int id;
1489       int h,v;
1490       int tq;
1491       int hd,ha;
1492       int dc_pred;
1493 
1494       int x,y,w2,h2;
1495       stbi_uc *data;
1496       void *raw_data, *raw_coeff;
1497       stbi_uc *linebuf;
1498       short   *coeff;   // progressive only
1499       int      coeff_w, coeff_h; // number of 8x8 coefficient blocks
1500    } img_comp[4];
1501 
1502    stbi__uint32   code_buffer; // jpeg entropy-coded buffer
1503    int            code_bits;   // number of valid bits
1504    unsigned char  marker;      // marker seen while filling entropy buffer
1505    int            nomore;      // flag if we saw a marker so must stop
1506 
1507    int            progressive;
1508    int            spec_start;
1509    int            spec_end;
1510    int            succ_high;
1511    int            succ_low;
1512    int            eob_run;
1513 
1514    int scan_n, order[4];
1515    int restart_interval, todo;
1516 
1517 // kernels
1518    void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]);
1519    void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step);
1520    stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs);
1521 } stbi__jpeg;
1522 
stbi__build_huffman(stbi__huffman * h,int * count)1523 static int stbi__build_huffman(stbi__huffman *h, int *count)
1524 {
1525    int i,j,k=0,code;
1526    // build size list for each symbol (from JPEG spec)
1527    for (i=0; i < 16; ++i)
1528       for (j=0; j < count[i]; ++j)
1529          h->size[k++] = (stbi_uc) (i+1);
1530    h->size[k] = 0;
1531 
1532    // compute actual symbols (from jpeg spec)
1533    code = 0;
1534    k = 0;
1535    for(j=1; j <= 16; ++j) {
1536       // compute delta to add to code to compute symbol id
1537       h->delta[j] = k - code;
1538       if (h->size[k] == j) {
1539          while (h->size[k] == j)
1540             h->code[k++] = (stbi__uint16) (code++);
1541          if (code-1 >= (1 << j)) return stbi__err("bad code lengths","Corrupt JPEG");
1542       }
1543       // compute largest code + 1 for this size, preshifted as needed later
1544       h->maxcode[j] = code << (16-j);
1545       code <<= 1;
1546    }
1547    h->maxcode[j] = 0xffffffff;
1548 
1549    // build non-spec acceleration table; 255 is flag for not-accelerated
1550    memset(h->fast, 255, 1 << FAST_BITS);
1551    for (i=0; i < k; ++i) {
1552       int s = h->size[i];
1553       if (s <= FAST_BITS) {
1554          int c = h->code[i] << (FAST_BITS-s);
1555          int m = 1 << (FAST_BITS-s);
1556          for (j=0; j < m; ++j) {
1557             h->fast[c+j] = (stbi_uc) i;
1558          }
1559       }
1560    }
1561    return 1;
1562 }
1563 
1564 // build a table that decodes both magnitude and value of small ACs in
1565 // one go.
stbi__build_fast_ac(stbi__int16 * fast_ac,stbi__huffman * h)1566 static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h)
1567 {
1568    int i;
1569    for (i=0; i < (1 << FAST_BITS); ++i) {
1570       stbi_uc fast = h->fast[i];
1571       fast_ac[i] = 0;
1572       if (fast < 255) {
1573          int rs = h->values[fast];
1574          int run = (rs >> 4) & 15;
1575          int magbits = rs & 15;
1576          int len = h->size[fast];
1577 
1578          if (magbits && len + magbits <= FAST_BITS) {
1579             // magnitude code followed by receive_extend code
1580             int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
1581             int m = 1 << (magbits - 1);
1582             if (k < m) k += (-1 << magbits) + 1;
1583             // if the result is small enough, we can fit it in fast_ac table
1584             if (k >= -128 && k <= 127)
1585                fast_ac[i] = (stbi__int16) ((k << 8) + (run << 4) + (len + magbits));
1586          }
1587       }
1588    }
1589 }
1590 
stbi__grow_buffer_unsafe(stbi__jpeg * j)1591 static void stbi__grow_buffer_unsafe(stbi__jpeg *j)
1592 {
1593    do {
1594       int b = j->nomore ? 0 : stbi__get8(j->s);
1595       if (b == 0xff) {
1596          int c = stbi__get8(j->s);
1597          if (c != 0) {
1598             j->marker = (unsigned char) c;
1599             j->nomore = 1;
1600             return;
1601          }
1602       }
1603       j->code_buffer |= b << (24 - j->code_bits);
1604       j->code_bits += 8;
1605    } while (j->code_bits <= 24);
1606 }
1607 
1608 // (1 << n) - 1
1609 static stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
1610 
1611 // decode a jpeg huffman value from the bitstream
stbi__jpeg_huff_decode(stbi__jpeg * j,stbi__huffman * h)1612 stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h)
1613 {
1614    unsigned int temp;
1615    int c,k;
1616 
1617    if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1618 
1619    // look at the top FAST_BITS and determine what symbol ID it is,
1620    // if the code is <= FAST_BITS
1621    c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1622    k = h->fast[c];
1623    if (k < 255) {
1624       int s = h->size[k];
1625       if (s > j->code_bits)
1626          return -1;
1627       j->code_buffer <<= s;
1628       j->code_bits -= s;
1629       return h->values[k];
1630    }
1631 
1632    // naive test is to shift the code_buffer down so k bits are
1633    // valid, then test against maxcode. To speed this up, we've
1634    // preshifted maxcode left so that it has (16-k) 0s at the
1635    // end; in other words, regardless of the number of bits, it
1636    // wants to be compared against something shifted to have 16;
1637    // that way we don't need to shift inside the loop.
1638    temp = j->code_buffer >> 16;
1639    for (k=FAST_BITS+1 ; ; ++k)
1640       if (temp < h->maxcode[k])
1641          break;
1642    if (k == 17) {
1643       // error! code not found
1644       j->code_bits -= 16;
1645       return -1;
1646    }
1647 
1648    if (k > j->code_bits)
1649       return -1;
1650 
1651    // convert the huffman code to the symbol id
1652    c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
1653    STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
1654 
1655    // convert the id to a symbol
1656    j->code_bits -= k;
1657    j->code_buffer <<= k;
1658    return h->values[c];
1659 }
1660 
1661 // bias[n] = (-1<<n) + 1
1662 static int const stbi__jbias[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
1663 
1664 // combined JPEG 'receive' and JPEG 'extend', since baseline
1665 // always extends everything it receives.
stbi__extend_receive(stbi__jpeg * j,int n)1666 stbi_inline static int stbi__extend_receive(stbi__jpeg *j, int n)
1667 {
1668    unsigned int k;
1669    int sgn;
1670    if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1671 
1672    sgn = (stbi__int32)j->code_buffer >> 31; // sign bit is always in MSB
1673    k = stbi_lrot(j->code_buffer, n);
1674    STBI_ASSERT(n >= 0 && n < (int) (sizeof(stbi__bmask)/sizeof(*stbi__bmask)));
1675    j->code_buffer = k & ~stbi__bmask[n];
1676    k &= stbi__bmask[n];
1677    j->code_bits -= n;
1678    return k + (stbi__jbias[n] & ~sgn);
1679 }
1680 
1681 // get some unsigned bits
stbi__jpeg_get_bits(stbi__jpeg * j,int n)1682 stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg *j, int n)
1683 {
1684    unsigned int k;
1685    if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1686    k = stbi_lrot(j->code_buffer, n);
1687    j->code_buffer = k & ~stbi__bmask[n];
1688    k &= stbi__bmask[n];
1689    j->code_bits -= n;
1690    return k;
1691 }
1692 
stbi__jpeg_get_bit(stbi__jpeg * j)1693 stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg *j)
1694 {
1695    unsigned int k;
1696    if (j->code_bits < 1) stbi__grow_buffer_unsafe(j);
1697    k = j->code_buffer;
1698    j->code_buffer <<= 1;
1699    --j->code_bits;
1700    return k & 0x80000000;
1701 }
1702 
1703 // given a value that's at position X in the zigzag stream,
1704 // where does it appear in the 8x8 matrix coded as row-major?
1705 static stbi_uc stbi__jpeg_dezigzag[64+15] =
1706 {
1707     0,  1,  8, 16,  9,  2,  3, 10,
1708    17, 24, 32, 25, 18, 11,  4,  5,
1709    12, 19, 26, 33, 40, 48, 41, 34,
1710    27, 20, 13,  6,  7, 14, 21, 28,
1711    35, 42, 49, 56, 57, 50, 43, 36,
1712    29, 22, 15, 23, 30, 37, 44, 51,
1713    58, 59, 52, 45, 38, 31, 39, 46,
1714    53, 60, 61, 54, 47, 55, 62, 63,
1715    // let corrupt input sample past end
1716    63, 63, 63, 63, 63, 63, 63, 63,
1717    63, 63, 63, 63, 63, 63, 63
1718 };
1719 
1720 // decode one 64-entry block--
stbi__jpeg_decode_block(stbi__jpeg * j,short data[64],stbi__huffman * hdc,stbi__huffman * hac,stbi__int16 * fac,int b,stbi_uc * dequant)1721 static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi_uc *dequant)
1722 {
1723    int diff,dc,k;
1724    int t;
1725 
1726    if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1727    t = stbi__jpeg_huff_decode(j, hdc);
1728    if (t < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1729 
1730    // 0 all the ac values now so we can do it 32-bits at a time
1731    memset(data,0,64*sizeof(data[0]));
1732 
1733    diff = t ? stbi__extend_receive(j, t) : 0;
1734    dc = j->img_comp[b].dc_pred + diff;
1735    j->img_comp[b].dc_pred = dc;
1736    data[0] = (short) (dc * dequant[0]);
1737 
1738    // decode AC components, see JPEG spec
1739    k = 1;
1740    do {
1741       unsigned int zig;
1742       int c,r,s;
1743       if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1744       c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1745       r = fac[c];
1746       if (r) { // fast-AC path
1747          k += (r >> 4) & 15; // run
1748          s = r & 15; // combined length
1749          j->code_buffer <<= s;
1750          j->code_bits -= s;
1751          // decode into unzigzag'd location
1752          zig = stbi__jpeg_dezigzag[k++];
1753          data[zig] = (short) ((r >> 8) * dequant[zig]);
1754       } else {
1755          int rs = stbi__jpeg_huff_decode(j, hac);
1756          if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1757          s = rs & 15;
1758          r = rs >> 4;
1759          if (s == 0) {
1760             if (rs != 0xf0) break; // end block
1761             k += 16;
1762          } else {
1763             k += r;
1764             // decode into unzigzag'd location
1765             zig = stbi__jpeg_dezigzag[k++];
1766             data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]);
1767          }
1768       }
1769    } while (k < 64);
1770    return 1;
1771 }
1772 
stbi__jpeg_decode_block_prog_dc(stbi__jpeg * j,short data[64],stbi__huffman * hdc,int b)1773 static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg *j, short data[64], stbi__huffman *hdc, int b)
1774 {
1775    int diff,dc;
1776    int t;
1777    if (j->spec_end != 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1778 
1779    if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1780 
1781    if (j->succ_high == 0) {
1782       // first scan for DC coefficient, must be first
1783       memset(data,0,64*sizeof(data[0])); // 0 all the ac values now
1784       t = stbi__jpeg_huff_decode(j, hdc);
1785       diff = t ? stbi__extend_receive(j, t) : 0;
1786 
1787       dc = j->img_comp[b].dc_pred + diff;
1788       j->img_comp[b].dc_pred = dc;
1789       data[0] = (short) (dc << j->succ_low);
1790    } else {
1791       // refinement scan for DC coefficient
1792       if (stbi__jpeg_get_bit(j))
1793          data[0] += (short) (1 << j->succ_low);
1794    }
1795    return 1;
1796 }
1797 
1798 // @OPTIMIZE: store non-zigzagged during the decode passes,
1799 // and only de-zigzag when dequantizing
stbi__jpeg_decode_block_prog_ac(stbi__jpeg * j,short data[64],stbi__huffman * hac,stbi__int16 * fac)1800 static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg *j, short data[64], stbi__huffman *hac, stbi__int16 *fac)
1801 {
1802    int k;
1803    if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1804 
1805    if (j->succ_high == 0) {
1806       int shift = j->succ_low;
1807 
1808       if (j->eob_run) {
1809          --j->eob_run;
1810          return 1;
1811       }
1812 
1813       k = j->spec_start;
1814       do {
1815          unsigned int zig;
1816          int c,r,s;
1817          if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1818          c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1819          r = fac[c];
1820          if (r) { // fast-AC path
1821             k += (r >> 4) & 15; // run
1822             s = r & 15; // combined length
1823             j->code_buffer <<= s;
1824             j->code_bits -= s;
1825             zig = stbi__jpeg_dezigzag[k++];
1826             data[zig] = (short) ((r >> 8) << shift);
1827          } else {
1828             int rs = stbi__jpeg_huff_decode(j, hac);
1829             if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1830             s = rs & 15;
1831             r = rs >> 4;
1832             if (s == 0) {
1833                if (r < 15) {
1834                   j->eob_run = (1 << r);
1835                   if (r)
1836                      j->eob_run += stbi__jpeg_get_bits(j, r);
1837                   --j->eob_run;
1838                   break;
1839                }
1840                k += 16;
1841             } else {
1842                k += r;
1843                zig = stbi__jpeg_dezigzag[k++];
1844                data[zig] = (short) (stbi__extend_receive(j,s) << shift);
1845             }
1846          }
1847       } while (k <= j->spec_end);
1848    } else {
1849       // refinement scan for these AC coefficients
1850 
1851       short bit = (short) (1 << j->succ_low);
1852 
1853       if (j->eob_run) {
1854          --j->eob_run;
1855          for (k = j->spec_start; k <= j->spec_end; ++k) {
1856             short *p = &data[stbi__jpeg_dezigzag[k]];
1857             if (*p != 0)
1858                if (stbi__jpeg_get_bit(j))
1859                   if ((*p & bit)==0) {
1860                      if (*p > 0)
1861                         *p += bit;
1862                      else
1863                         *p -= bit;
1864                   }
1865          }
1866       } else {
1867          k = j->spec_start;
1868          do {
1869             int r,s;
1870             int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
1871             if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1872             s = rs & 15;
1873             r = rs >> 4;
1874             if (s == 0) {
1875                if (r < 15) {
1876                   j->eob_run = (1 << r) - 1;
1877                   if (r)
1878                      j->eob_run += stbi__jpeg_get_bits(j, r);
1879                   r = 64; // force end of block
1880                } else {
1881                   // r=15 s=0 should write 16 0s, so we just do
1882                   // a run of 15 0s and then write s (which is 0),
1883                   // so we don't have to do anything special here
1884                }
1885             } else {
1886                if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG");
1887                // sign bit
1888                if (stbi__jpeg_get_bit(j))
1889                   s = bit;
1890                else
1891                   s = -bit;
1892             }
1893 
1894             // advance by r
1895             while (k <= j->spec_end) {
1896                short *p = &data[stbi__jpeg_dezigzag[k++]];
1897                if (*p != 0) {
1898                   if (stbi__jpeg_get_bit(j))
1899                      if ((*p & bit)==0) {
1900                         if (*p > 0)
1901                            *p += bit;
1902                         else
1903                            *p -= bit;
1904                      }
1905                } else {
1906                   if (r == 0) {
1907                      *p = (short) s;
1908                      break;
1909                   }
1910                   --r;
1911                }
1912             }
1913          } while (k <= j->spec_end);
1914       }
1915    }
1916    return 1;
1917 }
1918 
1919 // take a -128..127 value and stbi__clamp it and convert to 0..255
stbi__clamp(int x)1920 stbi_inline static stbi_uc stbi__clamp(int x)
1921 {
1922    // trick to use a single test to catch both cases
1923    if ((unsigned int) x > 255) {
1924       if (x < 0) return 0;
1925       if (x > 255) return 255;
1926    }
1927    return (stbi_uc) x;
1928 }
1929 
1930 #define stbi__f2f(x)  ((int) (((x) * 4096 + 0.5)))
1931 #define stbi__fsh(x)  ((x) << 12)
1932 
1933 // derived from jidctint -- DCT_ISLOW
1934 #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
1935    int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
1936    p2 = s2;                                    \
1937    p3 = s6;                                    \
1938    p1 = (p2+p3) * stbi__f2f(0.5411961f);       \
1939    t2 = p1 + p3*stbi__f2f(-1.847759065f);      \
1940    t3 = p1 + p2*stbi__f2f( 0.765366865f);      \
1941    p2 = s0;                                    \
1942    p3 = s4;                                    \
1943    t0 = stbi__fsh(p2+p3);                      \
1944    t1 = stbi__fsh(p2-p3);                      \
1945    x0 = t0+t3;                                 \
1946    x3 = t0-t3;                                 \
1947    x1 = t1+t2;                                 \
1948    x2 = t1-t2;                                 \
1949    t0 = s7;                                    \
1950    t1 = s5;                                    \
1951    t2 = s3;                                    \
1952    t3 = s1;                                    \
1953    p3 = t0+t2;                                 \
1954    p4 = t1+t3;                                 \
1955    p1 = t0+t3;                                 \
1956    p2 = t1+t2;                                 \
1957    p5 = (p3+p4)*stbi__f2f( 1.175875602f);      \
1958    t0 = t0*stbi__f2f( 0.298631336f);           \
1959    t1 = t1*stbi__f2f( 2.053119869f);           \
1960    t2 = t2*stbi__f2f( 3.072711026f);           \
1961    t3 = t3*stbi__f2f( 1.501321110f);           \
1962    p1 = p5 + p1*stbi__f2f(-0.899976223f);      \
1963    p2 = p5 + p2*stbi__f2f(-2.562915447f);      \
1964    p3 = p3*stbi__f2f(-1.961570560f);           \
1965    p4 = p4*stbi__f2f(-0.390180644f);           \
1966    t3 += p1+p4;                                \
1967    t2 += p2+p3;                                \
1968    t1 += p2+p4;                                \
1969    t0 += p1+p3;
1970 
stbi__idct_block(stbi_uc * out,int out_stride,short data[64])1971 static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64])
1972 {
1973    int i,val[64],*v=val;
1974    stbi_uc *o;
1975    short *d = data;
1976 
1977    // columns
1978    for (i=0; i < 8; ++i,++d, ++v) {
1979       // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
1980       if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
1981            && d[40]==0 && d[48]==0 && d[56]==0) {
1982          //    no shortcut                 0     seconds
1983          //    (1|2|3|4|5|6|7)==0          0     seconds
1984          //    all separate               -0.047 seconds
1985          //    1 && 2|3 && 4|5 && 6|7:    -0.047 seconds
1986          int dcterm = d[0] << 2;
1987          v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
1988       } else {
1989          STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56])
1990          // constants scaled things up by 1<<12; let's bring them back
1991          // down, but keep 2 extra bits of precision
1992          x0 += 512; x1 += 512; x2 += 512; x3 += 512;
1993          v[ 0] = (x0+t3) >> 10;
1994          v[56] = (x0-t3) >> 10;
1995          v[ 8] = (x1+t2) >> 10;
1996          v[48] = (x1-t2) >> 10;
1997          v[16] = (x2+t1) >> 10;
1998          v[40] = (x2-t1) >> 10;
1999          v[24] = (x3+t0) >> 10;
2000          v[32] = (x3-t0) >> 10;
2001       }
2002    }
2003 
2004    for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
2005       // no fast case since the first 1D IDCT spread components out
2006       STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
2007       // constants scaled things up by 1<<12, plus we had 1<<2 from first
2008       // loop, plus horizontal and vertical each scale by sqrt(8) so together
2009       // we've got an extra 1<<3, so 1<<17 total we need to remove.
2010       // so we want to round that, which means adding 0.5 * 1<<17,
2011       // aka 65536. Also, we'll end up with -128 to 127 that we want
2012       // to encode as 0..255 by adding 128, so we'll add that before the shift
2013       x0 += 65536 + (128<<17);
2014       x1 += 65536 + (128<<17);
2015       x2 += 65536 + (128<<17);
2016       x3 += 65536 + (128<<17);
2017       // tried computing the shifts into temps, or'ing the temps to see
2018       // if any were out of range, but that was slower
2019       o[0] = stbi__clamp((x0+t3) >> 17);
2020       o[7] = stbi__clamp((x0-t3) >> 17);
2021       o[1] = stbi__clamp((x1+t2) >> 17);
2022       o[6] = stbi__clamp((x1-t2) >> 17);
2023       o[2] = stbi__clamp((x2+t1) >> 17);
2024       o[5] = stbi__clamp((x2-t1) >> 17);
2025       o[3] = stbi__clamp((x3+t0) >> 17);
2026       o[4] = stbi__clamp((x3-t0) >> 17);
2027    }
2028 }
2029 
2030 #ifdef STBI_SSE2
2031 // sse2 integer IDCT. not the fastest possible implementation but it
2032 // produces bit-identical results to the generic C version so it's
2033 // fully "transparent".
stbi__idct_simd(stbi_uc * out,int out_stride,short data[64])2034 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2035 {
2036    // This is constructed to match our regular (generic) integer IDCT exactly.
2037    __m128i row0, row1, row2, row3, row4, row5, row6, row7;
2038    __m128i tmp;
2039 
2040    // dot product constant: even elems=x, odd elems=y
2041    #define dct_const(x,y)  _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
2042 
2043    // out(0) = c0[even]*x + c0[odd]*y   (c0, x, y 16-bit, out 32-bit)
2044    // out(1) = c1[even]*x + c1[odd]*y
2045    #define dct_rot(out0,out1, x,y,c0,c1) \
2046       __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
2047       __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
2048       __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
2049       __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
2050       __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
2051       __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
2052 
2053    // out = in << 12  (in 16-bit, out 32-bit)
2054    #define dct_widen(out, in) \
2055       __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
2056       __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
2057 
2058    // wide add
2059    #define dct_wadd(out, a, b) \
2060       __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
2061       __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
2062 
2063    // wide sub
2064    #define dct_wsub(out, a, b) \
2065       __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
2066       __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
2067 
2068    // butterfly a/b, add bias, then shift by "s" and pack
2069    #define dct_bfly32o(out0, out1, a,b,bias,s) \
2070       { \
2071          __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
2072          __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
2073          dct_wadd(sum, abiased, b); \
2074          dct_wsub(dif, abiased, b); \
2075          out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
2076          out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
2077       }
2078 
2079    // 8-bit interleave step (for transposes)
2080    #define dct_interleave8(a, b) \
2081       tmp = a; \
2082       a = _mm_unpacklo_epi8(a, b); \
2083       b = _mm_unpackhi_epi8(tmp, b)
2084 
2085    // 16-bit interleave step (for transposes)
2086    #define dct_interleave16(a, b) \
2087       tmp = a; \
2088       a = _mm_unpacklo_epi16(a, b); \
2089       b = _mm_unpackhi_epi16(tmp, b)
2090 
2091    #define dct_pass(bias,shift) \
2092       { \
2093          /* even part */ \
2094          dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
2095          __m128i sum04 = _mm_add_epi16(row0, row4); \
2096          __m128i dif04 = _mm_sub_epi16(row0, row4); \
2097          dct_widen(t0e, sum04); \
2098          dct_widen(t1e, dif04); \
2099          dct_wadd(x0, t0e, t3e); \
2100          dct_wsub(x3, t0e, t3e); \
2101          dct_wadd(x1, t1e, t2e); \
2102          dct_wsub(x2, t1e, t2e); \
2103          /* odd part */ \
2104          dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
2105          dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
2106          __m128i sum17 = _mm_add_epi16(row1, row7); \
2107          __m128i sum35 = _mm_add_epi16(row3, row5); \
2108          dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
2109          dct_wadd(x4, y0o, y4o); \
2110          dct_wadd(x5, y1o, y5o); \
2111          dct_wadd(x6, y2o, y5o); \
2112          dct_wadd(x7, y3o, y4o); \
2113          dct_bfly32o(row0,row7, x0,x7,bias,shift); \
2114          dct_bfly32o(row1,row6, x1,x6,bias,shift); \
2115          dct_bfly32o(row2,row5, x2,x5,bias,shift); \
2116          dct_bfly32o(row3,row4, x3,x4,bias,shift); \
2117       }
2118 
2119    __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
2120    __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f));
2121    __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
2122    __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
2123    __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f));
2124    __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f));
2125    __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f));
2126    __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f));
2127 
2128    // rounding biases in column/row passes, see stbi__idct_block for explanation.
2129    __m128i bias_0 = _mm_set1_epi32(512);
2130    __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17));
2131 
2132    // load
2133    row0 = _mm_load_si128((const __m128i *) (data + 0*8));
2134    row1 = _mm_load_si128((const __m128i *) (data + 1*8));
2135    row2 = _mm_load_si128((const __m128i *) (data + 2*8));
2136    row3 = _mm_load_si128((const __m128i *) (data + 3*8));
2137    row4 = _mm_load_si128((const __m128i *) (data + 4*8));
2138    row5 = _mm_load_si128((const __m128i *) (data + 5*8));
2139    row6 = _mm_load_si128((const __m128i *) (data + 6*8));
2140    row7 = _mm_load_si128((const __m128i *) (data + 7*8));
2141 
2142    // column pass
2143    dct_pass(bias_0, 10);
2144 
2145    {
2146       // 16bit 8x8 transpose pass 1
2147       dct_interleave16(row0, row4);
2148       dct_interleave16(row1, row5);
2149       dct_interleave16(row2, row6);
2150       dct_interleave16(row3, row7);
2151 
2152       // transpose pass 2
2153       dct_interleave16(row0, row2);
2154       dct_interleave16(row1, row3);
2155       dct_interleave16(row4, row6);
2156       dct_interleave16(row5, row7);
2157 
2158       // transpose pass 3
2159       dct_interleave16(row0, row1);
2160       dct_interleave16(row2, row3);
2161       dct_interleave16(row4, row5);
2162       dct_interleave16(row6, row7);
2163    }
2164 
2165    // row pass
2166    dct_pass(bias_1, 17);
2167 
2168    {
2169       // pack
2170       __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
2171       __m128i p1 = _mm_packus_epi16(row2, row3);
2172       __m128i p2 = _mm_packus_epi16(row4, row5);
2173       __m128i p3 = _mm_packus_epi16(row6, row7);
2174 
2175       // 8bit 8x8 transpose pass 1
2176       dct_interleave8(p0, p2); // a0e0a1e1...
2177       dct_interleave8(p1, p3); // c0g0c1g1...
2178 
2179       // transpose pass 2
2180       dct_interleave8(p0, p1); // a0c0e0g0...
2181       dct_interleave8(p2, p3); // b0d0f0h0...
2182 
2183       // transpose pass 3
2184       dct_interleave8(p0, p2); // a0b0c0d0...
2185       dct_interleave8(p1, p3); // a4b4c4d4...
2186 
2187       // store
2188       _mm_storel_epi64((__m128i *) out, p0); out += out_stride;
2189       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
2190       _mm_storel_epi64((__m128i *) out, p2); out += out_stride;
2191       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
2192       _mm_storel_epi64((__m128i *) out, p1); out += out_stride;
2193       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
2194       _mm_storel_epi64((__m128i *) out, p3); out += out_stride;
2195       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e));
2196    }
2197 
2198 #undef dct_const
2199 #undef dct_rot
2200 #undef dct_widen
2201 #undef dct_wadd
2202 #undef dct_wsub
2203 #undef dct_bfly32o
2204 #undef dct_interleave8
2205 #undef dct_interleave16
2206 #undef dct_pass
2207 }
2208 
2209 #endif // STBI_SSE2
2210 
2211 #ifdef STBI_NEON
2212 
2213 // NEON integer IDCT. should produce bit-identical
2214 // results to the generic C version.
stbi__idct_simd(stbi_uc * out,int out_stride,short data[64])2215 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2216 {
2217    int16x8_t row0, row1, row2, row3, row4, row5, row6, row7;
2218 
2219    int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f));
2220    int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f));
2221    int16x4_t rot0_2 = vdup_n_s16(stbi__f2f( 0.765366865f));
2222    int16x4_t rot1_0 = vdup_n_s16(stbi__f2f( 1.175875602f));
2223    int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f));
2224    int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f));
2225    int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f));
2226    int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f));
2227    int16x4_t rot3_0 = vdup_n_s16(stbi__f2f( 0.298631336f));
2228    int16x4_t rot3_1 = vdup_n_s16(stbi__f2f( 2.053119869f));
2229    int16x4_t rot3_2 = vdup_n_s16(stbi__f2f( 3.072711026f));
2230    int16x4_t rot3_3 = vdup_n_s16(stbi__f2f( 1.501321110f));
2231 
2232 #define dct_long_mul(out, inq, coeff) \
2233    int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
2234    int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
2235 
2236 #define dct_long_mac(out, acc, inq, coeff) \
2237    int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
2238    int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
2239 
2240 #define dct_widen(out, inq) \
2241    int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
2242    int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
2243 
2244 // wide add
2245 #define dct_wadd(out, a, b) \
2246    int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
2247    int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
2248 
2249 // wide sub
2250 #define dct_wsub(out, a, b) \
2251    int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
2252    int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
2253 
2254 // butterfly a/b, then shift using "shiftop" by "s" and pack
2255 #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
2256    { \
2257       dct_wadd(sum, a, b); \
2258       dct_wsub(dif, a, b); \
2259       out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
2260       out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
2261    }
2262 
2263 #define dct_pass(shiftop, shift) \
2264    { \
2265       /* even part */ \
2266       int16x8_t sum26 = vaddq_s16(row2, row6); \
2267       dct_long_mul(p1e, sum26, rot0_0); \
2268       dct_long_mac(t2e, p1e, row6, rot0_1); \
2269       dct_long_mac(t3e, p1e, row2, rot0_2); \
2270       int16x8_t sum04 = vaddq_s16(row0, row4); \
2271       int16x8_t dif04 = vsubq_s16(row0, row4); \
2272       dct_widen(t0e, sum04); \
2273       dct_widen(t1e, dif04); \
2274       dct_wadd(x0, t0e, t3e); \
2275       dct_wsub(x3, t0e, t3e); \
2276       dct_wadd(x1, t1e, t2e); \
2277       dct_wsub(x2, t1e, t2e); \
2278       /* odd part */ \
2279       int16x8_t sum15 = vaddq_s16(row1, row5); \
2280       int16x8_t sum17 = vaddq_s16(row1, row7); \
2281       int16x8_t sum35 = vaddq_s16(row3, row5); \
2282       int16x8_t sum37 = vaddq_s16(row3, row7); \
2283       int16x8_t sumodd = vaddq_s16(sum17, sum35); \
2284       dct_long_mul(p5o, sumodd, rot1_0); \
2285       dct_long_mac(p1o, p5o, sum17, rot1_1); \
2286       dct_long_mac(p2o, p5o, sum35, rot1_2); \
2287       dct_long_mul(p3o, sum37, rot2_0); \
2288       dct_long_mul(p4o, sum15, rot2_1); \
2289       dct_wadd(sump13o, p1o, p3o); \
2290       dct_wadd(sump24o, p2o, p4o); \
2291       dct_wadd(sump23o, p2o, p3o); \
2292       dct_wadd(sump14o, p1o, p4o); \
2293       dct_long_mac(x4, sump13o, row7, rot3_0); \
2294       dct_long_mac(x5, sump24o, row5, rot3_1); \
2295       dct_long_mac(x6, sump23o, row3, rot3_2); \
2296       dct_long_mac(x7, sump14o, row1, rot3_3); \
2297       dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
2298       dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
2299       dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
2300       dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
2301    }
2302 
2303    // load
2304    row0 = vld1q_s16(data + 0*8);
2305    row1 = vld1q_s16(data + 1*8);
2306    row2 = vld1q_s16(data + 2*8);
2307    row3 = vld1q_s16(data + 3*8);
2308    row4 = vld1q_s16(data + 4*8);
2309    row5 = vld1q_s16(data + 5*8);
2310    row6 = vld1q_s16(data + 6*8);
2311    row7 = vld1q_s16(data + 7*8);
2312 
2313    // add DC bias
2314    row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
2315 
2316    // column pass
2317    dct_pass(vrshrn_n_s32, 10);
2318 
2319    // 16bit 8x8 transpose
2320    {
2321 // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
2322 // whether compilers actually get this is another story, sadly.
2323 #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
2324 #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
2325 #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
2326 
2327       // pass 1
2328       dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6
2329       dct_trn16(row2, row3);
2330       dct_trn16(row4, row5);
2331       dct_trn16(row6, row7);
2332 
2333       // pass 2
2334       dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4
2335       dct_trn32(row1, row3);
2336       dct_trn32(row4, row6);
2337       dct_trn32(row5, row7);
2338 
2339       // pass 3
2340       dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0
2341       dct_trn64(row1, row5);
2342       dct_trn64(row2, row6);
2343       dct_trn64(row3, row7);
2344 
2345 #undef dct_trn16
2346 #undef dct_trn32
2347 #undef dct_trn64
2348    }
2349 
2350    // row pass
2351    // vrshrn_n_s32 only supports shifts up to 16, we need
2352    // 17. so do a non-rounding shift of 16 first then follow
2353    // up with a rounding shift by 1.
2354    dct_pass(vshrn_n_s32, 16);
2355 
2356    {
2357       // pack and round
2358       uint8x8_t p0 = vqrshrun_n_s16(row0, 1);
2359       uint8x8_t p1 = vqrshrun_n_s16(row1, 1);
2360       uint8x8_t p2 = vqrshrun_n_s16(row2, 1);
2361       uint8x8_t p3 = vqrshrun_n_s16(row3, 1);
2362       uint8x8_t p4 = vqrshrun_n_s16(row4, 1);
2363       uint8x8_t p5 = vqrshrun_n_s16(row5, 1);
2364       uint8x8_t p6 = vqrshrun_n_s16(row6, 1);
2365       uint8x8_t p7 = vqrshrun_n_s16(row7, 1);
2366 
2367       // again, these can translate into one instruction, but often don't.
2368 #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
2369 #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
2370 #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
2371 
2372       // sadly can't use interleaved stores here since we only write
2373       // 8 bytes to each scan line!
2374 
2375       // 8x8 8-bit transpose pass 1
2376       dct_trn8_8(p0, p1);
2377       dct_trn8_8(p2, p3);
2378       dct_trn8_8(p4, p5);
2379       dct_trn8_8(p6, p7);
2380 
2381       // pass 2
2382       dct_trn8_16(p0, p2);
2383       dct_trn8_16(p1, p3);
2384       dct_trn8_16(p4, p6);
2385       dct_trn8_16(p5, p7);
2386 
2387       // pass 3
2388       dct_trn8_32(p0, p4);
2389       dct_trn8_32(p1, p5);
2390       dct_trn8_32(p2, p6);
2391       dct_trn8_32(p3, p7);
2392 
2393       // store
2394       vst1_u8(out, p0); out += out_stride;
2395       vst1_u8(out, p1); out += out_stride;
2396       vst1_u8(out, p2); out += out_stride;
2397       vst1_u8(out, p3); out += out_stride;
2398       vst1_u8(out, p4); out += out_stride;
2399       vst1_u8(out, p5); out += out_stride;
2400       vst1_u8(out, p6); out += out_stride;
2401       vst1_u8(out, p7);
2402 
2403 #undef dct_trn8_8
2404 #undef dct_trn8_16
2405 #undef dct_trn8_32
2406    }
2407 
2408 #undef dct_long_mul
2409 #undef dct_long_mac
2410 #undef dct_widen
2411 #undef dct_wadd
2412 #undef dct_wsub
2413 #undef dct_bfly32o
2414 #undef dct_pass
2415 }
2416 
2417 #endif // STBI_NEON
2418 
2419 #define STBI__MARKER_none  0xff
2420 // if there's a pending marker from the entropy stream, return that
2421 // otherwise, fetch from the stream and get a marker. if there's no
2422 // marker, return 0xff, which is never a valid marker value
stbi__get_marker(stbi__jpeg * j)2423 static stbi_uc stbi__get_marker(stbi__jpeg *j)
2424 {
2425    stbi_uc x;
2426    if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
2427    x = stbi__get8(j->s);
2428    if (x != 0xff) return STBI__MARKER_none;
2429    while (x == 0xff)
2430       x = stbi__get8(j->s);
2431    return x;
2432 }
2433 
2434 // in each scan, we'll have scan_n components, and the order
2435 // of the components is specified by order[]
2436 #define STBI__RESTART(x)     ((x) >= 0xd0 && (x) <= 0xd7)
2437 
2438 // after a restart interval, stbi__jpeg_reset the entropy decoder and
2439 // the dc prediction
stbi__jpeg_reset(stbi__jpeg * j)2440 static void stbi__jpeg_reset(stbi__jpeg *j)
2441 {
2442    j->code_bits = 0;
2443    j->code_buffer = 0;
2444    j->nomore = 0;
2445    j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = 0;
2446    j->marker = STBI__MARKER_none;
2447    j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
2448    j->eob_run = 0;
2449    // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
2450    // since we don't even allow 1<<30 pixels
2451 }
2452 
stbi__parse_entropy_coded_data(stbi__jpeg * z)2453 static int stbi__parse_entropy_coded_data(stbi__jpeg *z)
2454 {
2455    stbi__jpeg_reset(z);
2456    if (!z->progressive) {
2457       if (z->scan_n == 1) {
2458          int i,j;
2459          STBI_SIMD_ALIGN(short, data[64]);
2460          int n = z->order[0];
2461          // non-interleaved data, we just need to process one block at a time,
2462          // in trivial scanline order
2463          // number of blocks to do just depends on how many actual "pixels" this
2464          // component has, independent of interleaved MCU blocking and such
2465          int w = (z->img_comp[n].x+7) >> 3;
2466          int h = (z->img_comp[n].y+7) >> 3;
2467          for (j=0; j < h; ++j) {
2468             for (i=0; i < w; ++i) {
2469                int ha = z->img_comp[n].ha;
2470                if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2471                z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2472                // every data block is an MCU, so countdown the restart interval
2473                if (--z->todo <= 0) {
2474                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2475                   // if it's NOT a restart, then just bail, so we get corrupt data
2476                   // rather than no data
2477                   if (!STBI__RESTART(z->marker)) return 1;
2478                   stbi__jpeg_reset(z);
2479                }
2480             }
2481          }
2482          return 1;
2483       } else { // interleaved
2484          int i,j,k,x,y;
2485          STBI_SIMD_ALIGN(short, data[64]);
2486          for (j=0; j < z->img_mcu_y; ++j) {
2487             for (i=0; i < z->img_mcu_x; ++i) {
2488                // scan an interleaved mcu... process scan_n components in order
2489                for (k=0; k < z->scan_n; ++k) {
2490                   int n = z->order[k];
2491                   // scan out an mcu's worth of this component; that's just determined
2492                   // by the basic H and V specified for the component
2493                   for (y=0; y < z->img_comp[n].v; ++y) {
2494                      for (x=0; x < z->img_comp[n].h; ++x) {
2495                         int x2 = (i*z->img_comp[n].h + x)*8;
2496                         int y2 = (j*z->img_comp[n].v + y)*8;
2497                         int ha = z->img_comp[n].ha;
2498                         if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2499                         z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data);
2500                      }
2501                   }
2502                }
2503                // after all interleaved components, that's an interleaved MCU,
2504                // so now count down the restart interval
2505                if (--z->todo <= 0) {
2506                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2507                   if (!STBI__RESTART(z->marker)) return 1;
2508                   stbi__jpeg_reset(z);
2509                }
2510             }
2511          }
2512          return 1;
2513       }
2514    } else {
2515       if (z->scan_n == 1) {
2516          int i,j;
2517          int n = z->order[0];
2518          // non-interleaved data, we just need to process one block at a time,
2519          // in trivial scanline order
2520          // number of blocks to do just depends on how many actual "pixels" this
2521          // component has, independent of interleaved MCU blocking and such
2522          int w = (z->img_comp[n].x+7) >> 3;
2523          int h = (z->img_comp[n].y+7) >> 3;
2524          for (j=0; j < h; ++j) {
2525             for (i=0; i < w; ++i) {
2526                short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2527                if (z->spec_start == 0) {
2528                   if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2529                      return 0;
2530                } else {
2531                   int ha = z->img_comp[n].ha;
2532                   if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha]))
2533                      return 0;
2534                }
2535                // every data block is an MCU, so countdown the restart interval
2536                if (--z->todo <= 0) {
2537                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2538                   if (!STBI__RESTART(z->marker)) return 1;
2539                   stbi__jpeg_reset(z);
2540                }
2541             }
2542          }
2543          return 1;
2544       } else { // interleaved
2545          int i,j,k,x,y;
2546          for (j=0; j < z->img_mcu_y; ++j) {
2547             for (i=0; i < z->img_mcu_x; ++i) {
2548                // scan an interleaved mcu... process scan_n components in order
2549                for (k=0; k < z->scan_n; ++k) {
2550                   int n = z->order[k];
2551                   // scan out an mcu's worth of this component; that's just determined
2552                   // by the basic H and V specified for the component
2553                   for (y=0; y < z->img_comp[n].v; ++y) {
2554                      for (x=0; x < z->img_comp[n].h; ++x) {
2555                         int x2 = (i*z->img_comp[n].h + x);
2556                         int y2 = (j*z->img_comp[n].v + y);
2557                         short *data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w);
2558                         if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2559                            return 0;
2560                      }
2561                   }
2562                }
2563                // after all interleaved components, that's an interleaved MCU,
2564                // so now count down the restart interval
2565                if (--z->todo <= 0) {
2566                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2567                   if (!STBI__RESTART(z->marker)) return 1;
2568                   stbi__jpeg_reset(z);
2569                }
2570             }
2571          }
2572          return 1;
2573       }
2574    }
2575 }
2576 
stbi__jpeg_dequantize(short * data,stbi_uc * dequant)2577 static void stbi__jpeg_dequantize(short *data, stbi_uc *dequant)
2578 {
2579    int i;
2580    for (i=0; i < 64; ++i)
2581       data[i] *= dequant[i];
2582 }
2583 
stbi__jpeg_finish(stbi__jpeg * z)2584 static void stbi__jpeg_finish(stbi__jpeg *z)
2585 {
2586    if (z->progressive) {
2587       // dequantize and idct the data
2588       int i,j,n;
2589       for (n=0; n < z->s->img_n; ++n) {
2590          int w = (z->img_comp[n].x+7) >> 3;
2591          int h = (z->img_comp[n].y+7) >> 3;
2592          for (j=0; j < h; ++j) {
2593             for (i=0; i < w; ++i) {
2594                short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2595                stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]);
2596                z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2597             }
2598          }
2599       }
2600    }
2601 }
2602 
stbi__process_marker(stbi__jpeg * z,int m)2603 static int stbi__process_marker(stbi__jpeg *z, int m)
2604 {
2605    int L;
2606    switch (m) {
2607       case STBI__MARKER_none: // no marker found
2608          return stbi__err("expected marker","Corrupt JPEG");
2609 
2610       case 0xDD: // DRI - specify restart interval
2611          if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
2612          z->restart_interval = stbi__get16be(z->s);
2613          return 1;
2614 
2615       case 0xDB: // DQT - define quantization table
2616          L = stbi__get16be(z->s)-2;
2617          while (L > 0) {
2618             int q = stbi__get8(z->s);
2619             int p = q >> 4;
2620             int t = q & 15,i;
2621             if (p != 0) return stbi__err("bad DQT type","Corrupt JPEG");
2622             if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG");
2623             for (i=0; i < 64; ++i)
2624                z->dequant[t][stbi__jpeg_dezigzag[i]] = stbi__get8(z->s);
2625             L -= 65;
2626          }
2627          return L==0;
2628 
2629       case 0xC4: // DHT - define huffman table
2630          L = stbi__get16be(z->s)-2;
2631          while (L > 0) {
2632             stbi_uc *v;
2633             int sizes[16],i,n=0;
2634             int q = stbi__get8(z->s);
2635             int tc = q >> 4;
2636             int th = q & 15;
2637             if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG");
2638             for (i=0; i < 16; ++i) {
2639                sizes[i] = stbi__get8(z->s);
2640                n += sizes[i];
2641             }
2642             L -= 17;
2643             if (tc == 0) {
2644                if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0;
2645                v = z->huff_dc[th].values;
2646             } else {
2647                if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0;
2648                v = z->huff_ac[th].values;
2649             }
2650             for (i=0; i < n; ++i)
2651                v[i] = stbi__get8(z->s);
2652             if (tc != 0)
2653                stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
2654             L -= n;
2655          }
2656          return L==0;
2657    }
2658    // check for comment block or APP blocks
2659    if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
2660       stbi__skip(z->s, stbi__get16be(z->s)-2);
2661       return 1;
2662    }
2663    return 0;
2664 }
2665 
2666 // after we see SOS
stbi__process_scan_header(stbi__jpeg * z)2667 static int stbi__process_scan_header(stbi__jpeg *z)
2668 {
2669    int i;
2670    int Ls = stbi__get16be(z->s);
2671    z->scan_n = stbi__get8(z->s);
2672    if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return stbi__err("bad SOS component count","Corrupt JPEG");
2673    if (Ls != 6+2*z->scan_n) return stbi__err("bad SOS len","Corrupt JPEG");
2674    for (i=0; i < z->scan_n; ++i) {
2675       int id = stbi__get8(z->s), which;
2676       int q = stbi__get8(z->s);
2677       for (which = 0; which < z->s->img_n; ++which)
2678          if (z->img_comp[which].id == id)
2679             break;
2680       if (which == z->s->img_n) return 0; // no match
2681       z->img_comp[which].hd = q >> 4;   if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff","Corrupt JPEG");
2682       z->img_comp[which].ha = q & 15;   if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff","Corrupt JPEG");
2683       z->order[i] = which;
2684    }
2685 
2686    {
2687       int aa;
2688       z->spec_start = stbi__get8(z->s);
2689       z->spec_end   = stbi__get8(z->s); // should be 63, but might be 0
2690       aa = stbi__get8(z->s);
2691       z->succ_high = (aa >> 4);
2692       z->succ_low  = (aa & 15);
2693       if (z->progressive) {
2694          if (z->spec_start > 63 || z->spec_end > 63  || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13)
2695             return stbi__err("bad SOS", "Corrupt JPEG");
2696       } else {
2697          if (z->spec_start != 0) return stbi__err("bad SOS","Corrupt JPEG");
2698          if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS","Corrupt JPEG");
2699          z->spec_end = 63;
2700       }
2701    }
2702 
2703    return 1;
2704 }
2705 
stbi__process_frame_header(stbi__jpeg * z,int scan)2706 static int stbi__process_frame_header(stbi__jpeg *z, int scan)
2707 {
2708    stbi__context *s = z->s;
2709    int Lf,p,i,q, h_max=1,v_max=1,c;
2710    Lf = stbi__get16be(s);         if (Lf < 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
2711    p  = stbi__get8(s);            if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
2712    s->img_y = stbi__get16be(s);   if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
2713    s->img_x = stbi__get16be(s);   if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
2714    c = stbi__get8(s);
2715    if (c != 3 && c != 1) return stbi__err("bad component count","Corrupt JPEG");    // JFIF requires
2716    s->img_n = c;
2717    for (i=0; i < c; ++i) {
2718       z->img_comp[i].data = NULL;
2719       z->img_comp[i].linebuf = NULL;
2720    }
2721 
2722    if (Lf != 8+3*s->img_n) return stbi__err("bad SOF len","Corrupt JPEG");
2723 
2724    for (i=0; i < s->img_n; ++i) {
2725       z->img_comp[i].id = stbi__get8(s);
2726       if (z->img_comp[i].id != i+1)   // JFIF requires
2727          if (z->img_comp[i].id != i)  // some version of jpegtran outputs non-JFIF-compliant files!
2728             return stbi__err("bad component ID","Corrupt JPEG");
2729       q = stbi__get8(s);
2730       z->img_comp[i].h = (q >> 4);  if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG");
2731       z->img_comp[i].v = q & 15;    if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG");
2732       z->img_comp[i].tq = stbi__get8(s);  if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG");
2733    }
2734 
2735    if (scan != STBI__SCAN_load) return 1;
2736 
2737    if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
2738 
2739    for (i=0; i < s->img_n; ++i) {
2740       if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
2741       if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
2742    }
2743 
2744    // compute interleaved mcu info
2745    z->img_h_max = h_max;
2746    z->img_v_max = v_max;
2747    z->img_mcu_w = h_max * 8;
2748    z->img_mcu_h = v_max * 8;
2749    z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
2750    z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;
2751 
2752    for (i=0; i < s->img_n; ++i) {
2753       // number of effective pixels (e.g. for non-interleaved MCU)
2754       z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
2755       z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
2756       // to simplify generation, we'll allocate enough memory to decode
2757       // the bogus oversized data from using interleaved MCUs and their
2758       // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
2759       // discard the extra data until colorspace conversion
2760       z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
2761       z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
2762       z->img_comp[i].raw_data = stbi__malloc(z->img_comp[i].w2 * z->img_comp[i].h2+15);
2763 
2764       if (z->img_comp[i].raw_data == NULL) {
2765          for(--i; i >= 0; --i) {
2766             STBI_FREE(z->img_comp[i].raw_data);
2767             z->img_comp[i].raw_data = NULL;
2768          }
2769          return stbi__err("outofmem", "Out of memory");
2770       }
2771       // align blocks for idct using mmx/sse
2772       z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
2773       z->img_comp[i].linebuf = NULL;
2774       if (z->progressive) {
2775          z->img_comp[i].coeff_w = (z->img_comp[i].w2 + 7) >> 3;
2776          z->img_comp[i].coeff_h = (z->img_comp[i].h2 + 7) >> 3;
2777          z->img_comp[i].raw_coeff = STBI_MALLOC(z->img_comp[i].coeff_w * z->img_comp[i].coeff_h * 64 * sizeof(short) + 15);
2778          z->img_comp[i].coeff = (short*) (((size_t) z->img_comp[i].raw_coeff + 15) & ~15);
2779       } else {
2780          z->img_comp[i].coeff = 0;
2781          z->img_comp[i].raw_coeff = 0;
2782       }
2783    }
2784 
2785    return 1;
2786 }
2787 
2788 // use comparisons since in some cases we handle more than one case (e.g. SOF)
2789 #define stbi__DNL(x)         ((x) == 0xdc)
2790 #define stbi__SOI(x)         ((x) == 0xd8)
2791 #define stbi__EOI(x)         ((x) == 0xd9)
2792 #define stbi__SOF(x)         ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
2793 #define stbi__SOS(x)         ((x) == 0xda)
2794 
2795 #define stbi__SOF_progressive(x)   ((x) == 0xc2)
2796 
stbi__decode_jpeg_header(stbi__jpeg * z,int scan)2797 static int stbi__decode_jpeg_header(stbi__jpeg *z, int scan)
2798 {
2799    int m;
2800    z->marker = STBI__MARKER_none; // initialize cached marker to empty
2801    m = stbi__get_marker(z);
2802    if (!stbi__SOI(m)) return stbi__err("no SOI","Corrupt JPEG");
2803    if (scan == STBI__SCAN_type) return 1;
2804    m = stbi__get_marker(z);
2805    while (!stbi__SOF(m)) {
2806       if (!stbi__process_marker(z,m)) return 0;
2807       m = stbi__get_marker(z);
2808       while (m == STBI__MARKER_none) {
2809          // some files have extra padding after their blocks, so ok, we'll scan
2810          if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG");
2811          m = stbi__get_marker(z);
2812       }
2813    }
2814    z->progressive = stbi__SOF_progressive(m);
2815    if (!stbi__process_frame_header(z, scan)) return 0;
2816    return 1;
2817 }
2818 
2819 // decode image to YCbCr format
stbi__decode_jpeg_image(stbi__jpeg * j)2820 static int stbi__decode_jpeg_image(stbi__jpeg *j)
2821 {
2822    int m;
2823    for (m = 0; m < 4; m++) {
2824       j->img_comp[m].raw_data = NULL;
2825       j->img_comp[m].raw_coeff = NULL;
2826    }
2827    j->restart_interval = 0;
2828    if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0;
2829    m = stbi__get_marker(j);
2830    while (!stbi__EOI(m)) {
2831       if (stbi__SOS(m)) {
2832          if (!stbi__process_scan_header(j)) return 0;
2833          if (!stbi__parse_entropy_coded_data(j)) return 0;
2834          if (j->marker == STBI__MARKER_none ) {
2835             // handle 0s at the end of image data from IP Kamera 9060
2836             while (!stbi__at_eof(j->s)) {
2837                int x = stbi__get8(j->s);
2838                if (x == 255) {
2839                   j->marker = stbi__get8(j->s);
2840                   break;
2841                } else if (x != 0) {
2842                   return stbi__err("junk before marker", "Corrupt JPEG");
2843                }
2844             }
2845             // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
2846          }
2847       } else {
2848          if (!stbi__process_marker(j, m)) return 0;
2849       }
2850       m = stbi__get_marker(j);
2851    }
2852    if (j->progressive)
2853       stbi__jpeg_finish(j);
2854    return 1;
2855 }
2856 
2857 // static jfif-centered resampling (across block boundaries)
2858 
2859 typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1,
2860                                     int w, int hs);
2861 
2862 #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
2863 
resample_row_1(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2864 static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2865 {
2866    STBI_NOTUSED(out);
2867    STBI_NOTUSED(in_far);
2868    STBI_NOTUSED(w);
2869    STBI_NOTUSED(hs);
2870    return in_near;
2871 }
2872 
stbi__resample_row_v_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2873 static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2874 {
2875    // need to generate two samples vertically for every one in input
2876    int i;
2877    STBI_NOTUSED(hs);
2878    for (i=0; i < w; ++i)
2879       out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2);
2880    return out;
2881 }
2882 
stbi__resample_row_h_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2883 static stbi_uc*  stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2884 {
2885    // need to generate two samples horizontally for every one in input
2886    int i;
2887    stbi_uc *input = in_near;
2888 
2889    if (w == 1) {
2890       // if only one sample, can't do any interpolation
2891       out[0] = out[1] = input[0];
2892       return out;
2893    }
2894 
2895    out[0] = input[0];
2896    out[1] = stbi__div4(input[0]*3 + input[1] + 2);
2897    for (i=1; i < w-1; ++i) {
2898       int n = 3*input[i]+2;
2899       out[i*2+0] = stbi__div4(n+input[i-1]);
2900       out[i*2+1] = stbi__div4(n+input[i+1]);
2901    }
2902    out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2);
2903    out[i*2+1] = input[w-1];
2904 
2905    STBI_NOTUSED(in_far);
2906    STBI_NOTUSED(hs);
2907 
2908    return out;
2909 }
2910 
2911 #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
2912 
stbi__resample_row_hv_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2913 static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2914 {
2915    // need to generate 2x2 samples for every one in input
2916    int i,t0,t1;
2917    if (w == 1) {
2918       out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2919       return out;
2920    }
2921 
2922    t1 = 3*in_near[0] + in_far[0];
2923    out[0] = stbi__div4(t1+2);
2924    for (i=1; i < w; ++i) {
2925       t0 = t1;
2926       t1 = 3*in_near[i]+in_far[i];
2927       out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
2928       out[i*2  ] = stbi__div16(3*t1 + t0 + 8);
2929    }
2930    out[w*2-1] = stbi__div4(t1+2);
2931 
2932    STBI_NOTUSED(hs);
2933 
2934    return out;
2935 }
2936 
2937 #if defined(STBI_SSE2) || defined(STBI_NEON)
stbi__resample_row_hv_2_simd(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2938 static stbi_uc *stbi__resample_row_hv_2_simd(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2939 {
2940    // need to generate 2x2 samples for every one in input
2941    int i=0,t0,t1;
2942 
2943    if (w == 1) {
2944       out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2945       return out;
2946    }
2947 
2948    t1 = 3*in_near[0] + in_far[0];
2949    // process groups of 8 pixels for as long as we can.
2950    // note we can't handle the last pixel in a row in this loop
2951    // because we need to handle the filter boundary conditions.
2952    for (; i < ((w-1) & ~7); i += 8) {
2953 #if defined(STBI_SSE2)
2954       // load and perform the vertical filtering pass
2955       // this uses 3*x + y = 4*x + (y - x)
2956       __m128i zero  = _mm_setzero_si128();
2957       __m128i farb  = _mm_loadl_epi64((__m128i *) (in_far + i));
2958       __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i));
2959       __m128i farw  = _mm_unpacklo_epi8(farb, zero);
2960       __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
2961       __m128i diff  = _mm_sub_epi16(farw, nearw);
2962       __m128i nears = _mm_slli_epi16(nearw, 2);
2963       __m128i curr  = _mm_add_epi16(nears, diff); // current row
2964 
2965       // horizontal filter works the same based on shifted vers of current
2966       // row. "prev" is current row shifted right by 1 pixel; we need to
2967       // insert the previous pixel value (from t1).
2968       // "next" is current row shifted left by 1 pixel, with first pixel
2969       // of next block of 8 pixels added in.
2970       __m128i prv0 = _mm_slli_si128(curr, 2);
2971       __m128i nxt0 = _mm_srli_si128(curr, 2);
2972       __m128i prev = _mm_insert_epi16(prv0, t1, 0);
2973       __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7);
2974 
2975       // horizontal filter, polyphase implementation since it's convenient:
2976       // even pixels = 3*cur + prev = cur*4 + (prev - cur)
2977       // odd  pixels = 3*cur + next = cur*4 + (next - cur)
2978       // note the shared term.
2979       __m128i bias  = _mm_set1_epi16(8);
2980       __m128i curs = _mm_slli_epi16(curr, 2);
2981       __m128i prvd = _mm_sub_epi16(prev, curr);
2982       __m128i nxtd = _mm_sub_epi16(next, curr);
2983       __m128i curb = _mm_add_epi16(curs, bias);
2984       __m128i even = _mm_add_epi16(prvd, curb);
2985       __m128i odd  = _mm_add_epi16(nxtd, curb);
2986 
2987       // interleave even and odd pixels, then undo scaling.
2988       __m128i int0 = _mm_unpacklo_epi16(even, odd);
2989       __m128i int1 = _mm_unpackhi_epi16(even, odd);
2990       __m128i de0  = _mm_srli_epi16(int0, 4);
2991       __m128i de1  = _mm_srli_epi16(int1, 4);
2992 
2993       // pack and write output
2994       __m128i outv = _mm_packus_epi16(de0, de1);
2995       _mm_storeu_si128((__m128i *) (out + i*2), outv);
2996 #elif defined(STBI_NEON)
2997       // load and perform the vertical filtering pass
2998       // this uses 3*x + y = 4*x + (y - x)
2999       uint8x8_t farb  = vld1_u8(in_far + i);
3000       uint8x8_t nearb = vld1_u8(in_near + i);
3001       int16x8_t diff  = vreinterpretq_s16_u16(vsubl_u8(farb, nearb));
3002       int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2));
3003       int16x8_t curr  = vaddq_s16(nears, diff); // current row
3004 
3005       // horizontal filter works the same based on shifted vers of current
3006       // row. "prev" is current row shifted right by 1 pixel; we need to
3007       // insert the previous pixel value (from t1).
3008       // "next" is current row shifted left by 1 pixel, with first pixel
3009       // of next block of 8 pixels added in.
3010       int16x8_t prv0 = vextq_s16(curr, curr, 7);
3011       int16x8_t nxt0 = vextq_s16(curr, curr, 1);
3012       int16x8_t prev = vsetq_lane_s16(t1, prv0, 0);
3013       int16x8_t next = vsetq_lane_s16(3*in_near[i+8] + in_far[i+8], nxt0, 7);
3014 
3015       // horizontal filter, polyphase implementation since it's convenient:
3016       // even pixels = 3*cur + prev = cur*4 + (prev - cur)
3017       // odd  pixels = 3*cur + next = cur*4 + (next - cur)
3018       // note the shared term.
3019       int16x8_t curs = vshlq_n_s16(curr, 2);
3020       int16x8_t prvd = vsubq_s16(prev, curr);
3021       int16x8_t nxtd = vsubq_s16(next, curr);
3022       int16x8_t even = vaddq_s16(curs, prvd);
3023       int16x8_t odd  = vaddq_s16(curs, nxtd);
3024 
3025       // undo scaling and round, then store with even/odd phases interleaved
3026       uint8x8x2_t o;
3027       o.val[0] = vqrshrun_n_s16(even, 4);
3028       o.val[1] = vqrshrun_n_s16(odd,  4);
3029       vst2_u8(out + i*2, o);
3030 #endif
3031 
3032       // "previous" value for next iter
3033       t1 = 3*in_near[i+7] + in_far[i+7];
3034    }
3035 
3036    t0 = t1;
3037    t1 = 3*in_near[i] + in_far[i];
3038    out[i*2] = stbi__div16(3*t1 + t0 + 8);
3039 
3040    for (++i; i < w; ++i) {
3041       t0 = t1;
3042       t1 = 3*in_near[i]+in_far[i];
3043       out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
3044       out[i*2  ] = stbi__div16(3*t1 + t0 + 8);
3045    }
3046    out[w*2-1] = stbi__div4(t1+2);
3047 
3048    STBI_NOTUSED(hs);
3049 
3050    return out;
3051 }
3052 #endif
3053 
stbi__resample_row_generic(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)3054 static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3055 {
3056    // resample with nearest-neighbor
3057    int i,j;
3058    STBI_NOTUSED(in_far);
3059    for (i=0; i < w; ++i)
3060       for (j=0; j < hs; ++j)
3061          out[i*hs+j] = in_near[i];
3062    return out;
3063 }
3064 
3065 #ifdef STBI_JPEG_OLD
3066 // this is the same YCbCr-to-RGB calculation that stb_image has used
3067 // historically before the algorithm changes in 1.49
3068 #define float2fixed(x)  ((int) ((x) * 65536 + 0.5))
stbi__YCbCr_to_RGB_row(stbi_uc * out,const stbi_uc * y,const stbi_uc * pcb,const stbi_uc * pcr,int count,int step)3069 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3070 {
3071    int i;
3072    for (i=0; i < count; ++i) {
3073       int y_fixed = (y[i] << 16) + 32768; // rounding
3074       int r,g,b;
3075       int cr = pcr[i] - 128;
3076       int cb = pcb[i] - 128;
3077       r = y_fixed + cr*float2fixed(1.40200f);
3078       g = y_fixed - cr*float2fixed(0.71414f) - cb*float2fixed(0.34414f);
3079       b = y_fixed                            + cb*float2fixed(1.77200f);
3080       r >>= 16;
3081       g >>= 16;
3082       b >>= 16;
3083       if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3084       if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3085       if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3086       out[0] = (stbi_uc)r;
3087       out[1] = (stbi_uc)g;
3088       out[2] = (stbi_uc)b;
3089       out[3] = 255;
3090       out += step;
3091    }
3092 }
3093 #else
3094 // this is a reduced-precision calculation of YCbCr-to-RGB introduced
3095 // to make sure the code produces the same results in both SIMD and scalar
3096 #define float2fixed(x)  (((int) ((x) * 4096.0f + 0.5f)) << 8)
stbi__YCbCr_to_RGB_row(stbi_uc * out,const stbi_uc * y,const stbi_uc * pcb,const stbi_uc * pcr,int count,int step)3097 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3098 {
3099    int i;
3100    for (i=0; i < count; ++i) {
3101       int y_fixed = (y[i] << 20) + (1<<19); // rounding
3102       int r,g,b;
3103       int cr = pcr[i] - 128;
3104       int cb = pcb[i] - 128;
3105       r = y_fixed +  cr* float2fixed(1.40200f);
3106       g = y_fixed + (cr*-float2fixed(0.71414f)) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3107       b = y_fixed                               +   cb* float2fixed(1.77200f);
3108       r >>= 20;
3109       g >>= 20;
3110       b >>= 20;
3111       if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3112       if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3113       if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3114       out[0] = (stbi_uc)r;
3115       out[1] = (stbi_uc)g;
3116       out[2] = (stbi_uc)b;
3117       out[3] = 255;
3118       out += step;
3119    }
3120 }
3121 #endif
3122 
3123 #if defined(STBI_SSE2) || defined(STBI_NEON)
stbi__YCbCr_to_RGB_simd(stbi_uc * out,stbi_uc const * y,stbi_uc const * pcb,stbi_uc const * pcr,int count,int step)3124 static void stbi__YCbCr_to_RGB_simd(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step)
3125 {
3126    int i = 0;
3127 
3128 #ifdef STBI_SSE2
3129    // step == 3 is pretty ugly on the final interleave, and i'm not convinced
3130    // it's useful in practice (you wouldn't use it for textures, for example).
3131    // so just accelerate step == 4 case.
3132    if (step == 4) {
3133       // this is a fairly straightforward implementation and not super-optimized.
3134       __m128i signflip  = _mm_set1_epi8(-0x80);
3135       __m128i cr_const0 = _mm_set1_epi16(   (short) ( 1.40200f*4096.0f+0.5f));
3136       __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f));
3137       __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f));
3138       __m128i cb_const1 = _mm_set1_epi16(   (short) ( 1.77200f*4096.0f+0.5f));
3139       __m128i y_bias = _mm_set1_epi8((char) (unsigned char) 128);
3140       __m128i xw = _mm_set1_epi16(255); // alpha channel
3141 
3142       for (; i+7 < count; i += 8) {
3143          // load
3144          __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i));
3145          __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i));
3146          __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i));
3147          __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
3148          __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
3149 
3150          // unpack to short (and left-shift cr, cb by 8)
3151          __m128i yw  = _mm_unpacklo_epi8(y_bias, y_bytes);
3152          __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
3153          __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
3154 
3155          // color transform
3156          __m128i yws = _mm_srli_epi16(yw, 4);
3157          __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
3158          __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
3159          __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
3160          __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
3161          __m128i rws = _mm_add_epi16(cr0, yws);
3162          __m128i gwt = _mm_add_epi16(cb0, yws);
3163          __m128i bws = _mm_add_epi16(yws, cb1);
3164          __m128i gws = _mm_add_epi16(gwt, cr1);
3165 
3166          // descale
3167          __m128i rw = _mm_srai_epi16(rws, 4);
3168          __m128i bw = _mm_srai_epi16(bws, 4);
3169          __m128i gw = _mm_srai_epi16(gws, 4);
3170 
3171          // back to byte, set up for transpose
3172          __m128i brb = _mm_packus_epi16(rw, bw);
3173          __m128i gxb = _mm_packus_epi16(gw, xw);
3174 
3175          // transpose to interleave channels
3176          __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
3177          __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
3178          __m128i o0 = _mm_unpacklo_epi16(t0, t1);
3179          __m128i o1 = _mm_unpackhi_epi16(t0, t1);
3180 
3181          // store
3182          _mm_storeu_si128((__m128i *) (out + 0), o0);
3183          _mm_storeu_si128((__m128i *) (out + 16), o1);
3184          out += 32;
3185       }
3186    }
3187 #endif
3188 
3189 #ifdef STBI_NEON
3190    // in this version, step=3 support would be easy to add. but is there demand?
3191    if (step == 4) {
3192       // this is a fairly straightforward implementation and not super-optimized.
3193       uint8x8_t signflip = vdup_n_u8(0x80);
3194       int16x8_t cr_const0 = vdupq_n_s16(   (short) ( 1.40200f*4096.0f+0.5f));
3195       int16x8_t cr_const1 = vdupq_n_s16( - (short) ( 0.71414f*4096.0f+0.5f));
3196       int16x8_t cb_const0 = vdupq_n_s16( - (short) ( 0.34414f*4096.0f+0.5f));
3197       int16x8_t cb_const1 = vdupq_n_s16(   (short) ( 1.77200f*4096.0f+0.5f));
3198 
3199       for (; i+7 < count; i += 8) {
3200          // load
3201          uint8x8_t y_bytes  = vld1_u8(y + i);
3202          uint8x8_t cr_bytes = vld1_u8(pcr + i);
3203          uint8x8_t cb_bytes = vld1_u8(pcb + i);
3204          int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip));
3205          int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip));
3206 
3207          // expand to s16
3208          int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4));
3209          int16x8_t crw = vshll_n_s8(cr_biased, 7);
3210          int16x8_t cbw = vshll_n_s8(cb_biased, 7);
3211 
3212          // color transform
3213          int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0);
3214          int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0);
3215          int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1);
3216          int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1);
3217          int16x8_t rws = vaddq_s16(yws, cr0);
3218          int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1);
3219          int16x8_t bws = vaddq_s16(yws, cb1);
3220 
3221          // undo scaling, round, convert to byte
3222          uint8x8x4_t o;
3223          o.val[0] = vqrshrun_n_s16(rws, 4);
3224          o.val[1] = vqrshrun_n_s16(gws, 4);
3225          o.val[2] = vqrshrun_n_s16(bws, 4);
3226          o.val[3] = vdup_n_u8(255);
3227 
3228          // store, interleaving r/g/b/a
3229          vst4_u8(out, o);
3230          out += 8*4;
3231       }
3232    }
3233 #endif
3234 
3235    for (; i < count; ++i) {
3236       int y_fixed = (y[i] << 20) + (1<<19); // rounding
3237       int r,g,b;
3238       int cr = pcr[i] - 128;
3239       int cb = pcb[i] - 128;
3240       r = y_fixed + cr* float2fixed(1.40200f);
3241       g = y_fixed + cr*-float2fixed(0.71414f) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3242       b = y_fixed                             +   cb* float2fixed(1.77200f);
3243       r >>= 20;
3244       g >>= 20;
3245       b >>= 20;
3246       if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3247       if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3248       if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3249       out[0] = (stbi_uc)r;
3250       out[1] = (stbi_uc)g;
3251       out[2] = (stbi_uc)b;
3252       out[3] = 255;
3253       out += step;
3254    }
3255 }
3256 #endif
3257 
3258 // set up the kernels
stbi__setup_jpeg(stbi__jpeg * j)3259 static void stbi__setup_jpeg(stbi__jpeg *j)
3260 {
3261    j->idct_block_kernel = stbi__idct_block;
3262    j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
3263    j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
3264 
3265 #ifdef STBI_SSE2
3266    if (stbi__sse2_available()) {
3267       j->idct_block_kernel = stbi__idct_simd;
3268       #ifndef STBI_JPEG_OLD
3269       j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3270       #endif
3271       j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3272    }
3273 #endif
3274 
3275 #ifdef STBI_NEON
3276    j->idct_block_kernel = stbi__idct_simd;
3277    #ifndef STBI_JPEG_OLD
3278    j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3279    #endif
3280    j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3281 #endif
3282 }
3283 
3284 // clean up the temporary component buffers
stbi__cleanup_jpeg(stbi__jpeg * j)3285 static void stbi__cleanup_jpeg(stbi__jpeg *j)
3286 {
3287    int i;
3288    for (i=0; i < j->s->img_n; ++i) {
3289       if (j->img_comp[i].raw_data) {
3290          STBI_FREE(j->img_comp[i].raw_data);
3291          j->img_comp[i].raw_data = NULL;
3292          j->img_comp[i].data = NULL;
3293       }
3294       if (j->img_comp[i].raw_coeff) {
3295          STBI_FREE(j->img_comp[i].raw_coeff);
3296          j->img_comp[i].raw_coeff = 0;
3297          j->img_comp[i].coeff = 0;
3298       }
3299       if (j->img_comp[i].linebuf) {
3300          STBI_FREE(j->img_comp[i].linebuf);
3301          j->img_comp[i].linebuf = NULL;
3302       }
3303    }
3304 }
3305 
3306 typedef struct
3307 {
3308    resample_row_func resample;
3309    stbi_uc *line0,*line1;
3310    int hs,vs;   // expansion factor in each axis
3311    int w_lores; // horizontal pixels pre-expansion
3312    int ystep;   // how far through vertical expansion we are
3313    int ypos;    // which pre-expansion row we're on
3314 } stbi__resample;
3315 
load_jpeg_image(stbi__jpeg * z,int * out_x,int * out_y,int * comp,int req_comp)3316 static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
3317 {
3318    int n, decode_n;
3319    z->s->img_n = 0; // make stbi__cleanup_jpeg safe
3320 
3321    // validate req_comp
3322    if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
3323 
3324    // load a jpeg image from whichever source, but leave in YCbCr format
3325    if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
3326 
3327    // determine actual number of components to generate
3328    n = req_comp ? req_comp : z->s->img_n;
3329 
3330    if (z->s->img_n == 3 && n < 3)
3331       decode_n = 1;
3332    else
3333       decode_n = z->s->img_n;
3334 
3335    // resample and color-convert
3336    {
3337       int k;
3338       unsigned int i,j;
3339       stbi_uc *output;
3340       stbi_uc *coutput[4];
3341 
3342       stbi__resample res_comp[4];
3343 
3344       for (k=0; k < decode_n; ++k) {
3345          stbi__resample *r = &res_comp[k];
3346 
3347          // allocate line buffer big enough for upsampling off the edges
3348          // with upsample factor of 4
3349          z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3);
3350          if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3351 
3352          r->hs      = z->img_h_max / z->img_comp[k].h;
3353          r->vs      = z->img_v_max / z->img_comp[k].v;
3354          r->ystep   = r->vs >> 1;
3355          r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
3356          r->ypos    = 0;
3357          r->line0   = r->line1 = z->img_comp[k].data;
3358 
3359          if      (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
3360          else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
3361          else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
3362          else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
3363          else                               r->resample = stbi__resample_row_generic;
3364       }
3365 
3366       // can't error after this so, this is safe
3367       output = (stbi_uc *) stbi__malloc(n * z->s->img_x * z->s->img_y + 1);
3368       if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3369 
3370       // now go ahead and resample
3371       for (j=0; j < z->s->img_y; ++j) {
3372          stbi_uc *out = output + n * z->s->img_x * j;
3373          for (k=0; k < decode_n; ++k) {
3374             stbi__resample *r = &res_comp[k];
3375             int y_bot = r->ystep >= (r->vs >> 1);
3376             coutput[k] = r->resample(z->img_comp[k].linebuf,
3377                                      y_bot ? r->line1 : r->line0,
3378                                      y_bot ? r->line0 : r->line1,
3379                                      r->w_lores, r->hs);
3380             if (++r->ystep >= r->vs) {
3381                r->ystep = 0;
3382                r->line0 = r->line1;
3383                if (++r->ypos < z->img_comp[k].y)
3384                   r->line1 += z->img_comp[k].w2;
3385             }
3386          }
3387          if (n >= 3) {
3388             stbi_uc *y = coutput[0];
3389             if (z->s->img_n == 3) {
3390                z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3391             } else
3392                for (i=0; i < z->s->img_x; ++i) {
3393                   out[0] = out[1] = out[2] = y[i];
3394                   out[3] = 255; // not used if n==3
3395                   out += n;
3396                }
3397          } else {
3398             stbi_uc *y = coutput[0];
3399             if (n == 1)
3400                for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
3401             else
3402                for (i=0; i < z->s->img_x; ++i) *out++ = y[i], *out++ = 255;
3403          }
3404       }
3405       stbi__cleanup_jpeg(z);
3406       *out_x = z->s->img_x;
3407       *out_y = z->s->img_y;
3408       if (comp) *comp  = z->s->img_n; // report original components, not output
3409       return output;
3410    }
3411 }
3412 
stbi__jpeg_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)3413 static unsigned char *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
3414 {
3415    stbi__jpeg j;
3416    j.s = s;
3417    stbi__setup_jpeg(&j);
3418    return load_jpeg_image(&j, x,y,comp,req_comp);
3419 }
3420 
stbi__jpeg_test(stbi__context * s)3421 static int stbi__jpeg_test(stbi__context *s)
3422 {
3423    int r;
3424    stbi__jpeg j;
3425    j.s = s;
3426    stbi__setup_jpeg(&j);
3427    r = stbi__decode_jpeg_header(&j, STBI__SCAN_type);
3428    stbi__rewind(s);
3429    return r;
3430 }
3431 
stbi__jpeg_info_raw(stbi__jpeg * j,int * x,int * y,int * comp)3432 static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp)
3433 {
3434    if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) {
3435       stbi__rewind( j->s );
3436       return 0;
3437    }
3438    if (x) *x = j->s->img_x;
3439    if (y) *y = j->s->img_y;
3440    if (comp) *comp = j->s->img_n;
3441    return 1;
3442 }
3443 
stbi__jpeg_info(stbi__context * s,int * x,int * y,int * comp)3444 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp)
3445 {
3446    stbi__jpeg j;
3447    j.s = s;
3448    return stbi__jpeg_info_raw(&j, x, y, comp);
3449 }
3450 #endif
3451 
3452 // public domain zlib decode    v0.2  Sean Barrett 2006-11-18
3453 //    simple implementation
3454 //      - all input must be provided in an upfront buffer
3455 //      - all output is written to a single output buffer (can malloc/realloc)
3456 //    performance
3457 //      - fast huffman
3458 
3459 #ifndef STBI_NO_ZLIB
3460 
3461 // fast-way is faster to check than jpeg huffman, but slow way is slower
3462 #define STBI__ZFAST_BITS  9 // accelerate all cases in default tables
3463 #define STBI__ZFAST_MASK  ((1 << STBI__ZFAST_BITS) - 1)
3464 
3465 // zlib-style huffman encoding
3466 // (jpegs packs from left, zlib from right, so can't share code)
3467 typedef struct
3468 {
3469    stbi__uint16 fast[1 << STBI__ZFAST_BITS];
3470    stbi__uint16 firstcode[16];
3471    int maxcode[17];
3472    stbi__uint16 firstsymbol[16];
3473    stbi_uc  size[288];
3474    stbi__uint16 value[288];
3475 } stbi__zhuffman;
3476 
stbi__bitreverse16(int n)3477 stbi_inline static int stbi__bitreverse16(int n)
3478 {
3479   n = ((n & 0xAAAA) >>  1) | ((n & 0x5555) << 1);
3480   n = ((n & 0xCCCC) >>  2) | ((n & 0x3333) << 2);
3481   n = ((n & 0xF0F0) >>  4) | ((n & 0x0F0F) << 4);
3482   n = ((n & 0xFF00) >>  8) | ((n & 0x00FF) << 8);
3483   return n;
3484 }
3485 
stbi__bit_reverse(int v,int bits)3486 stbi_inline static int stbi__bit_reverse(int v, int bits)
3487 {
3488    STBI_ASSERT(bits <= 16);
3489    // to bit reverse n bits, reverse 16 and shift
3490    // e.g. 11 bits, bit reverse and shift away 5
3491    return stbi__bitreverse16(v) >> (16-bits);
3492 }
3493 
stbi__zbuild_huffman(stbi__zhuffman * z,stbi_uc * sizelist,int num)3494 static int stbi__zbuild_huffman(stbi__zhuffman *z, stbi_uc *sizelist, int num)
3495 {
3496    int i,k=0;
3497    int code, next_code[16], sizes[17];
3498 
3499    // DEFLATE spec for generating codes
3500    memset(sizes, 0, sizeof(sizes));
3501    memset(z->fast, 0, sizeof(z->fast));
3502    for (i=0; i < num; ++i)
3503       ++sizes[sizelist[i]];
3504    sizes[0] = 0;
3505    for (i=1; i < 16; ++i)
3506       if (sizes[i] > (1 << i))
3507          return stbi__err("bad sizes", "Corrupt PNG");
3508    code = 0;
3509    for (i=1; i < 16; ++i) {
3510       next_code[i] = code;
3511       z->firstcode[i] = (stbi__uint16) code;
3512       z->firstsymbol[i] = (stbi__uint16) k;
3513       code = (code + sizes[i]);
3514       if (sizes[i])
3515          if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt PNG");
3516       z->maxcode[i] = code << (16-i); // preshift for inner loop
3517       code <<= 1;
3518       k += sizes[i];
3519    }
3520    z->maxcode[16] = 0x10000; // sentinel
3521    for (i=0; i < num; ++i) {
3522       int s = sizelist[i];
3523       if (s) {
3524          int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
3525          stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i);
3526          z->size [c] = (stbi_uc     ) s;
3527          z->value[c] = (stbi__uint16) i;
3528          if (s <= STBI__ZFAST_BITS) {
3529             int j = stbi__bit_reverse(next_code[s],s);
3530             while (j < (1 << STBI__ZFAST_BITS)) {
3531                z->fast[j] = fastv;
3532                j += (1 << s);
3533             }
3534          }
3535          ++next_code[s];
3536       }
3537    }
3538    return 1;
3539 }
3540 
3541 // zlib-from-memory implementation for PNG reading
3542 //    because PNG allows splitting the zlib stream arbitrarily,
3543 //    and it's annoying structurally to have PNG call ZLIB call PNG,
3544 //    we require PNG read all the IDATs and combine them into a single
3545 //    memory buffer
3546 
3547 typedef struct
3548 {
3549    stbi_uc *zbuffer, *zbuffer_end;
3550    int num_bits;
3551    stbi__uint32 code_buffer;
3552 
3553    char *zout;
3554    char *zout_start;
3555    char *zout_end;
3556    int   z_expandable;
3557 
3558    stbi__zhuffman z_length, z_distance;
3559 } stbi__zbuf;
3560 
stbi__zget8(stbi__zbuf * z)3561 stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z)
3562 {
3563    if (z->zbuffer >= z->zbuffer_end) return 0;
3564    return *z->zbuffer++;
3565 }
3566 
stbi__fill_bits(stbi__zbuf * z)3567 static void stbi__fill_bits(stbi__zbuf *z)
3568 {
3569    do {
3570       STBI_ASSERT(z->code_buffer < (1U << z->num_bits));
3571       z->code_buffer |= (unsigned int) stbi__zget8(z) << z->num_bits;
3572       z->num_bits += 8;
3573    } while (z->num_bits <= 24);
3574 }
3575 
stbi__zreceive(stbi__zbuf * z,int n)3576 stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n)
3577 {
3578    unsigned int k;
3579    if (z->num_bits < n) stbi__fill_bits(z);
3580    k = z->code_buffer & ((1 << n) - 1);
3581    z->code_buffer >>= n;
3582    z->num_bits -= n;
3583    return k;
3584 }
3585 
stbi__zhuffman_decode_slowpath(stbi__zbuf * a,stbi__zhuffman * z)3586 static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z)
3587 {
3588    int b,s,k;
3589    // not resolved by fast table, so compute it the slow way
3590    // use jpeg approach, which requires MSbits at top
3591    k = stbi__bit_reverse(a->code_buffer, 16);
3592    for (s=STBI__ZFAST_BITS+1; ; ++s)
3593       if (k < z->maxcode[s])
3594          break;
3595    if (s == 16) return -1; // invalid code!
3596    // code size is s, so:
3597    b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
3598    STBI_ASSERT(z->size[b] == s);
3599    a->code_buffer >>= s;
3600    a->num_bits -= s;
3601    return z->value[b];
3602 }
3603 
stbi__zhuffman_decode(stbi__zbuf * a,stbi__zhuffman * z)3604 stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z)
3605 {
3606    int b,s;
3607    if (a->num_bits < 16) stbi__fill_bits(a);
3608    b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
3609    if (b) {
3610       s = b >> 9;
3611       a->code_buffer >>= s;
3612       a->num_bits -= s;
3613       return b & 511;
3614    }
3615    return stbi__zhuffman_decode_slowpath(a, z);
3616 }
3617 
stbi__zexpand(stbi__zbuf * z,char * zout,int n)3618 static int stbi__zexpand(stbi__zbuf *z, char *zout, int n)  // need to make room for n bytes
3619 {
3620    char *q;
3621    int cur, limit, old_limit;
3622    z->zout = zout;
3623    if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG");
3624    cur   = (int) (z->zout     - z->zout_start);
3625    limit = old_limit = (int) (z->zout_end - z->zout_start);
3626    while (cur + n > limit)
3627       limit *= 2;
3628    q = (char *) STBI_REALLOC_SIZED(z->zout_start, old_limit, limit);
3629    STBI_NOTUSED(old_limit);
3630    if (q == NULL) return stbi__err("outofmem", "Out of memory");
3631    z->zout_start = q;
3632    z->zout       = q + cur;
3633    z->zout_end   = q + limit;
3634    return 1;
3635 }
3636 
3637 static int stbi__zlength_base[31] = {
3638    3,4,5,6,7,8,9,10,11,13,
3639    15,17,19,23,27,31,35,43,51,59,
3640    67,83,99,115,131,163,195,227,258,0,0 };
3641 
3642 static int stbi__zlength_extra[31]=
3643 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
3644 
3645 static int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
3646 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
3647 
3648 static int stbi__zdist_extra[32] =
3649 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
3650 
stbi__parse_huffman_block(stbi__zbuf * a)3651 static int stbi__parse_huffman_block(stbi__zbuf *a)
3652 {
3653    char *zout = a->zout;
3654    for(;;) {
3655       int z = stbi__zhuffman_decode(a, &a->z_length);
3656       if (z < 256) {
3657          if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
3658          if (zout >= a->zout_end) {
3659             if (!stbi__zexpand(a, zout, 1)) return 0;
3660             zout = a->zout;
3661          }
3662          *zout++ = (char) z;
3663       } else {
3664          stbi_uc *p;
3665          int len,dist;
3666          if (z == 256) {
3667             a->zout = zout;
3668             return 1;
3669          }
3670          z -= 257;
3671          len = stbi__zlength_base[z];
3672          if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
3673          z = stbi__zhuffman_decode(a, &a->z_distance);
3674          if (z < 0) return stbi__err("bad huffman code","Corrupt PNG");
3675          dist = stbi__zdist_base[z];
3676          if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
3677          if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG");
3678          if (zout + len > a->zout_end) {
3679             if (!stbi__zexpand(a, zout, len)) return 0;
3680             zout = a->zout;
3681          }
3682          p = (stbi_uc *) (zout - dist);
3683          if (dist == 1) { // run of one byte; common in images.
3684             stbi_uc v = *p;
3685             if (len) { do *zout++ = v; while (--len); }
3686          } else {
3687             if (len) { do *zout++ = *p++; while (--len); }
3688          }
3689       }
3690    }
3691 }
3692 
stbi__compute_huffman_codes(stbi__zbuf * a)3693 static int stbi__compute_huffman_codes(stbi__zbuf *a)
3694 {
3695    static stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
3696    stbi__zhuffman z_codelength;
3697    stbi_uc lencodes[286+32+137];//padding for maximum single op
3698    stbi_uc codelength_sizes[19];
3699    int i,n;
3700 
3701    int hlit  = stbi__zreceive(a,5) + 257;
3702    int hdist = stbi__zreceive(a,5) + 1;
3703    int hclen = stbi__zreceive(a,4) + 4;
3704 
3705    memset(codelength_sizes, 0, sizeof(codelength_sizes));
3706    for (i=0; i < hclen; ++i) {
3707       int s = stbi__zreceive(a,3);
3708       codelength_sizes[length_dezigzag[i]] = (stbi_uc) s;
3709    }
3710    if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
3711 
3712    n = 0;
3713    while (n < hlit + hdist) {
3714       int c = stbi__zhuffman_decode(a, &z_codelength);
3715       if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG");
3716       if (c < 16)
3717          lencodes[n++] = (stbi_uc) c;
3718       else if (c == 16) {
3719          c = stbi__zreceive(a,2)+3;
3720          memset(lencodes+n, lencodes[n-1], c);
3721          n += c;
3722       } else if (c == 17) {
3723          c = stbi__zreceive(a,3)+3;
3724          memset(lencodes+n, 0, c);
3725          n += c;
3726       } else {
3727          STBI_ASSERT(c == 18);
3728          c = stbi__zreceive(a,7)+11;
3729          memset(lencodes+n, 0, c);
3730          n += c;
3731       }
3732    }
3733    if (n != hlit+hdist) return stbi__err("bad codelengths","Corrupt PNG");
3734    if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
3735    if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
3736    return 1;
3737 }
3738 
stbi__parse_uncomperssed_block(stbi__zbuf * a)3739 static int stbi__parse_uncomperssed_block(stbi__zbuf *a)
3740 {
3741    stbi_uc header[4];
3742    int len,nlen,k;
3743    if (a->num_bits & 7)
3744       stbi__zreceive(a, a->num_bits & 7); // discard
3745    // drain the bit-packed data into header
3746    k = 0;
3747    while (a->num_bits > 0) {
3748       header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check
3749       a->code_buffer >>= 8;
3750       a->num_bits -= 8;
3751    }
3752    STBI_ASSERT(a->num_bits == 0);
3753    // now fill header the normal way
3754    while (k < 4)
3755       header[k++] = stbi__zget8(a);
3756    len  = header[1] * 256 + header[0];
3757    nlen = header[3] * 256 + header[2];
3758    if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
3759    if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG");
3760    if (a->zout + len > a->zout_end)
3761       if (!stbi__zexpand(a, a->zout, len)) return 0;
3762    memcpy(a->zout, a->zbuffer, len);
3763    a->zbuffer += len;
3764    a->zout += len;
3765    return 1;
3766 }
3767 
stbi__parse_zlib_header(stbi__zbuf * a)3768 static int stbi__parse_zlib_header(stbi__zbuf *a)
3769 {
3770    int cmf   = stbi__zget8(a);
3771    int cm    = cmf & 15;
3772    /* int cinfo = cmf >> 4; */
3773    int flg   = stbi__zget8(a);
3774    if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
3775    if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
3776    if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
3777    // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
3778    return 1;
3779 }
3780 
3781 // @TODO: should statically initialize these for optimal thread safety
3782 static stbi_uc stbi__zdefault_length[288], stbi__zdefault_distance[32];
stbi__init_zdefaults(void)3783 static void stbi__init_zdefaults(void)
3784 {
3785    int i;   // use <= to match clearly with spec
3786    for (i=0; i <= 143; ++i)     stbi__zdefault_length[i]   = 8;
3787    for (   ; i <= 255; ++i)     stbi__zdefault_length[i]   = 9;
3788    for (   ; i <= 279; ++i)     stbi__zdefault_length[i]   = 7;
3789    for (   ; i <= 287; ++i)     stbi__zdefault_length[i]   = 8;
3790 
3791    for (i=0; i <=  31; ++i)     stbi__zdefault_distance[i] = 5;
3792 }
3793 
stbi__parse_zlib(stbi__zbuf * a,int parse_header)3794 static int stbi__parse_zlib(stbi__zbuf *a, int parse_header)
3795 {
3796    int final, type;
3797    if (parse_header)
3798       if (!stbi__parse_zlib_header(a)) return 0;
3799    a->num_bits = 0;
3800    a->code_buffer = 0;
3801    do {
3802       final = stbi__zreceive(a,1);
3803       type = stbi__zreceive(a,2);
3804       if (type == 0) {
3805          if (!stbi__parse_uncomperssed_block(a)) return 0;
3806       } else if (type == 3) {
3807          return 0;
3808       } else {
3809          if (type == 1) {
3810             // use fixed code lengths
3811             if (!stbi__zdefault_distance[31]) stbi__init_zdefaults();
3812             if (!stbi__zbuild_huffman(&a->z_length  , stbi__zdefault_length  , 288)) return 0;
3813             if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance,  32)) return 0;
3814          } else {
3815             if (!stbi__compute_huffman_codes(a)) return 0;
3816          }
3817          if (!stbi__parse_huffman_block(a)) return 0;
3818       }
3819    } while (!final);
3820    return 1;
3821 }
3822 
stbi__do_zlib(stbi__zbuf * a,char * obuf,int olen,int exp,int parse_header)3823 static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header)
3824 {
3825    a->zout_start = obuf;
3826    a->zout       = obuf;
3827    a->zout_end   = obuf + olen;
3828    a->z_expandable = exp;
3829 
3830    return stbi__parse_zlib(a, parse_header);
3831 }
3832 
stbi_zlib_decode_malloc_guesssize(const char * buffer,int len,int initial_size,int * outlen)3833 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)
3834 {
3835    stbi__zbuf a;
3836    char *p = (char *) stbi__malloc(initial_size);
3837    if (p == NULL) return NULL;
3838    a.zbuffer = (stbi_uc *) buffer;
3839    a.zbuffer_end = (stbi_uc *) buffer + len;
3840    if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
3841       if (outlen) *outlen = (int) (a.zout - a.zout_start);
3842       return a.zout_start;
3843    } else {
3844       STBI_FREE(a.zout_start);
3845       return NULL;
3846    }
3847 }
3848 
stbi_zlib_decode_malloc(char const * buffer,int len,int * outlen)3849 STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
3850 {
3851    return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
3852 }
3853 
stbi_zlib_decode_malloc_guesssize_headerflag(const char * buffer,int len,int initial_size,int * outlen,int parse_header)3854 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
3855 {
3856    stbi__zbuf a;
3857    char *p = (char *) stbi__malloc(initial_size);
3858    if (p == NULL) return NULL;
3859    a.zbuffer = (stbi_uc *) buffer;
3860    a.zbuffer_end = (stbi_uc *) buffer + len;
3861    if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
3862       if (outlen) *outlen = (int) (a.zout - a.zout_start);
3863       return a.zout_start;
3864    } else {
3865       STBI_FREE(a.zout_start);
3866       return NULL;
3867    }
3868 }
3869 
stbi_zlib_decode_buffer(char * obuffer,int olen,char const * ibuffer,int ilen)3870 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
3871 {
3872    stbi__zbuf a;
3873    a.zbuffer = (stbi_uc *) ibuffer;
3874    a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3875    if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
3876       return (int) (a.zout - a.zout_start);
3877    else
3878       return -1;
3879 }
3880 
stbi_zlib_decode_noheader_malloc(char const * buffer,int len,int * outlen)3881 STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
3882 {
3883    stbi__zbuf a;
3884    char *p = (char *) stbi__malloc(16384);
3885    if (p == NULL) return NULL;
3886    a.zbuffer = (stbi_uc *) buffer;
3887    a.zbuffer_end = (stbi_uc *) buffer+len;
3888    if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
3889       if (outlen) *outlen = (int) (a.zout - a.zout_start);
3890       return a.zout_start;
3891    } else {
3892       STBI_FREE(a.zout_start);
3893       return NULL;
3894    }
3895 }
3896 
stbi_zlib_decode_noheader_buffer(char * obuffer,int olen,const char * ibuffer,int ilen)3897 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
3898 {
3899    stbi__zbuf a;
3900    a.zbuffer = (stbi_uc *) ibuffer;
3901    a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3902    if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
3903       return (int) (a.zout - a.zout_start);
3904    else
3905       return -1;
3906 }
3907 #endif
3908 
3909 // public domain "baseline" PNG decoder   v0.10  Sean Barrett 2006-11-18
3910 //    simple implementation
3911 //      - only 8-bit samples
3912 //      - no CRC checking
3913 //      - allocates lots of intermediate memory
3914 //        - avoids problem of streaming data between subsystems
3915 //        - avoids explicit window management
3916 //    performance
3917 //      - uses stb_zlib, a PD zlib implementation with fast huffman decoding
3918 
3919 #ifndef STBI_NO_PNG
3920 typedef struct
3921 {
3922    stbi__uint32 length;
3923    stbi__uint32 type;
3924 } stbi__pngchunk;
3925 
stbi__get_chunk_header(stbi__context * s)3926 static stbi__pngchunk stbi__get_chunk_header(stbi__context *s)
3927 {
3928    stbi__pngchunk c;
3929    c.length = stbi__get32be(s);
3930    c.type   = stbi__get32be(s);
3931    return c;
3932 }
3933 
stbi__check_png_header(stbi__context * s)3934 static int stbi__check_png_header(stbi__context *s)
3935 {
3936    static stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
3937    int i;
3938    for (i=0; i < 8; ++i)
3939       if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG");
3940    return 1;
3941 }
3942 
3943 typedef struct
3944 {
3945    stbi__context *s;
3946    stbi_uc *idata, *expanded, *out;
3947 } stbi__png;
3948 
3949 
3950 enum {
3951    STBI__F_none=0,
3952    STBI__F_sub=1,
3953    STBI__F_up=2,
3954    STBI__F_avg=3,
3955    STBI__F_paeth=4,
3956    // synthetic filters used for first scanline to avoid needing a dummy row of 0s
3957    STBI__F_avg_first,
3958    STBI__F_paeth_first
3959 };
3960 
3961 static stbi_uc first_row_filter[5] =
3962 {
3963    STBI__F_none,
3964    STBI__F_sub,
3965    STBI__F_none,
3966    STBI__F_avg_first,
3967    STBI__F_paeth_first
3968 };
3969 
stbi__paeth(int a,int b,int c)3970 static int stbi__paeth(int a, int b, int c)
3971 {
3972    int p = a + b - c;
3973    int pa = abs(p-a);
3974    int pb = abs(p-b);
3975    int pc = abs(p-c);
3976    if (pa <= pb && pa <= pc) return a;
3977    if (pb <= pc) return b;
3978    return c;
3979 }
3980 
3981 static stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
3982 
3983 // create the png data from post-deflated data
stbi__create_png_image_raw(stbi__png * a,stbi_uc * raw,stbi__uint32 raw_len,int out_n,stbi__uint32 x,stbi__uint32 y,int depth,int color)3984 static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
3985 {
3986    stbi__context *s = a->s;
3987    stbi__uint32 i,j,stride = x*out_n;
3988    stbi__uint32 img_len, img_width_bytes;
3989    int k;
3990    int img_n = s->img_n; // copy it into a local for later
3991 
3992    STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1);
3993    a->out = (stbi_uc *) stbi__malloc(x * y * out_n); // extra bytes to write off the end into
3994    if (!a->out) return stbi__err("outofmem", "Out of memory");
3995 
3996    img_width_bytes = (((img_n * x * depth) + 7) >> 3);
3997    img_len = (img_width_bytes + 1) * y;
3998    if (s->img_x == x && s->img_y == y) {
3999       if (raw_len != img_len) return stbi__err("not enough pixels","Corrupt PNG");
4000    } else { // interlaced:
4001       if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG");
4002    }
4003 
4004    for (j=0; j < y; ++j) {
4005       stbi_uc *cur = a->out + stride*j;
4006       stbi_uc *prior = cur - stride;
4007       int filter = *raw++;
4008       int filter_bytes = img_n;
4009       int width = x;
4010       if (filter > 4)
4011          return stbi__err("invalid filter","Corrupt PNG");
4012 
4013       if (depth < 8) {
4014          STBI_ASSERT(img_width_bytes <= x);
4015          cur += x*out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place
4016          filter_bytes = 1;
4017          width = img_width_bytes;
4018       }
4019 
4020       // if first row, use special filter that doesn't sample previous row
4021       if (j == 0) filter = first_row_filter[filter];
4022 
4023       // handle first byte explicitly
4024       for (k=0; k < filter_bytes; ++k) {
4025          switch (filter) {
4026             case STBI__F_none       : cur[k] = raw[k]; break;
4027             case STBI__F_sub        : cur[k] = raw[k]; break;
4028             case STBI__F_up         : cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4029             case STBI__F_avg        : cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1)); break;
4030             case STBI__F_paeth      : cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0,prior[k],0)); break;
4031             case STBI__F_avg_first  : cur[k] = raw[k]; break;
4032             case STBI__F_paeth_first: cur[k] = raw[k]; break;
4033          }
4034       }
4035 
4036       if (depth == 8) {
4037          if (img_n != out_n)
4038             cur[img_n] = 255; // first pixel
4039          raw += img_n;
4040          cur += out_n;
4041          prior += out_n;
4042       } else {
4043          raw += 1;
4044          cur += 1;
4045          prior += 1;
4046       }
4047 
4048       // this is a little gross, so that we don't switch per-pixel or per-component
4049       if (depth < 8 || img_n == out_n) {
4050          int nk = (width - 1)*img_n;
4051          #define CASE(f) \
4052              case f:     \
4053                 for (k=0; k < nk; ++k)
4054          switch (filter) {
4055             // "none" filter turns into a memcpy here; make that explicit.
4056             case STBI__F_none:         memcpy(cur, raw, nk); break;
4057             CASE(STBI__F_sub)          cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]); break;
4058             CASE(STBI__F_up)           cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4059             CASE(STBI__F_avg)          cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1)); break;
4060             CASE(STBI__F_paeth)        cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],prior[k],prior[k-filter_bytes])); break;
4061             CASE(STBI__F_avg_first)    cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1)); break;
4062             CASE(STBI__F_paeth_first)  cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],0,0)); break;
4063          }
4064          #undef CASE
4065          raw += nk;
4066       } else {
4067          STBI_ASSERT(img_n+1 == out_n);
4068          #define CASE(f) \
4069              case f:     \
4070                 for (i=x-1; i >= 1; --i, cur[img_n]=255,raw+=img_n,cur+=out_n,prior+=out_n) \
4071                    for (k=0; k < img_n; ++k)
4072          switch (filter) {
4073             CASE(STBI__F_none)         cur[k] = raw[k]; break;
4074             CASE(STBI__F_sub)          cur[k] = STBI__BYTECAST(raw[k] + cur[k-out_n]); break;
4075             CASE(STBI__F_up)           cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4076             CASE(STBI__F_avg)          cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-out_n])>>1)); break;
4077             CASE(STBI__F_paeth)        cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-out_n],prior[k],prior[k-out_n])); break;
4078             CASE(STBI__F_avg_first)    cur[k] = STBI__BYTECAST(raw[k] + (cur[k-out_n] >> 1)); break;
4079             CASE(STBI__F_paeth_first)  cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-out_n],0,0)); break;
4080          }
4081          #undef CASE
4082       }
4083    }
4084 
4085    // we make a separate pass to expand bits to pixels; for performance,
4086    // this could run two scanlines behind the above code, so it won't
4087    // intefere with filtering but will still be in the cache.
4088    if (depth < 8) {
4089       for (j=0; j < y; ++j) {
4090          stbi_uc *cur = a->out + stride*j;
4091          stbi_uc *in  = a->out + stride*j + x*out_n - img_width_bytes;
4092          // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
4093          // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
4094          stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range
4095 
4096          // note that the final byte might overshoot and write more data than desired.
4097          // we can allocate enough data that this never writes out of memory, but it
4098          // could also overwrite the next scanline. can it overwrite non-empty data
4099          // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
4100          // so we need to explicitly clamp the final ones
4101 
4102          if (depth == 4) {
4103             for (k=x*img_n; k >= 2; k-=2, ++in) {
4104                *cur++ = scale * ((*in >> 4)       );
4105                *cur++ = scale * ((*in     ) & 0x0f);
4106             }
4107             if (k > 0) *cur++ = scale * ((*in >> 4)       );
4108          } else if (depth == 2) {
4109             for (k=x*img_n; k >= 4; k-=4, ++in) {
4110                *cur++ = scale * ((*in >> 6)       );
4111                *cur++ = scale * ((*in >> 4) & 0x03);
4112                *cur++ = scale * ((*in >> 2) & 0x03);
4113                *cur++ = scale * ((*in     ) & 0x03);
4114             }
4115             if (k > 0) *cur++ = scale * ((*in >> 6)       );
4116             if (k > 1) *cur++ = scale * ((*in >> 4) & 0x03);
4117             if (k > 2) *cur++ = scale * ((*in >> 2) & 0x03);
4118          } else if (depth == 1) {
4119             for (k=x*img_n; k >= 8; k-=8, ++in) {
4120                *cur++ = scale * ((*in >> 7)       );
4121                *cur++ = scale * ((*in >> 6) & 0x01);
4122                *cur++ = scale * ((*in >> 5) & 0x01);
4123                *cur++ = scale * ((*in >> 4) & 0x01);
4124                *cur++ = scale * ((*in >> 3) & 0x01);
4125                *cur++ = scale * ((*in >> 2) & 0x01);
4126                *cur++ = scale * ((*in >> 1) & 0x01);
4127                *cur++ = scale * ((*in     ) & 0x01);
4128             }
4129             if (k > 0) *cur++ = scale * ((*in >> 7)       );
4130             if (k > 1) *cur++ = scale * ((*in >> 6) & 0x01);
4131             if (k > 2) *cur++ = scale * ((*in >> 5) & 0x01);
4132             if (k > 3) *cur++ = scale * ((*in >> 4) & 0x01);
4133             if (k > 4) *cur++ = scale * ((*in >> 3) & 0x01);
4134             if (k > 5) *cur++ = scale * ((*in >> 2) & 0x01);
4135             if (k > 6) *cur++ = scale * ((*in >> 1) & 0x01);
4136          }
4137          if (img_n != out_n) {
4138             int q;
4139             // insert alpha = 255
4140             cur = a->out + stride*j;
4141             if (img_n == 1) {
4142                for (q=x-1; q >= 0; --q) {
4143                   cur[q*2+1] = 255;
4144                   cur[q*2+0] = cur[q];
4145                }
4146             } else {
4147                STBI_ASSERT(img_n == 3);
4148                for (q=x-1; q >= 0; --q) {
4149                   cur[q*4+3] = 255;
4150                   cur[q*4+2] = cur[q*3+2];
4151                   cur[q*4+1] = cur[q*3+1];
4152                   cur[q*4+0] = cur[q*3+0];
4153                }
4154             }
4155          }
4156       }
4157    }
4158 
4159    return 1;
4160 }
4161 
stbi__create_png_image(stbi__png * a,stbi_uc * image_data,stbi__uint32 image_data_len,int out_n,int depth,int color,int interlaced)4162 static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
4163 {
4164    stbi_uc *final;
4165    int p;
4166    if (!interlaced)
4167       return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
4168 
4169    // de-interlacing
4170    final = (stbi_uc *) stbi__malloc(a->s->img_x * a->s->img_y * out_n);
4171    for (p=0; p < 7; ++p) {
4172       int xorig[] = { 0,4,0,2,0,1,0 };
4173       int yorig[] = { 0,0,4,0,2,0,1 };
4174       int xspc[]  = { 8,8,4,4,2,2,1 };
4175       int yspc[]  = { 8,8,8,4,4,2,2 };
4176       int i,j,x,y;
4177       // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
4178       x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
4179       y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
4180       if (x && y) {
4181          stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
4182          if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
4183             STBI_FREE(final);
4184             return 0;
4185          }
4186          for (j=0; j < y; ++j) {
4187             for (i=0; i < x; ++i) {
4188                int out_y = j*yspc[p]+yorig[p];
4189                int out_x = i*xspc[p]+xorig[p];
4190                memcpy(final + out_y*a->s->img_x*out_n + out_x*out_n,
4191                       a->out + (j*x+i)*out_n, out_n);
4192             }
4193          }
4194          STBI_FREE(a->out);
4195          image_data += img_len;
4196          image_data_len -= img_len;
4197       }
4198    }
4199    a->out = final;
4200 
4201    return 1;
4202 }
4203 
stbi__compute_transparency(stbi__png * z,stbi_uc tc[3],int out_n)4204 static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n)
4205 {
4206    stbi__context *s = z->s;
4207    stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4208    stbi_uc *p = z->out;
4209 
4210    // compute color-based transparency, assuming we've
4211    // already got 255 as the alpha value in the output
4212    STBI_ASSERT(out_n == 2 || out_n == 4);
4213 
4214    if (out_n == 2) {
4215       for (i=0; i < pixel_count; ++i) {
4216          p[1] = (p[0] == tc[0] ? 0 : 255);
4217          p += 2;
4218       }
4219    } else {
4220       for (i=0; i < pixel_count; ++i) {
4221          if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4222             p[3] = 0;
4223          p += 4;
4224       }
4225    }
4226    return 1;
4227 }
4228 
stbi__expand_png_palette(stbi__png * a,stbi_uc * palette,int len,int pal_img_n)4229 static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n)
4230 {
4231    stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
4232    stbi_uc *p, *temp_out, *orig = a->out;
4233 
4234    p = (stbi_uc *) stbi__malloc(pixel_count * pal_img_n);
4235    if (p == NULL) return stbi__err("outofmem", "Out of memory");
4236 
4237    // between here and free(out) below, exitting would leak
4238    temp_out = p;
4239 
4240    if (pal_img_n == 3) {
4241       for (i=0; i < pixel_count; ++i) {
4242          int n = orig[i]*4;
4243          p[0] = palette[n  ];
4244          p[1] = palette[n+1];
4245          p[2] = palette[n+2];
4246          p += 3;
4247       }
4248    } else {
4249       for (i=0; i < pixel_count; ++i) {
4250          int n = orig[i]*4;
4251          p[0] = palette[n  ];
4252          p[1] = palette[n+1];
4253          p[2] = palette[n+2];
4254          p[3] = palette[n+3];
4255          p += 4;
4256       }
4257    }
4258    STBI_FREE(a->out);
4259    a->out = temp_out;
4260 
4261    STBI_NOTUSED(len);
4262 
4263    return 1;
4264 }
4265 
4266 static int stbi__unpremultiply_on_load = 0;
4267 static int stbi__de_iphone_flag = 0;
4268 
stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)4269 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
4270 {
4271    stbi__unpremultiply_on_load = flag_true_if_should_unpremultiply;
4272 }
4273 
stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)4274 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
4275 {
4276    stbi__de_iphone_flag = flag_true_if_should_convert;
4277 }
4278 
stbi__de_iphone(stbi__png * z)4279 static void stbi__de_iphone(stbi__png *z)
4280 {
4281    stbi__context *s = z->s;
4282    stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4283    stbi_uc *p = z->out;
4284 
4285    if (s->img_out_n == 3) {  // convert bgr to rgb
4286       for (i=0; i < pixel_count; ++i) {
4287          stbi_uc t = p[0];
4288          p[0] = p[2];
4289          p[2] = t;
4290          p += 3;
4291       }
4292    } else {
4293       STBI_ASSERT(s->img_out_n == 4);
4294       if (stbi__unpremultiply_on_load) {
4295          // convert bgr to rgb and unpremultiply
4296          for (i=0; i < pixel_count; ++i) {
4297             stbi_uc a = p[3];
4298             stbi_uc t = p[0];
4299             if (a) {
4300                p[0] = p[2] * 255 / a;
4301                p[1] = p[1] * 255 / a;
4302                p[2] =  t   * 255 / a;
4303             } else {
4304                p[0] = p[2];
4305                p[2] = t;
4306             }
4307             p += 4;
4308          }
4309       } else {
4310          // convert bgr to rgb
4311          for (i=0; i < pixel_count; ++i) {
4312             stbi_uc t = p[0];
4313             p[0] = p[2];
4314             p[2] = t;
4315             p += 4;
4316          }
4317       }
4318    }
4319 }
4320 
4321 #define STBI__PNG_TYPE(a,b,c,d)  (((a) << 24) + ((b) << 16) + ((c) << 8) + (d))
4322 
stbi__parse_png_file(stbi__png * z,int scan,int req_comp)4323 static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp)
4324 {
4325    stbi_uc palette[1024], pal_img_n=0;
4326    stbi_uc has_trans=0, tc[3];
4327    stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0;
4328    int first=1,k,interlace=0, color=0, depth=0, is_iphone=0;
4329    stbi__context *s = z->s;
4330 
4331    z->expanded = NULL;
4332    z->idata = NULL;
4333    z->out = NULL;
4334 
4335    if (!stbi__check_png_header(s)) return 0;
4336 
4337    if (scan == STBI__SCAN_type) return 1;
4338 
4339    for (;;) {
4340       stbi__pngchunk c = stbi__get_chunk_header(s);
4341       switch (c.type) {
4342          case STBI__PNG_TYPE('C','g','B','I'):
4343             is_iphone = 1;
4344             stbi__skip(s, c.length);
4345             break;
4346          case STBI__PNG_TYPE('I','H','D','R'): {
4347             int comp,filter;
4348             if (!first) return stbi__err("multiple IHDR","Corrupt PNG");
4349             first = 0;
4350             if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG");
4351             s->img_x = stbi__get32be(s); if (s->img_x > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4352             s->img_y = stbi__get32be(s); if (s->img_y > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4353             depth = stbi__get8(s);  if (depth != 1 && depth != 2 && depth != 4 && depth != 8)  return stbi__err("1/2/4/8-bit only","PNG not supported: 1/2/4/8-bit only");
4354             color = stbi__get8(s);  if (color > 6)         return stbi__err("bad ctype","Corrupt PNG");
4355             if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG");
4356             comp  = stbi__get8(s);  if (comp) return stbi__err("bad comp method","Corrupt PNG");
4357             filter= stbi__get8(s);  if (filter) return stbi__err("bad filter method","Corrupt PNG");
4358             interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG");
4359             if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG");
4360             if (!pal_img_n) {
4361                s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
4362                if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
4363                if (scan == STBI__SCAN_header) return 1;
4364             } else {
4365                // if paletted, then pal_n is our final components, and
4366                // img_n is # components to decompress/filter.
4367                s->img_n = 1;
4368                if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG");
4369                // if SCAN_header, have to scan to see if we have a tRNS
4370             }
4371             break;
4372          }
4373 
4374          case STBI__PNG_TYPE('P','L','T','E'):  {
4375             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4376             if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
4377             pal_len = c.length / 3;
4378             if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG");
4379             for (i=0; i < pal_len; ++i) {
4380                palette[i*4+0] = stbi__get8(s);
4381                palette[i*4+1] = stbi__get8(s);
4382                palette[i*4+2] = stbi__get8(s);
4383                palette[i*4+3] = 255;
4384             }
4385             break;
4386          }
4387 
4388          case STBI__PNG_TYPE('t','R','N','S'): {
4389             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4390             if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG");
4391             if (pal_img_n) {
4392                if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; }
4393                if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
4394                if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG");
4395                pal_img_n = 4;
4396                for (i=0; i < c.length; ++i)
4397                   palette[i*4+3] = stbi__get8(s);
4398             } else {
4399                if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
4400                if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG");
4401                has_trans = 1;
4402                for (k=0; k < s->img_n; ++k)
4403                   tc[k] = (stbi_uc) (stbi__get16be(s) & 255) * stbi__depth_scale_table[depth]; // non 8-bit images will be larger
4404             }
4405             break;
4406          }
4407 
4408          case STBI__PNG_TYPE('I','D','A','T'): {
4409             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4410             if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG");
4411             if (scan == STBI__SCAN_header) { s->img_n = pal_img_n; return 1; }
4412             if ((int)(ioff + c.length) < (int)ioff) return 0;
4413             if (ioff + c.length > idata_limit) {
4414                stbi__uint32 idata_limit_old = idata_limit;
4415                stbi_uc *p;
4416                if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
4417                while (ioff + c.length > idata_limit)
4418                   idata_limit *= 2;
4419                STBI_NOTUSED(idata_limit_old);
4420                p = (stbi_uc *) STBI_REALLOC_SIZED(z->idata, idata_limit_old, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
4421                z->idata = p;
4422             }
4423             if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG");
4424             ioff += c.length;
4425             break;
4426          }
4427 
4428          case STBI__PNG_TYPE('I','E','N','D'): {
4429             stbi__uint32 raw_len, bpl;
4430             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4431             if (scan != STBI__SCAN_load) return 1;
4432             if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG");
4433             // initial guess for decoded data size to avoid unnecessary reallocs
4434             bpl = (s->img_x * depth + 7) / 8; // bytes per line, per component
4435             raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
4436             z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone);
4437             if (z->expanded == NULL) return 0; // zlib should set error
4438             STBI_FREE(z->idata); z->idata = NULL;
4439             if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
4440                s->img_out_n = s->img_n+1;
4441             else
4442                s->img_out_n = s->img_n;
4443             if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, depth, color, interlace)) return 0;
4444             if (has_trans)
4445                if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
4446             if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
4447                stbi__de_iphone(z);
4448             if (pal_img_n) {
4449                // pal_img_n == 3 or 4
4450                s->img_n = pal_img_n; // record the actual colors we had
4451                s->img_out_n = pal_img_n;
4452                if (req_comp >= 3) s->img_out_n = req_comp;
4453                if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
4454                   return 0;
4455             }
4456             STBI_FREE(z->expanded); z->expanded = NULL;
4457             return 1;
4458          }
4459 
4460          default:
4461             // if critical, fail
4462             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4463             if ((c.type & (1 << 29)) == 0) {
4464                #ifndef STBI_NO_FAILURE_STRINGS
4465                // not threadsafe
4466                static char invalid_chunk[] = "XXXX PNG chunk not known";
4467                invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
4468                invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
4469                invalid_chunk[2] = STBI__BYTECAST(c.type >>  8);
4470                invalid_chunk[3] = STBI__BYTECAST(c.type >>  0);
4471                #endif
4472                return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
4473             }
4474             stbi__skip(s, c.length);
4475             break;
4476       }
4477       // end of PNG chunk, read and skip CRC
4478       stbi__get32be(s);
4479    }
4480 }
4481 
stbi__do_png(stbi__png * p,int * x,int * y,int * n,int req_comp)4482 static unsigned char *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp)
4483 {
4484    unsigned char *result=NULL;
4485    if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
4486    if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) {
4487       result = p->out;
4488       p->out = NULL;
4489       if (req_comp && req_comp != p->s->img_out_n) {
4490          result = stbi__convert_format(result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
4491          p->s->img_out_n = req_comp;
4492          if (result == NULL) return result;
4493       }
4494       *x = p->s->img_x;
4495       *y = p->s->img_y;
4496       if (n) *n = p->s->img_out_n;
4497    }
4498    STBI_FREE(p->out);      p->out      = NULL;
4499    STBI_FREE(p->expanded); p->expanded = NULL;
4500    STBI_FREE(p->idata);    p->idata    = NULL;
4501 
4502    return result;
4503 }
4504 
stbi__png_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4505 static unsigned char *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4506 {
4507    stbi__png p;
4508    p.s = s;
4509    return stbi__do_png(&p, x,y,comp,req_comp);
4510 }
4511 
stbi__png_test(stbi__context * s)4512 static int stbi__png_test(stbi__context *s)
4513 {
4514    int r;
4515    r = stbi__check_png_header(s);
4516    stbi__rewind(s);
4517    return r;
4518 }
4519 
stbi__png_info_raw(stbi__png * p,int * x,int * y,int * comp)4520 static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp)
4521 {
4522    if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) {
4523       stbi__rewind( p->s );
4524       return 0;
4525    }
4526    if (x) *x = p->s->img_x;
4527    if (y) *y = p->s->img_y;
4528    if (comp) *comp = p->s->img_n;
4529    return 1;
4530 }
4531 
stbi__png_info(stbi__context * s,int * x,int * y,int * comp)4532 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp)
4533 {
4534    stbi__png p;
4535    p.s = s;
4536    return stbi__png_info_raw(&p, x, y, comp);
4537 }
4538 #endif
4539 
4540 // Microsoft/Windows BMP image
4541 
4542 #ifndef STBI_NO_BMP
stbi__bmp_test_raw(stbi__context * s)4543 static int stbi__bmp_test_raw(stbi__context *s)
4544 {
4545    int r;
4546    int sz;
4547    if (stbi__get8(s) != 'B') return 0;
4548    if (stbi__get8(s) != 'M') return 0;
4549    stbi__get32le(s); // discard filesize
4550    stbi__get16le(s); // discard reserved
4551    stbi__get16le(s); // discard reserved
4552    stbi__get32le(s); // discard data offset
4553    sz = stbi__get32le(s);
4554    r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
4555    return r;
4556 }
4557 
stbi__bmp_test(stbi__context * s)4558 static int stbi__bmp_test(stbi__context *s)
4559 {
4560    int r = stbi__bmp_test_raw(s);
4561    stbi__rewind(s);
4562    return r;
4563 }
4564 
4565 
4566 // returns 0..31 for the highest set bit
stbi__high_bit(unsigned int z)4567 static int stbi__high_bit(unsigned int z)
4568 {
4569    int n=0;
4570    if (z == 0) return -1;
4571    if (z >= 0x10000) n += 16, z >>= 16;
4572    if (z >= 0x00100) n +=  8, z >>=  8;
4573    if (z >= 0x00010) n +=  4, z >>=  4;
4574    if (z >= 0x00004) n +=  2, z >>=  2;
4575    if (z >= 0x00002) n +=  1, z >>=  1;
4576    return n;
4577 }
4578 
stbi__bitcount(unsigned int a)4579 static int stbi__bitcount(unsigned int a)
4580 {
4581    a = (a & 0x55555555) + ((a >>  1) & 0x55555555); // max 2
4582    a = (a & 0x33333333) + ((a >>  2) & 0x33333333); // max 4
4583    a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
4584    a = (a + (a >> 8)); // max 16 per 8 bits
4585    a = (a + (a >> 16)); // max 32 per 8 bits
4586    return a & 0xff;
4587 }
4588 
stbi__shiftsigned(int v,int shift,int bits)4589 static int stbi__shiftsigned(int v, int shift, int bits)
4590 {
4591    int result;
4592    int z=0;
4593 
4594    if (shift < 0) v <<= -shift;
4595    else v >>= shift;
4596    result = v;
4597 
4598    z = bits;
4599    while (z < 8) {
4600       result += v >> z;
4601       z += bits;
4602    }
4603    return result;
4604 }
4605 
4606 typedef struct
4607 {
4608    int bpp, offset, hsz;
4609    unsigned int mr,mg,mb,ma, all_a;
4610 } stbi__bmp_data;
4611 
stbi__bmp_parse_header(stbi__context * s,stbi__bmp_data * info)4612 static void *stbi__bmp_parse_header(stbi__context *s, stbi__bmp_data *info)
4613 {
4614    int hsz;
4615    if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
4616    stbi__get32le(s); // discard filesize
4617    stbi__get16le(s); // discard reserved
4618    stbi__get16le(s); // discard reserved
4619    info->offset = stbi__get32le(s);
4620    info->hsz = hsz = stbi__get32le(s);
4621 
4622    if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
4623    if (hsz == 12) {
4624       s->img_x = stbi__get16le(s);
4625       s->img_y = stbi__get16le(s);
4626    } else {
4627       s->img_x = stbi__get32le(s);
4628       s->img_y = stbi__get32le(s);
4629    }
4630    if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
4631    info->bpp = stbi__get16le(s);
4632    if (info->bpp == 1) return stbi__errpuc("monochrome", "BMP type not supported: 1-bit");
4633    if (hsz != 12) {
4634       int compress = stbi__get32le(s);
4635       if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
4636       stbi__get32le(s); // discard sizeof
4637       stbi__get32le(s); // discard hres
4638       stbi__get32le(s); // discard vres
4639       stbi__get32le(s); // discard colorsused
4640       stbi__get32le(s); // discard max important
4641       if (hsz == 40 || hsz == 56) {
4642          if (hsz == 56) {
4643             stbi__get32le(s);
4644             stbi__get32le(s);
4645             stbi__get32le(s);
4646             stbi__get32le(s);
4647          }
4648          if (info->bpp == 16 || info->bpp == 32) {
4649             info->mr = info->mg = info->mb = 0;
4650             if (compress == 0) {
4651                if (info->bpp == 32) {
4652                   info->mr = 0xffu << 16;
4653                   info->mg = 0xffu <<  8;
4654                   info->mb = 0xffu <<  0;
4655                   info->ma = 0xffu << 24;
4656                   info->all_a = 0; // if all_a is 0 at end, then we loaded alpha channel but it was all 0
4657                } else {
4658                   info->mr = 31u << 10;
4659                   info->mg = 31u <<  5;
4660                   info->mb = 31u <<  0;
4661                }
4662             } else if (compress == 3) {
4663                info->mr = stbi__get32le(s);
4664                info->mg = stbi__get32le(s);
4665                info->mb = stbi__get32le(s);
4666                // not documented, but generated by photoshop and handled by mspaint
4667                if (info->mr == info->mg && info->mg == info->mb) {
4668                   // ?!?!?
4669                   return stbi__errpuc("bad BMP", "bad BMP");
4670                }
4671             } else
4672                return stbi__errpuc("bad BMP", "bad BMP");
4673          }
4674       } else {
4675          int i;
4676          if (hsz != 108 && hsz != 124)
4677             return stbi__errpuc("bad BMP", "bad BMP");
4678          info->mr = stbi__get32le(s);
4679          info->mg = stbi__get32le(s);
4680          info->mb = stbi__get32le(s);
4681          info->ma = stbi__get32le(s);
4682          stbi__get32le(s); // discard color space
4683          for (i=0; i < 12; ++i)
4684             stbi__get32le(s); // discard color space parameters
4685          if (hsz == 124) {
4686             stbi__get32le(s); // discard rendering intent
4687             stbi__get32le(s); // discard offset of profile data
4688             stbi__get32le(s); // discard size of profile data
4689             stbi__get32le(s); // discard reserved
4690          }
4691       }
4692    }
4693    return (void *) 1;
4694 }
4695 
4696 
stbi__bmp_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4697 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4698 {
4699    stbi_uc *out;
4700    unsigned int mr=0,mg=0,mb=0,ma=0, all_a;
4701    stbi_uc pal[256][4];
4702    int psize=0,i,j,width;
4703    int flip_vertically, pad, target;
4704    stbi__bmp_data info;
4705 
4706    info.all_a = 255;
4707    if (stbi__bmp_parse_header(s, &info) == NULL)
4708       return NULL; // error code already set
4709 
4710    flip_vertically = ((int) s->img_y) > 0;
4711    s->img_y = abs((int) s->img_y);
4712 
4713    mr = info.mr;
4714    mg = info.mg;
4715    mb = info.mb;
4716    ma = info.ma;
4717    all_a = info.all_a;
4718 
4719    if (info.hsz == 12) {
4720       if (info.bpp < 24)
4721          psize = (info.offset - 14 - 24) / 3;
4722    } else {
4723       if (info.bpp < 16)
4724          psize = (info.offset - 14 - info.hsz) >> 2;
4725    }
4726 
4727    s->img_n = ma ? 4 : 3;
4728    if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
4729       target = req_comp;
4730    else
4731       target = s->img_n; // if they want monochrome, we'll post-convert
4732 
4733    out = (stbi_uc *) stbi__malloc(target * s->img_x * s->img_y);
4734    if (!out) return stbi__errpuc("outofmem", "Out of memory");
4735    if (info.bpp < 16) {
4736       int z=0;
4737       if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
4738       for (i=0; i < psize; ++i) {
4739          pal[i][2] = stbi__get8(s);
4740          pal[i][1] = stbi__get8(s);
4741          pal[i][0] = stbi__get8(s);
4742          if (info.hsz != 12) stbi__get8(s);
4743          pal[i][3] = 255;
4744       }
4745       stbi__skip(s, info.offset - 14 - info.hsz - psize * (info.hsz == 12 ? 3 : 4));
4746       if (info.bpp == 4) width = (s->img_x + 1) >> 1;
4747       else if (info.bpp == 8) width = s->img_x;
4748       else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
4749       pad = (-width)&3;
4750       for (j=0; j < (int) s->img_y; ++j) {
4751          for (i=0; i < (int) s->img_x; i += 2) {
4752             int v=stbi__get8(s),v2=0;
4753             if (info.bpp == 4) {
4754                v2 = v & 15;
4755                v >>= 4;
4756             }
4757             out[z++] = pal[v][0];
4758             out[z++] = pal[v][1];
4759             out[z++] = pal[v][2];
4760             if (target == 4) out[z++] = 255;
4761             if (i+1 == (int) s->img_x) break;
4762             v = (info.bpp == 8) ? stbi__get8(s) : v2;
4763             out[z++] = pal[v][0];
4764             out[z++] = pal[v][1];
4765             out[z++] = pal[v][2];
4766             if (target == 4) out[z++] = 255;
4767          }
4768          stbi__skip(s, pad);
4769       }
4770    } else {
4771       int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
4772       int z = 0;
4773       int easy=0;
4774       stbi__skip(s, info.offset - 14 - info.hsz);
4775       if (info.bpp == 24) width = 3 * s->img_x;
4776       else if (info.bpp == 16) width = 2*s->img_x;
4777       else /* bpp = 32 and pad = 0 */ width=0;
4778       pad = (-width) & 3;
4779       if (info.bpp == 24) {
4780          easy = 1;
4781       } else if (info.bpp == 32) {
4782          if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
4783             easy = 2;
4784       }
4785       if (!easy) {
4786          if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
4787          // right shift amt to put high bit in position #7
4788          rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr);
4789          gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg);
4790          bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb);
4791          ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma);
4792       }
4793       for (j=0; j < (int) s->img_y; ++j) {
4794          if (easy) {
4795             for (i=0; i < (int) s->img_x; ++i) {
4796                unsigned char a;
4797                out[z+2] = stbi__get8(s);
4798                out[z+1] = stbi__get8(s);
4799                out[z+0] = stbi__get8(s);
4800                z += 3;
4801                a = (easy == 2 ? stbi__get8(s) : 255);
4802                all_a |= a;
4803                if (target == 4) out[z++] = a;
4804             }
4805          } else {
4806             int bpp = info.bpp;
4807             for (i=0; i < (int) s->img_x; ++i) {
4808                stbi__uint32 v = (bpp == 16 ? (stbi__uint32) stbi__get16le(s) : stbi__get32le(s));
4809                int a;
4810                out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
4811                out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
4812                out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
4813                a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
4814                all_a |= a;
4815                if (target == 4) out[z++] = STBI__BYTECAST(a);
4816             }
4817          }
4818          stbi__skip(s, pad);
4819       }
4820    }
4821 
4822    // if alpha channel is all 0s, replace with all 255s
4823    if (target == 4 && all_a == 0)
4824       for (i=4*s->img_x*s->img_y-1; i >= 0; i -= 4)
4825          out[i] = 255;
4826 
4827    if (flip_vertically) {
4828       stbi_uc t;
4829       for (j=0; j < (int) s->img_y>>1; ++j) {
4830          stbi_uc *p1 = out +      j     *s->img_x*target;
4831          stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
4832          for (i=0; i < (int) s->img_x*target; ++i) {
4833             t = p1[i], p1[i] = p2[i], p2[i] = t;
4834          }
4835       }
4836    }
4837 
4838    if (req_comp && req_comp != target) {
4839       out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
4840       if (out == NULL) return out; // stbi__convert_format frees input on failure
4841    }
4842 
4843    *x = s->img_x;
4844    *y = s->img_y;
4845    if (comp) *comp = s->img_n;
4846    return out;
4847 }
4848 #endif
4849 
4850 // Targa Truevision - TGA
4851 // by Jonathan Dummer
4852 #ifndef STBI_NO_TGA
4853 // returns STBI_rgb or whatever, 0 on error
stbi__tga_get_comp(int bits_per_pixel,int is_grey,int * is_rgb16)4854 static int stbi__tga_get_comp(int bits_per_pixel, int is_grey, int* is_rgb16)
4855 {
4856    // only RGB or RGBA (incl. 16bit) or grey allowed
4857    if(is_rgb16) *is_rgb16 = 0;
4858    switch(bits_per_pixel) {
4859       case 8:  return STBI_grey;
4860       case 16: if(is_grey) return STBI_grey_alpha;
4861             // else: fall-through
4862       case 15: if(is_rgb16) *is_rgb16 = 1;
4863             return STBI_rgb;
4864       case 24: // fall-through
4865       case 32: return bits_per_pixel/8;
4866       default: return 0;
4867    }
4868 }
4869 
stbi__tga_info(stbi__context * s,int * x,int * y,int * comp)4870 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp)
4871 {
4872     int tga_w, tga_h, tga_comp, tga_image_type, tga_bits_per_pixel, tga_colormap_bpp;
4873     int sz, tga_colormap_type;
4874     stbi__get8(s);                   // discard Offset
4875     tga_colormap_type = stbi__get8(s); // colormap type
4876     if( tga_colormap_type > 1 ) {
4877         stbi__rewind(s);
4878         return 0;      // only RGB or indexed allowed
4879     }
4880     tga_image_type = stbi__get8(s); // image type
4881     if ( tga_colormap_type == 1 ) { // colormapped (paletted) image
4882         if (tga_image_type != 1 && tga_image_type != 9) {
4883             stbi__rewind(s);
4884             return 0;
4885         }
4886         stbi__skip(s,4);       // skip index of first colormap entry and number of entries
4887         sz = stbi__get8(s);    //   check bits per palette color entry
4888         if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) {
4889             stbi__rewind(s);
4890             return 0;
4891         }
4892         stbi__skip(s,4);       // skip image x and y origin
4893         tga_colormap_bpp = sz;
4894     } else { // "normal" image w/o colormap - only RGB or grey allowed, +/- RLE
4895         if ( (tga_image_type != 2) && (tga_image_type != 3) && (tga_image_type != 10) && (tga_image_type != 11) ) {
4896             stbi__rewind(s);
4897             return 0; // only RGB or grey allowed, +/- RLE
4898         }
4899         stbi__skip(s,9); // skip colormap specification and image x/y origin
4900         tga_colormap_bpp = 0;
4901     }
4902     tga_w = stbi__get16le(s);
4903     if( tga_w < 1 ) {
4904         stbi__rewind(s);
4905         return 0;   // test width
4906     }
4907     tga_h = stbi__get16le(s);
4908     if( tga_h < 1 ) {
4909         stbi__rewind(s);
4910         return 0;   // test height
4911     }
4912     tga_bits_per_pixel = stbi__get8(s); // bits per pixel
4913     stbi__get8(s); // ignore alpha bits
4914     if (tga_colormap_bpp != 0) {
4915         if((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16)) {
4916             // when using a colormap, tga_bits_per_pixel is the size of the indexes
4917             // I don't think anything but 8 or 16bit indexes makes sense
4918             stbi__rewind(s);
4919             return 0;
4920         }
4921         tga_comp = stbi__tga_get_comp(tga_colormap_bpp, 0, NULL);
4922     } else {
4923         tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3) || (tga_image_type == 11), NULL);
4924     }
4925     if(!tga_comp) {
4926       stbi__rewind(s);
4927       return 0;
4928     }
4929     if (x) *x = tga_w;
4930     if (y) *y = tga_h;
4931     if (comp) *comp = tga_comp;
4932     return 1;                   // seems to have passed everything
4933 }
4934 
stbi__tga_test(stbi__context * s)4935 static int stbi__tga_test(stbi__context *s)
4936 {
4937    int res = 0;
4938    int sz, tga_color_type;
4939    stbi__get8(s);      //   discard Offset
4940    tga_color_type = stbi__get8(s);   //   color type
4941    if ( tga_color_type > 1 ) goto errorEnd;   //   only RGB or indexed allowed
4942    sz = stbi__get8(s);   //   image type
4943    if ( tga_color_type == 1 ) { // colormapped (paletted) image
4944       if (sz != 1 && sz != 9) goto errorEnd; // colortype 1 demands image type 1 or 9
4945       stbi__skip(s,4);       // skip index of first colormap entry and number of entries
4946       sz = stbi__get8(s);    //   check bits per palette color entry
4947       if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
4948       stbi__skip(s,4);       // skip image x and y origin
4949    } else { // "normal" image w/o colormap
4950       if ( (sz != 2) && (sz != 3) && (sz != 10) && (sz != 11) ) goto errorEnd; // only RGB or grey allowed, +/- RLE
4951       stbi__skip(s,9); // skip colormap specification and image x/y origin
4952    }
4953    if ( stbi__get16le(s) < 1 ) goto errorEnd;      //   test width
4954    if ( stbi__get16le(s) < 1 ) goto errorEnd;      //   test height
4955    sz = stbi__get8(s);   //   bits per pixel
4956    if ( (tga_color_type == 1) && (sz != 8) && (sz != 16) ) goto errorEnd; // for colormapped images, bpp is size of an index
4957    if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
4958 
4959    res = 1; // if we got this far, everything's good and we can return 1 instead of 0
4960 
4961 errorEnd:
4962    stbi__rewind(s);
4963    return res;
4964 }
4965 
4966 // read 16bit value and convert to 24bit RGB
stbi__tga_read_rgb16(stbi__context * s,stbi_uc * out)4967 void stbi__tga_read_rgb16(stbi__context *s, stbi_uc* out)
4968 {
4969    stbi__uint16 px = stbi__get16le(s);
4970    stbi__uint16 fiveBitMask = 31;
4971    // we have 3 channels with 5bits each
4972    int r = (px >> 10) & fiveBitMask;
4973    int g = (px >> 5) & fiveBitMask;
4974    int b = px & fiveBitMask;
4975    // Note that this saves the data in RGB(A) order, so it doesn't need to be swapped later
4976    out[0] = (r * 255)/31;
4977    out[1] = (g * 255)/31;
4978    out[2] = (b * 255)/31;
4979 
4980    // some people claim that the most significant bit might be used for alpha
4981    // (possibly if an alpha-bit is set in the "image descriptor byte")
4982    // but that only made 16bit test images completely translucent..
4983    // so let's treat all 15 and 16bit TGAs as RGB with no alpha.
4984 }
4985 
stbi__tga_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4986 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4987 {
4988    //   read in the TGA header stuff
4989    int tga_offset = stbi__get8(s);
4990    int tga_indexed = stbi__get8(s);
4991    int tga_image_type = stbi__get8(s);
4992    int tga_is_RLE = 0;
4993    int tga_palette_start = stbi__get16le(s);
4994    int tga_palette_len = stbi__get16le(s);
4995    int tga_palette_bits = stbi__get8(s);
4996    int tga_x_origin = stbi__get16le(s);
4997    int tga_y_origin = stbi__get16le(s);
4998    int tga_width = stbi__get16le(s);
4999    int tga_height = stbi__get16le(s);
5000    int tga_bits_per_pixel = stbi__get8(s);
5001    int tga_comp, tga_rgb16=0;
5002    int tga_inverted = stbi__get8(s);
5003    // int tga_alpha_bits = tga_inverted & 15; // the 4 lowest bits - unused (useless?)
5004    //   image data
5005    unsigned char *tga_data;
5006    unsigned char *tga_palette = NULL;
5007    int i, j;
5008    unsigned char raw_data[4];
5009    int RLE_count = 0;
5010    int RLE_repeating = 0;
5011    int read_next_pixel = 1;
5012 
5013    //   do a tiny bit of precessing
5014    if ( tga_image_type >= 8 )
5015    {
5016       tga_image_type -= 8;
5017       tga_is_RLE = 1;
5018    }
5019    tga_inverted = 1 - ((tga_inverted >> 5) & 1);
5020 
5021    //   If I'm paletted, then I'll use the number of bits from the palette
5022    if ( tga_indexed ) tga_comp = stbi__tga_get_comp(tga_palette_bits, 0, &tga_rgb16);
5023    else tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3), &tga_rgb16);
5024 
5025    if(!tga_comp) // shouldn't really happen, stbi__tga_test() should have ensured basic consistency
5026       return stbi__errpuc("bad format", "Can't find out TGA pixelformat");
5027 
5028    //   tga info
5029    *x = tga_width;
5030    *y = tga_height;
5031    if (comp) *comp = tga_comp;
5032 
5033    tga_data = (unsigned char*)stbi__malloc( (size_t)tga_width * tga_height * tga_comp );
5034    if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
5035 
5036    // skip to the data's starting position (offset usually = 0)
5037    stbi__skip(s, tga_offset );
5038 
5039    if ( !tga_indexed && !tga_is_RLE && !tga_rgb16 ) {
5040       for (i=0; i < tga_height; ++i) {
5041          int row = tga_inverted ? tga_height -i - 1 : i;
5042          stbi_uc *tga_row = tga_data + row*tga_width*tga_comp;
5043          stbi__getn(s, tga_row, tga_width * tga_comp);
5044       }
5045    } else  {
5046       //   do I need to load a palette?
5047       if ( tga_indexed)
5048       {
5049          //   any data to skip? (offset usually = 0)
5050          stbi__skip(s, tga_palette_start );
5051          //   load the palette
5052          tga_palette = (unsigned char*)stbi__malloc( tga_palette_len * tga_comp );
5053          if (!tga_palette) {
5054             STBI_FREE(tga_data);
5055             return stbi__errpuc("outofmem", "Out of memory");
5056          }
5057          if (tga_rgb16) {
5058             stbi_uc *pal_entry = tga_palette;
5059             STBI_ASSERT(tga_comp == STBI_rgb);
5060             for (i=0; i < tga_palette_len; ++i) {
5061                stbi__tga_read_rgb16(s, pal_entry);
5062                pal_entry += tga_comp;
5063             }
5064          } else if (!stbi__getn(s, tga_palette, tga_palette_len * tga_comp)) {
5065                STBI_FREE(tga_data);
5066                STBI_FREE(tga_palette);
5067                return stbi__errpuc("bad palette", "Corrupt TGA");
5068          }
5069       }
5070       //   load the data
5071       for (i=0; i < tga_width * tga_height; ++i)
5072       {
5073          //   if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
5074          if ( tga_is_RLE )
5075          {
5076             if ( RLE_count == 0 )
5077             {
5078                //   yep, get the next byte as a RLE command
5079                int RLE_cmd = stbi__get8(s);
5080                RLE_count = 1 + (RLE_cmd & 127);
5081                RLE_repeating = RLE_cmd >> 7;
5082                read_next_pixel = 1;
5083             } else if ( !RLE_repeating )
5084             {
5085                read_next_pixel = 1;
5086             }
5087          } else
5088          {
5089             read_next_pixel = 1;
5090          }
5091          //   OK, if I need to read a pixel, do it now
5092          if ( read_next_pixel )
5093          {
5094             //   load however much data we did have
5095             if ( tga_indexed )
5096             {
5097                // read in index, then perform the lookup
5098                int pal_idx = (tga_bits_per_pixel == 8) ? stbi__get8(s) : stbi__get16le(s);
5099                if ( pal_idx >= tga_palette_len ) {
5100                   // invalid index
5101                   pal_idx = 0;
5102                }
5103                pal_idx *= tga_comp;
5104                for (j = 0; j < tga_comp; ++j) {
5105                   raw_data[j] = tga_palette[pal_idx+j];
5106                }
5107             } else if(tga_rgb16) {
5108                STBI_ASSERT(tga_comp == STBI_rgb);
5109                stbi__tga_read_rgb16(s, raw_data);
5110             } else {
5111                //   read in the data raw
5112                for (j = 0; j < tga_comp; ++j) {
5113                   raw_data[j] = stbi__get8(s);
5114                }
5115             }
5116             //   clear the reading flag for the next pixel
5117             read_next_pixel = 0;
5118          } // end of reading a pixel
5119 
5120          // copy data
5121          for (j = 0; j < tga_comp; ++j)
5122            tga_data[i*tga_comp+j] = raw_data[j];
5123 
5124          //   in case we're in RLE mode, keep counting down
5125          --RLE_count;
5126       }
5127       //   do I need to invert the image?
5128       if ( tga_inverted )
5129       {
5130          for (j = 0; j*2 < tga_height; ++j)
5131          {
5132             int index1 = j * tga_width * tga_comp;
5133             int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
5134             for (i = tga_width * tga_comp; i > 0; --i)
5135             {
5136                unsigned char temp = tga_data[index1];
5137                tga_data[index1] = tga_data[index2];
5138                tga_data[index2] = temp;
5139                ++index1;
5140                ++index2;
5141             }
5142          }
5143       }
5144       //   clear my palette, if I had one
5145       if ( tga_palette != NULL )
5146       {
5147          STBI_FREE( tga_palette );
5148       }
5149    }
5150 
5151    // swap RGB - if the source data was RGB16, it already is in the right order
5152    if (tga_comp >= 3 && !tga_rgb16)
5153    {
5154       unsigned char* tga_pixel = tga_data;
5155       for (i=0; i < tga_width * tga_height; ++i)
5156       {
5157          unsigned char temp = tga_pixel[0];
5158          tga_pixel[0] = tga_pixel[2];
5159          tga_pixel[2] = temp;
5160          tga_pixel += tga_comp;
5161       }
5162    }
5163 
5164    // convert to target component count
5165    if (req_comp && req_comp != tga_comp)
5166       tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
5167 
5168    //   the things I do to get rid of an error message, and yet keep
5169    //   Microsoft's C compilers happy... [8^(
5170    tga_palette_start = tga_palette_len = tga_palette_bits =
5171          tga_x_origin = tga_y_origin = 0;
5172    //   OK, done
5173    return tga_data;
5174 }
5175 #endif
5176 
5177 // *************************************************************************************************
5178 // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
5179 
5180 #ifndef STBI_NO_PSD
stbi__psd_test(stbi__context * s)5181 static int stbi__psd_test(stbi__context *s)
5182 {
5183    int r = (stbi__get32be(s) == 0x38425053);
5184    stbi__rewind(s);
5185    return r;
5186 }
5187 
stbi__psd_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)5188 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5189 {
5190    int   pixelCount;
5191    int channelCount, compression;
5192    int channel, i, count, len;
5193    int bitdepth;
5194    int w,h;
5195    stbi_uc *out;
5196 
5197    // Check identifier
5198    if (stbi__get32be(s) != 0x38425053)   // "8BPS"
5199       return stbi__errpuc("not PSD", "Corrupt PSD image");
5200 
5201    // Check file type version.
5202    if (stbi__get16be(s) != 1)
5203       return stbi__errpuc("wrong version", "Unsupported version of PSD image");
5204 
5205    // Skip 6 reserved bytes.
5206    stbi__skip(s, 6 );
5207 
5208    // Read the number of channels (R, G, B, A, etc).
5209    channelCount = stbi__get16be(s);
5210    if (channelCount < 0 || channelCount > 16)
5211       return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
5212 
5213    // Read the rows and columns of the image.
5214    h = stbi__get32be(s);
5215    w = stbi__get32be(s);
5216 
5217    // Make sure the depth is 8 bits.
5218    bitdepth = stbi__get16be(s);
5219    if (bitdepth != 8 && bitdepth != 16)
5220       return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 or 16 bit");
5221 
5222    // Make sure the color mode is RGB.
5223    // Valid options are:
5224    //   0: Bitmap
5225    //   1: Grayscale
5226    //   2: Indexed color
5227    //   3: RGB color
5228    //   4: CMYK color
5229    //   7: Multichannel
5230    //   8: Duotone
5231    //   9: Lab color
5232    if (stbi__get16be(s) != 3)
5233       return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
5234 
5235    // Skip the Mode Data.  (It's the palette for indexed color; other info for other modes.)
5236    stbi__skip(s,stbi__get32be(s) );
5237 
5238    // Skip the image resources.  (resolution, pen tool paths, etc)
5239    stbi__skip(s, stbi__get32be(s) );
5240 
5241    // Skip the reserved data.
5242    stbi__skip(s, stbi__get32be(s) );
5243 
5244    // Find out if the data is compressed.
5245    // Known values:
5246    //   0: no compression
5247    //   1: RLE compressed
5248    compression = stbi__get16be(s);
5249    if (compression > 1)
5250       return stbi__errpuc("bad compression", "PSD has an unknown compression format");
5251 
5252    // Create the destination image.
5253    out = (stbi_uc *) stbi__malloc(4 * w*h);
5254    if (!out) return stbi__errpuc("outofmem", "Out of memory");
5255    pixelCount = w*h;
5256 
5257    // Initialize the data to zero.
5258    //memset( out, 0, pixelCount * 4 );
5259 
5260    // Finally, the image data.
5261    if (compression) {
5262       // RLE as used by .PSD and .TIFF
5263       // Loop until you get the number of unpacked bytes you are expecting:
5264       //     Read the next source byte into n.
5265       //     If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
5266       //     Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
5267       //     Else if n is 128, noop.
5268       // Endloop
5269 
5270       // The RLE-compressed data is preceeded by a 2-byte data count for each row in the data,
5271       // which we're going to just skip.
5272       stbi__skip(s, h * channelCount * 2 );
5273 
5274       // Read the RLE data by channel.
5275       for (channel = 0; channel < 4; channel++) {
5276          stbi_uc *p;
5277 
5278          p = out+channel;
5279          if (channel >= channelCount) {
5280             // Fill this channel with default data.
5281             for (i = 0; i < pixelCount; i++, p += 4)
5282                *p = (channel == 3 ? 255 : 0);
5283          } else {
5284             // Read the RLE data.
5285             count = 0;
5286             while (count < pixelCount) {
5287                len = stbi__get8(s);
5288                if (len == 128) {
5289                   // No-op.
5290                } else if (len < 128) {
5291                   // Copy next len+1 bytes literally.
5292                   len++;
5293                   count += len;
5294                   while (len) {
5295                      *p = stbi__get8(s);
5296                      p += 4;
5297                      len--;
5298                   }
5299                } else if (len > 128) {
5300                   stbi_uc   val;
5301                   // Next -len+1 bytes in the dest are replicated from next source byte.
5302                   // (Interpret len as a negative 8-bit int.)
5303                   len ^= 0x0FF;
5304                   len += 2;
5305                   val = stbi__get8(s);
5306                   count += len;
5307                   while (len) {
5308                      *p = val;
5309                      p += 4;
5310                      len--;
5311                   }
5312                }
5313             }
5314          }
5315       }
5316 
5317    } else {
5318       // We're at the raw image data.  It's each channel in order (Red, Green, Blue, Alpha, ...)
5319       // where each channel consists of an 8-bit value for each pixel in the image.
5320 
5321       // Read the data by channel.
5322       for (channel = 0; channel < 4; channel++) {
5323          stbi_uc *p;
5324 
5325          p = out + channel;
5326          if (channel >= channelCount) {
5327             // Fill this channel with default data.
5328             stbi_uc val = channel == 3 ? 255 : 0;
5329             for (i = 0; i < pixelCount; i++, p += 4)
5330                *p = val;
5331          } else {
5332             // Read the data.
5333             if (bitdepth == 16) {
5334                for (i = 0; i < pixelCount; i++, p += 4)
5335                   *p = (stbi_uc) (stbi__get16be(s) >> 8);
5336             } else {
5337                for (i = 0; i < pixelCount; i++, p += 4)
5338                   *p = stbi__get8(s);
5339             }
5340          }
5341       }
5342    }
5343 
5344    if (req_comp && req_comp != 4) {
5345       out = stbi__convert_format(out, 4, req_comp, w, h);
5346       if (out == NULL) return out; // stbi__convert_format frees input on failure
5347    }
5348 
5349    if (comp) *comp = 4;
5350    *y = h;
5351    *x = w;
5352 
5353    return out;
5354 }
5355 #endif
5356 
5357 // *************************************************************************************************
5358 // Softimage PIC loader
5359 // by Tom Seddon
5360 //
5361 // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
5362 // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
5363 
5364 #ifndef STBI_NO_PIC
stbi__pic_is4(stbi__context * s,const char * str)5365 static int stbi__pic_is4(stbi__context *s,const char *str)
5366 {
5367    int i;
5368    for (i=0; i<4; ++i)
5369       if (stbi__get8(s) != (stbi_uc)str[i])
5370          return 0;
5371 
5372    return 1;
5373 }
5374 
stbi__pic_test_core(stbi__context * s)5375 static int stbi__pic_test_core(stbi__context *s)
5376 {
5377    int i;
5378 
5379    if (!stbi__pic_is4(s,"\x53\x80\xF6\x34"))
5380       return 0;
5381 
5382    for(i=0;i<84;++i)
5383       stbi__get8(s);
5384 
5385    if (!stbi__pic_is4(s,"PICT"))
5386       return 0;
5387 
5388    return 1;
5389 }
5390 
5391 typedef struct
5392 {
5393    stbi_uc size,type,channel;
5394 } stbi__pic_packet;
5395 
stbi__readval(stbi__context * s,int channel,stbi_uc * dest)5396 static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest)
5397 {
5398    int mask=0x80, i;
5399 
5400    for (i=0; i<4; ++i, mask>>=1) {
5401       if (channel & mask) {
5402          if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short");
5403          dest[i]=stbi__get8(s);
5404       }
5405    }
5406 
5407    return dest;
5408 }
5409 
stbi__copyval(int channel,stbi_uc * dest,const stbi_uc * src)5410 static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src)
5411 {
5412    int mask=0x80,i;
5413 
5414    for (i=0;i<4; ++i, mask>>=1)
5415       if (channel&mask)
5416          dest[i]=src[i];
5417 }
5418 
stbi__pic_load_core(stbi__context * s,int width,int height,int * comp,stbi_uc * result)5419 static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result)
5420 {
5421    int act_comp=0,num_packets=0,y,chained;
5422    stbi__pic_packet packets[10];
5423 
5424    // this will (should...) cater for even some bizarre stuff like having data
5425     // for the same channel in multiple packets.
5426    do {
5427       stbi__pic_packet *packet;
5428 
5429       if (num_packets==sizeof(packets)/sizeof(packets[0]))
5430          return stbi__errpuc("bad format","too many packets");
5431 
5432       packet = &packets[num_packets++];
5433 
5434       chained = stbi__get8(s);
5435       packet->size    = stbi__get8(s);
5436       packet->type    = stbi__get8(s);
5437       packet->channel = stbi__get8(s);
5438 
5439       act_comp |= packet->channel;
5440 
5441       if (stbi__at_eof(s))          return stbi__errpuc("bad file","file too short (reading packets)");
5442       if (packet->size != 8)  return stbi__errpuc("bad format","packet isn't 8bpp");
5443    } while (chained);
5444 
5445    *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
5446 
5447    for(y=0; y<height; ++y) {
5448       int packet_idx;
5449 
5450       for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
5451          stbi__pic_packet *packet = &packets[packet_idx];
5452          stbi_uc *dest = result+y*width*4;
5453 
5454          switch (packet->type) {
5455             default:
5456                return stbi__errpuc("bad format","packet has bad compression type");
5457 
5458             case 0: {//uncompressed
5459                int x;
5460 
5461                for(x=0;x<width;++x, dest+=4)
5462                   if (!stbi__readval(s,packet->channel,dest))
5463                      return 0;
5464                break;
5465             }
5466 
5467             case 1://Pure RLE
5468                {
5469                   int left=width, i;
5470 
5471                   while (left>0) {
5472                      stbi_uc count,value[4];
5473 
5474                      count=stbi__get8(s);
5475                      if (stbi__at_eof(s))   return stbi__errpuc("bad file","file too short (pure read count)");
5476 
5477                      if (count > left)
5478                         count = (stbi_uc) left;
5479 
5480                      if (!stbi__readval(s,packet->channel,value))  return 0;
5481 
5482                      for(i=0; i<count; ++i,dest+=4)
5483                         stbi__copyval(packet->channel,dest,value);
5484                      left -= count;
5485                   }
5486                }
5487                break;
5488 
5489             case 2: {//Mixed RLE
5490                int left=width;
5491                while (left>0) {
5492                   int count = stbi__get8(s), i;
5493                   if (stbi__at_eof(s))  return stbi__errpuc("bad file","file too short (mixed read count)");
5494 
5495                   if (count >= 128) { // Repeated
5496                      stbi_uc value[4];
5497 
5498                      if (count==128)
5499                         count = stbi__get16be(s);
5500                      else
5501                         count -= 127;
5502                      if (count > left)
5503                         return stbi__errpuc("bad file","scanline overrun");
5504 
5505                      if (!stbi__readval(s,packet->channel,value))
5506                         return 0;
5507 
5508                      for(i=0;i<count;++i, dest += 4)
5509                         stbi__copyval(packet->channel,dest,value);
5510                   } else { // Raw
5511                      ++count;
5512                      if (count>left) return stbi__errpuc("bad file","scanline overrun");
5513 
5514                      for(i=0;i<count;++i, dest+=4)
5515                         if (!stbi__readval(s,packet->channel,dest))
5516                            return 0;
5517                   }
5518                   left-=count;
5519                }
5520                break;
5521             }
5522          }
5523       }
5524    }
5525 
5526    return result;
5527 }
5528 
stbi__pic_load(stbi__context * s,int * px,int * py,int * comp,int req_comp)5529 static stbi_uc *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp)
5530 {
5531    stbi_uc *result;
5532    int i, x,y;
5533 
5534    for (i=0; i<92; ++i)
5535       stbi__get8(s);
5536 
5537    x = stbi__get16be(s);
5538    y = stbi__get16be(s);
5539    if (stbi__at_eof(s))  return stbi__errpuc("bad file","file too short (pic header)");
5540    if ((1 << 28) / x < y) return stbi__errpuc("too large", "Image too large to decode");
5541 
5542    stbi__get32be(s); //skip `ratio'
5543    stbi__get16be(s); //skip `fields'
5544    stbi__get16be(s); //skip `pad'
5545 
5546    // intermediate buffer is RGBA
5547    result = (stbi_uc *) stbi__malloc(x*y*4);
5548    memset(result, 0xff, x*y*4);
5549 
5550    if (!stbi__pic_load_core(s,x,y,comp, result)) {
5551       STBI_FREE(result);
5552       result=0;
5553    }
5554    *px = x;
5555    *py = y;
5556    if (req_comp == 0) req_comp = *comp;
5557    result=stbi__convert_format(result,4,req_comp,x,y);
5558 
5559    return result;
5560 }
5561 
stbi__pic_test(stbi__context * s)5562 static int stbi__pic_test(stbi__context *s)
5563 {
5564    int r = stbi__pic_test_core(s);
5565    stbi__rewind(s);
5566    return r;
5567 }
5568 #endif
5569 
5570 // *************************************************************************************************
5571 // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
5572 
5573 #ifndef STBI_NO_GIF
5574 typedef struct
5575 {
5576    stbi__int16 prefix;
5577    stbi_uc first;
5578    stbi_uc suffix;
5579 } stbi__gif_lzw;
5580 
5581 typedef struct
5582 {
5583    int w,h;
5584    stbi_uc *out, *old_out;             // output buffer (always 4 components)
5585    int flags, bgindex, ratio, transparent, eflags, delay;
5586    stbi_uc  pal[256][4];
5587    stbi_uc lpal[256][4];
5588    stbi__gif_lzw codes[4096];
5589    stbi_uc *color_table;
5590    int parse, step;
5591    int lflags;
5592    int start_x, start_y;
5593    int max_x, max_y;
5594    int cur_x, cur_y;
5595    int line_size;
5596 } stbi__gif;
5597 
stbi__gif_test_raw(stbi__context * s)5598 static int stbi__gif_test_raw(stbi__context *s)
5599 {
5600    int sz;
5601    if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
5602    sz = stbi__get8(s);
5603    if (sz != '9' && sz != '7') return 0;
5604    if (stbi__get8(s) != 'a') return 0;
5605    return 1;
5606 }
5607 
stbi__gif_test(stbi__context * s)5608 static int stbi__gif_test(stbi__context *s)
5609 {
5610    int r = stbi__gif_test_raw(s);
5611    stbi__rewind(s);
5612    return r;
5613 }
5614 
stbi__gif_parse_colortable(stbi__context * s,stbi_uc pal[256][4],int num_entries,int transp)5615 static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp)
5616 {
5617    int i;
5618    for (i=0; i < num_entries; ++i) {
5619       pal[i][2] = stbi__get8(s);
5620       pal[i][1] = stbi__get8(s);
5621       pal[i][0] = stbi__get8(s);
5622       pal[i][3] = transp == i ? 0 : 255;
5623    }
5624 }
5625 
stbi__gif_header(stbi__context * s,stbi__gif * g,int * comp,int is_info)5626 static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info)
5627 {
5628    stbi_uc version;
5629    if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
5630       return stbi__err("not GIF", "Corrupt GIF");
5631 
5632    version = stbi__get8(s);
5633    if (version != '7' && version != '9')    return stbi__err("not GIF", "Corrupt GIF");
5634    if (stbi__get8(s) != 'a')                return stbi__err("not GIF", "Corrupt GIF");
5635 
5636    stbi__g_failure_reason = "";
5637    g->w = stbi__get16le(s);
5638    g->h = stbi__get16le(s);
5639    g->flags = stbi__get8(s);
5640    g->bgindex = stbi__get8(s);
5641    g->ratio = stbi__get8(s);
5642    g->transparent = -1;
5643 
5644    if (comp != 0) *comp = 4;  // can't actually tell whether it's 3 or 4 until we parse the comments
5645 
5646    if (is_info) return 1;
5647 
5648    if (g->flags & 0x80)
5649       stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
5650 
5651    return 1;
5652 }
5653 
stbi__gif_info_raw(stbi__context * s,int * x,int * y,int * comp)5654 static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp)
5655 {
5656    stbi__gif g;
5657    if (!stbi__gif_header(s, &g, comp, 1)) {
5658       stbi__rewind( s );
5659       return 0;
5660    }
5661    if (x) *x = g.w;
5662    if (y) *y = g.h;
5663    return 1;
5664 }
5665 
stbi__out_gif_code(stbi__gif * g,stbi__uint16 code)5666 static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code)
5667 {
5668    stbi_uc *p, *c;
5669 
5670    // recurse to decode the prefixes, since the linked-list is backwards,
5671    // and working backwards through an interleaved image would be nasty
5672    if (g->codes[code].prefix >= 0)
5673       stbi__out_gif_code(g, g->codes[code].prefix);
5674 
5675    if (g->cur_y >= g->max_y) return;
5676 
5677    p = &g->out[g->cur_x + g->cur_y];
5678    c = &g->color_table[g->codes[code].suffix * 4];
5679 
5680    if (c[3] >= 128) {
5681       p[0] = c[2];
5682       p[1] = c[1];
5683       p[2] = c[0];
5684       p[3] = c[3];
5685    }
5686    g->cur_x += 4;
5687 
5688    if (g->cur_x >= g->max_x) {
5689       g->cur_x = g->start_x;
5690       g->cur_y += g->step;
5691 
5692       while (g->cur_y >= g->max_y && g->parse > 0) {
5693          g->step = (1 << g->parse) * g->line_size;
5694          g->cur_y = g->start_y + (g->step >> 1);
5695          --g->parse;
5696       }
5697    }
5698 }
5699 
stbi__process_gif_raster(stbi__context * s,stbi__gif * g)5700 static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g)
5701 {
5702    stbi_uc lzw_cs;
5703    stbi__int32 len, init_code;
5704    stbi__uint32 first;
5705    stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
5706    stbi__gif_lzw *p;
5707 
5708    lzw_cs = stbi__get8(s);
5709    if (lzw_cs > 12) return NULL;
5710    clear = 1 << lzw_cs;
5711    first = 1;
5712    codesize = lzw_cs + 1;
5713    codemask = (1 << codesize) - 1;
5714    bits = 0;
5715    valid_bits = 0;
5716    for (init_code = 0; init_code < clear; init_code++) {
5717       g->codes[init_code].prefix = -1;
5718       g->codes[init_code].first = (stbi_uc) init_code;
5719       g->codes[init_code].suffix = (stbi_uc) init_code;
5720    }
5721 
5722    // support no starting clear code
5723    avail = clear+2;
5724    oldcode = -1;
5725 
5726    len = 0;
5727    for(;;) {
5728       if (valid_bits < codesize) {
5729          if (len == 0) {
5730             len = stbi__get8(s); // start new block
5731             if (len == 0)
5732                return g->out;
5733          }
5734          --len;
5735          bits |= (stbi__int32) stbi__get8(s) << valid_bits;
5736          valid_bits += 8;
5737       } else {
5738          stbi__int32 code = bits & codemask;
5739          bits >>= codesize;
5740          valid_bits -= codesize;
5741          // @OPTIMIZE: is there some way we can accelerate the non-clear path?
5742          if (code == clear) {  // clear code
5743             codesize = lzw_cs + 1;
5744             codemask = (1 << codesize) - 1;
5745             avail = clear + 2;
5746             oldcode = -1;
5747             first = 0;
5748          } else if (code == clear + 1) { // end of stream code
5749             stbi__skip(s, len);
5750             while ((len = stbi__get8(s)) > 0)
5751                stbi__skip(s,len);
5752             return g->out;
5753          } else if (code <= avail) {
5754             if (first) return stbi__errpuc("no clear code", "Corrupt GIF");
5755 
5756             if (oldcode >= 0) {
5757                p = &g->codes[avail++];
5758                if (avail > 4096)        return stbi__errpuc("too many codes", "Corrupt GIF");
5759                p->prefix = (stbi__int16) oldcode;
5760                p->first = g->codes[oldcode].first;
5761                p->suffix = (code == avail) ? p->first : g->codes[code].first;
5762             } else if (code == avail)
5763                return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5764 
5765             stbi__out_gif_code(g, (stbi__uint16) code);
5766 
5767             if ((avail & codemask) == 0 && avail <= 0x0FFF) {
5768                codesize++;
5769                codemask = (1 << codesize) - 1;
5770             }
5771 
5772             oldcode = code;
5773          } else {
5774             return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5775          }
5776       }
5777    }
5778 }
5779 
stbi__fill_gif_background(stbi__gif * g,int x0,int y0,int x1,int y1)5780 static void stbi__fill_gif_background(stbi__gif *g, int x0, int y0, int x1, int y1)
5781 {
5782    int x, y;
5783    stbi_uc *c = g->pal[g->bgindex];
5784    for (y = y0; y < y1; y += 4 * g->w) {
5785       for (x = x0; x < x1; x += 4) {
5786          stbi_uc *p  = &g->out[y + x];
5787          p[0] = c[2];
5788          p[1] = c[1];
5789          p[2] = c[0];
5790          p[3] = 0;
5791       }
5792    }
5793 }
5794 
5795 // this function is designed to support animated gifs, although stb_image doesn't support it
stbi__gif_load_next(stbi__context * s,stbi__gif * g,int * comp,int req_comp)5796 static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp)
5797 {
5798    int i;
5799    stbi_uc *prev_out = 0;
5800 
5801    if (g->out == 0 && !stbi__gif_header(s, g, comp,0))
5802       return 0; // stbi__g_failure_reason set by stbi__gif_header
5803 
5804    prev_out = g->out;
5805    g->out = (stbi_uc *) stbi__malloc(4 * g->w * g->h);
5806    if (g->out == 0) return stbi__errpuc("outofmem", "Out of memory");
5807 
5808    switch ((g->eflags & 0x1C) >> 2) {
5809       case 0: // unspecified (also always used on 1st frame)
5810          stbi__fill_gif_background(g, 0, 0, 4 * g->w, 4 * g->w * g->h);
5811          break;
5812       case 1: // do not dispose
5813          if (prev_out) memcpy(g->out, prev_out, 4 * g->w * g->h);
5814          g->old_out = prev_out;
5815          break;
5816       case 2: // dispose to background
5817          if (prev_out) memcpy(g->out, prev_out, 4 * g->w * g->h);
5818          stbi__fill_gif_background(g, g->start_x, g->start_y, g->max_x, g->max_y);
5819          break;
5820       case 3: // dispose to previous
5821          if (g->old_out) {
5822             for (i = g->start_y; i < g->max_y; i += 4 * g->w)
5823                memcpy(&g->out[i + g->start_x], &g->old_out[i + g->start_x], g->max_x - g->start_x);
5824          }
5825          break;
5826    }
5827 
5828    for (;;) {
5829       switch (stbi__get8(s)) {
5830          case 0x2C: /* Image Descriptor */
5831          {
5832             int prev_trans = -1;
5833             stbi__int32 x, y, w, h;
5834             stbi_uc *o;
5835 
5836             x = stbi__get16le(s);
5837             y = stbi__get16le(s);
5838             w = stbi__get16le(s);
5839             h = stbi__get16le(s);
5840             if (((x + w) > (g->w)) || ((y + h) > (g->h)))
5841                return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
5842 
5843             g->line_size = g->w * 4;
5844             g->start_x = x * 4;
5845             g->start_y = y * g->line_size;
5846             g->max_x   = g->start_x + w * 4;
5847             g->max_y   = g->start_y + h * g->line_size;
5848             g->cur_x   = g->start_x;
5849             g->cur_y   = g->start_y;
5850 
5851             g->lflags = stbi__get8(s);
5852 
5853             if (g->lflags & 0x40) {
5854                g->step = 8 * g->line_size; // first interlaced spacing
5855                g->parse = 3;
5856             } else {
5857                g->step = g->line_size;
5858                g->parse = 0;
5859             }
5860 
5861             if (g->lflags & 0x80) {
5862                stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
5863                g->color_table = (stbi_uc *) g->lpal;
5864             } else if (g->flags & 0x80) {
5865                if (g->transparent >= 0 && (g->eflags & 0x01)) {
5866                   prev_trans = g->pal[g->transparent][3];
5867                   g->pal[g->transparent][3] = 0;
5868                }
5869                g->color_table = (stbi_uc *) g->pal;
5870             } else
5871                return stbi__errpuc("missing color table", "Corrupt GIF");
5872 
5873             o = stbi__process_gif_raster(s, g);
5874             if (o == NULL) return NULL;
5875 
5876             if (prev_trans != -1)
5877                g->pal[g->transparent][3] = (stbi_uc) prev_trans;
5878 
5879             return o;
5880          }
5881 
5882          case 0x21: // Comment Extension.
5883          {
5884             int len;
5885             if (stbi__get8(s) == 0xF9) { // Graphic Control Extension.
5886                len = stbi__get8(s);
5887                if (len == 4) {
5888                   g->eflags = stbi__get8(s);
5889                   g->delay = stbi__get16le(s);
5890                   g->transparent = stbi__get8(s);
5891                } else {
5892                   stbi__skip(s, len);
5893                   break;
5894                }
5895             }
5896             while ((len = stbi__get8(s)) != 0)
5897                stbi__skip(s, len);
5898             break;
5899          }
5900 
5901          case 0x3B: // gif stream termination code
5902             return (stbi_uc *) s; // using '1' causes warning on some compilers
5903 
5904          default:
5905             return stbi__errpuc("unknown code", "Corrupt GIF");
5906       }
5907    }
5908 
5909    STBI_NOTUSED(req_comp);
5910 }
5911 
stbi__gif_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)5912 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5913 {
5914    stbi_uc *u = 0;
5915    stbi__gif g;
5916    memset(&g, 0, sizeof(g));
5917 
5918    u = stbi__gif_load_next(s, &g, comp, req_comp);
5919    if (u == (stbi_uc *) s) u = 0;  // end of animated gif marker
5920    if (u) {
5921       *x = g.w;
5922       *y = g.h;
5923       if (req_comp && req_comp != 4)
5924          u = stbi__convert_format(u, 4, req_comp, g.w, g.h);
5925    }
5926    else if (g.out)
5927       STBI_FREE(g.out);
5928 
5929    return u;
5930 }
5931 
stbi__gif_info(stbi__context * s,int * x,int * y,int * comp)5932 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp)
5933 {
5934    return stbi__gif_info_raw(s,x,y,comp);
5935 }
5936 #endif
5937 
5938 // *************************************************************************************************
5939 // Radiance RGBE HDR loader
5940 // originally by Nicolas Schulz
5941 #ifndef STBI_NO_HDR
stbi__hdr_test_core(stbi__context * s)5942 static int stbi__hdr_test_core(stbi__context *s)
5943 {
5944    const char *signature = "#?RADIANCE\n";
5945    int i;
5946    for (i=0; signature[i]; ++i)
5947       if (stbi__get8(s) != signature[i])
5948          return 0;
5949    return 1;
5950 }
5951 
stbi__hdr_test(stbi__context * s)5952 static int stbi__hdr_test(stbi__context* s)
5953 {
5954    int r = stbi__hdr_test_core(s);
5955    stbi__rewind(s);
5956    return r;
5957 }
5958 
5959 #define STBI__HDR_BUFLEN  1024
stbi__hdr_gettoken(stbi__context * z,char * buffer)5960 static char *stbi__hdr_gettoken(stbi__context *z, char *buffer)
5961 {
5962    int len=0;
5963    char c = '\0';
5964 
5965    c = (char) stbi__get8(z);
5966 
5967    while (!stbi__at_eof(z) && c != '\n') {
5968       buffer[len++] = c;
5969       if (len == STBI__HDR_BUFLEN-1) {
5970          // flush to end of line
5971          while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
5972             ;
5973          break;
5974       }
5975       c = (char) stbi__get8(z);
5976    }
5977 
5978    buffer[len] = 0;
5979    return buffer;
5980 }
5981 
stbi__hdr_convert(float * output,stbi_uc * input,int req_comp)5982 static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp)
5983 {
5984    if ( input[3] != 0 ) {
5985       float f1;
5986       // Exponent
5987       f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
5988       if (req_comp <= 2)
5989          output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
5990       else {
5991          output[0] = input[0] * f1;
5992          output[1] = input[1] * f1;
5993          output[2] = input[2] * f1;
5994       }
5995       if (req_comp == 2) output[1] = 1;
5996       if (req_comp == 4) output[3] = 1;
5997    } else {
5998       switch (req_comp) {
5999          case 4: output[3] = 1; /* fallthrough */
6000          case 3: output[0] = output[1] = output[2] = 0;
6001                  break;
6002          case 2: output[1] = 1; /* fallthrough */
6003          case 1: output[0] = 0;
6004                  break;
6005       }
6006    }
6007 }
6008 
stbi__hdr_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)6009 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6010 {
6011    char buffer[STBI__HDR_BUFLEN];
6012    char *token;
6013    int valid = 0;
6014    int width, height;
6015    stbi_uc *scanline;
6016    float *hdr_data;
6017    int len;
6018    unsigned char count, value;
6019    int i, j, k, c1,c2, z;
6020 
6021 
6022    // Check identifier
6023    if (strcmp(stbi__hdr_gettoken(s,buffer), "#?RADIANCE") != 0)
6024       return stbi__errpf("not HDR", "Corrupt HDR image");
6025 
6026    // Parse header
6027    for(;;) {
6028       token = stbi__hdr_gettoken(s,buffer);
6029       if (token[0] == 0) break;
6030       if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6031    }
6032 
6033    if (!valid)    return stbi__errpf("unsupported format", "Unsupported HDR format");
6034 
6035    // Parse width and height
6036    // can't use sscanf() if we're not using stdio!
6037    token = stbi__hdr_gettoken(s,buffer);
6038    if (strncmp(token, "-Y ", 3))  return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6039    token += 3;
6040    height = (int) strtol(token, &token, 10);
6041    while (*token == ' ') ++token;
6042    if (strncmp(token, "+X ", 3))  return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6043    token += 3;
6044    width = (int) strtol(token, NULL, 10);
6045 
6046    *x = width;
6047    *y = height;
6048 
6049    if (comp) *comp = 3;
6050    if (req_comp == 0) req_comp = 3;
6051 
6052    // Read data
6053    hdr_data = (float *) stbi__malloc(height * width * req_comp * sizeof(float));
6054 
6055    // Load image data
6056    // image data is stored as some number of sca
6057    if ( width < 8 || width >= 32768) {
6058       // Read flat data
6059       for (j=0; j < height; ++j) {
6060          for (i=0; i < width; ++i) {
6061             stbi_uc rgbe[4];
6062            main_decode_loop:
6063             stbi__getn(s, rgbe, 4);
6064             stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
6065          }
6066       }
6067    } else {
6068       // Read RLE-encoded data
6069       scanline = NULL;
6070 
6071       for (j = 0; j < height; ++j) {
6072          c1 = stbi__get8(s);
6073          c2 = stbi__get8(s);
6074          len = stbi__get8(s);
6075          if (c1 != 2 || c2 != 2 || (len & 0x80)) {
6076             // not run-length encoded, so we have to actually use THIS data as a decoded
6077             // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
6078             stbi_uc rgbe[4];
6079             rgbe[0] = (stbi_uc) c1;
6080             rgbe[1] = (stbi_uc) c2;
6081             rgbe[2] = (stbi_uc) len;
6082             rgbe[3] = (stbi_uc) stbi__get8(s);
6083             stbi__hdr_convert(hdr_data, rgbe, req_comp);
6084             i = 1;
6085             j = 0;
6086             STBI_FREE(scanline);
6087             goto main_decode_loop; // yes, this makes no sense
6088          }
6089          len <<= 8;
6090          len |= stbi__get8(s);
6091          if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
6092          if (scanline == NULL) scanline = (stbi_uc *) stbi__malloc(width * 4);
6093 
6094          for (k = 0; k < 4; ++k) {
6095             i = 0;
6096             while (i < width) {
6097                count = stbi__get8(s);
6098                if (count > 128) {
6099                   // Run
6100                   value = stbi__get8(s);
6101                   count -= 128;
6102                   for (z = 0; z < count; ++z)
6103                      scanline[i++ * 4 + k] = value;
6104                } else {
6105                   // Dump
6106                   for (z = 0; z < count; ++z)
6107                      scanline[i++ * 4 + k] = stbi__get8(s);
6108                }
6109             }
6110          }
6111          for (i=0; i < width; ++i)
6112             stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
6113       }
6114       STBI_FREE(scanline);
6115    }
6116 
6117    return hdr_data;
6118 }
6119 
stbi__hdr_info(stbi__context * s,int * x,int * y,int * comp)6120 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp)
6121 {
6122    char buffer[STBI__HDR_BUFLEN];
6123    char *token;
6124    int valid = 0;
6125 
6126    if (stbi__hdr_test(s) == 0) {
6127        stbi__rewind( s );
6128        return 0;
6129    }
6130 
6131    for(;;) {
6132       token = stbi__hdr_gettoken(s,buffer);
6133       if (token[0] == 0) break;
6134       if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6135    }
6136 
6137    if (!valid) {
6138        stbi__rewind( s );
6139        return 0;
6140    }
6141    token = stbi__hdr_gettoken(s,buffer);
6142    if (strncmp(token, "-Y ", 3)) {
6143        stbi__rewind( s );
6144        return 0;
6145    }
6146    token += 3;
6147    *y = (int) strtol(token, &token, 10);
6148    while (*token == ' ') ++token;
6149    if (strncmp(token, "+X ", 3)) {
6150        stbi__rewind( s );
6151        return 0;
6152    }
6153    token += 3;
6154    *x = (int) strtol(token, NULL, 10);
6155    *comp = 3;
6156    return 1;
6157 }
6158 #endif // STBI_NO_HDR
6159 
6160 #ifndef STBI_NO_BMP
stbi__bmp_info(stbi__context * s,int * x,int * y,int * comp)6161 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp)
6162 {
6163    void *p;
6164    stbi__bmp_data info;
6165 
6166    info.all_a = 255;
6167    p = stbi__bmp_parse_header(s, &info);
6168    stbi__rewind( s );
6169    if (p == NULL)
6170       return 0;
6171    *x = s->img_x;
6172    *y = s->img_y;
6173    *comp = info.ma ? 4 : 3;
6174    return 1;
6175 }
6176 #endif
6177 
6178 #ifndef STBI_NO_PSD
stbi__psd_info(stbi__context * s,int * x,int * y,int * comp)6179 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp)
6180 {
6181    int channelCount;
6182    if (stbi__get32be(s) != 0x38425053) {
6183        stbi__rewind( s );
6184        return 0;
6185    }
6186    if (stbi__get16be(s) != 1) {
6187        stbi__rewind( s );
6188        return 0;
6189    }
6190    stbi__skip(s, 6);
6191    channelCount = stbi__get16be(s);
6192    if (channelCount < 0 || channelCount > 16) {
6193        stbi__rewind( s );
6194        return 0;
6195    }
6196    *y = stbi__get32be(s);
6197    *x = stbi__get32be(s);
6198    if (stbi__get16be(s) != 8) {
6199        stbi__rewind( s );
6200        return 0;
6201    }
6202    if (stbi__get16be(s) != 3) {
6203        stbi__rewind( s );
6204        return 0;
6205    }
6206    *comp = 4;
6207    return 1;
6208 }
6209 #endif
6210 
6211 #ifndef STBI_NO_PIC
stbi__pic_info(stbi__context * s,int * x,int * y,int * comp)6212 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp)
6213 {
6214    int act_comp=0,num_packets=0,chained;
6215    stbi__pic_packet packets[10];
6216 
6217    if (!stbi__pic_is4(s,"\x53\x80\xF6\x34")) {
6218       stbi__rewind(s);
6219       return 0;
6220    }
6221 
6222    stbi__skip(s, 88);
6223 
6224    *x = stbi__get16be(s);
6225    *y = stbi__get16be(s);
6226    if (stbi__at_eof(s)) {
6227       stbi__rewind( s);
6228       return 0;
6229    }
6230    if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
6231       stbi__rewind( s );
6232       return 0;
6233    }
6234 
6235    stbi__skip(s, 8);
6236 
6237    do {
6238       stbi__pic_packet *packet;
6239 
6240       if (num_packets==sizeof(packets)/sizeof(packets[0]))
6241          return 0;
6242 
6243       packet = &packets[num_packets++];
6244       chained = stbi__get8(s);
6245       packet->size    = stbi__get8(s);
6246       packet->type    = stbi__get8(s);
6247       packet->channel = stbi__get8(s);
6248       act_comp |= packet->channel;
6249 
6250       if (stbi__at_eof(s)) {
6251           stbi__rewind( s );
6252           return 0;
6253       }
6254       if (packet->size != 8) {
6255           stbi__rewind( s );
6256           return 0;
6257       }
6258    } while (chained);
6259 
6260    *comp = (act_comp & 0x10 ? 4 : 3);
6261 
6262    return 1;
6263 }
6264 #endif
6265 
6266 // *************************************************************************************************
6267 // Portable Gray Map and Portable Pixel Map loader
6268 // by Ken Miller
6269 //
6270 // PGM: http://netpbm.sourceforge.net/doc/pgm.html
6271 // PPM: http://netpbm.sourceforge.net/doc/ppm.html
6272 //
6273 // Known limitations:
6274 //    Does not support comments in the header section
6275 //    Does not support ASCII image data (formats P2 and P3)
6276 //    Does not support 16-bit-per-channel
6277 
6278 #ifndef STBI_NO_PNM
6279 
stbi__pnm_test(stbi__context * s)6280 static int      stbi__pnm_test(stbi__context *s)
6281 {
6282    char p, t;
6283    p = (char) stbi__get8(s);
6284    t = (char) stbi__get8(s);
6285    if (p != 'P' || (t != '5' && t != '6')) {
6286        stbi__rewind( s );
6287        return 0;
6288    }
6289    return 1;
6290 }
6291 
stbi__pnm_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)6292 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6293 {
6294    stbi_uc *out;
6295    if (!stbi__pnm_info(s, (int *)&s->img_x, (int *)&s->img_y, (int *)&s->img_n))
6296       return 0;
6297    *x = s->img_x;
6298    *y = s->img_y;
6299    *comp = s->img_n;
6300 
6301    out = (stbi_uc *) stbi__malloc(s->img_n * s->img_x * s->img_y);
6302    if (!out) return stbi__errpuc("outofmem", "Out of memory");
6303    stbi__getn(s, out, s->img_n * s->img_x * s->img_y);
6304 
6305    if (req_comp && req_comp != s->img_n) {
6306       out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y);
6307       if (out == NULL) return out; // stbi__convert_format frees input on failure
6308    }
6309    return out;
6310 }
6311 
stbi__pnm_isspace(char c)6312 static int      stbi__pnm_isspace(char c)
6313 {
6314    return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r';
6315 }
6316 
stbi__pnm_skip_whitespace(stbi__context * s,char * c)6317 static void     stbi__pnm_skip_whitespace(stbi__context *s, char *c)
6318 {
6319    for (;;) {
6320       while (!stbi__at_eof(s) && stbi__pnm_isspace(*c))
6321          *c = (char) stbi__get8(s);
6322 
6323       if (stbi__at_eof(s) || *c != '#')
6324          break;
6325 
6326       while (!stbi__at_eof(s) && *c != '\n' && *c != '\r' )
6327          *c = (char) stbi__get8(s);
6328    }
6329 }
6330 
stbi__pnm_isdigit(char c)6331 static int      stbi__pnm_isdigit(char c)
6332 {
6333    return c >= '0' && c <= '9';
6334 }
6335 
stbi__pnm_getinteger(stbi__context * s,char * c)6336 static int      stbi__pnm_getinteger(stbi__context *s, char *c)
6337 {
6338    int value = 0;
6339 
6340    while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) {
6341       value = value*10 + (*c - '0');
6342       *c = (char) stbi__get8(s);
6343    }
6344 
6345    return value;
6346 }
6347 
stbi__pnm_info(stbi__context * s,int * x,int * y,int * comp)6348 static int      stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp)
6349 {
6350    int maxv;
6351    char c, p, t;
6352 
6353    stbi__rewind( s );
6354 
6355    // Get identifier
6356    p = (char) stbi__get8(s);
6357    t = (char) stbi__get8(s);
6358    if (p != 'P' || (t != '5' && t != '6')) {
6359        stbi__rewind( s );
6360        return 0;
6361    }
6362 
6363    *comp = (t == '6') ? 3 : 1;  // '5' is 1-component .pgm; '6' is 3-component .ppm
6364 
6365    c = (char) stbi__get8(s);
6366    stbi__pnm_skip_whitespace(s, &c);
6367 
6368    *x = stbi__pnm_getinteger(s, &c); // read width
6369    stbi__pnm_skip_whitespace(s, &c);
6370 
6371    *y = stbi__pnm_getinteger(s, &c); // read height
6372    stbi__pnm_skip_whitespace(s, &c);
6373 
6374    maxv = stbi__pnm_getinteger(s, &c);  // read max value
6375 
6376    if (maxv > 255)
6377       return stbi__err("max value > 255", "PPM image not 8-bit");
6378    else
6379       return 1;
6380 }
6381 #endif
6382 
stbi__info_main(stbi__context * s,int * x,int * y,int * comp)6383 static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp)
6384 {
6385    #ifndef STBI_NO_JPEG
6386    if (stbi__jpeg_info(s, x, y, comp)) return 1;
6387    #endif
6388 
6389    #ifndef STBI_NO_PNG
6390    if (stbi__png_info(s, x, y, comp))  return 1;
6391    #endif
6392 
6393    #ifndef STBI_NO_GIF
6394    if (stbi__gif_info(s, x, y, comp))  return 1;
6395    #endif
6396 
6397    #ifndef STBI_NO_BMP
6398    if (stbi__bmp_info(s, x, y, comp))  return 1;
6399    #endif
6400 
6401    #ifndef STBI_NO_PSD
6402    if (stbi__psd_info(s, x, y, comp))  return 1;
6403    #endif
6404 
6405    #ifndef STBI_NO_PIC
6406    if (stbi__pic_info(s, x, y, comp))  return 1;
6407    #endif
6408 
6409    #ifndef STBI_NO_PNM
6410    if (stbi__pnm_info(s, x, y, comp))  return 1;
6411    #endif
6412 
6413    #ifndef STBI_NO_HDR
6414    if (stbi__hdr_info(s, x, y, comp))  return 1;
6415    #endif
6416 
6417    // test tga last because it's a crappy test!
6418    #ifndef STBI_NO_TGA
6419    if (stbi__tga_info(s, x, y, comp))
6420        return 1;
6421    #endif
6422    return stbi__err("unknown image type", "Image not of any known type, or corrupt");
6423 }
6424 
6425 #ifndef STBI_NO_STDIO
stbi_info(char const * filename,int * x,int * y,int * comp)6426 STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
6427 {
6428     FILE *f = stbi__fopen(filename, "rb");
6429     int result;
6430     if (!f) return stbi__err("can't fopen", "Unable to open file");
6431     result = stbi_info_from_file(f, x, y, comp);
6432     fclose(f);
6433     return result;
6434 }
6435 
stbi_info_from_file(FILE * f,int * x,int * y,int * comp)6436 STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
6437 {
6438    int r;
6439    stbi__context s;
6440    long pos = ftell(f);
6441    stbi__start_file(&s, f);
6442    r = stbi__info_main(&s,x,y,comp);
6443    fseek(f,pos,SEEK_SET);
6444    return r;
6445 }
6446 #endif // !STBI_NO_STDIO
6447 
stbi_info_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp)6448 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
6449 {
6450    stbi__context s;
6451    stbi__start_mem(&s,buffer,len);
6452    return stbi__info_main(&s,x,y,comp);
6453 }
6454 
stbi_info_from_callbacks(stbi_io_callbacks const * c,void * user,int * x,int * y,int * comp)6455 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
6456 {
6457    stbi__context s;
6458    stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
6459    return stbi__info_main(&s,x,y,comp);
6460 }
6461 
6462 #pragma GCC diagnostic pop
6463 
6464 #endif // STB_IMAGE_IMPLEMENTATION
6465 
6466 /*
6467    revision history:
6468       2.10  (2016-01-22) avoid warning introduced in 2.09 by STBI_REALLOC_SIZED
6469       2.09  (2016-01-16) allow comments in PNM files
6470                          16-bit-per-pixel TGA (not bit-per-component)
6471                          info() for TGA could break due to .hdr handling
6472                          info() for BMP to shares code instead of sloppy parse
6473                          can use STBI_REALLOC_SIZED if allocator doesn't support realloc
6474                          code cleanup
6475       2.08  (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
6476       2.07  (2015-09-13) fix compiler warnings
6477                          partial animated GIF support
6478                          limited 16-bpc PSD support
6479                          #ifdef unused functions
6480                          bug with < 92 byte PIC,PNM,HDR,TGA
6481       2.06  (2015-04-19) fix bug where PSD returns wrong '*comp' value
6482       2.05  (2015-04-19) fix bug in progressive JPEG handling, fix warning
6483       2.04  (2015-04-15) try to re-enable SIMD on MinGW 64-bit
6484       2.03  (2015-04-12) extra corruption checking (mmozeiko)
6485                          stbi_set_flip_vertically_on_load (nguillemot)
6486                          fix NEON support; fix mingw support
6487       2.02  (2015-01-19) fix incorrect assert, fix warning
6488       2.01  (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
6489       2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
6490       2.00  (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
6491                          progressive JPEG (stb)
6492                          PGM/PPM support (Ken Miller)
6493                          STBI_MALLOC,STBI_REALLOC,STBI_FREE
6494                          GIF bugfix -- seemingly never worked
6495                          STBI_NO_*, STBI_ONLY_*
6496       1.48  (2014-12-14) fix incorrectly-named assert()
6497       1.47  (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
6498                          optimize PNG (ryg)
6499                          fix bug in interlaced PNG with user-specified channel count (stb)
6500       1.46  (2014-08-26)
6501               fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
6502       1.45  (2014-08-16)
6503               fix MSVC-ARM internal compiler error by wrapping malloc
6504       1.44  (2014-08-07)
6505               various warning fixes from Ronny Chevalier
6506       1.43  (2014-07-15)
6507               fix MSVC-only compiler problem in code changed in 1.42
6508       1.42  (2014-07-09)
6509               don't define _CRT_SECURE_NO_WARNINGS (affects user code)
6510               fixes to stbi__cleanup_jpeg path
6511               added STBI_ASSERT to avoid requiring assert.h
6512       1.41  (2014-06-25)
6513               fix search&replace from 1.36 that messed up comments/error messages
6514       1.40  (2014-06-22)
6515               fix gcc struct-initialization warning
6516       1.39  (2014-06-15)
6517               fix to TGA optimization when req_comp != number of components in TGA;
6518               fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
6519               add support for BMP version 5 (more ignored fields)
6520       1.38  (2014-06-06)
6521               suppress MSVC warnings on integer casts truncating values
6522               fix accidental rename of 'skip' field of I/O
6523       1.37  (2014-06-04)
6524               remove duplicate typedef
6525       1.36  (2014-06-03)
6526               convert to header file single-file library
6527               if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
6528       1.35  (2014-05-27)
6529               various warnings
6530               fix broken STBI_SIMD path
6531               fix bug where stbi_load_from_file no longer left file pointer in correct place
6532               fix broken non-easy path for 32-bit BMP (possibly never used)
6533               TGA optimization by Arseny Kapoulkine
6534       1.34  (unknown)
6535               use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
6536       1.33  (2011-07-14)
6537               make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
6538       1.32  (2011-07-13)
6539               support for "info" function for all supported filetypes (SpartanJ)
6540       1.31  (2011-06-20)
6541               a few more leak fixes, bug in PNG handling (SpartanJ)
6542       1.30  (2011-06-11)
6543               added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
6544               removed deprecated format-specific test/load functions
6545               removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
6546               error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
6547               fix inefficiency in decoding 32-bit BMP (David Woo)
6548       1.29  (2010-08-16)
6549               various warning fixes from Aurelien Pocheville
6550       1.28  (2010-08-01)
6551               fix bug in GIF palette transparency (SpartanJ)
6552       1.27  (2010-08-01)
6553               cast-to-stbi_uc to fix warnings
6554       1.26  (2010-07-24)
6555               fix bug in file buffering for PNG reported by SpartanJ
6556       1.25  (2010-07-17)
6557               refix trans_data warning (Won Chun)
6558       1.24  (2010-07-12)
6559               perf improvements reading from files on platforms with lock-heavy fgetc()
6560               minor perf improvements for jpeg
6561               deprecated type-specific functions so we'll get feedback if they're needed
6562               attempt to fix trans_data warning (Won Chun)
6563       1.23    fixed bug in iPhone support
6564       1.22  (2010-07-10)
6565               removed image *writing* support
6566               stbi_info support from Jetro Lauha
6567               GIF support from Jean-Marc Lienher
6568               iPhone PNG-extensions from James Brown
6569               warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
6570       1.21    fix use of 'stbi_uc' in header (reported by jon blow)
6571       1.20    added support for Softimage PIC, by Tom Seddon
6572       1.19    bug in interlaced PNG corruption check (found by ryg)
6573       1.18  (2008-08-02)
6574               fix a threading bug (local mutable static)
6575       1.17    support interlaced PNG
6576       1.16    major bugfix - stbi__convert_format converted one too many pixels
6577       1.15    initialize some fields for thread safety
6578       1.14    fix threadsafe conversion bug
6579               header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
6580       1.13    threadsafe
6581       1.12    const qualifiers in the API
6582       1.11    Support installable IDCT, colorspace conversion routines
6583       1.10    Fixes for 64-bit (don't use "unsigned long")
6584               optimized upsampling by Fabian "ryg" Giesen
6585       1.09    Fix format-conversion for PSD code (bad global variables!)
6586       1.08    Thatcher Ulrich's PSD code integrated by Nicolas Schulz
6587       1.07    attempt to fix C++ warning/errors again
6588       1.06    attempt to fix C++ warning/errors again
6589       1.05    fix TGA loading to return correct *comp and use good luminance calc
6590       1.04    default float alpha is 1, not 255; use 'void *' for stbi_image_free
6591       1.03    bugfixes to STBI_NO_STDIO, STBI_NO_HDR
6592       1.02    support for (subset of) HDR files, float interface for preferred access to them
6593       1.01    fix bug: possible bug in handling right-side up bmps... not sure
6594               fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
6595       1.00    interface to zlib that skips zlib header
6596       0.99    correct handling of alpha in palette
6597       0.98    TGA loader by lonesock; dynamically add loaders (untested)
6598       0.97    jpeg errors on too large a file; also catch another malloc failure
6599       0.96    fix detection of invalid v value - particleman@mollyrocket forum
6600       0.95    during header scan, seek to markers in case of padding
6601       0.94    STBI_NO_STDIO to disable stdio usage; rename all #defines the same
6602       0.93    handle jpegtran output; verbose errors
6603       0.92    read 4,8,16,24,32-bit BMP files of several formats
6604       0.91    output 24-bit Windows 3.0 BMP files
6605       0.90    fix a few more warnings; bump version number to approach 1.0
6606       0.61    bugfixes due to Marc LeBlanc, Christopher Lloyd
6607       0.60    fix compiling as c++
6608       0.59    fix warnings: merge Dave Moore's -Wall fixes
6609       0.58    fix bug: zlib uncompressed mode len/nlen was wrong endian
6610       0.57    fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
6611       0.56    fix bug: zlib uncompressed mode len vs. nlen
6612       0.55    fix bug: restart_interval not initialized to 0
6613       0.54    allow NULL for 'int *comp'
6614       0.53    fix bug in png 3->4; speedup png decoding
6615       0.52    png handles req_comp=3,4 directly; minor cleanup; jpeg comments
6616       0.51    obey req_comp requests, 1-component jpegs return as 1-component,
6617               on 'test' only check type, not whether we support this variant
6618       0.50  (2006-11-19)
6619               first released version
6620 */
6621