1 /* stb_image - v2.12 - public domain image loader - http://nothings.org/stb_image.h
2                                      no warranty implied; use at your own risk
3 
4    Do this:
5       #define STB_IMAGE_IMPLEMENTATION
6    before you include this file in *one* C or C++ file to create the implementation.
7 
8    // i.e. it should look like this:
9    #include ...
10    #include ...
11    #include ...
12    #define STB_IMAGE_IMPLEMENTATION
13    #include "stb_image.h"
14 
15    You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
16    And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
17 
18 
19    QUICK NOTES:
20       Primarily of interest to game developers and other people who can
21           avoid problematic images and only need the trivial interface
22 
23       JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
24       PNG 1/2/4/8-bit-per-channel (16 bpc not supported)
25 
26       TGA (not sure what subset, if a subset)
27       BMP non-1bpp, non-RLE
28       PSD (composited view only, no extra channels, 8/16 bit-per-channel)
29 
30       GIF (*comp always reports as 4-channel)
31       HDR (radiance rgbE format)
32       PIC (Softimage PIC)
33       PNM (PPM and PGM binary only)
34 
35       Animated GIF still needs a proper API, but here's one way to do it:
36           http://gist.github.com/urraka/685d9a6340b26b830d49
37 
38       - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
39       - decode from arbitrary I/O callbacks
40       - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
41 
42    Full documentation under "DOCUMENTATION" below.
43 
44 
45    Revision 2.00 release notes:
46 
47       - Progressive JPEG is now supported.
48 
49       - PPM and PGM binary formats are now supported, thanks to Ken Miller.
50 
51       - x86 platforms now make use of SSE2 SIMD instructions for
52         JPEG decoding, and ARM platforms can use NEON SIMD if requested.
53         This work was done by Fabian "ryg" Giesen. SSE2 is used by
54         default, but NEON must be enabled explicitly; see docs.
55 
56         With other JPEG optimizations included in this version, we see
57         2x speedup on a JPEG on an x86 machine, and a 1.5x speedup
58         on a JPEG on an ARM machine, relative to previous versions of this
59         library. The same results will not obtain for all JPGs and for all
60         x86/ARM machines. (Note that progressive JPEGs are significantly
61         slower to decode than regular JPEGs.) This doesn't mean that this
62         is the fastest JPEG decoder in the land; rather, it brings it
63         closer to parity with standard libraries. If you want the fastest
64         decode, look elsewhere. (See "Philosophy" section of docs below.)
65 
66         See final bullet items below for more info on SIMD.
67 
68       - Added STBI_MALLOC, STBI_REALLOC, and STBI_FREE macros for replacing
69         the memory allocator. Unlike other STBI libraries, these macros don't
70         support a context parameter, so if you need to pass a context in to
71         the allocator, you'll have to store it in a global or a thread-local
72         variable.
73 
74       - Split existing STBI_NO_HDR flag into two flags, STBI_NO_HDR and
75         STBI_NO_LINEAR.
76             STBI_NO_HDR:     suppress implementation of .hdr reader format
77             STBI_NO_LINEAR:  suppress high-dynamic-range light-linear float API
78 
79       - You can suppress implementation of any of the decoders to reduce
80         your code footprint by #defining one or more of the following
81         symbols before creating the implementation.
82 
83             STBI_NO_JPEG
84             STBI_NO_PNG
85             STBI_NO_BMP
86             STBI_NO_PSD
87             STBI_NO_TGA
88             STBI_NO_GIF
89             STBI_NO_HDR
90             STBI_NO_PIC
91             STBI_NO_PNM   (.ppm and .pgm)
92 
93       - You can request *only* certain decoders and suppress all other ones
94         (this will be more forward-compatible, as addition of new decoders
95         doesn't require you to disable them explicitly):
96 
97             STBI_ONLY_JPEG
98             STBI_ONLY_PNG
99             STBI_ONLY_BMP
100             STBI_ONLY_PSD
101             STBI_ONLY_TGA
102             STBI_ONLY_GIF
103             STBI_ONLY_HDR
104             STBI_ONLY_PIC
105             STBI_ONLY_PNM   (.ppm and .pgm)
106 
107          Note that you can define multiples of these, and you will get all
108          of them ("only x" and "only y" is interpreted to mean "only x&y").
109 
110        - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
111          want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
112 
113       - Compilation of all SIMD code can be suppressed with
114             #define STBI_NO_SIMD
115         It should not be necessary to disable SIMD unless you have issues
116         compiling (e.g. using an x86 compiler which doesn't support SSE
117         intrinsics or that doesn't support the method used to detect
118         SSE2 support at run-time), and even those can be reported as
119         bugs so I can refine the built-in compile-time checking to be
120         smarter.
121 
122       - The old STBI_SIMD system which allowed installing a user-defined
123         IDCT etc. has been removed. If you need this, don't upgrade. My
124         assumption is that almost nobody was doing this, and those who
125         were will find the built-in SIMD more satisfactory anyway.
126 
127       - RGB values computed for JPEG images are slightly different from
128         previous versions of stb_image. (This is due to using less
129         integer precision in SIMD.) The C code has been adjusted so
130         that the same RGB values will be computed regardless of whether
131         SIMD support is available, so your app should always produce
132         consistent results. But these results are slightly different from
133         previous versions. (Specifically, about 3% of available YCbCr values
134         will compute different RGB results from pre-1.49 versions by +-1;
135         most of the deviating values are one smaller in the G channel.)
136 
137       - If you must produce consistent results with previous versions of
138         stb_image, #define STBI_JPEG_OLD and you will get the same results
139         you used to; however, you will not get the SIMD speedups for
140         the YCbCr-to-RGB conversion step (although you should still see
141         significant JPEG speedup from the other changes).
142 
143         Please note that STBI_JPEG_OLD is a temporary feature; it will be
144         removed in future versions of the library. It is only intended for
145         near-term back-compatibility use.
146 
147 
148    Latest revision history:
149       2.12  (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
150       2.11  (2016-04-02) 16-bit PNGS; enable SSE2 in non-gcc x64
151                          RGB-format JPEG; remove white matting in PSD;
152                          allocate large structures on the stack;
153                          correct channel count for PNG & BMP
154       2.10  (2016-01-22) avoid warning introduced in 2.09
155       2.09  (2016-01-16) 16-bit TGA; comments in PNM files; STBI_REALLOC_SIZED
156       2.08  (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
157       2.07  (2015-09-13) partial animated GIF support
158                          limited 16-bit PSD support
159                          minor bugs, code cleanup, and compiler warnings
160       2.06  (2015-04-19) fix bug where PSD returns wrong '*comp' value
161       2.05  (2015-04-19) fix bug in progressive JPEG handling, fix warning
162       2.04  (2015-04-15) try to re-enable SIMD on MinGW 64-bit
163       2.03  (2015-04-12) additional corruption checking
164                          stbi_set_flip_vertically_on_load
165                          fix NEON support; fix mingw support
166       2.02  (2015-01-19) fix incorrect assert, fix warning
167       2.01  (2015-01-17) fix various warnings
168       2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
169       2.00  (2014-12-25) optimize JPEG, including x86 SSE2 & ARM NEON SIMD
170                          progressive JPEG
171                          PGM/PPM support
172                          STBI_MALLOC,STBI_REALLOC,STBI_FREE
173                          STBI_NO_*, STBI_ONLY_*
174                          GIF bugfix
175 
176    See end of file for full revision history.
177 
178 
179  ============================    Contributors    =========================
180 
181  Image formats                          Extensions, features
182     Sean Barrett (jpeg, png, bmp)          Jetro Lauha (stbi_info)
183     Nicolas Schulz (hdr, psd)              Martin "SpartanJ" Golini (stbi_info)
184     Jonathan Dummer (tga)                  James "moose2000" Brown (iPhone PNG)
185     Jean-Marc Lienher (gif)                Ben "Disch" Wenger (io callbacks)
186     Tom Seddon (pic)                       Omar Cornut (1/2/4-bit PNG)
187     Thatcher Ulrich (psd)                  Nicolas Guillemot (vertical flip)
188     Ken Miller (pgm, ppm)                  Richard Mitton (16-bit PSD)
189     urraka@github (animated gif)           Junggon Kim (PNM comments)
190                                            Daniel Gibson (16-bit TGA)
191 
192  Optimizations & bugfixes
193     Fabian "ryg" Giesen
194     Arseny Kapoulkine
195 
196  Bug & warning fixes
197     Marc LeBlanc            David Woo          Guillaume George   Martins Mozeiko
198     Christpher Lloyd        Martin Golini      Jerry Jansson      Joseph Thomson
199     Dave Moore              Roy Eltham         Hayaki Saito       Phil Jordan
200     Won Chun                Luke Graham        Johan Duparc       Nathan Reed
201     the Horde3D community   Thomas Ruf         Ronny Chevalier    Nick Verigakis
202     Janez Zemva             John Bartholomew   Michal Cichon      svdijk@github
203     Jonathan Blow           Ken Hamada         Tero Hanninen      Baldur Karlsson
204     Laurent Gomila          Cort Stratton      Sergio Gonzalez    romigrou@github
205     Aruelien Pocheville     Thibault Reuille   Cass Everitt       Matthew Gregan
206     Ryamond Barbiero        Paul Du Bois       Engin Manap        snagar@github
207     Michaelangel007@github  Oriol Ferrer Mesia socks-the-fox
208     Blazej Dariusz Roszkowski
209 
210 
211 LICENSE
212 
213 This software is dual-licensed to the public domain and under the following
214 license: you are granted a perpetual, irrevocable license to copy, modify,
215 publish, and distribute this file as you see fit.
216 
217 */
218 
219 #ifndef STBI_INCLUDE_STB_IMAGE_H
220 #define STBI_INCLUDE_STB_IMAGE_H
221 
222 // DOCUMENTATION
223 //
224 // Limitations:
225 //    - no 16-bit-per-channel PNG
226 //    - no 12-bit-per-channel JPEG
227 //    - no JPEGs with arithmetic coding
228 //    - no 1-bit BMP
229 //    - GIF always returns *comp=4
230 //
231 // Basic usage (see HDR discussion below for HDR usage):
232 //    int x,y,n;
233 //    unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
234 //    // ... process data if not NULL ...
235 //    // ... x = width, y = height, n = # 8-bit components per pixel ...
236 //    // ... replace '0' with '1'..'4' to force that many components per pixel
237 //    // ... but 'n' will always be the number that it would have been if you said 0
238 //    stbi_image_free(data)
239 //
240 // Standard parameters:
241 //    int *x       -- outputs image width in pixels
242 //    int *y       -- outputs image height in pixels
243 //    int *comp    -- outputs # of image components in image file
244 //    int req_comp -- if non-zero, # of image components requested in result
245 //
246 // The return value from an image loader is an 'unsigned char *' which points
247 // to the pixel data, or NULL on an allocation failure or if the image is
248 // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
249 // with each pixel consisting of N interleaved 8-bit components; the first
250 // pixel pointed to is top-left-most in the image. There is no padding between
251 // image scanlines or between pixels, regardless of format. The number of
252 // components N is 'req_comp' if req_comp is non-zero, or *comp otherwise.
253 // If req_comp is non-zero, *comp has the number of components that _would_
254 // have been output otherwise. E.g. if you set req_comp to 4, you will always
255 // get RGBA output, but you can check *comp to see if it's trivially opaque
256 // because e.g. there were only 3 channels in the source image.
257 //
258 // An output image with N components has the following components interleaved
259 // in this order in each pixel:
260 //
261 //     N=#comp     components
262 //       1           grey
263 //       2           grey, alpha
264 //       3           red, green, blue
265 //       4           red, green, blue, alpha
266 //
267 // If image loading fails for any reason, the return value will be NULL,
268 // and *x, *y, *comp will be unchanged. The function stbi_failure_reason()
269 // can be queried for an extremely brief, end-user unfriendly explanation
270 // of why the load failed. Define STBI_NO_FAILURE_STRINGS to avoid
271 // compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
272 // more user-friendly ones.
273 //
274 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
275 //
276 // ===========================================================================
277 //
278 // Philosophy
279 //
280 // stb libraries are designed with the following priorities:
281 //
282 //    1. easy to use
283 //    2. easy to maintain
284 //    3. good performance
285 //
286 // Sometimes I let "good performance" creep up in priority over "easy to maintain",
287 // and for best performance I may provide less-easy-to-use APIs that give higher
288 // performance, in addition to the easy to use ones. Nevertheless, it's important
289 // to keep in mind that from the standpoint of you, a client of this library,
290 // all you care about is #1 and #3, and stb libraries do not emphasize #3 above all.
291 //
292 // Some secondary priorities arise directly from the first two, some of which
293 // make more explicit reasons why performance can't be emphasized.
294 //
295 //    - Portable ("ease of use")
296 //    - Small footprint ("easy to maintain")
297 //    - No dependencies ("ease of use")
298 //
299 // ===========================================================================
300 //
301 // I/O callbacks
302 //
303 // I/O callbacks allow you to read from arbitrary sources, like packaged
304 // files or some other source. Data read from callbacks are processed
305 // through a small internal buffer (currently 128 bytes) to try to reduce
306 // overhead.
307 //
308 // The three functions you must define are "read" (reads some bytes of data),
309 // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
310 //
311 // ===========================================================================
312 //
313 // SIMD support
314 //
315 // The JPEG decoder will try to automatically use SIMD kernels on x86 when
316 // supported by the compiler. For ARM Neon support, you must explicitly
317 // request it.
318 //
319 // (The old do-it-yourself SIMD API is no longer supported in the current
320 // code.)
321 //
322 // On x86, SSE2 will automatically be used when available based on a run-time
323 // test; if not, the generic C versions are used as a fall-back. On ARM targets,
324 // the typical path is to have separate builds for NEON and non-NEON devices
325 // (at least this is true for iOS and Android). Therefore, the NEON support is
326 // toggled by a build flag: define STBI_NEON to get NEON loops.
327 //
328 // The output of the JPEG decoder is slightly different from versions where
329 // SIMD support was introduced (that is, for versions before 1.49). The
330 // difference is only +-1 in the 8-bit RGB channels, and only on a small
331 // fraction of pixels. You can force the pre-1.49 behavior by defining
332 // STBI_JPEG_OLD, but this will disable some of the SIMD decoding path
333 // and hence cost some performance.
334 //
335 // If for some reason you do not want to use any of SIMD code, or if
336 // you have issues compiling it, you can disable it entirely by
337 // defining STBI_NO_SIMD.
338 //
339 // ===========================================================================
340 //
341 // HDR image support   (disable by defining STBI_NO_HDR)
342 //
343 // stb_image now supports loading HDR images in general, and currently
344 // the Radiance .HDR file format, although the support is provided
345 // generically. You can still load any file through the existing interface;
346 // if you attempt to load an HDR file, it will be automatically remapped to
347 // LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
348 // both of these constants can be reconfigured through this interface:
349 //
350 //     stbi_hdr_to_ldr_gamma(2.2f);
351 //     stbi_hdr_to_ldr_scale(1.0f);
352 //
353 // (note, do not use _inverse_ constants; stbi_image will invert them
354 // appropriately).
355 //
356 // Additionally, there is a new, parallel interface for loading files as
357 // (linear) floats to preserve the full dynamic range:
358 //
359 //    float *data = stbi_loadf(filename, &x, &y, &n, 0);
360 //
361 // If you load LDR images through this interface, those images will
362 // be promoted to floating point values, run through the inverse of
363 // constants corresponding to the above:
364 //
365 //     stbi_ldr_to_hdr_scale(1.0f);
366 //     stbi_ldr_to_hdr_gamma(2.2f);
367 //
368 // Finally, given a filename (or an open file or memory block--see header
369 // file for details) containing image data, you can query for the "most
370 // appropriate" interface to use (that is, whether the image is HDR or
371 // not), using:
372 //
373 //     stbi_is_hdr(char *filename);
374 //
375 // ===========================================================================
376 //
377 // iPhone PNG support:
378 //
379 // By default we convert iphone-formatted PNGs back to RGB, even though
380 // they are internally encoded differently. You can disable this conversion
381 // by by calling stbi_convert_iphone_png_to_rgb(0), in which case
382 // you will always just get the native iphone "format" through (which
383 // is BGR stored in RGB).
384 //
385 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
386 // pixel to remove any premultiplied alpha *only* if the image file explicitly
387 // says there's premultiplied data (currently only happens in iPhone images,
388 // and only if iPhone convert-to-rgb processing is on).
389 //
390 
391 
392 #ifndef STBI_NO_STDIO
393 #include <stdio.h>
394 #endif // STBI_NO_STDIO
395 
396 #define STBI_VERSION 1
397 
398 enum
399 {
400    STBI_default = 0, // only used for req_comp
401 
402    STBI_grey       = 1,
403    STBI_grey_alpha = 2,
404    STBI_rgb        = 3,
405    STBI_rgb_alpha  = 4
406 };
407 
408 typedef unsigned char stbi_uc;
409 
410 #ifdef __cplusplus
411 extern "C" {
412 #endif
413 
414 #ifdef STB_IMAGE_STATIC
415 #define STBIDEF static
416 #else
417 #define STBIDEF extern
418 #endif
419 
420 //////////////////////////////////////////////////////////////////////////////
421 //
422 // PRIMARY API - works on images of any type
423 //
424 
425 //
426 // load image by filename, open file, or memory buffer
427 //
428 
429 typedef struct
430 {
431    int      (*read)  (void *user,char *data,int size);   // fill 'data' with 'size' bytes.  return number of bytes actually read
432    void     (*skip)  (void *user,int n);                 // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
433    int      (*eof)   (void *user);                       // returns nonzero if we are at end of file/data
434 } stbi_io_callbacks;
435 
436 STBIDEF stbi_uc *stbi_load               (char              const *filename,           int *x, int *y, int *comp, int req_comp);
437 STBIDEF stbi_uc *stbi_load_from_memory   (stbi_uc           const *buffer, int len   , int *x, int *y, int *comp, int req_comp);
438 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk  , void *user, int *x, int *y, int *comp, int req_comp);
439 
440 #ifndef STBI_NO_STDIO
441 STBIDEF stbi_uc *stbi_load_from_file  (FILE *f,                  int *x, int *y, int *comp, int req_comp);
442 // for stbi_load_from_file, file pointer is left pointing immediately after image
443 #endif
444 
445 #ifndef STBI_NO_LINEAR
446    STBIDEF float *stbi_loadf                 (char const *filename,           int *x, int *y, int *comp, int req_comp);
447    STBIDEF float *stbi_loadf_from_memory     (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
448    STBIDEF float *stbi_loadf_from_callbacks  (stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp);
449 
450    #ifndef STBI_NO_STDIO
451    STBIDEF float *stbi_loadf_from_file  (FILE *f,                int *x, int *y, int *comp, int req_comp);
452    #endif
453 #endif
454 
455 #ifndef STBI_NO_HDR
456    STBIDEF void   stbi_hdr_to_ldr_gamma(float gamma);
457    STBIDEF void   stbi_hdr_to_ldr_scale(float scale);
458 #endif // STBI_NO_HDR
459 
460 #ifndef STBI_NO_LINEAR
461    STBIDEF void   stbi_ldr_to_hdr_gamma(float gamma);
462    STBIDEF void   stbi_ldr_to_hdr_scale(float scale);
463 #endif // STBI_NO_LINEAR
464 
465 // stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
466 STBIDEF int    stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
467 STBIDEF int    stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
468 #ifndef STBI_NO_STDIO
469 STBIDEF int      stbi_is_hdr          (char const *filename);
470 STBIDEF int      stbi_is_hdr_from_file(FILE *f);
471 #endif // STBI_NO_STDIO
472 
473 
474 // get a VERY brief reason for failure
475 // NOT THREADSAFE
476 STBIDEF const char *stbi_failure_reason  (void);
477 
478 // free the loaded image -- this is just free()
479 STBIDEF void     stbi_image_free      (void *retval_from_stbi_load);
480 
481 // get image dimensions & components without fully decoding
482 STBIDEF int      stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
483 STBIDEF int      stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp);
484 
485 #ifndef STBI_NO_STDIO
486 STBIDEF int      stbi_info            (char const *filename,     int *x, int *y, int *comp);
487 STBIDEF int      stbi_info_from_file  (FILE *f,                  int *x, int *y, int *comp);
488 
489 #endif
490 
491 
492 
493 // for image formats that explicitly notate that they have premultiplied alpha,
494 // we just return the colors as stored in the file. set this flag to force
495 // unpremultiplication. results are undefined if the unpremultiply overflow.
496 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
497 
498 // indicate whether we should process iphone images back to canonical format,
499 // or just pass them through "as-is"
500 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
501 
502 // flip the image vertically, so the first pixel in the output array is the bottom left
503 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip);
504 
505 // ZLIB client - used by PNG, available for other purposes
506 
507 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
508 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
509 STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
510 STBIDEF int   stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
511 
512 STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
513 STBIDEF int   stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
514 
515 
516 #ifdef __cplusplus
517 }
518 #endif
519 
520 //
521 //
522 ////   end header file   /////////////////////////////////////////////////////
523 #endif // STBI_INCLUDE_STB_IMAGE_H
524 
525 #ifdef STB_IMAGE_IMPLEMENTATION
526 
527 #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
528   || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
529   || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
530   || defined(STBI_ONLY_ZLIB)
531    #ifndef STBI_ONLY_JPEG
532    #define STBI_NO_JPEG
533    #endif
534    #ifndef STBI_ONLY_PNG
535    #define STBI_NO_PNG
536    #endif
537    #ifndef STBI_ONLY_BMP
538    #define STBI_NO_BMP
539    #endif
540    #ifndef STBI_ONLY_PSD
541    #define STBI_NO_PSD
542    #endif
543    #ifndef STBI_ONLY_TGA
544    #define STBI_NO_TGA
545    #endif
546    #ifndef STBI_ONLY_GIF
547    #define STBI_NO_GIF
548    #endif
549    #ifndef STBI_ONLY_HDR
550    #define STBI_NO_HDR
551    #endif
552    #ifndef STBI_ONLY_PIC
553    #define STBI_NO_PIC
554    #endif
555    #ifndef STBI_ONLY_PNM
556    #define STBI_NO_PNM
557    #endif
558 #endif
559 
560 #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
561 #define STBI_NO_ZLIB
562 #endif
563 
564 
565 #include <stdarg.h>
566 #include <stddef.h> // ptrdiff_t on osx
567 #include <stdlib.h>
568 #include <string.h>
569 
570 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
571 #include <math.h>  // ldexp
572 #endif
573 
574 #ifndef STBI_NO_STDIO
575 #include <stdio.h>
576 #endif
577 
578 #ifndef STBI_ASSERT
579 #include <assert.h>
580 #define STBI_ASSERT(x) assert(x)
581 #endif
582 
583 
584 #ifndef _MSC_VER
585    #ifdef __cplusplus
586    #define stbi_inline inline
587    #else
588    #define stbi_inline
589    #endif
590 #else
591    #define stbi_inline __forceinline
592 #endif
593 
594 
595 #ifdef _MSC_VER
596 typedef unsigned short stbi__uint16;
597 typedef   signed short stbi__int16;
598 typedef unsigned int   stbi__uint32;
599 typedef   signed int   stbi__int32;
600 #else
601 #include <stdint.h>
602 typedef uint16_t stbi__uint16;
603 typedef int16_t  stbi__int16;
604 typedef uint32_t stbi__uint32;
605 typedef int32_t  stbi__int32;
606 #endif
607 
608 // should produce compiler error if size is wrong
609 typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1];
610 
611 #ifdef _MSC_VER
612 #define STBI_NOTUSED(v)  (void)(v)
613 #else
614 #define STBI_NOTUSED(v)  (void)sizeof(v)
615 #endif
616 
617 #ifdef _MSC_VER
618 #define STBI_HAS_LROTL
619 #endif
620 
621 #ifdef STBI_HAS_LROTL
622    #define stbi_lrot(x,y)  _lrotl(x,y)
623 #else
624    #define stbi_lrot(x,y)  (((x) << (y)) | ((x) >> (32 - (y))))
625 #endif
626 
627 #if defined(STBI_MALLOC) && defined(STBI_FREE) && (defined(STBI_REALLOC) || defined(STBI_REALLOC_SIZED))
628 // ok
629 #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC) && !defined(STBI_REALLOC_SIZED)
630 // ok
631 #else
632 #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC (or STBI_REALLOC_SIZED)."
633 #endif
634 
635 #ifndef STBI_MALLOC
636 #define STBI_MALLOC(sz)           malloc(sz)
637 #define STBI_REALLOC(p,newsz)     realloc(p,newsz)
638 #define STBI_FREE(p)              free(p)
639 #endif
640 
641 #ifndef STBI_REALLOC_SIZED
642 #define STBI_REALLOC_SIZED(p,oldsz,newsz) STBI_REALLOC(p,newsz)
643 #endif
644 
645 // x86/x64 detection
646 #if defined(__x86_64__) || defined(_M_X64)
647 #define STBI__X64_TARGET
648 #elif defined(__i386) || defined(_M_IX86)
649 #define STBI__X86_TARGET
650 #endif
651 
652 #if defined(__GNUC__) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET)) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
653 // NOTE: not clear do we actually need this for the 64-bit path?
654 // gcc doesn't support sse2 intrinsics unless you compile with -msse2,
655 // (but compiling with -msse2 allows the compiler to use SSE2 everywhere;
656 // this is just broken and gcc are jerks for not fixing it properly
657 // http://www.virtualdub.org/blog/pivot/entry.php?id=363 )
658 #define STBI_NO_SIMD
659 #endif
660 
661 #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
662 // Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
663 //
664 // 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
665 // Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
666 // As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
667 // simultaneously enabling "-mstackrealign".
668 //
669 // See https://github.com/nothings/stb/issues/81 for more information.
670 //
671 // So default to no SSE2 on 32-bit MinGW. If you've read this far and added
672 // -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
673 #define STBI_NO_SIMD
674 #endif
675 
676 #if !defined(STBI_NO_SIMD) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET))
677 #define STBI_SSE2
678 #include <emmintrin.h>
679 
680 #ifdef _MSC_VER
681 
682 #if _MSC_VER >= 1400  // not VC6
683 #include <intrin.h> // __cpuid
stbi__cpuid3(void)684 static int stbi__cpuid3(void)
685 {
686    int info[4];
687    __cpuid(info,1);
688    return info[3];
689 }
690 #else
stbi__cpuid3(void)691 static int stbi__cpuid3(void)
692 {
693    int res;
694    __asm {
695       mov  eax,1
696       cpuid
697       mov  res,edx
698    }
699    return res;
700 }
701 #endif
702 
703 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
704 
stbi__sse2_available()705 static int stbi__sse2_available()
706 {
707    int info3 = stbi__cpuid3();
708    return ((info3 >> 26) & 1) != 0;
709 }
710 #else // assume GCC-style if not VC++
711 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
712 
stbi__sse2_available()713 static int stbi__sse2_available()
714 {
715 #if defined(STBI__X64_TARGET)
716    // on x64, SSE2 can be assumed to be available.
717    return 1;
718 #elif defined(LOVE_STBI_SSE2_OVERRIDE)
719    return 1;
720 #else
721 #	warning "stb_image compiled without SSE2 support, define LOVE_STBI_SSE2_OVERRIDE to force SSE2 support"
722    // __builtin_cpu_supports is buggy on GCC 5 and above, causing problems if
723    // referenced in a shared object, giving missing __cpu_model hidden symbol errors.
724    // To get around that, just assume that SSE2 is not available on x86.
725    //
726    // See https://github.com/nothings/stb/issues/280 for more information.
727    return 0;
728 #endif
729 }
730 #endif
731 #endif
732 
733 // ARM NEON
734 #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
735 #undef STBI_NEON
736 #endif
737 
738 #ifdef STBI_NEON
739 #include <arm_neon.h>
740 // assume GCC or Clang on ARM targets
741 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
742 #endif
743 
744 #ifndef STBI_SIMD_ALIGN
745 #define STBI_SIMD_ALIGN(type, name) type name
746 #endif
747 
748 ///////////////////////////////////////////////
749 //
750 //  stbi__context struct and start_xxx functions
751 
752 // stbi__context structure is our basic context used by all images, so it
753 // contains all the IO context, plus some basic image information
754 typedef struct
755 {
756    stbi__uint32 img_x, img_y;
757    int img_n, img_out_n;
758 
759    stbi_io_callbacks io;
760    void *io_user_data;
761 
762    int read_from_callbacks;
763    int buflen;
764    stbi_uc buffer_start[128];
765 
766    stbi_uc *img_buffer, *img_buffer_end;
767    stbi_uc *img_buffer_original, *img_buffer_original_end;
768 } stbi__context;
769 
770 
771 static void stbi__refill_buffer(stbi__context *s);
772 
773 // initialize a memory-decode context
stbi__start_mem(stbi__context * s,stbi_uc const * buffer,int len)774 static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len)
775 {
776    s->io.read = NULL;
777    s->read_from_callbacks = 0;
778    s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer;
779    s->img_buffer_end = s->img_buffer_original_end = (stbi_uc *) buffer+len;
780 }
781 
782 // initialize a callback-based context
stbi__start_callbacks(stbi__context * s,stbi_io_callbacks * c,void * user)783 static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user)
784 {
785    s->io = *c;
786    s->io_user_data = user;
787    s->buflen = sizeof(s->buffer_start);
788    s->read_from_callbacks = 1;
789    s->img_buffer_original = s->buffer_start;
790    stbi__refill_buffer(s);
791    s->img_buffer_original_end = s->img_buffer_end;
792 }
793 
794 #ifndef STBI_NO_STDIO
795 
stbi__stdio_read(void * user,char * data,int size)796 static int stbi__stdio_read(void *user, char *data, int size)
797 {
798    return (int) fread(data,1,size,(FILE*) user);
799 }
800 
stbi__stdio_skip(void * user,int n)801 static void stbi__stdio_skip(void *user, int n)
802 {
803    fseek((FILE*) user, n, SEEK_CUR);
804 }
805 
stbi__stdio_eof(void * user)806 static int stbi__stdio_eof(void *user)
807 {
808    return feof((FILE*) user);
809 }
810 
811 static stbi_io_callbacks stbi__stdio_callbacks =
812 {
813    stbi__stdio_read,
814    stbi__stdio_skip,
815    stbi__stdio_eof,
816 };
817 
stbi__start_file(stbi__context * s,FILE * f)818 static void stbi__start_file(stbi__context *s, FILE *f)
819 {
820    stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f);
821 }
822 
823 //static void stop_file(stbi__context *s) { }
824 
825 #endif // !STBI_NO_STDIO
826 
stbi__rewind(stbi__context * s)827 static void stbi__rewind(stbi__context *s)
828 {
829    // conceptually rewind SHOULD rewind to the beginning of the stream,
830    // but we just rewind to the beginning of the initial buffer, because
831    // we only use it after doing 'test', which only ever looks at at most 92 bytes
832    s->img_buffer = s->img_buffer_original;
833    s->img_buffer_end = s->img_buffer_original_end;
834 }
835 
836 #ifndef STBI_NO_JPEG
837 static int      stbi__jpeg_test(stbi__context *s);
838 static stbi_uc *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
839 static int      stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp);
840 #endif
841 
842 #ifndef STBI_NO_PNG
843 static int      stbi__png_test(stbi__context *s);
844 static stbi_uc *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
845 static int      stbi__png_info(stbi__context *s, int *x, int *y, int *comp);
846 #endif
847 
848 #ifndef STBI_NO_BMP
849 static int      stbi__bmp_test(stbi__context *s);
850 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
851 static int      stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp);
852 #endif
853 
854 #ifndef STBI_NO_TGA
855 static int      stbi__tga_test(stbi__context *s);
856 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
857 static int      stbi__tga_info(stbi__context *s, int *x, int *y, int *comp);
858 #endif
859 
860 #ifndef STBI_NO_PSD
861 static int      stbi__psd_test(stbi__context *s);
862 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
863 static int      stbi__psd_info(stbi__context *s, int *x, int *y, int *comp);
864 #endif
865 
866 #ifndef STBI_NO_HDR
867 static int      stbi__hdr_test(stbi__context *s);
868 static float   *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
869 static int      stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp);
870 #endif
871 
872 #ifndef STBI_NO_PIC
873 static int      stbi__pic_test(stbi__context *s);
874 static stbi_uc *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
875 static int      stbi__pic_info(stbi__context *s, int *x, int *y, int *comp);
876 #endif
877 
878 #ifndef STBI_NO_GIF
879 static int      stbi__gif_test(stbi__context *s);
880 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
881 static int      stbi__gif_info(stbi__context *s, int *x, int *y, int *comp);
882 #endif
883 
884 #ifndef STBI_NO_PNM
885 static int      stbi__pnm_test(stbi__context *s);
886 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
887 static int      stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp);
888 #endif
889 
890 // this is not threadsafe
891 static const char *stbi__g_failure_reason;
892 
stbi_failure_reason(void)893 STBIDEF const char *stbi_failure_reason(void)
894 {
895    return stbi__g_failure_reason;
896 }
897 
stbi__err(const char * str)898 static int stbi__err(const char *str)
899 {
900    stbi__g_failure_reason = str;
901    return 0;
902 }
903 
stbi__malloc(size_t size)904 static void *stbi__malloc(size_t size)
905 {
906     return STBI_MALLOC(size);
907 }
908 
909 // stbi__err - error
910 // stbi__errpf - error returning pointer to float
911 // stbi__errpuc - error returning pointer to unsigned char
912 
913 #ifdef STBI_NO_FAILURE_STRINGS
914    #define stbi__err(x,y)  0
915 #elif defined(STBI_FAILURE_USERMSG)
916    #define stbi__err(x,y)  stbi__err(y)
917 #else
918    #define stbi__err(x,y)  stbi__err(x)
919 #endif
920 
921 #define stbi__errpf(x,y)   ((float *)(size_t) (stbi__err(x,y)?NULL:NULL))
922 #define stbi__errpuc(x,y)  ((unsigned char *)(size_t) (stbi__err(x,y)?NULL:NULL))
923 
stbi_image_free(void * retval_from_stbi_load)924 STBIDEF void stbi_image_free(void *retval_from_stbi_load)
925 {
926    STBI_FREE(retval_from_stbi_load);
927 }
928 
929 #ifndef STBI_NO_LINEAR
930 static float   *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
931 #endif
932 
933 #ifndef STBI_NO_HDR
934 static stbi_uc *stbi__hdr_to_ldr(float   *data, int x, int y, int comp);
935 #endif
936 
937 static int stbi__vertically_flip_on_load = 0;
938 
stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)939 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
940 {
941     stbi__vertically_flip_on_load = flag_true_if_should_flip;
942 }
943 
stbi__load_main(stbi__context * s,int * x,int * y,int * comp,int req_comp)944 static unsigned char *stbi__load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
945 {
946    #ifndef STBI_NO_JPEG
947    if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp);
948    #endif
949    #ifndef STBI_NO_PNG
950    if (stbi__png_test(s))  return stbi__png_load(s,x,y,comp,req_comp);
951    #endif
952    #ifndef STBI_NO_BMP
953    if (stbi__bmp_test(s))  return stbi__bmp_load(s,x,y,comp,req_comp);
954    #endif
955    #ifndef STBI_NO_GIF
956    if (stbi__gif_test(s))  return stbi__gif_load(s,x,y,comp,req_comp);
957    #endif
958    #ifndef STBI_NO_PSD
959    if (stbi__psd_test(s))  return stbi__psd_load(s,x,y,comp,req_comp);
960    #endif
961    #ifndef STBI_NO_PIC
962    if (stbi__pic_test(s))  return stbi__pic_load(s,x,y,comp,req_comp);
963    #endif
964    #ifndef STBI_NO_PNM
965    if (stbi__pnm_test(s))  return stbi__pnm_load(s,x,y,comp,req_comp);
966    #endif
967 
968    #ifndef STBI_NO_HDR
969    if (stbi__hdr_test(s)) {
970       float *hdr = stbi__hdr_load(s, x,y,comp,req_comp);
971       return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
972    }
973    #endif
974 
975    #ifndef STBI_NO_TGA
976    // test tga last because it's a crappy test!
977    if (stbi__tga_test(s))
978       return stbi__tga_load(s,x,y,comp,req_comp);
979    #endif
980 
981    return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
982 }
983 
stbi__load_flip(stbi__context * s,int * x,int * y,int * comp,int req_comp)984 static unsigned char *stbi__load_flip(stbi__context *s, int *x, int *y, int *comp, int req_comp)
985 {
986    unsigned char *result = stbi__load_main(s, x, y, comp, req_comp);
987 
988    if (stbi__vertically_flip_on_load && result != NULL) {
989       int w = *x, h = *y;
990       int depth = req_comp ? req_comp : *comp;
991       int row,col,z;
992       stbi_uc temp;
993 
994       // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
995       for (row = 0; row < (h>>1); row++) {
996          for (col = 0; col < w; col++) {
997             for (z = 0; z < depth; z++) {
998                temp = result[(row * w + col) * depth + z];
999                result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
1000                result[((h - row - 1) * w + col) * depth + z] = temp;
1001             }
1002          }
1003       }
1004    }
1005 
1006    return result;
1007 }
1008 
1009 #ifndef STBI_NO_HDR
stbi__float_postprocess(float * result,int * x,int * y,int * comp,int req_comp)1010 static void stbi__float_postprocess(float *result, int *x, int *y, int *comp, int req_comp)
1011 {
1012    if (stbi__vertically_flip_on_load && result != NULL) {
1013       int w = *x, h = *y;
1014       int depth = req_comp ? req_comp : *comp;
1015       int row,col,z;
1016       float temp;
1017 
1018       // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
1019       for (row = 0; row < (h>>1); row++) {
1020          for (col = 0; col < w; col++) {
1021             for (z = 0; z < depth; z++) {
1022                temp = result[(row * w + col) * depth + z];
1023                result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
1024                result[((h - row - 1) * w + col) * depth + z] = temp;
1025             }
1026          }
1027       }
1028    }
1029 }
1030 #endif
1031 
1032 #ifndef STBI_NO_STDIO
1033 
stbi__fopen(char const * filename,char const * mode)1034 static FILE *stbi__fopen(char const *filename, char const *mode)
1035 {
1036    FILE *f;
1037 #if defined(_MSC_VER) && _MSC_VER >= 1400
1038    if (0 != fopen_s(&f, filename, mode))
1039       f=0;
1040 #else
1041    f = fopen(filename, mode);
1042 #endif
1043    return f;
1044 }
1045 
1046 
stbi_load(char const * filename,int * x,int * y,int * comp,int req_comp)1047 STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
1048 {
1049    FILE *f = stbi__fopen(filename, "rb");
1050    unsigned char *result;
1051    if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
1052    result = stbi_load_from_file(f,x,y,comp,req_comp);
1053    fclose(f);
1054    return result;
1055 }
1056 
stbi_load_from_file(FILE * f,int * x,int * y,int * comp,int req_comp)1057 STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1058 {
1059    unsigned char *result;
1060    stbi__context s;
1061    stbi__start_file(&s,f);
1062    result = stbi__load_flip(&s,x,y,comp,req_comp);
1063    if (result) {
1064       // need to 'unget' all the characters in the IO buffer
1065       fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
1066    }
1067    return result;
1068 }
1069 #endif //!STBI_NO_STDIO
1070 
stbi_load_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp,int req_comp)1071 STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1072 {
1073    stbi__context s;
1074    stbi__start_mem(&s,buffer,len);
1075    return stbi__load_flip(&s,x,y,comp,req_comp);
1076 }
1077 
stbi_load_from_callbacks(stbi_io_callbacks const * clbk,void * user,int * x,int * y,int * comp,int req_comp)1078 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1079 {
1080    stbi__context s;
1081    stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1082    return stbi__load_flip(&s,x,y,comp,req_comp);
1083 }
1084 
1085 #ifndef STBI_NO_LINEAR
stbi__loadf_main(stbi__context * s,int * x,int * y,int * comp,int req_comp)1086 static float *stbi__loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
1087 {
1088    unsigned char *data;
1089    #ifndef STBI_NO_HDR
1090    if (stbi__hdr_test(s)) {
1091       float *hdr_data = stbi__hdr_load(s,x,y,comp,req_comp);
1092       if (hdr_data)
1093          stbi__float_postprocess(hdr_data,x,y,comp,req_comp);
1094       return hdr_data;
1095    }
1096    #endif
1097    data = stbi__load_flip(s, x, y, comp, req_comp);
1098    if (data)
1099       return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
1100    return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
1101 }
1102 
stbi_loadf_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp,int req_comp)1103 STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1104 {
1105    stbi__context s;
1106    stbi__start_mem(&s,buffer,len);
1107    return stbi__loadf_main(&s,x,y,comp,req_comp);
1108 }
1109 
stbi_loadf_from_callbacks(stbi_io_callbacks const * clbk,void * user,int * x,int * y,int * comp,int req_comp)1110 STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1111 {
1112    stbi__context s;
1113    stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1114    return stbi__loadf_main(&s,x,y,comp,req_comp);
1115 }
1116 
1117 #ifndef STBI_NO_STDIO
stbi_loadf(char const * filename,int * x,int * y,int * comp,int req_comp)1118 STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
1119 {
1120    float *result;
1121    FILE *f = stbi__fopen(filename, "rb");
1122    if (!f) return stbi__errpf("can't fopen", "Unable to open file");
1123    result = stbi_loadf_from_file(f,x,y,comp,req_comp);
1124    fclose(f);
1125    return result;
1126 }
1127 
stbi_loadf_from_file(FILE * f,int * x,int * y,int * comp,int req_comp)1128 STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1129 {
1130    stbi__context s;
1131    stbi__start_file(&s,f);
1132    return stbi__loadf_main(&s,x,y,comp,req_comp);
1133 }
1134 #endif // !STBI_NO_STDIO
1135 
1136 #endif // !STBI_NO_LINEAR
1137 
1138 // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
1139 // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
1140 // reports false!
1141 
stbi_is_hdr_from_memory(stbi_uc const * buffer,int len)1142 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
1143 {
1144    #ifndef STBI_NO_HDR
1145    stbi__context s;
1146    stbi__start_mem(&s,buffer,len);
1147    return stbi__hdr_test(&s);
1148    #else
1149    STBI_NOTUSED(buffer);
1150    STBI_NOTUSED(len);
1151    return 0;
1152    #endif
1153 }
1154 
1155 #ifndef STBI_NO_STDIO
stbi_is_hdr(char const * filename)1156 STBIDEF int      stbi_is_hdr          (char const *filename)
1157 {
1158    FILE *f = stbi__fopen(filename, "rb");
1159    int result=0;
1160    if (f) {
1161       result = stbi_is_hdr_from_file(f);
1162       fclose(f);
1163    }
1164    return result;
1165 }
1166 
stbi_is_hdr_from_file(FILE * f)1167 STBIDEF int      stbi_is_hdr_from_file(FILE *f)
1168 {
1169    #ifndef STBI_NO_HDR
1170    stbi__context s;
1171    stbi__start_file(&s,f);
1172    return stbi__hdr_test(&s);
1173    #else
1174    STBI_NOTUSED(f);
1175    return 0;
1176    #endif
1177 }
1178 #endif // !STBI_NO_STDIO
1179 
stbi_is_hdr_from_callbacks(stbi_io_callbacks const * clbk,void * user)1180 STBIDEF int      stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
1181 {
1182    #ifndef STBI_NO_HDR
1183    stbi__context s;
1184    stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1185    return stbi__hdr_test(&s);
1186    #else
1187    STBI_NOTUSED(clbk);
1188    STBI_NOTUSED(user);
1189    return 0;
1190    #endif
1191 }
1192 
1193 #ifndef STBI_NO_LINEAR
1194 static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f;
1195 
stbi_ldr_to_hdr_gamma(float gamma)1196 STBIDEF void   stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
stbi_ldr_to_hdr_scale(float scale)1197 STBIDEF void   stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
1198 #endif
1199 
1200 static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f;
1201 
stbi_hdr_to_ldr_gamma(float gamma)1202 STBIDEF void   stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; }
stbi_hdr_to_ldr_scale(float scale)1203 STBIDEF void   stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; }
1204 
1205 
1206 //////////////////////////////////////////////////////////////////////////////
1207 //
1208 // Common code used by all image loaders
1209 //
1210 
1211 enum
1212 {
1213    STBI__SCAN_load=0,
1214    STBI__SCAN_type,
1215    STBI__SCAN_header
1216 };
1217 
stbi__refill_buffer(stbi__context * s)1218 static void stbi__refill_buffer(stbi__context *s)
1219 {
1220    int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
1221    if (n == 0) {
1222       // at end of file, treat same as if from memory, but need to handle case
1223       // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
1224       s->read_from_callbacks = 0;
1225       s->img_buffer = s->buffer_start;
1226       s->img_buffer_end = s->buffer_start+1;
1227       *s->img_buffer = 0;
1228    } else {
1229       s->img_buffer = s->buffer_start;
1230       s->img_buffer_end = s->buffer_start + n;
1231    }
1232 }
1233 
stbi__get8(stbi__context * s)1234 stbi_inline static stbi_uc stbi__get8(stbi__context *s)
1235 {
1236    if (s->img_buffer < s->img_buffer_end)
1237       return *s->img_buffer++;
1238    if (s->read_from_callbacks) {
1239       stbi__refill_buffer(s);
1240       return *s->img_buffer++;
1241    }
1242    return 0;
1243 }
1244 
stbi__at_eof(stbi__context * s)1245 stbi_inline static int stbi__at_eof(stbi__context *s)
1246 {
1247    if (s->io.read) {
1248       if (!(s->io.eof)(s->io_user_data)) return 0;
1249       // if feof() is true, check if buffer = end
1250       // special case: we've only got the special 0 character at the end
1251       if (s->read_from_callbacks == 0) return 1;
1252    }
1253 
1254    return s->img_buffer >= s->img_buffer_end;
1255 }
1256 
stbi__skip(stbi__context * s,int n)1257 static void stbi__skip(stbi__context *s, int n)
1258 {
1259    if (n < 0) {
1260       s->img_buffer = s->img_buffer_end;
1261       return;
1262    }
1263    if (s->io.read) {
1264       int blen = (int) (s->img_buffer_end - s->img_buffer);
1265       if (blen < n) {
1266          s->img_buffer = s->img_buffer_end;
1267          (s->io.skip)(s->io_user_data, n - blen);
1268          return;
1269       }
1270    }
1271    s->img_buffer += n;
1272 }
1273 
stbi__getn(stbi__context * s,stbi_uc * buffer,int n)1274 static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n)
1275 {
1276    if (s->io.read) {
1277       int blen = (int) (s->img_buffer_end - s->img_buffer);
1278       if (blen < n) {
1279          int res, count;
1280 
1281          memcpy(buffer, s->img_buffer, blen);
1282 
1283          count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
1284          res = (count == (n-blen));
1285          s->img_buffer = s->img_buffer_end;
1286          return res;
1287       }
1288    }
1289 
1290    if (s->img_buffer+n <= s->img_buffer_end) {
1291       memcpy(buffer, s->img_buffer, n);
1292       s->img_buffer += n;
1293       return 1;
1294    } else
1295       return 0;
1296 }
1297 
stbi__get16be(stbi__context * s)1298 static int stbi__get16be(stbi__context *s)
1299 {
1300    int z = stbi__get8(s);
1301    return (z << 8) + stbi__get8(s);
1302 }
1303 
1304 #if defined(STBI_NO_PNG) && defined(STBI_NO_PSD) && defined(STBI_NO_PIC)
1305 // nothing
1306 #else
stbi__get32be(stbi__context * s)1307 static stbi__uint32 stbi__get32be(stbi__context *s)
1308 {
1309    stbi__uint32 z = stbi__get16be(s);
1310    return (z << 16) + stbi__get16be(s);
1311 }
1312 #endif
1313 
1314 #if defined(STBI_NO_BMP) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF)
1315 // nothing
1316 #else
stbi__get16le(stbi__context * s)1317 static int stbi__get16le(stbi__context *s)
1318 {
1319    int z = stbi__get8(s);
1320    return z + (stbi__get8(s) << 8);
1321 }
1322 #endif
1323 
1324 #ifndef STBI_NO_BMP
stbi__get32le(stbi__context * s)1325 static stbi__uint32 stbi__get32le(stbi__context *s)
1326 {
1327    stbi__uint32 z = stbi__get16le(s);
1328    return z + (stbi__get16le(s) << 16);
1329 }
1330 #endif
1331 
1332 #define STBI__BYTECAST(x)  ((stbi_uc) ((x) & 255))  // truncate int to byte without warnings
1333 
1334 
1335 //////////////////////////////////////////////////////////////////////////////
1336 //
1337 //  generic converter from built-in img_n to req_comp
1338 //    individual types do this automatically as much as possible (e.g. jpeg
1339 //    does all cases internally since it needs to colorspace convert anyway,
1340 //    and it never has alpha, so very few cases ). png can automatically
1341 //    interleave an alpha=255 channel, but falls back to this for other cases
1342 //
1343 //  assume data buffer is malloced, so malloc a new one and free that one
1344 //  only failure mode is malloc failing
1345 
stbi__compute_y(int r,int g,int b)1346 static stbi_uc stbi__compute_y(int r, int g, int b)
1347 {
1348    return (stbi_uc) (((r*77) + (g*150) +  (29*b)) >> 8);
1349 }
1350 
stbi__convert_format(unsigned char * data,int img_n,int req_comp,unsigned int x,unsigned int y)1351 static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y)
1352 {
1353    int i,j;
1354    unsigned char *good;
1355 
1356    if (req_comp == img_n) return data;
1357    STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
1358 
1359    good = (unsigned char *) stbi__malloc(req_comp * x * y);
1360    if (good == NULL) {
1361       STBI_FREE(data);
1362       return stbi__errpuc("outofmem", "Out of memory");
1363    }
1364 
1365    for (j=0; j < (int) y; ++j) {
1366       unsigned char *src  = data + j * x * img_n   ;
1367       unsigned char *dest = good + j * x * req_comp;
1368 
1369       #define COMBO(a,b)  ((a)*8+(b))
1370       #define CASE(a,b)   case COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
1371       // convert source image with img_n components to one with req_comp components;
1372       // avoid switch per pixel, so use switch per scanline and massive macros
1373       switch (COMBO(img_n, req_comp)) {
1374          CASE(1,2) dest[0]=src[0], dest[1]=255; break;
1375          CASE(1,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1376          CASE(1,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=255; break;
1377          CASE(2,1) dest[0]=src[0]; break;
1378          CASE(2,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1379          CASE(2,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=src[1]; break;
1380          CASE(3,4) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2],dest[3]=255; break;
1381          CASE(3,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1382          CASE(3,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = 255; break;
1383          CASE(4,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1384          CASE(4,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = src[3]; break;
1385          CASE(4,3) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2]; break;
1386          default: STBI_ASSERT(0);
1387       }
1388       #undef CASE
1389    }
1390 
1391    STBI_FREE(data);
1392    return good;
1393 }
1394 
1395 #ifndef STBI_NO_LINEAR
stbi__ldr_to_hdr(stbi_uc * data,int x,int y,int comp)1396 static float   *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
1397 {
1398    int i,k,n;
1399    float *output = (float *) stbi__malloc(x * y * comp * sizeof(float));
1400    if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
1401    // compute number of non-alpha components
1402    if (comp & 1) n = comp; else n = comp-1;
1403    for (i=0; i < x*y; ++i) {
1404       for (k=0; k < n; ++k) {
1405          output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
1406       }
1407       if (k < comp) output[i*comp + k] = data[i*comp+k]/255.0f;
1408    }
1409    STBI_FREE(data);
1410    return output;
1411 }
1412 #endif
1413 
1414 #ifndef STBI_NO_HDR
1415 #define stbi__float2int(x)   ((int) (x))
stbi__hdr_to_ldr(float * data,int x,int y,int comp)1416 static stbi_uc *stbi__hdr_to_ldr(float   *data, int x, int y, int comp)
1417 {
1418    int i,k,n;
1419    stbi_uc *output = (stbi_uc *) stbi__malloc(x * y * comp);
1420    if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
1421    // compute number of non-alpha components
1422    if (comp & 1) n = comp; else n = comp-1;
1423    for (i=0; i < x*y; ++i) {
1424       for (k=0; k < n; ++k) {
1425          float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
1426          if (z < 0) z = 0;
1427          if (z > 255) z = 255;
1428          output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1429       }
1430       if (k < comp) {
1431          float z = data[i*comp+k] * 255 + 0.5f;
1432          if (z < 0) z = 0;
1433          if (z > 255) z = 255;
1434          output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1435       }
1436    }
1437    STBI_FREE(data);
1438    return output;
1439 }
1440 #endif
1441 
1442 //////////////////////////////////////////////////////////////////////////////
1443 //
1444 //  "baseline" JPEG/JFIF decoder
1445 //
1446 //    simple implementation
1447 //      - doesn't support delayed output of y-dimension
1448 //      - simple interface (only one output format: 8-bit interleaved RGB)
1449 //      - doesn't try to recover corrupt jpegs
1450 //      - doesn't allow partial loading, loading multiple at once
1451 //      - still fast on x86 (copying globals into locals doesn't help x86)
1452 //      - allocates lots of intermediate memory (full size of all components)
1453 //        - non-interleaved case requires this anyway
1454 //        - allows good upsampling (see next)
1455 //    high-quality
1456 //      - upsampled channels are bilinearly interpolated, even across blocks
1457 //      - quality integer IDCT derived from IJG's 'slow'
1458 //    performance
1459 //      - fast huffman; reasonable integer IDCT
1460 //      - some SIMD kernels for common paths on targets with SSE2/NEON
1461 //      - uses a lot of intermediate memory, could cache poorly
1462 
1463 #ifndef STBI_NO_JPEG
1464 
1465 // huffman decoding acceleration
1466 #define FAST_BITS   9  // larger handles more cases; smaller stomps less cache
1467 
1468 typedef struct
1469 {
1470    stbi_uc  fast[1 << FAST_BITS];
1471    // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
1472    stbi__uint16 code[256];
1473    stbi_uc  values[256];
1474    stbi_uc  size[257];
1475    unsigned int maxcode[18];
1476    int    delta[17];   // old 'firstsymbol' - old 'firstcode'
1477 } stbi__huffman;
1478 
1479 typedef struct
1480 {
1481    stbi__context *s;
1482    stbi__huffman huff_dc[4];
1483    stbi__huffman huff_ac[4];
1484    stbi_uc dequant[4][64];
1485    stbi__int16 fast_ac[4][1 << FAST_BITS];
1486 
1487 // sizes for components, interleaved MCUs
1488    int img_h_max, img_v_max;
1489    int img_mcu_x, img_mcu_y;
1490    int img_mcu_w, img_mcu_h;
1491 
1492 // definition of jpeg image component
1493    struct
1494    {
1495       int id;
1496       int h,v;
1497       int tq;
1498       int hd,ha;
1499       int dc_pred;
1500 
1501       int x,y,w2,h2;
1502       stbi_uc *data;
1503       void *raw_data, *raw_coeff;
1504       stbi_uc *linebuf;
1505       short   *coeff;   // progressive only
1506       int      coeff_w, coeff_h; // number of 8x8 coefficient blocks
1507    } img_comp[4];
1508 
1509    stbi__uint32   code_buffer; // jpeg entropy-coded buffer
1510    int            code_bits;   // number of valid bits
1511    unsigned char  marker;      // marker seen while filling entropy buffer
1512    int            nomore;      // flag if we saw a marker so must stop
1513 
1514    int            progressive;
1515    int            spec_start;
1516    int            spec_end;
1517    int            succ_high;
1518    int            succ_low;
1519    int            eob_run;
1520    int            rgb;
1521 
1522    int scan_n, order[4];
1523    int restart_interval, todo;
1524 
1525 // kernels
1526    void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]);
1527    void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step);
1528    stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs);
1529 } stbi__jpeg;
1530 
stbi__build_huffman(stbi__huffman * h,int * count)1531 static int stbi__build_huffman(stbi__huffman *h, int *count)
1532 {
1533    int i,j,k=0,code;
1534    // build size list for each symbol (from JPEG spec)
1535    for (i=0; i < 16; ++i)
1536       for (j=0; j < count[i]; ++j)
1537          h->size[k++] = (stbi_uc) (i+1);
1538    h->size[k] = 0;
1539 
1540    // compute actual symbols (from jpeg spec)
1541    code = 0;
1542    k = 0;
1543    for(j=1; j <= 16; ++j) {
1544       // compute delta to add to code to compute symbol id
1545       h->delta[j] = k - code;
1546       if (h->size[k] == j) {
1547          while (h->size[k] == j)
1548             h->code[k++] = (stbi__uint16) (code++);
1549          if (code-1 >= (1 << j)) return stbi__err("bad code lengths","Corrupt JPEG");
1550       }
1551       // compute largest code + 1 for this size, preshifted as needed later
1552       h->maxcode[j] = code << (16-j);
1553       code <<= 1;
1554    }
1555    h->maxcode[j] = 0xffffffff;
1556 
1557    // build non-spec acceleration table; 255 is flag for not-accelerated
1558    memset(h->fast, 255, 1 << FAST_BITS);
1559    for (i=0; i < k; ++i) {
1560       int s = h->size[i];
1561       if (s <= FAST_BITS) {
1562          int c = h->code[i] << (FAST_BITS-s);
1563          int m = 1 << (FAST_BITS-s);
1564          for (j=0; j < m; ++j) {
1565             h->fast[c+j] = (stbi_uc) i;
1566          }
1567       }
1568    }
1569    return 1;
1570 }
1571 
1572 // build a table that decodes both magnitude and value of small ACs in
1573 // one go.
stbi__build_fast_ac(stbi__int16 * fast_ac,stbi__huffman * h)1574 static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h)
1575 {
1576    int i;
1577    for (i=0; i < (1 << FAST_BITS); ++i) {
1578       stbi_uc fast = h->fast[i];
1579       fast_ac[i] = 0;
1580       if (fast < 255) {
1581          int rs = h->values[fast];
1582          int run = (rs >> 4) & 15;
1583          int magbits = rs & 15;
1584          int len = h->size[fast];
1585 
1586          if (magbits && len + magbits <= FAST_BITS) {
1587             // magnitude code followed by receive_extend code
1588             int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
1589             int m = 1 << (magbits - 1);
1590             if (k < m) k += (-1 << magbits) + 1;
1591             // if the result is small enough, we can fit it in fast_ac table
1592             if (k >= -128 && k <= 127)
1593                fast_ac[i] = (stbi__int16) ((k << 8) + (run << 4) + (len + magbits));
1594          }
1595       }
1596    }
1597 }
1598 
stbi__grow_buffer_unsafe(stbi__jpeg * j)1599 static void stbi__grow_buffer_unsafe(stbi__jpeg *j)
1600 {
1601    do {
1602       int b = j->nomore ? 0 : stbi__get8(j->s);
1603       if (b == 0xff) {
1604          int c = stbi__get8(j->s);
1605          if (c != 0) {
1606             j->marker = (unsigned char) c;
1607             j->nomore = 1;
1608             return;
1609          }
1610       }
1611       j->code_buffer |= b << (24 - j->code_bits);
1612       j->code_bits += 8;
1613    } while (j->code_bits <= 24);
1614 }
1615 
1616 // (1 << n) - 1
1617 static stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
1618 
1619 // decode a jpeg huffman value from the bitstream
stbi__jpeg_huff_decode(stbi__jpeg * j,stbi__huffman * h)1620 stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h)
1621 {
1622    unsigned int temp;
1623    int c,k;
1624 
1625    if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1626 
1627    // look at the top FAST_BITS and determine what symbol ID it is,
1628    // if the code is <= FAST_BITS
1629    c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1630    k = h->fast[c];
1631    if (k < 255) {
1632       int s = h->size[k];
1633       if (s > j->code_bits)
1634          return -1;
1635       j->code_buffer <<= s;
1636       j->code_bits -= s;
1637       return h->values[k];
1638    }
1639 
1640    // naive test is to shift the code_buffer down so k bits are
1641    // valid, then test against maxcode. To speed this up, we've
1642    // preshifted maxcode left so that it has (16-k) 0s at the
1643    // end; in other words, regardless of the number of bits, it
1644    // wants to be compared against something shifted to have 16;
1645    // that way we don't need to shift inside the loop.
1646    temp = j->code_buffer >> 16;
1647    for (k=FAST_BITS+1 ; ; ++k)
1648       if (temp < h->maxcode[k])
1649          break;
1650    if (k == 17) {
1651       // error! code not found
1652       j->code_bits -= 16;
1653       return -1;
1654    }
1655 
1656    if (k > j->code_bits)
1657       return -1;
1658 
1659    // convert the huffman code to the symbol id
1660    c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
1661    STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
1662 
1663    // convert the id to a symbol
1664    j->code_bits -= k;
1665    j->code_buffer <<= k;
1666    return h->values[c];
1667 }
1668 
1669 // bias[n] = (-1<<n) + 1
1670 static int const stbi__jbias[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
1671 
1672 // combined JPEG 'receive' and JPEG 'extend', since baseline
1673 // always extends everything it receives.
stbi__extend_receive(stbi__jpeg * j,int n)1674 stbi_inline static int stbi__extend_receive(stbi__jpeg *j, int n)
1675 {
1676    unsigned int k;
1677    int sgn;
1678    if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1679 
1680    sgn = (stbi__int32)j->code_buffer >> 31; // sign bit is always in MSB
1681    k = stbi_lrot(j->code_buffer, n);
1682    STBI_ASSERT(n >= 0 && n < (int) (sizeof(stbi__bmask)/sizeof(*stbi__bmask)));
1683    j->code_buffer = k & ~stbi__bmask[n];
1684    k &= stbi__bmask[n];
1685    j->code_bits -= n;
1686    return k + (stbi__jbias[n] & ~sgn);
1687 }
1688 
1689 // get some unsigned bits
stbi__jpeg_get_bits(stbi__jpeg * j,int n)1690 stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg *j, int n)
1691 {
1692    unsigned int k;
1693    if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1694    k = stbi_lrot(j->code_buffer, n);
1695    j->code_buffer = k & ~stbi__bmask[n];
1696    k &= stbi__bmask[n];
1697    j->code_bits -= n;
1698    return k;
1699 }
1700 
stbi__jpeg_get_bit(stbi__jpeg * j)1701 stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg *j)
1702 {
1703    unsigned int k;
1704    if (j->code_bits < 1) stbi__grow_buffer_unsafe(j);
1705    k = j->code_buffer;
1706    j->code_buffer <<= 1;
1707    --j->code_bits;
1708    return k & 0x80000000;
1709 }
1710 
1711 // given a value that's at position X in the zigzag stream,
1712 // where does it appear in the 8x8 matrix coded as row-major?
1713 static stbi_uc stbi__jpeg_dezigzag[64+15] =
1714 {
1715     0,  1,  8, 16,  9,  2,  3, 10,
1716    17, 24, 32, 25, 18, 11,  4,  5,
1717    12, 19, 26, 33, 40, 48, 41, 34,
1718    27, 20, 13,  6,  7, 14, 21, 28,
1719    35, 42, 49, 56, 57, 50, 43, 36,
1720    29, 22, 15, 23, 30, 37, 44, 51,
1721    58, 59, 52, 45, 38, 31, 39, 46,
1722    53, 60, 61, 54, 47, 55, 62, 63,
1723    // let corrupt input sample past end
1724    63, 63, 63, 63, 63, 63, 63, 63,
1725    63, 63, 63, 63, 63, 63, 63
1726 };
1727 
1728 // decode one 64-entry block--
stbi__jpeg_decode_block(stbi__jpeg * j,short data[64],stbi__huffman * hdc,stbi__huffman * hac,stbi__int16 * fac,int b,stbi_uc * dequant)1729 static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi_uc *dequant)
1730 {
1731    int diff,dc,k;
1732    int t;
1733 
1734    if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1735    t = stbi__jpeg_huff_decode(j, hdc);
1736    if (t < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1737 
1738    // 0 all the ac values now so we can do it 32-bits at a time
1739    memset(data,0,64*sizeof(data[0]));
1740 
1741    diff = t ? stbi__extend_receive(j, t) : 0;
1742    dc = j->img_comp[b].dc_pred + diff;
1743    j->img_comp[b].dc_pred = dc;
1744    data[0] = (short) (dc * dequant[0]);
1745 
1746    // decode AC components, see JPEG spec
1747    k = 1;
1748    do {
1749       unsigned int zig;
1750       int c,r,s;
1751       if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1752       c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1753       r = fac[c];
1754       if (r) { // fast-AC path
1755          k += (r >> 4) & 15; // run
1756          s = r & 15; // combined length
1757          j->code_buffer <<= s;
1758          j->code_bits -= s;
1759          // decode into unzigzag'd location
1760          zig = stbi__jpeg_dezigzag[k++];
1761          data[zig] = (short) ((r >> 8) * dequant[zig]);
1762       } else {
1763          int rs = stbi__jpeg_huff_decode(j, hac);
1764          if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1765          s = rs & 15;
1766          r = rs >> 4;
1767          if (s == 0) {
1768             if (rs != 0xf0) break; // end block
1769             k += 16;
1770          } else {
1771             k += r;
1772             // decode into unzigzag'd location
1773             zig = stbi__jpeg_dezigzag[k++];
1774             data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]);
1775          }
1776       }
1777    } while (k < 64);
1778    return 1;
1779 }
1780 
stbi__jpeg_decode_block_prog_dc(stbi__jpeg * j,short data[64],stbi__huffman * hdc,int b)1781 static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg *j, short data[64], stbi__huffman *hdc, int b)
1782 {
1783    int diff,dc;
1784    int t;
1785    if (j->spec_end != 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1786 
1787    if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1788 
1789    if (j->succ_high == 0) {
1790       // first scan for DC coefficient, must be first
1791       memset(data,0,64*sizeof(data[0])); // 0 all the ac values now
1792       t = stbi__jpeg_huff_decode(j, hdc);
1793       diff = t ? stbi__extend_receive(j, t) : 0;
1794 
1795       dc = j->img_comp[b].dc_pred + diff;
1796       j->img_comp[b].dc_pred = dc;
1797       data[0] = (short) (dc << j->succ_low);
1798    } else {
1799       // refinement scan for DC coefficient
1800       if (stbi__jpeg_get_bit(j))
1801          data[0] += (short) (1 << j->succ_low);
1802    }
1803    return 1;
1804 }
1805 
1806 // @OPTIMIZE: store non-zigzagged during the decode passes,
1807 // and only de-zigzag when dequantizing
stbi__jpeg_decode_block_prog_ac(stbi__jpeg * j,short data[64],stbi__huffman * hac,stbi__int16 * fac)1808 static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg *j, short data[64], stbi__huffman *hac, stbi__int16 *fac)
1809 {
1810    int k;
1811    if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1812 
1813    if (j->succ_high == 0) {
1814       int shift = j->succ_low;
1815 
1816       if (j->eob_run) {
1817          --j->eob_run;
1818          return 1;
1819       }
1820 
1821       k = j->spec_start;
1822       do {
1823          unsigned int zig;
1824          int c,r,s;
1825          if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1826          c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1827          r = fac[c];
1828          if (r) { // fast-AC path
1829             k += (r >> 4) & 15; // run
1830             s = r & 15; // combined length
1831             j->code_buffer <<= s;
1832             j->code_bits -= s;
1833             zig = stbi__jpeg_dezigzag[k++];
1834             data[zig] = (short) ((r >> 8) << shift);
1835          } else {
1836             int rs = stbi__jpeg_huff_decode(j, hac);
1837             if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1838             s = rs & 15;
1839             r = rs >> 4;
1840             if (s == 0) {
1841                if (r < 15) {
1842                   j->eob_run = (1 << r);
1843                   if (r)
1844                      j->eob_run += stbi__jpeg_get_bits(j, r);
1845                   --j->eob_run;
1846                   break;
1847                }
1848                k += 16;
1849             } else {
1850                k += r;
1851                zig = stbi__jpeg_dezigzag[k++];
1852                data[zig] = (short) (stbi__extend_receive(j,s) << shift);
1853             }
1854          }
1855       } while (k <= j->spec_end);
1856    } else {
1857       // refinement scan for these AC coefficients
1858 
1859       short bit = (short) (1 << j->succ_low);
1860 
1861       if (j->eob_run) {
1862          --j->eob_run;
1863          for (k = j->spec_start; k <= j->spec_end; ++k) {
1864             short *p = &data[stbi__jpeg_dezigzag[k]];
1865             if (*p != 0)
1866                if (stbi__jpeg_get_bit(j))
1867                   if ((*p & bit)==0) {
1868                      if (*p > 0)
1869                         *p += bit;
1870                      else
1871                         *p -= bit;
1872                   }
1873          }
1874       } else {
1875          k = j->spec_start;
1876          do {
1877             int r,s;
1878             int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
1879             if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1880             s = rs & 15;
1881             r = rs >> 4;
1882             if (s == 0) {
1883                if (r < 15) {
1884                   j->eob_run = (1 << r) - 1;
1885                   if (r)
1886                      j->eob_run += stbi__jpeg_get_bits(j, r);
1887                   r = 64; // force end of block
1888                } else {
1889                   // r=15 s=0 should write 16 0s, so we just do
1890                   // a run of 15 0s and then write s (which is 0),
1891                   // so we don't have to do anything special here
1892                }
1893             } else {
1894                if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG");
1895                // sign bit
1896                if (stbi__jpeg_get_bit(j))
1897                   s = bit;
1898                else
1899                   s = -bit;
1900             }
1901 
1902             // advance by r
1903             while (k <= j->spec_end) {
1904                short *p = &data[stbi__jpeg_dezigzag[k++]];
1905                if (*p != 0) {
1906                   if (stbi__jpeg_get_bit(j))
1907                      if ((*p & bit)==0) {
1908                         if (*p > 0)
1909                            *p += bit;
1910                         else
1911                            *p -= bit;
1912                      }
1913                } else {
1914                   if (r == 0) {
1915                      *p = (short) s;
1916                      break;
1917                   }
1918                   --r;
1919                }
1920             }
1921          } while (k <= j->spec_end);
1922       }
1923    }
1924    return 1;
1925 }
1926 
1927 // take a -128..127 value and stbi__clamp it and convert to 0..255
stbi__clamp(int x)1928 stbi_inline static stbi_uc stbi__clamp(int x)
1929 {
1930    // trick to use a single test to catch both cases
1931    if ((unsigned int) x > 255) {
1932       if (x < 0) return 0;
1933       if (x > 255) return 255;
1934    }
1935    return (stbi_uc) x;
1936 }
1937 
1938 #define stbi__f2f(x)  ((int) (((x) * 4096 + 0.5)))
1939 #define stbi__fsh(x)  ((x) << 12)
1940 
1941 // derived from jidctint -- DCT_ISLOW
1942 #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
1943    int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
1944    p2 = s2;                                    \
1945    p3 = s6;                                    \
1946    p1 = (p2+p3) * stbi__f2f(0.5411961f);       \
1947    t2 = p1 + p3*stbi__f2f(-1.847759065f);      \
1948    t3 = p1 + p2*stbi__f2f( 0.765366865f);      \
1949    p2 = s0;                                    \
1950    p3 = s4;                                    \
1951    t0 = stbi__fsh(p2+p3);                      \
1952    t1 = stbi__fsh(p2-p3);                      \
1953    x0 = t0+t3;                                 \
1954    x3 = t0-t3;                                 \
1955    x1 = t1+t2;                                 \
1956    x2 = t1-t2;                                 \
1957    t0 = s7;                                    \
1958    t1 = s5;                                    \
1959    t2 = s3;                                    \
1960    t3 = s1;                                    \
1961    p3 = t0+t2;                                 \
1962    p4 = t1+t3;                                 \
1963    p1 = t0+t3;                                 \
1964    p2 = t1+t2;                                 \
1965    p5 = (p3+p4)*stbi__f2f( 1.175875602f);      \
1966    t0 = t0*stbi__f2f( 0.298631336f);           \
1967    t1 = t1*stbi__f2f( 2.053119869f);           \
1968    t2 = t2*stbi__f2f( 3.072711026f);           \
1969    t3 = t3*stbi__f2f( 1.501321110f);           \
1970    p1 = p5 + p1*stbi__f2f(-0.899976223f);      \
1971    p2 = p5 + p2*stbi__f2f(-2.562915447f);      \
1972    p3 = p3*stbi__f2f(-1.961570560f);           \
1973    p4 = p4*stbi__f2f(-0.390180644f);           \
1974    t3 += p1+p4;                                \
1975    t2 += p2+p3;                                \
1976    t1 += p2+p4;                                \
1977    t0 += p1+p3;
1978 
stbi__idct_block(stbi_uc * out,int out_stride,short data[64])1979 static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64])
1980 {
1981    int i,val[64],*v=val;
1982    stbi_uc *o;
1983    short *d = data;
1984 
1985    // columns
1986    for (i=0; i < 8; ++i,++d, ++v) {
1987       // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
1988       if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
1989            && d[40]==0 && d[48]==0 && d[56]==0) {
1990          //    no shortcut                 0     seconds
1991          //    (1|2|3|4|5|6|7)==0          0     seconds
1992          //    all separate               -0.047 seconds
1993          //    1 && 2|3 && 4|5 && 6|7:    -0.047 seconds
1994          int dcterm = d[0] << 2;
1995          v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
1996       } else {
1997          STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56])
1998          // constants scaled things up by 1<<12; let's bring them back
1999          // down, but keep 2 extra bits of precision
2000          x0 += 512; x1 += 512; x2 += 512; x3 += 512;
2001          v[ 0] = (x0+t3) >> 10;
2002          v[56] = (x0-t3) >> 10;
2003          v[ 8] = (x1+t2) >> 10;
2004          v[48] = (x1-t2) >> 10;
2005          v[16] = (x2+t1) >> 10;
2006          v[40] = (x2-t1) >> 10;
2007          v[24] = (x3+t0) >> 10;
2008          v[32] = (x3-t0) >> 10;
2009       }
2010    }
2011 
2012    for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
2013       // no fast case since the first 1D IDCT spread components out
2014       STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
2015       // constants scaled things up by 1<<12, plus we had 1<<2 from first
2016       // loop, plus horizontal and vertical each scale by sqrt(8) so together
2017       // we've got an extra 1<<3, so 1<<17 total we need to remove.
2018       // so we want to round that, which means adding 0.5 * 1<<17,
2019       // aka 65536. Also, we'll end up with -128 to 127 that we want
2020       // to encode as 0..255 by adding 128, so we'll add that before the shift
2021       x0 += 65536 + (128<<17);
2022       x1 += 65536 + (128<<17);
2023       x2 += 65536 + (128<<17);
2024       x3 += 65536 + (128<<17);
2025       // tried computing the shifts into temps, or'ing the temps to see
2026       // if any were out of range, but that was slower
2027       o[0] = stbi__clamp((x0+t3) >> 17);
2028       o[7] = stbi__clamp((x0-t3) >> 17);
2029       o[1] = stbi__clamp((x1+t2) >> 17);
2030       o[6] = stbi__clamp((x1-t2) >> 17);
2031       o[2] = stbi__clamp((x2+t1) >> 17);
2032       o[5] = stbi__clamp((x2-t1) >> 17);
2033       o[3] = stbi__clamp((x3+t0) >> 17);
2034       o[4] = stbi__clamp((x3-t0) >> 17);
2035    }
2036 }
2037 
2038 #ifdef STBI_SSE2
2039 // sse2 integer IDCT. not the fastest possible implementation but it
2040 // produces bit-identical results to the generic C version so it's
2041 // fully "transparent".
stbi__idct_simd(stbi_uc * out,int out_stride,short data[64])2042 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2043 {
2044    // This is constructed to match our regular (generic) integer IDCT exactly.
2045    __m128i row0, row1, row2, row3, row4, row5, row6, row7;
2046    __m128i tmp;
2047 
2048    // dot product constant: even elems=x, odd elems=y
2049    #define dct_const(x,y)  _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
2050 
2051    // out(0) = c0[even]*x + c0[odd]*y   (c0, x, y 16-bit, out 32-bit)
2052    // out(1) = c1[even]*x + c1[odd]*y
2053    #define dct_rot(out0,out1, x,y,c0,c1) \
2054       __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
2055       __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
2056       __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
2057       __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
2058       __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
2059       __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
2060 
2061    // out = in << 12  (in 16-bit, out 32-bit)
2062    #define dct_widen(out, in) \
2063       __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
2064       __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
2065 
2066    // wide add
2067    #define dct_wadd(out, a, b) \
2068       __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
2069       __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
2070 
2071    // wide sub
2072    #define dct_wsub(out, a, b) \
2073       __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
2074       __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
2075 
2076    // butterfly a/b, add bias, then shift by "s" and pack
2077    #define dct_bfly32o(out0, out1, a,b,bias,s) \
2078       { \
2079          __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
2080          __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
2081          dct_wadd(sum, abiased, b); \
2082          dct_wsub(dif, abiased, b); \
2083          out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
2084          out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
2085       }
2086 
2087    // 8-bit interleave step (for transposes)
2088    #define dct_interleave8(a, b) \
2089       tmp = a; \
2090       a = _mm_unpacklo_epi8(a, b); \
2091       b = _mm_unpackhi_epi8(tmp, b)
2092 
2093    // 16-bit interleave step (for transposes)
2094    #define dct_interleave16(a, b) \
2095       tmp = a; \
2096       a = _mm_unpacklo_epi16(a, b); \
2097       b = _mm_unpackhi_epi16(tmp, b)
2098 
2099    #define dct_pass(bias,shift) \
2100       { \
2101          /* even part */ \
2102          dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
2103          __m128i sum04 = _mm_add_epi16(row0, row4); \
2104          __m128i dif04 = _mm_sub_epi16(row0, row4); \
2105          dct_widen(t0e, sum04); \
2106          dct_widen(t1e, dif04); \
2107          dct_wadd(x0, t0e, t3e); \
2108          dct_wsub(x3, t0e, t3e); \
2109          dct_wadd(x1, t1e, t2e); \
2110          dct_wsub(x2, t1e, t2e); \
2111          /* odd part */ \
2112          dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
2113          dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
2114          __m128i sum17 = _mm_add_epi16(row1, row7); \
2115          __m128i sum35 = _mm_add_epi16(row3, row5); \
2116          dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
2117          dct_wadd(x4, y0o, y4o); \
2118          dct_wadd(x5, y1o, y5o); \
2119          dct_wadd(x6, y2o, y5o); \
2120          dct_wadd(x7, y3o, y4o); \
2121          dct_bfly32o(row0,row7, x0,x7,bias,shift); \
2122          dct_bfly32o(row1,row6, x1,x6,bias,shift); \
2123          dct_bfly32o(row2,row5, x2,x5,bias,shift); \
2124          dct_bfly32o(row3,row4, x3,x4,bias,shift); \
2125       }
2126 
2127    __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
2128    __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f));
2129    __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
2130    __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
2131    __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f));
2132    __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f));
2133    __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f));
2134    __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f));
2135 
2136    // rounding biases in column/row passes, see stbi__idct_block for explanation.
2137    __m128i bias_0 = _mm_set1_epi32(512);
2138    __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17));
2139 
2140    // load
2141    row0 = _mm_load_si128((const __m128i *) (data + 0*8));
2142    row1 = _mm_load_si128((const __m128i *) (data + 1*8));
2143    row2 = _mm_load_si128((const __m128i *) (data + 2*8));
2144    row3 = _mm_load_si128((const __m128i *) (data + 3*8));
2145    row4 = _mm_load_si128((const __m128i *) (data + 4*8));
2146    row5 = _mm_load_si128((const __m128i *) (data + 5*8));
2147    row6 = _mm_load_si128((const __m128i *) (data + 6*8));
2148    row7 = _mm_load_si128((const __m128i *) (data + 7*8));
2149 
2150    // column pass
2151    dct_pass(bias_0, 10);
2152 
2153    {
2154       // 16bit 8x8 transpose pass 1
2155       dct_interleave16(row0, row4);
2156       dct_interleave16(row1, row5);
2157       dct_interleave16(row2, row6);
2158       dct_interleave16(row3, row7);
2159 
2160       // transpose pass 2
2161       dct_interleave16(row0, row2);
2162       dct_interleave16(row1, row3);
2163       dct_interleave16(row4, row6);
2164       dct_interleave16(row5, row7);
2165 
2166       // transpose pass 3
2167       dct_interleave16(row0, row1);
2168       dct_interleave16(row2, row3);
2169       dct_interleave16(row4, row5);
2170       dct_interleave16(row6, row7);
2171    }
2172 
2173    // row pass
2174    dct_pass(bias_1, 17);
2175 
2176    {
2177       // pack
2178       __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
2179       __m128i p1 = _mm_packus_epi16(row2, row3);
2180       __m128i p2 = _mm_packus_epi16(row4, row5);
2181       __m128i p3 = _mm_packus_epi16(row6, row7);
2182 
2183       // 8bit 8x8 transpose pass 1
2184       dct_interleave8(p0, p2); // a0e0a1e1...
2185       dct_interleave8(p1, p3); // c0g0c1g1...
2186 
2187       // transpose pass 2
2188       dct_interleave8(p0, p1); // a0c0e0g0...
2189       dct_interleave8(p2, p3); // b0d0f0h0...
2190 
2191       // transpose pass 3
2192       dct_interleave8(p0, p2); // a0b0c0d0...
2193       dct_interleave8(p1, p3); // a4b4c4d4...
2194 
2195       // store
2196       _mm_storel_epi64((__m128i *) out, p0); out += out_stride;
2197       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
2198       _mm_storel_epi64((__m128i *) out, p2); out += out_stride;
2199       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
2200       _mm_storel_epi64((__m128i *) out, p1); out += out_stride;
2201       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
2202       _mm_storel_epi64((__m128i *) out, p3); out += out_stride;
2203       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e));
2204    }
2205 
2206 #undef dct_const
2207 #undef dct_rot
2208 #undef dct_widen
2209 #undef dct_wadd
2210 #undef dct_wsub
2211 #undef dct_bfly32o
2212 #undef dct_interleave8
2213 #undef dct_interleave16
2214 #undef dct_pass
2215 }
2216 
2217 #endif // STBI_SSE2
2218 
2219 #ifdef STBI_NEON
2220 
2221 // NEON integer IDCT. should produce bit-identical
2222 // results to the generic C version.
stbi__idct_simd(stbi_uc * out,int out_stride,short data[64])2223 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2224 {
2225    int16x8_t row0, row1, row2, row3, row4, row5, row6, row7;
2226 
2227    int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f));
2228    int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f));
2229    int16x4_t rot0_2 = vdup_n_s16(stbi__f2f( 0.765366865f));
2230    int16x4_t rot1_0 = vdup_n_s16(stbi__f2f( 1.175875602f));
2231    int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f));
2232    int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f));
2233    int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f));
2234    int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f));
2235    int16x4_t rot3_0 = vdup_n_s16(stbi__f2f( 0.298631336f));
2236    int16x4_t rot3_1 = vdup_n_s16(stbi__f2f( 2.053119869f));
2237    int16x4_t rot3_2 = vdup_n_s16(stbi__f2f( 3.072711026f));
2238    int16x4_t rot3_3 = vdup_n_s16(stbi__f2f( 1.501321110f));
2239 
2240 #define dct_long_mul(out, inq, coeff) \
2241    int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
2242    int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
2243 
2244 #define dct_long_mac(out, acc, inq, coeff) \
2245    int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
2246    int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
2247 
2248 #define dct_widen(out, inq) \
2249    int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
2250    int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
2251 
2252 // wide add
2253 #define dct_wadd(out, a, b) \
2254    int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
2255    int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
2256 
2257 // wide sub
2258 #define dct_wsub(out, a, b) \
2259    int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
2260    int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
2261 
2262 // butterfly a/b, then shift using "shiftop" by "s" and pack
2263 #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
2264    { \
2265       dct_wadd(sum, a, b); \
2266       dct_wsub(dif, a, b); \
2267       out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
2268       out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
2269    }
2270 
2271 #define dct_pass(shiftop, shift) \
2272    { \
2273       /* even part */ \
2274       int16x8_t sum26 = vaddq_s16(row2, row6); \
2275       dct_long_mul(p1e, sum26, rot0_0); \
2276       dct_long_mac(t2e, p1e, row6, rot0_1); \
2277       dct_long_mac(t3e, p1e, row2, rot0_2); \
2278       int16x8_t sum04 = vaddq_s16(row0, row4); \
2279       int16x8_t dif04 = vsubq_s16(row0, row4); \
2280       dct_widen(t0e, sum04); \
2281       dct_widen(t1e, dif04); \
2282       dct_wadd(x0, t0e, t3e); \
2283       dct_wsub(x3, t0e, t3e); \
2284       dct_wadd(x1, t1e, t2e); \
2285       dct_wsub(x2, t1e, t2e); \
2286       /* odd part */ \
2287       int16x8_t sum15 = vaddq_s16(row1, row5); \
2288       int16x8_t sum17 = vaddq_s16(row1, row7); \
2289       int16x8_t sum35 = vaddq_s16(row3, row5); \
2290       int16x8_t sum37 = vaddq_s16(row3, row7); \
2291       int16x8_t sumodd = vaddq_s16(sum17, sum35); \
2292       dct_long_mul(p5o, sumodd, rot1_0); \
2293       dct_long_mac(p1o, p5o, sum17, rot1_1); \
2294       dct_long_mac(p2o, p5o, sum35, rot1_2); \
2295       dct_long_mul(p3o, sum37, rot2_0); \
2296       dct_long_mul(p4o, sum15, rot2_1); \
2297       dct_wadd(sump13o, p1o, p3o); \
2298       dct_wadd(sump24o, p2o, p4o); \
2299       dct_wadd(sump23o, p2o, p3o); \
2300       dct_wadd(sump14o, p1o, p4o); \
2301       dct_long_mac(x4, sump13o, row7, rot3_0); \
2302       dct_long_mac(x5, sump24o, row5, rot3_1); \
2303       dct_long_mac(x6, sump23o, row3, rot3_2); \
2304       dct_long_mac(x7, sump14o, row1, rot3_3); \
2305       dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
2306       dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
2307       dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
2308       dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
2309    }
2310 
2311    // load
2312    row0 = vld1q_s16(data + 0*8);
2313    row1 = vld1q_s16(data + 1*8);
2314    row2 = vld1q_s16(data + 2*8);
2315    row3 = vld1q_s16(data + 3*8);
2316    row4 = vld1q_s16(data + 4*8);
2317    row5 = vld1q_s16(data + 5*8);
2318    row6 = vld1q_s16(data + 6*8);
2319    row7 = vld1q_s16(data + 7*8);
2320 
2321    // add DC bias
2322    row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
2323 
2324    // column pass
2325    dct_pass(vrshrn_n_s32, 10);
2326 
2327    // 16bit 8x8 transpose
2328    {
2329 // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
2330 // whether compilers actually get this is another story, sadly.
2331 #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
2332 #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
2333 #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
2334 
2335       // pass 1
2336       dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6
2337       dct_trn16(row2, row3);
2338       dct_trn16(row4, row5);
2339       dct_trn16(row6, row7);
2340 
2341       // pass 2
2342       dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4
2343       dct_trn32(row1, row3);
2344       dct_trn32(row4, row6);
2345       dct_trn32(row5, row7);
2346 
2347       // pass 3
2348       dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0
2349       dct_trn64(row1, row5);
2350       dct_trn64(row2, row6);
2351       dct_trn64(row3, row7);
2352 
2353 #undef dct_trn16
2354 #undef dct_trn32
2355 #undef dct_trn64
2356    }
2357 
2358    // row pass
2359    // vrshrn_n_s32 only supports shifts up to 16, we need
2360    // 17. so do a non-rounding shift of 16 first then follow
2361    // up with a rounding shift by 1.
2362    dct_pass(vshrn_n_s32, 16);
2363 
2364    {
2365       // pack and round
2366       uint8x8_t p0 = vqrshrun_n_s16(row0, 1);
2367       uint8x8_t p1 = vqrshrun_n_s16(row1, 1);
2368       uint8x8_t p2 = vqrshrun_n_s16(row2, 1);
2369       uint8x8_t p3 = vqrshrun_n_s16(row3, 1);
2370       uint8x8_t p4 = vqrshrun_n_s16(row4, 1);
2371       uint8x8_t p5 = vqrshrun_n_s16(row5, 1);
2372       uint8x8_t p6 = vqrshrun_n_s16(row6, 1);
2373       uint8x8_t p7 = vqrshrun_n_s16(row7, 1);
2374 
2375       // again, these can translate into one instruction, but often don't.
2376 #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
2377 #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
2378 #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
2379 
2380       // sadly can't use interleaved stores here since we only write
2381       // 8 bytes to each scan line!
2382 
2383       // 8x8 8-bit transpose pass 1
2384       dct_trn8_8(p0, p1);
2385       dct_trn8_8(p2, p3);
2386       dct_trn8_8(p4, p5);
2387       dct_trn8_8(p6, p7);
2388 
2389       // pass 2
2390       dct_trn8_16(p0, p2);
2391       dct_trn8_16(p1, p3);
2392       dct_trn8_16(p4, p6);
2393       dct_trn8_16(p5, p7);
2394 
2395       // pass 3
2396       dct_trn8_32(p0, p4);
2397       dct_trn8_32(p1, p5);
2398       dct_trn8_32(p2, p6);
2399       dct_trn8_32(p3, p7);
2400 
2401       // store
2402       vst1_u8(out, p0); out += out_stride;
2403       vst1_u8(out, p1); out += out_stride;
2404       vst1_u8(out, p2); out += out_stride;
2405       vst1_u8(out, p3); out += out_stride;
2406       vst1_u8(out, p4); out += out_stride;
2407       vst1_u8(out, p5); out += out_stride;
2408       vst1_u8(out, p6); out += out_stride;
2409       vst1_u8(out, p7);
2410 
2411 #undef dct_trn8_8
2412 #undef dct_trn8_16
2413 #undef dct_trn8_32
2414    }
2415 
2416 #undef dct_long_mul
2417 #undef dct_long_mac
2418 #undef dct_widen
2419 #undef dct_wadd
2420 #undef dct_wsub
2421 #undef dct_bfly32o
2422 #undef dct_pass
2423 }
2424 
2425 #endif // STBI_NEON
2426 
2427 #define STBI__MARKER_none  0xff
2428 // if there's a pending marker from the entropy stream, return that
2429 // otherwise, fetch from the stream and get a marker. if there's no
2430 // marker, return 0xff, which is never a valid marker value
stbi__get_marker(stbi__jpeg * j)2431 static stbi_uc stbi__get_marker(stbi__jpeg *j)
2432 {
2433    stbi_uc x;
2434    if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
2435    x = stbi__get8(j->s);
2436    if (x != 0xff) return STBI__MARKER_none;
2437    while (x == 0xff)
2438       x = stbi__get8(j->s);
2439    return x;
2440 }
2441 
2442 // in each scan, we'll have scan_n components, and the order
2443 // of the components is specified by order[]
2444 #define STBI__RESTART(x)     ((x) >= 0xd0 && (x) <= 0xd7)
2445 
2446 // after a restart interval, stbi__jpeg_reset the entropy decoder and
2447 // the dc prediction
stbi__jpeg_reset(stbi__jpeg * j)2448 static void stbi__jpeg_reset(stbi__jpeg *j)
2449 {
2450    j->code_bits = 0;
2451    j->code_buffer = 0;
2452    j->nomore = 0;
2453    j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = 0;
2454    j->marker = STBI__MARKER_none;
2455    j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
2456    j->eob_run = 0;
2457    // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
2458    // since we don't even allow 1<<30 pixels
2459 }
2460 
stbi__parse_entropy_coded_data(stbi__jpeg * z)2461 static int stbi__parse_entropy_coded_data(stbi__jpeg *z)
2462 {
2463    stbi__jpeg_reset(z);
2464    if (!z->progressive) {
2465       if (z->scan_n == 1) {
2466          int i,j;
2467          STBI_SIMD_ALIGN(short, data[64]);
2468          int n = z->order[0];
2469          // non-interleaved data, we just need to process one block at a time,
2470          // in trivial scanline order
2471          // number of blocks to do just depends on how many actual "pixels" this
2472          // component has, independent of interleaved MCU blocking and such
2473          int w = (z->img_comp[n].x+7) >> 3;
2474          int h = (z->img_comp[n].y+7) >> 3;
2475          for (j=0; j < h; ++j) {
2476             for (i=0; i < w; ++i) {
2477                int ha = z->img_comp[n].ha;
2478                if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2479                z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2480                // every data block is an MCU, so countdown the restart interval
2481                if (--z->todo <= 0) {
2482                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2483                   // if it's NOT a restart, then just bail, so we get corrupt data
2484                   // rather than no data
2485                   if (!STBI__RESTART(z->marker)) return 1;
2486                   stbi__jpeg_reset(z);
2487                }
2488             }
2489          }
2490          return 1;
2491       } else { // interleaved
2492          int i,j,k,x,y;
2493          STBI_SIMD_ALIGN(short, data[64]);
2494          for (j=0; j < z->img_mcu_y; ++j) {
2495             for (i=0; i < z->img_mcu_x; ++i) {
2496                // scan an interleaved mcu... process scan_n components in order
2497                for (k=0; k < z->scan_n; ++k) {
2498                   int n = z->order[k];
2499                   // scan out an mcu's worth of this component; that's just determined
2500                   // by the basic H and V specified for the component
2501                   for (y=0; y < z->img_comp[n].v; ++y) {
2502                      for (x=0; x < z->img_comp[n].h; ++x) {
2503                         int x2 = (i*z->img_comp[n].h + x)*8;
2504                         int y2 = (j*z->img_comp[n].v + y)*8;
2505                         int ha = z->img_comp[n].ha;
2506                         if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2507                         z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data);
2508                      }
2509                   }
2510                }
2511                // after all interleaved components, that's an interleaved MCU,
2512                // so now count down the restart interval
2513                if (--z->todo <= 0) {
2514                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2515                   if (!STBI__RESTART(z->marker)) return 1;
2516                   stbi__jpeg_reset(z);
2517                }
2518             }
2519          }
2520          return 1;
2521       }
2522    } else {
2523       if (z->scan_n == 1) {
2524          int i,j;
2525          int n = z->order[0];
2526          // non-interleaved data, we just need to process one block at a time,
2527          // in trivial scanline order
2528          // number of blocks to do just depends on how many actual "pixels" this
2529          // component has, independent of interleaved MCU blocking and such
2530          int w = (z->img_comp[n].x+7) >> 3;
2531          int h = (z->img_comp[n].y+7) >> 3;
2532          for (j=0; j < h; ++j) {
2533             for (i=0; i < w; ++i) {
2534                short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2535                if (z->spec_start == 0) {
2536                   if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2537                      return 0;
2538                } else {
2539                   int ha = z->img_comp[n].ha;
2540                   if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha]))
2541                      return 0;
2542                }
2543                // every data block is an MCU, so countdown the restart interval
2544                if (--z->todo <= 0) {
2545                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2546                   if (!STBI__RESTART(z->marker)) return 1;
2547                   stbi__jpeg_reset(z);
2548                }
2549             }
2550          }
2551          return 1;
2552       } else { // interleaved
2553          int i,j,k,x,y;
2554          for (j=0; j < z->img_mcu_y; ++j) {
2555             for (i=0; i < z->img_mcu_x; ++i) {
2556                // scan an interleaved mcu... process scan_n components in order
2557                for (k=0; k < z->scan_n; ++k) {
2558                   int n = z->order[k];
2559                   // scan out an mcu's worth of this component; that's just determined
2560                   // by the basic H and V specified for the component
2561                   for (y=0; y < z->img_comp[n].v; ++y) {
2562                      for (x=0; x < z->img_comp[n].h; ++x) {
2563                         int x2 = (i*z->img_comp[n].h + x);
2564                         int y2 = (j*z->img_comp[n].v + y);
2565                         short *data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w);
2566                         if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2567                            return 0;
2568                      }
2569                   }
2570                }
2571                // after all interleaved components, that's an interleaved MCU,
2572                // so now count down the restart interval
2573                if (--z->todo <= 0) {
2574                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2575                   if (!STBI__RESTART(z->marker)) return 1;
2576                   stbi__jpeg_reset(z);
2577                }
2578             }
2579          }
2580          return 1;
2581       }
2582    }
2583 }
2584 
stbi__jpeg_dequantize(short * data,stbi_uc * dequant)2585 static void stbi__jpeg_dequantize(short *data, stbi_uc *dequant)
2586 {
2587    int i;
2588    for (i=0; i < 64; ++i)
2589       data[i] *= dequant[i];
2590 }
2591 
stbi__jpeg_finish(stbi__jpeg * z)2592 static void stbi__jpeg_finish(stbi__jpeg *z)
2593 {
2594    if (z->progressive) {
2595       // dequantize and idct the data
2596       int i,j,n;
2597       for (n=0; n < z->s->img_n; ++n) {
2598          int w = (z->img_comp[n].x+7) >> 3;
2599          int h = (z->img_comp[n].y+7) >> 3;
2600          for (j=0; j < h; ++j) {
2601             for (i=0; i < w; ++i) {
2602                short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2603                stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]);
2604                z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2605             }
2606          }
2607       }
2608    }
2609 }
2610 
stbi__process_marker(stbi__jpeg * z,int m)2611 static int stbi__process_marker(stbi__jpeg *z, int m)
2612 {
2613    int L;
2614    switch (m) {
2615       case STBI__MARKER_none: // no marker found
2616          return stbi__err("expected marker","Corrupt JPEG");
2617 
2618       case 0xDD: // DRI - specify restart interval
2619          if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
2620          z->restart_interval = stbi__get16be(z->s);
2621          return 1;
2622 
2623       case 0xDB: // DQT - define quantization table
2624          L = stbi__get16be(z->s)-2;
2625          while (L > 0) {
2626             int q = stbi__get8(z->s);
2627             int p = q >> 4;
2628             int t = q & 15,i;
2629             if (p != 0) return stbi__err("bad DQT type","Corrupt JPEG");
2630             if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG");
2631             for (i=0; i < 64; ++i)
2632                z->dequant[t][stbi__jpeg_dezigzag[i]] = stbi__get8(z->s);
2633             L -= 65;
2634          }
2635          return L==0;
2636 
2637       case 0xC4: // DHT - define huffman table
2638          L = stbi__get16be(z->s)-2;
2639          while (L > 0) {
2640             stbi_uc *v;
2641             int sizes[16],i,n=0;
2642             int q = stbi__get8(z->s);
2643             int tc = q >> 4;
2644             int th = q & 15;
2645             if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG");
2646             for (i=0; i < 16; ++i) {
2647                sizes[i] = stbi__get8(z->s);
2648                n += sizes[i];
2649             }
2650             L -= 17;
2651             if (tc == 0) {
2652                if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0;
2653                v = z->huff_dc[th].values;
2654             } else {
2655                if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0;
2656                v = z->huff_ac[th].values;
2657             }
2658             for (i=0; i < n; ++i)
2659                v[i] = stbi__get8(z->s);
2660             if (tc != 0)
2661                stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
2662             L -= n;
2663          }
2664          return L==0;
2665    }
2666    // check for comment block or APP blocks
2667    if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
2668       stbi__skip(z->s, stbi__get16be(z->s)-2);
2669       return 1;
2670    }
2671    return 0;
2672 }
2673 
2674 // after we see SOS
stbi__process_scan_header(stbi__jpeg * z)2675 static int stbi__process_scan_header(stbi__jpeg *z)
2676 {
2677    int i;
2678    int Ls = stbi__get16be(z->s);
2679    z->scan_n = stbi__get8(z->s);
2680    if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return stbi__err("bad SOS component count","Corrupt JPEG");
2681    if (Ls != 6+2*z->scan_n) return stbi__err("bad SOS len","Corrupt JPEG");
2682    for (i=0; i < z->scan_n; ++i) {
2683       int id = stbi__get8(z->s), which;
2684       int q = stbi__get8(z->s);
2685       for (which = 0; which < z->s->img_n; ++which)
2686          if (z->img_comp[which].id == id)
2687             break;
2688       if (which == z->s->img_n) return 0; // no match
2689       z->img_comp[which].hd = q >> 4;   if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff","Corrupt JPEG");
2690       z->img_comp[which].ha = q & 15;   if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff","Corrupt JPEG");
2691       z->order[i] = which;
2692    }
2693 
2694    {
2695       int aa;
2696       z->spec_start = stbi__get8(z->s);
2697       z->spec_end   = stbi__get8(z->s); // should be 63, but might be 0
2698       aa = stbi__get8(z->s);
2699       z->succ_high = (aa >> 4);
2700       z->succ_low  = (aa & 15);
2701       if (z->progressive) {
2702          if (z->spec_start > 63 || z->spec_end > 63  || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13)
2703             return stbi__err("bad SOS", "Corrupt JPEG");
2704       } else {
2705          if (z->spec_start != 0) return stbi__err("bad SOS","Corrupt JPEG");
2706          if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS","Corrupt JPEG");
2707          z->spec_end = 63;
2708       }
2709    }
2710 
2711    return 1;
2712 }
2713 
stbi__process_frame_header(stbi__jpeg * z,int scan)2714 static int stbi__process_frame_header(stbi__jpeg *z, int scan)
2715 {
2716    stbi__context *s = z->s;
2717    int Lf,p,i,q, h_max=1,v_max=1,c;
2718    Lf = stbi__get16be(s);         if (Lf < 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
2719    p  = stbi__get8(s);            if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
2720    s->img_y = stbi__get16be(s);   if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
2721    s->img_x = stbi__get16be(s);   if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
2722    c = stbi__get8(s);
2723    if (c != 3 && c != 1) return stbi__err("bad component count","Corrupt JPEG");    // JFIF requires
2724    s->img_n = c;
2725    for (i=0; i < c; ++i) {
2726       z->img_comp[i].data = NULL;
2727       z->img_comp[i].linebuf = NULL;
2728    }
2729 
2730    if (Lf != 8+3*s->img_n) return stbi__err("bad SOF len","Corrupt JPEG");
2731 
2732    z->rgb = 0;
2733    for (i=0; i < s->img_n; ++i) {
2734       static unsigned char rgb[3] = { 'R', 'G', 'B' };
2735       z->img_comp[i].id = stbi__get8(s);
2736       if (z->img_comp[i].id != i+1)   // JFIF requires
2737          if (z->img_comp[i].id != i) {  // some version of jpegtran outputs non-JFIF-compliant files!
2738             // somethings output this (see http://fileformats.archiveteam.org/wiki/JPEG#Color_format)
2739             if (z->img_comp[i].id != rgb[i])
2740                return stbi__err("bad component ID","Corrupt JPEG");
2741             ++z->rgb;
2742          }
2743       q = stbi__get8(s);
2744       z->img_comp[i].h = (q >> 4);  if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG");
2745       z->img_comp[i].v = q & 15;    if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG");
2746       z->img_comp[i].tq = stbi__get8(s);  if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG");
2747    }
2748 
2749    if (scan != STBI__SCAN_load) return 1;
2750 
2751    if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
2752 
2753    for (i=0; i < s->img_n; ++i) {
2754       if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
2755       if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
2756    }
2757 
2758    // compute interleaved mcu info
2759    z->img_h_max = h_max;
2760    z->img_v_max = v_max;
2761    z->img_mcu_w = h_max * 8;
2762    z->img_mcu_h = v_max * 8;
2763    z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
2764    z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;
2765 
2766    for (i=0; i < s->img_n; ++i) {
2767       // number of effective pixels (e.g. for non-interleaved MCU)
2768       z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
2769       z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
2770       // to simplify generation, we'll allocate enough memory to decode
2771       // the bogus oversized data from using interleaved MCUs and their
2772       // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
2773       // discard the extra data until colorspace conversion
2774       z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
2775       z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
2776       z->img_comp[i].raw_data = stbi__malloc(z->img_comp[i].w2 * z->img_comp[i].h2+15);
2777 
2778       if (z->img_comp[i].raw_data == NULL) {
2779          for(--i; i >= 0; --i) {
2780             STBI_FREE(z->img_comp[i].raw_data);
2781             z->img_comp[i].raw_data = NULL;
2782          }
2783          return stbi__err("outofmem", "Out of memory");
2784       }
2785       // align blocks for idct using mmx/sse
2786       z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
2787       z->img_comp[i].linebuf = NULL;
2788       if (z->progressive) {
2789          z->img_comp[i].coeff_w = (z->img_comp[i].w2 + 7) >> 3;
2790          z->img_comp[i].coeff_h = (z->img_comp[i].h2 + 7) >> 3;
2791          z->img_comp[i].raw_coeff = STBI_MALLOC(z->img_comp[i].coeff_w * z->img_comp[i].coeff_h * 64 * sizeof(short) + 15);
2792          z->img_comp[i].coeff = (short*) (((size_t) z->img_comp[i].raw_coeff + 15) & ~15);
2793       } else {
2794          z->img_comp[i].coeff = 0;
2795          z->img_comp[i].raw_coeff = 0;
2796       }
2797    }
2798 
2799    return 1;
2800 }
2801 
2802 // use comparisons since in some cases we handle more than one case (e.g. SOF)
2803 #define stbi__DNL(x)         ((x) == 0xdc)
2804 #define stbi__SOI(x)         ((x) == 0xd8)
2805 #define stbi__EOI(x)         ((x) == 0xd9)
2806 #define stbi__SOF(x)         ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
2807 #define stbi__SOS(x)         ((x) == 0xda)
2808 
2809 #define stbi__SOF_progressive(x)   ((x) == 0xc2)
2810 
stbi__decode_jpeg_header(stbi__jpeg * z,int scan)2811 static int stbi__decode_jpeg_header(stbi__jpeg *z, int scan)
2812 {
2813    int m;
2814    z->marker = STBI__MARKER_none; // initialize cached marker to empty
2815    m = stbi__get_marker(z);
2816    if (!stbi__SOI(m)) return stbi__err("no SOI","Corrupt JPEG");
2817    if (scan == STBI__SCAN_type) return 1;
2818    m = stbi__get_marker(z);
2819    while (!stbi__SOF(m)) {
2820       if (!stbi__process_marker(z,m)) return 0;
2821       m = stbi__get_marker(z);
2822       while (m == STBI__MARKER_none) {
2823          // some files have extra padding after their blocks, so ok, we'll scan
2824          if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG");
2825          m = stbi__get_marker(z);
2826       }
2827    }
2828    z->progressive = stbi__SOF_progressive(m);
2829    if (!stbi__process_frame_header(z, scan)) return 0;
2830    return 1;
2831 }
2832 
2833 // decode image to YCbCr format
stbi__decode_jpeg_image(stbi__jpeg * j)2834 static int stbi__decode_jpeg_image(stbi__jpeg *j)
2835 {
2836    int m;
2837    for (m = 0; m < 4; m++) {
2838       j->img_comp[m].raw_data = NULL;
2839       j->img_comp[m].raw_coeff = NULL;
2840    }
2841    j->restart_interval = 0;
2842    if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0;
2843    m = stbi__get_marker(j);
2844    while (!stbi__EOI(m)) {
2845       if (stbi__SOS(m)) {
2846          if (!stbi__process_scan_header(j)) return 0;
2847          if (!stbi__parse_entropy_coded_data(j)) return 0;
2848          if (j->marker == STBI__MARKER_none ) {
2849             // handle 0s at the end of image data from IP Kamera 9060
2850             while (!stbi__at_eof(j->s)) {
2851                int x = stbi__get8(j->s);
2852                if (x == 255) {
2853                   j->marker = stbi__get8(j->s);
2854                   break;
2855                } else if (x != 0) {
2856                   return stbi__err("junk before marker", "Corrupt JPEG");
2857                }
2858             }
2859             // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
2860          }
2861       } else {
2862          if (!stbi__process_marker(j, m)) return 0;
2863       }
2864       m = stbi__get_marker(j);
2865    }
2866    if (j->progressive)
2867       stbi__jpeg_finish(j);
2868    return 1;
2869 }
2870 
2871 // static jfif-centered resampling (across block boundaries)
2872 
2873 typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1,
2874                                     int w, int hs);
2875 
2876 #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
2877 
resample_row_1(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2878 static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2879 {
2880    STBI_NOTUSED(out);
2881    STBI_NOTUSED(in_far);
2882    STBI_NOTUSED(w);
2883    STBI_NOTUSED(hs);
2884    return in_near;
2885 }
2886 
stbi__resample_row_v_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2887 static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2888 {
2889    // need to generate two samples vertically for every one in input
2890    int i;
2891    STBI_NOTUSED(hs);
2892    for (i=0; i < w; ++i)
2893       out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2);
2894    return out;
2895 }
2896 
stbi__resample_row_h_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2897 static stbi_uc*  stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2898 {
2899    // need to generate two samples horizontally for every one in input
2900    int i;
2901    stbi_uc *input = in_near;
2902 
2903    if (w == 1) {
2904       // if only one sample, can't do any interpolation
2905       out[0] = out[1] = input[0];
2906       return out;
2907    }
2908 
2909    out[0] = input[0];
2910    out[1] = stbi__div4(input[0]*3 + input[1] + 2);
2911    for (i=1; i < w-1; ++i) {
2912       int n = 3*input[i]+2;
2913       out[i*2+0] = stbi__div4(n+input[i-1]);
2914       out[i*2+1] = stbi__div4(n+input[i+1]);
2915    }
2916    out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2);
2917    out[i*2+1] = input[w-1];
2918 
2919    STBI_NOTUSED(in_far);
2920    STBI_NOTUSED(hs);
2921 
2922    return out;
2923 }
2924 
2925 #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
2926 
stbi__resample_row_hv_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2927 static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2928 {
2929    // need to generate 2x2 samples for every one in input
2930    int i,t0,t1;
2931    if (w == 1) {
2932       out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2933       return out;
2934    }
2935 
2936    t1 = 3*in_near[0] + in_far[0];
2937    out[0] = stbi__div4(t1+2);
2938    for (i=1; i < w; ++i) {
2939       t0 = t1;
2940       t1 = 3*in_near[i]+in_far[i];
2941       out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
2942       out[i*2  ] = stbi__div16(3*t1 + t0 + 8);
2943    }
2944    out[w*2-1] = stbi__div4(t1+2);
2945 
2946    STBI_NOTUSED(hs);
2947 
2948    return out;
2949 }
2950 
2951 #if defined(STBI_SSE2) || defined(STBI_NEON)
stbi__resample_row_hv_2_simd(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2952 static stbi_uc *stbi__resample_row_hv_2_simd(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2953 {
2954    // need to generate 2x2 samples for every one in input
2955    int i=0,t0,t1;
2956 
2957    if (w == 1) {
2958       out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2959       return out;
2960    }
2961 
2962    t1 = 3*in_near[0] + in_far[0];
2963    // process groups of 8 pixels for as long as we can.
2964    // note we can't handle the last pixel in a row in this loop
2965    // because we need to handle the filter boundary conditions.
2966    for (; i < ((w-1) & ~7); i += 8) {
2967 #if defined(STBI_SSE2)
2968       // load and perform the vertical filtering pass
2969       // this uses 3*x + y = 4*x + (y - x)
2970       __m128i zero  = _mm_setzero_si128();
2971       __m128i farb  = _mm_loadl_epi64((__m128i *) (in_far + i));
2972       __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i));
2973       __m128i farw  = _mm_unpacklo_epi8(farb, zero);
2974       __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
2975       __m128i diff  = _mm_sub_epi16(farw, nearw);
2976       __m128i nears = _mm_slli_epi16(nearw, 2);
2977       __m128i curr  = _mm_add_epi16(nears, diff); // current row
2978 
2979       // horizontal filter works the same based on shifted vers of current
2980       // row. "prev" is current row shifted right by 1 pixel; we need to
2981       // insert the previous pixel value (from t1).
2982       // "next" is current row shifted left by 1 pixel, with first pixel
2983       // of next block of 8 pixels added in.
2984       __m128i prv0 = _mm_slli_si128(curr, 2);
2985       __m128i nxt0 = _mm_srli_si128(curr, 2);
2986       __m128i prev = _mm_insert_epi16(prv0, t1, 0);
2987       __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7);
2988 
2989       // horizontal filter, polyphase implementation since it's convenient:
2990       // even pixels = 3*cur + prev = cur*4 + (prev - cur)
2991       // odd  pixels = 3*cur + next = cur*4 + (next - cur)
2992       // note the shared term.
2993       __m128i bias  = _mm_set1_epi16(8);
2994       __m128i curs = _mm_slli_epi16(curr, 2);
2995       __m128i prvd = _mm_sub_epi16(prev, curr);
2996       __m128i nxtd = _mm_sub_epi16(next, curr);
2997       __m128i curb = _mm_add_epi16(curs, bias);
2998       __m128i even = _mm_add_epi16(prvd, curb);
2999       __m128i odd  = _mm_add_epi16(nxtd, curb);
3000 
3001       // interleave even and odd pixels, then undo scaling.
3002       __m128i int0 = _mm_unpacklo_epi16(even, odd);
3003       __m128i int1 = _mm_unpackhi_epi16(even, odd);
3004       __m128i de0  = _mm_srli_epi16(int0, 4);
3005       __m128i de1  = _mm_srli_epi16(int1, 4);
3006 
3007       // pack and write output
3008       __m128i outv = _mm_packus_epi16(de0, de1);
3009       _mm_storeu_si128((__m128i *) (out + i*2), outv);
3010 #elif defined(STBI_NEON)
3011       // load and perform the vertical filtering pass
3012       // this uses 3*x + y = 4*x + (y - x)
3013       uint8x8_t farb  = vld1_u8(in_far + i);
3014       uint8x8_t nearb = vld1_u8(in_near + i);
3015       int16x8_t diff  = vreinterpretq_s16_u16(vsubl_u8(farb, nearb));
3016       int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2));
3017       int16x8_t curr  = vaddq_s16(nears, diff); // current row
3018 
3019       // horizontal filter works the same based on shifted vers of current
3020       // row. "prev" is current row shifted right by 1 pixel; we need to
3021       // insert the previous pixel value (from t1).
3022       // "next" is current row shifted left by 1 pixel, with first pixel
3023       // of next block of 8 pixels added in.
3024       int16x8_t prv0 = vextq_s16(curr, curr, 7);
3025       int16x8_t nxt0 = vextq_s16(curr, curr, 1);
3026       int16x8_t prev = vsetq_lane_s16(t1, prv0, 0);
3027       int16x8_t next = vsetq_lane_s16(3*in_near[i+8] + in_far[i+8], nxt0, 7);
3028 
3029       // horizontal filter, polyphase implementation since it's convenient:
3030       // even pixels = 3*cur + prev = cur*4 + (prev - cur)
3031       // odd  pixels = 3*cur + next = cur*4 + (next - cur)
3032       // note the shared term.
3033       int16x8_t curs = vshlq_n_s16(curr, 2);
3034       int16x8_t prvd = vsubq_s16(prev, curr);
3035       int16x8_t nxtd = vsubq_s16(next, curr);
3036       int16x8_t even = vaddq_s16(curs, prvd);
3037       int16x8_t odd  = vaddq_s16(curs, nxtd);
3038 
3039       // undo scaling and round, then store with even/odd phases interleaved
3040       uint8x8x2_t o;
3041       o.val[0] = vqrshrun_n_s16(even, 4);
3042       o.val[1] = vqrshrun_n_s16(odd,  4);
3043       vst2_u8(out + i*2, o);
3044 #endif
3045 
3046       // "previous" value for next iter
3047       t1 = 3*in_near[i+7] + in_far[i+7];
3048    }
3049 
3050    t0 = t1;
3051    t1 = 3*in_near[i] + in_far[i];
3052    out[i*2] = stbi__div16(3*t1 + t0 + 8);
3053 
3054    for (++i; i < w; ++i) {
3055       t0 = t1;
3056       t1 = 3*in_near[i]+in_far[i];
3057       out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
3058       out[i*2  ] = stbi__div16(3*t1 + t0 + 8);
3059    }
3060    out[w*2-1] = stbi__div4(t1+2);
3061 
3062    STBI_NOTUSED(hs);
3063 
3064    return out;
3065 }
3066 #endif
3067 
stbi__resample_row_generic(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)3068 static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3069 {
3070    // resample with nearest-neighbor
3071    int i,j;
3072    STBI_NOTUSED(in_far);
3073    for (i=0; i < w; ++i)
3074       for (j=0; j < hs; ++j)
3075          out[i*hs+j] = in_near[i];
3076    return out;
3077 }
3078 
3079 #ifdef STBI_JPEG_OLD
3080 // this is the same YCbCr-to-RGB calculation that stb_image has used
3081 // historically before the algorithm changes in 1.49
3082 #define float2fixed(x)  ((int) ((x) * 65536 + 0.5))
stbi__YCbCr_to_RGB_row(stbi_uc * out,const stbi_uc * y,const stbi_uc * pcb,const stbi_uc * pcr,int count,int step)3083 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3084 {
3085    int i;
3086    for (i=0; i < count; ++i) {
3087       int y_fixed = (y[i] << 16) + 32768; // rounding
3088       int r,g,b;
3089       int cr = pcr[i] - 128;
3090       int cb = pcb[i] - 128;
3091       r = y_fixed + cr*float2fixed(1.40200f);
3092       g = y_fixed - cr*float2fixed(0.71414f) - cb*float2fixed(0.34414f);
3093       b = y_fixed                            + cb*float2fixed(1.77200f);
3094       r >>= 16;
3095       g >>= 16;
3096       b >>= 16;
3097       if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3098       if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3099       if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3100       out[0] = (stbi_uc)r;
3101       out[1] = (stbi_uc)g;
3102       out[2] = (stbi_uc)b;
3103       out[3] = 255;
3104       out += step;
3105    }
3106 }
3107 #else
3108 // this is a reduced-precision calculation of YCbCr-to-RGB introduced
3109 // to make sure the code produces the same results in both SIMD and scalar
3110 #define float2fixed(x)  (((int) ((x) * 4096.0f + 0.5f)) << 8)
stbi__YCbCr_to_RGB_row(stbi_uc * out,const stbi_uc * y,const stbi_uc * pcb,const stbi_uc * pcr,int count,int step)3111 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3112 {
3113    int i;
3114    for (i=0; i < count; ++i) {
3115       int y_fixed = (y[i] << 20) + (1<<19); // rounding
3116       int r,g,b;
3117       int cr = pcr[i] - 128;
3118       int cb = pcb[i] - 128;
3119       r = y_fixed +  cr* float2fixed(1.40200f);
3120       g = y_fixed + (cr*-float2fixed(0.71414f)) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3121       b = y_fixed                               +   cb* float2fixed(1.77200f);
3122       r >>= 20;
3123       g >>= 20;
3124       b >>= 20;
3125       if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3126       if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3127       if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3128       out[0] = (stbi_uc)r;
3129       out[1] = (stbi_uc)g;
3130       out[2] = (stbi_uc)b;
3131       out[3] = 255;
3132       out += step;
3133    }
3134 }
3135 #endif
3136 
3137 #if defined(STBI_SSE2) || defined(STBI_NEON)
stbi__YCbCr_to_RGB_simd(stbi_uc * out,stbi_uc const * y,stbi_uc const * pcb,stbi_uc const * pcr,int count,int step)3138 static void stbi__YCbCr_to_RGB_simd(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step)
3139 {
3140    int i = 0;
3141 
3142 #ifdef STBI_SSE2
3143    // step == 3 is pretty ugly on the final interleave, and i'm not convinced
3144    // it's useful in practice (you wouldn't use it for textures, for example).
3145    // so just accelerate step == 4 case.
3146    if (step == 4) {
3147       // this is a fairly straightforward implementation and not super-optimized.
3148       __m128i signflip  = _mm_set1_epi8(-0x80);
3149       __m128i cr_const0 = _mm_set1_epi16(   (short) ( 1.40200f*4096.0f+0.5f));
3150       __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f));
3151       __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f));
3152       __m128i cb_const1 = _mm_set1_epi16(   (short) ( 1.77200f*4096.0f+0.5f));
3153       __m128i y_bias = _mm_set1_epi8((char) (unsigned char) 128);
3154       __m128i xw = _mm_set1_epi16(255); // alpha channel
3155 
3156       for (; i+7 < count; i += 8) {
3157          // load
3158          __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i));
3159          __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i));
3160          __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i));
3161          __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
3162          __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
3163 
3164          // unpack to short (and left-shift cr, cb by 8)
3165          __m128i yw  = _mm_unpacklo_epi8(y_bias, y_bytes);
3166          __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
3167          __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
3168 
3169          // color transform
3170          __m128i yws = _mm_srli_epi16(yw, 4);
3171          __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
3172          __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
3173          __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
3174          __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
3175          __m128i rws = _mm_add_epi16(cr0, yws);
3176          __m128i gwt = _mm_add_epi16(cb0, yws);
3177          __m128i bws = _mm_add_epi16(yws, cb1);
3178          __m128i gws = _mm_add_epi16(gwt, cr1);
3179 
3180          // descale
3181          __m128i rw = _mm_srai_epi16(rws, 4);
3182          __m128i bw = _mm_srai_epi16(bws, 4);
3183          __m128i gw = _mm_srai_epi16(gws, 4);
3184 
3185          // back to byte, set up for transpose
3186          __m128i brb = _mm_packus_epi16(rw, bw);
3187          __m128i gxb = _mm_packus_epi16(gw, xw);
3188 
3189          // transpose to interleave channels
3190          __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
3191          __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
3192          __m128i o0 = _mm_unpacklo_epi16(t0, t1);
3193          __m128i o1 = _mm_unpackhi_epi16(t0, t1);
3194 
3195          // store
3196          _mm_storeu_si128((__m128i *) (out + 0), o0);
3197          _mm_storeu_si128((__m128i *) (out + 16), o1);
3198          out += 32;
3199       }
3200    }
3201 #endif
3202 
3203 #ifdef STBI_NEON
3204    // in this version, step=3 support would be easy to add. but is there demand?
3205    if (step == 4) {
3206       // this is a fairly straightforward implementation and not super-optimized.
3207       uint8x8_t signflip = vdup_n_u8(0x80);
3208       int16x8_t cr_const0 = vdupq_n_s16(   (short) ( 1.40200f*4096.0f+0.5f));
3209       int16x8_t cr_const1 = vdupq_n_s16( - (short) ( 0.71414f*4096.0f+0.5f));
3210       int16x8_t cb_const0 = vdupq_n_s16( - (short) ( 0.34414f*4096.0f+0.5f));
3211       int16x8_t cb_const1 = vdupq_n_s16(   (short) ( 1.77200f*4096.0f+0.5f));
3212 
3213       for (; i+7 < count; i += 8) {
3214          // load
3215          uint8x8_t y_bytes  = vld1_u8(y + i);
3216          uint8x8_t cr_bytes = vld1_u8(pcr + i);
3217          uint8x8_t cb_bytes = vld1_u8(pcb + i);
3218          int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip));
3219          int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip));
3220 
3221          // expand to s16
3222          int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4));
3223          int16x8_t crw = vshll_n_s8(cr_biased, 7);
3224          int16x8_t cbw = vshll_n_s8(cb_biased, 7);
3225 
3226          // color transform
3227          int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0);
3228          int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0);
3229          int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1);
3230          int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1);
3231          int16x8_t rws = vaddq_s16(yws, cr0);
3232          int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1);
3233          int16x8_t bws = vaddq_s16(yws, cb1);
3234 
3235          // undo scaling, round, convert to byte
3236          uint8x8x4_t o;
3237          o.val[0] = vqrshrun_n_s16(rws, 4);
3238          o.val[1] = vqrshrun_n_s16(gws, 4);
3239          o.val[2] = vqrshrun_n_s16(bws, 4);
3240          o.val[3] = vdup_n_u8(255);
3241 
3242          // store, interleaving r/g/b/a
3243          vst4_u8(out, o);
3244          out += 8*4;
3245       }
3246    }
3247 #endif
3248 
3249    for (; i < count; ++i) {
3250       int y_fixed = (y[i] << 20) + (1<<19); // rounding
3251       int r,g,b;
3252       int cr = pcr[i] - 128;
3253       int cb = pcb[i] - 128;
3254       r = y_fixed + cr* float2fixed(1.40200f);
3255       g = y_fixed + cr*-float2fixed(0.71414f) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3256       b = y_fixed                             +   cb* float2fixed(1.77200f);
3257       r >>= 20;
3258       g >>= 20;
3259       b >>= 20;
3260       if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3261       if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3262       if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3263       out[0] = (stbi_uc)r;
3264       out[1] = (stbi_uc)g;
3265       out[2] = (stbi_uc)b;
3266       out[3] = 255;
3267       out += step;
3268    }
3269 }
3270 #endif
3271 
3272 // set up the kernels
stbi__setup_jpeg(stbi__jpeg * j)3273 static void stbi__setup_jpeg(stbi__jpeg *j)
3274 {
3275    j->idct_block_kernel = stbi__idct_block;
3276    j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
3277    j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
3278 
3279 #ifdef STBI_SSE2
3280    if (stbi__sse2_available()) {
3281       j->idct_block_kernel = stbi__idct_simd;
3282       #ifndef STBI_JPEG_OLD
3283       j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3284       #endif
3285       j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3286    }
3287 #endif
3288 
3289 #ifdef STBI_NEON
3290    j->idct_block_kernel = stbi__idct_simd;
3291    #ifndef STBI_JPEG_OLD
3292    j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3293    #endif
3294    j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3295 #endif
3296 }
3297 
3298 // clean up the temporary component buffers
stbi__cleanup_jpeg(stbi__jpeg * j)3299 static void stbi__cleanup_jpeg(stbi__jpeg *j)
3300 {
3301    int i;
3302    for (i=0; i < j->s->img_n; ++i) {
3303       if (j->img_comp[i].raw_data) {
3304          STBI_FREE(j->img_comp[i].raw_data);
3305          j->img_comp[i].raw_data = NULL;
3306          j->img_comp[i].data = NULL;
3307       }
3308       if (j->img_comp[i].raw_coeff) {
3309          STBI_FREE(j->img_comp[i].raw_coeff);
3310          j->img_comp[i].raw_coeff = 0;
3311          j->img_comp[i].coeff = 0;
3312       }
3313       if (j->img_comp[i].linebuf) {
3314          STBI_FREE(j->img_comp[i].linebuf);
3315          j->img_comp[i].linebuf = NULL;
3316       }
3317    }
3318 }
3319 
3320 typedef struct
3321 {
3322    resample_row_func resample;
3323    stbi_uc *line0,*line1;
3324    int hs,vs;   // expansion factor in each axis
3325    int w_lores; // horizontal pixels pre-expansion
3326    int ystep;   // how far through vertical expansion we are
3327    int ypos;    // which pre-expansion row we're on
3328 } stbi__resample;
3329 
load_jpeg_image(stbi__jpeg * z,int * out_x,int * out_y,int * comp,int req_comp)3330 static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
3331 {
3332    int n, decode_n;
3333    z->s->img_n = 0; // make stbi__cleanup_jpeg safe
3334 
3335    // validate req_comp
3336    if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
3337 
3338    // load a jpeg image from whichever source, but leave in YCbCr format
3339    if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
3340 
3341    // determine actual number of components to generate
3342    n = req_comp ? req_comp : z->s->img_n;
3343 
3344    if (z->s->img_n == 3 && n < 3)
3345       decode_n = 1;
3346    else
3347       decode_n = z->s->img_n;
3348 
3349    // resample and color-convert
3350    {
3351       int k;
3352       unsigned int i,j;
3353       stbi_uc *output;
3354       stbi_uc *coutput[4];
3355 
3356       stbi__resample res_comp[4];
3357 
3358       for (k=0; k < decode_n; ++k) {
3359          stbi__resample *r = &res_comp[k];
3360 
3361          // allocate line buffer big enough for upsampling off the edges
3362          // with upsample factor of 4
3363          z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3);
3364          if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3365 
3366          r->hs      = z->img_h_max / z->img_comp[k].h;
3367          r->vs      = z->img_v_max / z->img_comp[k].v;
3368          r->ystep   = r->vs >> 1;
3369          r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
3370          r->ypos    = 0;
3371          r->line0   = r->line1 = z->img_comp[k].data;
3372 
3373          if      (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
3374          else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
3375          else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
3376          else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
3377          else                               r->resample = stbi__resample_row_generic;
3378       }
3379 
3380       // can't error after this so, this is safe
3381       output = (stbi_uc *) stbi__malloc(n * z->s->img_x * z->s->img_y + 1);
3382       if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3383 
3384       // now go ahead and resample
3385       for (j=0; j < z->s->img_y; ++j) {
3386          stbi_uc *out = output + n * z->s->img_x * j;
3387          for (k=0; k < decode_n; ++k) {
3388             stbi__resample *r = &res_comp[k];
3389             int y_bot = r->ystep >= (r->vs >> 1);
3390             coutput[k] = r->resample(z->img_comp[k].linebuf,
3391                                      y_bot ? r->line1 : r->line0,
3392                                      y_bot ? r->line0 : r->line1,
3393                                      r->w_lores, r->hs);
3394             if (++r->ystep >= r->vs) {
3395                r->ystep = 0;
3396                r->line0 = r->line1;
3397                if (++r->ypos < z->img_comp[k].y)
3398                   r->line1 += z->img_comp[k].w2;
3399             }
3400          }
3401          if (n >= 3) {
3402             stbi_uc *y = coutput[0];
3403             if (z->s->img_n == 3) {
3404                if (z->rgb == 3) {
3405                   for (i=0; i < z->s->img_x; ++i) {
3406                      out[0] = y[i];
3407                      out[1] = coutput[1][i];
3408                      out[2] = coutput[2][i];
3409                      out[3] = 255;
3410                      out += n;
3411                   }
3412                } else {
3413                   z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3414                }
3415             } else
3416                for (i=0; i < z->s->img_x; ++i) {
3417                   out[0] = out[1] = out[2] = y[i];
3418                   out[3] = 255; // not used if n==3
3419                   out += n;
3420                }
3421          } else {
3422             stbi_uc *y = coutput[0];
3423             if (n == 1)
3424                for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
3425             else
3426                for (i=0; i < z->s->img_x; ++i) *out++ = y[i], *out++ = 255;
3427          }
3428       }
3429       stbi__cleanup_jpeg(z);
3430       *out_x = z->s->img_x;
3431       *out_y = z->s->img_y;
3432       if (comp) *comp  = z->s->img_n; // report original components, not output
3433       return output;
3434    }
3435 }
3436 
stbi__jpeg_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)3437 static unsigned char *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
3438 {
3439    unsigned char* result;
3440    stbi__jpeg* j = (stbi__jpeg*) stbi__malloc(sizeof(stbi__jpeg));
3441    j->s = s;
3442    stbi__setup_jpeg(j);
3443    result = load_jpeg_image(j, x,y,comp,req_comp);
3444    STBI_FREE(j);
3445    return result;
3446 }
3447 
stbi__jpeg_test(stbi__context * s)3448 static int stbi__jpeg_test(stbi__context *s)
3449 {
3450    int r;
3451    stbi__jpeg j;
3452    j.s = s;
3453    stbi__setup_jpeg(&j);
3454    r = stbi__decode_jpeg_header(&j, STBI__SCAN_type);
3455    stbi__rewind(s);
3456    return r;
3457 }
3458 
stbi__jpeg_info_raw(stbi__jpeg * j,int * x,int * y,int * comp)3459 static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp)
3460 {
3461    if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) {
3462       stbi__rewind( j->s );
3463       return 0;
3464    }
3465    if (x) *x = j->s->img_x;
3466    if (y) *y = j->s->img_y;
3467    if (comp) *comp = j->s->img_n;
3468    return 1;
3469 }
3470 
stbi__jpeg_info(stbi__context * s,int * x,int * y,int * comp)3471 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp)
3472 {
3473    int result;
3474    stbi__jpeg* j = (stbi__jpeg*) (stbi__malloc(sizeof(stbi__jpeg)));
3475    j->s = s;
3476    result = stbi__jpeg_info_raw(j, x, y, comp);
3477    STBI_FREE(j);
3478    return result;
3479 }
3480 #endif
3481 
3482 // public domain zlib decode    v0.2  Sean Barrett 2006-11-18
3483 //    simple implementation
3484 //      - all input must be provided in an upfront buffer
3485 //      - all output is written to a single output buffer (can malloc/realloc)
3486 //    performance
3487 //      - fast huffman
3488 
3489 #ifndef STBI_NO_ZLIB
3490 
3491 // fast-way is faster to check than jpeg huffman, but slow way is slower
3492 #define STBI__ZFAST_BITS  9 // accelerate all cases in default tables
3493 #define STBI__ZFAST_MASK  ((1 << STBI__ZFAST_BITS) - 1)
3494 
3495 // zlib-style huffman encoding
3496 // (jpegs packs from left, zlib from right, so can't share code)
3497 typedef struct
3498 {
3499    stbi__uint16 fast[1 << STBI__ZFAST_BITS];
3500    stbi__uint16 firstcode[16];
3501    int maxcode[17];
3502    stbi__uint16 firstsymbol[16];
3503    stbi_uc  size[288];
3504    stbi__uint16 value[288];
3505 } stbi__zhuffman;
3506 
stbi__bitreverse16(int n)3507 stbi_inline static int stbi__bitreverse16(int n)
3508 {
3509   n = ((n & 0xAAAA) >>  1) | ((n & 0x5555) << 1);
3510   n = ((n & 0xCCCC) >>  2) | ((n & 0x3333) << 2);
3511   n = ((n & 0xF0F0) >>  4) | ((n & 0x0F0F) << 4);
3512   n = ((n & 0xFF00) >>  8) | ((n & 0x00FF) << 8);
3513   return n;
3514 }
3515 
stbi__bit_reverse(int v,int bits)3516 stbi_inline static int stbi__bit_reverse(int v, int bits)
3517 {
3518    STBI_ASSERT(bits <= 16);
3519    // to bit reverse n bits, reverse 16 and shift
3520    // e.g. 11 bits, bit reverse and shift away 5
3521    return stbi__bitreverse16(v) >> (16-bits);
3522 }
3523 
stbi__zbuild_huffman(stbi__zhuffman * z,stbi_uc * sizelist,int num)3524 static int stbi__zbuild_huffman(stbi__zhuffman *z, stbi_uc *sizelist, int num)
3525 {
3526    int i,k=0;
3527    int code, next_code[16], sizes[17];
3528 
3529    // DEFLATE spec for generating codes
3530    memset(sizes, 0, sizeof(sizes));
3531    memset(z->fast, 0, sizeof(z->fast));
3532    for (i=0; i < num; ++i)
3533       ++sizes[sizelist[i]];
3534    sizes[0] = 0;
3535    for (i=1; i < 16; ++i)
3536       if (sizes[i] > (1 << i))
3537          return stbi__err("bad sizes", "Corrupt PNG");
3538    code = 0;
3539    for (i=1; i < 16; ++i) {
3540       next_code[i] = code;
3541       z->firstcode[i] = (stbi__uint16) code;
3542       z->firstsymbol[i] = (stbi__uint16) k;
3543       code = (code + sizes[i]);
3544       if (sizes[i])
3545          if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt PNG");
3546       z->maxcode[i] = code << (16-i); // preshift for inner loop
3547       code <<= 1;
3548       k += sizes[i];
3549    }
3550    z->maxcode[16] = 0x10000; // sentinel
3551    for (i=0; i < num; ++i) {
3552       int s = sizelist[i];
3553       if (s) {
3554          int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
3555          stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i);
3556          z->size [c] = (stbi_uc     ) s;
3557          z->value[c] = (stbi__uint16) i;
3558          if (s <= STBI__ZFAST_BITS) {
3559             int j = stbi__bit_reverse(next_code[s],s);
3560             while (j < (1 << STBI__ZFAST_BITS)) {
3561                z->fast[j] = fastv;
3562                j += (1 << s);
3563             }
3564          }
3565          ++next_code[s];
3566       }
3567    }
3568    return 1;
3569 }
3570 
3571 // zlib-from-memory implementation for PNG reading
3572 //    because PNG allows splitting the zlib stream arbitrarily,
3573 //    and it's annoying structurally to have PNG call ZLIB call PNG,
3574 //    we require PNG read all the IDATs and combine them into a single
3575 //    memory buffer
3576 
3577 typedef struct
3578 {
3579    stbi_uc *zbuffer, *zbuffer_end;
3580    int num_bits;
3581    stbi__uint32 code_buffer;
3582 
3583    char *zout;
3584    char *zout_start;
3585    char *zout_end;
3586    int   z_expandable;
3587 
3588    stbi__zhuffman z_length, z_distance;
3589 } stbi__zbuf;
3590 
stbi__zget8(stbi__zbuf * z)3591 stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z)
3592 {
3593    if (z->zbuffer >= z->zbuffer_end) return 0;
3594    return *z->zbuffer++;
3595 }
3596 
stbi__fill_bits(stbi__zbuf * z)3597 static void stbi__fill_bits(stbi__zbuf *z)
3598 {
3599    do {
3600       STBI_ASSERT(z->code_buffer < (1U << z->num_bits));
3601       z->code_buffer |= (unsigned int) stbi__zget8(z) << z->num_bits;
3602       z->num_bits += 8;
3603    } while (z->num_bits <= 24);
3604 }
3605 
stbi__zreceive(stbi__zbuf * z,int n)3606 stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n)
3607 {
3608    unsigned int k;
3609    if (z->num_bits < n) stbi__fill_bits(z);
3610    k = z->code_buffer & ((1 << n) - 1);
3611    z->code_buffer >>= n;
3612    z->num_bits -= n;
3613    return k;
3614 }
3615 
stbi__zhuffman_decode_slowpath(stbi__zbuf * a,stbi__zhuffman * z)3616 static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z)
3617 {
3618    int b,s,k;
3619    // not resolved by fast table, so compute it the slow way
3620    // use jpeg approach, which requires MSbits at top
3621    k = stbi__bit_reverse(a->code_buffer, 16);
3622    for (s=STBI__ZFAST_BITS+1; ; ++s)
3623       if (k < z->maxcode[s])
3624          break;
3625    if (s == 16) return -1; // invalid code!
3626    // code size is s, so:
3627    b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
3628    STBI_ASSERT(z->size[b] == s);
3629    a->code_buffer >>= s;
3630    a->num_bits -= s;
3631    return z->value[b];
3632 }
3633 
stbi__zhuffman_decode(stbi__zbuf * a,stbi__zhuffman * z)3634 stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z)
3635 {
3636    int b,s;
3637    if (a->num_bits < 16) stbi__fill_bits(a);
3638    b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
3639    if (b) {
3640       s = b >> 9;
3641       a->code_buffer >>= s;
3642       a->num_bits -= s;
3643       return b & 511;
3644    }
3645    return stbi__zhuffman_decode_slowpath(a, z);
3646 }
3647 
stbi__zexpand(stbi__zbuf * z,char * zout,int n)3648 static int stbi__zexpand(stbi__zbuf *z, char *zout, int n)  // need to make room for n bytes
3649 {
3650    char *q;
3651    int cur, limit, old_limit;
3652    z->zout = zout;
3653    if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG");
3654    cur   = (int) (z->zout     - z->zout_start);
3655    limit = old_limit = (int) (z->zout_end - z->zout_start);
3656    while (cur + n > limit)
3657       limit *= 2;
3658    q = (char *) STBI_REALLOC_SIZED(z->zout_start, old_limit, limit);
3659    STBI_NOTUSED(old_limit);
3660    if (q == NULL) return stbi__err("outofmem", "Out of memory");
3661    z->zout_start = q;
3662    z->zout       = q + cur;
3663    z->zout_end   = q + limit;
3664    return 1;
3665 }
3666 
3667 static int stbi__zlength_base[31] = {
3668    3,4,5,6,7,8,9,10,11,13,
3669    15,17,19,23,27,31,35,43,51,59,
3670    67,83,99,115,131,163,195,227,258,0,0 };
3671 
3672 static int stbi__zlength_extra[31]=
3673 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
3674 
3675 static int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
3676 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
3677 
3678 static int stbi__zdist_extra[32] =
3679 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
3680 
stbi__parse_huffman_block(stbi__zbuf * a)3681 static int stbi__parse_huffman_block(stbi__zbuf *a)
3682 {
3683    char *zout = a->zout;
3684    for(;;) {
3685       int z = stbi__zhuffman_decode(a, &a->z_length);
3686       if (z < 256) {
3687          if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
3688          if (zout >= a->zout_end) {
3689             if (!stbi__zexpand(a, zout, 1)) return 0;
3690             zout = a->zout;
3691          }
3692          *zout++ = (char) z;
3693       } else {
3694          stbi_uc *p;
3695          int len,dist;
3696          if (z == 256) {
3697             a->zout = zout;
3698             return 1;
3699          }
3700          z -= 257;
3701          len = stbi__zlength_base[z];
3702          if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
3703          z = stbi__zhuffman_decode(a, &a->z_distance);
3704          if (z < 0) return stbi__err("bad huffman code","Corrupt PNG");
3705          dist = stbi__zdist_base[z];
3706          if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
3707          if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG");
3708          if (zout + len > a->zout_end) {
3709             if (!stbi__zexpand(a, zout, len)) return 0;
3710             zout = a->zout;
3711          }
3712          p = (stbi_uc *) (zout - dist);
3713          if (dist == 1) { // run of one byte; common in images.
3714             stbi_uc v = *p;
3715             if (len) { do *zout++ = v; while (--len); }
3716          } else {
3717             if (len) { do *zout++ = *p++; while (--len); }
3718          }
3719       }
3720    }
3721 }
3722 
stbi__compute_huffman_codes(stbi__zbuf * a)3723 static int stbi__compute_huffman_codes(stbi__zbuf *a)
3724 {
3725    static stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
3726    stbi__zhuffman z_codelength;
3727    stbi_uc lencodes[286+32+137];//padding for maximum single op
3728    stbi_uc codelength_sizes[19];
3729    int i,n;
3730 
3731    int hlit  = stbi__zreceive(a,5) + 257;
3732    int hdist = stbi__zreceive(a,5) + 1;
3733    int hclen = stbi__zreceive(a,4) + 4;
3734 
3735    memset(codelength_sizes, 0, sizeof(codelength_sizes));
3736    for (i=0; i < hclen; ++i) {
3737       int s = stbi__zreceive(a,3);
3738       codelength_sizes[length_dezigzag[i]] = (stbi_uc) s;
3739    }
3740    if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
3741 
3742    n = 0;
3743    while (n < hlit + hdist) {
3744       int c = stbi__zhuffman_decode(a, &z_codelength);
3745       if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG");
3746       if (c < 16)
3747          lencodes[n++] = (stbi_uc) c;
3748       else if (c == 16) {
3749          c = stbi__zreceive(a,2)+3;
3750          memset(lencodes+n, lencodes[n-1], c);
3751          n += c;
3752       } else if (c == 17) {
3753          c = stbi__zreceive(a,3)+3;
3754          memset(lencodes+n, 0, c);
3755          n += c;
3756       } else {
3757          STBI_ASSERT(c == 18);
3758          c = stbi__zreceive(a,7)+11;
3759          memset(lencodes+n, 0, c);
3760          n += c;
3761       }
3762    }
3763    if (n != hlit+hdist) return stbi__err("bad codelengths","Corrupt PNG");
3764    if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
3765    if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
3766    return 1;
3767 }
3768 
stbi__parse_uncompressed_block(stbi__zbuf * a)3769 static int stbi__parse_uncompressed_block(stbi__zbuf *a)
3770 {
3771    stbi_uc header[4];
3772    int len,nlen,k;
3773    if (a->num_bits & 7)
3774       stbi__zreceive(a, a->num_bits & 7); // discard
3775    // drain the bit-packed data into header
3776    k = 0;
3777    while (a->num_bits > 0) {
3778       header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check
3779       a->code_buffer >>= 8;
3780       a->num_bits -= 8;
3781    }
3782    STBI_ASSERT(a->num_bits == 0);
3783    // now fill header the normal way
3784    while (k < 4)
3785       header[k++] = stbi__zget8(a);
3786    len  = header[1] * 256 + header[0];
3787    nlen = header[3] * 256 + header[2];
3788    if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
3789    if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG");
3790    if (a->zout + len > a->zout_end)
3791       if (!stbi__zexpand(a, a->zout, len)) return 0;
3792    memcpy(a->zout, a->zbuffer, len);
3793    a->zbuffer += len;
3794    a->zout += len;
3795    return 1;
3796 }
3797 
stbi__parse_zlib_header(stbi__zbuf * a)3798 static int stbi__parse_zlib_header(stbi__zbuf *a)
3799 {
3800    int cmf   = stbi__zget8(a);
3801    int cm    = cmf & 15;
3802    /* int cinfo = cmf >> 4; */
3803    int flg   = stbi__zget8(a);
3804    if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
3805    if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
3806    if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
3807    // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
3808    return 1;
3809 }
3810 
3811 // @TODO: should statically initialize these for optimal thread safety
3812 static stbi_uc stbi__zdefault_length[288], stbi__zdefault_distance[32];
stbi__init_zdefaults(void)3813 static void stbi__init_zdefaults(void)
3814 {
3815    int i;   // use <= to match clearly with spec
3816    for (i=0; i <= 143; ++i)     stbi__zdefault_length[i]   = 8;
3817    for (   ; i <= 255; ++i)     stbi__zdefault_length[i]   = 9;
3818    for (   ; i <= 279; ++i)     stbi__zdefault_length[i]   = 7;
3819    for (   ; i <= 287; ++i)     stbi__zdefault_length[i]   = 8;
3820 
3821    for (i=0; i <=  31; ++i)     stbi__zdefault_distance[i] = 5;
3822 }
3823 
stbi__parse_zlib(stbi__zbuf * a,int parse_header)3824 static int stbi__parse_zlib(stbi__zbuf *a, int parse_header)
3825 {
3826    int final, type;
3827    if (parse_header)
3828       if (!stbi__parse_zlib_header(a)) return 0;
3829    a->num_bits = 0;
3830    a->code_buffer = 0;
3831    do {
3832       final = stbi__zreceive(a,1);
3833       type = stbi__zreceive(a,2);
3834       if (type == 0) {
3835          if (!stbi__parse_uncompressed_block(a)) return 0;
3836       } else if (type == 3) {
3837          return 0;
3838       } else {
3839          if (type == 1) {
3840             // use fixed code lengths
3841             if (!stbi__zdefault_distance[31]) stbi__init_zdefaults();
3842             if (!stbi__zbuild_huffman(&a->z_length  , stbi__zdefault_length  , 288)) return 0;
3843             if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance,  32)) return 0;
3844          } else {
3845             if (!stbi__compute_huffman_codes(a)) return 0;
3846          }
3847          if (!stbi__parse_huffman_block(a)) return 0;
3848       }
3849    } while (!final);
3850    return 1;
3851 }
3852 
stbi__do_zlib(stbi__zbuf * a,char * obuf,int olen,int exp,int parse_header)3853 static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header)
3854 {
3855    a->zout_start = obuf;
3856    a->zout       = obuf;
3857    a->zout_end   = obuf + olen;
3858    a->z_expandable = exp;
3859 
3860    return stbi__parse_zlib(a, parse_header);
3861 }
3862 
stbi_zlib_decode_malloc_guesssize(const char * buffer,int len,int initial_size,int * outlen)3863 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)
3864 {
3865    stbi__zbuf a;
3866    char *p = (char *) stbi__malloc(initial_size);
3867    if (p == NULL) return NULL;
3868    a.zbuffer = (stbi_uc *) buffer;
3869    a.zbuffer_end = (stbi_uc *) buffer + len;
3870    if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
3871       if (outlen) *outlen = (int) (a.zout - a.zout_start);
3872       return a.zout_start;
3873    } else {
3874       STBI_FREE(a.zout_start);
3875       return NULL;
3876    }
3877 }
3878 
stbi_zlib_decode_malloc(char const * buffer,int len,int * outlen)3879 STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
3880 {
3881    return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
3882 }
3883 
stbi_zlib_decode_malloc_guesssize_headerflag(const char * buffer,int len,int initial_size,int * outlen,int parse_header)3884 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
3885 {
3886    stbi__zbuf a;
3887    char *p = (char *) stbi__malloc(initial_size);
3888    if (p == NULL) return NULL;
3889    a.zbuffer = (stbi_uc *) buffer;
3890    a.zbuffer_end = (stbi_uc *) buffer + len;
3891    if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
3892       if (outlen) *outlen = (int) (a.zout - a.zout_start);
3893       return a.zout_start;
3894    } else {
3895       STBI_FREE(a.zout_start);
3896       return NULL;
3897    }
3898 }
3899 
stbi_zlib_decode_buffer(char * obuffer,int olen,char const * ibuffer,int ilen)3900 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
3901 {
3902    stbi__zbuf a;
3903    a.zbuffer = (stbi_uc *) ibuffer;
3904    a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3905    if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
3906       return (int) (a.zout - a.zout_start);
3907    else
3908       return -1;
3909 }
3910 
stbi_zlib_decode_noheader_malloc(char const * buffer,int len,int * outlen)3911 STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
3912 {
3913    stbi__zbuf a;
3914    char *p = (char *) stbi__malloc(16384);
3915    if (p == NULL) return NULL;
3916    a.zbuffer = (stbi_uc *) buffer;
3917    a.zbuffer_end = (stbi_uc *) buffer+len;
3918    if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
3919       if (outlen) *outlen = (int) (a.zout - a.zout_start);
3920       return a.zout_start;
3921    } else {
3922       STBI_FREE(a.zout_start);
3923       return NULL;
3924    }
3925 }
3926 
stbi_zlib_decode_noheader_buffer(char * obuffer,int olen,const char * ibuffer,int ilen)3927 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
3928 {
3929    stbi__zbuf a;
3930    a.zbuffer = (stbi_uc *) ibuffer;
3931    a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3932    if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
3933       return (int) (a.zout - a.zout_start);
3934    else
3935       return -1;
3936 }
3937 #endif
3938 
3939 // public domain "baseline" PNG decoder   v0.10  Sean Barrett 2006-11-18
3940 //    simple implementation
3941 //      - only 8-bit samples
3942 //      - no CRC checking
3943 //      - allocates lots of intermediate memory
3944 //        - avoids problem of streaming data between subsystems
3945 //        - avoids explicit window management
3946 //    performance
3947 //      - uses stb_zlib, a PD zlib implementation with fast huffman decoding
3948 
3949 #ifndef STBI_NO_PNG
3950 typedef struct
3951 {
3952    stbi__uint32 length;
3953    stbi__uint32 type;
3954 } stbi__pngchunk;
3955 
stbi__get_chunk_header(stbi__context * s)3956 static stbi__pngchunk stbi__get_chunk_header(stbi__context *s)
3957 {
3958    stbi__pngchunk c;
3959    c.length = stbi__get32be(s);
3960    c.type   = stbi__get32be(s);
3961    return c;
3962 }
3963 
stbi__check_png_header(stbi__context * s)3964 static int stbi__check_png_header(stbi__context *s)
3965 {
3966    static stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
3967    int i;
3968    for (i=0; i < 8; ++i)
3969       if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG");
3970    return 1;
3971 }
3972 
3973 typedef struct
3974 {
3975    stbi__context *s;
3976    stbi_uc *idata, *expanded, *out;
3977    int depth;
3978 } stbi__png;
3979 
3980 
3981 enum {
3982    STBI__F_none=0,
3983    STBI__F_sub=1,
3984    STBI__F_up=2,
3985    STBI__F_avg=3,
3986    STBI__F_paeth=4,
3987    // synthetic filters used for first scanline to avoid needing a dummy row of 0s
3988    STBI__F_avg_first,
3989    STBI__F_paeth_first
3990 };
3991 
3992 static stbi_uc first_row_filter[5] =
3993 {
3994    STBI__F_none,
3995    STBI__F_sub,
3996    STBI__F_none,
3997    STBI__F_avg_first,
3998    STBI__F_paeth_first
3999 };
4000 
stbi__paeth(int a,int b,int c)4001 static int stbi__paeth(int a, int b, int c)
4002 {
4003    int p = a + b - c;
4004    int pa = abs(p-a);
4005    int pb = abs(p-b);
4006    int pc = abs(p-c);
4007    if (pa <= pb && pa <= pc) return a;
4008    if (pb <= pc) return b;
4009    return c;
4010 }
4011 
4012 static stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
4013 
4014 // create the png data from post-deflated data
stbi__create_png_image_raw(stbi__png * a,stbi_uc * raw,stbi__uint32 raw_len,int out_n,stbi__uint32 x,stbi__uint32 y,int depth,int color)4015 static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
4016 {
4017    int bytes = (depth == 16? 2 : 1);
4018    stbi__context *s = a->s;
4019    stbi__uint32 i,j,stride = x*out_n*bytes;
4020    stbi__uint32 img_len, img_width_bytes;
4021    int k;
4022    int img_n = s->img_n; // copy it into a local for later
4023 
4024    int output_bytes = out_n*bytes;
4025    int filter_bytes = img_n*bytes;
4026    int width = x;
4027 
4028    STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1);
4029    a->out = (stbi_uc *) stbi__malloc(x * y * output_bytes); // extra bytes to write off the end into
4030    if (!a->out) return stbi__err("outofmem", "Out of memory");
4031 
4032    img_width_bytes = (((img_n * x * depth) + 7) >> 3);
4033    img_len = (img_width_bytes + 1) * y;
4034    if (s->img_x == x && s->img_y == y) {
4035       if (raw_len != img_len) return stbi__err("not enough pixels","Corrupt PNG");
4036    } else { // interlaced:
4037       if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG");
4038    }
4039 
4040    for (j=0; j < y; ++j) {
4041       stbi_uc *cur = a->out + stride*j;
4042       stbi_uc *prior = cur - stride;
4043       int filter = *raw++;
4044 
4045       if (filter > 4)
4046          return stbi__err("invalid filter","Corrupt PNG");
4047 
4048       if (depth < 8) {
4049          STBI_ASSERT(img_width_bytes <= x);
4050          cur += x*out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place
4051          filter_bytes = 1;
4052          width = img_width_bytes;
4053       }
4054 
4055       // if first row, use special filter that doesn't sample previous row
4056       if (j == 0) filter = first_row_filter[filter];
4057 
4058       // handle first byte explicitly
4059       for (k=0; k < filter_bytes; ++k) {
4060          switch (filter) {
4061             case STBI__F_none       : cur[k] = raw[k]; break;
4062             case STBI__F_sub        : cur[k] = raw[k]; break;
4063             case STBI__F_up         : cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4064             case STBI__F_avg        : cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1)); break;
4065             case STBI__F_paeth      : cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0,prior[k],0)); break;
4066             case STBI__F_avg_first  : cur[k] = raw[k]; break;
4067             case STBI__F_paeth_first: cur[k] = raw[k]; break;
4068          }
4069       }
4070 
4071       if (depth == 8) {
4072          if (img_n != out_n)
4073             cur[img_n] = 255; // first pixel
4074          raw += img_n;
4075          cur += out_n;
4076          prior += out_n;
4077       } else if (depth == 16) {
4078          if (img_n != out_n) {
4079             cur[filter_bytes]   = 255; // first pixel top byte
4080             cur[filter_bytes+1] = 255; // first pixel bottom byte
4081          }
4082          raw += filter_bytes;
4083          cur += output_bytes;
4084          prior += output_bytes;
4085       } else {
4086          raw += 1;
4087          cur += 1;
4088          prior += 1;
4089       }
4090 
4091       // this is a little gross, so that we don't switch per-pixel or per-component
4092       if (depth < 8 || img_n == out_n) {
4093          int nk = (width - 1)*filter_bytes;
4094          #define CASE(f) \
4095              case f:     \
4096                 for (k=0; k < nk; ++k)
4097          switch (filter) {
4098             // "none" filter turns into a memcpy here; make that explicit.
4099             case STBI__F_none:         memcpy(cur, raw, nk); break;
4100             CASE(STBI__F_sub)          cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]); break;
4101             CASE(STBI__F_up)           cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4102             CASE(STBI__F_avg)          cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1)); break;
4103             CASE(STBI__F_paeth)        cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],prior[k],prior[k-filter_bytes])); break;
4104             CASE(STBI__F_avg_first)    cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1)); break;
4105             CASE(STBI__F_paeth_first)  cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],0,0)); break;
4106          }
4107          #undef CASE
4108          raw += nk;
4109       } else {
4110          STBI_ASSERT(img_n+1 == out_n);
4111          #define CASE(f) \
4112              case f:     \
4113                 for (i=x-1; i >= 1; --i, cur[filter_bytes]=255,raw+=filter_bytes,cur+=output_bytes,prior+=output_bytes) \
4114                    for (k=0; k < filter_bytes; ++k)
4115          switch (filter) {
4116             CASE(STBI__F_none)         cur[k] = raw[k]; break;
4117             CASE(STBI__F_sub)          cur[k] = STBI__BYTECAST(raw[k] + cur[k- output_bytes]); break;
4118             CASE(STBI__F_up)           cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4119             CASE(STBI__F_avg)          cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k- output_bytes])>>1)); break;
4120             CASE(STBI__F_paeth)        cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],prior[k],prior[k- output_bytes])); break;
4121             CASE(STBI__F_avg_first)    cur[k] = STBI__BYTECAST(raw[k] + (cur[k- output_bytes] >> 1)); break;
4122             CASE(STBI__F_paeth_first)  cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],0,0)); break;
4123          }
4124          #undef CASE
4125 
4126          // the loop above sets the high byte of the pixels' alpha, but for
4127          // 16 bit png files we also need the low byte set. we'll do that here.
4128          if (depth == 16) {
4129             cur = a->out + stride*j; // start at the beginning of the row again
4130             for (i=0; i < x; ++i,cur+=output_bytes) {
4131                cur[filter_bytes+1] = 255;
4132             }
4133          }
4134       }
4135    }
4136 
4137    // we make a separate pass to expand bits to pixels; for performance,
4138    // this could run two scanlines behind the above code, so it won't
4139    // intefere with filtering but will still be in the cache.
4140    if (depth < 8) {
4141       for (j=0; j < y; ++j) {
4142          stbi_uc *cur = a->out + stride*j;
4143          stbi_uc *in  = a->out + stride*j + x*out_n - img_width_bytes;
4144          // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
4145          // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
4146          stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range
4147 
4148          // note that the final byte might overshoot and write more data than desired.
4149          // we can allocate enough data that this never writes out of memory, but it
4150          // could also overwrite the next scanline. can it overwrite non-empty data
4151          // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
4152          // so we need to explicitly clamp the final ones
4153 
4154          if (depth == 4) {
4155             for (k=x*img_n; k >= 2; k-=2, ++in) {
4156                *cur++ = scale * ((*in >> 4)       );
4157                *cur++ = scale * ((*in     ) & 0x0f);
4158             }
4159             if (k > 0) *cur++ = scale * ((*in >> 4)       );
4160          } else if (depth == 2) {
4161             for (k=x*img_n; k >= 4; k-=4, ++in) {
4162                *cur++ = scale * ((*in >> 6)       );
4163                *cur++ = scale * ((*in >> 4) & 0x03);
4164                *cur++ = scale * ((*in >> 2) & 0x03);
4165                *cur++ = scale * ((*in     ) & 0x03);
4166             }
4167             if (k > 0) *cur++ = scale * ((*in >> 6)       );
4168             if (k > 1) *cur++ = scale * ((*in >> 4) & 0x03);
4169             if (k > 2) *cur++ = scale * ((*in >> 2) & 0x03);
4170          } else if (depth == 1) {
4171             for (k=x*img_n; k >= 8; k-=8, ++in) {
4172                *cur++ = scale * ((*in >> 7)       );
4173                *cur++ = scale * ((*in >> 6) & 0x01);
4174                *cur++ = scale * ((*in >> 5) & 0x01);
4175                *cur++ = scale * ((*in >> 4) & 0x01);
4176                *cur++ = scale * ((*in >> 3) & 0x01);
4177                *cur++ = scale * ((*in >> 2) & 0x01);
4178                *cur++ = scale * ((*in >> 1) & 0x01);
4179                *cur++ = scale * ((*in     ) & 0x01);
4180             }
4181             if (k > 0) *cur++ = scale * ((*in >> 7)       );
4182             if (k > 1) *cur++ = scale * ((*in >> 6) & 0x01);
4183             if (k > 2) *cur++ = scale * ((*in >> 5) & 0x01);
4184             if (k > 3) *cur++ = scale * ((*in >> 4) & 0x01);
4185             if (k > 4) *cur++ = scale * ((*in >> 3) & 0x01);
4186             if (k > 5) *cur++ = scale * ((*in >> 2) & 0x01);
4187             if (k > 6) *cur++ = scale * ((*in >> 1) & 0x01);
4188          }
4189          if (img_n != out_n) {
4190             int q;
4191             // insert alpha = 255
4192             cur = a->out + stride*j;
4193             if (img_n == 1) {
4194                for (q=x-1; q >= 0; --q) {
4195                   cur[q*2+1] = 255;
4196                   cur[q*2+0] = cur[q];
4197                }
4198             } else {
4199                STBI_ASSERT(img_n == 3);
4200                for (q=x-1; q >= 0; --q) {
4201                   cur[q*4+3] = 255;
4202                   cur[q*4+2] = cur[q*3+2];
4203                   cur[q*4+1] = cur[q*3+1];
4204                   cur[q*4+0] = cur[q*3+0];
4205                }
4206             }
4207          }
4208       }
4209    } else if (depth == 16) {
4210       // force the image data from big-endian to platform-native.
4211       // this is done in a separate pass due to the decoding relying
4212       // on the data being untouched, but could probably be done
4213       // per-line during decode if care is taken.
4214       stbi_uc *cur = a->out;
4215       stbi__uint16 *cur16 = (stbi__uint16*)cur;
4216 
4217       for(i=0; i < x*y*out_n; ++i,cur16++,cur+=2) {
4218          *cur16 = (cur[0] << 8) | cur[1];
4219       }
4220    }
4221 
4222    return 1;
4223 }
4224 
stbi__create_png_image(stbi__png * a,stbi_uc * image_data,stbi__uint32 image_data_len,int out_n,int depth,int color,int interlaced)4225 static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
4226 {
4227    stbi_uc *final;
4228    int p;
4229    if (!interlaced)
4230       return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
4231 
4232    // de-interlacing
4233    final = (stbi_uc *) stbi__malloc(a->s->img_x * a->s->img_y * out_n);
4234    for (p=0; p < 7; ++p) {
4235       int xorig[] = { 0,4,0,2,0,1,0 };
4236       int yorig[] = { 0,0,4,0,2,0,1 };
4237       int xspc[]  = { 8,8,4,4,2,2,1 };
4238       int yspc[]  = { 8,8,8,4,4,2,2 };
4239       int i,j,x,y;
4240       // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
4241       x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
4242       y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
4243       if (x && y) {
4244          stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
4245          if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
4246             STBI_FREE(final);
4247             return 0;
4248          }
4249          for (j=0; j < y; ++j) {
4250             for (i=0; i < x; ++i) {
4251                int out_y = j*yspc[p]+yorig[p];
4252                int out_x = i*xspc[p]+xorig[p];
4253                memcpy(final + out_y*a->s->img_x*out_n + out_x*out_n,
4254                       a->out + (j*x+i)*out_n, out_n);
4255             }
4256          }
4257          STBI_FREE(a->out);
4258          image_data += img_len;
4259          image_data_len -= img_len;
4260       }
4261    }
4262    a->out = final;
4263 
4264    return 1;
4265 }
4266 
stbi__compute_transparency(stbi__png * z,stbi_uc tc[3],int out_n)4267 static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n)
4268 {
4269    stbi__context *s = z->s;
4270    stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4271    stbi_uc *p = z->out;
4272 
4273    // compute color-based transparency, assuming we've
4274    // already got 255 as the alpha value in the output
4275    STBI_ASSERT(out_n == 2 || out_n == 4);
4276 
4277    if (out_n == 2) {
4278       for (i=0; i < pixel_count; ++i) {
4279          p[1] = (p[0] == tc[0] ? 0 : 255);
4280          p += 2;
4281       }
4282    } else {
4283       for (i=0; i < pixel_count; ++i) {
4284          if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4285             p[3] = 0;
4286          p += 4;
4287       }
4288    }
4289    return 1;
4290 }
4291 
stbi__compute_transparency16(stbi__png * z,stbi__uint16 tc[3],int out_n)4292 static int stbi__compute_transparency16(stbi__png *z, stbi__uint16 tc[3], int out_n)
4293 {
4294    stbi__context *s = z->s;
4295    stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4296    stbi__uint16 *p = (stbi__uint16*) z->out;
4297 
4298    // compute color-based transparency, assuming we've
4299    // already got 65535 as the alpha value in the output
4300    STBI_ASSERT(out_n == 2 || out_n == 4);
4301 
4302    if (out_n == 2) {
4303       for (i = 0; i < pixel_count; ++i) {
4304          p[1] = (p[0] == tc[0] ? 0 : 65535);
4305          p += 2;
4306       }
4307    } else {
4308       for (i = 0; i < pixel_count; ++i) {
4309          if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4310             p[3] = 0;
4311          p += 4;
4312       }
4313    }
4314    return 1;
4315 }
4316 
stbi__expand_png_palette(stbi__png * a,stbi_uc * palette,int len,int pal_img_n)4317 static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n)
4318 {
4319    stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
4320    stbi_uc *p, *temp_out, *orig = a->out;
4321 
4322    p = (stbi_uc *) stbi__malloc(pixel_count * pal_img_n);
4323    if (p == NULL) return stbi__err("outofmem", "Out of memory");
4324 
4325    // between here and free(out) below, exitting would leak
4326    temp_out = p;
4327 
4328    if (pal_img_n == 3) {
4329       for (i=0; i < pixel_count; ++i) {
4330          int n = orig[i]*4;
4331          p[0] = palette[n  ];
4332          p[1] = palette[n+1];
4333          p[2] = palette[n+2];
4334          p += 3;
4335       }
4336    } else {
4337       for (i=0; i < pixel_count; ++i) {
4338          int n = orig[i]*4;
4339          p[0] = palette[n  ];
4340          p[1] = palette[n+1];
4341          p[2] = palette[n+2];
4342          p[3] = palette[n+3];
4343          p += 4;
4344       }
4345    }
4346    STBI_FREE(a->out);
4347    a->out = temp_out;
4348 
4349    STBI_NOTUSED(len);
4350 
4351    return 1;
4352 }
4353 
stbi__reduce_png(stbi__png * p)4354 static int stbi__reduce_png(stbi__png *p)
4355 {
4356    int i;
4357    int img_len = p->s->img_x * p->s->img_y * p->s->img_out_n;
4358    stbi_uc *reduced;
4359    stbi__uint16 *orig = (stbi__uint16*)p->out;
4360 
4361    if (p->depth != 16) return 1; // don't need to do anything if not 16-bit data
4362 
4363    reduced = (stbi_uc *)stbi__malloc(img_len);
4364    if (p == NULL) return stbi__err("outofmem", "Out of memory");
4365 
4366    for (i = 0; i < img_len; ++i) reduced[i] = (stbi_uc)((orig[i] >> 8) & 0xFF); // top half of each byte is a decent approx of 16->8 bit scaling
4367 
4368    p->out = reduced;
4369    STBI_FREE(orig);
4370 
4371    return 1;
4372 }
4373 
4374 static int stbi__unpremultiply_on_load = 0;
4375 static int stbi__de_iphone_flag = 0;
4376 
stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)4377 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
4378 {
4379    stbi__unpremultiply_on_load = flag_true_if_should_unpremultiply;
4380 }
4381 
stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)4382 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
4383 {
4384    stbi__de_iphone_flag = flag_true_if_should_convert;
4385 }
4386 
stbi__de_iphone(stbi__png * z)4387 static void stbi__de_iphone(stbi__png *z)
4388 {
4389    stbi__context *s = z->s;
4390    stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4391    stbi_uc *p = z->out;
4392 
4393    if (s->img_out_n == 3) {  // convert bgr to rgb
4394       for (i=0; i < pixel_count; ++i) {
4395          stbi_uc t = p[0];
4396          p[0] = p[2];
4397          p[2] = t;
4398          p += 3;
4399       }
4400    } else {
4401       STBI_ASSERT(s->img_out_n == 4);
4402       if (stbi__unpremultiply_on_load) {
4403          // convert bgr to rgb and unpremultiply
4404          for (i=0; i < pixel_count; ++i) {
4405             stbi_uc a = p[3];
4406             stbi_uc t = p[0];
4407             if (a) {
4408                p[0] = p[2] * 255 / a;
4409                p[1] = p[1] * 255 / a;
4410                p[2] =  t   * 255 / a;
4411             } else {
4412                p[0] = p[2];
4413                p[2] = t;
4414             }
4415             p += 4;
4416          }
4417       } else {
4418          // convert bgr to rgb
4419          for (i=0; i < pixel_count; ++i) {
4420             stbi_uc t = p[0];
4421             p[0] = p[2];
4422             p[2] = t;
4423             p += 4;
4424          }
4425       }
4426    }
4427 }
4428 
4429 #define STBI__PNG_TYPE(a,b,c,d)  (((a) << 24) + ((b) << 16) + ((c) << 8) + (d))
4430 
stbi__parse_png_file(stbi__png * z,int scan,int req_comp)4431 static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp)
4432 {
4433    stbi_uc palette[1024], pal_img_n=0;
4434    stbi_uc has_trans=0, tc[3];
4435    stbi__uint16 tc16[3];
4436    stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0;
4437    int first=1,k,interlace=0, color=0, is_iphone=0;
4438    stbi__context *s = z->s;
4439 
4440    z->expanded = NULL;
4441    z->idata = NULL;
4442    z->out = NULL;
4443 
4444    if (!stbi__check_png_header(s)) return 0;
4445 
4446    if (scan == STBI__SCAN_type) return 1;
4447 
4448    for (;;) {
4449       stbi__pngchunk c = stbi__get_chunk_header(s);
4450       switch (c.type) {
4451          case STBI__PNG_TYPE('C','g','B','I'):
4452             is_iphone = 1;
4453             stbi__skip(s, c.length);
4454             break;
4455          case STBI__PNG_TYPE('I','H','D','R'): {
4456             int comp,filter;
4457             if (!first) return stbi__err("multiple IHDR","Corrupt PNG");
4458             first = 0;
4459             if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG");
4460             s->img_x = stbi__get32be(s); if (s->img_x > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4461             s->img_y = stbi__get32be(s); if (s->img_y > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4462             z->depth = stbi__get8(s);  if (z->depth != 1 && z->depth != 2 && z->depth != 4 && z->depth != 8 && z->depth != 16)  return stbi__err("1/2/4/8/16-bit only","PNG not supported: 1/2/4/8/16-bit only");
4463             color = stbi__get8(s);  if (color > 6)         return stbi__err("bad ctype","Corrupt PNG");
4464 			if (color == 3 && z->depth == 16)                  return stbi__err("bad ctype","Corrupt PNG");
4465             if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG");
4466             comp  = stbi__get8(s);  if (comp) return stbi__err("bad comp method","Corrupt PNG");
4467             filter= stbi__get8(s);  if (filter) return stbi__err("bad filter method","Corrupt PNG");
4468             interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG");
4469             if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG");
4470             if (!pal_img_n) {
4471                s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
4472                if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
4473                if (scan == STBI__SCAN_header) return 1;
4474             } else {
4475                // if paletted, then pal_n is our final components, and
4476                // img_n is # components to decompress/filter.
4477                s->img_n = 1;
4478                if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG");
4479                // if SCAN_header, have to scan to see if we have a tRNS
4480             }
4481             break;
4482          }
4483 
4484          case STBI__PNG_TYPE('P','L','T','E'):  {
4485             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4486             if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
4487             pal_len = c.length / 3;
4488             if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG");
4489             for (i=0; i < pal_len; ++i) {
4490                palette[i*4+0] = stbi__get8(s);
4491                palette[i*4+1] = stbi__get8(s);
4492                palette[i*4+2] = stbi__get8(s);
4493                palette[i*4+3] = 255;
4494             }
4495             break;
4496          }
4497 
4498          case STBI__PNG_TYPE('t','R','N','S'): {
4499             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4500             if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG");
4501             if (pal_img_n) {
4502                if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; }
4503                if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
4504                if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG");
4505                pal_img_n = 4;
4506                for (i=0; i < c.length; ++i)
4507                   palette[i*4+3] = stbi__get8(s);
4508             } else {
4509                if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
4510                if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG");
4511                has_trans = 1;
4512                if (z->depth == 16) {
4513                   for (k = 0; k < s->img_n; ++k) tc16[k] = stbi__get16be(s); // copy the values as-is
4514                } else {
4515                   for (k = 0; k < s->img_n; ++k) tc[k] = (stbi_uc)(stbi__get16be(s) & 255) * stbi__depth_scale_table[z->depth]; // non 8-bit images will be larger
4516                }
4517             }
4518             break;
4519          }
4520 
4521          case STBI__PNG_TYPE('I','D','A','T'): {
4522             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4523             if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG");
4524             if (scan == STBI__SCAN_header) { s->img_n = pal_img_n; return 1; }
4525             if ((int)(ioff + c.length) < (int)ioff) return 0;
4526             if (ioff + c.length > idata_limit) {
4527                stbi__uint32 idata_limit_old = idata_limit;
4528                stbi_uc *p;
4529                if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
4530                while (ioff + c.length > idata_limit)
4531                   idata_limit *= 2;
4532                STBI_NOTUSED(idata_limit_old);
4533                p = (stbi_uc *) STBI_REALLOC_SIZED(z->idata, idata_limit_old, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
4534                z->idata = p;
4535             }
4536             if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG");
4537             ioff += c.length;
4538             break;
4539          }
4540 
4541          case STBI__PNG_TYPE('I','E','N','D'): {
4542             stbi__uint32 raw_len, bpl;
4543             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4544             if (scan != STBI__SCAN_load) return 1;
4545             if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG");
4546             // initial guess for decoded data size to avoid unnecessary reallocs
4547             bpl = (s->img_x * z->depth + 7) / 8; // bytes per line, per component
4548             raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
4549             z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone);
4550             if (z->expanded == NULL) return 0; // zlib should set error
4551             STBI_FREE(z->idata); z->idata = NULL;
4552             if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
4553                s->img_out_n = s->img_n+1;
4554             else
4555                s->img_out_n = s->img_n;
4556             if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, z->depth, color, interlace)) return 0;
4557             if (has_trans) {
4558                if (z->depth == 16) {
4559                   if (!stbi__compute_transparency16(z, tc16, s->img_out_n)) return 0;
4560                } else {
4561                   if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
4562                }
4563             }
4564             if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
4565                stbi__de_iphone(z);
4566             if (pal_img_n) {
4567                // pal_img_n == 3 or 4
4568                s->img_n = pal_img_n; // record the actual colors we had
4569                s->img_out_n = pal_img_n;
4570                if (req_comp >= 3) s->img_out_n = req_comp;
4571                if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
4572                   return 0;
4573             }
4574             STBI_FREE(z->expanded); z->expanded = NULL;
4575             return 1;
4576          }
4577 
4578          default:
4579             // if critical, fail
4580             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4581             if ((c.type & (1 << 29)) == 0) {
4582                #ifndef STBI_NO_FAILURE_STRINGS
4583                // not threadsafe
4584                static char invalid_chunk[] = "XXXX PNG chunk not known";
4585                invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
4586                invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
4587                invalid_chunk[2] = STBI__BYTECAST(c.type >>  8);
4588                invalid_chunk[3] = STBI__BYTECAST(c.type >>  0);
4589                #endif
4590                return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
4591             }
4592             stbi__skip(s, c.length);
4593             break;
4594       }
4595       // end of PNG chunk, read and skip CRC
4596       stbi__get32be(s);
4597    }
4598 }
4599 
stbi__do_png(stbi__png * p,int * x,int * y,int * n,int req_comp)4600 static unsigned char *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp)
4601 {
4602    unsigned char *result=NULL;
4603    if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
4604    if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) {
4605       if (p->depth == 16) {
4606          if (!stbi__reduce_png(p)) {
4607             return result;
4608          }
4609       }
4610       result = p->out;
4611       p->out = NULL;
4612       if (req_comp && req_comp != p->s->img_out_n) {
4613          result = stbi__convert_format(result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
4614          p->s->img_out_n = req_comp;
4615          if (result == NULL) return result;
4616       }
4617       *x = p->s->img_x;
4618       *y = p->s->img_y;
4619       if (n) *n = p->s->img_n;
4620    }
4621    STBI_FREE(p->out);      p->out      = NULL;
4622    STBI_FREE(p->expanded); p->expanded = NULL;
4623    STBI_FREE(p->idata);    p->idata    = NULL;
4624 
4625    return result;
4626 }
4627 
stbi__png_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4628 static unsigned char *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4629 {
4630    stbi__png p;
4631    p.s = s;
4632    return stbi__do_png(&p, x,y,comp,req_comp);
4633 }
4634 
stbi__png_test(stbi__context * s)4635 static int stbi__png_test(stbi__context *s)
4636 {
4637    int r;
4638    r = stbi__check_png_header(s);
4639    stbi__rewind(s);
4640    return r;
4641 }
4642 
stbi__png_info_raw(stbi__png * p,int * x,int * y,int * comp)4643 static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp)
4644 {
4645    if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) {
4646       stbi__rewind( p->s );
4647       return 0;
4648    }
4649    if (x) *x = p->s->img_x;
4650    if (y) *y = p->s->img_y;
4651    if (comp) *comp = p->s->img_n;
4652    return 1;
4653 }
4654 
stbi__png_info(stbi__context * s,int * x,int * y,int * comp)4655 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp)
4656 {
4657    stbi__png p;
4658    p.s = s;
4659    return stbi__png_info_raw(&p, x, y, comp);
4660 }
4661 #endif
4662 
4663 // Microsoft/Windows BMP image
4664 
4665 #ifndef STBI_NO_BMP
stbi__bmp_test_raw(stbi__context * s)4666 static int stbi__bmp_test_raw(stbi__context *s)
4667 {
4668    int r;
4669    int sz;
4670    if (stbi__get8(s) != 'B') return 0;
4671    if (stbi__get8(s) != 'M') return 0;
4672    stbi__get32le(s); // discard filesize
4673    stbi__get16le(s); // discard reserved
4674    stbi__get16le(s); // discard reserved
4675    stbi__get32le(s); // discard data offset
4676    sz = stbi__get32le(s);
4677    r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
4678    return r;
4679 }
4680 
stbi__bmp_test(stbi__context * s)4681 static int stbi__bmp_test(stbi__context *s)
4682 {
4683    int r = stbi__bmp_test_raw(s);
4684    stbi__rewind(s);
4685    return r;
4686 }
4687 
4688 
4689 // returns 0..31 for the highest set bit
stbi__high_bit(unsigned int z)4690 static int stbi__high_bit(unsigned int z)
4691 {
4692    int n=0;
4693    if (z == 0) return -1;
4694    if (z >= 0x10000) n += 16, z >>= 16;
4695    if (z >= 0x00100) n +=  8, z >>=  8;
4696    if (z >= 0x00010) n +=  4, z >>=  4;
4697    if (z >= 0x00004) n +=  2, z >>=  2;
4698    if (z >= 0x00002) n +=  1, z >>=  1;
4699    return n;
4700 }
4701 
stbi__bitcount(unsigned int a)4702 static int stbi__bitcount(unsigned int a)
4703 {
4704    a = (a & 0x55555555) + ((a >>  1) & 0x55555555); // max 2
4705    a = (a & 0x33333333) + ((a >>  2) & 0x33333333); // max 4
4706    a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
4707    a = (a + (a >> 8)); // max 16 per 8 bits
4708    a = (a + (a >> 16)); // max 32 per 8 bits
4709    return a & 0xff;
4710 }
4711 
stbi__shiftsigned(int v,int shift,int bits)4712 static int stbi__shiftsigned(int v, int shift, int bits)
4713 {
4714    int result;
4715    int z=0;
4716 
4717    if (shift < 0) v <<= -shift;
4718    else v >>= shift;
4719    result = v;
4720 
4721    z = bits;
4722    while (z < 8) {
4723       result += v >> z;
4724       z += bits;
4725    }
4726    return result;
4727 }
4728 
4729 typedef struct
4730 {
4731    int bpp, offset, hsz;
4732    unsigned int mr,mg,mb,ma, all_a;
4733 } stbi__bmp_data;
4734 
stbi__bmp_parse_header(stbi__context * s,stbi__bmp_data * info)4735 static void *stbi__bmp_parse_header(stbi__context *s, stbi__bmp_data *info)
4736 {
4737    int hsz;
4738    if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
4739    stbi__get32le(s); // discard filesize
4740    stbi__get16le(s); // discard reserved
4741    stbi__get16le(s); // discard reserved
4742    info->offset = stbi__get32le(s);
4743    info->hsz = hsz = stbi__get32le(s);
4744    info->mr = info->mg = info->mb = info->ma = 0;
4745 
4746    if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
4747    if (hsz == 12) {
4748       s->img_x = stbi__get16le(s);
4749       s->img_y = stbi__get16le(s);
4750    } else {
4751       s->img_x = stbi__get32le(s);
4752       s->img_y = stbi__get32le(s);
4753    }
4754    if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
4755    info->bpp = stbi__get16le(s);
4756    if (info->bpp == 1) return stbi__errpuc("monochrome", "BMP type not supported: 1-bit");
4757    if (hsz != 12) {
4758       int compress = stbi__get32le(s);
4759       if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
4760       stbi__get32le(s); // discard sizeof
4761       stbi__get32le(s); // discard hres
4762       stbi__get32le(s); // discard vres
4763       stbi__get32le(s); // discard colorsused
4764       stbi__get32le(s); // discard max important
4765       if (hsz == 40 || hsz == 56) {
4766          if (hsz == 56) {
4767             stbi__get32le(s);
4768             stbi__get32le(s);
4769             stbi__get32le(s);
4770             stbi__get32le(s);
4771          }
4772          if (info->bpp == 16 || info->bpp == 32) {
4773             if (compress == 0) {
4774                if (info->bpp == 32) {
4775                   info->mr = 0xffu << 16;
4776                   info->mg = 0xffu <<  8;
4777                   info->mb = 0xffu <<  0;
4778                   info->ma = 0xffu << 24;
4779                   info->all_a = 0; // if all_a is 0 at end, then we loaded alpha channel but it was all 0
4780                } else {
4781                   info->mr = 31u << 10;
4782                   info->mg = 31u <<  5;
4783                   info->mb = 31u <<  0;
4784                }
4785             } else if (compress == 3) {
4786                info->mr = stbi__get32le(s);
4787                info->mg = stbi__get32le(s);
4788                info->mb = stbi__get32le(s);
4789                // not documented, but generated by photoshop and handled by mspaint
4790                if (info->mr == info->mg && info->mg == info->mb) {
4791                   // ?!?!?
4792                   return stbi__errpuc("bad BMP", "bad BMP");
4793                }
4794             } else
4795                return stbi__errpuc("bad BMP", "bad BMP");
4796          }
4797       } else {
4798          int i;
4799          if (hsz != 108 && hsz != 124)
4800             return stbi__errpuc("bad BMP", "bad BMP");
4801          info->mr = stbi__get32le(s);
4802          info->mg = stbi__get32le(s);
4803          info->mb = stbi__get32le(s);
4804          info->ma = stbi__get32le(s);
4805          stbi__get32le(s); // discard color space
4806          for (i=0; i < 12; ++i)
4807             stbi__get32le(s); // discard color space parameters
4808          if (hsz == 124) {
4809             stbi__get32le(s); // discard rendering intent
4810             stbi__get32le(s); // discard offset of profile data
4811             stbi__get32le(s); // discard size of profile data
4812             stbi__get32le(s); // discard reserved
4813          }
4814       }
4815    }
4816    return (void *) 1;
4817 }
4818 
4819 
stbi__bmp_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4820 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4821 {
4822    stbi_uc *out;
4823    unsigned int mr=0,mg=0,mb=0,ma=0, all_a;
4824    stbi_uc pal[256][4];
4825    int psize=0,i,j,width;
4826    int flip_vertically, pad, target;
4827    stbi__bmp_data info;
4828 
4829    info.all_a = 255;
4830    if (stbi__bmp_parse_header(s, &info) == NULL)
4831       return NULL; // error code already set
4832 
4833    flip_vertically = ((int) s->img_y) > 0;
4834    s->img_y = abs((int) s->img_y);
4835 
4836    mr = info.mr;
4837    mg = info.mg;
4838    mb = info.mb;
4839    ma = info.ma;
4840    all_a = info.all_a;
4841 
4842    if (info.hsz == 12) {
4843       if (info.bpp < 24)
4844          psize = (info.offset - 14 - 24) / 3;
4845    } else {
4846       if (info.bpp < 16)
4847          psize = (info.offset - 14 - info.hsz) >> 2;
4848    }
4849 
4850    s->img_n = ma ? 4 : 3;
4851    if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
4852       target = req_comp;
4853    else
4854       target = s->img_n; // if they want monochrome, we'll post-convert
4855 
4856    out = (stbi_uc *) stbi__malloc(target * s->img_x * s->img_y);
4857    if (!out) return stbi__errpuc("outofmem", "Out of memory");
4858    if (info.bpp < 16) {
4859       int z=0;
4860       if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
4861       for (i=0; i < psize; ++i) {
4862          pal[i][2] = stbi__get8(s);
4863          pal[i][1] = stbi__get8(s);
4864          pal[i][0] = stbi__get8(s);
4865          if (info.hsz != 12) stbi__get8(s);
4866          pal[i][3] = 255;
4867       }
4868       stbi__skip(s, info.offset - 14 - info.hsz - psize * (info.hsz == 12 ? 3 : 4));
4869       if (info.bpp == 4) width = (s->img_x + 1) >> 1;
4870       else if (info.bpp == 8) width = s->img_x;
4871       else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
4872       pad = (-width)&3;
4873       for (j=0; j < (int) s->img_y; ++j) {
4874          for (i=0; i < (int) s->img_x; i += 2) {
4875             int v=stbi__get8(s),v2=0;
4876             if (info.bpp == 4) {
4877                v2 = v & 15;
4878                v >>= 4;
4879             }
4880             out[z++] = pal[v][0];
4881             out[z++] = pal[v][1];
4882             out[z++] = pal[v][2];
4883             if (target == 4) out[z++] = 255;
4884             if (i+1 == (int) s->img_x) break;
4885             v = (info.bpp == 8) ? stbi__get8(s) : v2;
4886             out[z++] = pal[v][0];
4887             out[z++] = pal[v][1];
4888             out[z++] = pal[v][2];
4889             if (target == 4) out[z++] = 255;
4890          }
4891          stbi__skip(s, pad);
4892       }
4893    } else {
4894       int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
4895       int z = 0;
4896       int easy=0;
4897       stbi__skip(s, info.offset - 14 - info.hsz);
4898       if (info.bpp == 24) width = 3 * s->img_x;
4899       else if (info.bpp == 16) width = 2*s->img_x;
4900       else /* bpp = 32 and pad = 0 */ width=0;
4901       pad = (-width) & 3;
4902       if (info.bpp == 24) {
4903          easy = 1;
4904       } else if (info.bpp == 32) {
4905          if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
4906             easy = 2;
4907       }
4908       if (!easy) {
4909          if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
4910          // right shift amt to put high bit in position #7
4911          rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr);
4912          gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg);
4913          bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb);
4914          ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma);
4915       }
4916       for (j=0; j < (int) s->img_y; ++j) {
4917          if (easy) {
4918             for (i=0; i < (int) s->img_x; ++i) {
4919                unsigned char a;
4920                out[z+2] = stbi__get8(s);
4921                out[z+1] = stbi__get8(s);
4922                out[z+0] = stbi__get8(s);
4923                z += 3;
4924                a = (easy == 2 ? stbi__get8(s) : 255);
4925                all_a |= a;
4926                if (target == 4) out[z++] = a;
4927             }
4928          } else {
4929             int bpp = info.bpp;
4930             for (i=0; i < (int) s->img_x; ++i) {
4931                stbi__uint32 v = (bpp == 16 ? (stbi__uint32) stbi__get16le(s) : stbi__get32le(s));
4932                int a;
4933                out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
4934                out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
4935                out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
4936                a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
4937                all_a |= a;
4938                if (target == 4) out[z++] = STBI__BYTECAST(a);
4939             }
4940          }
4941          stbi__skip(s, pad);
4942       }
4943    }
4944 
4945    // if alpha channel is all 0s, replace with all 255s
4946    if (target == 4 && all_a == 0)
4947       for (i=4*s->img_x*s->img_y-1; i >= 0; i -= 4)
4948          out[i] = 255;
4949 
4950    if (flip_vertically) {
4951       stbi_uc t;
4952       for (j=0; j < (int) s->img_y>>1; ++j) {
4953          stbi_uc *p1 = out +      j     *s->img_x*target;
4954          stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
4955          for (i=0; i < (int) s->img_x*target; ++i) {
4956             t = p1[i], p1[i] = p2[i], p2[i] = t;
4957          }
4958       }
4959    }
4960 
4961    if (req_comp && req_comp != target) {
4962       out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
4963       if (out == NULL) return out; // stbi__convert_format frees input on failure
4964    }
4965 
4966    *x = s->img_x;
4967    *y = s->img_y;
4968    if (comp) *comp = s->img_n;
4969    return out;
4970 }
4971 #endif
4972 
4973 // Targa Truevision - TGA
4974 // by Jonathan Dummer
4975 #ifndef STBI_NO_TGA
4976 // returns STBI_rgb or whatever, 0 on error
stbi__tga_get_comp(int bits_per_pixel,int is_grey,int * is_rgb16)4977 static int stbi__tga_get_comp(int bits_per_pixel, int is_grey, int* is_rgb16)
4978 {
4979    // only RGB or RGBA (incl. 16bit) or grey allowed
4980    if(is_rgb16) *is_rgb16 = 0;
4981    switch(bits_per_pixel) {
4982       case 8:  return STBI_grey;
4983       case 16: if(is_grey) return STBI_grey_alpha;
4984             // else: fall-through
4985       case 15: if(is_rgb16) *is_rgb16 = 1;
4986             return STBI_rgb;
4987       case 24: // fall-through
4988       case 32: return bits_per_pixel/8;
4989       default: return 0;
4990    }
4991 }
4992 
stbi__tga_info(stbi__context * s,int * x,int * y,int * comp)4993 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp)
4994 {
4995     int tga_w, tga_h, tga_comp, tga_image_type, tga_bits_per_pixel, tga_colormap_bpp;
4996     int sz, tga_colormap_type;
4997     stbi__get8(s);                   // discard Offset
4998     tga_colormap_type = stbi__get8(s); // colormap type
4999     if( tga_colormap_type > 1 ) {
5000         stbi__rewind(s);
5001         return 0;      // only RGB or indexed allowed
5002     }
5003     tga_image_type = stbi__get8(s); // image type
5004     if ( tga_colormap_type == 1 ) { // colormapped (paletted) image
5005         if (tga_image_type != 1 && tga_image_type != 9) {
5006             stbi__rewind(s);
5007             return 0;
5008         }
5009         stbi__skip(s,4);       // skip index of first colormap entry and number of entries
5010         sz = stbi__get8(s);    //   check bits per palette color entry
5011         if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) {
5012             stbi__rewind(s);
5013             return 0;
5014         }
5015         stbi__skip(s,4);       // skip image x and y origin
5016         tga_colormap_bpp = sz;
5017     } else { // "normal" image w/o colormap - only RGB or grey allowed, +/- RLE
5018         if ( (tga_image_type != 2) && (tga_image_type != 3) && (tga_image_type != 10) && (tga_image_type != 11) ) {
5019             stbi__rewind(s);
5020             return 0; // only RGB or grey allowed, +/- RLE
5021         }
5022         stbi__skip(s,9); // skip colormap specification and image x/y origin
5023         tga_colormap_bpp = 0;
5024     }
5025     tga_w = stbi__get16le(s);
5026     if( tga_w < 1 ) {
5027         stbi__rewind(s);
5028         return 0;   // test width
5029     }
5030     tga_h = stbi__get16le(s);
5031     if( tga_h < 1 ) {
5032         stbi__rewind(s);
5033         return 0;   // test height
5034     }
5035     tga_bits_per_pixel = stbi__get8(s); // bits per pixel
5036     stbi__get8(s); // ignore alpha bits
5037     if (tga_colormap_bpp != 0) {
5038         if((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16)) {
5039             // when using a colormap, tga_bits_per_pixel is the size of the indexes
5040             // I don't think anything but 8 or 16bit indexes makes sense
5041             stbi__rewind(s);
5042             return 0;
5043         }
5044         tga_comp = stbi__tga_get_comp(tga_colormap_bpp, 0, NULL);
5045     } else {
5046         tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3) || (tga_image_type == 11), NULL);
5047     }
5048     if(!tga_comp) {
5049       stbi__rewind(s);
5050       return 0;
5051     }
5052     if (x) *x = tga_w;
5053     if (y) *y = tga_h;
5054     if (comp) *comp = tga_comp;
5055     return 1;                   // seems to have passed everything
5056 }
5057 
stbi__tga_test(stbi__context * s)5058 static int stbi__tga_test(stbi__context *s)
5059 {
5060    int res = 0;
5061    int sz, tga_color_type;
5062    stbi__get8(s);      //   discard Offset
5063    tga_color_type = stbi__get8(s);   //   color type
5064    if ( tga_color_type > 1 ) goto errorEnd;   //   only RGB or indexed allowed
5065    sz = stbi__get8(s);   //   image type
5066    if ( tga_color_type == 1 ) { // colormapped (paletted) image
5067       if (sz != 1 && sz != 9) goto errorEnd; // colortype 1 demands image type 1 or 9
5068       stbi__skip(s,4);       // skip index of first colormap entry and number of entries
5069       sz = stbi__get8(s);    //   check bits per palette color entry
5070       if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
5071       stbi__skip(s,4);       // skip image x and y origin
5072    } else { // "normal" image w/o colormap
5073       if ( (sz != 2) && (sz != 3) && (sz != 10) && (sz != 11) ) goto errorEnd; // only RGB or grey allowed, +/- RLE
5074       stbi__skip(s,9); // skip colormap specification and image x/y origin
5075    }
5076    if ( stbi__get16le(s) < 1 ) goto errorEnd;      //   test width
5077    if ( stbi__get16le(s) < 1 ) goto errorEnd;      //   test height
5078    sz = stbi__get8(s);   //   bits per pixel
5079    if ( (tga_color_type == 1) && (sz != 8) && (sz != 16) ) goto errorEnd; // for colormapped images, bpp is size of an index
5080    if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
5081 
5082    res = 1; // if we got this far, everything's good and we can return 1 instead of 0
5083 
5084 errorEnd:
5085    stbi__rewind(s);
5086    return res;
5087 }
5088 
5089 // read 16bit value and convert to 24bit RGB
stbi__tga_read_rgb16(stbi__context * s,stbi_uc * out)5090 void stbi__tga_read_rgb16(stbi__context *s, stbi_uc* out)
5091 {
5092    stbi__uint16 px = stbi__get16le(s);
5093    stbi__uint16 fiveBitMask = 31;
5094    // we have 3 channels with 5bits each
5095    int r = (px >> 10) & fiveBitMask;
5096    int g = (px >> 5) & fiveBitMask;
5097    int b = px & fiveBitMask;
5098    // Note that this saves the data in RGB(A) order, so it doesn't need to be swapped later
5099    out[0] = (r * 255)/31;
5100    out[1] = (g * 255)/31;
5101    out[2] = (b * 255)/31;
5102 
5103    // some people claim that the most significant bit might be used for alpha
5104    // (possibly if an alpha-bit is set in the "image descriptor byte")
5105    // but that only made 16bit test images completely translucent..
5106    // so let's treat all 15 and 16bit TGAs as RGB with no alpha.
5107 }
5108 
stbi__tga_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)5109 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5110 {
5111    //   read in the TGA header stuff
5112    int tga_offset = stbi__get8(s);
5113    int tga_indexed = stbi__get8(s);
5114    int tga_image_type = stbi__get8(s);
5115    int tga_is_RLE = 0;
5116    int tga_palette_start = stbi__get16le(s);
5117    int tga_palette_len = stbi__get16le(s);
5118    int tga_palette_bits = stbi__get8(s);
5119    int tga_x_origin = stbi__get16le(s);
5120    int tga_y_origin = stbi__get16le(s);
5121    int tga_width = stbi__get16le(s);
5122    int tga_height = stbi__get16le(s);
5123    int tga_bits_per_pixel = stbi__get8(s);
5124    int tga_comp, tga_rgb16=0;
5125    int tga_inverted = stbi__get8(s);
5126    // int tga_alpha_bits = tga_inverted & 15; // the 4 lowest bits - unused (useless?)
5127    //   image data
5128    unsigned char *tga_data;
5129    unsigned char *tga_palette = NULL;
5130    int i, j;
5131    unsigned char raw_data[4];
5132    int RLE_count = 0;
5133    int RLE_repeating = 0;
5134    int read_next_pixel = 1;
5135 
5136    //   do a tiny bit of precessing
5137    if ( tga_image_type >= 8 )
5138    {
5139       tga_image_type -= 8;
5140       tga_is_RLE = 1;
5141    }
5142    tga_inverted = 1 - ((tga_inverted >> 5) & 1);
5143 
5144    //   If I'm paletted, then I'll use the number of bits from the palette
5145    if ( tga_indexed ) tga_comp = stbi__tga_get_comp(tga_palette_bits, 0, &tga_rgb16);
5146    else tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3), &tga_rgb16);
5147 
5148    if(!tga_comp) // shouldn't really happen, stbi__tga_test() should have ensured basic consistency
5149       return stbi__errpuc("bad format", "Can't find out TGA pixelformat");
5150 
5151    //   tga info
5152    *x = tga_width;
5153    *y = tga_height;
5154    if (comp) *comp = tga_comp;
5155 
5156    tga_data = (unsigned char*)stbi__malloc( (size_t)tga_width * tga_height * tga_comp );
5157    if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
5158 
5159    // skip to the data's starting position (offset usually = 0)
5160    stbi__skip(s, tga_offset );
5161 
5162    if ( !tga_indexed && !tga_is_RLE && !tga_rgb16 ) {
5163       for (i=0; i < tga_height; ++i) {
5164          int row = tga_inverted ? tga_height -i - 1 : i;
5165          stbi_uc *tga_row = tga_data + row*tga_width*tga_comp;
5166          stbi__getn(s, tga_row, tga_width * tga_comp);
5167       }
5168    } else  {
5169       //   do I need to load a palette?
5170       if ( tga_indexed)
5171       {
5172          //   any data to skip? (offset usually = 0)
5173          stbi__skip(s, tga_palette_start );
5174          //   load the palette
5175          tga_palette = (unsigned char*)stbi__malloc( tga_palette_len * tga_comp );
5176          if (!tga_palette) {
5177             STBI_FREE(tga_data);
5178             return stbi__errpuc("outofmem", "Out of memory");
5179          }
5180          if (tga_rgb16) {
5181             stbi_uc *pal_entry = tga_palette;
5182             STBI_ASSERT(tga_comp == STBI_rgb);
5183             for (i=0; i < tga_palette_len; ++i) {
5184                stbi__tga_read_rgb16(s, pal_entry);
5185                pal_entry += tga_comp;
5186             }
5187          } else if (!stbi__getn(s, tga_palette, tga_palette_len * tga_comp)) {
5188                STBI_FREE(tga_data);
5189                STBI_FREE(tga_palette);
5190                return stbi__errpuc("bad palette", "Corrupt TGA");
5191          }
5192       }
5193       //   load the data
5194       for (i=0; i < tga_width * tga_height; ++i)
5195       {
5196          //   if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
5197          if ( tga_is_RLE )
5198          {
5199             if ( RLE_count == 0 )
5200             {
5201                //   yep, get the next byte as a RLE command
5202                int RLE_cmd = stbi__get8(s);
5203                RLE_count = 1 + (RLE_cmd & 127);
5204                RLE_repeating = RLE_cmd >> 7;
5205                read_next_pixel = 1;
5206             } else if ( !RLE_repeating )
5207             {
5208                read_next_pixel = 1;
5209             }
5210          } else
5211          {
5212             read_next_pixel = 1;
5213          }
5214          //   OK, if I need to read a pixel, do it now
5215          if ( read_next_pixel )
5216          {
5217             //   load however much data we did have
5218             if ( tga_indexed )
5219             {
5220                // read in index, then perform the lookup
5221                int pal_idx = (tga_bits_per_pixel == 8) ? stbi__get8(s) : stbi__get16le(s);
5222                if ( pal_idx >= tga_palette_len ) {
5223                   // invalid index
5224                   pal_idx = 0;
5225                }
5226                pal_idx *= tga_comp;
5227                for (j = 0; j < tga_comp; ++j) {
5228                   raw_data[j] = tga_palette[pal_idx+j];
5229                }
5230             } else if(tga_rgb16) {
5231                STBI_ASSERT(tga_comp == STBI_rgb);
5232                stbi__tga_read_rgb16(s, raw_data);
5233             } else {
5234                //   read in the data raw
5235                for (j = 0; j < tga_comp; ++j) {
5236                   raw_data[j] = stbi__get8(s);
5237                }
5238             }
5239             //   clear the reading flag for the next pixel
5240             read_next_pixel = 0;
5241          } // end of reading a pixel
5242 
5243          // copy data
5244          for (j = 0; j < tga_comp; ++j)
5245            tga_data[i*tga_comp+j] = raw_data[j];
5246 
5247          //   in case we're in RLE mode, keep counting down
5248          --RLE_count;
5249       }
5250       //   do I need to invert the image?
5251       if ( tga_inverted )
5252       {
5253          for (j = 0; j*2 < tga_height; ++j)
5254          {
5255             int index1 = j * tga_width * tga_comp;
5256             int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
5257             for (i = tga_width * tga_comp; i > 0; --i)
5258             {
5259                unsigned char temp = tga_data[index1];
5260                tga_data[index1] = tga_data[index2];
5261                tga_data[index2] = temp;
5262                ++index1;
5263                ++index2;
5264             }
5265          }
5266       }
5267       //   clear my palette, if I had one
5268       if ( tga_palette != NULL )
5269       {
5270          STBI_FREE( tga_palette );
5271       }
5272    }
5273 
5274    // swap RGB - if the source data was RGB16, it already is in the right order
5275    if (tga_comp >= 3 && !tga_rgb16)
5276    {
5277       unsigned char* tga_pixel = tga_data;
5278       for (i=0; i < tga_width * tga_height; ++i)
5279       {
5280          unsigned char temp = tga_pixel[0];
5281          tga_pixel[0] = tga_pixel[2];
5282          tga_pixel[2] = temp;
5283          tga_pixel += tga_comp;
5284       }
5285    }
5286 
5287    // convert to target component count
5288    if (req_comp && req_comp != tga_comp)
5289       tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
5290 
5291    //   the things I do to get rid of an error message, and yet keep
5292    //   Microsoft's C compilers happy... [8^(
5293    tga_palette_start = tga_palette_len = tga_palette_bits =
5294          tga_x_origin = tga_y_origin = 0;
5295    //   OK, done
5296    return tga_data;
5297 }
5298 #endif
5299 
5300 // *************************************************************************************************
5301 // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
5302 
5303 #ifndef STBI_NO_PSD
stbi__psd_test(stbi__context * s)5304 static int stbi__psd_test(stbi__context *s)
5305 {
5306    int r = (stbi__get32be(s) == 0x38425053);
5307    stbi__rewind(s);
5308    return r;
5309 }
5310 
stbi__psd_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)5311 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5312 {
5313    int   pixelCount;
5314    int channelCount, compression;
5315    int channel, i, count, len;
5316    int bitdepth;
5317    int w,h;
5318    stbi_uc *out;
5319 
5320    // Check identifier
5321    if (stbi__get32be(s) != 0x38425053)   // "8BPS"
5322       return stbi__errpuc("not PSD", "Corrupt PSD image");
5323 
5324    // Check file type version.
5325    if (stbi__get16be(s) != 1)
5326       return stbi__errpuc("wrong version", "Unsupported version of PSD image");
5327 
5328    // Skip 6 reserved bytes.
5329    stbi__skip(s, 6 );
5330 
5331    // Read the number of channels (R, G, B, A, etc).
5332    channelCount = stbi__get16be(s);
5333    if (channelCount < 0 || channelCount > 16)
5334       return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
5335 
5336    // Read the rows and columns of the image.
5337    h = stbi__get32be(s);
5338    w = stbi__get32be(s);
5339 
5340    // Make sure the depth is 8 bits.
5341    bitdepth = stbi__get16be(s);
5342    if (bitdepth != 8 && bitdepth != 16)
5343       return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 or 16 bit");
5344 
5345    // Make sure the color mode is RGB.
5346    // Valid options are:
5347    //   0: Bitmap
5348    //   1: Grayscale
5349    //   2: Indexed color
5350    //   3: RGB color
5351    //   4: CMYK color
5352    //   7: Multichannel
5353    //   8: Duotone
5354    //   9: Lab color
5355    if (stbi__get16be(s) != 3)
5356       return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
5357 
5358    // Skip the Mode Data.  (It's the palette for indexed color; other info for other modes.)
5359    stbi__skip(s,stbi__get32be(s) );
5360 
5361    // Skip the image resources.  (resolution, pen tool paths, etc)
5362    stbi__skip(s, stbi__get32be(s) );
5363 
5364    // Skip the reserved data.
5365    stbi__skip(s, stbi__get32be(s) );
5366 
5367    // Find out if the data is compressed.
5368    // Known values:
5369    //   0: no compression
5370    //   1: RLE compressed
5371    compression = stbi__get16be(s);
5372    if (compression > 1)
5373       return stbi__errpuc("bad compression", "PSD has an unknown compression format");
5374 
5375    // Create the destination image.
5376    out = (stbi_uc *) stbi__malloc(4 * w*h);
5377    if (!out) return stbi__errpuc("outofmem", "Out of memory");
5378    pixelCount = w*h;
5379 
5380    // Initialize the data to zero.
5381    //memset( out, 0, pixelCount * 4 );
5382 
5383    // Finally, the image data.
5384    if (compression) {
5385       // RLE as used by .PSD and .TIFF
5386       // Loop until you get the number of unpacked bytes you are expecting:
5387       //     Read the next source byte into n.
5388       //     If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
5389       //     Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
5390       //     Else if n is 128, noop.
5391       // Endloop
5392 
5393       // The RLE-compressed data is preceeded by a 2-byte data count for each row in the data,
5394       // which we're going to just skip.
5395       stbi__skip(s, h * channelCount * 2 );
5396 
5397       // Read the RLE data by channel.
5398       for (channel = 0; channel < 4; channel++) {
5399          stbi_uc *p;
5400 
5401          p = out+channel;
5402          if (channel >= channelCount) {
5403             // Fill this channel with default data.
5404             for (i = 0; i < pixelCount; i++, p += 4)
5405                *p = (channel == 3 ? 255 : 0);
5406          } else {
5407             // Read the RLE data.
5408             count = 0;
5409             while (count < pixelCount) {
5410                len = stbi__get8(s);
5411                if (len == 128) {
5412                   // No-op.
5413                } else if (len < 128) {
5414                   // Copy next len+1 bytes literally.
5415                   len++;
5416                   count += len;
5417                   while (len) {
5418                      *p = stbi__get8(s);
5419                      p += 4;
5420                      len--;
5421                   }
5422                } else if (len > 128) {
5423                   stbi_uc   val;
5424                   // Next -len+1 bytes in the dest are replicated from next source byte.
5425                   // (Interpret len as a negative 8-bit int.)
5426                   len ^= 0x0FF;
5427                   len += 2;
5428                   val = stbi__get8(s);
5429                   count += len;
5430                   while (len) {
5431                      *p = val;
5432                      p += 4;
5433                      len--;
5434                   }
5435                }
5436             }
5437          }
5438       }
5439 
5440    } else {
5441       // We're at the raw image data.  It's each channel in order (Red, Green, Blue, Alpha, ...)
5442       // where each channel consists of an 8-bit value for each pixel in the image.
5443 
5444       // Read the data by channel.
5445       for (channel = 0; channel < 4; channel++) {
5446          stbi_uc *p;
5447 
5448          p = out + channel;
5449          if (channel >= channelCount) {
5450             // Fill this channel with default data.
5451             stbi_uc val = channel == 3 ? 255 : 0;
5452             for (i = 0; i < pixelCount; i++, p += 4)
5453                *p = val;
5454          } else {
5455             // Read the data.
5456             if (bitdepth == 16) {
5457                for (i = 0; i < pixelCount; i++, p += 4)
5458                   *p = (stbi_uc) (stbi__get16be(s) >> 8);
5459             } else {
5460                for (i = 0; i < pixelCount; i++, p += 4)
5461                   *p = stbi__get8(s);
5462             }
5463          }
5464       }
5465    }
5466 
5467    if (channelCount >= 4) {
5468       for (i=0; i < w*h; ++i) {
5469          unsigned char *pixel = out + 4*i;
5470          if (pixel[3] != 0 && pixel[3] != 255) {
5471             // remove weird white matte from PSD
5472             float a = pixel[3] / 255.0f;
5473             float ra = 1.0f / a;
5474             float inv_a = 255.0f * (1 - ra);
5475             pixel[0] = (unsigned char) (pixel[0]*ra + inv_a);
5476             pixel[1] = (unsigned char) (pixel[1]*ra + inv_a);
5477             pixel[2] = (unsigned char) (pixel[2]*ra + inv_a);
5478          }
5479       }
5480    }
5481 
5482    if (req_comp && req_comp != 4) {
5483       out = stbi__convert_format(out, 4, req_comp, w, h);
5484       if (out == NULL) return out; // stbi__convert_format frees input on failure
5485    }
5486 
5487    if (comp) *comp = 4;
5488    *y = h;
5489    *x = w;
5490 
5491    return out;
5492 }
5493 #endif
5494 
5495 // *************************************************************************************************
5496 // Softimage PIC loader
5497 // by Tom Seddon
5498 //
5499 // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
5500 // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
5501 
5502 #ifndef STBI_NO_PIC
stbi__pic_is4(stbi__context * s,const char * str)5503 static int stbi__pic_is4(stbi__context *s,const char *str)
5504 {
5505    int i;
5506    for (i=0; i<4; ++i)
5507       if (stbi__get8(s) != (stbi_uc)str[i])
5508          return 0;
5509 
5510    return 1;
5511 }
5512 
stbi__pic_test_core(stbi__context * s)5513 static int stbi__pic_test_core(stbi__context *s)
5514 {
5515    int i;
5516 
5517    if (!stbi__pic_is4(s,"\x53\x80\xF6\x34"))
5518       return 0;
5519 
5520    for(i=0;i<84;++i)
5521       stbi__get8(s);
5522 
5523    if (!stbi__pic_is4(s,"PICT"))
5524       return 0;
5525 
5526    return 1;
5527 }
5528 
5529 typedef struct
5530 {
5531    stbi_uc size,type,channel;
5532 } stbi__pic_packet;
5533 
stbi__readval(stbi__context * s,int channel,stbi_uc * dest)5534 static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest)
5535 {
5536    int mask=0x80, i;
5537 
5538    for (i=0; i<4; ++i, mask>>=1) {
5539       if (channel & mask) {
5540          if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short");
5541          dest[i]=stbi__get8(s);
5542       }
5543    }
5544 
5545    return dest;
5546 }
5547 
stbi__copyval(int channel,stbi_uc * dest,const stbi_uc * src)5548 static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src)
5549 {
5550    int mask=0x80,i;
5551 
5552    for (i=0;i<4; ++i, mask>>=1)
5553       if (channel&mask)
5554          dest[i]=src[i];
5555 }
5556 
stbi__pic_load_core(stbi__context * s,int width,int height,int * comp,stbi_uc * result)5557 static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result)
5558 {
5559    int act_comp=0,num_packets=0,y,chained;
5560    stbi__pic_packet packets[10];
5561 
5562    // this will (should...) cater for even some bizarre stuff like having data
5563     // for the same channel in multiple packets.
5564    do {
5565       stbi__pic_packet *packet;
5566 
5567       if (num_packets==sizeof(packets)/sizeof(packets[0]))
5568          return stbi__errpuc("bad format","too many packets");
5569 
5570       packet = &packets[num_packets++];
5571 
5572       chained = stbi__get8(s);
5573       packet->size    = stbi__get8(s);
5574       packet->type    = stbi__get8(s);
5575       packet->channel = stbi__get8(s);
5576 
5577       act_comp |= packet->channel;
5578 
5579       if (stbi__at_eof(s))          return stbi__errpuc("bad file","file too short (reading packets)");
5580       if (packet->size != 8)  return stbi__errpuc("bad format","packet isn't 8bpp");
5581    } while (chained);
5582 
5583    *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
5584 
5585    for(y=0; y<height; ++y) {
5586       int packet_idx;
5587 
5588       for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
5589          stbi__pic_packet *packet = &packets[packet_idx];
5590          stbi_uc *dest = result+y*width*4;
5591 
5592          switch (packet->type) {
5593             default:
5594                return stbi__errpuc("bad format","packet has bad compression type");
5595 
5596             case 0: {//uncompressed
5597                int x;
5598 
5599                for(x=0;x<width;++x, dest+=4)
5600                   if (!stbi__readval(s,packet->channel,dest))
5601                      return 0;
5602                break;
5603             }
5604 
5605             case 1://Pure RLE
5606                {
5607                   int left=width, i;
5608 
5609                   while (left>0) {
5610                      stbi_uc count,value[4];
5611 
5612                      count=stbi__get8(s);
5613                      if (stbi__at_eof(s))   return stbi__errpuc("bad file","file too short (pure read count)");
5614 
5615                      if (count > left)
5616                         count = (stbi_uc) left;
5617 
5618                      if (!stbi__readval(s,packet->channel,value))  return 0;
5619 
5620                      for(i=0; i<count; ++i,dest+=4)
5621                         stbi__copyval(packet->channel,dest,value);
5622                      left -= count;
5623                   }
5624                }
5625                break;
5626 
5627             case 2: {//Mixed RLE
5628                int left=width;
5629                while (left>0) {
5630                   int count = stbi__get8(s), i;
5631                   if (stbi__at_eof(s))  return stbi__errpuc("bad file","file too short (mixed read count)");
5632 
5633                   if (count >= 128) { // Repeated
5634                      stbi_uc value[4];
5635 
5636                      if (count==128)
5637                         count = stbi__get16be(s);
5638                      else
5639                         count -= 127;
5640                      if (count > left)
5641                         return stbi__errpuc("bad file","scanline overrun");
5642 
5643                      if (!stbi__readval(s,packet->channel,value))
5644                         return 0;
5645 
5646                      for(i=0;i<count;++i, dest += 4)
5647                         stbi__copyval(packet->channel,dest,value);
5648                   } else { // Raw
5649                      ++count;
5650                      if (count>left) return stbi__errpuc("bad file","scanline overrun");
5651 
5652                      for(i=0;i<count;++i, dest+=4)
5653                         if (!stbi__readval(s,packet->channel,dest))
5654                            return 0;
5655                   }
5656                   left-=count;
5657                }
5658                break;
5659             }
5660          }
5661       }
5662    }
5663 
5664    return result;
5665 }
5666 
stbi__pic_load(stbi__context * s,int * px,int * py,int * comp,int req_comp)5667 static stbi_uc *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp)
5668 {
5669    stbi_uc *result;
5670    int i, x,y;
5671 
5672    for (i=0; i<92; ++i)
5673       stbi__get8(s);
5674 
5675    x = stbi__get16be(s);
5676    y = stbi__get16be(s);
5677    if (stbi__at_eof(s))  return stbi__errpuc("bad file","file too short (pic header)");
5678    if ((1 << 28) / x < y) return stbi__errpuc("too large", "Image too large to decode");
5679 
5680    stbi__get32be(s); //skip `ratio'
5681    stbi__get16be(s); //skip `fields'
5682    stbi__get16be(s); //skip `pad'
5683 
5684    // intermediate buffer is RGBA
5685    result = (stbi_uc *) stbi__malloc(x*y*4);
5686    memset(result, 0xff, x*y*4);
5687 
5688    if (!stbi__pic_load_core(s,x,y,comp, result)) {
5689       STBI_FREE(result);
5690       result=0;
5691    }
5692    *px = x;
5693    *py = y;
5694    if (req_comp == 0) req_comp = *comp;
5695    result=stbi__convert_format(result,4,req_comp,x,y);
5696 
5697    return result;
5698 }
5699 
stbi__pic_test(stbi__context * s)5700 static int stbi__pic_test(stbi__context *s)
5701 {
5702    int r = stbi__pic_test_core(s);
5703    stbi__rewind(s);
5704    return r;
5705 }
5706 #endif
5707 
5708 // *************************************************************************************************
5709 // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
5710 
5711 #ifndef STBI_NO_GIF
5712 typedef struct
5713 {
5714    stbi__int16 prefix;
5715    stbi_uc first;
5716    stbi_uc suffix;
5717 } stbi__gif_lzw;
5718 
5719 typedef struct
5720 {
5721    int w,h;
5722    stbi_uc *out, *old_out;             // output buffer (always 4 components)
5723    int flags, bgindex, ratio, transparent, eflags, delay;
5724    stbi_uc  pal[256][4];
5725    stbi_uc lpal[256][4];
5726    stbi__gif_lzw codes[4096];
5727    stbi_uc *color_table;
5728    int parse, step;
5729    int lflags;
5730    int start_x, start_y;
5731    int max_x, max_y;
5732    int cur_x, cur_y;
5733    int line_size;
5734 } stbi__gif;
5735 
stbi__gif_test_raw(stbi__context * s)5736 static int stbi__gif_test_raw(stbi__context *s)
5737 {
5738    int sz;
5739    if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
5740    sz = stbi__get8(s);
5741    if (sz != '9' && sz != '7') return 0;
5742    if (stbi__get8(s) != 'a') return 0;
5743    return 1;
5744 }
5745 
stbi__gif_test(stbi__context * s)5746 static int stbi__gif_test(stbi__context *s)
5747 {
5748    int r = stbi__gif_test_raw(s);
5749    stbi__rewind(s);
5750    return r;
5751 }
5752 
stbi__gif_parse_colortable(stbi__context * s,stbi_uc pal[256][4],int num_entries,int transp)5753 static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp)
5754 {
5755    int i;
5756    for (i=0; i < num_entries; ++i) {
5757       pal[i][2] = stbi__get8(s);
5758       pal[i][1] = stbi__get8(s);
5759       pal[i][0] = stbi__get8(s);
5760       pal[i][3] = transp == i ? 0 : 255;
5761    }
5762 }
5763 
stbi__gif_header(stbi__context * s,stbi__gif * g,int * comp,int is_info)5764 static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info)
5765 {
5766    stbi_uc version;
5767    if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
5768       return stbi__err("not GIF", "Corrupt GIF");
5769 
5770    version = stbi__get8(s);
5771    if (version != '7' && version != '9')    return stbi__err("not GIF", "Corrupt GIF");
5772    if (stbi__get8(s) != 'a')                return stbi__err("not GIF", "Corrupt GIF");
5773 
5774    stbi__g_failure_reason = "";
5775    g->w = stbi__get16le(s);
5776    g->h = stbi__get16le(s);
5777    g->flags = stbi__get8(s);
5778    g->bgindex = stbi__get8(s);
5779    g->ratio = stbi__get8(s);
5780    g->transparent = -1;
5781 
5782    if (comp != 0) *comp = 4;  // can't actually tell whether it's 3 or 4 until we parse the comments
5783 
5784    if (is_info) return 1;
5785 
5786    if (g->flags & 0x80)
5787       stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
5788 
5789    return 1;
5790 }
5791 
stbi__gif_info_raw(stbi__context * s,int * x,int * y,int * comp)5792 static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp)
5793 {
5794    stbi__gif* g = (stbi__gif*) stbi__malloc(sizeof(stbi__gif));
5795    if (!stbi__gif_header(s, g, comp, 1)) {
5796       STBI_FREE(g);
5797       stbi__rewind( s );
5798       return 0;
5799    }
5800    if (x) *x = g->w;
5801    if (y) *y = g->h;
5802    STBI_FREE(g);
5803    return 1;
5804 }
5805 
stbi__out_gif_code(stbi__gif * g,stbi__uint16 code)5806 static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code)
5807 {
5808    stbi_uc *p, *c;
5809 
5810    // recurse to decode the prefixes, since the linked-list is backwards,
5811    // and working backwards through an interleaved image would be nasty
5812    if (g->codes[code].prefix >= 0)
5813       stbi__out_gif_code(g, g->codes[code].prefix);
5814 
5815    if (g->cur_y >= g->max_y) return;
5816 
5817    p = &g->out[g->cur_x + g->cur_y];
5818    c = &g->color_table[g->codes[code].suffix * 4];
5819 
5820    if (c[3] >= 128) {
5821       p[0] = c[2];
5822       p[1] = c[1];
5823       p[2] = c[0];
5824       p[3] = c[3];
5825    }
5826    g->cur_x += 4;
5827 
5828    if (g->cur_x >= g->max_x) {
5829       g->cur_x = g->start_x;
5830       g->cur_y += g->step;
5831 
5832       while (g->cur_y >= g->max_y && g->parse > 0) {
5833          g->step = (1 << g->parse) * g->line_size;
5834          g->cur_y = g->start_y + (g->step >> 1);
5835          --g->parse;
5836       }
5837    }
5838 }
5839 
stbi__process_gif_raster(stbi__context * s,stbi__gif * g)5840 static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g)
5841 {
5842    stbi_uc lzw_cs;
5843    stbi__int32 len, init_code;
5844    stbi__uint32 first;
5845    stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
5846    stbi__gif_lzw *p;
5847 
5848    lzw_cs = stbi__get8(s);
5849    if (lzw_cs > 12) return NULL;
5850    clear = 1 << lzw_cs;
5851    first = 1;
5852    codesize = lzw_cs + 1;
5853    codemask = (1 << codesize) - 1;
5854    bits = 0;
5855    valid_bits = 0;
5856    for (init_code = 0; init_code < clear; init_code++) {
5857       g->codes[init_code].prefix = -1;
5858       g->codes[init_code].first = (stbi_uc) init_code;
5859       g->codes[init_code].suffix = (stbi_uc) init_code;
5860    }
5861 
5862    // support no starting clear code
5863    avail = clear+2;
5864    oldcode = -1;
5865 
5866    len = 0;
5867    for(;;) {
5868       if (valid_bits < codesize) {
5869          if (len == 0) {
5870             len = stbi__get8(s); // start new block
5871             if (len == 0)
5872                return g->out;
5873          }
5874          --len;
5875          bits |= (stbi__int32) stbi__get8(s) << valid_bits;
5876          valid_bits += 8;
5877       } else {
5878          stbi__int32 code = bits & codemask;
5879          bits >>= codesize;
5880          valid_bits -= codesize;
5881          // @OPTIMIZE: is there some way we can accelerate the non-clear path?
5882          if (code == clear) {  // clear code
5883             codesize = lzw_cs + 1;
5884             codemask = (1 << codesize) - 1;
5885             avail = clear + 2;
5886             oldcode = -1;
5887             first = 0;
5888          } else if (code == clear + 1) { // end of stream code
5889             stbi__skip(s, len);
5890             while ((len = stbi__get8(s)) > 0)
5891                stbi__skip(s,len);
5892             return g->out;
5893          } else if (code <= avail) {
5894             if (first) return stbi__errpuc("no clear code", "Corrupt GIF");
5895 
5896             if (oldcode >= 0) {
5897                p = &g->codes[avail++];
5898                if (avail > 4096)        return stbi__errpuc("too many codes", "Corrupt GIF");
5899                p->prefix = (stbi__int16) oldcode;
5900                p->first = g->codes[oldcode].first;
5901                p->suffix = (code == avail) ? p->first : g->codes[code].first;
5902             } else if (code == avail)
5903                return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5904 
5905             stbi__out_gif_code(g, (stbi__uint16) code);
5906 
5907             if ((avail & codemask) == 0 && avail <= 0x0FFF) {
5908                codesize++;
5909                codemask = (1 << codesize) - 1;
5910             }
5911 
5912             oldcode = code;
5913          } else {
5914             return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5915          }
5916       }
5917    }
5918 }
5919 
stbi__fill_gif_background(stbi__gif * g,int x0,int y0,int x1,int y1)5920 static void stbi__fill_gif_background(stbi__gif *g, int x0, int y0, int x1, int y1)
5921 {
5922    int x, y;
5923    stbi_uc *c = g->pal[g->bgindex];
5924    for (y = y0; y < y1; y += 4 * g->w) {
5925       for (x = x0; x < x1; x += 4) {
5926          stbi_uc *p  = &g->out[y + x];
5927          p[0] = c[2];
5928          p[1] = c[1];
5929          p[2] = c[0];
5930          p[3] = 0;
5931       }
5932    }
5933 }
5934 
5935 // this function is designed to support animated gifs, although stb_image doesn't support it
stbi__gif_load_next(stbi__context * s,stbi__gif * g,int * comp,int req_comp)5936 static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp)
5937 {
5938    int i;
5939    stbi_uc *prev_out = 0;
5940 
5941    if (g->out == 0 && !stbi__gif_header(s, g, comp,0))
5942       return 0; // stbi__g_failure_reason set by stbi__gif_header
5943 
5944    prev_out = g->out;
5945    g->out = (stbi_uc *) stbi__malloc(4 * g->w * g->h);
5946    if (g->out == 0) return stbi__errpuc("outofmem", "Out of memory");
5947 
5948    switch ((g->eflags & 0x1C) >> 2) {
5949       case 0: // unspecified (also always used on 1st frame)
5950          stbi__fill_gif_background(g, 0, 0, 4 * g->w, 4 * g->w * g->h);
5951          break;
5952       case 1: // do not dispose
5953          if (prev_out) memcpy(g->out, prev_out, 4 * g->w * g->h);
5954          g->old_out = prev_out;
5955          break;
5956       case 2: // dispose to background
5957          if (prev_out) memcpy(g->out, prev_out, 4 * g->w * g->h);
5958          stbi__fill_gif_background(g, g->start_x, g->start_y, g->max_x, g->max_y);
5959          break;
5960       case 3: // dispose to previous
5961          if (g->old_out) {
5962             for (i = g->start_y; i < g->max_y; i += 4 * g->w)
5963                memcpy(&g->out[i + g->start_x], &g->old_out[i + g->start_x], g->max_x - g->start_x);
5964          }
5965          break;
5966    }
5967 
5968    for (;;) {
5969       switch (stbi__get8(s)) {
5970          case 0x2C: /* Image Descriptor */
5971          {
5972             int prev_trans = -1;
5973             stbi__int32 x, y, w, h;
5974             stbi_uc *o;
5975 
5976             x = stbi__get16le(s);
5977             y = stbi__get16le(s);
5978             w = stbi__get16le(s);
5979             h = stbi__get16le(s);
5980             if (((x + w) > (g->w)) || ((y + h) > (g->h)))
5981                return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
5982 
5983             g->line_size = g->w * 4;
5984             g->start_x = x * 4;
5985             g->start_y = y * g->line_size;
5986             g->max_x   = g->start_x + w * 4;
5987             g->max_y   = g->start_y + h * g->line_size;
5988             g->cur_x   = g->start_x;
5989             g->cur_y   = g->start_y;
5990 
5991             g->lflags = stbi__get8(s);
5992 
5993             if (g->lflags & 0x40) {
5994                g->step = 8 * g->line_size; // first interlaced spacing
5995                g->parse = 3;
5996             } else {
5997                g->step = g->line_size;
5998                g->parse = 0;
5999             }
6000 
6001             if (g->lflags & 0x80) {
6002                stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
6003                g->color_table = (stbi_uc *) g->lpal;
6004             } else if (g->flags & 0x80) {
6005                if (g->transparent >= 0 && (g->eflags & 0x01)) {
6006                   prev_trans = g->pal[g->transparent][3];
6007                   g->pal[g->transparent][3] = 0;
6008                }
6009                g->color_table = (stbi_uc *) g->pal;
6010             } else
6011                return stbi__errpuc("missing color table", "Corrupt GIF");
6012 
6013             o = stbi__process_gif_raster(s, g);
6014             if (o == NULL) return NULL;
6015 
6016             if (prev_trans != -1)
6017                g->pal[g->transparent][3] = (stbi_uc) prev_trans;
6018 
6019             return o;
6020          }
6021 
6022          case 0x21: // Comment Extension.
6023          {
6024             int len;
6025             if (stbi__get8(s) == 0xF9) { // Graphic Control Extension.
6026                len = stbi__get8(s);
6027                if (len == 4) {
6028                   g->eflags = stbi__get8(s);
6029                   g->delay = stbi__get16le(s);
6030                   g->transparent = stbi__get8(s);
6031                } else {
6032                   stbi__skip(s, len);
6033                   break;
6034                }
6035             }
6036             while ((len = stbi__get8(s)) != 0)
6037                stbi__skip(s, len);
6038             break;
6039          }
6040 
6041          case 0x3B: // gif stream termination code
6042             return (stbi_uc *) s; // using '1' causes warning on some compilers
6043 
6044          default:
6045             return stbi__errpuc("unknown code", "Corrupt GIF");
6046       }
6047    }
6048 
6049    STBI_NOTUSED(req_comp);
6050 }
6051 
stbi__gif_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)6052 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6053 {
6054    stbi_uc *u = 0;
6055    stbi__gif* g = (stbi__gif*) stbi__malloc(sizeof(stbi__gif));
6056    memset(g, 0, sizeof(*g));
6057 
6058    u = stbi__gif_load_next(s, g, comp, req_comp);
6059    if (u == (stbi_uc *) s) u = 0;  // end of animated gif marker
6060    if (u) {
6061       *x = g->w;
6062       *y = g->h;
6063       if (req_comp && req_comp != 4)
6064          u = stbi__convert_format(u, 4, req_comp, g->w, g->h);
6065    }
6066    else if (g->out)
6067       STBI_FREE(g->out);
6068    STBI_FREE(g);
6069    return u;
6070 }
6071 
stbi__gif_info(stbi__context * s,int * x,int * y,int * comp)6072 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp)
6073 {
6074    return stbi__gif_info_raw(s,x,y,comp);
6075 }
6076 #endif
6077 
6078 // *************************************************************************************************
6079 // Radiance RGBE HDR loader
6080 // originally by Nicolas Schulz
6081 #ifndef STBI_NO_HDR
stbi__hdr_test_core(stbi__context * s)6082 static int stbi__hdr_test_core(stbi__context *s)
6083 {
6084    const char *signature = "#?RADIANCE\n";
6085    int i;
6086    for (i=0; signature[i]; ++i)
6087       if (stbi__get8(s) != signature[i])
6088          return 0;
6089    return 1;
6090 }
6091 
stbi__hdr_test(stbi__context * s)6092 static int stbi__hdr_test(stbi__context* s)
6093 {
6094    int r = stbi__hdr_test_core(s);
6095    stbi__rewind(s);
6096    return r;
6097 }
6098 
6099 #define STBI__HDR_BUFLEN  1024
stbi__hdr_gettoken(stbi__context * z,char * buffer)6100 static char *stbi__hdr_gettoken(stbi__context *z, char *buffer)
6101 {
6102    int len=0;
6103    char c = '\0';
6104 
6105    c = (char) stbi__get8(z);
6106 
6107    while (!stbi__at_eof(z) && c != '\n') {
6108       buffer[len++] = c;
6109       if (len == STBI__HDR_BUFLEN-1) {
6110          // flush to end of line
6111          while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
6112             ;
6113          break;
6114       }
6115       c = (char) stbi__get8(z);
6116    }
6117 
6118    buffer[len] = 0;
6119    return buffer;
6120 }
6121 
stbi__hdr_convert(float * output,stbi_uc * input,int req_comp)6122 static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp)
6123 {
6124    if ( input[3] != 0 ) {
6125       float f1;
6126       // Exponent
6127       f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
6128       if (req_comp <= 2)
6129          output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
6130       else {
6131          output[0] = input[0] * f1;
6132          output[1] = input[1] * f1;
6133          output[2] = input[2] * f1;
6134       }
6135       if (req_comp == 2) output[1] = 1;
6136       if (req_comp == 4) output[3] = 1;
6137    } else {
6138       switch (req_comp) {
6139          case 4: output[3] = 1; /* fallthrough */
6140          case 3: output[0] = output[1] = output[2] = 0;
6141                  break;
6142          case 2: output[1] = 1; /* fallthrough */
6143          case 1: output[0] = 0;
6144                  break;
6145       }
6146    }
6147 }
6148 
stbi__hdr_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)6149 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6150 {
6151    char buffer[STBI__HDR_BUFLEN];
6152    char *token;
6153    int valid = 0;
6154    int width, height;
6155    stbi_uc *scanline;
6156    float *hdr_data;
6157    int len;
6158    unsigned char count, value;
6159    int i, j, k, c1,c2, z;
6160 
6161 
6162    // Check identifier
6163    if (strcmp(stbi__hdr_gettoken(s,buffer), "#?RADIANCE") != 0)
6164       return stbi__errpf("not HDR", "Corrupt HDR image");
6165 
6166    // Parse header
6167    for(;;) {
6168       token = stbi__hdr_gettoken(s,buffer);
6169       if (token[0] == 0) break;
6170       if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6171    }
6172 
6173    if (!valid)    return stbi__errpf("unsupported format", "Unsupported HDR format");
6174 
6175    // Parse width and height
6176    // can't use sscanf() if we're not using stdio!
6177    token = stbi__hdr_gettoken(s,buffer);
6178    if (strncmp(token, "-Y ", 3))  return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6179    token += 3;
6180    height = (int) strtol(token, &token, 10);
6181    while (*token == ' ') ++token;
6182    if (strncmp(token, "+X ", 3))  return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6183    token += 3;
6184    width = (int) strtol(token, NULL, 10);
6185 
6186    *x = width;
6187    *y = height;
6188 
6189    if (comp) *comp = 3;
6190    if (req_comp == 0) req_comp = 3;
6191 
6192    // Read data
6193    hdr_data = (float *) stbi__malloc(height * width * req_comp * sizeof(float));
6194 
6195    // Load image data
6196    // image data is stored as some number of sca
6197    if ( width < 8 || width >= 32768) {
6198       // Read flat data
6199       for (j=0; j < height; ++j) {
6200          for (i=0; i < width; ++i) {
6201             stbi_uc rgbe[4];
6202            main_decode_loop:
6203             stbi__getn(s, rgbe, 4);
6204             stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
6205          }
6206       }
6207    } else {
6208       // Read RLE-encoded data
6209       scanline = NULL;
6210 
6211       for (j = 0; j < height; ++j) {
6212          c1 = stbi__get8(s);
6213          c2 = stbi__get8(s);
6214          len = stbi__get8(s);
6215          if (c1 != 2 || c2 != 2 || (len & 0x80)) {
6216             // not run-length encoded, so we have to actually use THIS data as a decoded
6217             // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
6218             stbi_uc rgbe[4];
6219             rgbe[0] = (stbi_uc) c1;
6220             rgbe[1] = (stbi_uc) c2;
6221             rgbe[2] = (stbi_uc) len;
6222             rgbe[3] = (stbi_uc) stbi__get8(s);
6223             stbi__hdr_convert(hdr_data, rgbe, req_comp);
6224             i = 1;
6225             j = 0;
6226             STBI_FREE(scanline);
6227             goto main_decode_loop; // yes, this makes no sense
6228          }
6229          len <<= 8;
6230          len |= stbi__get8(s);
6231          if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
6232          if (scanline == NULL) scanline = (stbi_uc *) stbi__malloc(width * 4);
6233 
6234          for (k = 0; k < 4; ++k) {
6235             i = 0;
6236             while (i < width) {
6237                count = stbi__get8(s);
6238                if (count > 128) {
6239                   // Run
6240                   value = stbi__get8(s);
6241                   count -= 128;
6242                   for (z = 0; z < count; ++z)
6243                      scanline[i++ * 4 + k] = value;
6244                } else {
6245                   // Dump
6246                   for (z = 0; z < count; ++z)
6247                      scanline[i++ * 4 + k] = stbi__get8(s);
6248                }
6249             }
6250          }
6251          for (i=0; i < width; ++i)
6252             stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
6253       }
6254       STBI_FREE(scanline);
6255    }
6256 
6257    return hdr_data;
6258 }
6259 
stbi__hdr_info(stbi__context * s,int * x,int * y,int * comp)6260 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp)
6261 {
6262    char buffer[STBI__HDR_BUFLEN];
6263    char *token;
6264    int valid = 0;
6265 
6266    if (stbi__hdr_test(s) == 0) {
6267        stbi__rewind( s );
6268        return 0;
6269    }
6270 
6271    for(;;) {
6272       token = stbi__hdr_gettoken(s,buffer);
6273       if (token[0] == 0) break;
6274       if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6275    }
6276 
6277    if (!valid) {
6278        stbi__rewind( s );
6279        return 0;
6280    }
6281    token = stbi__hdr_gettoken(s,buffer);
6282    if (strncmp(token, "-Y ", 3)) {
6283        stbi__rewind( s );
6284        return 0;
6285    }
6286    token += 3;
6287    *y = (int) strtol(token, &token, 10);
6288    while (*token == ' ') ++token;
6289    if (strncmp(token, "+X ", 3)) {
6290        stbi__rewind( s );
6291        return 0;
6292    }
6293    token += 3;
6294    *x = (int) strtol(token, NULL, 10);
6295    *comp = 3;
6296    return 1;
6297 }
6298 #endif // STBI_NO_HDR
6299 
6300 #ifndef STBI_NO_BMP
stbi__bmp_info(stbi__context * s,int * x,int * y,int * comp)6301 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp)
6302 {
6303    void *p;
6304    stbi__bmp_data info;
6305 
6306    info.all_a = 255;
6307    p = stbi__bmp_parse_header(s, &info);
6308    stbi__rewind( s );
6309    if (p == NULL)
6310       return 0;
6311    *x = s->img_x;
6312    *y = s->img_y;
6313    *comp = info.ma ? 4 : 3;
6314    return 1;
6315 }
6316 #endif
6317 
6318 #ifndef STBI_NO_PSD
stbi__psd_info(stbi__context * s,int * x,int * y,int * comp)6319 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp)
6320 {
6321    int channelCount;
6322    if (stbi__get32be(s) != 0x38425053) {
6323        stbi__rewind( s );
6324        return 0;
6325    }
6326    if (stbi__get16be(s) != 1) {
6327        stbi__rewind( s );
6328        return 0;
6329    }
6330    stbi__skip(s, 6);
6331    channelCount = stbi__get16be(s);
6332    if (channelCount < 0 || channelCount > 16) {
6333        stbi__rewind( s );
6334        return 0;
6335    }
6336    *y = stbi__get32be(s);
6337    *x = stbi__get32be(s);
6338    if (stbi__get16be(s) != 8) {
6339        stbi__rewind( s );
6340        return 0;
6341    }
6342    if (stbi__get16be(s) != 3) {
6343        stbi__rewind( s );
6344        return 0;
6345    }
6346    *comp = 4;
6347    return 1;
6348 }
6349 #endif
6350 
6351 #ifndef STBI_NO_PIC
stbi__pic_info(stbi__context * s,int * x,int * y,int * comp)6352 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp)
6353 {
6354    int act_comp=0,num_packets=0,chained;
6355    stbi__pic_packet packets[10];
6356 
6357    if (!stbi__pic_is4(s,"\x53\x80\xF6\x34")) {
6358       stbi__rewind(s);
6359       return 0;
6360    }
6361 
6362    stbi__skip(s, 88);
6363 
6364    *x = stbi__get16be(s);
6365    *y = stbi__get16be(s);
6366    if (stbi__at_eof(s)) {
6367       stbi__rewind( s);
6368       return 0;
6369    }
6370    if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
6371       stbi__rewind( s );
6372       return 0;
6373    }
6374 
6375    stbi__skip(s, 8);
6376 
6377    do {
6378       stbi__pic_packet *packet;
6379 
6380       if (num_packets==sizeof(packets)/sizeof(packets[0]))
6381          return 0;
6382 
6383       packet = &packets[num_packets++];
6384       chained = stbi__get8(s);
6385       packet->size    = stbi__get8(s);
6386       packet->type    = stbi__get8(s);
6387       packet->channel = stbi__get8(s);
6388       act_comp |= packet->channel;
6389 
6390       if (stbi__at_eof(s)) {
6391           stbi__rewind( s );
6392           return 0;
6393       }
6394       if (packet->size != 8) {
6395           stbi__rewind( s );
6396           return 0;
6397       }
6398    } while (chained);
6399 
6400    *comp = (act_comp & 0x10 ? 4 : 3);
6401 
6402    return 1;
6403 }
6404 #endif
6405 
6406 // *************************************************************************************************
6407 // Portable Gray Map and Portable Pixel Map loader
6408 // by Ken Miller
6409 //
6410 // PGM: http://netpbm.sourceforge.net/doc/pgm.html
6411 // PPM: http://netpbm.sourceforge.net/doc/ppm.html
6412 //
6413 // Known limitations:
6414 //    Does not support comments in the header section
6415 //    Does not support ASCII image data (formats P2 and P3)
6416 //    Does not support 16-bit-per-channel
6417 
6418 #ifndef STBI_NO_PNM
6419 
stbi__pnm_test(stbi__context * s)6420 static int      stbi__pnm_test(stbi__context *s)
6421 {
6422    char p, t;
6423    p = (char) stbi__get8(s);
6424    t = (char) stbi__get8(s);
6425    if (p != 'P' || (t != '5' && t != '6')) {
6426        stbi__rewind( s );
6427        return 0;
6428    }
6429    return 1;
6430 }
6431 
stbi__pnm_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)6432 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6433 {
6434    stbi_uc *out;
6435    if (!stbi__pnm_info(s, (int *)&s->img_x, (int *)&s->img_y, (int *)&s->img_n))
6436       return 0;
6437    *x = s->img_x;
6438    *y = s->img_y;
6439    *comp = s->img_n;
6440 
6441    out = (stbi_uc *) stbi__malloc(s->img_n * s->img_x * s->img_y);
6442    if (!out) return stbi__errpuc("outofmem", "Out of memory");
6443    stbi__getn(s, out, s->img_n * s->img_x * s->img_y);
6444 
6445    if (req_comp && req_comp != s->img_n) {
6446       out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y);
6447       if (out == NULL) return out; // stbi__convert_format frees input on failure
6448    }
6449    return out;
6450 }
6451 
stbi__pnm_isspace(char c)6452 static int      stbi__pnm_isspace(char c)
6453 {
6454    return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r';
6455 }
6456 
stbi__pnm_skip_whitespace(stbi__context * s,char * c)6457 static void     stbi__pnm_skip_whitespace(stbi__context *s, char *c)
6458 {
6459    for (;;) {
6460       while (!stbi__at_eof(s) && stbi__pnm_isspace(*c))
6461          *c = (char) stbi__get8(s);
6462 
6463       if (stbi__at_eof(s) || *c != '#')
6464          break;
6465 
6466       while (!stbi__at_eof(s) && *c != '\n' && *c != '\r' )
6467          *c = (char) stbi__get8(s);
6468    }
6469 }
6470 
stbi__pnm_isdigit(char c)6471 static int      stbi__pnm_isdigit(char c)
6472 {
6473    return c >= '0' && c <= '9';
6474 }
6475 
stbi__pnm_getinteger(stbi__context * s,char * c)6476 static int      stbi__pnm_getinteger(stbi__context *s, char *c)
6477 {
6478    int value = 0;
6479 
6480    while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) {
6481       value = value*10 + (*c - '0');
6482       *c = (char) stbi__get8(s);
6483    }
6484 
6485    return value;
6486 }
6487 
stbi__pnm_info(stbi__context * s,int * x,int * y,int * comp)6488 static int      stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp)
6489 {
6490    int maxv;
6491    char c, p, t;
6492 
6493    stbi__rewind( s );
6494 
6495    // Get identifier
6496    p = (char) stbi__get8(s);
6497    t = (char) stbi__get8(s);
6498    if (p != 'P' || (t != '5' && t != '6')) {
6499        stbi__rewind( s );
6500        return 0;
6501    }
6502 
6503    *comp = (t == '6') ? 3 : 1;  // '5' is 1-component .pgm; '6' is 3-component .ppm
6504 
6505    c = (char) stbi__get8(s);
6506    stbi__pnm_skip_whitespace(s, &c);
6507 
6508    *x = stbi__pnm_getinteger(s, &c); // read width
6509    stbi__pnm_skip_whitespace(s, &c);
6510 
6511    *y = stbi__pnm_getinteger(s, &c); // read height
6512    stbi__pnm_skip_whitespace(s, &c);
6513 
6514    maxv = stbi__pnm_getinteger(s, &c);  // read max value
6515 
6516    if (maxv > 255)
6517       return stbi__err("max value > 255", "PPM image not 8-bit");
6518    else
6519       return 1;
6520 }
6521 #endif
6522 
stbi__info_main(stbi__context * s,int * x,int * y,int * comp)6523 static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp)
6524 {
6525    #ifndef STBI_NO_JPEG
6526    if (stbi__jpeg_info(s, x, y, comp)) return 1;
6527    #endif
6528 
6529    #ifndef STBI_NO_PNG
6530    if (stbi__png_info(s, x, y, comp))  return 1;
6531    #endif
6532 
6533    #ifndef STBI_NO_GIF
6534    if (stbi__gif_info(s, x, y, comp))  return 1;
6535    #endif
6536 
6537    #ifndef STBI_NO_BMP
6538    if (stbi__bmp_info(s, x, y, comp))  return 1;
6539    #endif
6540 
6541    #ifndef STBI_NO_PSD
6542    if (stbi__psd_info(s, x, y, comp))  return 1;
6543    #endif
6544 
6545    #ifndef STBI_NO_PIC
6546    if (stbi__pic_info(s, x, y, comp))  return 1;
6547    #endif
6548 
6549    #ifndef STBI_NO_PNM
6550    if (stbi__pnm_info(s, x, y, comp))  return 1;
6551    #endif
6552 
6553    #ifndef STBI_NO_HDR
6554    if (stbi__hdr_info(s, x, y, comp))  return 1;
6555    #endif
6556 
6557    // test tga last because it's a crappy test!
6558    #ifndef STBI_NO_TGA
6559    if (stbi__tga_info(s, x, y, comp))
6560        return 1;
6561    #endif
6562    return stbi__err("unknown image type", "Image not of any known type, or corrupt");
6563 }
6564 
6565 #ifndef STBI_NO_STDIO
stbi_info(char const * filename,int * x,int * y,int * comp)6566 STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
6567 {
6568     FILE *f = stbi__fopen(filename, "rb");
6569     int result;
6570     if (!f) return stbi__err("can't fopen", "Unable to open file");
6571     result = stbi_info_from_file(f, x, y, comp);
6572     fclose(f);
6573     return result;
6574 }
6575 
stbi_info_from_file(FILE * f,int * x,int * y,int * comp)6576 STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
6577 {
6578    int r;
6579    stbi__context s;
6580    long pos = ftell(f);
6581    stbi__start_file(&s, f);
6582    r = stbi__info_main(&s,x,y,comp);
6583    fseek(f,pos,SEEK_SET);
6584    return r;
6585 }
6586 #endif // !STBI_NO_STDIO
6587 
stbi_info_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp)6588 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
6589 {
6590    stbi__context s;
6591    stbi__start_mem(&s,buffer,len);
6592    return stbi__info_main(&s,x,y,comp);
6593 }
6594 
stbi_info_from_callbacks(stbi_io_callbacks const * c,void * user,int * x,int * y,int * comp)6595 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
6596 {
6597    stbi__context s;
6598    stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
6599    return stbi__info_main(&s,x,y,comp);
6600 }
6601 
6602 #endif // STB_IMAGE_IMPLEMENTATION
6603 
6604 /*
6605    revision history:
6606       2.12  (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
6607       2.11  (2016-04-02) allocate large structures on the stack
6608                          remove white matting for transparent PSD
6609                          fix reported channel count for PNG & BMP
6610                          re-enable SSE2 in non-gcc 64-bit
6611                          support RGB-formatted JPEG
6612                          read 16-bit PNGs (only as 8-bit)
6613       2.10  (2016-01-22) avoid warning introduced in 2.09 by STBI_REALLOC_SIZED
6614       2.09  (2016-01-16) allow comments in PNM files
6615                          16-bit-per-pixel TGA (not bit-per-component)
6616                          info() for TGA could break due to .hdr handling
6617                          info() for BMP to shares code instead of sloppy parse
6618                          can use STBI_REALLOC_SIZED if allocator doesn't support realloc
6619                          code cleanup
6620       2.08  (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
6621       2.07  (2015-09-13) fix compiler warnings
6622                          partial animated GIF support
6623                          limited 16-bpc PSD support
6624                          #ifdef unused functions
6625                          bug with < 92 byte PIC,PNM,HDR,TGA
6626       2.06  (2015-04-19) fix bug where PSD returns wrong '*comp' value
6627       2.05  (2015-04-19) fix bug in progressive JPEG handling, fix warning
6628       2.04  (2015-04-15) try to re-enable SIMD on MinGW 64-bit
6629       2.03  (2015-04-12) extra corruption checking (mmozeiko)
6630                          stbi_set_flip_vertically_on_load (nguillemot)
6631                          fix NEON support; fix mingw support
6632       2.02  (2015-01-19) fix incorrect assert, fix warning
6633       2.01  (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
6634       2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
6635       2.00  (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
6636                          progressive JPEG (stb)
6637                          PGM/PPM support (Ken Miller)
6638                          STBI_MALLOC,STBI_REALLOC,STBI_FREE
6639                          GIF bugfix -- seemingly never worked
6640                          STBI_NO_*, STBI_ONLY_*
6641       1.48  (2014-12-14) fix incorrectly-named assert()
6642       1.47  (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
6643                          optimize PNG (ryg)
6644                          fix bug in interlaced PNG with user-specified channel count (stb)
6645       1.46  (2014-08-26)
6646               fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
6647       1.45  (2014-08-16)
6648               fix MSVC-ARM internal compiler error by wrapping malloc
6649       1.44  (2014-08-07)
6650               various warning fixes from Ronny Chevalier
6651       1.43  (2014-07-15)
6652               fix MSVC-only compiler problem in code changed in 1.42
6653       1.42  (2014-07-09)
6654               don't define _CRT_SECURE_NO_WARNINGS (affects user code)
6655               fixes to stbi__cleanup_jpeg path
6656               added STBI_ASSERT to avoid requiring assert.h
6657       1.41  (2014-06-25)
6658               fix search&replace from 1.36 that messed up comments/error messages
6659       1.40  (2014-06-22)
6660               fix gcc struct-initialization warning
6661       1.39  (2014-06-15)
6662               fix to TGA optimization when req_comp != number of components in TGA;
6663               fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
6664               add support for BMP version 5 (more ignored fields)
6665       1.38  (2014-06-06)
6666               suppress MSVC warnings on integer casts truncating values
6667               fix accidental rename of 'skip' field of I/O
6668       1.37  (2014-06-04)
6669               remove duplicate typedef
6670       1.36  (2014-06-03)
6671               convert to header file single-file library
6672               if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
6673       1.35  (2014-05-27)
6674               various warnings
6675               fix broken STBI_SIMD path
6676               fix bug where stbi_load_from_file no longer left file pointer in correct place
6677               fix broken non-easy path for 32-bit BMP (possibly never used)
6678               TGA optimization by Arseny Kapoulkine
6679       1.34  (unknown)
6680               use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
6681       1.33  (2011-07-14)
6682               make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
6683       1.32  (2011-07-13)
6684               support for "info" function for all supported filetypes (SpartanJ)
6685       1.31  (2011-06-20)
6686               a few more leak fixes, bug in PNG handling (SpartanJ)
6687       1.30  (2011-06-11)
6688               added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
6689               removed deprecated format-specific test/load functions
6690               removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
6691               error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
6692               fix inefficiency in decoding 32-bit BMP (David Woo)
6693       1.29  (2010-08-16)
6694               various warning fixes from Aurelien Pocheville
6695       1.28  (2010-08-01)
6696               fix bug in GIF palette transparency (SpartanJ)
6697       1.27  (2010-08-01)
6698               cast-to-stbi_uc to fix warnings
6699       1.26  (2010-07-24)
6700               fix bug in file buffering for PNG reported by SpartanJ
6701       1.25  (2010-07-17)
6702               refix trans_data warning (Won Chun)
6703       1.24  (2010-07-12)
6704               perf improvements reading from files on platforms with lock-heavy fgetc()
6705               minor perf improvements for jpeg
6706               deprecated type-specific functions so we'll get feedback if they're needed
6707               attempt to fix trans_data warning (Won Chun)
6708       1.23    fixed bug in iPhone support
6709       1.22  (2010-07-10)
6710               removed image *writing* support
6711               stbi_info support from Jetro Lauha
6712               GIF support from Jean-Marc Lienher
6713               iPhone PNG-extensions from James Brown
6714               warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
6715       1.21    fix use of 'stbi_uc' in header (reported by jon blow)
6716       1.20    added support for Softimage PIC, by Tom Seddon
6717       1.19    bug in interlaced PNG corruption check (found by ryg)
6718       1.18  (2008-08-02)
6719               fix a threading bug (local mutable static)
6720       1.17    support interlaced PNG
6721       1.16    major bugfix - stbi__convert_format converted one too many pixels
6722       1.15    initialize some fields for thread safety
6723       1.14    fix threadsafe conversion bug
6724               header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
6725       1.13    threadsafe
6726       1.12    const qualifiers in the API
6727       1.11    Support installable IDCT, colorspace conversion routines
6728       1.10    Fixes for 64-bit (don't use "unsigned long")
6729               optimized upsampling by Fabian "ryg" Giesen
6730       1.09    Fix format-conversion for PSD code (bad global variables!)
6731       1.08    Thatcher Ulrich's PSD code integrated by Nicolas Schulz
6732       1.07    attempt to fix C++ warning/errors again
6733       1.06    attempt to fix C++ warning/errors again
6734       1.05    fix TGA loading to return correct *comp and use good luminance calc
6735       1.04    default float alpha is 1, not 255; use 'void *' for stbi_image_free
6736       1.03    bugfixes to STBI_NO_STDIO, STBI_NO_HDR
6737       1.02    support for (subset of) HDR files, float interface for preferred access to them
6738       1.01    fix bug: possible bug in handling right-side up bmps... not sure
6739               fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
6740       1.00    interface to zlib that skips zlib header
6741       0.99    correct handling of alpha in palette
6742       0.98    TGA loader by lonesock; dynamically add loaders (untested)
6743       0.97    jpeg errors on too large a file; also catch another malloc failure
6744       0.96    fix detection of invalid v value - particleman@mollyrocket forum
6745       0.95    during header scan, seek to markers in case of padding
6746       0.94    STBI_NO_STDIO to disable stdio usage; rename all #defines the same
6747       0.93    handle jpegtran output; verbose errors
6748       0.92    read 4,8,16,24,32-bit BMP files of several formats
6749       0.91    output 24-bit Windows 3.0 BMP files
6750       0.90    fix a few more warnings; bump version number to approach 1.0
6751       0.61    bugfixes due to Marc LeBlanc, Christopher Lloyd
6752       0.60    fix compiling as c++
6753       0.59    fix warnings: merge Dave Moore's -Wall fixes
6754       0.58    fix bug: zlib uncompressed mode len/nlen was wrong endian
6755       0.57    fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
6756       0.56    fix bug: zlib uncompressed mode len vs. nlen
6757       0.55    fix bug: restart_interval not initialized to 0
6758       0.54    allow NULL for 'int *comp'
6759       0.53    fix bug in png 3->4; speedup png decoding
6760       0.52    png handles req_comp=3,4 directly; minor cleanup; jpeg comments
6761       0.51    obey req_comp requests, 1-component jpegs return as 1-component,
6762               on 'test' only check type, not whether we support this variant
6763       0.50  (2006-11-19)
6764               first released version
6765 */
6766