1 /* stb_image - v2.05 - public domain image loader - http://nothings.org/stb_image.h
2 no warranty implied; use at your own risk
3
4 Do this:
5 #define STB_IMAGE_IMPLEMENTATION
6 before you include this file in *one* C or C++ file to create the implementation.
7
8 // i.e. it should look like this:
9 #include ...
10 #include ...
11 #include ...
12 #define STB_IMAGE_IMPLEMENTATION
13 #include "stb_image.h"
14
15 You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
16 And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
17
18
19 QUICK NOTES:
20 Primarily of interest to game developers and other people who can
21 avoid problematic images and only need the trivial interface
22
23 JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
24 PNG 1/2/4/8-bit-per-channel (16 bpc not supported)
25
26 TGA (not sure what subset, if a subset)
27 BMP non-1bpp, non-RLE
28 PSD (composited view only, no extra channels)
29
30 GIF (*comp always reports as 4-channel)
31 HDR (radiance rgbE format)
32 PIC (Softimage PIC)
33 PNM (PPM and PGM binary only)
34
35 - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
36 - decode from arbitrary I/O callbacks
37 - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
38
39 Full documentation under "DOCUMENTATION" below.
40
41
42 Revision 2.00 release notes:
43
44 - Progressive JPEG is now supported.
45
46 - PPM and PGM binary formats are now supported, thanks to Ken Miller.
47
48 - x86 platforms now make use of SSE2 SIMD instructions for
49 JPEG decoding, and ARM platforms can use NEON SIMD if requested.
50 This work was done by Fabian "ryg" Giesen. SSE2 is used by
51 default, but NEON must be enabled explicitly; see docs.
52
53 With other JPEG optimizations included in this version, we see
54 2x speedup on a JPEG on an x86 machine, and a 1.5x speedup
55 on a JPEG on an ARM machine, relative to previous versions of this
56 library. The same results will not obtain for all JPGs and for all
57 x86/ARM machines. (Note that progressive JPEGs are significantly
58 slower to decode than regular JPEGs.) This doesn't mean that this
59 is the fastest JPEG decoder in the land; rather, it brings it
60 closer to parity with standard libraries. If you want the fastest
61 decode, look elsewhere. (See "Philosophy" section of docs below.)
62
63 See final bullet items below for more info on SIMD.
64
65 - Added STBI_MALLOC, STBI_REALLOC, and STBI_FREE macros for replacing
66 the memory allocator. Unlike other STBI libraries, these macros don't
67 support a context parameter, so if you need to pass a context in to
68 the allocator, you'll have to store it in a global or a thread-local
69 variable.
70
71 - Split existing STBI_NO_HDR flag into two flags, STBI_NO_HDR and
72 STBI_NO_LINEAR.
73 STBI_NO_HDR: suppress implementation of .hdr reader format
74 STBI_NO_LINEAR: suppress high-dynamic-range light-linear float API
75
76 - You can suppress implementation of any of the decoders to reduce
77 your code footprint by #defining one or more of the following
78 symbols before creating the implementation.
79
80 STBI_NO_JPEG
81 STBI_NO_PNG
82 STBI_NO_BMP
83 STBI_NO_PSD
84 STBI_NO_TGA
85 STBI_NO_GIF
86 STBI_NO_HDR
87 STBI_NO_PIC
88 STBI_NO_PNM (.ppm and .pgm)
89
90 - You can request *only* certain decoders and suppress all other ones
91 (this will be more forward-compatible, as addition of new decoders
92 doesn't require you to disable them explicitly):
93
94 STBI_ONLY_JPEG
95 STBI_ONLY_PNG
96 STBI_ONLY_BMP
97 STBI_ONLY_PSD
98 STBI_ONLY_TGA
99 STBI_ONLY_GIF
100 STBI_ONLY_HDR
101 STBI_ONLY_PIC
102 STBI_ONLY_PNM (.ppm and .pgm)
103
104 Note that you can define multiples of these, and you will get all
105 of them ("only x" and "only y" is interpreted to mean "only x&y").
106
107 - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
108 want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
109
110 - Compilation of all SIMD code can be suppressed with
111 #define STBI_NO_SIMD
112 It should not be necessary to disable SIMD unless you have issues
113 compiling (e.g. using an x86 compiler which doesn't support SSE
114 intrinsics or that doesn't support the method used to detect
115 SSE2 support at run-time), and even those can be reported as
116 bugs so I can refine the built-in compile-time checking to be
117 smarter.
118
119 - The old STBI_SIMD system which allowed installing a user-defined
120 IDCT etc. has been removed. If you need this, don't upgrade. My
121 assumption is that almost nobody was doing this, and those who
122 were will find the built-in SIMD more satisfactory anyway.
123
124 - RGB values computed for JPEG images are slightly different from
125 previous versions of stb_image. (This is due to using less
126 integer precision in SIMD.) The C code has been adjusted so
127 that the same RGB values will be computed regardless of whether
128 SIMD support is available, so your app should always produce
129 consistent results. But these results are slightly different from
130 previous versions. (Specifically, about 3% of available YCbCr values
131 will compute different RGB results from pre-1.49 versions by +-1;
132 most of the deviating values are one smaller in the G channel.)
133
134 - If you must produce consistent results with previous versions of
135 stb_image, #define STBI_JPEG_OLD and you will get the same results
136 you used to; however, you will not get the SIMD speedups for
137 the YCbCr-to-RGB conversion step (although you should still see
138 significant JPEG speedup from the other changes).
139
140 Please note that STBI_JPEG_OLD is a temporary feature; it will be
141 removed in future versions of the library. It is only intended for
142 near-term back-compatibility use.
143
144
145 Latest revision history:
146 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
147 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
148 2.03 (2015-04-12) additional corruption checking
149 stbi_set_flip_vertically_on_load
150 fix NEON support; fix mingw support
151 2.02 (2015-01-19) fix incorrect assert, fix warning
152 2.01 (2015-01-17) fix various warnings
153 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
154 2.00 (2014-12-25) optimize JPEG, including x86 SSE2 & ARM NEON SIMD
155 progressive JPEG
156 PGM/PPM support
157 STBI_MALLOC,STBI_REALLOC,STBI_FREE
158 STBI_NO_*, STBI_ONLY_*
159 GIF bugfix
160 1.48 (2014-12-14) fix incorrectly-named assert()
161 1.47 (2014-12-14) 1/2/4-bit PNG support (both grayscale and paletted)
162 optimize PNG
163 fix bug in interlaced PNG with user-specified channel count
164
165 See end of file for full revision history.
166
167
168 ============================ Contributors =========================
169
170 Image formats Bug fixes & warning fixes
171 Sean Barrett (jpeg, png, bmp) Marc LeBlanc
172 Nicolas Schulz (hdr, psd) Christpher Lloyd
173 Jonathan Dummer (tga) Dave Moore
174 Jean-Marc Lienher (gif) Won Chun
175 Tom Seddon (pic) the Horde3D community
176 Thatcher Ulrich (psd) Janez Zemva
177 Ken Miller (pgm, ppm) Jonathan Blow
178 Laurent Gomila
179 Aruelien Pocheville
180 Extensions, features Ryamond Barbiero
181 Jetro Lauha (stbi_info) David Woo
182 Martin "SpartanJ" Golini (stbi_info) Martin Golini
183 James "moose2000" Brown (iPhone PNG) Roy Eltham
184 Ben "Disch" Wenger (io callbacks) Luke Graham
185 Omar Cornut (1/2/4-bit PNG) Thomas Ruf
186 Nicolas Guillemot (vertical flip) John Bartholomew
187 Ken Hamada
188 Optimizations & bugfixes Cort Stratton
189 Fabian "ryg" Giesen Blazej Dariusz Roszkowski
190 Arseny Kapoulkine Thibault Reuille
191 Paul Du Bois
192 Guillaume George
193 If your name should be here but Jerry Jansson
194 isn't, let Sean know. Hayaki Saito
195 Johan Duparc
196 Ronny Chevalier
197 Michal Cichon
198 Tero Hanninen
199 Sergio Gonzalez
200 Cass Everitt
201 Engin Manap
202 Martins Mozeiko
203 Joseph Thomson
204 Phil Jordan
205
206 License:
207 This software is in the public domain. Where that dedication is not
208 recognized, you are granted a perpetual, irrevocable license to copy
209 and modify this file however you want.
210
211 */
212
213 #ifndef STBI_INCLUDE_STB_IMAGE_H
214 #define STBI_INCLUDE_STB_IMAGE_H
215
216 // DOCUMENTATION
217 //
218 // Limitations:
219 // - no 16-bit-per-channel PNG
220 // - no 12-bit-per-channel JPEG
221 // - no JPEGs with arithmetic coding
222 // - no 1-bit BMP
223 // - GIF always returns *comp=4
224 //
225 // Basic usage (see HDR discussion below for HDR usage):
226 // int x,y,n;
227 // unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
228 // // ... process data if not NULL ...
229 // // ... x = width, y = height, n = # 8-bit components per pixel ...
230 // // ... replace '0' with '1'..'4' to force that many components per pixel
231 // // ... but 'n' will always be the number that it would have been if you said 0
232 // stbi_image_free(data)
233 //
234 // Standard parameters:
235 // int *x -- outputs image width in pixels
236 // int *y -- outputs image height in pixels
237 // int *comp -- outputs # of image components in image file
238 // int req_comp -- if non-zero, # of image components requested in result
239 //
240 // The return value from an image loader is an 'unsigned char *' which points
241 // to the pixel data, or NULL on an allocation failure or if the image is
242 // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
243 // with each pixel consisting of N interleaved 8-bit components; the first
244 // pixel pointed to is top-left-most in the image. There is no padding between
245 // image scanlines or between pixels, regardless of format. The number of
246 // components N is 'req_comp' if req_comp is non-zero, or *comp otherwise.
247 // If req_comp is non-zero, *comp has the number of components that _would_
248 // have been output otherwise. E.g. if you set req_comp to 4, you will always
249 // get RGBA output, but you can check *comp to see if it's trivially opaque
250 // because e.g. there were only 3 channels in the source image.
251 //
252 // An output image with N components has the following components interleaved
253 // in this order in each pixel:
254 //
255 // N=#comp components
256 // 1 grey
257 // 2 grey, alpha
258 // 3 red, green, blue
259 // 4 red, green, blue, alpha
260 //
261 // If image loading fails for any reason, the return value will be NULL,
262 // and *x, *y, *comp will be unchanged. The function stbi_failure_reason()
263 // can be queried for an extremely brief, end-user unfriendly explanation
264 // of why the load failed. Define STBI_NO_FAILURE_STRINGS to avoid
265 // compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
266 // more user-friendly ones.
267 //
268 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
269 //
270 // ===========================================================================
271 //
272 // Philosophy
273 //
274 // stb libraries are designed with the following priorities:
275 //
276 // 1. easy to use
277 // 2. easy to maintain
278 // 3. good performance
279 //
280 // Sometimes I let "good performance" creep up in priority over "easy to maintain",
281 // and for best performance I may provide less-easy-to-use APIs that give higher
282 // performance, in addition to the easy to use ones. Nevertheless, it's important
283 // to keep in mind that from the standpoint of you, a client of this library,
284 // all you care about is #1 and #3, and stb libraries do not emphasize #3 above all.
285 //
286 // Some secondary priorities arise directly from the first two, some of which
287 // make more explicit reasons why performance can't be emphasized.
288 //
289 // - Portable ("ease of use")
290 // - Small footprint ("easy to maintain")
291 // - No dependencies ("ease of use")
292 //
293 // ===========================================================================
294 //
295 // I/O callbacks
296 //
297 // I/O callbacks allow you to read from arbitrary sources, like packaged
298 // files or some other source. Data read from callbacks are processed
299 // through a small internal buffer (currently 128 bytes) to try to reduce
300 // overhead.
301 //
302 // The three functions you must define are "read" (reads some bytes of data),
303 // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
304 //
305 // ===========================================================================
306 //
307 // SIMD support
308 //
309 // The JPEG decoder will try to automatically use SIMD kernels on x86 when
310 // supported by the compiler. For ARM Neon support, you must explicitly
311 // request it.
312 //
313 // (The old do-it-yourself SIMD API is no longer supported in the current
314 // code.)
315 //
316 // On x86, SSE2 will automatically be used when available based on a run-time
317 // test; if not, the generic C versions are used as a fall-back. On ARM targets,
318 // the typical path is to have separate builds for NEON and non-NEON devices
319 // (at least this is true for iOS and Android). Therefore, the NEON support is
320 // toggled by a build flag: define STBI_NEON to get NEON loops.
321 //
322 // The output of the JPEG decoder is slightly different from versions where
323 // SIMD support was introduced (that is, for versions before 1.49). The
324 // difference is only +-1 in the 8-bit RGB channels, and only on a small
325 // fraction of pixels. You can force the pre-1.49 behavior by defining
326 // STBI_JPEG_OLD, but this will disable some of the SIMD decoding path
327 // and hence cost some performance.
328 //
329 // If for some reason you do not want to use any of SIMD code, or if
330 // you have issues compiling it, you can disable it entirely by
331 // defining STBI_NO_SIMD.
332 //
333 // ===========================================================================
334 //
335 // HDR image support (disable by defining STBI_NO_HDR)
336 //
337 // stb_image now supports loading HDR images in general, and currently
338 // the Radiance .HDR file format, although the support is provided
339 // generically. You can still load any file through the existing interface;
340 // if you attempt to load an HDR file, it will be automatically remapped to
341 // LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
342 // both of these constants can be reconfigured through this interface:
343 //
344 // stbi_hdr_to_ldr_gamma(2.2f);
345 // stbi_hdr_to_ldr_scale(1.0f);
346 //
347 // (note, do not use _inverse_ constants; stbi_image will invert them
348 // appropriately).
349 //
350 // Additionally, there is a new, parallel interface for loading files as
351 // (linear) floats to preserve the full dynamic range:
352 //
353 // float *data = stbi_loadf(filename, &x, &y, &n, 0);
354 //
355 // If you load LDR images through this interface, those images will
356 // be promoted to floating point values, run through the inverse of
357 // constants corresponding to the above:
358 //
359 // stbi_ldr_to_hdr_scale(1.0f);
360 // stbi_ldr_to_hdr_gamma(2.2f);
361 //
362 // Finally, given a filename (or an open file or memory block--see header
363 // file for details) containing image data, you can query for the "most
364 // appropriate" interface to use (that is, whether the image is HDR or
365 // not), using:
366 //
367 // stbi_is_hdr(char *filename);
368 //
369 // ===========================================================================
370 //
371 // iPhone PNG support:
372 //
373 // By default we convert iphone-formatted PNGs back to RGB, even though
374 // they are internally encoded differently. You can disable this conversion
375 // by by calling stbi_convert_iphone_png_to_rgb(0), in which case
376 // you will always just get the native iphone "format" through (which
377 // is BGR stored in RGB).
378 //
379 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
380 // pixel to remove any premultiplied alpha *only* if the image file explicitly
381 // says there's premultiplied data (currently only happens in iPhone images,
382 // and only if iPhone convert-to-rgb processing is on).
383 //
384
385
386 #ifndef STBI_NO_STDIO
387 #include <stdio.h>
388 #endif // STBI_NO_STDIO
389
390 #define STBI_VERSION 1
391
392 enum
393 {
394 STBI_default = 0, // only used for req_comp
395
396 STBI_grey = 1,
397 STBI_grey_alpha = 2,
398 STBI_rgb = 3,
399 STBI_rgb_alpha = 4
400 };
401
402 typedef unsigned char stbi_uc;
403
404 #ifdef __cplusplus
405 extern "C" {
406 #endif
407
408 #ifdef STB_IMAGE_STATIC
409 #define STBIDEF static
410 #else
411 #define STBIDEF extern
412 #endif
413
414 //////////////////////////////////////////////////////////////////////////////
415 //
416 // PRIMARY API - works on images of any type
417 //
418
419 //
420 // load image by filename, open file, or memory buffer
421 //
422
423 typedef struct
424 {
425 int (*read) (void *user,char *data,int size); // fill 'data' with 'size' bytes. return number of bytes actually read
426 void (*skip) (void *user,int n); // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
427 int (*eof) (void *user); // returns nonzero if we are at end of file/data
428 } stbi_io_callbacks;
429
430 STBIDEF stbi_uc *stbi_load (char const *filename, int *x, int *y, int *comp, int req_comp);
431 STBIDEF stbi_uc *stbi_load_from_memory (stbi_uc const *buffer, int len , int *x, int *y, int *comp, int req_comp);
432 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk , void *user, int *x, int *y, int *comp, int req_comp);
433
434 #ifndef STBI_NO_STDIO
435 STBIDEF stbi_uc *stbi_load_from_file (FILE *f, int *x, int *y, int *comp, int req_comp);
436 // for stbi_load_from_file, file pointer is left pointing immediately after image
437 #endif
438
439 #ifndef STBI_NO_LINEAR
440 STBIDEF float *stbi_loadf (char const *filename, int *x, int *y, int *comp, int req_comp);
441 STBIDEF float *stbi_loadf_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
442 STBIDEF float *stbi_loadf_from_callbacks (stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp);
443
444 #ifndef STBI_NO_STDIO
445 STBIDEF float *stbi_loadf_from_file (FILE *f, int *x, int *y, int *comp, int req_comp);
446 #endif
447 #endif
448
449 #ifndef STBI_NO_HDR
450 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma);
451 STBIDEF void stbi_hdr_to_ldr_scale(float scale);
452 #endif
453
454 #ifndef STBI_NO_LINEAR
455 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma);
456 STBIDEF void stbi_ldr_to_hdr_scale(float scale);
457 #endif // STBI_NO_HDR
458
459 // stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
460 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
461 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
462 #ifndef STBI_NO_STDIO
463 STBIDEF int stbi_is_hdr (char const *filename);
464 STBIDEF int stbi_is_hdr_from_file(FILE *f);
465 #endif // STBI_NO_STDIO
466
467
468 // get a VERY brief reason for failure
469 // NOT THREADSAFE
470 STBIDEF const char *stbi_failure_reason (void);
471
472 // free the loaded image -- this is just free()
473 STBIDEF void stbi_image_free (void *retval_from_stbi_load);
474
475 // get image dimensions & components without fully decoding
476 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
477 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp);
478
479 #ifndef STBI_NO_STDIO
480 STBIDEF int stbi_info (char const *filename, int *x, int *y, int *comp);
481 STBIDEF int stbi_info_from_file (FILE *f, int *x, int *y, int *comp);
482
483 #endif
484
485
486
487 // for image formats that explicitly notate that they have premultiplied alpha,
488 // we just return the colors as stored in the file. set this flag to force
489 // unpremultiplication. results are undefined if the unpremultiply overflow.
490 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
491
492 // indicate whether we should process iphone images back to canonical format,
493 // or just pass them through "as-is"
494 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
495
496 // flip the image vertically, so the first pixel in the output array is the bottom left
497 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip);
498
499 // ZLIB client - used by PNG, available for other purposes
500
501 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
502 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
503 STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
504 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
505
506 STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
507 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
508
509 #ifndef STBI_NO_DDS
510 #include "stbi_DDS.h"
511 #endif
512
513 #ifndef STBI_NO_PVR
514 #include "stbi_pvr.h"
515 #endif
516
517 #ifndef STBI_NO_PKM
518 #include "stbi_pkm.h"
519 #endif
520
521 #ifndef STBI_NO_EXT
522 #include "stbi_ext.h"
523 #endif
524
525 #ifdef __cplusplus
526 }
527 #endif
528
529 //
530 //
531 //// end header file /////////////////////////////////////////////////////
532 #endif // STBI_INCLUDE_STB_IMAGE_H
533
534 #ifdef STB_IMAGE_IMPLEMENTATION
535
536 #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
537 || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
538 || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
539 || defined(STBI_ONLY_ZLIB)
540 #ifndef STBI_ONLY_JPEG
541 #define STBI_NO_JPEG
542 #endif
543 #ifndef STBI_ONLY_PNG
544 #define STBI_NO_PNG
545 #endif
546 #ifndef STBI_ONLY_BMP
547 #define STBI_NO_BMP
548 #endif
549 #ifndef STBI_ONLY_PSD
550 #define STBI_NO_PSD
551 #endif
552 #ifndef STBI_ONLY_TGA
553 #define STBI_NO_TGA
554 #endif
555 #ifndef STBI_ONLY_GIF
556 #define STBI_NO_GIF
557 #endif
558 #ifndef STBI_ONLY_HDR
559 #define STBI_NO_HDR
560 #endif
561 #ifndef STBI_ONLY_PIC
562 #define STBI_NO_PIC
563 #endif
564 #ifndef STBI_ONLY_PNM
565 #define STBI_NO_PNM
566 #endif
567 #endif
568
569 #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
570 #define STBI_NO_ZLIB
571 #endif
572
573
574 #include <stdarg.h>
575 #include <stddef.h> // ptrdiff_t on osx
576 #include <stdlib.h>
577 #include <string.h>
578
579 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
580 #include <math.h> // ldexp
581 #endif
582
583 #ifndef STBI_NO_STDIO
584 #include <stdio.h>
585 #endif
586
587 #ifndef STBI_ASSERT
588 #include <assert.h>
589 #define STBI_ASSERT(x) assert(x)
590 #endif
591
592
593 #ifndef _MSC_VER
594 #ifdef __cplusplus
595 #define stbi_inline inline
596 #else
597 #define stbi_inline
598 #endif
599 #else
600 #define stbi_inline __forceinline
601 #endif
602
603
604 #ifdef _MSC_VER
605 typedef unsigned short stbi__uint16;
606 typedef signed short stbi__int16;
607 typedef unsigned int stbi__uint32;
608 typedef signed int stbi__int32;
609 #else
610 #include <stdint.h>
611 typedef uint16_t stbi__uint16;
612 typedef int16_t stbi__int16;
613 typedef uint32_t stbi__uint32;
614 typedef int32_t stbi__int32;
615 #endif
616
617 // should produce compiler error if size is wrong
618 typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1];
619
620 #ifdef _MSC_VER
621 #define STBI_NOTUSED(v) (void)(v)
622 #else
623 #define STBI_NOTUSED(v) (void)sizeof(v)
624 #endif
625
626 #ifdef _MSC_VER
627 #define STBI_HAS_LROTL
628 #endif
629
630 #ifdef STBI_HAS_LROTL
631 #define stbi_lrot(x,y) _lrotl(x,y)
632 #else
633 #define stbi_lrot(x,y) (((x) << (y)) | ((x) >> (32 - (y))))
634 #endif
635
636 #if defined(STBI_MALLOC) && defined(STBI_FREE) && defined(STBI_REALLOC)
637 // ok
638 #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC)
639 // ok
640 #else
641 #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC."
642 #endif
643
644 #ifndef STBI_MALLOC
645 #define STBI_MALLOC(sz) malloc(sz)
646 #define STBI_REALLOC(p,sz) realloc(p,sz)
647 #define STBI_FREE(p) free(p)
648 #endif
649
650 // x86/x64 detection
651 #if defined(__x86_64__) || defined(_M_X64)
652 #define STBI__X64_TARGET
653 #elif defined(__i386) || defined(_M_IX86)
654 #define STBI__X86_TARGET
655 #endif
656
657 #if defined(__GNUC__) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET)) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
658 // NOTE: not clear do we actually need this for the 64-bit path?
659 // gcc doesn't support sse2 intrinsics unless you compile with -msse2,
660 // (but compiling with -msse2 allows the compiler to use SSE2 everywhere;
661 // this is just broken and gcc are jerks for not fixing it properly
662 // http://www.virtualdub.org/blog/pivot/entry.php?id=363 )
663 #define STBI_NO_SIMD
664 #endif
665
666 #if defined(__MINGW32__) && !defined(__x86_64__) && !defined(STBI_NO_SIMD)
667 #define STBI_MINGW_ENABLE_SSE2
668 #define STBI_FORCE_STACK_ALIGN __attribute__((force_align_arg_pointer))
669 #else
670 #define STBI_FORCE_STACK_ALIGN
671 #endif
672
673 #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
674 // Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
675 //
676 // 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
677 // Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
678 // As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
679 // simultaneously enabling "-mstackrealign".
680 //
681 // See https://github.com/nothings/stb/issues/81 for more information.
682 //
683 // So default to no SSE2 on 32-bit MinGW. If you've read this far and added
684 // -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
685 #define STBI_NO_SIMD
686 #endif
687
688 #if !defined(STBI_NO_SIMD) && defined(STBI__X86_TARGET)
689 #define STBI_SSE2
690 #include <emmintrin.h>
691
692 #ifdef _MSC_VER
693
694 #if _MSC_VER >= 1400 // not VC6
695 #include <intrin.h> // __cpuid
stbi__cpuid3(void)696 static int stbi__cpuid3(void)
697 {
698 int info[4];
699 __cpuid(info,1);
700 return info[3];
701 }
702 #else
stbi__cpuid3(void)703 static int stbi__cpuid3(void)
704 {
705 int res;
706 __asm {
707 mov eax,1
708 cpuid
709 mov res,edx
710 }
711 return res;
712 }
713 #endif
714
715 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
716
stbi__sse2_available()717 static int stbi__sse2_available()
718 {
719 int info3 = stbi__cpuid3();
720 return ((info3 >> 26) & 1) != 0;
721 }
722 #else // assume GCC-style if not VC++
723 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
724
stbi__sse2_available()725 static int stbi__sse2_available()
726 {
727 #if defined(__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__) >= 408 // GCC 4.8 or later
728 // GCC 4.8+ has a nice way to do this
729 return __builtin_cpu_supports("sse2");
730 #else
731 // portable way to do this, preferably without using GCC inline ASM?
732 // just bail for now.
733 return 0;
734 #endif
735 }
736 #endif
737 #endif
738
739 // ARM NEON
740 #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
741 #undef STBI_NEON
742 #endif
743
744 #ifdef STBI_NEON
745 #include <arm_neon.h>
746 // assume GCC or Clang on ARM targets
747 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
748 #endif
749
750 #ifndef STBI_SIMD_ALIGN
751 #define STBI_SIMD_ALIGN(type, name) type name
752 #endif
753
754 ///////////////////////////////////////////////
755 //
756 // stbi__context struct and start_xxx functions
757
758 // stbi__context structure is our basic context used by all images, so it
759 // contains all the IO context, plus some basic image information
760 typedef struct
761 {
762 stbi__uint32 img_x, img_y;
763 int img_n, img_out_n;
764
765 stbi_io_callbacks io;
766 void *io_user_data;
767
768 int read_from_callbacks;
769 int buflen;
770 stbi_uc buffer_start[128];
771
772 stbi_uc *img_buffer, *img_buffer_end;
773 stbi_uc *img_buffer_original;
774 } stbi__context;
775
776
777 static void stbi__refill_buffer(stbi__context *s);
778
779 // initialize a memory-decode context
stbi__start_mem(stbi__context * s,stbi_uc const * buffer,int len)780 static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len)
781 {
782 s->io.read = NULL;
783 s->read_from_callbacks = 0;
784 s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer;
785 s->img_buffer_end = (stbi_uc *) buffer+len;
786 }
787
788 // initialize a callback-based context
stbi__start_callbacks(stbi__context * s,stbi_io_callbacks * c,void * user)789 static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user)
790 {
791 s->io = *c;
792 s->io_user_data = user;
793 s->buflen = sizeof(s->buffer_start);
794 s->read_from_callbacks = 1;
795 s->img_buffer_original = s->buffer_start;
796 stbi__refill_buffer(s);
797 }
798
799 #ifndef STBI_NO_STDIO
800
stbi__stdio_read(void * user,char * data,int size)801 static int stbi__stdio_read(void *user, char *data, int size)
802 {
803 return (int) fread(data,1,size,(FILE*) user);
804 }
805
stbi__stdio_skip(void * user,int n)806 static void stbi__stdio_skip(void *user, int n)
807 {
808 fseek((FILE*) user, n, SEEK_CUR);
809 }
810
stbi__stdio_eof(void * user)811 static int stbi__stdio_eof(void *user)
812 {
813 return feof((FILE*) user);
814 }
815
816 static stbi_io_callbacks stbi__stdio_callbacks =
817 {
818 stbi__stdio_read,
819 stbi__stdio_skip,
820 stbi__stdio_eof,
821 };
822
stbi__start_file(stbi__context * s,FILE * f)823 static void stbi__start_file(stbi__context *s, FILE *f)
824 {
825 stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f);
826 }
827
828 //static void stop_file(stbi__context *s) { }
829
830 #endif // !STBI_NO_STDIO
831
stbi__rewind(stbi__context * s)832 static void stbi__rewind(stbi__context *s)
833 {
834 // conceptually rewind SHOULD rewind to the beginning of the stream,
835 // but we just rewind to the beginning of the initial buffer, because
836 // we only use it after doing 'test', which only ever looks at at most 92 bytes
837 s->img_buffer = s->img_buffer_original;
838 }
839
840 #ifndef STBI_NO_JPEG
841 static int stbi__jpeg_test(stbi__context *s);
842 static stbi_uc *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
843 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp);
844 #endif
845
846 #ifndef STBI_NO_PNG
847 static int stbi__png_test(stbi__context *s);
848 static stbi_uc *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
849 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp);
850 #endif
851
852 #ifndef STBI_NO_BMP
853 static int stbi__bmp_test(stbi__context *s);
854 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
855 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp);
856 #endif
857
858 #ifndef STBI_NO_TGA
859 static int stbi__tga_test(stbi__context *s);
860 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
861 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp);
862 #endif
863
864 #ifndef STBI_NO_PSD
865 static int stbi__psd_test(stbi__context *s);
866 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
867 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp);
868 #endif
869
870 #ifndef STBI_NO_HDR
871 static int stbi__hdr_test(stbi__context *s);
872 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
873 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp);
874 #endif
875
876 #ifndef STBI_NO_PIC
877 static int stbi__pic_test(stbi__context *s);
878 static stbi_uc *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
879 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp);
880 #endif
881
882 #ifndef STBI_NO_GIF
883 static int stbi__gif_test(stbi__context *s);
884 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
885 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp);
886 #endif
887
888 #ifndef STBI_NO_PNM
889 static int stbi__pnm_test(stbi__context *s);
890 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
891 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp);
892 #endif
893
894 #ifndef STBI_NO_PNM
895 static int stbi__pnm_test(stbi__context *s);
896 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
897 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp);
898 #endif
899
900 #ifndef STBI_NO_DDS
901 #include "stbi_DDS.h"
902 static int stbi__dds_test(stbi__context *s);
903 static stbi_uc *stbi__dds_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
904 #endif
905
906 #ifndef STBI_NO_PVR
907 #include "stbi_pvr.h"
908 static int stbi__pvr_test(stbi__context *s);
909 static stbi_uc *stbi__pvr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
910 #endif
911
912 #ifndef STBI_NO_PKM
913 #include "stbi_pkm.h"
914 static int stbi__pkm_test(stbi__context *s);
915 static stbi_uc *stbi__pkm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
916 #endif
917
918 // this is not threadsafe
919 static const char *stbi__g_failure_reason;
920
stbi_failure_reason(void)921 STBIDEF const char *stbi_failure_reason(void)
922 {
923 return stbi__g_failure_reason;
924 }
925
stbi__err(const char * str)926 static int stbi__err(const char *str)
927 {
928 stbi__g_failure_reason = str;
929 return 0;
930 }
931
stbi__malloc(size_t size)932 static void *stbi__malloc(size_t size)
933 {
934 return STBI_MALLOC(size);
935 }
936
937 // stbi__err - error
938 // stbi__errpf - error returning pointer to float
939 // stbi__errpuc - error returning pointer to unsigned char
940
941 #ifdef STBI_NO_FAILURE_STRINGS
942 #define stbi__err(x,y) 0
943 #elif defined(STBI_FAILURE_USERMSG)
944 #define stbi__err(x,y) stbi__err(y)
945 #else
946 #define stbi__err(x,y) stbi__err(x)
947 #endif
948
949 #define stbi__errpf(x,y) ((float *) (stbi__err(x,y)?NULL:NULL))
950 #define stbi__errpuc(x,y) ((unsigned char *) (stbi__err(x,y)?NULL:NULL))
951
stbi_image_free(void * retval_from_stbi_load)952 STBIDEF void stbi_image_free(void *retval_from_stbi_load)
953 {
954 STBI_FREE(retval_from_stbi_load);
955 }
956
957 #ifndef STBI_NO_LINEAR
958 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
959 #endif
960
961 #ifndef STBI_NO_HDR
962 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp);
963 #endif
964
965 static int stbi__vertically_flip_on_load = 0;
966
stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)967 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
968 {
969 stbi__vertically_flip_on_load = flag_true_if_should_flip;
970 }
971
stbi__load_main(stbi__context * s,int * x,int * y,int * comp,int req_comp)972 static unsigned char *stbi__load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
973 {
974 #ifndef STBI_NO_JPEG
975 if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp);
976 #endif
977 #ifndef STBI_NO_PNG
978 if (stbi__png_test(s)) return stbi__png_load(s,x,y,comp,req_comp);
979 #endif
980 #ifndef STBI_NO_BMP
981 if (stbi__bmp_test(s)) return stbi__bmp_load(s,x,y,comp,req_comp);
982 #endif
983 #ifndef STBI_NO_GIF
984 if (stbi__gif_test(s)) return stbi__gif_load(s,x,y,comp,req_comp);
985 #endif
986 #ifndef STBI_NO_PSD
987 if (stbi__psd_test(s)) return stbi__psd_load(s,x,y,comp,req_comp);
988 #endif
989 #ifndef STBI_NO_PIC
990 if (stbi__pic_test(s)) return stbi__pic_load(s,x,y,comp,req_comp);
991 #endif
992 #ifndef STBI_NO_PNM
993 if (stbi__pnm_test(s)) return stbi__pnm_load(s,x,y,comp,req_comp);
994 #endif
995 #ifndef STBI_NO_DDS
996 if (stbi__dds_test(s)) return stbi__dds_load(s,x,y,comp,req_comp);
997 #endif
998 #ifndef STBI_NO_PVR
999 if (stbi__pvr_test(s)) return stbi__pvr_load(s,x,y,comp,req_comp);
1000 #endif
1001 #ifndef STBI_NO_PKM
1002 if (stbi__pkm_test(s)) return stbi__pkm_load(s,x,y,comp,req_comp);
1003 #endif
1004 #ifndef STBI_NO_HDR
1005 if (stbi__hdr_test(s)) {
1006 float *hdr = stbi__hdr_load(s, x,y,comp,req_comp);
1007 return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
1008 }
1009 #endif
1010
1011 #ifndef STBI_NO_TGA
1012 // test tga last because it's a crappy test!
1013 if (stbi__tga_test(s))
1014 return stbi__tga_load(s,x,y,comp,req_comp);
1015 #endif
1016
1017 return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
1018 }
1019
stbi__load_flip(stbi__context * s,int * x,int * y,int * comp,int req_comp)1020 static unsigned char *stbi__load_flip(stbi__context *s, int *x, int *y, int *comp, int req_comp)
1021 {
1022 unsigned char *result = stbi__load_main(s, x, y, comp, req_comp);
1023
1024 if (stbi__vertically_flip_on_load && result != NULL) {
1025 int w = *x, h = *y;
1026 int depth = req_comp ? req_comp : *comp;
1027 int row,col,z;
1028 stbi_uc temp;
1029
1030 // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
1031 for (row = 0; row < (h>>1); row++) {
1032 for (col = 0; col < w; col++) {
1033 for (z = 0; z < depth; z++) {
1034 temp = result[(row * w + col) * depth + z];
1035 result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
1036 result[((h - row - 1) * w + col) * depth + z] = temp;
1037 }
1038 }
1039 }
1040 }
1041
1042 return result;
1043 }
1044
stbi__float_postprocess(float * result,int * x,int * y,int * comp,int req_comp)1045 static void stbi__float_postprocess(float *result, int *x, int *y, int *comp, int req_comp)
1046 {
1047 if (stbi__vertically_flip_on_load && result != NULL) {
1048 int w = *x, h = *y;
1049 int depth = req_comp ? req_comp : *comp;
1050 int row,col,z;
1051 float temp;
1052
1053 // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
1054 for (row = 0; row < (h>>1); row++) {
1055 for (col = 0; col < w; col++) {
1056 for (z = 0; z < depth; z++) {
1057 temp = result[(row * w + col) * depth + z];
1058 result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
1059 result[((h - row - 1) * w + col) * depth + z] = temp;
1060 }
1061 }
1062 }
1063 }
1064 }
1065
1066
1067 #ifndef STBI_NO_STDIO
1068
stbi__fopen(char const * filename,char const * mode)1069 static FILE *stbi__fopen(char const *filename, char const *mode)
1070 {
1071 FILE *f;
1072 #if defined(_MSC_VER) && _MSC_VER >= 1400
1073 if (0 != fopen_s(&f, filename, mode))
1074 f=0;
1075 #else
1076 f = fopen(filename, mode);
1077 #endif
1078 return f;
1079 }
1080
1081
stbi_load(char const * filename,int * x,int * y,int * comp,int req_comp)1082 STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
1083 {
1084 FILE *f = stbi__fopen(filename, "rb");
1085 unsigned char *result;
1086 if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
1087 result = stbi_load_from_file(f,x,y,comp,req_comp);
1088 fclose(f);
1089 return result;
1090 }
1091
stbi_load_from_file(FILE * f,int * x,int * y,int * comp,int req_comp)1092 STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1093 {
1094 unsigned char *result;
1095 stbi__context s;
1096 stbi__start_file(&s,f);
1097 result = stbi__load_flip(&s,x,y,comp,req_comp);
1098 if (result) {
1099 // need to 'unget' all the characters in the IO buffer
1100 fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
1101 }
1102 return result;
1103 }
1104 #endif //!STBI_NO_STDIO
1105
stbi_load_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp,int req_comp)1106 STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1107 {
1108 stbi__context s;
1109 stbi__start_mem(&s,buffer,len);
1110 return stbi__load_flip(&s,x,y,comp,req_comp);
1111 }
1112
stbi_load_from_callbacks(stbi_io_callbacks const * clbk,void * user,int * x,int * y,int * comp,int req_comp)1113 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1114 {
1115 stbi__context s;
1116 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1117 return stbi__load_flip(&s,x,y,comp,req_comp);
1118 }
1119
1120 #ifndef STBI_NO_LINEAR
stbi__loadf_main(stbi__context * s,int * x,int * y,int * comp,int req_comp)1121 static float *stbi__loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
1122 {
1123 unsigned char *data;
1124 #ifndef STBI_NO_HDR
1125 if (stbi__hdr_test(s)) {
1126 float *hdr_data = stbi__hdr_load(s,x,y,comp,req_comp);
1127 if (hdr_data)
1128 stbi__float_postprocess(hdr_data,x,y,comp,req_comp);
1129 return hdr_data;
1130 }
1131 #endif
1132 data = stbi__load_flip(s, x, y, comp, req_comp);
1133 if (data)
1134 return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
1135 return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
1136 }
1137
stbi_loadf_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp,int req_comp)1138 STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1139 {
1140 stbi__context s;
1141 stbi__start_mem(&s,buffer,len);
1142 return stbi__loadf_main(&s,x,y,comp,req_comp);
1143 }
1144
stbi_loadf_from_callbacks(stbi_io_callbacks const * clbk,void * user,int * x,int * y,int * comp,int req_comp)1145 STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1146 {
1147 stbi__context s;
1148 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1149 return stbi__loadf_main(&s,x,y,comp,req_comp);
1150 }
1151
1152 #ifndef STBI_NO_STDIO
stbi_loadf(char const * filename,int * x,int * y,int * comp,int req_comp)1153 STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
1154 {
1155 float *result;
1156 FILE *f = stbi__fopen(filename, "rb");
1157 if (!f) return stbi__errpf("can't fopen", "Unable to open file");
1158 result = stbi_loadf_from_file(f,x,y,comp,req_comp);
1159 fclose(f);
1160 return result;
1161 }
1162
stbi_loadf_from_file(FILE * f,int * x,int * y,int * comp,int req_comp)1163 STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1164 {
1165 stbi__context s;
1166 stbi__start_file(&s,f);
1167 return stbi__loadf_main(&s,x,y,comp,req_comp);
1168 }
1169 #endif // !STBI_NO_STDIO
1170
1171 #endif // !STBI_NO_LINEAR
1172
1173 // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
1174 // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
1175 // reports false!
1176
stbi_is_hdr_from_memory(stbi_uc const * buffer,int len)1177 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
1178 {
1179 #ifndef STBI_NO_HDR
1180 stbi__context s;
1181 stbi__start_mem(&s,buffer,len);
1182 return stbi__hdr_test(&s);
1183 #else
1184 STBI_NOTUSED(buffer);
1185 STBI_NOTUSED(len);
1186 return 0;
1187 #endif
1188 }
1189
1190 #ifndef STBI_NO_STDIO
stbi_is_hdr(char const * filename)1191 STBIDEF int stbi_is_hdr (char const *filename)
1192 {
1193 FILE *f = stbi__fopen(filename, "rb");
1194 int result=0;
1195 if (f) {
1196 result = stbi_is_hdr_from_file(f);
1197 fclose(f);
1198 }
1199 return result;
1200 }
1201
stbi_is_hdr_from_file(FILE * f)1202 STBIDEF int stbi_is_hdr_from_file(FILE *f)
1203 {
1204 #ifndef STBI_NO_HDR
1205 stbi__context s;
1206 stbi__start_file(&s,f);
1207 return stbi__hdr_test(&s);
1208 #else
1209 return 0;
1210 #endif
1211 }
1212 #endif // !STBI_NO_STDIO
1213
stbi_is_hdr_from_callbacks(stbi_io_callbacks const * clbk,void * user)1214 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
1215 {
1216 #ifndef STBI_NO_HDR
1217 stbi__context s;
1218 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1219 return stbi__hdr_test(&s);
1220 #else
1221 return 0;
1222 #endif
1223 }
1224
1225 static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f;
1226 static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f;
1227
1228 #ifndef STBI_NO_LINEAR
stbi_ldr_to_hdr_gamma(float gamma)1229 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
stbi_ldr_to_hdr_scale(float scale)1230 STBIDEF void stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
1231 #endif
1232
stbi_hdr_to_ldr_gamma(float gamma)1233 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; }
stbi_hdr_to_ldr_scale(float scale)1234 STBIDEF void stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; }
1235
1236
1237 //////////////////////////////////////////////////////////////////////////////
1238 //
1239 // Common code used by all image loaders
1240 //
1241
1242 enum
1243 {
1244 STBI__SCAN_load=0,
1245 STBI__SCAN_type,
1246 STBI__SCAN_header
1247 };
1248
stbi__refill_buffer(stbi__context * s)1249 static void stbi__refill_buffer(stbi__context *s)
1250 {
1251 int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
1252 if (n == 0) {
1253 // at end of file, treat same as if from memory, but need to handle case
1254 // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
1255 s->read_from_callbacks = 0;
1256 s->img_buffer = s->buffer_start;
1257 s->img_buffer_end = s->buffer_start+1;
1258 *s->img_buffer = 0;
1259 } else {
1260 s->img_buffer = s->buffer_start;
1261 s->img_buffer_end = s->buffer_start + n;
1262 }
1263 }
1264
stbi__get8(stbi__context * s)1265 stbi_inline static stbi_uc stbi__get8(stbi__context *s)
1266 {
1267 if (s->img_buffer < s->img_buffer_end)
1268 return *s->img_buffer++;
1269 if (s->read_from_callbacks) {
1270 stbi__refill_buffer(s);
1271 return *s->img_buffer++;
1272 }
1273 return 0;
1274 }
1275
stbi__at_eof(stbi__context * s)1276 stbi_inline static int stbi__at_eof(stbi__context *s)
1277 {
1278 if (s->io.read) {
1279 if (!(s->io.eof)(s->io_user_data)) return 0;
1280 // if feof() is true, check if buffer = end
1281 // special case: we've only got the special 0 character at the end
1282 if (s->read_from_callbacks == 0) return 1;
1283 }
1284
1285 return s->img_buffer >= s->img_buffer_end;
1286 }
1287
stbi__skip(stbi__context * s,int n)1288 static void stbi__skip(stbi__context *s, int n)
1289 {
1290 if (n < 0) {
1291 s->img_buffer = s->img_buffer_end;
1292 return;
1293 }
1294 if (s->io.read) {
1295 int blen = (int) (s->img_buffer_end - s->img_buffer);
1296 if (blen < n) {
1297 s->img_buffer = s->img_buffer_end;
1298 (s->io.skip)(s->io_user_data, n - blen);
1299 return;
1300 }
1301 }
1302 s->img_buffer += n;
1303 }
1304
stbi__getn(stbi__context * s,stbi_uc * buffer,int n)1305 static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n)
1306 {
1307 if (s->io.read) {
1308 int blen = (int) (s->img_buffer_end - s->img_buffer);
1309 if (blen < n) {
1310 int res, count;
1311
1312 memcpy(buffer, s->img_buffer, blen);
1313
1314 count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
1315 res = (count == (n-blen));
1316 s->img_buffer = s->img_buffer_end;
1317 return res;
1318 }
1319 }
1320
1321 if (s->img_buffer+n <= s->img_buffer_end) {
1322 memcpy(buffer, s->img_buffer, n);
1323 s->img_buffer += n;
1324 return 1;
1325 } else
1326 return 0;
1327 }
1328
stbi__get16be(stbi__context * s)1329 static int stbi__get16be(stbi__context *s)
1330 {
1331 int z = stbi__get8(s);
1332 return (z << 8) + stbi__get8(s);
1333 }
1334
stbi__get32be(stbi__context * s)1335 static stbi__uint32 stbi__get32be(stbi__context *s)
1336 {
1337 stbi__uint32 z = stbi__get16be(s);
1338 return (z << 16) + stbi__get16be(s);
1339 }
1340
stbi__get16le(stbi__context * s)1341 static int stbi__get16le(stbi__context *s)
1342 {
1343 int z = stbi__get8(s);
1344 return z + (stbi__get8(s) << 8);
1345 }
1346
stbi__get32le(stbi__context * s)1347 static stbi__uint32 stbi__get32le(stbi__context *s)
1348 {
1349 stbi__uint32 z = stbi__get16le(s);
1350 return z + (stbi__get16le(s) << 16);
1351 }
1352
1353 #define STBI__BYTECAST(x) ((stbi_uc) ((x) & 255)) // truncate int to byte without warnings
1354
1355
1356 //////////////////////////////////////////////////////////////////////////////
1357 //
1358 // generic converter from built-in img_n to req_comp
1359 // individual types do this automatically as much as possible (e.g. jpeg
1360 // does all cases internally since it needs to colorspace convert anyway,
1361 // and it never has alpha, so very few cases ). png can automatically
1362 // interleave an alpha=255 channel, but falls back to this for other cases
1363 //
1364 // assume data buffer is malloced, so malloc a new one and free that one
1365 // only failure mode is malloc failing
1366
stbi__compute_y(int r,int g,int b)1367 static stbi_uc stbi__compute_y(int r, int g, int b)
1368 {
1369 return (stbi_uc) (((r*77) + (g*150) + (29*b)) >> 8);
1370 }
1371
stbi__convert_format(unsigned char * data,int img_n,int req_comp,unsigned int x,unsigned int y)1372 static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y)
1373 {
1374 int i,j;
1375 unsigned char *good;
1376
1377 if (req_comp == img_n) return data;
1378 STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
1379
1380 good = (unsigned char *) stbi__malloc(req_comp * x * y);
1381 if (good == NULL) {
1382 STBI_FREE(data);
1383 return stbi__errpuc("outofmem", "Out of memory");
1384 }
1385
1386 for (j=0; j < (int) y; ++j) {
1387 unsigned char *src = data + j * x * img_n ;
1388 unsigned char *dest = good + j * x * req_comp;
1389
1390 #define COMBO(a,b) ((a)*8+(b))
1391 #define CASE(a,b) case COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
1392 // convert source image with img_n components to one with req_comp components;
1393 // avoid switch per pixel, so use switch per scanline and massive macros
1394 switch (COMBO(img_n, req_comp)) {
1395 CASE(1,2) dest[0]=src[0], dest[1]=255;
1396 break;
1397 CASE(1,3) dest[0]=dest[1]=dest[2]=src[0];
1398 break;
1399 CASE(1,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=255;
1400 break;
1401 CASE(2,1) dest[0]=src[0];
1402 break;
1403 CASE(2,3) dest[0]=dest[1]=dest[2]=src[0];
1404 break;
1405 CASE(2,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=src[1];
1406 break;
1407 CASE(3,4) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2],dest[3]=255;
1408 break;
1409 CASE(3,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]);
1410 break;
1411 CASE(3,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = 255;
1412 break;
1413 CASE(4,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]);
1414 break;
1415 CASE(4,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = src[3];
1416 break;
1417 CASE(4,3) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2];
1418 break;
1419 default: STBI_ASSERT(0);
1420 }
1421 #undef CASE
1422 }
1423
1424 STBI_FREE(data);
1425 return good;
1426 }
1427
1428 #ifndef STBI_NO_LINEAR
stbi__ldr_to_hdr(stbi_uc * data,int x,int y,int comp)1429 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
1430 {
1431 int i,k,n;
1432 float *output = (float *) stbi__malloc(x * y * comp * sizeof(float));
1433 if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
1434 // compute number of non-alpha components
1435 if (comp & 1) n = comp; else n = comp-1;
1436 for (i=0; i < x*y; ++i) {
1437 for (k=0; k < n; ++k) {
1438 output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
1439 }
1440 if (k < comp) output[i*comp + k] = data[i*comp+k]/255.0f;
1441 }
1442 STBI_FREE(data);
1443 return output;
1444 }
1445 #endif
1446
1447 #ifndef STBI_NO_HDR
1448 #define stbi__float2int(x) ((int) (x))
stbi__hdr_to_ldr(float * data,int x,int y,int comp)1449 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp)
1450 {
1451 int i,k,n;
1452 stbi_uc *output = (stbi_uc *) stbi__malloc(x * y * comp);
1453 if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
1454 // compute number of non-alpha components
1455 if (comp & 1) n = comp; else n = comp-1;
1456 for (i=0; i < x*y; ++i) {
1457 for (k=0; k < n; ++k) {
1458 float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
1459 if (z < 0) z = 0;
1460 if (z > 255) z = 255;
1461 output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1462 }
1463 if (k < comp) {
1464 float z = data[i*comp+k] * 255 + 0.5f;
1465 if (z < 0) z = 0;
1466 if (z > 255) z = 255;
1467 output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1468 }
1469 }
1470 STBI_FREE(data);
1471 return output;
1472 }
1473 #endif
1474
1475 //////////////////////////////////////////////////////////////////////////////
1476 //
1477 // "baseline" JPEG/JFIF decoder
1478 //
1479 // simple implementation
1480 // - doesn't support delayed output of y-dimension
1481 // - simple interface (only one output format: 8-bit interleaved RGB)
1482 // - doesn't try to recover corrupt jpegs
1483 // - doesn't allow partial loading, loading multiple at once
1484 // - still fast on x86 (copying globals into locals doesn't help x86)
1485 // - allocates lots of intermediate memory (full size of all components)
1486 // - non-interleaved case requires this anyway
1487 // - allows good upsampling (see next)
1488 // high-quality
1489 // - upsampled channels are bilinearly interpolated, even across blocks
1490 // - quality integer IDCT derived from IJG's 'slow'
1491 // performance
1492 // - fast huffman; reasonable integer IDCT
1493 // - some SIMD kernels for common paths on targets with SSE2/NEON
1494 // - uses a lot of intermediate memory, could cache poorly
1495
1496 #ifndef STBI_NO_JPEG
1497
1498 // huffman decoding acceleration
1499 #define FAST_BITS 9 // larger handles more cases; smaller stomps less cache
1500
1501 typedef struct
1502 {
1503 stbi_uc fast[1 << FAST_BITS];
1504 // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
1505 stbi__uint16 code[256];
1506 stbi_uc values[256];
1507 stbi_uc size[257];
1508 unsigned int maxcode[18];
1509 int delta[17]; // old 'firstsymbol' - old 'firstcode'
1510 } stbi__huffman;
1511
1512 typedef struct
1513 {
1514 stbi__context *s;
1515 stbi__huffman huff_dc[4];
1516 stbi__huffman huff_ac[4];
1517 stbi_uc dequant[4][64];
1518 stbi__int16 fast_ac[4][1 << FAST_BITS];
1519
1520 // sizes for components, interleaved MCUs
1521 int img_h_max, img_v_max;
1522 int img_mcu_x, img_mcu_y;
1523 int img_mcu_w, img_mcu_h;
1524
1525 // definition of jpeg image component
1526 struct
1527 {
1528 int id;
1529 int h,v;
1530 int tq;
1531 int hd,ha;
1532 int dc_pred;
1533
1534 int x,y,w2,h2;
1535 stbi_uc *data;
1536 void *raw_data, *raw_coeff;
1537 stbi_uc *linebuf;
1538 short *coeff; // progressive only
1539 int coeff_w, coeff_h; // number of 8x8 coefficient blocks
1540 } img_comp[4];
1541
1542 stbi__uint32 code_buffer; // jpeg entropy-coded buffer
1543 int code_bits; // number of valid bits
1544 unsigned char marker; // marker seen while filling entropy buffer
1545 int nomore; // flag if we saw a marker so must stop
1546
1547 int progressive;
1548 int spec_start;
1549 int spec_end;
1550 int succ_high;
1551 int succ_low;
1552 int eob_run;
1553
1554 int scan_n, order[4];
1555 int restart_interval, todo;
1556
1557 // kernels
1558 void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]);
1559 void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step);
1560 stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs);
1561 } stbi__jpeg;
1562
stbi__build_huffman(stbi__huffman * h,int * count)1563 static int stbi__build_huffman(stbi__huffman *h, int *count)
1564 {
1565 int i,j,k=0,code;
1566 // build size list for each symbol (from JPEG spec)
1567 for (i=0; i < 16; ++i)
1568 for (j=0; j < count[i]; ++j)
1569 h->size[k++] = (stbi_uc) (i+1);
1570 h->size[k] = 0;
1571
1572 // compute actual symbols (from jpeg spec)
1573 code = 0;
1574 k = 0;
1575 for(j=1; j <= 16; ++j) {
1576 // compute delta to add to code to compute symbol id
1577 h->delta[j] = k - code;
1578 if (h->size[k] == j) {
1579 while (h->size[k] == j)
1580 h->code[k++] = (stbi__uint16) (code++);
1581 if (code-1 >= (1 << j)) return stbi__err("bad code lengths","Corrupt JPEG");
1582 }
1583 // compute largest code + 1 for this size, preshifted as needed later
1584 h->maxcode[j] = code << (16-j);
1585 code <<= 1;
1586 }
1587 h->maxcode[j] = 0xffffffff;
1588
1589 // build non-spec acceleration table; 255 is flag for not-accelerated
1590 memset(h->fast, 255, 1 << FAST_BITS);
1591 for (i=0; i < k; ++i) {
1592 int s = h->size[i];
1593 if (s <= FAST_BITS) {
1594 int c = h->code[i] << (FAST_BITS-s);
1595 int m = 1 << (FAST_BITS-s);
1596 for (j=0; j < m; ++j) {
1597 h->fast[c+j] = (stbi_uc) i;
1598 }
1599 }
1600 }
1601 return 1;
1602 }
1603
1604 // build a table that decodes both magnitude and value of small ACs in
1605 // one go.
stbi__build_fast_ac(stbi__int16 * fast_ac,stbi__huffman * h)1606 static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h)
1607 {
1608 int i;
1609 for (i=0; i < (1 << FAST_BITS); ++i) {
1610 stbi_uc fast = h->fast[i];
1611 fast_ac[i] = 0;
1612 if (fast < 255) {
1613 int rs = h->values[fast];
1614 int run = (rs >> 4) & 15;
1615 int magbits = rs & 15;
1616 int len = h->size[fast];
1617
1618 if (magbits && len + magbits <= FAST_BITS) {
1619 // magnitude code followed by receive_extend code
1620 int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
1621 int m = 1 << (magbits - 1);
1622 if (k < m) k += (-1 << magbits) + 1;
1623 // if the result is small enough, we can fit it in fast_ac table
1624 if (k >= -128 && k <= 127)
1625 fast_ac[i] = (stbi__int16) ((k << 8) + (run << 4) + (len + magbits));
1626 }
1627 }
1628 }
1629 }
1630
stbi__grow_buffer_unsafe(stbi__jpeg * j)1631 static void stbi__grow_buffer_unsafe(stbi__jpeg *j)
1632 {
1633 do {
1634 int b = j->nomore ? 0 : stbi__get8(j->s);
1635 if (b == 0xff) {
1636 int c = stbi__get8(j->s);
1637 if (c != 0) {
1638 j->marker = (unsigned char) c;
1639 j->nomore = 1;
1640 return;
1641 }
1642 }
1643 j->code_buffer |= b << (24 - j->code_bits);
1644 j->code_bits += 8;
1645 } while (j->code_bits <= 24);
1646 }
1647
1648 // (1 << n) - 1
1649 static stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
1650
1651 // decode a jpeg huffman value from the bitstream
stbi__jpeg_huff_decode(stbi__jpeg * j,stbi__huffman * h)1652 stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h)
1653 {
1654 unsigned int temp;
1655 int c,k;
1656
1657 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1658
1659 // look at the top FAST_BITS and determine what symbol ID it is,
1660 // if the code is <= FAST_BITS
1661 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1662 k = h->fast[c];
1663 if (k < 255) {
1664 int s = h->size[k];
1665 if (s > j->code_bits)
1666 return -1;
1667 j->code_buffer <<= s;
1668 j->code_bits -= s;
1669 return h->values[k];
1670 }
1671
1672 // naive test is to shift the code_buffer down so k bits are
1673 // valid, then test against maxcode. To speed this up, we've
1674 // preshifted maxcode left so that it has (16-k) 0s at the
1675 // end; in other words, regardless of the number of bits, it
1676 // wants to be compared against something shifted to have 16;
1677 // that way we don't need to shift inside the loop.
1678 temp = j->code_buffer >> 16;
1679 for (k=FAST_BITS+1 ; ; ++k)
1680 if (temp < h->maxcode[k])
1681 break;
1682 if (k == 17) {
1683 // error! code not found
1684 j->code_bits -= 16;
1685 return -1;
1686 }
1687
1688 if (k > j->code_bits)
1689 return -1;
1690
1691 // convert the huffman code to the symbol id
1692 c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
1693 STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
1694
1695 // convert the id to a symbol
1696 j->code_bits -= k;
1697 j->code_buffer <<= k;
1698 return h->values[c];
1699 }
1700
1701 // bias[n] = (-1<<n) + 1
1702 static int const stbi__jbias[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
1703
1704 // combined JPEG 'receive' and JPEG 'extend', since baseline
1705 // always extends everything it receives.
stbi__extend_receive(stbi__jpeg * j,int n)1706 stbi_inline static int stbi__extend_receive(stbi__jpeg *j, int n)
1707 {
1708 unsigned int k;
1709 int sgn;
1710 if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1711
1712 sgn = (stbi__int32)j->code_buffer >> 31; // sign bit is always in MSB
1713 k = stbi_lrot(j->code_buffer, n);
1714 STBI_ASSERT(n >= 0 && n < (int) (sizeof(stbi__bmask)/sizeof(*stbi__bmask)));
1715 j->code_buffer = k & ~stbi__bmask[n];
1716 k &= stbi__bmask[n];
1717 j->code_bits -= n;
1718 return k + (stbi__jbias[n] & ~sgn);
1719 }
1720
1721 // get some unsigned bits
stbi__jpeg_get_bits(stbi__jpeg * j,int n)1722 stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg *j, int n)
1723 {
1724 unsigned int k;
1725 if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1726 k = stbi_lrot(j->code_buffer, n);
1727 j->code_buffer = k & ~stbi__bmask[n];
1728 k &= stbi__bmask[n];
1729 j->code_bits -= n;
1730 return k;
1731 }
1732
stbi__jpeg_get_bit(stbi__jpeg * j)1733 stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg *j)
1734 {
1735 unsigned int k;
1736 if (j->code_bits < 1) stbi__grow_buffer_unsafe(j);
1737 k = j->code_buffer;
1738 j->code_buffer <<= 1;
1739 --j->code_bits;
1740 return k & 0x80000000;
1741 }
1742
1743 // given a value that's at position X in the zigzag stream,
1744 // where does it appear in the 8x8 matrix coded as row-major?
1745 static stbi_uc stbi__jpeg_dezigzag[64+15] =
1746 {
1747 0, 1, 8, 16, 9, 2, 3, 10,
1748 17, 24, 32, 25, 18, 11, 4, 5,
1749 12, 19, 26, 33, 40, 48, 41, 34,
1750 27, 20, 13, 6, 7, 14, 21, 28,
1751 35, 42, 49, 56, 57, 50, 43, 36,
1752 29, 22, 15, 23, 30, 37, 44, 51,
1753 58, 59, 52, 45, 38, 31, 39, 46,
1754 53, 60, 61, 54, 47, 55, 62, 63,
1755 // let corrupt input sample past end
1756 63, 63, 63, 63, 63, 63, 63, 63,
1757 63, 63, 63, 63, 63, 63, 63
1758 };
1759
1760 // decode one 64-entry block--
stbi__jpeg_decode_block(stbi__jpeg * j,short data[64],stbi__huffman * hdc,stbi__huffman * hac,stbi__int16 * fac,int b,stbi_uc * dequant)1761 static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi_uc *dequant)
1762 {
1763 int diff,dc,k;
1764 int t;
1765
1766 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1767 t = stbi__jpeg_huff_decode(j, hdc);
1768 if (t < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1769
1770 // 0 all the ac values now so we can do it 32-bits at a time
1771 memset(data,0,64*sizeof(data[0]));
1772
1773 diff = t ? stbi__extend_receive(j, t) : 0;
1774 dc = j->img_comp[b].dc_pred + diff;
1775 j->img_comp[b].dc_pred = dc;
1776 data[0] = (short) (dc * dequant[0]);
1777
1778 // decode AC components, see JPEG spec
1779 k = 1;
1780 do {
1781 unsigned int zig;
1782 int c,r,s;
1783 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1784 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1785 r = fac[c];
1786 if (r) { // fast-AC path
1787 k += (r >> 4) & 15; // run
1788 s = r & 15; // combined length
1789 j->code_buffer <<= s;
1790 j->code_bits -= s;
1791 // decode into unzigzag'd location
1792 zig = stbi__jpeg_dezigzag[k++];
1793 data[zig] = (short) ((r >> 8) * dequant[zig]);
1794 } else {
1795 int rs = stbi__jpeg_huff_decode(j, hac);
1796 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1797 s = rs & 15;
1798 r = rs >> 4;
1799 if (s == 0) {
1800 if (rs != 0xf0) break; // end block
1801 k += 16;
1802 } else {
1803 k += r;
1804 // decode into unzigzag'd location
1805 zig = stbi__jpeg_dezigzag[k++];
1806 data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]);
1807 }
1808 }
1809 } while (k < 64);
1810 return 1;
1811 }
1812
stbi__jpeg_decode_block_prog_dc(stbi__jpeg * j,short data[64],stbi__huffman * hdc,int b)1813 static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg *j, short data[64], stbi__huffman *hdc, int b)
1814 {
1815 int diff,dc;
1816 int t;
1817 if (j->spec_end != 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1818
1819 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1820
1821 if (j->succ_high == 0) {
1822 // first scan for DC coefficient, must be first
1823 memset(data,0,64*sizeof(data[0])); // 0 all the ac values now
1824 t = stbi__jpeg_huff_decode(j, hdc);
1825 diff = t ? stbi__extend_receive(j, t) : 0;
1826
1827 dc = j->img_comp[b].dc_pred + diff;
1828 j->img_comp[b].dc_pred = dc;
1829 data[0] = (short) (dc << j->succ_low);
1830 } else {
1831 // refinement scan for DC coefficient
1832 if (stbi__jpeg_get_bit(j))
1833 data[0] += (short) (1 << j->succ_low);
1834 }
1835 return 1;
1836 }
1837
1838 // @OPTIMIZE: store non-zigzagged during the decode passes,
1839 // and only de-zigzag when dequantizing
stbi__jpeg_decode_block_prog_ac(stbi__jpeg * j,short data[64],stbi__huffman * hac,stbi__int16 * fac)1840 static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg *j, short data[64], stbi__huffman *hac, stbi__int16 *fac)
1841 {
1842 int k;
1843 if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1844
1845 if (j->succ_high == 0) {
1846 int shift = j->succ_low;
1847
1848 if (j->eob_run) {
1849 --j->eob_run;
1850 return 1;
1851 }
1852
1853 k = j->spec_start;
1854 do {
1855 unsigned int zig;
1856 int c,r,s;
1857 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1858 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1859 r = fac[c];
1860 if (r) { // fast-AC path
1861 k += (r >> 4) & 15; // run
1862 s = r & 15; // combined length
1863 j->code_buffer <<= s;
1864 j->code_bits -= s;
1865 zig = stbi__jpeg_dezigzag[k++];
1866 data[zig] = (short) ((r >> 8) << shift);
1867 } else {
1868 int rs = stbi__jpeg_huff_decode(j, hac);
1869 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1870 s = rs & 15;
1871 r = rs >> 4;
1872 if (s == 0) {
1873 if (r < 15) {
1874 j->eob_run = (1 << r);
1875 if (r)
1876 j->eob_run += stbi__jpeg_get_bits(j, r);
1877 --j->eob_run;
1878 break;
1879 }
1880 k += 16;
1881 } else {
1882 k += r;
1883 zig = stbi__jpeg_dezigzag[k++];
1884 data[zig] = (short) (stbi__extend_receive(j,s) << shift);
1885 }
1886 }
1887 } while (k <= j->spec_end);
1888 } else {
1889 // refinement scan for these AC coefficients
1890
1891 short bit = (short) (1 << j->succ_low);
1892
1893 if (j->eob_run) {
1894 --j->eob_run;
1895 for (k = j->spec_start; k <= j->spec_end; ++k) {
1896 short *p = &data[stbi__jpeg_dezigzag[k]];
1897 if (*p != 0)
1898 if (stbi__jpeg_get_bit(j))
1899 if ((*p & bit)==0) {
1900 if (*p > 0)
1901 *p += bit;
1902 else
1903 *p -= bit;
1904 }
1905 }
1906 } else {
1907 k = j->spec_start;
1908 do {
1909 int r,s;
1910 int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
1911 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1912 s = rs & 15;
1913 r = rs >> 4;
1914 if (s == 0) {
1915 if (r < 15) {
1916 j->eob_run = (1 << r) - 1;
1917 if (r)
1918 j->eob_run += stbi__jpeg_get_bits(j, r);
1919 r = 64; // force end of block
1920 } else {
1921 // r=15 s=0 should write 16 0s, so we just do
1922 // a run of 15 0s and then write s (which is 0),
1923 // so we don't have to do anything special here
1924 }
1925 } else {
1926 if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG");
1927 // sign bit
1928 if (stbi__jpeg_get_bit(j))
1929 s = bit;
1930 else
1931 s = -bit;
1932 }
1933
1934 // advance by r
1935 while (k <= j->spec_end) {
1936 short *p = &data[stbi__jpeg_dezigzag[k++]];
1937 if (*p != 0) {
1938 if (stbi__jpeg_get_bit(j))
1939 if ((*p & bit)==0) {
1940 if (*p > 0)
1941 *p += bit;
1942 else
1943 *p -= bit;
1944 }
1945 } else {
1946 if (r == 0) {
1947 *p = (short) s;
1948 break;
1949 }
1950 --r;
1951 }
1952 }
1953 } while (k <= j->spec_end);
1954 }
1955 }
1956 return 1;
1957 }
1958
1959 // take a -128..127 value and stbi__clamp it and convert to 0..255
stbi__clamp(int x)1960 stbi_inline static stbi_uc stbi__clamp(int x)
1961 {
1962 // trick to use a single test to catch both cases
1963 if ((unsigned int) x > 255) {
1964 if (x < 0) return 0;
1965 if (x > 255) return 255;
1966 }
1967 return (stbi_uc) x;
1968 }
1969
1970 #define stbi__f2f(x) ((int) (((x) * 4096 + 0.5)))
1971 #define stbi__fsh(x) ((x) << 12)
1972
1973 // derived from jidctint -- DCT_ISLOW
1974 #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
1975 int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
1976 p2 = s2; \
1977 p3 = s6; \
1978 p1 = (p2+p3) * stbi__f2f(0.5411961f); \
1979 t2 = p1 + p3*stbi__f2f(-1.847759065f); \
1980 t3 = p1 + p2*stbi__f2f( 0.765366865f); \
1981 p2 = s0; \
1982 p3 = s4; \
1983 t0 = stbi__fsh(p2+p3); \
1984 t1 = stbi__fsh(p2-p3); \
1985 x0 = t0+t3; \
1986 x3 = t0-t3; \
1987 x1 = t1+t2; \
1988 x2 = t1-t2; \
1989 t0 = s7; \
1990 t1 = s5; \
1991 t2 = s3; \
1992 t3 = s1; \
1993 p3 = t0+t2; \
1994 p4 = t1+t3; \
1995 p1 = t0+t3; \
1996 p2 = t1+t2; \
1997 p5 = (p3+p4)*stbi__f2f( 1.175875602f); \
1998 t0 = t0*stbi__f2f( 0.298631336f); \
1999 t1 = t1*stbi__f2f( 2.053119869f); \
2000 t2 = t2*stbi__f2f( 3.072711026f); \
2001 t3 = t3*stbi__f2f( 1.501321110f); \
2002 p1 = p5 + p1*stbi__f2f(-0.899976223f); \
2003 p2 = p5 + p2*stbi__f2f(-2.562915447f); \
2004 p3 = p3*stbi__f2f(-1.961570560f); \
2005 p4 = p4*stbi__f2f(-0.390180644f); \
2006 t3 += p1+p4; \
2007 t2 += p2+p3; \
2008 t1 += p2+p4; \
2009 t0 += p1+p3;
2010
stbi__idct_block(stbi_uc * out,int out_stride,short data[64])2011 static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64])
2012 {
2013 int i,val[64],*v=val;
2014 stbi_uc *o;
2015 short *d = data;
2016
2017 // columns
2018 for (i=0; i < 8; ++i,++d, ++v) {
2019 // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
2020 if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
2021 && d[40]==0 && d[48]==0 && d[56]==0) {
2022 // no shortcut 0 seconds
2023 // (1|2|3|4|5|6|7)==0 0 seconds
2024 // all separate -0.047 seconds
2025 // 1 && 2|3 && 4|5 && 6|7: -0.047 seconds
2026 int dcterm = d[0] << 2;
2027 v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
2028 } else {
2029 STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56])
2030 // constants scaled things up by 1<<12; let's bring them back
2031 // down, but keep 2 extra bits of precision
2032 x0 += 512; x1 += 512; x2 += 512; x3 += 512;
2033 v[ 0] = (x0+t3) >> 10;
2034 v[56] = (x0-t3) >> 10;
2035 v[ 8] = (x1+t2) >> 10;
2036 v[48] = (x1-t2) >> 10;
2037 v[16] = (x2+t1) >> 10;
2038 v[40] = (x2-t1) >> 10;
2039 v[24] = (x3+t0) >> 10;
2040 v[32] = (x3-t0) >> 10;
2041 }
2042 }
2043
2044 for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
2045 // no fast case since the first 1D IDCT spread components out
2046 STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
2047 // constants scaled things up by 1<<12, plus we had 1<<2 from first
2048 // loop, plus horizontal and vertical each scale by sqrt(8) so together
2049 // we've got an extra 1<<3, so 1<<17 total we need to remove.
2050 // so we want to round that, which means adding 0.5 * 1<<17,
2051 // aka 65536. Also, we'll end up with -128 to 127 that we want
2052 // to encode as 0..255 by adding 128, so we'll add that before the shift
2053 x0 += 65536 + (128<<17);
2054 x1 += 65536 + (128<<17);
2055 x2 += 65536 + (128<<17);
2056 x3 += 65536 + (128<<17);
2057 // tried computing the shifts into temps, or'ing the temps to see
2058 // if any were out of range, but that was slower
2059 o[0] = stbi__clamp((x0+t3) >> 17);
2060 o[7] = stbi__clamp((x0-t3) >> 17);
2061 o[1] = stbi__clamp((x1+t2) >> 17);
2062 o[6] = stbi__clamp((x1-t2) >> 17);
2063 o[2] = stbi__clamp((x2+t1) >> 17);
2064 o[5] = stbi__clamp((x2-t1) >> 17);
2065 o[3] = stbi__clamp((x3+t0) >> 17);
2066 o[4] = stbi__clamp((x3-t0) >> 17);
2067 }
2068 }
2069
2070 #ifdef STBI_SSE2
2071 // sse2 integer IDCT. not the fastest possible implementation but it
2072 // produces bit-identical results to the generic C version so it's
2073 // fully "transparent".
stbi__idct_simd(stbi_uc * out,int out_stride,short data[64])2074 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2075 {
2076 // This is constructed to match our regular (generic) integer IDCT exactly.
2077 __m128i row0, row1, row2, row3, row4, row5, row6, row7;
2078 __m128i tmp;
2079
2080 // dot product constant: even elems=x, odd elems=y
2081 #define dct_const(x,y) _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
2082
2083 // out(0) = c0[even]*x + c0[odd]*y (c0, x, y 16-bit, out 32-bit)
2084 // out(1) = c1[even]*x + c1[odd]*y
2085 #define dct_rot(out0,out1, x,y,c0,c1) \
2086 __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
2087 __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
2088 __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
2089 __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
2090 __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
2091 __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
2092
2093 // out = in << 12 (in 16-bit, out 32-bit)
2094 #define dct_widen(out, in) \
2095 __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
2096 __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
2097
2098 // wide add
2099 #define dct_wadd(out, a, b) \
2100 __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
2101 __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
2102
2103 // wide sub
2104 #define dct_wsub(out, a, b) \
2105 __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
2106 __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
2107
2108 // butterfly a/b, add bias, then shift by "s" and pack
2109 #define dct_bfly32o(out0, out1, a,b,bias,s) \
2110 { \
2111 __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
2112 __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
2113 dct_wadd(sum, abiased, b); \
2114 dct_wsub(dif, abiased, b); \
2115 out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
2116 out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
2117 }
2118
2119 // 8-bit interleave step (for transposes)
2120 #define dct_interleave8(a, b) \
2121 tmp = a; \
2122 a = _mm_unpacklo_epi8(a, b); \
2123 b = _mm_unpackhi_epi8(tmp, b)
2124
2125 // 16-bit interleave step (for transposes)
2126 #define dct_interleave16(a, b) \
2127 tmp = a; \
2128 a = _mm_unpacklo_epi16(a, b); \
2129 b = _mm_unpackhi_epi16(tmp, b)
2130
2131 #define dct_pass(bias,shift) \
2132 { \
2133 /* even part */ \
2134 dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
2135 __m128i sum04 = _mm_add_epi16(row0, row4); \
2136 __m128i dif04 = _mm_sub_epi16(row0, row4); \
2137 dct_widen(t0e, sum04); \
2138 dct_widen(t1e, dif04); \
2139 dct_wadd(x0, t0e, t3e); \
2140 dct_wsub(x3, t0e, t3e); \
2141 dct_wadd(x1, t1e, t2e); \
2142 dct_wsub(x2, t1e, t2e); \
2143 /* odd part */ \
2144 dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
2145 dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
2146 __m128i sum17 = _mm_add_epi16(row1, row7); \
2147 __m128i sum35 = _mm_add_epi16(row3, row5); \
2148 dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
2149 dct_wadd(x4, y0o, y4o); \
2150 dct_wadd(x5, y1o, y5o); \
2151 dct_wadd(x6, y2o, y5o); \
2152 dct_wadd(x7, y3o, y4o); \
2153 dct_bfly32o(row0,row7, x0,x7,bias,shift); \
2154 dct_bfly32o(row1,row6, x1,x6,bias,shift); \
2155 dct_bfly32o(row2,row5, x2,x5,bias,shift); \
2156 dct_bfly32o(row3,row4, x3,x4,bias,shift); \
2157 }
2158
2159 __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
2160 __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f));
2161 __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
2162 __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
2163 __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f));
2164 __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f));
2165 __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f));
2166 __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f));
2167
2168 // rounding biases in column/row passes, see stbi__idct_block for explanation.
2169 __m128i bias_0 = _mm_set1_epi32(512);
2170 __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17));
2171
2172 // load
2173 row0 = _mm_load_si128((const __m128i *) (data + 0*8));
2174 row1 = _mm_load_si128((const __m128i *) (data + 1*8));
2175 row2 = _mm_load_si128((const __m128i *) (data + 2*8));
2176 row3 = _mm_load_si128((const __m128i *) (data + 3*8));
2177 row4 = _mm_load_si128((const __m128i *) (data + 4*8));
2178 row5 = _mm_load_si128((const __m128i *) (data + 5*8));
2179 row6 = _mm_load_si128((const __m128i *) (data + 6*8));
2180 row7 = _mm_load_si128((const __m128i *) (data + 7*8));
2181
2182 // column pass
2183 dct_pass(bias_0, 10);
2184
2185 {
2186 // 16bit 8x8 transpose pass 1
2187 dct_interleave16(row0, row4);
2188 dct_interleave16(row1, row5);
2189 dct_interleave16(row2, row6);
2190 dct_interleave16(row3, row7);
2191
2192 // transpose pass 2
2193 dct_interleave16(row0, row2);
2194 dct_interleave16(row1, row3);
2195 dct_interleave16(row4, row6);
2196 dct_interleave16(row5, row7);
2197
2198 // transpose pass 3
2199 dct_interleave16(row0, row1);
2200 dct_interleave16(row2, row3);
2201 dct_interleave16(row4, row5);
2202 dct_interleave16(row6, row7);
2203 }
2204
2205 // row pass
2206 dct_pass(bias_1, 17);
2207
2208 {
2209 // pack
2210 __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
2211 __m128i p1 = _mm_packus_epi16(row2, row3);
2212 __m128i p2 = _mm_packus_epi16(row4, row5);
2213 __m128i p3 = _mm_packus_epi16(row6, row7);
2214
2215 // 8bit 8x8 transpose pass 1
2216 dct_interleave8(p0, p2); // a0e0a1e1...
2217 dct_interleave8(p1, p3); // c0g0c1g1...
2218
2219 // transpose pass 2
2220 dct_interleave8(p0, p1); // a0c0e0g0...
2221 dct_interleave8(p2, p3); // b0d0f0h0...
2222
2223 // transpose pass 3
2224 dct_interleave8(p0, p2); // a0b0c0d0...
2225 dct_interleave8(p1, p3); // a4b4c4d4...
2226
2227 // store
2228 _mm_storel_epi64((__m128i *) out, p0); out += out_stride;
2229 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
2230 _mm_storel_epi64((__m128i *) out, p2); out += out_stride;
2231 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
2232 _mm_storel_epi64((__m128i *) out, p1); out += out_stride;
2233 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
2234 _mm_storel_epi64((__m128i *) out, p3); out += out_stride;
2235 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e));
2236 }
2237
2238 #undef dct_const
2239 #undef dct_rot
2240 #undef dct_widen
2241 #undef dct_wadd
2242 #undef dct_wsub
2243 #undef dct_bfly32o
2244 #undef dct_interleave8
2245 #undef dct_interleave16
2246 #undef dct_pass
2247 }
2248
2249 #endif // STBI_SSE2
2250
2251 #ifdef STBI_NEON
2252
2253 // NEON integer IDCT. should produce bit-identical
2254 // results to the generic C version.
stbi__idct_simd(stbi_uc * out,int out_stride,short data[64])2255 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2256 {
2257 int16x8_t row0, row1, row2, row3, row4, row5, row6, row7;
2258
2259 int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f));
2260 int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f));
2261 int16x4_t rot0_2 = vdup_n_s16(stbi__f2f( 0.765366865f));
2262 int16x4_t rot1_0 = vdup_n_s16(stbi__f2f( 1.175875602f));
2263 int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f));
2264 int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f));
2265 int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f));
2266 int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f));
2267 int16x4_t rot3_0 = vdup_n_s16(stbi__f2f( 0.298631336f));
2268 int16x4_t rot3_1 = vdup_n_s16(stbi__f2f( 2.053119869f));
2269 int16x4_t rot3_2 = vdup_n_s16(stbi__f2f( 3.072711026f));
2270 int16x4_t rot3_3 = vdup_n_s16(stbi__f2f( 1.501321110f));
2271
2272 #define dct_long_mul(out, inq, coeff) \
2273 int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
2274 int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
2275
2276 #define dct_long_mac(out, acc, inq, coeff) \
2277 int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
2278 int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
2279
2280 #define dct_widen(out, inq) \
2281 int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
2282 int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
2283
2284 // wide add
2285 #define dct_wadd(out, a, b) \
2286 int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
2287 int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
2288
2289 // wide sub
2290 #define dct_wsub(out, a, b) \
2291 int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
2292 int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
2293
2294 // butterfly a/b, then shift using "shiftop" by "s" and pack
2295 #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
2296 { \
2297 dct_wadd(sum, a, b); \
2298 dct_wsub(dif, a, b); \
2299 out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
2300 out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
2301 }
2302
2303 #define dct_pass(shiftop, shift) \
2304 { \
2305 /* even part */ \
2306 int16x8_t sum26 = vaddq_s16(row2, row6); \
2307 dct_long_mul(p1e, sum26, rot0_0); \
2308 dct_long_mac(t2e, p1e, row6, rot0_1); \
2309 dct_long_mac(t3e, p1e, row2, rot0_2); \
2310 int16x8_t sum04 = vaddq_s16(row0, row4); \
2311 int16x8_t dif04 = vsubq_s16(row0, row4); \
2312 dct_widen(t0e, sum04); \
2313 dct_widen(t1e, dif04); \
2314 dct_wadd(x0, t0e, t3e); \
2315 dct_wsub(x3, t0e, t3e); \
2316 dct_wadd(x1, t1e, t2e); \
2317 dct_wsub(x2, t1e, t2e); \
2318 /* odd part */ \
2319 int16x8_t sum15 = vaddq_s16(row1, row5); \
2320 int16x8_t sum17 = vaddq_s16(row1, row7); \
2321 int16x8_t sum35 = vaddq_s16(row3, row5); \
2322 int16x8_t sum37 = vaddq_s16(row3, row7); \
2323 int16x8_t sumodd = vaddq_s16(sum17, sum35); \
2324 dct_long_mul(p5o, sumodd, rot1_0); \
2325 dct_long_mac(p1o, p5o, sum17, rot1_1); \
2326 dct_long_mac(p2o, p5o, sum35, rot1_2); \
2327 dct_long_mul(p3o, sum37, rot2_0); \
2328 dct_long_mul(p4o, sum15, rot2_1); \
2329 dct_wadd(sump13o, p1o, p3o); \
2330 dct_wadd(sump24o, p2o, p4o); \
2331 dct_wadd(sump23o, p2o, p3o); \
2332 dct_wadd(sump14o, p1o, p4o); \
2333 dct_long_mac(x4, sump13o, row7, rot3_0); \
2334 dct_long_mac(x5, sump24o, row5, rot3_1); \
2335 dct_long_mac(x6, sump23o, row3, rot3_2); \
2336 dct_long_mac(x7, sump14o, row1, rot3_3); \
2337 dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
2338 dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
2339 dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
2340 dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
2341 }
2342
2343 // load
2344 row0 = vld1q_s16(data + 0*8);
2345 row1 = vld1q_s16(data + 1*8);
2346 row2 = vld1q_s16(data + 2*8);
2347 row3 = vld1q_s16(data + 3*8);
2348 row4 = vld1q_s16(data + 4*8);
2349 row5 = vld1q_s16(data + 5*8);
2350 row6 = vld1q_s16(data + 6*8);
2351 row7 = vld1q_s16(data + 7*8);
2352
2353 // add DC bias
2354 row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
2355
2356 // column pass
2357 dct_pass(vrshrn_n_s32, 10);
2358
2359 // 16bit 8x8 transpose
2360 {
2361 // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
2362 // whether compilers actually get this is another story, sadly.
2363 #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
2364 #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
2365 #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
2366
2367 // pass 1
2368 dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6
2369 dct_trn16(row2, row3);
2370 dct_trn16(row4, row5);
2371 dct_trn16(row6, row7);
2372
2373 // pass 2
2374 dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4
2375 dct_trn32(row1, row3);
2376 dct_trn32(row4, row6);
2377 dct_trn32(row5, row7);
2378
2379 // pass 3
2380 dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0
2381 dct_trn64(row1, row5);
2382 dct_trn64(row2, row6);
2383 dct_trn64(row3, row7);
2384
2385 #undef dct_trn16
2386 #undef dct_trn32
2387 #undef dct_trn64
2388 }
2389
2390 // row pass
2391 // vrshrn_n_s32 only supports shifts up to 16, we need
2392 // 17. so do a non-rounding shift of 16 first then follow
2393 // up with a rounding shift by 1.
2394 dct_pass(vshrn_n_s32, 16);
2395
2396 {
2397 // pack and round
2398 uint8x8_t p0 = vqrshrun_n_s16(row0, 1);
2399 uint8x8_t p1 = vqrshrun_n_s16(row1, 1);
2400 uint8x8_t p2 = vqrshrun_n_s16(row2, 1);
2401 uint8x8_t p3 = vqrshrun_n_s16(row3, 1);
2402 uint8x8_t p4 = vqrshrun_n_s16(row4, 1);
2403 uint8x8_t p5 = vqrshrun_n_s16(row5, 1);
2404 uint8x8_t p6 = vqrshrun_n_s16(row6, 1);
2405 uint8x8_t p7 = vqrshrun_n_s16(row7, 1);
2406
2407 // again, these can translate into one instruction, but often don't.
2408 #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
2409 #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
2410 #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
2411
2412 // sadly can't use interleaved stores here since we only write
2413 // 8 bytes to each scan line!
2414
2415 // 8x8 8-bit transpose pass 1
2416 dct_trn8_8(p0, p1);
2417 dct_trn8_8(p2, p3);
2418 dct_trn8_8(p4, p5);
2419 dct_trn8_8(p6, p7);
2420
2421 // pass 2
2422 dct_trn8_16(p0, p2);
2423 dct_trn8_16(p1, p3);
2424 dct_trn8_16(p4, p6);
2425 dct_trn8_16(p5, p7);
2426
2427 // pass 3
2428 dct_trn8_32(p0, p4);
2429 dct_trn8_32(p1, p5);
2430 dct_trn8_32(p2, p6);
2431 dct_trn8_32(p3, p7);
2432
2433 // store
2434 vst1_u8(out, p0); out += out_stride;
2435 vst1_u8(out, p1); out += out_stride;
2436 vst1_u8(out, p2); out += out_stride;
2437 vst1_u8(out, p3); out += out_stride;
2438 vst1_u8(out, p4); out += out_stride;
2439 vst1_u8(out, p5); out += out_stride;
2440 vst1_u8(out, p6); out += out_stride;
2441 vst1_u8(out, p7);
2442
2443 #undef dct_trn8_8
2444 #undef dct_trn8_16
2445 #undef dct_trn8_32
2446 }
2447
2448 #undef dct_long_mul
2449 #undef dct_long_mac
2450 #undef dct_widen
2451 #undef dct_wadd
2452 #undef dct_wsub
2453 #undef dct_bfly32o
2454 #undef dct_pass
2455 }
2456
2457 #endif // STBI_NEON
2458
2459 #define STBI__MARKER_none 0xff
2460 // if there's a pending marker from the entropy stream, return that
2461 // otherwise, fetch from the stream and get a marker. if there's no
2462 // marker, return 0xff, which is never a valid marker value
stbi__get_marker(stbi__jpeg * j)2463 static stbi_uc stbi__get_marker(stbi__jpeg *j)
2464 {
2465 stbi_uc x;
2466 if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
2467 x = stbi__get8(j->s);
2468 if (x != 0xff) return STBI__MARKER_none;
2469 while (x == 0xff)
2470 x = stbi__get8(j->s);
2471 return x;
2472 }
2473
2474 // in each scan, we'll have scan_n components, and the order
2475 // of the components is specified by order[]
2476 #define STBI__RESTART(x) ((x) >= 0xd0 && (x) <= 0xd7)
2477
2478 // after a restart interval, stbi__jpeg_reset the entropy decoder and
2479 // the dc prediction
stbi__jpeg_reset(stbi__jpeg * j)2480 static void stbi__jpeg_reset(stbi__jpeg *j)
2481 {
2482 j->code_bits = 0;
2483 j->code_buffer = 0;
2484 j->nomore = 0;
2485 j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = 0;
2486 j->marker = STBI__MARKER_none;
2487 j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
2488 j->eob_run = 0;
2489 // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
2490 // since we don't even allow 1<<30 pixels
2491 }
2492
stbi__parse_entropy_coded_data(stbi__jpeg * z)2493 static int stbi__parse_entropy_coded_data(stbi__jpeg *z)
2494 {
2495 stbi__jpeg_reset(z);
2496 if (!z->progressive) {
2497 if (z->scan_n == 1) {
2498 int i,j;
2499 STBI_SIMD_ALIGN(short, data[64]);
2500 int n = z->order[0];
2501 // non-interleaved data, we just need to process one block at a time,
2502 // in trivial scanline order
2503 // number of blocks to do just depends on how many actual "pixels" this
2504 // component has, independent of interleaved MCU blocking and such
2505 int w = (z->img_comp[n].x+7) >> 3;
2506 int h = (z->img_comp[n].y+7) >> 3;
2507 for (j=0; j < h; ++j) {
2508 for (i=0; i < w; ++i) {
2509 int ha = z->img_comp[n].ha;
2510 if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2511 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2512 // every data block is an MCU, so countdown the restart interval
2513 if (--z->todo <= 0) {
2514 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2515 // if it's NOT a restart, then just bail, so we get corrupt data
2516 // rather than no data
2517 if (!STBI__RESTART(z->marker)) return 1;
2518 stbi__jpeg_reset(z);
2519 }
2520 }
2521 }
2522 return 1;
2523 } else { // interleaved
2524 int i,j,k,x,y;
2525 STBI_SIMD_ALIGN(short, data[64]);
2526 for (j=0; j < z->img_mcu_y; ++j) {
2527 for (i=0; i < z->img_mcu_x; ++i) {
2528 // scan an interleaved mcu... process scan_n components in order
2529 for (k=0; k < z->scan_n; ++k) {
2530 int n = z->order[k];
2531 // scan out an mcu's worth of this component; that's just determined
2532 // by the basic H and V specified for the component
2533 for (y=0; y < z->img_comp[n].v; ++y) {
2534 for (x=0; x < z->img_comp[n].h; ++x) {
2535 int x2 = (i*z->img_comp[n].h + x)*8;
2536 int y2 = (j*z->img_comp[n].v + y)*8;
2537 int ha = z->img_comp[n].ha;
2538 if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2539 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data);
2540 }
2541 }
2542 }
2543 // after all interleaved components, that's an interleaved MCU,
2544 // so now count down the restart interval
2545 if (--z->todo <= 0) {
2546 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2547 if (!STBI__RESTART(z->marker)) return 1;
2548 stbi__jpeg_reset(z);
2549 }
2550 }
2551 }
2552 return 1;
2553 }
2554 } else {
2555 if (z->scan_n == 1) {
2556 int i,j;
2557 int n = z->order[0];
2558 // non-interleaved data, we just need to process one block at a time,
2559 // in trivial scanline order
2560 // number of blocks to do just depends on how many actual "pixels" this
2561 // component has, independent of interleaved MCU blocking and such
2562 int w = (z->img_comp[n].x+7) >> 3;
2563 int h = (z->img_comp[n].y+7) >> 3;
2564 for (j=0; j < h; ++j) {
2565 for (i=0; i < w; ++i) {
2566 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2567 if (z->spec_start == 0) {
2568 if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2569 return 0;
2570 } else {
2571 int ha = z->img_comp[n].ha;
2572 if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha]))
2573 return 0;
2574 }
2575 // every data block is an MCU, so countdown the restart interval
2576 if (--z->todo <= 0) {
2577 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2578 if (!STBI__RESTART(z->marker)) return 1;
2579 stbi__jpeg_reset(z);
2580 }
2581 }
2582 }
2583 return 1;
2584 } else { // interleaved
2585 int i,j,k,x,y;
2586 for (j=0; j < z->img_mcu_y; ++j) {
2587 for (i=0; i < z->img_mcu_x; ++i) {
2588 // scan an interleaved mcu... process scan_n components in order
2589 for (k=0; k < z->scan_n; ++k) {
2590 int n = z->order[k];
2591 // scan out an mcu's worth of this component; that's just determined
2592 // by the basic H and V specified for the component
2593 for (y=0; y < z->img_comp[n].v; ++y) {
2594 for (x=0; x < z->img_comp[n].h; ++x) {
2595 int x2 = (i*z->img_comp[n].h + x);
2596 int y2 = (j*z->img_comp[n].v + y);
2597 short *data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w);
2598 if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2599 return 0;
2600 }
2601 }
2602 }
2603 // after all interleaved components, that's an interleaved MCU,
2604 // so now count down the restart interval
2605 if (--z->todo <= 0) {
2606 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2607 if (!STBI__RESTART(z->marker)) return 1;
2608 stbi__jpeg_reset(z);
2609 }
2610 }
2611 }
2612 return 1;
2613 }
2614 }
2615 }
2616
stbi__jpeg_dequantize(short * data,stbi_uc * dequant)2617 static void stbi__jpeg_dequantize(short *data, stbi_uc *dequant)
2618 {
2619 int i;
2620 for (i=0; i < 64; ++i)
2621 data[i] *= dequant[i];
2622 }
2623
stbi__jpeg_finish(stbi__jpeg * z)2624 static void stbi__jpeg_finish(stbi__jpeg *z)
2625 {
2626 if (z->progressive) {
2627 // dequantize and idct the data
2628 int i,j,n;
2629 for (n=0; n < z->s->img_n; ++n) {
2630 int w = (z->img_comp[n].x+7) >> 3;
2631 int h = (z->img_comp[n].y+7) >> 3;
2632 for (j=0; j < h; ++j) {
2633 for (i=0; i < w; ++i) {
2634 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2635 stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]);
2636 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2637 }
2638 }
2639 }
2640 }
2641 }
2642
stbi__process_marker(stbi__jpeg * z,int m)2643 static int stbi__process_marker(stbi__jpeg *z, int m)
2644 {
2645 int L;
2646 switch (m) {
2647 case STBI__MARKER_none: // no marker found
2648 return stbi__err("expected marker","Corrupt JPEG");
2649
2650 case 0xDD: // DRI - specify restart interval
2651 if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
2652 z->restart_interval = stbi__get16be(z->s);
2653 return 1;
2654
2655 case 0xDB: // DQT - define quantization table
2656 L = stbi__get16be(z->s)-2;
2657 while (L > 0) {
2658 int q = stbi__get8(z->s);
2659 int p = q >> 4;
2660 int t = q & 15,i;
2661 if (p != 0) return stbi__err("bad DQT type","Corrupt JPEG");
2662 if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG");
2663 for (i=0; i < 64; ++i)
2664 z->dequant[t][stbi__jpeg_dezigzag[i]] = stbi__get8(z->s);
2665 L -= 65;
2666 }
2667 return L==0;
2668
2669 case 0xC4: // DHT - define huffman table
2670 L = stbi__get16be(z->s)-2;
2671 while (L > 0) {
2672 stbi_uc *v;
2673 int sizes[16],i,n=0;
2674 int q = stbi__get8(z->s);
2675 int tc = q >> 4;
2676 int th = q & 15;
2677 if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG");
2678 for (i=0; i < 16; ++i) {
2679 sizes[i] = stbi__get8(z->s);
2680 n += sizes[i];
2681 }
2682 L -= 17;
2683 if (tc == 0) {
2684 if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0;
2685 v = z->huff_dc[th].values;
2686 } else {
2687 if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0;
2688 v = z->huff_ac[th].values;
2689 }
2690 for (i=0; i < n; ++i)
2691 v[i] = stbi__get8(z->s);
2692 if (tc != 0)
2693 stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
2694 L -= n;
2695 }
2696 return L==0;
2697 }
2698 // check for comment block or APP blocks
2699 if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
2700 stbi__skip(z->s, stbi__get16be(z->s)-2);
2701 return 1;
2702 }
2703 return 0;
2704 }
2705
2706 // after we see SOS
stbi__process_scan_header(stbi__jpeg * z)2707 static int stbi__process_scan_header(stbi__jpeg *z)
2708 {
2709 int i;
2710 int Ls = stbi__get16be(z->s);
2711 z->scan_n = stbi__get8(z->s);
2712 if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return stbi__err("bad SOS component count","Corrupt JPEG");
2713 if (Ls != 6+2*z->scan_n) return stbi__err("bad SOS len","Corrupt JPEG");
2714 for (i=0; i < z->scan_n; ++i) {
2715 int id = stbi__get8(z->s), which;
2716 int q = stbi__get8(z->s);
2717 for (which = 0; which < z->s->img_n; ++which)
2718 if (z->img_comp[which].id == id)
2719 break;
2720 if (which == z->s->img_n) return 0; // no match
2721 z->img_comp[which].hd = q >> 4; if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff","Corrupt JPEG");
2722 z->img_comp[which].ha = q & 15; if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff","Corrupt JPEG");
2723 z->order[i] = which;
2724 }
2725
2726 {
2727 int aa;
2728 z->spec_start = stbi__get8(z->s);
2729 z->spec_end = stbi__get8(z->s); // should be 63, but might be 0
2730 aa = stbi__get8(z->s);
2731 z->succ_high = (aa >> 4);
2732 z->succ_low = (aa & 15);
2733 if (z->progressive) {
2734 if (z->spec_start > 63 || z->spec_end > 63 || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13)
2735 return stbi__err("bad SOS", "Corrupt JPEG");
2736 } else {
2737 if (z->spec_start != 0) return stbi__err("bad SOS","Corrupt JPEG");
2738 if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS","Corrupt JPEG");
2739 z->spec_end = 63;
2740 }
2741 }
2742
2743 return 1;
2744 }
2745
stbi__process_frame_header(stbi__jpeg * z,int scan)2746 static int stbi__process_frame_header(stbi__jpeg *z, int scan)
2747 {
2748 stbi__context *s = z->s;
2749 int Lf,p,i,q, h_max=1,v_max=1,c;
2750 Lf = stbi__get16be(s); if (Lf < 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
2751 p = stbi__get8(s); if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
2752 s->img_y = stbi__get16be(s); if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
2753 s->img_x = stbi__get16be(s); if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
2754 c = stbi__get8(s);
2755 if (c != 3 && c != 1) return stbi__err("bad component count","Corrupt JPEG"); // JFIF requires
2756 s->img_n = c;
2757 for (i=0; i < c; ++i) {
2758 z->img_comp[i].data = NULL;
2759 z->img_comp[i].linebuf = NULL;
2760 }
2761
2762 if (Lf != 8+3*s->img_n) return stbi__err("bad SOF len","Corrupt JPEG");
2763
2764 for (i=0; i < s->img_n; ++i) {
2765 z->img_comp[i].id = stbi__get8(s);
2766 if (z->img_comp[i].id != i+1) // JFIF requires
2767 if (z->img_comp[i].id != i) // some version of jpegtran outputs non-JFIF-compliant files!
2768 return stbi__err("bad component ID","Corrupt JPEG");
2769 q = stbi__get8(s);
2770 z->img_comp[i].h = (q >> 4); if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG");
2771 z->img_comp[i].v = q & 15; if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG");
2772 z->img_comp[i].tq = stbi__get8(s); if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG");
2773 }
2774
2775 if (scan != STBI__SCAN_load) return 1;
2776
2777 if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
2778
2779 for (i=0; i < s->img_n; ++i) {
2780 if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
2781 if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
2782 }
2783
2784 // compute interleaved mcu info
2785 z->img_h_max = h_max;
2786 z->img_v_max = v_max;
2787 z->img_mcu_w = h_max * 8;
2788 z->img_mcu_h = v_max * 8;
2789 z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
2790 z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;
2791
2792 for (i=0; i < s->img_n; ++i) {
2793 // number of effective pixels (e.g. for non-interleaved MCU)
2794 z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
2795 z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
2796 // to simplify generation, we'll allocate enough memory to decode
2797 // the bogus oversized data from using interleaved MCUs and their
2798 // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
2799 // discard the extra data until colorspace conversion
2800 z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
2801 z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
2802 z->img_comp[i].raw_data = stbi__malloc(z->img_comp[i].w2 * z->img_comp[i].h2+15);
2803
2804 if (z->img_comp[i].raw_data == NULL) {
2805 for(--i; i >= 0; --i) {
2806 STBI_FREE(z->img_comp[i].raw_data);
2807 z->img_comp[i].data = NULL;
2808 }
2809 return stbi__err("outofmem", "Out of memory");
2810 }
2811 // align blocks for idct using mmx/sse
2812 z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
2813 z->img_comp[i].linebuf = NULL;
2814 if (z->progressive) {
2815 z->img_comp[i].coeff_w = (z->img_comp[i].w2 + 7) >> 3;
2816 z->img_comp[i].coeff_h = (z->img_comp[i].h2 + 7) >> 3;
2817 z->img_comp[i].raw_coeff = STBI_MALLOC(z->img_comp[i].coeff_w * z->img_comp[i].coeff_h * 64 * sizeof(short) + 15);
2818 z->img_comp[i].coeff = (short*) (((size_t) z->img_comp[i].raw_coeff + 15) & ~15);
2819 } else {
2820 z->img_comp[i].coeff = 0;
2821 z->img_comp[i].raw_coeff = 0;
2822 }
2823 }
2824
2825 return 1;
2826 }
2827
2828 // use comparisons since in some cases we handle more than one case (e.g. SOF)
2829 #define stbi__DNL(x) ((x) == 0xdc)
2830 #define stbi__SOI(x) ((x) == 0xd8)
2831 #define stbi__EOI(x) ((x) == 0xd9)
2832 #define stbi__SOF(x) ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
2833 #define stbi__SOS(x) ((x) == 0xda)
2834
2835 #define stbi__SOF_progressive(x) ((x) == 0xc2)
2836
stbi__decode_jpeg_header(stbi__jpeg * z,int scan)2837 static int stbi__decode_jpeg_header(stbi__jpeg *z, int scan)
2838 {
2839 int m;
2840 z->marker = STBI__MARKER_none; // initialize cached marker to empty
2841 m = stbi__get_marker(z);
2842 if (!stbi__SOI(m)) return stbi__err("no SOI","Corrupt JPEG");
2843 if (scan == STBI__SCAN_type) return 1;
2844 m = stbi__get_marker(z);
2845 while (!stbi__SOF(m)) {
2846 if (!stbi__process_marker(z,m)) return 0;
2847 m = stbi__get_marker(z);
2848 while (m == STBI__MARKER_none) {
2849 // some files have extra padding after their blocks, so ok, we'll scan
2850 if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG");
2851 m = stbi__get_marker(z);
2852 }
2853 }
2854 z->progressive = stbi__SOF_progressive(m);
2855 if (!stbi__process_frame_header(z, scan)) return 0;
2856 return 1;
2857 }
2858
2859 // decode image to YCbCr format
stbi__decode_jpeg_image(stbi__jpeg * j)2860 static int stbi__decode_jpeg_image(stbi__jpeg *j)
2861 {
2862 int m;
2863 for (m = 0; m < 4; m++) {
2864 j->img_comp[m].raw_data = NULL;
2865 j->img_comp[m].raw_coeff = NULL;
2866 }
2867 j->restart_interval = 0;
2868 if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0;
2869 m = stbi__get_marker(j);
2870 while (!stbi__EOI(m)) {
2871 if (stbi__SOS(m)) {
2872 if (!stbi__process_scan_header(j)) return 0;
2873 if (!stbi__parse_entropy_coded_data(j)) return 0;
2874 if (j->marker == STBI__MARKER_none ) {
2875 // handle 0s at the end of image data from IP Kamera 9060
2876 while (!stbi__at_eof(j->s)) {
2877 int x = stbi__get8(j->s);
2878 if (x == 255) {
2879 j->marker = stbi__get8(j->s);
2880 break;
2881 } else if (x != 0) {
2882 return stbi__err("junk before marker", "Corrupt JPEG");
2883 }
2884 }
2885 // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
2886 }
2887 } else {
2888 if (!stbi__process_marker(j, m)) return 0;
2889 }
2890 m = stbi__get_marker(j);
2891 }
2892 if (j->progressive)
2893 stbi__jpeg_finish(j);
2894 return 1;
2895 }
2896
2897 // static jfif-centered resampling (across block boundaries)
2898
2899 typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1,
2900 int w, int hs);
2901
2902 #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
2903
resample_row_1(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2904 static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2905 {
2906 STBI_NOTUSED(out);
2907 STBI_NOTUSED(in_far);
2908 STBI_NOTUSED(w);
2909 STBI_NOTUSED(hs);
2910 return in_near;
2911 }
2912
stbi__resample_row_v_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2913 static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2914 {
2915 // need to generate two samples vertically for every one in input
2916 int i;
2917 STBI_NOTUSED(hs);
2918 for (i=0; i < w; ++i)
2919 out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2);
2920 return out;
2921 }
2922
stbi__resample_row_h_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2923 static stbi_uc* stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2924 {
2925 // need to generate two samples horizontally for every one in input
2926 int i;
2927 stbi_uc *input = in_near;
2928
2929 if (w == 1) {
2930 // if only one sample, can't do any interpolation
2931 out[0] = out[1] = input[0];
2932 return out;
2933 }
2934
2935 out[0] = input[0];
2936 out[1] = stbi__div4(input[0]*3 + input[1] + 2);
2937 for (i=1; i < w-1; ++i) {
2938 int n = 3*input[i]+2;
2939 out[i*2+0] = stbi__div4(n+input[i-1]);
2940 out[i*2+1] = stbi__div4(n+input[i+1]);
2941 }
2942 out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2);
2943 out[i*2+1] = input[w-1];
2944
2945 STBI_NOTUSED(in_far);
2946 STBI_NOTUSED(hs);
2947
2948 return out;
2949 }
2950
2951 #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
2952
stbi__resample_row_hv_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2953 static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2954 {
2955 // need to generate 2x2 samples for every one in input
2956 int i,t0,t1;
2957 if (w == 1) {
2958 out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2959 return out;
2960 }
2961
2962 t1 = 3*in_near[0] + in_far[0];
2963 out[0] = stbi__div4(t1+2);
2964 for (i=1; i < w; ++i) {
2965 t0 = t1;
2966 t1 = 3*in_near[i]+in_far[i];
2967 out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
2968 out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
2969 }
2970 out[w*2-1] = stbi__div4(t1+2);
2971
2972 STBI_NOTUSED(hs);
2973
2974 return out;
2975 }
2976
2977 #if defined(STBI_SSE2) || defined(STBI_NEON)
stbi__resample_row_hv_2_simd(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2978 static stbi_uc *stbi__resample_row_hv_2_simd(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2979 {
2980 // need to generate 2x2 samples for every one in input
2981 int i=0,t0,t1;
2982
2983 if (w == 1) {
2984 out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2985 return out;
2986 }
2987
2988 t1 = 3*in_near[0] + in_far[0];
2989 // process groups of 8 pixels for as long as we can.
2990 // note we can't handle the last pixel in a row in this loop
2991 // because we need to handle the filter boundary conditions.
2992 for (; i < ((w-1) & ~7); i += 8) {
2993 #if defined(STBI_SSE2)
2994 // load and perform the vertical filtering pass
2995 // this uses 3*x + y = 4*x + (y - x)
2996 __m128i zero = _mm_setzero_si128();
2997 __m128i farb = _mm_loadl_epi64((__m128i *) (in_far + i));
2998 __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i));
2999 __m128i farw = _mm_unpacklo_epi8(farb, zero);
3000 __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
3001 __m128i diff = _mm_sub_epi16(farw, nearw);
3002 __m128i nears = _mm_slli_epi16(nearw, 2);
3003 __m128i curr = _mm_add_epi16(nears, diff); // current row
3004
3005 // horizontal filter works the same based on shifted vers of current
3006 // row. "prev" is current row shifted right by 1 pixel; we need to
3007 // insert the previous pixel value (from t1).
3008 // "next" is current row shifted left by 1 pixel, with first pixel
3009 // of next block of 8 pixels added in.
3010 __m128i prv0 = _mm_slli_si128(curr, 2);
3011 __m128i nxt0 = _mm_srli_si128(curr, 2);
3012 __m128i prev = _mm_insert_epi16(prv0, t1, 0);
3013 __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7);
3014
3015 // horizontal filter, polyphase implementation since it's convenient:
3016 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
3017 // odd pixels = 3*cur + next = cur*4 + (next - cur)
3018 // note the shared term.
3019 __m128i bias = _mm_set1_epi16(8);
3020 __m128i curs = _mm_slli_epi16(curr, 2);
3021 __m128i prvd = _mm_sub_epi16(prev, curr);
3022 __m128i nxtd = _mm_sub_epi16(next, curr);
3023 __m128i curb = _mm_add_epi16(curs, bias);
3024 __m128i even = _mm_add_epi16(prvd, curb);
3025 __m128i odd = _mm_add_epi16(nxtd, curb);
3026
3027 // interleave even and odd pixels, then undo scaling.
3028 __m128i int0 = _mm_unpacklo_epi16(even, odd);
3029 __m128i int1 = _mm_unpackhi_epi16(even, odd);
3030 __m128i de0 = _mm_srli_epi16(int0, 4);
3031 __m128i de1 = _mm_srli_epi16(int1, 4);
3032
3033 // pack and write output
3034 __m128i outv = _mm_packus_epi16(de0, de1);
3035 _mm_storeu_si128((__m128i *) (out + i*2), outv);
3036 #elif defined(STBI_NEON)
3037 // load and perform the vertical filtering pass
3038 // this uses 3*x + y = 4*x + (y - x)
3039 uint8x8_t farb = vld1_u8(in_far + i);
3040 uint8x8_t nearb = vld1_u8(in_near + i);
3041 int16x8_t diff = vreinterpretq_s16_u16(vsubl_u8(farb, nearb));
3042 int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2));
3043 int16x8_t curr = vaddq_s16(nears, diff); // current row
3044
3045 // horizontal filter works the same based on shifted vers of current
3046 // row. "prev" is current row shifted right by 1 pixel; we need to
3047 // insert the previous pixel value (from t1).
3048 // "next" is current row shifted left by 1 pixel, with first pixel
3049 // of next block of 8 pixels added in.
3050 int16x8_t prv0 = vextq_s16(curr, curr, 7);
3051 int16x8_t nxt0 = vextq_s16(curr, curr, 1);
3052 int16x8_t prev = vsetq_lane_s16(t1, prv0, 0);
3053 int16x8_t next = vsetq_lane_s16(3*in_near[i+8] + in_far[i+8], nxt0, 7);
3054
3055 // horizontal filter, polyphase implementation since it's convenient:
3056 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
3057 // odd pixels = 3*cur + next = cur*4 + (next - cur)
3058 // note the shared term.
3059 int16x8_t curs = vshlq_n_s16(curr, 2);
3060 int16x8_t prvd = vsubq_s16(prev, curr);
3061 int16x8_t nxtd = vsubq_s16(next, curr);
3062 int16x8_t even = vaddq_s16(curs, prvd);
3063 int16x8_t odd = vaddq_s16(curs, nxtd);
3064
3065 // undo scaling and round, then store with even/odd phases interleaved
3066 uint8x8x2_t o;
3067 o.val[0] = vqrshrun_n_s16(even, 4);
3068 o.val[1] = vqrshrun_n_s16(odd, 4);
3069 vst2_u8(out + i*2, o);
3070 #endif
3071
3072 // "previous" value for next iter
3073 t1 = 3*in_near[i+7] + in_far[i+7];
3074 }
3075
3076 t0 = t1;
3077 t1 = 3*in_near[i] + in_far[i];
3078 out[i*2] = stbi__div16(3*t1 + t0 + 8);
3079
3080 for (++i; i < w; ++i) {
3081 t0 = t1;
3082 t1 = 3*in_near[i]+in_far[i];
3083 out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
3084 out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
3085 }
3086 out[w*2-1] = stbi__div4(t1+2);
3087
3088 STBI_NOTUSED(hs);
3089
3090 return out;
3091 }
3092 #endif
3093
stbi__resample_row_generic(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)3094 static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3095 {
3096 // resample with nearest-neighbor
3097 int i,j;
3098 STBI_NOTUSED(in_far);
3099 for (i=0; i < w; ++i)
3100 for (j=0; j < hs; ++j)
3101 out[i*hs+j] = in_near[i];
3102 return out;
3103 }
3104
3105 #ifdef STBI_JPEG_OLD
3106 // this is the same YCbCr-to-RGB calculation that stb_image has used
3107 // historically before the algorithm changes in 1.49
3108 #define float2fixed(x) ((int) ((x) * 65536 + 0.5))
stbi__YCbCr_to_RGB_row(stbi_uc * out,const stbi_uc * y,const stbi_uc * pcb,const stbi_uc * pcr,int count,int step)3109 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3110 {
3111 int i;
3112 for (i=0; i < count; ++i) {
3113 int y_fixed = (y[i] << 16) + 32768; // rounding
3114 int r,g,b;
3115 int cr = pcr[i] - 128;
3116 int cb = pcb[i] - 128;
3117 r = y_fixed + cr*float2fixed(1.40200f);
3118 g = y_fixed - cr*float2fixed(0.71414f) - cb*float2fixed(0.34414f);
3119 b = y_fixed + cb*float2fixed(1.77200f);
3120 r >>= 16;
3121 g >>= 16;
3122 b >>= 16;
3123 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3124 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3125 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3126 out[0] = (stbi_uc)r;
3127 out[1] = (stbi_uc)g;
3128 out[2] = (stbi_uc)b;
3129 out[3] = 255;
3130 out += step;
3131 }
3132 }
3133 #else
3134 // this is a reduced-precision calculation of YCbCr-to-RGB introduced
3135 // to make sure the code produces the same results in both SIMD and scalar
3136 #define float2fixed(x) (((int) ((x) * 4096.0f + 0.5f)) << 8)
stbi__YCbCr_to_RGB_row(stbi_uc * out,const stbi_uc * y,const stbi_uc * pcb,const stbi_uc * pcr,int count,int step)3137 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3138 {
3139 int i;
3140 for (i=0; i < count; ++i) {
3141 int y_fixed = (y[i] << 20) + (1<<19); // rounding
3142 int r,g,b;
3143 int cr = pcr[i] - 128;
3144 int cb = pcb[i] - 128;
3145 r = y_fixed + cr* float2fixed(1.40200f);
3146 g = y_fixed + (cr*-float2fixed(0.71414f)) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3147 b = y_fixed + cb* float2fixed(1.77200f);
3148 r >>= 20;
3149 g >>= 20;
3150 b >>= 20;
3151 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3152 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3153 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3154 out[0] = (stbi_uc)r;
3155 out[1] = (stbi_uc)g;
3156 out[2] = (stbi_uc)b;
3157 out[3] = 255;
3158 out += step;
3159 }
3160 }
3161 #endif
3162
3163 #if defined(STBI_SSE2) || defined(STBI_NEON)
stbi__YCbCr_to_RGB_simd(stbi_uc * out,stbi_uc const * y,stbi_uc const * pcb,stbi_uc const * pcr,int count,int step)3164 static void stbi__YCbCr_to_RGB_simd(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step)
3165 {
3166 int i = 0;
3167
3168 #ifdef STBI_SSE2
3169 // step == 3 is pretty ugly on the final interleave, and i'm not convinced
3170 // it's useful in practice (you wouldn't use it for textures, for example).
3171 // so just accelerate step == 4 case.
3172 if (step == 4) {
3173 // this is a fairly straightforward implementation and not super-optimized.
3174 __m128i signflip = _mm_set1_epi8(-0x80);
3175 __m128i cr_const0 = _mm_set1_epi16( (short) ( 1.40200f*4096.0f+0.5f));
3176 __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f));
3177 __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f));
3178 __m128i cb_const1 = _mm_set1_epi16( (short) ( 1.77200f*4096.0f+0.5f));
3179 __m128i y_bias = _mm_set1_epi8((char) (unsigned char) 128);
3180 __m128i xw = _mm_set1_epi16(255); // alpha channel
3181
3182 for (; i+7 < count; i += 8) {
3183 // load
3184 __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i));
3185 __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i));
3186 __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i));
3187 __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
3188 __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
3189
3190 // unpack to short (and left-shift cr, cb by 8)
3191 __m128i yw = _mm_unpacklo_epi8(y_bias, y_bytes);
3192 __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
3193 __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
3194
3195 // color transform
3196 __m128i yws = _mm_srli_epi16(yw, 4);
3197 __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
3198 __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
3199 __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
3200 __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
3201 __m128i rws = _mm_add_epi16(cr0, yws);
3202 __m128i gwt = _mm_add_epi16(cb0, yws);
3203 __m128i bws = _mm_add_epi16(yws, cb1);
3204 __m128i gws = _mm_add_epi16(gwt, cr1);
3205
3206 // descale
3207 __m128i rw = _mm_srai_epi16(rws, 4);
3208 __m128i bw = _mm_srai_epi16(bws, 4);
3209 __m128i gw = _mm_srai_epi16(gws, 4);
3210
3211 // back to byte, set up for transpose
3212 __m128i brb = _mm_packus_epi16(rw, bw);
3213 __m128i gxb = _mm_packus_epi16(gw, xw);
3214
3215 // transpose to interleave channels
3216 __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
3217 __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
3218 __m128i o0 = _mm_unpacklo_epi16(t0, t1);
3219 __m128i o1 = _mm_unpackhi_epi16(t0, t1);
3220
3221 // store
3222 _mm_storeu_si128((__m128i *) (out + 0), o0);
3223 _mm_storeu_si128((__m128i *) (out + 16), o1);
3224 out += 32;
3225 }
3226 }
3227 #endif
3228
3229 #ifdef STBI_NEON
3230 // in this version, step=3 support would be easy to add. but is there demand?
3231 if (step == 4) {
3232 // this is a fairly straightforward implementation and not super-optimized.
3233 uint8x8_t signflip = vdup_n_u8(0x80);
3234 int16x8_t cr_const0 = vdupq_n_s16( (short) ( 1.40200f*4096.0f+0.5f));
3235 int16x8_t cr_const1 = vdupq_n_s16( - (short) ( 0.71414f*4096.0f+0.5f));
3236 int16x8_t cb_const0 = vdupq_n_s16( - (short) ( 0.34414f*4096.0f+0.5f));
3237 int16x8_t cb_const1 = vdupq_n_s16( (short) ( 1.77200f*4096.0f+0.5f));
3238
3239 for (; i+7 < count; i += 8) {
3240 // load
3241 uint8x8_t y_bytes = vld1_u8(y + i);
3242 uint8x8_t cr_bytes = vld1_u8(pcr + i);
3243 uint8x8_t cb_bytes = vld1_u8(pcb + i);
3244 int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip));
3245 int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip));
3246
3247 // expand to s16
3248 int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4));
3249 int16x8_t crw = vshll_n_s8(cr_biased, 7);
3250 int16x8_t cbw = vshll_n_s8(cb_biased, 7);
3251
3252 // color transform
3253 int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0);
3254 int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0);
3255 int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1);
3256 int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1);
3257 int16x8_t rws = vaddq_s16(yws, cr0);
3258 int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1);
3259 int16x8_t bws = vaddq_s16(yws, cb1);
3260
3261 // undo scaling, round, convert to byte
3262 uint8x8x4_t o;
3263 o.val[0] = vqrshrun_n_s16(rws, 4);
3264 o.val[1] = vqrshrun_n_s16(gws, 4);
3265 o.val[2] = vqrshrun_n_s16(bws, 4);
3266 o.val[3] = vdup_n_u8(255);
3267
3268 // store, interleaving r/g/b/a
3269 vst4_u8(out, o);
3270 out += 8*4;
3271 }
3272 }
3273 #endif
3274
3275 for (; i < count; ++i) {
3276 int y_fixed = (y[i] << 20) + (1<<19); // rounding
3277 int r,g,b;
3278 int cr = pcr[i] - 128;
3279 int cb = pcb[i] - 128;
3280 r = y_fixed + cr* float2fixed(1.40200f);
3281 g = y_fixed + cr*-float2fixed(0.71414f) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3282 b = y_fixed + cb* float2fixed(1.77200f);
3283 r >>= 20;
3284 g >>= 20;
3285 b >>= 20;
3286 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3287 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3288 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3289 out[0] = (stbi_uc)r;
3290 out[1] = (stbi_uc)g;
3291 out[2] = (stbi_uc)b;
3292 out[3] = 255;
3293 out += step;
3294 }
3295 }
3296 #endif
3297
3298 // set up the kernels
stbi__setup_jpeg(stbi__jpeg * j)3299 static void stbi__setup_jpeg(stbi__jpeg *j)
3300 {
3301 j->idct_block_kernel = stbi__idct_block;
3302 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
3303 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
3304
3305 #ifdef STBI_SSE2
3306 if (stbi__sse2_available()) {
3307 j->idct_block_kernel = stbi__idct_simd;
3308 #ifndef STBI_JPEG_OLD
3309 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3310 #endif
3311 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3312 }
3313 #endif
3314
3315 #ifdef STBI_NEON
3316 j->idct_block_kernel = stbi__idct_simd;
3317 #ifndef STBI_JPEG_OLD
3318 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3319 #endif
3320 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3321 #endif
3322 }
3323
3324 // clean up the temporary component buffers
stbi__cleanup_jpeg(stbi__jpeg * j)3325 static void stbi__cleanup_jpeg(stbi__jpeg *j)
3326 {
3327 int i;
3328 for (i=0; i < j->s->img_n; ++i) {
3329 if (j->img_comp[i].raw_data) {
3330 STBI_FREE(j->img_comp[i].raw_data);
3331 j->img_comp[i].raw_data = NULL;
3332 j->img_comp[i].data = NULL;
3333 }
3334 if (j->img_comp[i].raw_coeff) {
3335 STBI_FREE(j->img_comp[i].raw_coeff);
3336 j->img_comp[i].raw_coeff = 0;
3337 j->img_comp[i].coeff = 0;
3338 }
3339 if (j->img_comp[i].linebuf) {
3340 STBI_FREE(j->img_comp[i].linebuf);
3341 j->img_comp[i].linebuf = NULL;
3342 }
3343 }
3344 }
3345
3346 typedef struct
3347 {
3348 resample_row_func resample;
3349 stbi_uc *line0,*line1;
3350 int hs,vs; // expansion factor in each axis
3351 int w_lores; // horizontal pixels pre-expansion
3352 int ystep; // how far through vertical expansion we are
3353 int ypos; // which pre-expansion row we're on
3354 } stbi__resample;
3355
load_jpeg_image(stbi__jpeg * z,int * out_x,int * out_y,int * comp,int req_comp)3356 static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
3357 {
3358 int n, decode_n;
3359 z->s->img_n = 0; // make stbi__cleanup_jpeg safe
3360
3361 // validate req_comp
3362 if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
3363
3364 // load a jpeg image from whichever source, but leave in YCbCr format
3365 if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
3366
3367 // determine actual number of components to generate
3368 n = req_comp ? req_comp : z->s->img_n;
3369
3370 if (z->s->img_n == 3 && n < 3)
3371 decode_n = 1;
3372 else
3373 decode_n = z->s->img_n;
3374
3375 // resample and color-convert
3376 {
3377 int k;
3378 unsigned int i,j;
3379 stbi_uc *output;
3380 stbi_uc *coutput[4];
3381
3382 stbi__resample res_comp[4];
3383
3384 for (k=0; k < decode_n; ++k) {
3385 stbi__resample *r = &res_comp[k];
3386
3387 // allocate line buffer big enough for upsampling off the edges
3388 // with upsample factor of 4
3389 z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3);
3390 if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3391
3392 r->hs = z->img_h_max / z->img_comp[k].h;
3393 r->vs = z->img_v_max / z->img_comp[k].v;
3394 r->ystep = r->vs >> 1;
3395 r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
3396 r->ypos = 0;
3397 r->line0 = r->line1 = z->img_comp[k].data;
3398
3399 if (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
3400 else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
3401 else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
3402 else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
3403 else r->resample = stbi__resample_row_generic;
3404 }
3405
3406 // can't error after this so, this is safe
3407 output = (stbi_uc *) stbi__malloc(n * z->s->img_x * z->s->img_y + 1);
3408 if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3409
3410 // now go ahead and resample
3411 for (j=0; j < z->s->img_y; ++j) {
3412 stbi_uc *out = output + n * z->s->img_x * j;
3413 for (k=0; k < decode_n; ++k) {
3414 stbi__resample *r = &res_comp[k];
3415 int y_bot = r->ystep >= (r->vs >> 1);
3416 coutput[k] = r->resample(z->img_comp[k].linebuf,
3417 y_bot ? r->line1 : r->line0,
3418 y_bot ? r->line0 : r->line1,
3419 r->w_lores, r->hs);
3420 if (++r->ystep >= r->vs) {
3421 r->ystep = 0;
3422 r->line0 = r->line1;
3423 if (++r->ypos < z->img_comp[k].y)
3424 r->line1 += z->img_comp[k].w2;
3425 }
3426 }
3427 if (n >= 3) {
3428 stbi_uc *y = coutput[0];
3429 if (z->s->img_n == 3) {
3430 z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3431 } else
3432 for (i=0; i < z->s->img_x; ++i) {
3433 out[0] = out[1] = out[2] = y[i];
3434 out[3] = 255; // not used if n==3
3435 out += n;
3436 }
3437 } else {
3438 stbi_uc *y = coutput[0];
3439 if (n == 1)
3440 for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
3441 else
3442 for (i=0; i < z->s->img_x; ++i) *out++ = y[i], *out++ = 255;
3443 }
3444 }
3445 stbi__cleanup_jpeg(z);
3446 *out_x = z->s->img_x;
3447 *out_y = z->s->img_y;
3448 if (comp) *comp = z->s->img_n; // report original components, not output
3449 return output;
3450 }
3451 }
3452
stbi__jpeg_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)3453 static unsigned char * STBI_FORCE_STACK_ALIGN stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
3454 {
3455 stbi__jpeg j;
3456 j.s = s;
3457 stbi__setup_jpeg(&j);
3458 return load_jpeg_image(&j, x,y,comp,req_comp);
3459 }
3460
stbi__jpeg_test(stbi__context * s)3461 static int stbi__jpeg_test(stbi__context *s)
3462 {
3463 int r;
3464 stbi__jpeg j;
3465 j.s = s;
3466 stbi__setup_jpeg(&j);
3467 r = stbi__decode_jpeg_header(&j, STBI__SCAN_type);
3468 stbi__rewind(s);
3469 return r;
3470 }
3471
stbi__jpeg_info_raw(stbi__jpeg * j,int * x,int * y,int * comp)3472 static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp)
3473 {
3474 if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) {
3475 stbi__rewind( j->s );
3476 return 0;
3477 }
3478 if (x) *x = j->s->img_x;
3479 if (y) *y = j->s->img_y;
3480 if (comp) *comp = j->s->img_n;
3481 return 1;
3482 }
3483
stbi__jpeg_info(stbi__context * s,int * x,int * y,int * comp)3484 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp)
3485 {
3486 stbi__jpeg j;
3487 j.s = s;
3488 return stbi__jpeg_info_raw(&j, x, y, comp);
3489 }
3490 #endif
3491
3492 // public domain zlib decode v0.2 Sean Barrett 2006-11-18
3493 // simple implementation
3494 // - all input must be provided in an upfront buffer
3495 // - all output is written to a single output buffer (can malloc/realloc)
3496 // performance
3497 // - fast huffman
3498
3499 #ifndef STBI_NO_ZLIB
3500
3501 // fast-way is faster to check than jpeg huffman, but slow way is slower
3502 #define STBI__ZFAST_BITS 9 // accelerate all cases in default tables
3503 #define STBI__ZFAST_MASK ((1 << STBI__ZFAST_BITS) - 1)
3504
3505 // zlib-style huffman encoding
3506 // (jpegs packs from left, zlib from right, so can't share code)
3507 typedef struct
3508 {
3509 stbi__uint16 fast[1 << STBI__ZFAST_BITS];
3510 stbi__uint16 firstcode[16];
3511 int maxcode[17];
3512 stbi__uint16 firstsymbol[16];
3513 stbi_uc size[288];
3514 stbi__uint16 value[288];
3515 } stbi__zhuffman;
3516
stbi__bitreverse16(int n)3517 stbi_inline static int stbi__bitreverse16(int n)
3518 {
3519 n = ((n & 0xAAAA) >> 1) | ((n & 0x5555) << 1);
3520 n = ((n & 0xCCCC) >> 2) | ((n & 0x3333) << 2);
3521 n = ((n & 0xF0F0) >> 4) | ((n & 0x0F0F) << 4);
3522 n = ((n & 0xFF00) >> 8) | ((n & 0x00FF) << 8);
3523 return n;
3524 }
3525
stbi__bit_reverse(int v,int bits)3526 stbi_inline static int stbi__bit_reverse(int v, int bits)
3527 {
3528 STBI_ASSERT(bits <= 16);
3529 // to bit reverse n bits, reverse 16 and shift
3530 // e.g. 11 bits, bit reverse and shift away 5
3531 return stbi__bitreverse16(v) >> (16-bits);
3532 }
3533
stbi__zbuild_huffman(stbi__zhuffman * z,stbi_uc * sizelist,int num)3534 static int stbi__zbuild_huffman(stbi__zhuffman *z, stbi_uc *sizelist, int num)
3535 {
3536 int i,k=0;
3537 int code, next_code[16], sizes[17];
3538
3539 // DEFLATE spec for generating codes
3540 memset(sizes, 0, sizeof(sizes));
3541 memset(z->fast, 0, sizeof(z->fast));
3542 for (i=0; i < num; ++i)
3543 ++sizes[sizelist[i]];
3544 sizes[0] = 0;
3545 for (i=1; i < 16; ++i)
3546 if (sizes[i] > (1 << i))
3547 return stbi__err("bad sizes", "Corrupt PNG");
3548 code = 0;
3549 for (i=1; i < 16; ++i) {
3550 next_code[i] = code;
3551 z->firstcode[i] = (stbi__uint16) code;
3552 z->firstsymbol[i] = (stbi__uint16) k;
3553 code = (code + sizes[i]);
3554 if (sizes[i])
3555 if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt PNG");
3556 z->maxcode[i] = code << (16-i); // preshift for inner loop
3557 code <<= 1;
3558 k += sizes[i];
3559 }
3560 z->maxcode[16] = 0x10000; // sentinel
3561 for (i=0; i < num; ++i) {
3562 int s = sizelist[i];
3563 if (s) {
3564 int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
3565 stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i);
3566 z->size [c] = (stbi_uc ) s;
3567 z->value[c] = (stbi__uint16) i;
3568 if (s <= STBI__ZFAST_BITS) {
3569 int k2 = stbi__bit_reverse(next_code[s],s);
3570 while (k2 < (1 << STBI__ZFAST_BITS)) {
3571 z->fast[k2] = fastv;
3572 k2 += (1 << s);
3573 }
3574 }
3575 ++next_code[s];
3576 }
3577 }
3578 return 1;
3579 }
3580
3581 // zlib-from-memory implementation for PNG reading
3582 // because PNG allows splitting the zlib stream arbitrarily,
3583 // and it's annoying structurally to have PNG call ZLIB call PNG,
3584 // we require PNG read all the IDATs and combine them into a single
3585 // memory buffer
3586
3587 typedef struct
3588 {
3589 stbi_uc *zbuffer, *zbuffer_end;
3590 int num_bits;
3591 stbi__uint32 code_buffer;
3592
3593 char *zout;
3594 char *zout_start;
3595 char *zout_end;
3596 int z_expandable;
3597
3598 stbi__zhuffman z_length, z_distance;
3599 } stbi__zbuf;
3600
stbi__zget8(stbi__zbuf * z)3601 stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z)
3602 {
3603 if (z->zbuffer >= z->zbuffer_end) return 0;
3604 return *z->zbuffer++;
3605 }
3606
stbi__fill_bits(stbi__zbuf * z)3607 static void stbi__fill_bits(stbi__zbuf *z)
3608 {
3609 do {
3610 STBI_ASSERT(z->code_buffer < (1U << z->num_bits));
3611 z->code_buffer |= stbi__zget8(z) << z->num_bits;
3612 z->num_bits += 8;
3613 } while (z->num_bits <= 24);
3614 }
3615
stbi__zreceive(stbi__zbuf * z,int n)3616 stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n)
3617 {
3618 unsigned int k;
3619 if (z->num_bits < n) stbi__fill_bits(z);
3620 k = z->code_buffer & ((1 << n) - 1);
3621 z->code_buffer >>= n;
3622 z->num_bits -= n;
3623 return k;
3624 }
3625
stbi__zhuffman_decode_slowpath(stbi__zbuf * a,stbi__zhuffman * z)3626 static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z)
3627 {
3628 int b,s,k;
3629 // not resolved by fast table, so compute it the slow way
3630 // use jpeg approach, which requires MSbits at top
3631 k = stbi__bit_reverse(a->code_buffer, 16);
3632 for (s=STBI__ZFAST_BITS+1; ; ++s)
3633 if (k < z->maxcode[s])
3634 break;
3635 if (s == 16) return -1; // invalid code!
3636 // code size is s, so:
3637 b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
3638 STBI_ASSERT(z->size[b] == s);
3639 a->code_buffer >>= s;
3640 a->num_bits -= s;
3641 return z->value[b];
3642 }
3643
stbi__zhuffman_decode(stbi__zbuf * a,stbi__zhuffman * z)3644 stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z)
3645 {
3646 int b,s;
3647 if (a->num_bits < 16) stbi__fill_bits(a);
3648 b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
3649 if (b) {
3650 s = b >> 9;
3651 a->code_buffer >>= s;
3652 a->num_bits -= s;
3653 return b & 511;
3654 }
3655 return stbi__zhuffman_decode_slowpath(a, z);
3656 }
3657
stbi__zexpand(stbi__zbuf * z,char * zout,int n)3658 static int stbi__zexpand(stbi__zbuf *z, char *zout, int n) // need to make room for n bytes
3659 {
3660 char *q;
3661 int cur, limit;
3662 z->zout = zout;
3663 if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG");
3664 cur = (int) (z->zout - z->zout_start);
3665 limit = (int) (z->zout_end - z->zout_start);
3666 while (cur + n > limit)
3667 limit *= 2;
3668 q = (char *) STBI_REALLOC(z->zout_start, limit);
3669 if (q == NULL) return stbi__err("outofmem", "Out of memory");
3670 z->zout_start = q;
3671 z->zout = q + cur;
3672 z->zout_end = q + limit;
3673 return 1;
3674 }
3675
3676 static int stbi__zlength_base[31] = {
3677 3,4,5,6,7,8,9,10,11,13,
3678 15,17,19,23,27,31,35,43,51,59,
3679 67,83,99,115,131,163,195,227,258,0,0 };
3680
3681 static int stbi__zlength_extra[31]=
3682 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
3683
3684 static int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
3685 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
3686
3687 static int stbi__zdist_extra[32] =
3688 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
3689
stbi__parse_huffman_block(stbi__zbuf * a)3690 static int stbi__parse_huffman_block(stbi__zbuf *a)
3691 {
3692 char *zout = a->zout;
3693 for(;;) {
3694 int z = stbi__zhuffman_decode(a, &a->z_length);
3695 if (z < 256) {
3696 if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
3697 if (zout >= a->zout_end) {
3698 if (!stbi__zexpand(a, zout, 1)) return 0;
3699 zout = a->zout;
3700 }
3701 *zout++ = (char) z;
3702 } else {
3703 stbi_uc *p;
3704 int len,dist;
3705 if (z == 256) {
3706 a->zout = zout;
3707 return 1;
3708 }
3709 z -= 257;
3710 len = stbi__zlength_base[z];
3711 if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
3712 z = stbi__zhuffman_decode(a, &a->z_distance);
3713 if (z < 0) return stbi__err("bad huffman code","Corrupt PNG");
3714 dist = stbi__zdist_base[z];
3715 if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
3716 if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG");
3717 if (zout + len > a->zout_end) {
3718 if (!stbi__zexpand(a, zout, len)) return 0;
3719 zout = a->zout;
3720 }
3721 p = (stbi_uc *) (zout - dist);
3722 if (dist == 1) { // run of one byte; common in images.
3723 stbi_uc v = *p;
3724 if (len) { do *zout++ = v; while (--len); }
3725 } else {
3726 if (len) { do *zout++ = *p++; while (--len); }
3727 }
3728 }
3729 }
3730 }
3731
stbi__compute_huffman_codes(stbi__zbuf * a)3732 static int stbi__compute_huffman_codes(stbi__zbuf *a)
3733 {
3734 static stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
3735 stbi__zhuffman z_codelength;
3736 stbi_uc lencodes[286+32+137];//padding for maximum single op
3737 stbi_uc codelength_sizes[19];
3738 int i,n;
3739
3740 int hlit = stbi__zreceive(a,5) + 257;
3741 int hdist = stbi__zreceive(a,5) + 1;
3742 int hclen = stbi__zreceive(a,4) + 4;
3743
3744 memset(codelength_sizes, 0, sizeof(codelength_sizes));
3745 for (i=0; i < hclen; ++i) {
3746 int s = stbi__zreceive(a,3);
3747 codelength_sizes[length_dezigzag[i]] = (stbi_uc) s;
3748 }
3749 if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
3750
3751 n = 0;
3752 while (n < hlit + hdist) {
3753 int c = stbi__zhuffman_decode(a, &z_codelength);
3754 if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG");
3755 if (c < 16)
3756 lencodes[n++] = (stbi_uc) c;
3757 else if (c == 16) {
3758 c = stbi__zreceive(a,2)+3;
3759 memset(lencodes+n, lencodes[n-1], c);
3760 n += c;
3761 } else if (c == 17) {
3762 c = stbi__zreceive(a,3)+3;
3763 memset(lencodes+n, 0, c);
3764 n += c;
3765 } else {
3766 STBI_ASSERT(c == 18);
3767 c = stbi__zreceive(a,7)+11;
3768 memset(lencodes+n, 0, c);
3769 n += c;
3770 }
3771 }
3772 if (n != hlit+hdist) return stbi__err("bad codelengths","Corrupt PNG");
3773 if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
3774 if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
3775 return 1;
3776 }
3777
stbi__parse_uncomperssed_block(stbi__zbuf * a)3778 static int stbi__parse_uncomperssed_block(stbi__zbuf *a)
3779 {
3780 stbi_uc header[4];
3781 int len,nlen,k;
3782 if (a->num_bits & 7)
3783 stbi__zreceive(a, a->num_bits & 7); // discard
3784 // drain the bit-packed data into header
3785 k = 0;
3786 while (a->num_bits > 0) {
3787 header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check
3788 a->code_buffer >>= 8;
3789 a->num_bits -= 8;
3790 }
3791 STBI_ASSERT(a->num_bits == 0);
3792 // now fill header the normal way
3793 while (k < 4)
3794 header[k++] = stbi__zget8(a);
3795 len = header[1] * 256 + header[0];
3796 nlen = header[3] * 256 + header[2];
3797 if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
3798 if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG");
3799 if (a->zout + len > a->zout_end)
3800 if (!stbi__zexpand(a, a->zout, len)) return 0;
3801 memcpy(a->zout, a->zbuffer, len);
3802 a->zbuffer += len;
3803 a->zout += len;
3804 return 1;
3805 }
3806
stbi__parse_zlib_header(stbi__zbuf * a)3807 static int stbi__parse_zlib_header(stbi__zbuf *a)
3808 {
3809 int cmf = stbi__zget8(a);
3810 int cm = cmf & 15;
3811 /* int cinfo = cmf >> 4; */
3812 int flg = stbi__zget8(a);
3813 if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
3814 if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
3815 if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
3816 // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
3817 return 1;
3818 }
3819
3820 // @TODO: should statically initialize these for optimal thread safety
3821 static stbi_uc stbi__zdefault_length[288], stbi__zdefault_distance[32];
stbi__init_zdefaults(void)3822 static void stbi__init_zdefaults(void)
3823 {
3824 int i; // use <= to match clearly with spec
3825 for (i=0; i <= 143; ++i) stbi__zdefault_length[i] = 8;
3826 for ( ; i <= 255; ++i) stbi__zdefault_length[i] = 9;
3827 for ( ; i <= 279; ++i) stbi__zdefault_length[i] = 7;
3828 for ( ; i <= 287; ++i) stbi__zdefault_length[i] = 8;
3829
3830 for (i=0; i <= 31; ++i) stbi__zdefault_distance[i] = 5;
3831 }
3832
stbi__parse_zlib(stbi__zbuf * a,int parse_header)3833 static int stbi__parse_zlib(stbi__zbuf *a, int parse_header)
3834 {
3835 int final, type;
3836 if (parse_header)
3837 if (!stbi__parse_zlib_header(a)) return 0;
3838 a->num_bits = 0;
3839 a->code_buffer = 0;
3840 do {
3841 final = stbi__zreceive(a,1);
3842 type = stbi__zreceive(a,2);
3843 if (type == 0) {
3844 if (!stbi__parse_uncomperssed_block(a)) return 0;
3845 } else if (type == 3) {
3846 return 0;
3847 } else {
3848 if (type == 1) {
3849 // use fixed code lengths
3850 if (!stbi__zdefault_distance[31]) stbi__init_zdefaults();
3851 if (!stbi__zbuild_huffman(&a->z_length , stbi__zdefault_length , 288)) return 0;
3852 if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance, 32)) return 0;
3853 } else {
3854 if (!stbi__compute_huffman_codes(a)) return 0;
3855 }
3856 if (!stbi__parse_huffman_block(a)) return 0;
3857 }
3858 } while (!final);
3859 return 1;
3860 }
3861
stbi__do_zlib(stbi__zbuf * a,char * obuf,int olen,int exp,int parse_header)3862 static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header)
3863 {
3864 a->zout_start = obuf;
3865 a->zout = obuf;
3866 a->zout_end = obuf + olen;
3867 a->z_expandable = exp;
3868
3869 return stbi__parse_zlib(a, parse_header);
3870 }
3871
stbi_zlib_decode_malloc_guesssize(const char * buffer,int len,int initial_size,int * outlen)3872 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)
3873 {
3874 stbi__zbuf a;
3875 char *p = (char *) stbi__malloc(initial_size);
3876 if (p == NULL) return NULL;
3877 a.zbuffer = (stbi_uc *) buffer;
3878 a.zbuffer_end = (stbi_uc *) buffer + len;
3879 if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
3880 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3881 return a.zout_start;
3882 } else {
3883 STBI_FREE(a.zout_start);
3884 return NULL;
3885 }
3886 }
3887
stbi_zlib_decode_malloc(char const * buffer,int len,int * outlen)3888 STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
3889 {
3890 return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
3891 }
3892
stbi_zlib_decode_malloc_guesssize_headerflag(const char * buffer,int len,int initial_size,int * outlen,int parse_header)3893 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
3894 {
3895 stbi__zbuf a;
3896 char *p = (char *) stbi__malloc(initial_size);
3897 if (p == NULL) return NULL;
3898 a.zbuffer = (stbi_uc *) buffer;
3899 a.zbuffer_end = (stbi_uc *) buffer + len;
3900 if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
3901 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3902 return a.zout_start;
3903 } else {
3904 STBI_FREE(a.zout_start);
3905 return NULL;
3906 }
3907 }
3908
stbi_zlib_decode_buffer(char * obuffer,int olen,char const * ibuffer,int ilen)3909 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
3910 {
3911 stbi__zbuf a;
3912 a.zbuffer = (stbi_uc *) ibuffer;
3913 a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3914 if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
3915 return (int) (a.zout - a.zout_start);
3916 else
3917 return -1;
3918 }
3919
stbi_zlib_decode_noheader_malloc(char const * buffer,int len,int * outlen)3920 STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
3921 {
3922 stbi__zbuf a;
3923 char *p = (char *) stbi__malloc(16384);
3924 if (p == NULL) return NULL;
3925 a.zbuffer = (stbi_uc *) buffer;
3926 a.zbuffer_end = (stbi_uc *) buffer+len;
3927 if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
3928 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3929 return a.zout_start;
3930 } else {
3931 STBI_FREE(a.zout_start);
3932 return NULL;
3933 }
3934 }
3935
stbi_zlib_decode_noheader_buffer(char * obuffer,int olen,const char * ibuffer,int ilen)3936 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
3937 {
3938 stbi__zbuf a;
3939 a.zbuffer = (stbi_uc *) ibuffer;
3940 a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3941 if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
3942 return (int) (a.zout - a.zout_start);
3943 else
3944 return -1;
3945 }
3946 #endif
3947
3948 // public domain "baseline" PNG decoder v0.10 Sean Barrett 2006-11-18
3949 // simple implementation
3950 // - only 8-bit samples
3951 // - no CRC checking
3952 // - allocates lots of intermediate memory
3953 // - avoids problem of streaming data between subsystems
3954 // - avoids explicit window management
3955 // performance
3956 // - uses stb_zlib, a PD zlib implementation with fast huffman decoding
3957
3958 #ifndef STBI_NO_PNG
3959 typedef struct
3960 {
3961 stbi__uint32 length;
3962 stbi__uint32 type;
3963 } stbi__pngchunk;
3964
stbi__get_chunk_header(stbi__context * s)3965 static stbi__pngchunk stbi__get_chunk_header(stbi__context *s)
3966 {
3967 stbi__pngchunk c;
3968 c.length = stbi__get32be(s);
3969 c.type = stbi__get32be(s);
3970 return c;
3971 }
3972
stbi__check_png_header(stbi__context * s)3973 static int stbi__check_png_header(stbi__context *s)
3974 {
3975 static stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
3976 int i;
3977 for (i=0; i < 8; ++i)
3978 if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG");
3979 return 1;
3980 }
3981
3982 typedef struct
3983 {
3984 stbi__context *s;
3985 stbi_uc *idata, *expanded, *out;
3986 } stbi__png;
3987
3988
3989 enum {
3990 STBI__F_none=0,
3991 STBI__F_sub=1,
3992 STBI__F_up=2,
3993 STBI__F_avg=3,
3994 STBI__F_paeth=4,
3995 // synthetic filters used for first scanline to avoid needing a dummy row of 0s
3996 STBI__F_avg_first,
3997 STBI__F_paeth_first
3998 };
3999
4000 static stbi_uc first_row_filter[5] =
4001 {
4002 STBI__F_none,
4003 STBI__F_sub,
4004 STBI__F_none,
4005 STBI__F_avg_first,
4006 STBI__F_paeth_first
4007 };
4008
stbi__paeth(int a,int b,int c)4009 static int stbi__paeth(int a, int b, int c)
4010 {
4011 int p = a + b - c;
4012 int pa = abs(p-a);
4013 int pb = abs(p-b);
4014 int pc = abs(p-c);
4015 if (pa <= pb && pa <= pc) return a;
4016 if (pb <= pc) return b;
4017 return c;
4018 }
4019
4020 static stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
4021
4022 // create the png data from post-deflated data
stbi__create_png_image_raw(stbi__png * a,stbi_uc * raw,stbi__uint32 raw_len,int out_n,stbi__uint32 x,stbi__uint32 y,int depth,int color)4023 static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
4024 {
4025 stbi__context *s = a->s;
4026 stbi__uint32 i,j,stride = x*out_n;
4027 stbi__uint32 img_len, img_width_bytes;
4028 int k;
4029 int img_n = s->img_n; // copy it into a local for later
4030
4031 STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1);
4032 a->out = (stbi_uc *) stbi__malloc(x * y * out_n); // extra bytes to write off the end into
4033 if (!a->out) return stbi__err("outofmem", "Out of memory");
4034
4035 img_width_bytes = (((img_n * x * depth) + 7) >> 3);
4036 img_len = (img_width_bytes + 1) * y;
4037 if (s->img_x == x && s->img_y == y) {
4038 if (raw_len != img_len) return stbi__err("not enough pixels","Corrupt PNG");
4039 } else { // interlaced:
4040 if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG");
4041 }
4042
4043 for (j=0; j < y; ++j) {
4044 stbi_uc *cur = a->out + stride*j;
4045 stbi_uc *prior = cur - stride;
4046 int filter = *raw++;
4047 int filter_bytes = img_n;
4048 int width = x;
4049 if (filter > 4)
4050 return stbi__err("invalid filter","Corrupt PNG");
4051
4052 if (depth < 8) {
4053 STBI_ASSERT(img_width_bytes <= x);
4054 cur += x*out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place
4055 filter_bytes = 1;
4056 width = img_width_bytes;
4057 }
4058
4059 // if first row, use special filter that doesn't sample previous row
4060 if (j == 0) filter = first_row_filter[filter];
4061
4062 // handle first byte explicitly
4063 for (k=0; k < filter_bytes; ++k) {
4064 switch (filter) {
4065 case STBI__F_none : cur[k] = raw[k]; break;
4066 case STBI__F_sub : cur[k] = raw[k]; break;
4067 case STBI__F_up : cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4068 case STBI__F_avg : cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1)); break;
4069 case STBI__F_paeth : cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0,prior[k],0)); break;
4070 case STBI__F_avg_first : cur[k] = raw[k]; break;
4071 case STBI__F_paeth_first: cur[k] = raw[k]; break;
4072 }
4073 }
4074
4075 if (depth == 8) {
4076 if (img_n != out_n)
4077 cur[img_n] = 255; // first pixel
4078 raw += img_n;
4079 cur += out_n;
4080 prior += out_n;
4081 } else {
4082 raw += 1;
4083 cur += 1;
4084 prior += 1;
4085 }
4086
4087 // this is a little gross, so that we don't switch per-pixel or per-component
4088 if (depth < 8 || img_n == out_n) {
4089 int nk = (width - 1)*img_n;
4090 #define CASE(f) \
4091 case f: \
4092 for (k=0; k < nk; ++k)
4093 switch (filter) {
4094 // "none" filter turns into a memcpy here; make that explicit.
4095 case STBI__F_none: memcpy(cur, raw, nk);
4096 break;
4097 CASE(STBI__F_sub) cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]);
4098 break;
4099 CASE(STBI__F_up) cur[k] = STBI__BYTECAST(raw[k] + prior[k]);
4100 break;
4101 CASE(STBI__F_avg) cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1));
4102 break;
4103 CASE(STBI__F_paeth) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],prior[k],prior[k-filter_bytes]));
4104 break;
4105 CASE(STBI__F_avg_first) cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1));
4106 break;
4107 CASE(STBI__F_paeth_first) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],0,0));
4108 break;
4109 }
4110 #undef CASE
4111 raw += nk;
4112 } else {
4113 STBI_ASSERT(img_n+1 == out_n);
4114 #define CASE(f) \
4115 case f: \
4116 for (i=x-1; i >= 1; --i, cur[img_n]=255,raw+=img_n,cur+=out_n,prior+=out_n) \
4117 for (k=0; k < img_n; ++k)
4118 switch (filter) {
4119 CASE(STBI__F_none) cur[k] = raw[k];
4120 break;
4121 CASE(STBI__F_sub) cur[k] = STBI__BYTECAST(raw[k] + cur[k-out_n]);
4122 break;
4123 CASE(STBI__F_up) cur[k] = STBI__BYTECAST(raw[k] + prior[k]);
4124 break;
4125 CASE(STBI__F_avg) cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-out_n])>>1));
4126 break;
4127 CASE(STBI__F_paeth) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-out_n],prior[k],prior[k-out_n]));
4128 break;
4129 CASE(STBI__F_avg_first) cur[k] = STBI__BYTECAST(raw[k] + (cur[k-out_n] >> 1));
4130 break;
4131 CASE(STBI__F_paeth_first) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-out_n],0,0));
4132 break;
4133 }
4134 #undef CASE
4135 }
4136 }
4137
4138 // we make a separate pass to expand bits to pixels; for performance,
4139 // this could run two scanlines behind the above code, so it won't
4140 // intefere with filtering but will still be in the cache.
4141 if (depth < 8) {
4142 for (j=0; j < y; ++j) {
4143 stbi_uc *cur = a->out + stride*j;
4144 stbi_uc *in = a->out + stride*j + x*out_n - img_width_bytes;
4145 // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
4146 // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
4147 stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range
4148
4149 // note that the final byte might overshoot and write more data than desired.
4150 // we can allocate enough data that this never writes out of memory, but it
4151 // could also overwrite the next scanline. can it overwrite non-empty data
4152 // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
4153 // so we need to explicitly clamp the final ones
4154
4155 if (depth == 4) {
4156 for (k=x*img_n; k >= 2; k-=2, ++in) {
4157 *cur++ = scale * ((*in >> 4) );
4158 *cur++ = scale * ((*in ) & 0x0f);
4159 }
4160 if (k > 0) *cur++ = scale * ((*in >> 4) );
4161 } else if (depth == 2) {
4162 for (k=x*img_n; k >= 4; k-=4, ++in) {
4163 *cur++ = scale * ((*in >> 6) );
4164 *cur++ = scale * ((*in >> 4) & 0x03);
4165 *cur++ = scale * ((*in >> 2) & 0x03);
4166 *cur++ = scale * ((*in ) & 0x03);
4167 }
4168 if (k > 0) *cur++ = scale * ((*in >> 6) );
4169 if (k > 1) *cur++ = scale * ((*in >> 4) & 0x03);
4170 if (k > 2) *cur++ = scale * ((*in >> 2) & 0x03);
4171 } else if (depth == 1) {
4172 for (k=x*img_n; k >= 8; k-=8, ++in) {
4173 *cur++ = scale * ((*in >> 7) );
4174 *cur++ = scale * ((*in >> 6) & 0x01);
4175 *cur++ = scale * ((*in >> 5) & 0x01);
4176 *cur++ = scale * ((*in >> 4) & 0x01);
4177 *cur++ = scale * ((*in >> 3) & 0x01);
4178 *cur++ = scale * ((*in >> 2) & 0x01);
4179 *cur++ = scale * ((*in >> 1) & 0x01);
4180 *cur++ = scale * ((*in ) & 0x01);
4181 }
4182 if (k > 0) *cur++ = scale * ((*in >> 7) );
4183 if (k > 1) *cur++ = scale * ((*in >> 6) & 0x01);
4184 if (k > 2) *cur++ = scale * ((*in >> 5) & 0x01);
4185 if (k > 3) *cur++ = scale * ((*in >> 4) & 0x01);
4186 if (k > 4) *cur++ = scale * ((*in >> 3) & 0x01);
4187 if (k > 5) *cur++ = scale * ((*in >> 2) & 0x01);
4188 if (k > 6) *cur++ = scale * ((*in >> 1) & 0x01);
4189 }
4190 if (img_n != out_n) {
4191 // insert alpha = 255
4192 stbi_uc *cur2 = a->out + stride*j;
4193 int i2;
4194 if (img_n == 1) {
4195 for (i2=x-1; i2 >= 0; --i2) {
4196 cur2[i2*2+1] = 255;
4197 cur2[i2*2+0] = cur2[i2];
4198 }
4199 } else {
4200 STBI_ASSERT(img_n == 3);
4201 for (i2=x-1; i2 >= 0; --i2) {
4202 cur2[i2*4+3] = 255;
4203 cur2[i2*4+2] = cur2[i2*3+2];
4204 cur2[i2*4+1] = cur2[i2*3+1];
4205 cur2[i2*4+0] = cur2[i2*3+0];
4206 }
4207 }
4208 }
4209 }
4210 }
4211
4212 return 1;
4213 }
4214
stbi__create_png_image(stbi__png * a,stbi_uc * image_data,stbi__uint32 image_data_len,int out_n,int depth,int color,int interlaced)4215 static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
4216 {
4217 stbi_uc *final;
4218 int p;
4219 if (!interlaced)
4220 return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
4221
4222 // de-interlacing
4223 final = (stbi_uc *) stbi__malloc(a->s->img_x * a->s->img_y * out_n);
4224 for (p=0; p < 7; ++p) {
4225 int xorig[] = { 0,4,0,2,0,1,0 };
4226 int yorig[] = { 0,0,4,0,2,0,1 };
4227 int xspc[] = { 8,8,4,4,2,2,1 };
4228 int yspc[] = { 8,8,8,4,4,2,2 };
4229 int i,j,x,y;
4230 // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
4231 x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
4232 y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
4233 if (x && y) {
4234 stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
4235 if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
4236 STBI_FREE(final);
4237 return 0;
4238 }
4239 for (j=0; j < y; ++j) {
4240 for (i=0; i < x; ++i) {
4241 int out_y = j*yspc[p]+yorig[p];
4242 int out_x = i*xspc[p]+xorig[p];
4243 memcpy(final + out_y*a->s->img_x*out_n + out_x*out_n,
4244 a->out + (j*x+i)*out_n, out_n);
4245 }
4246 }
4247 STBI_FREE(a->out);
4248 image_data += img_len;
4249 image_data_len -= img_len;
4250 }
4251 }
4252 a->out = final;
4253
4254 return 1;
4255 }
4256
stbi__compute_transparency(stbi__png * z,stbi_uc tc[3],int out_n)4257 static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n)
4258 {
4259 stbi__context *s = z->s;
4260 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4261 stbi_uc *p = z->out;
4262
4263 // compute color-based transparency, assuming we've
4264 // already got 255 as the alpha value in the output
4265 STBI_ASSERT(out_n == 2 || out_n == 4);
4266
4267 if (out_n == 2) {
4268 for (i=0; i < pixel_count; ++i) {
4269 p[1] = (p[0] == tc[0] ? 0 : 255);
4270 p += 2;
4271 }
4272 } else {
4273 for (i=0; i < pixel_count; ++i) {
4274 if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4275 p[3] = 0;
4276 p += 4;
4277 }
4278 }
4279 return 1;
4280 }
4281
stbi__expand_png_palette(stbi__png * a,stbi_uc * palette,int len,int pal_img_n)4282 static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n)
4283 {
4284 stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
4285 stbi_uc *p, *temp_out, *orig = a->out;
4286
4287 p = (stbi_uc *) stbi__malloc(pixel_count * pal_img_n);
4288 if (p == NULL) return stbi__err("outofmem", "Out of memory");
4289
4290 // between here and free(out) below, exitting would leak
4291 temp_out = p;
4292
4293 if (pal_img_n == 3) {
4294 for (i=0; i < pixel_count; ++i) {
4295 int n = orig[i]*4;
4296 p[0] = palette[n ];
4297 p[1] = palette[n+1];
4298 p[2] = palette[n+2];
4299 p += 3;
4300 }
4301 } else {
4302 for (i=0; i < pixel_count; ++i) {
4303 int n = orig[i]*4;
4304 p[0] = palette[n ];
4305 p[1] = palette[n+1];
4306 p[2] = palette[n+2];
4307 p[3] = palette[n+3];
4308 p += 4;
4309 }
4310 }
4311 STBI_FREE(a->out);
4312 a->out = temp_out;
4313
4314 STBI_NOTUSED(len);
4315
4316 return 1;
4317 }
4318
4319 static int stbi__unpremultiply_on_load = 0;
4320 static int stbi__de_iphone_flag = 0;
4321
stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)4322 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
4323 {
4324 stbi__unpremultiply_on_load = flag_true_if_should_unpremultiply;
4325 }
4326
stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)4327 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
4328 {
4329 stbi__de_iphone_flag = flag_true_if_should_convert;
4330 }
4331
stbi__de_iphone(stbi__png * z)4332 static void stbi__de_iphone(stbi__png *z)
4333 {
4334 stbi__context *s = z->s;
4335 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4336 stbi_uc *p = z->out;
4337
4338 if (s->img_out_n == 3) { // convert bgr to rgb
4339 for (i=0; i < pixel_count; ++i) {
4340 stbi_uc t = p[0];
4341 p[0] = p[2];
4342 p[2] = t;
4343 p += 3;
4344 }
4345 } else {
4346 STBI_ASSERT(s->img_out_n == 4);
4347 if (stbi__unpremultiply_on_load) {
4348 // convert bgr to rgb and unpremultiply
4349 for (i=0; i < pixel_count; ++i) {
4350 stbi_uc a = p[3];
4351 stbi_uc t = p[0];
4352 if (a) {
4353 p[0] = p[2] * 255 / a;
4354 p[1] = p[1] * 255 / a;
4355 p[2] = t * 255 / a;
4356 } else {
4357 p[0] = p[2];
4358 p[2] = t;
4359 }
4360 p += 4;
4361 }
4362 } else {
4363 // convert bgr to rgb
4364 for (i=0; i < pixel_count; ++i) {
4365 stbi_uc t = p[0];
4366 p[0] = p[2];
4367 p[2] = t;
4368 p += 4;
4369 }
4370 }
4371 }
4372 }
4373
4374 #define STBI__PNG_TYPE(a,b,c,d) (((a) << 24) + ((b) << 16) + ((c) << 8) + (d))
4375
stbi__parse_png_file(stbi__png * z,int scan,int req_comp)4376 static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp)
4377 {
4378 stbi_uc palette[1024], pal_img_n=0;
4379 stbi_uc has_trans=0, tc[3];
4380 stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0;
4381 int first=1,k,interlace=0, color=0, depth=0, is_iphone=0;
4382 stbi__context *s = z->s;
4383
4384 z->expanded = NULL;
4385 z->idata = NULL;
4386 z->out = NULL;
4387
4388 if (!stbi__check_png_header(s)) return 0;
4389
4390 if (scan == STBI__SCAN_type) return 1;
4391
4392 for (;;) {
4393 stbi__pngchunk c = stbi__get_chunk_header(s);
4394 switch (c.type) {
4395 case STBI__PNG_TYPE('C','g','B','I'):
4396 is_iphone = 1;
4397 stbi__skip(s, c.length);
4398 break;
4399 case STBI__PNG_TYPE('I','H','D','R'): {
4400 int comp,filter;
4401 if (!first) return stbi__err("multiple IHDR","Corrupt PNG");
4402 first = 0;
4403 if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG");
4404 s->img_x = stbi__get32be(s); if (s->img_x > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4405 s->img_y = stbi__get32be(s); if (s->img_y > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4406 depth = stbi__get8(s); if (depth != 1 && depth != 2 && depth != 4 && depth != 8) return stbi__err("1/2/4/8-bit only","PNG not supported: 1/2/4/8-bit only");
4407 color = stbi__get8(s); if (color > 6) return stbi__err("bad ctype","Corrupt PNG");
4408 if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG");
4409 comp = stbi__get8(s); if (comp) return stbi__err("bad comp method","Corrupt PNG");
4410 filter= stbi__get8(s); if (filter) return stbi__err("bad filter method","Corrupt PNG");
4411 interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG");
4412 if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG");
4413 if (!pal_img_n) {
4414 s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
4415 if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
4416 if (scan == STBI__SCAN_header) return 1;
4417 } else {
4418 // if paletted, then pal_n is our final components, and
4419 // img_n is # components to decompress/filter.
4420 s->img_n = 1;
4421 if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG");
4422 // if SCAN_header, have to scan to see if we have a tRNS
4423 }
4424 break;
4425 }
4426
4427 case STBI__PNG_TYPE('P','L','T','E'): {
4428 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4429 if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
4430 pal_len = c.length / 3;
4431 if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG");
4432 for (i=0; i < pal_len; ++i) {
4433 palette[i*4+0] = stbi__get8(s);
4434 palette[i*4+1] = stbi__get8(s);
4435 palette[i*4+2] = stbi__get8(s);
4436 palette[i*4+3] = 255;
4437 }
4438 break;
4439 }
4440
4441 case STBI__PNG_TYPE('t','R','N','S'): {
4442 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4443 if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG");
4444 if (pal_img_n) {
4445 if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; }
4446 if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
4447 if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG");
4448 pal_img_n = 4;
4449 for (i=0; i < c.length; ++i)
4450 palette[i*4+3] = stbi__get8(s);
4451 } else {
4452 if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
4453 if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG");
4454 has_trans = 1;
4455 for (k=0; k < s->img_n; ++k)
4456 tc[k] = (stbi_uc) (stbi__get16be(s) & 255) * stbi__depth_scale_table[depth]; // non 8-bit images will be larger
4457 }
4458 break;
4459 }
4460
4461 case STBI__PNG_TYPE('I','D','A','T'): {
4462 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4463 if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG");
4464 if (scan == STBI__SCAN_header) { s->img_n = pal_img_n; return 1; }
4465 if ((int)(ioff + c.length) < (int)ioff) return 0;
4466 if (ioff + c.length > idata_limit) {
4467 stbi_uc *p;
4468 if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
4469 while (ioff + c.length > idata_limit)
4470 idata_limit *= 2;
4471 p = (stbi_uc *) STBI_REALLOC(z->idata, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
4472 z->idata = p;
4473 }
4474 if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG");
4475 ioff += c.length;
4476 break;
4477 }
4478
4479 case STBI__PNG_TYPE('I','E','N','D'): {
4480 stbi__uint32 raw_len, bpl;
4481 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4482 if (scan != STBI__SCAN_load) return 1;
4483 if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG");
4484 // initial guess for decoded data size to avoid unnecessary reallocs
4485 bpl = (s->img_x * depth + 7) / 8; // bytes per line, per component
4486 raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
4487 z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone);
4488 if (z->expanded == NULL) return 0; // zlib should set error
4489 STBI_FREE(z->idata); z->idata = NULL;
4490 if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
4491 s->img_out_n = s->img_n+1;
4492 else
4493 s->img_out_n = s->img_n;
4494 if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, depth, color, interlace)) return 0;
4495 if (has_trans)
4496 if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
4497 if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
4498 stbi__de_iphone(z);
4499 if (pal_img_n) {
4500 // pal_img_n == 3 or 4
4501 s->img_n = pal_img_n; // record the actual colors we had
4502 s->img_out_n = pal_img_n;
4503 if (req_comp >= 3) s->img_out_n = req_comp;
4504 if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
4505 return 0;
4506 }
4507 STBI_FREE(z->expanded); z->expanded = NULL;
4508 return 1;
4509 }
4510
4511 default:
4512 // if critical, fail
4513 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4514 if ((c.type & (1 << 29)) == 0) {
4515 #ifndef STBI_NO_FAILURE_STRINGS
4516 // not threadsafe
4517 static char invalid_chunk[] = "XXXX PNG chunk not known";
4518 invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
4519 invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
4520 invalid_chunk[2] = STBI__BYTECAST(c.type >> 8);
4521 invalid_chunk[3] = STBI__BYTECAST(c.type >> 0);
4522 #endif
4523 return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
4524 }
4525 stbi__skip(s, c.length);
4526 break;
4527 }
4528 // end of PNG chunk, read and skip CRC
4529 stbi__get32be(s);
4530 }
4531 }
4532
stbi__do_png(stbi__png * p,int * x,int * y,int * n,int req_comp)4533 static unsigned char *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp)
4534 {
4535 unsigned char *result=NULL;
4536 if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
4537 if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) {
4538 result = p->out;
4539 p->out = NULL;
4540 if (req_comp && req_comp != p->s->img_out_n) {
4541 result = stbi__convert_format(result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
4542 p->s->img_out_n = req_comp;
4543 if (result == NULL) return result;
4544 }
4545 *x = p->s->img_x;
4546 *y = p->s->img_y;
4547 if (n) *n = p->s->img_out_n;
4548 }
4549 STBI_FREE(p->out); p->out = NULL;
4550 STBI_FREE(p->expanded); p->expanded = NULL;
4551 STBI_FREE(p->idata); p->idata = NULL;
4552
4553 return result;
4554 }
4555
stbi__png_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4556 static unsigned char *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4557 {
4558 stbi__png p;
4559 p.s = s;
4560 return stbi__do_png(&p, x,y,comp,req_comp);
4561 }
4562
stbi__png_test(stbi__context * s)4563 static int stbi__png_test(stbi__context *s)
4564 {
4565 int r;
4566 r = stbi__check_png_header(s);
4567 stbi__rewind(s);
4568 return r;
4569 }
4570
stbi__png_info_raw(stbi__png * p,int * x,int * y,int * comp)4571 static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp)
4572 {
4573 if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) {
4574 stbi__rewind( p->s );
4575 return 0;
4576 }
4577 if (x) *x = p->s->img_x;
4578 if (y) *y = p->s->img_y;
4579 if (comp) *comp = p->s->img_n;
4580 return 1;
4581 }
4582
stbi__png_info(stbi__context * s,int * x,int * y,int * comp)4583 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp)
4584 {
4585 stbi__png p;
4586 p.s = s;
4587 return stbi__png_info_raw(&p, x, y, comp);
4588 }
4589 #endif
4590
4591 // Microsoft/Windows BMP image
4592
4593 #ifndef STBI_NO_BMP
stbi__bmp_test_raw(stbi__context * s)4594 static int stbi__bmp_test_raw(stbi__context *s)
4595 {
4596 int r;
4597 int sz;
4598 if (stbi__get8(s) != 'B') return 0;
4599 if (stbi__get8(s) != 'M') return 0;
4600 stbi__get32le(s); // discard filesize
4601 stbi__get16le(s); // discard reserved
4602 stbi__get16le(s); // discard reserved
4603 stbi__get32le(s); // discard data offset
4604 sz = stbi__get32le(s);
4605 r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
4606 return r;
4607 }
4608
stbi__bmp_test(stbi__context * s)4609 static int stbi__bmp_test(stbi__context *s)
4610 {
4611 int r = stbi__bmp_test_raw(s);
4612 stbi__rewind(s);
4613 return r;
4614 }
4615
4616
4617 // returns 0..31 for the highest set bit
stbi__high_bit(unsigned int z)4618 static int stbi__high_bit(unsigned int z)
4619 {
4620 int n=0;
4621 if (z == 0) return -1;
4622 if (z >= 0x10000) n += 16, z >>= 16;
4623 if (z >= 0x00100) n += 8, z >>= 8;
4624 if (z >= 0x00010) n += 4, z >>= 4;
4625 if (z >= 0x00004) n += 2, z >>= 2;
4626 if (z >= 0x00002) n += 1, z >>= 1;
4627 return n;
4628 }
4629
stbi__bitcount(unsigned int a)4630 static int stbi__bitcount(unsigned int a)
4631 {
4632 a = (a & 0x55555555) + ((a >> 1) & 0x55555555); // max 2
4633 a = (a & 0x33333333) + ((a >> 2) & 0x33333333); // max 4
4634 a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
4635 a = (a + (a >> 8)); // max 16 per 8 bits
4636 a = (a + (a >> 16)); // max 32 per 8 bits
4637 return a & 0xff;
4638 }
4639
stbi__shiftsigned(int v,int shift,int bits)4640 static int stbi__shiftsigned(int v, int shift, int bits)
4641 {
4642 int result;
4643 int z=0;
4644
4645 if (shift < 0) v <<= -shift;
4646 else v >>= shift;
4647 result = v;
4648
4649 z = bits;
4650 while (z < 8) {
4651 result += v >> z;
4652 z += bits;
4653 }
4654 return result;
4655 }
4656
stbi__bmp_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4657 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4658 {
4659 stbi_uc *out;
4660 unsigned int mr=0,mg=0,mb=0,ma=0, fake_a=0;
4661 stbi_uc pal[256][4];
4662 int psize=0,i,j,compress=0,width;
4663 int bpp, flip_vertically, pad, target, offset, hsz;
4664 if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
4665 stbi__get32le(s); // discard filesize
4666 stbi__get16le(s); // discard reserved
4667 stbi__get16le(s); // discard reserved
4668 offset = stbi__get32le(s);
4669 hsz = stbi__get32le(s);
4670 if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
4671 if (hsz == 12) {
4672 s->img_x = stbi__get16le(s);
4673 s->img_y = stbi__get16le(s);
4674 } else {
4675 s->img_x = stbi__get32le(s);
4676 s->img_y = stbi__get32le(s);
4677 }
4678 if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
4679 bpp = stbi__get16le(s);
4680 if (bpp == 1) return stbi__errpuc("monochrome", "BMP type not supported: 1-bit");
4681 flip_vertically = ((int) s->img_y) > 0;
4682 s->img_y = abs((int) s->img_y);
4683 if (hsz == 12) {
4684 if (bpp < 24)
4685 psize = (offset - 14 - 24) / 3;
4686 } else {
4687 compress = stbi__get32le(s);
4688 if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
4689 stbi__get32le(s); // discard sizeof
4690 stbi__get32le(s); // discard hres
4691 stbi__get32le(s); // discard vres
4692 stbi__get32le(s); // discard colorsused
4693 stbi__get32le(s); // discard max important
4694 if (hsz == 40 || hsz == 56) {
4695 if (hsz == 56) {
4696 stbi__get32le(s);
4697 stbi__get32le(s);
4698 stbi__get32le(s);
4699 stbi__get32le(s);
4700 }
4701 if (bpp == 16 || bpp == 32) {
4702 mr = mg = mb = 0;
4703 if (compress == 0) {
4704 if (bpp == 32) {
4705 mr = 0xffu << 16;
4706 mg = 0xffu << 8;
4707 mb = 0xffu << 0;
4708 ma = 0xffu << 24;
4709 fake_a = 1; // @TODO: check for cases like alpha value is all 0 and switch it to 255
4710 STBI_NOTUSED(fake_a);
4711 } else {
4712 mr = 31u << 10;
4713 mg = 31u << 5;
4714 mb = 31u << 0;
4715 }
4716 } else if (compress == 3) {
4717 mr = stbi__get32le(s);
4718 mg = stbi__get32le(s);
4719 mb = stbi__get32le(s);
4720 // not documented, but generated by photoshop and handled by mspaint
4721 if (mr == mg && mg == mb) {
4722 // ?!?!?
4723 return stbi__errpuc("bad BMP", "bad BMP");
4724 }
4725 } else
4726 return stbi__errpuc("bad BMP", "bad BMP");
4727 }
4728 } else {
4729 STBI_ASSERT(hsz == 108 || hsz == 124);
4730 mr = stbi__get32le(s);
4731 mg = stbi__get32le(s);
4732 mb = stbi__get32le(s);
4733 ma = stbi__get32le(s);
4734 stbi__get32le(s); // discard color space
4735 for (i=0; i < 12; ++i)
4736 stbi__get32le(s); // discard color space parameters
4737 if (hsz == 124) {
4738 stbi__get32le(s); // discard rendering intent
4739 stbi__get32le(s); // discard offset of profile data
4740 stbi__get32le(s); // discard size of profile data
4741 stbi__get32le(s); // discard reserved
4742 }
4743 }
4744 if (bpp < 16)
4745 psize = (offset - 14 - hsz) >> 2;
4746 }
4747 s->img_n = ma ? 4 : 3;
4748 if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
4749 target = req_comp;
4750 else
4751 target = s->img_n; // if they want monochrome, we'll post-convert
4752 out = (stbi_uc *) stbi__malloc(target * s->img_x * s->img_y);
4753 if (!out) return stbi__errpuc("outofmem", "Out of memory");
4754 if (bpp < 16) {
4755 int z=0;
4756 if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
4757 for (i=0; i < psize; ++i) {
4758 pal[i][2] = stbi__get8(s);
4759 pal[i][1] = stbi__get8(s);
4760 pal[i][0] = stbi__get8(s);
4761 if (hsz != 12) stbi__get8(s);
4762 pal[i][3] = 255;
4763 }
4764 stbi__skip(s, offset - 14 - hsz - psize * (hsz == 12 ? 3 : 4));
4765 if (bpp == 4) width = (s->img_x + 1) >> 1;
4766 else if (bpp == 8) width = s->img_x;
4767 else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
4768 pad = (-width)&3;
4769 for (j=0; j < (int) s->img_y; ++j) {
4770 for (i=0; i < (int) s->img_x; i += 2) {
4771 int v=stbi__get8(s),v2=0;
4772 if (bpp == 4) {
4773 v2 = v & 15;
4774 v >>= 4;
4775 }
4776 out[z++] = pal[v][0];
4777 out[z++] = pal[v][1];
4778 out[z++] = pal[v][2];
4779 if (target == 4) out[z++] = 255;
4780 if (i+1 == (int) s->img_x) break;
4781 v = (bpp == 8) ? stbi__get8(s) : v2;
4782 out[z++] = pal[v][0];
4783 out[z++] = pal[v][1];
4784 out[z++] = pal[v][2];
4785 if (target == 4) out[z++] = 255;
4786 }
4787 stbi__skip(s, pad);
4788 }
4789 } else {
4790 int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
4791 int z = 0;
4792 int easy=0;
4793 stbi__skip(s, offset - 14 - hsz);
4794 if (bpp == 24) width = 3 * s->img_x;
4795 else if (bpp == 16) width = 2*s->img_x;
4796 else /* bpp = 32 and pad = 0 */ width=0;
4797 pad = (-width) & 3;
4798 if (bpp == 24) {
4799 easy = 1;
4800 } else if (bpp == 32) {
4801 if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
4802 easy = 2;
4803 }
4804 if (!easy) {
4805 if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
4806 // right shift amt to put high bit in position #7
4807 rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr);
4808 gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg);
4809 bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb);
4810 ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma);
4811 }
4812 for (j=0; j < (int) s->img_y; ++j) {
4813 if (easy) {
4814 for (i=0; i < (int) s->img_x; ++i) {
4815 unsigned char a;
4816 out[z+2] = stbi__get8(s);
4817 out[z+1] = stbi__get8(s);
4818 out[z+0] = stbi__get8(s);
4819 z += 3;
4820 a = (easy == 2 ? stbi__get8(s) : 255);
4821 if (target == 4) out[z++] = a;
4822 }
4823 } else {
4824 for (i=0; i < (int) s->img_x; ++i) {
4825 stbi__uint32 v = (bpp == 16 ? (stbi__uint32) stbi__get16le(s) : stbi__get32le(s));
4826 int a;
4827 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
4828 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
4829 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
4830 a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
4831 if (target == 4) out[z++] = STBI__BYTECAST(a);
4832 }
4833 }
4834 stbi__skip(s, pad);
4835 }
4836 }
4837 if (flip_vertically) {
4838 stbi_uc t;
4839 for (j=0; j < (int) s->img_y>>1; ++j) {
4840 stbi_uc *p1 = out + j *s->img_x*target;
4841 stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
4842 for (i=0; i < (int) s->img_x*target; ++i) {
4843 t = p1[i], p1[i] = p2[i], p2[i] = t;
4844 }
4845 }
4846 }
4847
4848 if (req_comp && req_comp != target) {
4849 out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
4850 if (out == NULL) return out; // stbi__convert_format frees input on failure
4851 }
4852
4853 *x = s->img_x;
4854 *y = s->img_y;
4855 if (comp) *comp = s->img_n;
4856 return out;
4857 }
4858 #endif
4859
4860 // Targa Truevision - TGA
4861 // by Jonathan Dummer
4862 #ifndef STBI_NO_TGA
stbi__tga_info(stbi__context * s,int * x,int * y,int * comp)4863 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp)
4864 {
4865 int tga_w, tga_h, tga_comp;
4866 int sz;
4867 stbi__get8(s); // discard Offset
4868 sz = stbi__get8(s); // color type
4869 if( sz > 1 ) {
4870 stbi__rewind(s);
4871 return 0; // only RGB or indexed allowed
4872 }
4873 sz = stbi__get8(s); // image type
4874 // only RGB or grey allowed, +/- RLE
4875 if ((sz != 1) && (sz != 2) && (sz != 3) && (sz != 9) && (sz != 10) && (sz != 11)) return 0;
4876 stbi__skip(s,9);
4877 tga_w = stbi__get16le(s);
4878 if( tga_w < 1 ) {
4879 stbi__rewind(s);
4880 return 0; // test width
4881 }
4882 tga_h = stbi__get16le(s);
4883 if( tga_h < 1 ) {
4884 stbi__rewind(s);
4885 return 0; // test height
4886 }
4887 sz = stbi__get8(s); // bits per pixel
4888 // only RGB or RGBA or grey allowed
4889 if ((sz != 8) && (sz != 16) && (sz != 24) && (sz != 32)) {
4890 stbi__rewind(s);
4891 return 0;
4892 }
4893 tga_comp = sz;
4894 if (x) *x = tga_w;
4895 if (y) *y = tga_h;
4896 if (comp) *comp = tga_comp / 8;
4897 return 1; // seems to have passed everything
4898 }
4899
stbi__tga_test(stbi__context * s)4900 static int stbi__tga_test(stbi__context *s)
4901 {
4902 int res;
4903 int sz;
4904 stbi__get8(s); // discard Offset
4905 sz = stbi__get8(s); // color type
4906 if ( sz > 1 ) return 0; // only RGB or indexed allowed
4907 sz = stbi__get8(s); // image type
4908 if ( (sz != 1) && (sz != 2) && (sz != 3) && (sz != 9) && (sz != 10) && (sz != 11) ) return 0; // only RGB or grey allowed, +/- RLE
4909 stbi__get16be(s); // discard palette start
4910 stbi__get16be(s); // discard palette length
4911 stbi__get8(s); // discard bits per palette color entry
4912 stbi__get16be(s); // discard x origin
4913 stbi__get16be(s); // discard y origin
4914 if ( stbi__get16be(s) < 1 ) return 0; // test width
4915 if ( stbi__get16be(s) < 1 ) return 0; // test height
4916 sz = stbi__get8(s); // bits per pixel
4917 if ( (sz != 8) && (sz != 16) && (sz != 24) && (sz != 32) )
4918 res = 0;
4919 else
4920 res = 1;
4921 stbi__rewind(s);
4922 return res;
4923 }
4924
stbi__tga_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4925 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4926 {
4927 // read in the TGA header stuff
4928 int tga_offset = stbi__get8(s);
4929 int tga_indexed = stbi__get8(s);
4930 int tga_image_type = stbi__get8(s);
4931 int tga_is_RLE = 0;
4932 int tga_palette_start = stbi__get16le(s);
4933 int tga_palette_len = stbi__get16le(s);
4934 int tga_palette_bits = stbi__get8(s);
4935 int tga_x_origin = stbi__get16le(s);
4936 int tga_y_origin = stbi__get16le(s);
4937 int tga_width = stbi__get16le(s);
4938 int tga_height = stbi__get16le(s);
4939 int tga_bits_per_pixel = stbi__get8(s);
4940 int tga_comp = tga_bits_per_pixel / 8;
4941 int tga_inverted = stbi__get8(s);
4942 // image data
4943 unsigned char *tga_data;
4944 unsigned char *tga_palette = NULL;
4945 int i, j;
4946 unsigned char raw_data[4];
4947 int RLE_count = 0;
4948 int RLE_repeating = 0;
4949 int read_next_pixel = 1;
4950
4951 // do a tiny bit of precessing
4952 if ( tga_image_type >= 8 )
4953 {
4954 tga_image_type -= 8;
4955 tga_is_RLE = 1;
4956 }
4957 /* int tga_alpha_bits = tga_inverted & 15; */
4958 tga_inverted = 1 - ((tga_inverted >> 5) & 1);
4959
4960 // error check
4961 if ( //(tga_indexed) ||
4962 (tga_width < 1) || (tga_height < 1) ||
4963 (tga_image_type < 1) || (tga_image_type > 3) ||
4964 ((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16) &&
4965 (tga_bits_per_pixel != 24) && (tga_bits_per_pixel != 32))
4966 )
4967 {
4968 return NULL; // we don't report this as a bad TGA because we don't even know if it's TGA
4969 }
4970
4971 // If I'm paletted, then I'll use the number of bits from the palette
4972 if ( tga_indexed )
4973 {
4974 tga_comp = tga_palette_bits / 8;
4975 }
4976
4977 // tga info
4978 *x = tga_width;
4979 *y = tga_height;
4980 if (comp) *comp = tga_comp;
4981
4982 tga_data = (unsigned char*)stbi__malloc( (size_t)tga_width * tga_height * tga_comp );
4983 if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
4984
4985 // skip to the data's starting position (offset usually = 0)
4986 stbi__skip(s, tga_offset );
4987
4988 if ( !tga_indexed && !tga_is_RLE) {
4989 for (i=0; i < tga_height; ++i) {
4990 int y2 = tga_inverted ? tga_height -i - 1 : i;
4991 stbi_uc *tga_row = tga_data + y2*tga_width*tga_comp;
4992 stbi__getn(s, tga_row, tga_width * tga_comp);
4993 }
4994 } else {
4995 // do I need to load a palette?
4996 if ( tga_indexed)
4997 {
4998 // any data to skip? (offset usually = 0)
4999 stbi__skip(s, tga_palette_start );
5000 // load the palette
5001 tga_palette = (unsigned char*)stbi__malloc( tga_palette_len * tga_palette_bits / 8 );
5002 if (!tga_palette) {
5003 STBI_FREE(tga_data);
5004 return stbi__errpuc("outofmem", "Out of memory");
5005 }
5006 if (!stbi__getn(s, tga_palette, tga_palette_len * tga_palette_bits / 8 )) {
5007 STBI_FREE(tga_data);
5008 STBI_FREE(tga_palette);
5009 return stbi__errpuc("bad palette", "Corrupt TGA");
5010 }
5011 }
5012 // load the data
5013 for (i=0; i < tga_width * tga_height; ++i)
5014 {
5015 // if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
5016 if ( tga_is_RLE )
5017 {
5018 if ( RLE_count == 0 )
5019 {
5020 // yep, get the next byte as a RLE command
5021 int RLE_cmd = stbi__get8(s);
5022 RLE_count = 1 + (RLE_cmd & 127);
5023 RLE_repeating = RLE_cmd >> 7;
5024 read_next_pixel = 1;
5025 } else if ( !RLE_repeating )
5026 {
5027 read_next_pixel = 1;
5028 }
5029 } else
5030 {
5031 read_next_pixel = 1;
5032 }
5033 // OK, if I need to read a pixel, do it now
5034 if ( read_next_pixel )
5035 {
5036 // load however much data we did have
5037 if ( tga_indexed )
5038 {
5039 // read in 1 byte, then perform the lookup
5040 int pal_idx = stbi__get8(s);
5041 if ( pal_idx >= tga_palette_len )
5042 {
5043 // invalid index
5044 pal_idx = 0;
5045 }
5046 pal_idx *= tga_bits_per_pixel / 8;
5047 for (j = 0; j*8 < tga_bits_per_pixel; ++j)
5048 {
5049 raw_data[j] = tga_palette[pal_idx+j];
5050 }
5051 } else
5052 {
5053 // read in the data raw
5054 for (j = 0; j*8 < tga_bits_per_pixel; ++j)
5055 {
5056 raw_data[j] = stbi__get8(s);
5057 }
5058 }
5059 // clear the reading flag for the next pixel
5060 read_next_pixel = 0;
5061 } // end of reading a pixel
5062
5063 // copy data
5064 for (j = 0; j < tga_comp; ++j)
5065 tga_data[i*tga_comp+j] = raw_data[j];
5066
5067 // in case we're in RLE mode, keep counting down
5068 --RLE_count;
5069 }
5070 // do I need to invert the image?
5071 if ( tga_inverted )
5072 {
5073 for (j = 0; j*2 < tga_height; ++j)
5074 {
5075 int index1 = j * tga_width * tga_comp;
5076 int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
5077 for (i = tga_width * tga_comp; i > 0; --i)
5078 {
5079 unsigned char temp = tga_data[index1];
5080 tga_data[index1] = tga_data[index2];
5081 tga_data[index2] = temp;
5082 ++index1;
5083 ++index2;
5084 }
5085 }
5086 }
5087 // clear my palette, if I had one
5088 if ( tga_palette != NULL )
5089 {
5090 STBI_FREE( tga_palette );
5091 }
5092 }
5093
5094 // swap RGB
5095 if (tga_comp >= 3)
5096 {
5097 unsigned char* tga_pixel = tga_data;
5098 for (i=0; i < tga_width * tga_height; ++i)
5099 {
5100 unsigned char temp = tga_pixel[0];
5101 tga_pixel[0] = tga_pixel[2];
5102 tga_pixel[2] = temp;
5103 tga_pixel += tga_comp;
5104 }
5105 }
5106
5107 // convert to target component count
5108 if (req_comp && req_comp != tga_comp)
5109 tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
5110
5111 // the things I do to get rid of an error message, and yet keep
5112 // Microsoft's C compilers happy... [8^(
5113 tga_palette_start = tga_palette_len = tga_palette_bits =
5114 tga_x_origin = tga_y_origin = 0;
5115 // OK, done
5116 return tga_data;
5117 }
5118 #endif
5119
5120 // *************************************************************************************************
5121 // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
5122
5123 #ifndef STBI_NO_PSD
stbi__psd_test(stbi__context * s)5124 static int stbi__psd_test(stbi__context *s)
5125 {
5126 int r = (stbi__get32be(s) == 0x38425053);
5127 stbi__rewind(s);
5128 return r;
5129 }
5130
stbi__psd_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)5131 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5132 {
5133 int pixelCount;
5134 int channelCount, compression;
5135 int channel, i, count, len;
5136 int w,h;
5137 stbi_uc *out;
5138
5139 // Check identifier
5140 if (stbi__get32be(s) != 0x38425053) // "8BPS"
5141 return stbi__errpuc("not PSD", "Corrupt PSD image");
5142
5143 // Check file type version.
5144 if (stbi__get16be(s) != 1)
5145 return stbi__errpuc("wrong version", "Unsupported version of PSD image");
5146
5147 // Skip 6 reserved bytes.
5148 stbi__skip(s, 6 );
5149
5150 // Read the number of channels (R, G, B, A, etc).
5151 channelCount = stbi__get16be(s);
5152 if (channelCount < 0 || channelCount > 16)
5153 return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
5154
5155 // Read the rows and columns of the image.
5156 h = stbi__get32be(s);
5157 w = stbi__get32be(s);
5158
5159 // Make sure the depth is 8 bits.
5160 if (stbi__get16be(s) != 8)
5161 return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 bit");
5162
5163 // Make sure the color mode is RGB.
5164 // Valid options are:
5165 // 0: Bitmap
5166 // 1: Grayscale
5167 // 2: Indexed color
5168 // 3: RGB color
5169 // 4: CMYK color
5170 // 7: Multichannel
5171 // 8: Duotone
5172 // 9: Lab color
5173 if (stbi__get16be(s) != 3)
5174 return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
5175
5176 // Skip the Mode Data. (It's the palette for indexed color; other info for other modes.)
5177 stbi__skip(s,stbi__get32be(s) );
5178
5179 // Skip the image resources. (resolution, pen tool paths, etc)
5180 stbi__skip(s, stbi__get32be(s) );
5181
5182 // Skip the reserved data.
5183 stbi__skip(s, stbi__get32be(s) );
5184
5185 // Find out if the data is compressed.
5186 // Known values:
5187 // 0: no compression
5188 // 1: RLE compressed
5189 compression = stbi__get16be(s);
5190 if (compression > 1)
5191 return stbi__errpuc("bad compression", "PSD has an unknown compression format");
5192
5193 // Create the destination image.
5194 out = (stbi_uc *) stbi__malloc(channelCount * w*h);
5195 if (!out) return stbi__errpuc("outofmem", "Out of memory");
5196 pixelCount = w*h;
5197
5198 // Initialize the data to zero.
5199 //memset( out, 0, pixelCount * 4 );
5200
5201 // Finally, the image data.
5202 if (compression) {
5203 // RLE as used by .PSD and .TIFF
5204 // Loop until you get the number of unpacked bytes you are expecting:
5205 // Read the next source byte into n.
5206 // If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
5207 // Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
5208 // Else if n is 128, noop.
5209 // Endloop
5210
5211 // The RLE-compressed data is preceeded by a 2-byte data count for each row in the data,
5212 // which we're going to just skip.
5213 stbi__skip(s, h * channelCount * 2 );
5214
5215 // Read the RLE data by channel.
5216 for (channel = 0; channel < channelCount; channel++) {
5217 stbi_uc *p;
5218
5219 p = out+channel;
5220 if (channel >= channelCount) {
5221 // Fill this channel with default data.
5222 for (i = 0; i < pixelCount; i++) *p = (channel == 3 ? 255 : 0), p += channelCount;
5223 } else {
5224 // Read the RLE data.
5225 count = 0;
5226 while (count < pixelCount) {
5227 len = stbi__get8(s);
5228 if (len == 128) {
5229 // No-op.
5230 } else if (len < 128) {
5231 // Copy next len+1 bytes literally.
5232 len++;
5233 count += len;
5234 while (len) {
5235 *p = stbi__get8(s);
5236 p += channelCount;
5237 len--;
5238 }
5239 } else if (len > 128) {
5240 stbi_uc val;
5241 // Next -len+1 bytes in the dest are replicated from next source byte.
5242 // (Interpret len as a negative 8-bit int.)
5243 len ^= 0x0FF;
5244 len += 2;
5245 val = stbi__get8(s);
5246 count += len;
5247 while (len) {
5248 *p = val;
5249 p += channelCount;
5250 len--;
5251 }
5252 }
5253 }
5254 }
5255 }
5256
5257 } else {
5258 // We're at the raw image data. It's each channel in order (Red, Green, Blue, Alpha, ...)
5259 // where each channel consists of an 8-bit value for each pixel in the image.
5260
5261 // Read the data by channel.
5262 for (channel = 0; channel < channelCount; channel++) {
5263 stbi_uc *p;
5264
5265 p = out + channel;
5266 if (channel > channelCount) {
5267 // Fill this channel with default data.
5268 for (i = 0; i < pixelCount; i++) *p = channel == 3 ? 255 : 0, p += channelCount;
5269 } else {
5270 // Read the data.
5271 for (i = 0; i < pixelCount; i++)
5272 *p = stbi__get8(s), p += channelCount;
5273 }
5274 }
5275 }
5276
5277 if (req_comp && req_comp != channelCount) {
5278 out = stbi__convert_format(out, channelCount, req_comp, w, h);
5279 if (out == NULL) return out; // stbi__convert_format frees input on failure
5280 }
5281
5282 if (comp) *comp = channelCount;
5283 *y = h;
5284 *x = w;
5285
5286 return out;
5287 }
5288 #endif
5289
5290 // *************************************************************************************************
5291 // Softimage PIC loader
5292 // by Tom Seddon
5293 //
5294 // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
5295 // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
5296
5297 #ifndef STBI_NO_PIC
stbi__pic_is4(stbi__context * s,const char * str)5298 static int stbi__pic_is4(stbi__context *s,const char *str)
5299 {
5300 int i;
5301 for (i=0; i<4; ++i)
5302 if (stbi__get8(s) != (stbi_uc)str[i])
5303 return 0;
5304
5305 return 1;
5306 }
5307
stbi__pic_test_core(stbi__context * s)5308 static int stbi__pic_test_core(stbi__context *s)
5309 {
5310 int i;
5311
5312 if (!stbi__pic_is4(s,"\x53\x80\xF6\x34"))
5313 return 0;
5314
5315 for(i=0;i<84;++i)
5316 stbi__get8(s);
5317
5318 if (!stbi__pic_is4(s,"PICT"))
5319 return 0;
5320
5321 return 1;
5322 }
5323
5324 typedef struct
5325 {
5326 stbi_uc size,type,channel;
5327 } stbi__pic_packet;
5328
stbi__readval(stbi__context * s,int channel,stbi_uc * dest)5329 static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest)
5330 {
5331 int mask=0x80, i;
5332
5333 for (i=0; i<4; ++i, mask>>=1) {
5334 if (channel & mask) {
5335 if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short");
5336 dest[i]=stbi__get8(s);
5337 }
5338 }
5339
5340 return dest;
5341 }
5342
stbi__copyval(int channel,stbi_uc * dest,const stbi_uc * src)5343 static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src)
5344 {
5345 int mask=0x80,i;
5346
5347 for (i=0;i<4; ++i, mask>>=1)
5348 if (channel&mask)
5349 dest[i]=src[i];
5350 }
5351
stbi__pic_load_core(stbi__context * s,int width,int height,int * comp,stbi_uc * result)5352 static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result)
5353 {
5354 int act_comp=0,num_packets=0,y,chained;
5355 stbi__pic_packet packets[10];
5356
5357 // this will (should...) cater for even some bizarre stuff like having data
5358 // for the same channel in multiple packets.
5359 do {
5360 stbi__pic_packet *packet;
5361
5362 if (num_packets==sizeof(packets)/sizeof(packets[0]))
5363 return stbi__errpuc("bad format","too many packets");
5364
5365 packet = &packets[num_packets++];
5366
5367 chained = stbi__get8(s);
5368 packet->size = stbi__get8(s);
5369 packet->type = stbi__get8(s);
5370 packet->channel = stbi__get8(s);
5371
5372 act_comp |= packet->channel;
5373
5374 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (reading packets)");
5375 if (packet->size != 8) return stbi__errpuc("bad format","packet isn't 8bpp");
5376 } while (chained);
5377
5378 *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
5379
5380 for(y=0; y<height; ++y) {
5381 int packet_idx;
5382
5383 for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
5384 stbi__pic_packet *packet = &packets[packet_idx];
5385 stbi_uc *dest = result+y*width*4;
5386
5387 switch (packet->type) {
5388 default:
5389 return stbi__errpuc("bad format","packet has bad compression type");
5390
5391 case 0: {//uncompressed
5392 int x;
5393
5394 for(x=0;x<width;++x, dest+=4)
5395 if (!stbi__readval(s,packet->channel,dest))
5396 return 0;
5397 break;
5398 }
5399
5400 case 1://Pure RLE
5401 {
5402 int left=width, i;
5403
5404 while (left>0) {
5405 stbi_uc count,value[4];
5406
5407 count=stbi__get8(s);
5408 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pure read count)");
5409
5410 if (count > left)
5411 count = (stbi_uc) left;
5412
5413 if (!stbi__readval(s,packet->channel,value)) return 0;
5414
5415 for(i=0; i<count; ++i,dest+=4)
5416 stbi__copyval(packet->channel,dest,value);
5417 left -= count;
5418 }
5419 }
5420 break;
5421
5422 case 2: {//Mixed RLE
5423 int left=width;
5424 while (left>0) {
5425 int count = stbi__get8(s), i;
5426 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (mixed read count)");
5427
5428 if (count >= 128) { // Repeated
5429 stbi_uc value[4];
5430
5431 if (count==128)
5432 count = stbi__get16be(s);
5433 else
5434 count -= 127;
5435 if (count > left)
5436 return stbi__errpuc("bad file","scanline overrun");
5437
5438 if (!stbi__readval(s,packet->channel,value))
5439 return 0;
5440
5441 for(i=0;i<count;++i, dest += 4)
5442 stbi__copyval(packet->channel,dest,value);
5443 } else { // Raw
5444 ++count;
5445 if (count>left) return stbi__errpuc("bad file","scanline overrun");
5446
5447 for(i=0;i<count;++i, dest+=4)
5448 if (!stbi__readval(s,packet->channel,dest))
5449 return 0;
5450 }
5451 left-=count;
5452 }
5453 break;
5454 }
5455 }
5456 }
5457 }
5458
5459 return result;
5460 }
5461
stbi__pic_load(stbi__context * s,int * px,int * py,int * comp,int req_comp)5462 static stbi_uc *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp)
5463 {
5464 stbi_uc *result;
5465 int i, x,y;
5466
5467 for (i=0; i<92; ++i)
5468 stbi__get8(s);
5469
5470 x = stbi__get16be(s);
5471 y = stbi__get16be(s);
5472 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pic header)");
5473 if ((1 << 28) / x < y) return stbi__errpuc("too large", "Image too large to decode");
5474
5475 stbi__get32be(s); //skip `ratio'
5476 stbi__get16be(s); //skip `fields'
5477 stbi__get16be(s); //skip `pad'
5478
5479 // intermediate buffer is RGBA
5480 result = (stbi_uc *) stbi__malloc(x*y*4);
5481 memset(result, 0xff, x*y*4);
5482
5483 if (!stbi__pic_load_core(s,x,y,comp, result)) {
5484 STBI_FREE(result);
5485 result=0;
5486 }
5487 *px = x;
5488 *py = y;
5489 if (req_comp == 0) req_comp = *comp;
5490 result=stbi__convert_format(result,4,req_comp,x,y);
5491
5492 return result;
5493 }
5494
stbi__pic_test(stbi__context * s)5495 static int stbi__pic_test(stbi__context *s)
5496 {
5497 int r = stbi__pic_test_core(s);
5498 stbi__rewind(s);
5499 return r;
5500 }
5501 #endif
5502
5503 // *************************************************************************************************
5504 // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
5505
5506 #ifndef STBI_NO_GIF
5507 typedef struct
5508 {
5509 stbi__int16 prefix;
5510 stbi_uc first;
5511 stbi_uc suffix;
5512 } stbi__gif_lzw;
5513
5514 typedef struct
5515 {
5516 int w,h;
5517 stbi_uc *out; // output buffer (always 4 components)
5518 int flags, bgindex, ratio, transparent, eflags;
5519 stbi_uc pal[256][4];
5520 stbi_uc lpal[256][4];
5521 stbi__gif_lzw codes[4096];
5522 stbi_uc *color_table;
5523 int parse, step;
5524 int lflags;
5525 int start_x, start_y;
5526 int max_x, max_y;
5527 int cur_x, cur_y;
5528 int line_size;
5529 } stbi__gif;
5530
stbi__gif_test_raw(stbi__context * s)5531 static int stbi__gif_test_raw(stbi__context *s)
5532 {
5533 int sz;
5534 if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
5535 sz = stbi__get8(s);
5536 if (sz != '9' && sz != '7') return 0;
5537 if (stbi__get8(s) != 'a') return 0;
5538 return 1;
5539 }
5540
stbi__gif_test(stbi__context * s)5541 static int stbi__gif_test(stbi__context *s)
5542 {
5543 int r = stbi__gif_test_raw(s);
5544 stbi__rewind(s);
5545 return r;
5546 }
5547
stbi__gif_parse_colortable(stbi__context * s,stbi_uc pal[256][4],int num_entries,int transp)5548 static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp)
5549 {
5550 int i;
5551 for (i=0; i < num_entries; ++i) {
5552 pal[i][2] = stbi__get8(s);
5553 pal[i][1] = stbi__get8(s);
5554 pal[i][0] = stbi__get8(s);
5555 pal[i][3] = transp == i ? 0 : 255;
5556 }
5557 }
5558
stbi__gif_header(stbi__context * s,stbi__gif * g,int * comp,int is_info)5559 static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info)
5560 {
5561 stbi_uc version;
5562 if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
5563 return stbi__err("not GIF", "Corrupt GIF");
5564
5565 version = stbi__get8(s);
5566 if (version != '7' && version != '9') return stbi__err("not GIF", "Corrupt GIF");
5567 if (stbi__get8(s) != 'a') return stbi__err("not GIF", "Corrupt GIF");
5568
5569 stbi__g_failure_reason = "";
5570 g->w = stbi__get16le(s);
5571 g->h = stbi__get16le(s);
5572 g->flags = stbi__get8(s);
5573 g->bgindex = stbi__get8(s);
5574 g->ratio = stbi__get8(s);
5575 g->transparent = -1;
5576
5577 if (comp != 0) *comp = 4; // can't actually tell whether it's 3 or 4 until we parse the comments
5578
5579 if (is_info) return 1;
5580
5581 if (g->flags & 0x80)
5582 stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
5583
5584 return 1;
5585 }
5586
stbi__gif_info_raw(stbi__context * s,int * x,int * y,int * comp)5587 static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp)
5588 {
5589 stbi__gif g;
5590 if (!stbi__gif_header(s, &g, comp, 1)) {
5591 stbi__rewind( s );
5592 return 0;
5593 }
5594 if (x) *x = g.w;
5595 if (y) *y = g.h;
5596 return 1;
5597 }
5598
stbi__out_gif_code(stbi__gif * g,stbi__uint16 code)5599 static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code)
5600 {
5601 stbi_uc *p, *c;
5602
5603 // recurse to decode the prefixes, since the linked-list is backwards,
5604 // and working backwards through an interleaved image would be nasty
5605 if (g->codes[code].prefix >= 0)
5606 stbi__out_gif_code(g, g->codes[code].prefix);
5607
5608 if (g->cur_y >= g->max_y) return;
5609
5610 p = &g->out[g->cur_x + g->cur_y];
5611 c = &g->color_table[g->codes[code].suffix * 4];
5612
5613 if (c[3] >= 128) {
5614 p[0] = c[2];
5615 p[1] = c[1];
5616 p[2] = c[0];
5617 p[3] = c[3];
5618 }
5619 g->cur_x += 4;
5620
5621 if (g->cur_x >= g->max_x) {
5622 g->cur_x = g->start_x;
5623 g->cur_y += g->step;
5624
5625 while (g->cur_y >= g->max_y && g->parse > 0) {
5626 g->step = (1 << g->parse) * g->line_size;
5627 g->cur_y = g->start_y + (g->step >> 1);
5628 --g->parse;
5629 }
5630 }
5631 }
5632
stbi__process_gif_raster(stbi__context * s,stbi__gif * g)5633 static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g)
5634 {
5635 stbi_uc lzw_cs;
5636 stbi__int32 len, code;
5637 stbi__uint32 first;
5638 stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
5639 stbi__gif_lzw *p;
5640
5641 lzw_cs = stbi__get8(s);
5642 if (lzw_cs > 12) return NULL;
5643 clear = 1 << lzw_cs;
5644 first = 1;
5645 codesize = lzw_cs + 1;
5646 codemask = (1 << codesize) - 1;
5647 bits = 0;
5648 valid_bits = 0;
5649 for (code = 0; code < clear; code++) {
5650 g->codes[code].prefix = -1;
5651 g->codes[code].first = (stbi_uc) code;
5652 g->codes[code].suffix = (stbi_uc) code;
5653 }
5654
5655 // support no starting clear code
5656 avail = clear+2;
5657 oldcode = -1;
5658
5659 len = 0;
5660 for(;;) {
5661 if (valid_bits < codesize) {
5662 if (len == 0) {
5663 len = stbi__get8(s); // start new block
5664 if (len == 0)
5665 return g->out;
5666 }
5667 --len;
5668 bits |= (stbi__int32) stbi__get8(s) << valid_bits;
5669 valid_bits += 8;
5670 } else {
5671 stbi__int32 code2 = bits & codemask;
5672 bits >>= codesize;
5673 valid_bits -= codesize;
5674 // @OPTIMIZE: is there some way we can accelerate the non-clear path?
5675 if (code2 == clear) { // clear code
5676 codesize = lzw_cs + 1;
5677 codemask = (1 << codesize) - 1;
5678 avail = clear + 2;
5679 oldcode = -1;
5680 first = 0;
5681 } else if (code2 == clear + 1) { // end of stream code
5682 stbi__skip(s, len);
5683 while ((len = stbi__get8(s)) > 0)
5684 stbi__skip(s,len);
5685 return g->out;
5686 } else if (code2 <= avail) {
5687 if (first) return stbi__errpuc("no clear code", "Corrupt GIF");
5688
5689 if (oldcode >= 0) {
5690 p = &g->codes[avail++];
5691 if (avail > 4096) return stbi__errpuc("too many codes", "Corrupt GIF");
5692 p->prefix = (stbi__int16) oldcode;
5693 p->first = g->codes[oldcode].first;
5694 p->suffix = (code2 == avail) ? p->first : g->codes[code2].first;
5695 } else if (code2 == avail)
5696 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5697
5698 stbi__out_gif_code(g, (stbi__uint16) code2);
5699
5700 if ((avail & codemask) == 0 && avail <= 0x0FFF) {
5701 codesize++;
5702 codemask = (1 << codesize) - 1;
5703 }
5704
5705 oldcode = code2;
5706 } else {
5707 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5708 }
5709 }
5710 }
5711 }
5712
stbi__fill_gif_background(stbi__gif * g)5713 static void stbi__fill_gif_background(stbi__gif *g)
5714 {
5715 int i;
5716 stbi_uc *c = g->pal[g->bgindex];
5717 // @OPTIMIZE: write a dword at a time
5718 for (i = 0; i < g->w * g->h * 4; i += 4) {
5719 stbi_uc *p = &g->out[i];
5720 p[0] = c[2];
5721 p[1] = c[1];
5722 p[2] = c[0];
5723 p[3] = c[3];
5724 }
5725 }
5726
5727 // this function is designed to support animated gifs, although stb_image doesn't support it
stbi__gif_load_next(stbi__context * s,stbi__gif * g,int * comp,int req_comp)5728 static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp)
5729 {
5730 int i;
5731 stbi_uc *old_out = 0;
5732
5733 if (g->out == 0) {
5734 if (!stbi__gif_header(s, g, comp,0)) return 0; // stbi__g_failure_reason set by stbi__gif_header
5735 g->out = (stbi_uc *) stbi__malloc(4 * g->w * g->h);
5736 if (g->out == 0) return stbi__errpuc("outofmem", "Out of memory");
5737 stbi__fill_gif_background(g);
5738 } else {
5739 // animated-gif-only path
5740 if (((g->eflags & 0x1C) >> 2) == 3) {
5741 old_out = g->out;
5742 g->out = (stbi_uc *) stbi__malloc(4 * g->w * g->h);
5743 if (g->out == 0) return stbi__errpuc("outofmem", "Out of memory");
5744 memcpy(g->out, old_out, g->w*g->h*4);
5745 }
5746 }
5747
5748 for (;;) {
5749 switch (stbi__get8(s)) {
5750 case 0x2C: /* Image Descriptor */
5751 {
5752 stbi__int32 x, y, w, h;
5753 stbi_uc *o;
5754
5755 x = stbi__get16le(s);
5756 y = stbi__get16le(s);
5757 w = stbi__get16le(s);
5758 h = stbi__get16le(s);
5759 if (((x + w) > (g->w)) || ((y + h) > (g->h)))
5760 return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
5761
5762 g->line_size = g->w * 4;
5763 g->start_x = x * 4;
5764 g->start_y = y * g->line_size;
5765 g->max_x = g->start_x + w * 4;
5766 g->max_y = g->start_y + h * g->line_size;
5767 g->cur_x = g->start_x;
5768 g->cur_y = g->start_y;
5769
5770 g->lflags = stbi__get8(s);
5771
5772 if (g->lflags & 0x40) {
5773 g->step = 8 * g->line_size; // first interlaced spacing
5774 g->parse = 3;
5775 } else {
5776 g->step = g->line_size;
5777 g->parse = 0;
5778 }
5779
5780 if (g->lflags & 0x80) {
5781 stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
5782 g->color_table = (stbi_uc *) g->lpal;
5783 } else if (g->flags & 0x80) {
5784 for (i=0; i < 256; ++i) // @OPTIMIZE: stbi__jpeg_reset only the previous transparent
5785 g->pal[i][3] = 255;
5786 if (g->transparent >= 0 && (g->eflags & 0x01))
5787 g->pal[g->transparent][3] = 0;
5788 g->color_table = (stbi_uc *) g->pal;
5789 } else
5790 return stbi__errpuc("missing color table", "Corrupt GIF");
5791
5792 o = stbi__process_gif_raster(s, g);
5793 if (o == NULL) return NULL;
5794
5795 if (req_comp && req_comp != 4)
5796 o = stbi__convert_format(o, 4, req_comp, g->w, g->h);
5797 return o;
5798 }
5799
5800 case 0x21: // Comment Extension.
5801 {
5802 int len;
5803 if (stbi__get8(s) == 0xF9) { // Graphic Control Extension.
5804 len = stbi__get8(s);
5805 if (len == 4) {
5806 g->eflags = stbi__get8(s);
5807 stbi__get16le(s); // delay
5808 g->transparent = stbi__get8(s);
5809 } else {
5810 stbi__skip(s, len);
5811 break;
5812 }
5813 }
5814 while ((len = stbi__get8(s)) != 0)
5815 stbi__skip(s, len);
5816 break;
5817 }
5818
5819 case 0x3B: // gif stream termination code
5820 return (stbi_uc *) s; // using '1' causes warning on some compilers
5821
5822 default:
5823 return stbi__errpuc("unknown code", "Corrupt GIF");
5824 }
5825 }
5826 }
5827
stbi__gif_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)5828 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5829 {
5830 stbi_uc *u = 0;
5831 stbi__gif g;
5832 memset(&g, 0, sizeof(g));
5833
5834 u = stbi__gif_load_next(s, &g, comp, req_comp);
5835 if (u == (stbi_uc *) s) u = 0; // end of animated gif marker
5836 if (u) {
5837 *x = g.w;
5838 *y = g.h;
5839 }
5840
5841 return u;
5842 }
5843
stbi__gif_info(stbi__context * s,int * x,int * y,int * comp)5844 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp)
5845 {
5846 return stbi__gif_info_raw(s,x,y,comp);
5847 }
5848 #endif
5849
5850 // *************************************************************************************************
5851 // Radiance RGBE HDR loader
5852 // originally by Nicolas Schulz
5853 #ifndef STBI_NO_HDR
stbi__hdr_test_core(stbi__context * s)5854 static int stbi__hdr_test_core(stbi__context *s)
5855 {
5856 const char *signature = "#?RADIANCE\n";
5857 int i;
5858 for (i=0; signature[i]; ++i)
5859 if (stbi__get8(s) != signature[i])
5860 return 0;
5861 return 1;
5862 }
5863
stbi__hdr_test(stbi__context * s)5864 static int stbi__hdr_test(stbi__context* s)
5865 {
5866 int r = stbi__hdr_test_core(s);
5867 stbi__rewind(s);
5868 return r;
5869 }
5870
5871 #define STBI__HDR_BUFLEN 1024
stbi__hdr_gettoken(stbi__context * z,char * buffer)5872 static char *stbi__hdr_gettoken(stbi__context *z, char *buffer)
5873 {
5874 int len=0;
5875 char c = '\0';
5876
5877 c = (char) stbi__get8(z);
5878
5879 while (!stbi__at_eof(z) && c != '\n') {
5880 buffer[len++] = c;
5881 if (len == STBI__HDR_BUFLEN-1) {
5882 // flush to end of line
5883 while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
5884 ;
5885 break;
5886 }
5887 c = (char) stbi__get8(z);
5888 }
5889
5890 buffer[len] = 0;
5891 return buffer;
5892 }
5893
stbi__hdr_convert(float * output,stbi_uc * input,int req_comp)5894 static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp)
5895 {
5896 if ( input[3] != 0 ) {
5897 float f1;
5898 // Exponent
5899 f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
5900 if (req_comp <= 2)
5901 output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
5902 else {
5903 output[0] = input[0] * f1;
5904 output[1] = input[1] * f1;
5905 output[2] = input[2] * f1;
5906 }
5907 if (req_comp == 2) output[1] = 1;
5908 if (req_comp == 4) output[3] = 1;
5909 } else {
5910 switch (req_comp) {
5911 case 4: output[3] = 1; /* fallthrough */
5912 case 3: output[0] = output[1] = output[2] = 0;
5913 break;
5914 case 2: output[1] = 1; /* fallthrough */
5915 case 1: output[0] = 0;
5916 break;
5917 }
5918 }
5919 }
5920
stbi__hdr_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)5921 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5922 {
5923 char buffer[STBI__HDR_BUFLEN];
5924 char *token;
5925 int valid = 0;
5926 int width, height;
5927 stbi_uc *scanline;
5928 float *hdr_data;
5929 int len;
5930 unsigned char count, value;
5931 int i, j, k, c1,c2, z;
5932
5933
5934 // Check identifier
5935 if (strcmp(stbi__hdr_gettoken(s,buffer), "#?RADIANCE") != 0)
5936 return stbi__errpf("not HDR", "Corrupt HDR image");
5937
5938 // Parse header
5939 for(;;) {
5940 token = stbi__hdr_gettoken(s,buffer);
5941 if (token[0] == 0) break;
5942 if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
5943 }
5944
5945 if (!valid) return stbi__errpf("unsupported format", "Unsupported HDR format");
5946
5947 // Parse width and height
5948 // can't use sscanf() if we're not using stdio!
5949 token = stbi__hdr_gettoken(s,buffer);
5950 if (strncmp(token, "-Y ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
5951 token += 3;
5952 height = (int) strtol(token, &token, 10);
5953 while (*token == ' ') ++token;
5954 if (strncmp(token, "+X ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
5955 token += 3;
5956 width = (int) strtol(token, NULL, 10);
5957
5958 *x = width;
5959 *y = height;
5960
5961 if (comp) *comp = 3;
5962 if (req_comp == 0) req_comp = 3;
5963
5964 // Read data
5965 hdr_data = (float *) stbi__malloc(height * width * req_comp * sizeof(float));
5966
5967 // Load image data
5968 // image data is stored as some number of sca
5969 if ( width < 8 || width >= 32768) {
5970 // Read flat data
5971 for (j=0; j < height; ++j) {
5972 for (i=0; i < width; ++i) {
5973 stbi_uc rgbe[4];
5974 main_decode_loop:
5975 stbi__getn(s, rgbe, 4);
5976 stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
5977 }
5978 }
5979 } else {
5980 // Read RLE-encoded data
5981 scanline = NULL;
5982
5983 for (j = 0; j < height; ++j) {
5984 c1 = stbi__get8(s);
5985 c2 = stbi__get8(s);
5986 len = stbi__get8(s);
5987 if (c1 != 2 || c2 != 2 || (len & 0x80)) {
5988 // not run-length encoded, so we have to actually use THIS data as a decoded
5989 // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
5990 stbi_uc rgbe[4];
5991 rgbe[0] = (stbi_uc) c1;
5992 rgbe[1] = (stbi_uc) c2;
5993 rgbe[2] = (stbi_uc) len;
5994 rgbe[3] = (stbi_uc) stbi__get8(s);
5995 stbi__hdr_convert(hdr_data, rgbe, req_comp);
5996 i = 1;
5997 j = 0;
5998 STBI_FREE(scanline);
5999 goto main_decode_loop; // yes, this makes no sense
6000 }
6001 len <<= 8;
6002 len |= stbi__get8(s);
6003 if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
6004 if (scanline == NULL) scanline = (stbi_uc *) stbi__malloc(width * 4);
6005
6006 for (k = 0; k < 4; ++k) {
6007 i = 0;
6008 while (i < width) {
6009 count = stbi__get8(s);
6010 if (count > 128) {
6011 // Run
6012 value = stbi__get8(s);
6013 count -= 128;
6014 for (z = 0; z < count; ++z)
6015 scanline[i++ * 4 + k] = value;
6016 } else {
6017 // Dump
6018 for (z = 0; z < count; ++z)
6019 scanline[i++ * 4 + k] = stbi__get8(s);
6020 }
6021 }
6022 }
6023 for (i=0; i < width; ++i)
6024 stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
6025 }
6026 STBI_FREE(scanline);
6027 }
6028
6029 return hdr_data;
6030 }
6031
stbi__hdr_info(stbi__context * s,int * x,int * y,int * comp)6032 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp)
6033 {
6034 char buffer[STBI__HDR_BUFLEN];
6035 char *token;
6036 int valid = 0;
6037
6038 if (strcmp(stbi__hdr_gettoken(s,buffer), "#?RADIANCE") != 0) {
6039 stbi__rewind( s );
6040 return 0;
6041 }
6042
6043 for(;;) {
6044 token = stbi__hdr_gettoken(s,buffer);
6045 if (token[0] == 0) break;
6046 if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6047 }
6048
6049 if (!valid) {
6050 stbi__rewind( s );
6051 return 0;
6052 }
6053 token = stbi__hdr_gettoken(s,buffer);
6054 if (strncmp(token, "-Y ", 3)) {
6055 stbi__rewind( s );
6056 return 0;
6057 }
6058 token += 3;
6059 *y = (int) strtol(token, &token, 10);
6060 while (*token == ' ') ++token;
6061 if (strncmp(token, "+X ", 3)) {
6062 stbi__rewind( s );
6063 return 0;
6064 }
6065 token += 3;
6066 *x = (int) strtol(token, NULL, 10);
6067 *comp = 3;
6068 return 1;
6069 }
6070 #endif // STBI_NO_HDR
6071
6072 #ifndef STBI_NO_BMP
stbi__bmp_info(stbi__context * s,int * x,int * y,int * comp)6073 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp)
6074 {
6075 int hsz;
6076 if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') {
6077 stbi__rewind( s );
6078 return 0;
6079 }
6080 stbi__skip(s,12);
6081 hsz = stbi__get32le(s);
6082 if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) {
6083 stbi__rewind( s );
6084 return 0;
6085 }
6086 if (hsz == 12) {
6087 *x = stbi__get16le(s);
6088 *y = stbi__get16le(s);
6089 } else {
6090 *x = stbi__get32le(s);
6091 *y = stbi__get32le(s);
6092 }
6093 if (stbi__get16le(s) != 1) {
6094 stbi__rewind( s );
6095 return 0;
6096 }
6097 *comp = stbi__get16le(s) / 8;
6098 return 1;
6099 }
6100 #endif
6101
6102 #ifndef STBI_NO_PSD
stbi__psd_info(stbi__context * s,int * x,int * y,int * comp)6103 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp)
6104 {
6105 int channelCount;
6106 if (stbi__get32be(s) != 0x38425053) {
6107 stbi__rewind( s );
6108 return 0;
6109 }
6110 if (stbi__get16be(s) != 1) {
6111 stbi__rewind( s );
6112 return 0;
6113 }
6114 stbi__skip(s, 6);
6115 channelCount = stbi__get16be(s);
6116 if (channelCount < 0 || channelCount > 16) {
6117 stbi__rewind( s );
6118 return 0;
6119 }
6120 *y = stbi__get32be(s);
6121 *x = stbi__get32be(s);
6122 if (stbi__get16be(s) != 8) {
6123 stbi__rewind( s );
6124 return 0;
6125 }
6126 if (stbi__get16be(s) != 3) {
6127 stbi__rewind( s );
6128 return 0;
6129 }
6130 *comp = 4;
6131 return 1;
6132 }
6133 #endif
6134
6135 #ifndef STBI_NO_PIC
stbi__pic_info(stbi__context * s,int * x,int * y,int * comp)6136 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp)
6137 {
6138 int act_comp=0,num_packets=0,chained;
6139 stbi__pic_packet packets[10];
6140
6141 stbi__skip(s, 92);
6142
6143 *x = stbi__get16be(s);
6144 *y = stbi__get16be(s);
6145 if (stbi__at_eof(s)) return 0;
6146 if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
6147 stbi__rewind( s );
6148 return 0;
6149 }
6150
6151 stbi__skip(s, 8);
6152
6153 do {
6154 stbi__pic_packet *packet;
6155
6156 if (num_packets==sizeof(packets)/sizeof(packets[0]))
6157 return 0;
6158
6159 packet = &packets[num_packets++];
6160 chained = stbi__get8(s);
6161 packet->size = stbi__get8(s);
6162 packet->type = stbi__get8(s);
6163 packet->channel = stbi__get8(s);
6164 act_comp |= packet->channel;
6165
6166 if (stbi__at_eof(s)) {
6167 stbi__rewind( s );
6168 return 0;
6169 }
6170 if (packet->size != 8) {
6171 stbi__rewind( s );
6172 return 0;
6173 }
6174 } while (chained);
6175
6176 *comp = (act_comp & 0x10 ? 4 : 3);
6177
6178 return 1;
6179 }
6180 #endif
6181
6182 // *************************************************************************************************
6183 // Portable Gray Map and Portable Pixel Map loader
6184 // by Ken Miller
6185 //
6186 // PGM: http://netpbm.sourceforge.net/doc/pgm.html
6187 // PPM: http://netpbm.sourceforge.net/doc/ppm.html
6188 //
6189 // Known limitations:
6190 // Does not support comments in the header section
6191 // Does not support ASCII image data (formats P2 and P3)
6192 // Does not support 16-bit-per-channel
6193
6194 #ifndef STBI_NO_PNM
6195
stbi__pnm_test(stbi__context * s)6196 static int stbi__pnm_test(stbi__context *s)
6197 {
6198 char p, t;
6199 p = (char) stbi__get8(s);
6200 t = (char) stbi__get8(s);
6201 if (p != 'P' || (t != '5' && t != '6')) {
6202 stbi__rewind( s );
6203 return 0;
6204 }
6205 return 1;
6206 }
6207
stbi__pnm_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)6208 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6209 {
6210 stbi_uc *out;
6211 if (!stbi__pnm_info(s, (int *)&s->img_x, (int *)&s->img_y, (int *)&s->img_n))
6212 return 0;
6213 *x = s->img_x;
6214 *y = s->img_y;
6215 *comp = s->img_n;
6216
6217 out = (stbi_uc *) stbi__malloc(s->img_n * s->img_x * s->img_y);
6218 if (!out) return stbi__errpuc("outofmem", "Out of memory");
6219 stbi__getn(s, out, s->img_n * s->img_x * s->img_y);
6220
6221 if (req_comp && req_comp != s->img_n) {
6222 out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y);
6223 if (out == NULL) return out; // stbi__convert_format frees input on failure
6224 }
6225 return out;
6226 }
6227
stbi__pnm_isspace(char c)6228 static int stbi__pnm_isspace(char c)
6229 {
6230 return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r';
6231 }
6232
stbi__pnm_skip_whitespace(stbi__context * s,char * c)6233 static void stbi__pnm_skip_whitespace(stbi__context *s, char *c)
6234 {
6235 while (!stbi__at_eof(s) && stbi__pnm_isspace(*c))
6236 *c = (char) stbi__get8(s);
6237 }
6238
stbi__pnm_isdigit(char c)6239 static int stbi__pnm_isdigit(char c)
6240 {
6241 return c >= '0' && c <= '9';
6242 }
6243
stbi__pnm_getinteger(stbi__context * s,char * c)6244 static int stbi__pnm_getinteger(stbi__context *s, char *c)
6245 {
6246 int value = 0;
6247
6248 while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) {
6249 value = value*10 + (*c - '0');
6250 *c = (char) stbi__get8(s);
6251 }
6252
6253 return value;
6254 }
6255
stbi__pnm_info(stbi__context * s,int * x,int * y,int * comp)6256 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp)
6257 {
6258 int maxv;
6259 char c, p, t;
6260
6261 stbi__rewind( s );
6262
6263 // Get identifier
6264 p = (char) stbi__get8(s);
6265 t = (char) stbi__get8(s);
6266 if (p != 'P' || (t != '5' && t != '6')) {
6267 stbi__rewind( s );
6268 return 0;
6269 }
6270
6271 *comp = (t == '6') ? 3 : 1; // '5' is 1-component .pgm; '6' is 3-component .ppm
6272
6273 c = (char) stbi__get8(s);
6274 stbi__pnm_skip_whitespace(s, &c);
6275
6276 *x = stbi__pnm_getinteger(s, &c); // read width
6277 stbi__pnm_skip_whitespace(s, &c);
6278
6279 *y = stbi__pnm_getinteger(s, &c); // read height
6280 stbi__pnm_skip_whitespace(s, &c);
6281
6282 maxv = stbi__pnm_getinteger(s, &c); // read max value
6283
6284 if (maxv > 255)
6285 return stbi__err("max value > 255", "PPM image not 8-bit");
6286 else
6287 return 1;
6288 }
6289 #endif
6290
stbi__info_main(stbi__context * s,int * x,int * y,int * comp)6291 static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp)
6292 {
6293 #ifndef STBI_NO_JPEG
6294 if (stbi__jpeg_info(s, x, y, comp)) return 1;
6295 #endif
6296
6297 #ifndef STBI_NO_PNG
6298 if (stbi__png_info(s, x, y, comp)) return 1;
6299 #endif
6300
6301 #ifndef STBI_NO_GIF
6302 if (stbi__gif_info(s, x, y, comp)) return 1;
6303 #endif
6304
6305 #ifndef STBI_NO_BMP
6306 if (stbi__bmp_info(s, x, y, comp)) return 1;
6307 #endif
6308
6309 #ifndef STBI_NO_PSD
6310 if (stbi__psd_info(s, x, y, comp)) return 1;
6311 #endif
6312
6313 #ifndef STBI_NO_PIC
6314 if (stbi__pic_info(s, x, y, comp)) return 1;
6315 #endif
6316
6317 #ifndef STBI_NO_PNM
6318 if (stbi__pnm_info(s, x, y, comp)) return 1;
6319 #endif
6320
6321 #ifndef STBI_NO_HDR
6322 if (stbi__hdr_info(s, x, y, comp)) return 1;
6323 #endif
6324
6325 // test tga last because it's a crappy test!
6326 #ifndef STBI_NO_TGA
6327 if (stbi__tga_info(s, x, y, comp))
6328 return 1;
6329 #endif
6330 return stbi__err("unknown image type", "Image not of any known type, or corrupt");
6331 }
6332
6333 #ifndef STBI_NO_STDIO
stbi_info(char const * filename,int * x,int * y,int * comp)6334 STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
6335 {
6336 FILE *f = stbi__fopen(filename, "rb");
6337 int result;
6338 if (!f) return stbi__err("can't fopen", "Unable to open file");
6339 result = stbi_info_from_file(f, x, y, comp);
6340 fclose(f);
6341 return result;
6342 }
6343
stbi_info_from_file(FILE * f,int * x,int * y,int * comp)6344 STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
6345 {
6346 int r;
6347 stbi__context s;
6348 long pos = ftell(f);
6349 stbi__start_file(&s, f);
6350 r = stbi__info_main(&s,x,y,comp);
6351 fseek(f,pos,SEEK_SET);
6352 return r;
6353 }
6354 #endif // !STBI_NO_STDIO
6355
stbi_info_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp)6356 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
6357 {
6358 stbi__context s;
6359 stbi__start_mem(&s,buffer,len);
6360 return stbi__info_main(&s,x,y,comp);
6361 }
6362
stbi_info_from_callbacks(stbi_io_callbacks const * c,void * user,int * x,int * y,int * comp)6363 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
6364 {
6365 stbi__context s;
6366 stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
6367 return stbi__info_main(&s,x,y,comp);
6368 }
6369
6370 // add in my DDS loading support
6371 #ifndef STBI_NO_DDS
6372 #include "stbi_DDS_c.h"
6373 #endif
6374
6375 // add in my pvr loading support
6376 #ifndef STBI_NO_PVR
6377 #include "stbi_pvr_c.h"
6378 #endif
6379
6380 // add in my pkm ( ETC1 ) loading support
6381 #ifndef STBI_NO_PKM
6382 #include "stbi_pkm_c.h"
6383 #endif
6384
6385 #ifndef STBI_NO_EXT
6386 #include "stbi_ext_c.h"
6387 #endif
6388
6389 #endif // STB_IMAGE_IMPLEMENTATION
6390
6391 /*
6392 revision history:
6393 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
6394 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
6395 2.03 (2015-04-12) extra corruption checking (mmozeiko)
6396 stbi_set_flip_vertically_on_load (nguillemot)
6397 fix NEON support; fix mingw support
6398 2.02 (2015-01-19) fix incorrect assert, fix warning
6399 2.01 (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
6400 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
6401 2.00 (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
6402 progressive JPEG (stb)
6403 PGM/PPM support (Ken Miller)
6404 STBI_MALLOC,STBI_REALLOC,STBI_FREE
6405 GIF bugfix -- seemingly never worked
6406 STBI_NO_*, STBI_ONLY_*
6407 1.48 (2014-12-14) fix incorrectly-named assert()
6408 1.47 (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
6409 optimize PNG (ryg)
6410 fix bug in interlaced PNG with user-specified channel count (stb)
6411 1.46 (2014-08-26)
6412 fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
6413 1.45 (2014-08-16)
6414 fix MSVC-ARM internal compiler error by wrapping malloc
6415 1.44 (2014-08-07)
6416 various warning fixes from Ronny Chevalier
6417 1.43 (2014-07-15)
6418 fix MSVC-only compiler problem in code changed in 1.42
6419 1.42 (2014-07-09)
6420 don't define _CRT_SECURE_NO_WARNINGS (affects user code)
6421 fixes to stbi__cleanup_jpeg path
6422 added STBI_ASSERT to avoid requiring assert.h
6423 1.41 (2014-06-25)
6424 fix search&replace from 1.36 that messed up comments/error messages
6425 1.40 (2014-06-22)
6426 fix gcc struct-initialization warning
6427 1.39 (2014-06-15)
6428 fix to TGA optimization when req_comp != number of components in TGA;
6429 fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
6430 add support for BMP version 5 (more ignored fields)
6431 1.38 (2014-06-06)
6432 suppress MSVC warnings on integer casts truncating values
6433 fix accidental rename of 'skip' field of I/O
6434 1.37 (2014-06-04)
6435 remove duplicate typedef
6436 1.36 (2014-06-03)
6437 convert to header file single-file library
6438 if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
6439 1.35 (2014-05-27)
6440 various warnings
6441 fix broken STBI_SIMD path
6442 fix bug where stbi_load_from_file no longer left file pointer in correct place
6443 fix broken non-easy path for 32-bit BMP (possibly never used)
6444 TGA optimization by Arseny Kapoulkine
6445 1.34 (unknown)
6446 use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
6447 1.33 (2011-07-14)
6448 make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
6449 1.32 (2011-07-13)
6450 support for "info" function for all supported filetypes (SpartanJ)
6451 1.31 (2011-06-20)
6452 a few more leak fixes, bug in PNG handling (SpartanJ)
6453 1.30 (2011-06-11)
6454 added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
6455 removed deprecated format-specific test/load functions
6456 removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
6457 error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
6458 fix inefficiency in decoding 32-bit BMP (David Woo)
6459 1.29 (2010-08-16)
6460 various warning fixes from Aurelien Pocheville
6461 1.28 (2010-08-01)
6462 fix bug in GIF palette transparency (SpartanJ)
6463 1.27 (2010-08-01)
6464 cast-to-stbi_uc to fix warnings
6465 1.26 (2010-07-24)
6466 fix bug in file buffering for PNG reported by SpartanJ
6467 1.25 (2010-07-17)
6468 refix trans_data warning (Won Chun)
6469 1.24 (2010-07-12)
6470 perf improvements reading from files on platforms with lock-heavy fgetc()
6471 minor perf improvements for jpeg
6472 deprecated type-specific functions so we'll get feedback if they're needed
6473 attempt to fix trans_data warning (Won Chun)
6474 1.23 fixed bug in iPhone support
6475 1.22 (2010-07-10)
6476 removed image *writing* support
6477 stbi_info support from Jetro Lauha
6478 GIF support from Jean-Marc Lienher
6479 iPhone PNG-extensions from James Brown
6480 warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
6481 1.21 fix use of 'stbi_uc' in header (reported by jon blow)
6482 1.20 added support for Softimage PIC, by Tom Seddon
6483 1.19 bug in interlaced PNG corruption check (found by ryg)
6484 1.18 (2008-08-02)
6485 fix a threading bug (local mutable static)
6486 1.17 support interlaced PNG
6487 1.16 major bugfix - stbi__convert_format converted one too many pixels
6488 1.15 initialize some fields for thread safety
6489 1.14 fix threadsafe conversion bug
6490 header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
6491 1.13 threadsafe
6492 1.12 const qualifiers in the API
6493 1.11 Support installable IDCT, colorspace conversion routines
6494 1.10 Fixes for 64-bit (don't use "unsigned long")
6495 optimized upsampling by Fabian "ryg" Giesen
6496 1.09 Fix format-conversion for PSD code (bad global variables!)
6497 1.08 Thatcher Ulrich's PSD code integrated by Nicolas Schulz
6498 1.07 attempt to fix C++ warning/errors again
6499 1.06 attempt to fix C++ warning/errors again
6500 1.05 fix TGA loading to return correct *comp and use good luminance calc
6501 1.04 default float alpha is 1, not 255; use 'void *' for stbi_image_free
6502 1.03 bugfixes to STBI_NO_STDIO, STBI_NO_HDR
6503 1.02 support for (subset of) HDR files, float interface for preferred access to them
6504 1.01 fix bug: possible bug in handling right-side up bmps... not sure
6505 fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
6506 1.00 interface to zlib that skips zlib header
6507 0.99 correct handling of alpha in palette
6508 0.98 TGA loader by lonesock; dynamically add loaders (untested)
6509 0.97 jpeg errors on too large a file; also catch another malloc failure
6510 0.96 fix detection of invalid v value - particleman@mollyrocket forum
6511 0.95 during header scan, seek to markers in case of padding
6512 0.94 STBI_NO_STDIO to disable stdio usage; rename all #defines the same
6513 0.93 handle jpegtran output; verbose errors
6514 0.92 read 4,8,16,24,32-bit BMP files of several formats
6515 0.91 output 24-bit Windows 3.0 BMP files
6516 0.90 fix a few more warnings; bump version number to approach 1.0
6517 0.61 bugfixes due to Marc LeBlanc, Christopher Lloyd
6518 0.60 fix compiling as c++
6519 0.59 fix warnings: merge Dave Moore's -Wall fixes
6520 0.58 fix bug: zlib uncompressed mode len/nlen was wrong endian
6521 0.57 fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
6522 0.56 fix bug: zlib uncompressed mode len vs. nlen
6523 0.55 fix bug: restart_interval not initialized to 0
6524 0.54 allow NULL for 'int *comp'
6525 0.53 fix bug in png 3->4; speedup png decoding
6526 0.52 png handles req_comp=3,4 directly; minor cleanup; jpeg comments
6527 0.51 obey req_comp requests, 1-component jpegs return as 1-component,
6528 on 'test' only check type, not whether we support this variant
6529 0.50 (2006-11-19)
6530 first released version
6531 */
6532