1 /* stb_image - v2.12 - public domain image loader - http://nothings.org/stb_image.h
2 no warranty implied; use at your own risk
3
4 Do this:
5 #define STB_IMAGE_IMPLEMENTATION
6 before you include this file in *one* C or C++ file to create the implementation.
7
8 // i.e. it should look like this:
9 #include ...
10 #include ...
11 #include ...
12 #define STB_IMAGE_IMPLEMENTATION
13 #include "stb_image.h"
14
15 You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
16 And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
17
18
19 QUICK NOTES:
20 Primarily of interest to game developers and other people who can
21 avoid problematic images and only need the trivial interface
22
23 JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
24 PNG 1/2/4/8-bit-per-channel (16 bpc not supported)
25
26 TGA (not sure what subset, if a subset)
27 BMP non-1bpp, non-RLE
28 PSD (composited view only, no extra channels, 8/16 bit-per-channel)
29
30 GIF (*comp always reports as 4-channel)
31 HDR (radiance rgbE format)
32 PIC (Softimage PIC)
33 PNM (PPM and PGM binary only)
34
35 Animated GIF still needs a proper API, but here's one way to do it:
36 http://gist.github.com/urraka/685d9a6340b26b830d49
37
38 - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
39 - decode from arbitrary I/O callbacks
40 - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
41
42 Full documentation under "DOCUMENTATION" below.
43
44
45 Revision 2.00 release notes:
46
47 - Progressive JPEG is now supported.
48
49 - PPM and PGM binary formats are now supported, thanks to Ken Miller.
50
51 - x86 platforms now make use of SSE2 SIMD instructions for
52 JPEG decoding, and ARM platforms can use NEON SIMD if requested.
53 This work was done by Fabian "ryg" Giesen. SSE2 is used by
54 default, but NEON must be enabled explicitly; see docs.
55
56 With other JPEG optimizations included in this version, we see
57 2x speedup on a JPEG on an x86 machine, and a 1.5x speedup
58 on a JPEG on an ARM machine, relative to previous versions of this
59 library. The same results will not obtain for all JPGs and for all
60 x86/ARM machines. (Note that progressive JPEGs are significantly
61 slower to decode than regular JPEGs.) This doesn't mean that this
62 is the fastest JPEG decoder in the land; rather, it brings it
63 closer to parity with standard libraries. If you want the fastest
64 decode, look elsewhere. (See "Philosophy" section of docs below.)
65
66 See final bullet items below for more info on SIMD.
67
68 - Added STBI_MALLOC, STBI_REALLOC, and STBI_FREE macros for replacing
69 the memory allocator. Unlike other STBI libraries, these macros don't
70 support a context parameter, so if you need to pass a context in to
71 the allocator, you'll have to store it in a global or a thread-local
72 variable.
73
74 - Split existing STBI_NO_HDR flag into two flags, STBI_NO_HDR and
75 STBI_NO_LINEAR.
76 STBI_NO_HDR: suppress implementation of .hdr reader format
77 STBI_NO_LINEAR: suppress high-dynamic-range light-linear float API
78
79 - You can suppress implementation of any of the decoders to reduce
80 your code footprint by #defining one or more of the following
81 symbols before creating the implementation.
82
83 STBI_NO_JPEG
84 STBI_NO_PNG
85 STBI_NO_BMP
86 STBI_NO_PSD
87 STBI_NO_TGA
88 STBI_NO_GIF
89 STBI_NO_HDR
90 STBI_NO_PIC
91 STBI_NO_PNM (.ppm and .pgm)
92
93 - You can request *only* certain decoders and suppress all other ones
94 (this will be more forward-compatible, as addition of new decoders
95 doesn't require you to disable them explicitly):
96
97 STBI_ONLY_JPEG
98 STBI_ONLY_PNG
99 STBI_ONLY_BMP
100 STBI_ONLY_PSD
101 STBI_ONLY_TGA
102 STBI_ONLY_GIF
103 STBI_ONLY_HDR
104 STBI_ONLY_PIC
105 STBI_ONLY_PNM (.ppm and .pgm)
106
107 Note that you can define multiples of these, and you will get all
108 of them ("only x" and "only y" is interpreted to mean "only x&y").
109
110 - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
111 want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
112
113 - Compilation of all SIMD code can be suppressed with
114 #define STBI_NO_SIMD
115 It should not be necessary to disable SIMD unless you have issues
116 compiling (e.g. using an x86 compiler which doesn't support SSE
117 intrinsics or that doesn't support the method used to detect
118 SSE2 support at run-time), and even those can be reported as
119 bugs so I can refine the built-in compile-time checking to be
120 smarter.
121
122 - The old STBI_SIMD system which allowed installing a user-defined
123 IDCT etc. has been removed. If you need this, don't upgrade. My
124 assumption is that almost nobody was doing this, and those who
125 were will find the built-in SIMD more satisfactory anyway.
126
127 - RGB values computed for JPEG images are slightly different from
128 previous versions of stb_image. (This is due to using less
129 integer precision in SIMD.) The C code has been adjusted so
130 that the same RGB values will be computed regardless of whether
131 SIMD support is available, so your app should always produce
132 consistent results. But these results are slightly different from
133 previous versions. (Specifically, about 3% of available YCbCr values
134 will compute different RGB results from pre-1.49 versions by +-1;
135 most of the deviating values are one smaller in the G channel.)
136
137 - If you must produce consistent results with previous versions of
138 stb_image, #define STBI_JPEG_OLD and you will get the same results
139 you used to; however, you will not get the SIMD speedups for
140 the YCbCr-to-RGB conversion step (although you should still see
141 significant JPEG speedup from the other changes).
142
143 Please note that STBI_JPEG_OLD is a temporary feature; it will be
144 removed in future versions of the library. It is only intended for
145 near-term back-compatibility use.
146
147
148 Latest revision history:
149 2.12 (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
150 2.11 (2016-04-02) 16-bit PNGS; enable SSE2 in non-gcc x64
151 RGB-format JPEG; remove white matting in PSD;
152 allocate large structures on the stack;
153 correct channel count for PNG & BMP
154 2.10 (2016-01-22) avoid warning introduced in 2.09
155 2.09 (2016-01-16) 16-bit TGA; comments in PNM files; STBI_REALLOC_SIZED
156 2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
157 2.07 (2015-09-13) partial animated GIF support
158 limited 16-bit PSD support
159 minor bugs, code cleanup, and compiler warnings
160 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
161 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
162 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
163 2.03 (2015-04-12) additional corruption checking
164 stbi_set_flip_vertically_on_load
165 fix NEON support; fix mingw support
166 2.02 (2015-01-19) fix incorrect assert, fix warning
167 2.01 (2015-01-17) fix various warnings
168 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
169 2.00 (2014-12-25) optimize JPEG, including x86 SSE2 & ARM NEON SIMD
170 progressive JPEG
171 PGM/PPM support
172 STBI_MALLOC,STBI_REALLOC,STBI_FREE
173 STBI_NO_*, STBI_ONLY_*
174 GIF bugfix
175
176 See end of file for full revision history.
177
178
179 ============================ Contributors =========================
180
181 Image formats Extensions, features
182 Sean Barrett (jpeg, png, bmp) Jetro Lauha (stbi_info)
183 Nicolas Schulz (hdr, psd) Martin "SpartanJ" Golini (stbi_info)
184 Jonathan Dummer (tga) James "moose2000" Brown (iPhone PNG)
185 Jean-Marc Lienher (gif) Ben "Disch" Wenger (io callbacks)
186 Tom Seddon (pic) Omar Cornut (1/2/4-bit PNG)
187 Thatcher Ulrich (psd) Nicolas Guillemot (vertical flip)
188 Ken Miller (pgm, ppm) Richard Mitton (16-bit PSD)
189 urraka@github (animated gif) Junggon Kim (PNM comments)
190 Daniel Gibson (16-bit TGA)
191
192 Optimizations & bugfixes
193 Fabian "ryg" Giesen
194 Arseny Kapoulkine
195
196 Bug & warning fixes
197 Marc LeBlanc David Woo Guillaume George Martins Mozeiko
198 Christpher Lloyd Martin Golini Jerry Jansson Joseph Thomson
199 Dave Moore Roy Eltham Hayaki Saito Phil Jordan
200 Won Chun Luke Graham Johan Duparc Nathan Reed
201 the Horde3D community Thomas Ruf Ronny Chevalier Nick Verigakis
202 Janez Zemva John Bartholomew Michal Cichon svdijk@github
203 Jonathan Blow Ken Hamada Tero Hanninen Baldur Karlsson
204 Laurent Gomila Cort Stratton Sergio Gonzalez romigrou@github
205 Aruelien Pocheville Thibault Reuille Cass Everitt Matthew Gregan
206 Ryamond Barbiero Paul Du Bois Engin Manap snagar@github
207 Michaelangel007@github Oriol Ferrer Mesia socks-the-fox
208 Blazej Dariusz Roszkowski
209
210
211 LICENSE
212
213 This software is dual-licensed to the public domain and under the following
214 license: you are granted a perpetual, irrevocable license to copy, modify,
215 publish, and distribute this file as you see fit.
216
217 */
218
219 #ifndef STBI_INCLUDE_STB_IMAGE_H
220 #define STBI_INCLUDE_STB_IMAGE_H
221
222 // DOCUMENTATION
223 //
224 // Limitations:
225 // - no 16-bit-per-channel PNG
226 // - no 12-bit-per-channel JPEG
227 // - no JPEGs with arithmetic coding
228 // - no 1-bit BMP
229 // - GIF always returns *comp=4
230 //
231 // Basic usage (see HDR discussion below for HDR usage):
232 // int x,y,n;
233 // unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
234 // // ... process data if not NULL ...
235 // // ... x = width, y = height, n = # 8-bit components per pixel ...
236 // // ... replace '0' with '1'..'4' to force that many components per pixel
237 // // ... but 'n' will always be the number that it would have been if you said 0
238 // stbi_image_free(data)
239 //
240 // Standard parameters:
241 // int *x -- outputs image width in pixels
242 // int *y -- outputs image height in pixels
243 // int *comp -- outputs # of image components in image file
244 // int req_comp -- if non-zero, # of image components requested in result
245 //
246 // The return value from an image loader is an 'unsigned char *' which points
247 // to the pixel data, or NULL on an allocation failure or if the image is
248 // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
249 // with each pixel consisting of N interleaved 8-bit components; the first
250 // pixel pointed to is top-left-most in the image. There is no padding between
251 // image scanlines or between pixels, regardless of format. The number of
252 // components N is 'req_comp' if req_comp is non-zero, or *comp otherwise.
253 // If req_comp is non-zero, *comp has the number of components that _would_
254 // have been output otherwise. E.g. if you set req_comp to 4, you will always
255 // get RGBA output, but you can check *comp to see if it's trivially opaque
256 // because e.g. there were only 3 channels in the source image.
257 //
258 // An output image with N components has the following components interleaved
259 // in this order in each pixel:
260 //
261 // N=#comp components
262 // 1 grey
263 // 2 grey, alpha
264 // 3 red, green, blue
265 // 4 red, green, blue, alpha
266 //
267 // If image loading fails for any reason, the return value will be NULL,
268 // and *x, *y, *comp will be unchanged. The function stbi_failure_reason()
269 // can be queried for an extremely brief, end-user unfriendly explanation
270 // of why the load failed. Define STBI_NO_FAILURE_STRINGS to avoid
271 // compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
272 // more user-friendly ones.
273 //
274 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
275 //
276 // ===========================================================================
277 //
278 // Philosophy
279 //
280 // stb libraries are designed with the following priorities:
281 //
282 // 1. easy to use
283 // 2. easy to maintain
284 // 3. good performance
285 //
286 // Sometimes I let "good performance" creep up in priority over "easy to maintain",
287 // and for best performance I may provide less-easy-to-use APIs that give higher
288 // performance, in addition to the easy to use ones. Nevertheless, it's important
289 // to keep in mind that from the standpoint of you, a client of this library,
290 // all you care about is #1 and #3, and stb libraries do not emphasize #3 above all.
291 //
292 // Some secondary priorities arise directly from the first two, some of which
293 // make more explicit reasons why performance can't be emphasized.
294 //
295 // - Portable ("ease of use")
296 // - Small footprint ("easy to maintain")
297 // - No dependencies ("ease of use")
298 //
299 // ===========================================================================
300 //
301 // I/O callbacks
302 //
303 // I/O callbacks allow you to read from arbitrary sources, like packaged
304 // files or some other source. Data read from callbacks are processed
305 // through a small internal buffer (currently 128 bytes) to try to reduce
306 // overhead.
307 //
308 // The three functions you must define are "read" (reads some bytes of data),
309 // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
310 //
311 // ===========================================================================
312 //
313 // SIMD support
314 //
315 // The JPEG decoder will try to automatically use SIMD kernels on x86 when
316 // supported by the compiler. For ARM Neon support, you must explicitly
317 // request it.
318 //
319 // (The old do-it-yourself SIMD API is no longer supported in the current
320 // code.)
321 //
322 // On x86, SSE2 will automatically be used when available based on a run-time
323 // test; if not, the generic C versions are used as a fall-back. On ARM targets,
324 // the typical path is to have separate builds for NEON and non-NEON devices
325 // (at least this is true for iOS and Android). Therefore, the NEON support is
326 // toggled by a build flag: define STBI_NEON to get NEON loops.
327 //
328 // The output of the JPEG decoder is slightly different from versions where
329 // SIMD support was introduced (that is, for versions before 1.49). The
330 // difference is only +-1 in the 8-bit RGB channels, and only on a small
331 // fraction of pixels. You can force the pre-1.49 behavior by defining
332 // STBI_JPEG_OLD, but this will disable some of the SIMD decoding path
333 // and hence cost some performance.
334 //
335 // If for some reason you do not want to use any of SIMD code, or if
336 // you have issues compiling it, you can disable it entirely by
337 // defining STBI_NO_SIMD.
338 //
339 // ===========================================================================
340 //
341 // HDR image support (disable by defining STBI_NO_HDR)
342 //
343 // stb_image now supports loading HDR images in general, and currently
344 // the Radiance .HDR file format, although the support is provided
345 // generically. You can still load any file through the existing interface;
346 // if you attempt to load an HDR file, it will be automatically remapped to
347 // LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
348 // both of these constants can be reconfigured through this interface:
349 //
350 // stbi_hdr_to_ldr_gamma(2.2f);
351 // stbi_hdr_to_ldr_scale(1.0f);
352 //
353 // (note, do not use _inverse_ constants; stbi_image will invert them
354 // appropriately).
355 //
356 // Additionally, there is a new, parallel interface for loading files as
357 // (linear) floats to preserve the full dynamic range:
358 //
359 // float *data = stbi_loadf(filename, &x, &y, &n, 0);
360 //
361 // If you load LDR images through this interface, those images will
362 // be promoted to floating point values, run through the inverse of
363 // constants corresponding to the above:
364 //
365 // stbi_ldr_to_hdr_scale(1.0f);
366 // stbi_ldr_to_hdr_gamma(2.2f);
367 //
368 // Finally, given a filename (or an open file or memory block--see header
369 // file for details) containing image data, you can query for the "most
370 // appropriate" interface to use (that is, whether the image is HDR or
371 // not), using:
372 //
373 // stbi_is_hdr(char *filename);
374 //
375 // ===========================================================================
376 //
377 // iPhone PNG support:
378 //
379 // By default we convert iphone-formatted PNGs back to RGB, even though
380 // they are internally encoded differently. You can disable this conversion
381 // by by calling stbi_convert_iphone_png_to_rgb(0), in which case
382 // you will always just get the native iphone "format" through (which
383 // is BGR stored in RGB).
384 //
385 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
386 // pixel to remove any premultiplied alpha *only* if the image file explicitly
387 // says there's premultiplied data (currently only happens in iPhone images,
388 // and only if iPhone convert-to-rgb processing is on).
389 //
390
391
392 #ifndef STBI_NO_STDIO
393 #include <stdio.h>
394 #endif // STBI_NO_STDIO
395
396 #define STBI_VERSION 1
397
398 enum
399 {
400 STBI_default = 0, // only used for req_comp
401
402 STBI_grey = 1,
403 STBI_grey_alpha = 2,
404 STBI_rgb = 3,
405 STBI_rgb_alpha = 4
406 };
407
408 typedef unsigned char stbi_uc;
409
410 #ifdef __cplusplus
411 extern "C" {
412 #endif
413
414 #ifdef STB_IMAGE_STATIC
415 #define STBIDEF static
416 #else
417 #define STBIDEF extern
418 #endif
419
420 //////////////////////////////////////////////////////////////////////////////
421 //
422 // PRIMARY API - works on images of any type
423 //
424
425 //
426 // load image by filename, open file, or memory buffer
427 //
428
429 typedef struct
430 {
431 int (*read) (void *user,char *data,int size); // fill 'data' with 'size' bytes. return number of bytes actually read
432 void (*skip) (void *user,int n); // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
433 int (*eof) (void *user); // returns nonzero if we are at end of file/data
434 } stbi_io_callbacks;
435
436 STBIDEF stbi_uc *stbi_load (char const *filename, int *x, int *y, int *comp, int req_comp);
437 STBIDEF stbi_uc *stbi_load_from_memory (stbi_uc const *buffer, int len , int *x, int *y, int *comp, int req_comp);
438 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk , void *user, int *x, int *y, int *comp, int req_comp);
439
440 #ifndef STBI_NO_STDIO
441 STBIDEF stbi_uc *stbi_load_from_file (FILE *f, int *x, int *y, int *comp, int req_comp);
442 // for stbi_load_from_file, file pointer is left pointing immediately after image
443 #endif
444
445 #ifndef STBI_NO_LINEAR
446 STBIDEF float *stbi_loadf (char const *filename, int *x, int *y, int *comp, int req_comp);
447 STBIDEF float *stbi_loadf_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
448 STBIDEF float *stbi_loadf_from_callbacks (stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp);
449
450 #ifndef STBI_NO_STDIO
451 STBIDEF float *stbi_loadf_from_file (FILE *f, int *x, int *y, int *comp, int req_comp);
452 #endif
453 #endif
454
455 #ifndef STBI_NO_HDR
456 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma);
457 STBIDEF void stbi_hdr_to_ldr_scale(float scale);
458 #endif // STBI_NO_HDR
459
460 #ifndef STBI_NO_LINEAR
461 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma);
462 STBIDEF void stbi_ldr_to_hdr_scale(float scale);
463 #endif // STBI_NO_LINEAR
464
465 // stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
466 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
467 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
468 #ifndef STBI_NO_STDIO
469 STBIDEF int stbi_is_hdr (char const *filename);
470 STBIDEF int stbi_is_hdr_from_file(FILE *f);
471 #endif // STBI_NO_STDIO
472
473
474 // get a VERY brief reason for failure
475 // NOT THREADSAFE
476 STBIDEF const char *stbi_failure_reason (void);
477
478 // free the loaded image -- this is just free()
479 STBIDEF void stbi_image_free (void *retval_from_stbi_load);
480
481 // get image dimensions & components without fully decoding
482 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
483 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp);
484
485 #ifndef STBI_NO_STDIO
486 STBIDEF int stbi_info (char const *filename, int *x, int *y, int *comp);
487 STBIDEF int stbi_info_from_file (FILE *f, int *x, int *y, int *comp);
488
489 #endif
490
491
492
493 // for image formats that explicitly notate that they have premultiplied alpha,
494 // we just return the colors as stored in the file. set this flag to force
495 // unpremultiplication. results are undefined if the unpremultiply overflow.
496 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
497
498 // indicate whether we should process iphone images back to canonical format,
499 // or just pass them through "as-is"
500 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
501
502 // flip the image vertically, so the first pixel in the output array is the bottom left
503 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip);
504
505 // ZLIB client - used by PNG, available for other purposes
506
507 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
508 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
509 STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
510 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
511
512 STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
513 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
514
515
516 #ifdef __cplusplus
517 }
518 #endif
519
520 //
521 //
522 //// end header file /////////////////////////////////////////////////////
523 #endif // STBI_INCLUDE_STB_IMAGE_H
524
525 #ifdef STB_IMAGE_IMPLEMENTATION
526
527 #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
528 || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
529 || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
530 || defined(STBI_ONLY_ZLIB)
531 #ifndef STBI_ONLY_JPEG
532 #define STBI_NO_JPEG
533 #endif
534 #ifndef STBI_ONLY_PNG
535 #define STBI_NO_PNG
536 #endif
537 #ifndef STBI_ONLY_BMP
538 #define STBI_NO_BMP
539 #endif
540 #ifndef STBI_ONLY_PSD
541 #define STBI_NO_PSD
542 #endif
543 #ifndef STBI_ONLY_TGA
544 #define STBI_NO_TGA
545 #endif
546 #ifndef STBI_ONLY_GIF
547 #define STBI_NO_GIF
548 #endif
549 #ifndef STBI_ONLY_HDR
550 #define STBI_NO_HDR
551 #endif
552 #ifndef STBI_ONLY_PIC
553 #define STBI_NO_PIC
554 #endif
555 #ifndef STBI_ONLY_PNM
556 #define STBI_NO_PNM
557 #endif
558 #endif
559
560 #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
561 #define STBI_NO_ZLIB
562 #endif
563
564
565 #include <stdarg.h>
566 #include <stddef.h> // ptrdiff_t on osx
567 #include <stdlib.h>
568 #include <string.h>
569
570 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
571 #include <math.h> // ldexp
572 #endif
573
574 #ifndef STBI_NO_STDIO
575 #include <stdio.h>
576 #endif
577
578 #ifndef STBI_ASSERT
579 #include <assert.h>
580 #define STBI_ASSERT(x) assert(x)
581 #endif
582
583
584 #ifndef _MSC_VER
585 #ifdef __cplusplus
586 #define stbi_inline inline
587 #else
588 #define stbi_inline
589 #endif
590 #else
591 #define stbi_inline __forceinline
592 #endif
593
594
595 #ifdef _MSC_VER
596 typedef unsigned short stbi__uint16;
597 typedef signed short stbi__int16;
598 typedef unsigned int stbi__uint32;
599 typedef signed int stbi__int32;
600 #else
601 #include <stdint.h>
602 typedef uint16_t stbi__uint16;
603 typedef int16_t stbi__int16;
604 typedef uint32_t stbi__uint32;
605 typedef int32_t stbi__int32;
606 #endif
607
608 // should produce compiler error if size is wrong
609 typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1];
610
611 #ifdef _MSC_VER
612 #define STBI_NOTUSED(v) (void)(v)
613 #else
614 #define STBI_NOTUSED(v) (void)sizeof(v)
615 #endif
616
617 #ifdef _MSC_VER
618 #define STBI_HAS_LROTL
619 #endif
620
621 #ifdef STBI_HAS_LROTL
622 #define stbi_lrot(x,y) _lrotl(x,y)
623 #else
624 #define stbi_lrot(x,y) (((x) << (y)) | ((x) >> (32 - (y))))
625 #endif
626
627 #if defined(STBI_MALLOC) && defined(STBI_FREE) && (defined(STBI_REALLOC) || defined(STBI_REALLOC_SIZED))
628 // ok
629 #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC) && !defined(STBI_REALLOC_SIZED)
630 // ok
631 #else
632 #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC (or STBI_REALLOC_SIZED)."
633 #endif
634
635 #ifndef STBI_MALLOC
636 #define STBI_MALLOC(sz) malloc(sz)
637 #define STBI_REALLOC(p,newsz) realloc(p,newsz)
638 #define STBI_FREE(p) free(p)
639 #endif
640
641 #ifndef STBI_REALLOC_SIZED
642 #define STBI_REALLOC_SIZED(p,oldsz,newsz) STBI_REALLOC(p,newsz)
643 #endif
644
645 // x86/x64 detection
646 #if defined(__x86_64__) || defined(_M_X64)
647 #define STBI__X64_TARGET
648 #elif defined(__i386) || defined(_M_IX86)
649 #define STBI__X86_TARGET
650 #endif
651
652 #if defined(__GNUC__) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET)) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
653 // NOTE: not clear do we actually need this for the 64-bit path?
654 // gcc doesn't support sse2 intrinsics unless you compile with -msse2,
655 // (but compiling with -msse2 allows the compiler to use SSE2 everywhere;
656 // this is just broken and gcc are jerks for not fixing it properly
657 // http://www.virtualdub.org/blog/pivot/entry.php?id=363 )
658 #define STBI_NO_SIMD
659 #endif
660
661 #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
662 // Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
663 //
664 // 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
665 // Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
666 // As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
667 // simultaneously enabling "-mstackrealign".
668 //
669 // See https://github.com/nothings/stb/issues/81 for more information.
670 //
671 // So default to no SSE2 on 32-bit MinGW. If you've read this far and added
672 // -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
673 #define STBI_NO_SIMD
674 #endif
675
676 #if !defined(STBI_NO_SIMD) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET))
677 #define STBI_SSE2
678 #include <emmintrin.h>
679
680 #ifdef _MSC_VER
681
682 #if _MSC_VER >= 1400 // not VC6
683 #include <intrin.h> // __cpuid
stbi__cpuid3(void)684 static int stbi__cpuid3(void)
685 {
686 int info[4];
687 __cpuid(info,1);
688 return info[3];
689 }
690 #else
stbi__cpuid3(void)691 static int stbi__cpuid3(void)
692 {
693 int res;
694 __asm {
695 mov eax,1
696 cpuid
697 mov res,edx
698 }
699 return res;
700 }
701 #endif
702
703 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
704
stbi__sse2_available()705 static int stbi__sse2_available()
706 {
707 int info3 = stbi__cpuid3();
708 return ((info3 >> 26) & 1) != 0;
709 }
710 #else // assume GCC-style if not VC++
711 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
712
stbi__sse2_available()713 static int stbi__sse2_available()
714 {
715 #if defined(STBI__X64_TARGET)
716 // on x64, SSE2 can be assumed to be available.
717 return 1;
718 #elif defined(LOVE_STBI_SSE2_OVERRIDE)
719 return 1;
720 #else
721 # warning "stb_image compiled without SSE2 support, define LOVE_STBI_SSE2_OVERRIDE to force SSE2 support"
722 // __builtin_cpu_supports is buggy on GCC 5 and above, causing problems if
723 // referenced in a shared object, giving missing __cpu_model hidden symbol errors.
724 // To get around that, just assume that SSE2 is not available on x86.
725 //
726 // See https://github.com/nothings/stb/issues/280 for more information.
727 return 0;
728 #endif
729 }
730 #endif
731 #endif
732
733 // ARM NEON
734 #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
735 #undef STBI_NEON
736 #endif
737
738 #ifdef STBI_NEON
739 #include <arm_neon.h>
740 // assume GCC or Clang on ARM targets
741 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
742 #endif
743
744 #ifndef STBI_SIMD_ALIGN
745 #define STBI_SIMD_ALIGN(type, name) type name
746 #endif
747
748 ///////////////////////////////////////////////
749 //
750 // stbi__context struct and start_xxx functions
751
752 // stbi__context structure is our basic context used by all images, so it
753 // contains all the IO context, plus some basic image information
754 typedef struct
755 {
756 stbi__uint32 img_x, img_y;
757 int img_n, img_out_n;
758
759 stbi_io_callbacks io;
760 void *io_user_data;
761
762 int read_from_callbacks;
763 int buflen;
764 stbi_uc buffer_start[128];
765
766 stbi_uc *img_buffer, *img_buffer_end;
767 stbi_uc *img_buffer_original, *img_buffer_original_end;
768 } stbi__context;
769
770
771 static void stbi__refill_buffer(stbi__context *s);
772
773 // initialize a memory-decode context
stbi__start_mem(stbi__context * s,stbi_uc const * buffer,int len)774 static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len)
775 {
776 s->io.read = NULL;
777 s->read_from_callbacks = 0;
778 s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer;
779 s->img_buffer_end = s->img_buffer_original_end = (stbi_uc *) buffer+len;
780 }
781
782 // initialize a callback-based context
stbi__start_callbacks(stbi__context * s,stbi_io_callbacks * c,void * user)783 static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user)
784 {
785 s->io = *c;
786 s->io_user_data = user;
787 s->buflen = sizeof(s->buffer_start);
788 s->read_from_callbacks = 1;
789 s->img_buffer_original = s->buffer_start;
790 stbi__refill_buffer(s);
791 s->img_buffer_original_end = s->img_buffer_end;
792 }
793
794 #ifndef STBI_NO_STDIO
795
stbi__stdio_read(void * user,char * data,int size)796 static int stbi__stdio_read(void *user, char *data, int size)
797 {
798 return (int) fread(data,1,size,(FILE*) user);
799 }
800
stbi__stdio_skip(void * user,int n)801 static void stbi__stdio_skip(void *user, int n)
802 {
803 fseek((FILE*) user, n, SEEK_CUR);
804 }
805
stbi__stdio_eof(void * user)806 static int stbi__stdio_eof(void *user)
807 {
808 return feof((FILE*) user);
809 }
810
811 static stbi_io_callbacks stbi__stdio_callbacks =
812 {
813 stbi__stdio_read,
814 stbi__stdio_skip,
815 stbi__stdio_eof,
816 };
817
stbi__start_file(stbi__context * s,FILE * f)818 static void stbi__start_file(stbi__context *s, FILE *f)
819 {
820 stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f);
821 }
822
823 //static void stop_file(stbi__context *s) { }
824
825 #endif // !STBI_NO_STDIO
826
stbi__rewind(stbi__context * s)827 static void stbi__rewind(stbi__context *s)
828 {
829 // conceptually rewind SHOULD rewind to the beginning of the stream,
830 // but we just rewind to the beginning of the initial buffer, because
831 // we only use it after doing 'test', which only ever looks at at most 92 bytes
832 s->img_buffer = s->img_buffer_original;
833 s->img_buffer_end = s->img_buffer_original_end;
834 }
835
836 #ifndef STBI_NO_JPEG
837 static int stbi__jpeg_test(stbi__context *s);
838 static stbi_uc *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
839 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp);
840 #endif
841
842 #ifndef STBI_NO_PNG
843 static int stbi__png_test(stbi__context *s);
844 static stbi_uc *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
845 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp);
846 #endif
847
848 #ifndef STBI_NO_BMP
849 static int stbi__bmp_test(stbi__context *s);
850 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
851 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp);
852 #endif
853
854 #ifndef STBI_NO_TGA
855 static int stbi__tga_test(stbi__context *s);
856 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
857 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp);
858 #endif
859
860 #ifndef STBI_NO_PSD
861 static int stbi__psd_test(stbi__context *s);
862 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
863 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp);
864 #endif
865
866 #ifndef STBI_NO_HDR
867 static int stbi__hdr_test(stbi__context *s);
868 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
869 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp);
870 #endif
871
872 #ifndef STBI_NO_PIC
873 static int stbi__pic_test(stbi__context *s);
874 static stbi_uc *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
875 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp);
876 #endif
877
878 #ifndef STBI_NO_GIF
879 static int stbi__gif_test(stbi__context *s);
880 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
881 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp);
882 #endif
883
884 #ifndef STBI_NO_PNM
885 static int stbi__pnm_test(stbi__context *s);
886 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
887 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp);
888 #endif
889
890 // this is not threadsafe
891 static const char *stbi__g_failure_reason;
892
stbi_failure_reason(void)893 STBIDEF const char *stbi_failure_reason(void)
894 {
895 return stbi__g_failure_reason;
896 }
897
stbi__err(const char * str)898 static int stbi__err(const char *str)
899 {
900 stbi__g_failure_reason = str;
901 return 0;
902 }
903
stbi__malloc(size_t size)904 static void *stbi__malloc(size_t size)
905 {
906 return STBI_MALLOC(size);
907 }
908
909 // stbi__err - error
910 // stbi__errpf - error returning pointer to float
911 // stbi__errpuc - error returning pointer to unsigned char
912
913 #ifdef STBI_NO_FAILURE_STRINGS
914 #define stbi__err(x,y) 0
915 #elif defined(STBI_FAILURE_USERMSG)
916 #define stbi__err(x,y) stbi__err(y)
917 #else
918 #define stbi__err(x,y) stbi__err(x)
919 #endif
920
921 #define stbi__errpf(x,y) ((float *)(size_t) (stbi__err(x,y)?NULL:NULL))
922 #define stbi__errpuc(x,y) ((unsigned char *)(size_t) (stbi__err(x,y)?NULL:NULL))
923
stbi_image_free(void * retval_from_stbi_load)924 STBIDEF void stbi_image_free(void *retval_from_stbi_load)
925 {
926 STBI_FREE(retval_from_stbi_load);
927 }
928
929 #ifndef STBI_NO_LINEAR
930 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
931 #endif
932
933 #ifndef STBI_NO_HDR
934 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp);
935 #endif
936
937 static int stbi__vertically_flip_on_load = 0;
938
stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)939 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
940 {
941 stbi__vertically_flip_on_load = flag_true_if_should_flip;
942 }
943
stbi__load_main(stbi__context * s,int * x,int * y,int * comp,int req_comp)944 static unsigned char *stbi__load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
945 {
946 #ifndef STBI_NO_JPEG
947 if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp);
948 #endif
949 #ifndef STBI_NO_PNG
950 if (stbi__png_test(s)) return stbi__png_load(s,x,y,comp,req_comp);
951 #endif
952 #ifndef STBI_NO_BMP
953 if (stbi__bmp_test(s)) return stbi__bmp_load(s,x,y,comp,req_comp);
954 #endif
955 #ifndef STBI_NO_GIF
956 if (stbi__gif_test(s)) return stbi__gif_load(s,x,y,comp,req_comp);
957 #endif
958 #ifndef STBI_NO_PSD
959 if (stbi__psd_test(s)) return stbi__psd_load(s,x,y,comp,req_comp);
960 #endif
961 #ifndef STBI_NO_PIC
962 if (stbi__pic_test(s)) return stbi__pic_load(s,x,y,comp,req_comp);
963 #endif
964 #ifndef STBI_NO_PNM
965 if (stbi__pnm_test(s)) return stbi__pnm_load(s,x,y,comp,req_comp);
966 #endif
967
968 #ifndef STBI_NO_HDR
969 if (stbi__hdr_test(s)) {
970 float *hdr = stbi__hdr_load(s, x,y,comp,req_comp);
971 return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
972 }
973 #endif
974
975 #ifndef STBI_NO_TGA
976 // test tga last because it's a crappy test!
977 if (stbi__tga_test(s))
978 return stbi__tga_load(s,x,y,comp,req_comp);
979 #endif
980
981 return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
982 }
983
stbi__load_flip(stbi__context * s,int * x,int * y,int * comp,int req_comp)984 static unsigned char *stbi__load_flip(stbi__context *s, int *x, int *y, int *comp, int req_comp)
985 {
986 unsigned char *result = stbi__load_main(s, x, y, comp, req_comp);
987
988 if (stbi__vertically_flip_on_load && result != NULL) {
989 int w = *x, h = *y;
990 int depth = req_comp ? req_comp : *comp;
991 int row,col,z;
992 stbi_uc temp;
993
994 // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
995 for (row = 0; row < (h>>1); row++) {
996 for (col = 0; col < w; col++) {
997 for (z = 0; z < depth; z++) {
998 temp = result[(row * w + col) * depth + z];
999 result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
1000 result[((h - row - 1) * w + col) * depth + z] = temp;
1001 }
1002 }
1003 }
1004 }
1005
1006 return result;
1007 }
1008
1009 #ifndef STBI_NO_HDR
stbi__float_postprocess(float * result,int * x,int * y,int * comp,int req_comp)1010 static void stbi__float_postprocess(float *result, int *x, int *y, int *comp, int req_comp)
1011 {
1012 if (stbi__vertically_flip_on_load && result != NULL) {
1013 int w = *x, h = *y;
1014 int depth = req_comp ? req_comp : *comp;
1015 int row,col,z;
1016 float temp;
1017
1018 // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
1019 for (row = 0; row < (h>>1); row++) {
1020 for (col = 0; col < w; col++) {
1021 for (z = 0; z < depth; z++) {
1022 temp = result[(row * w + col) * depth + z];
1023 result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
1024 result[((h - row - 1) * w + col) * depth + z] = temp;
1025 }
1026 }
1027 }
1028 }
1029 }
1030 #endif
1031
1032 #ifndef STBI_NO_STDIO
1033
stbi__fopen(char const * filename,char const * mode)1034 static FILE *stbi__fopen(char const *filename, char const *mode)
1035 {
1036 FILE *f;
1037 #if defined(_MSC_VER) && _MSC_VER >= 1400
1038 if (0 != fopen_s(&f, filename, mode))
1039 f=0;
1040 #else
1041 f = fopen(filename, mode);
1042 #endif
1043 return f;
1044 }
1045
1046
stbi_load(char const * filename,int * x,int * y,int * comp,int req_comp)1047 STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
1048 {
1049 FILE *f = stbi__fopen(filename, "rb");
1050 unsigned char *result;
1051 if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
1052 result = stbi_load_from_file(f,x,y,comp,req_comp);
1053 fclose(f);
1054 return result;
1055 }
1056
stbi_load_from_file(FILE * f,int * x,int * y,int * comp,int req_comp)1057 STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1058 {
1059 unsigned char *result;
1060 stbi__context s;
1061 stbi__start_file(&s,f);
1062 result = stbi__load_flip(&s,x,y,comp,req_comp);
1063 if (result) {
1064 // need to 'unget' all the characters in the IO buffer
1065 fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
1066 }
1067 return result;
1068 }
1069 #endif //!STBI_NO_STDIO
1070
stbi_load_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp,int req_comp)1071 STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1072 {
1073 stbi__context s;
1074 stbi__start_mem(&s,buffer,len);
1075 return stbi__load_flip(&s,x,y,comp,req_comp);
1076 }
1077
stbi_load_from_callbacks(stbi_io_callbacks const * clbk,void * user,int * x,int * y,int * comp,int req_comp)1078 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1079 {
1080 stbi__context s;
1081 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1082 return stbi__load_flip(&s,x,y,comp,req_comp);
1083 }
1084
1085 #ifndef STBI_NO_LINEAR
stbi__loadf_main(stbi__context * s,int * x,int * y,int * comp,int req_comp)1086 static float *stbi__loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
1087 {
1088 unsigned char *data;
1089 #ifndef STBI_NO_HDR
1090 if (stbi__hdr_test(s)) {
1091 float *hdr_data = stbi__hdr_load(s,x,y,comp,req_comp);
1092 if (hdr_data)
1093 stbi__float_postprocess(hdr_data,x,y,comp,req_comp);
1094 return hdr_data;
1095 }
1096 #endif
1097 data = stbi__load_flip(s, x, y, comp, req_comp);
1098 if (data)
1099 return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
1100 return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
1101 }
1102
stbi_loadf_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp,int req_comp)1103 STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1104 {
1105 stbi__context s;
1106 stbi__start_mem(&s,buffer,len);
1107 return stbi__loadf_main(&s,x,y,comp,req_comp);
1108 }
1109
stbi_loadf_from_callbacks(stbi_io_callbacks const * clbk,void * user,int * x,int * y,int * comp,int req_comp)1110 STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1111 {
1112 stbi__context s;
1113 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1114 return stbi__loadf_main(&s,x,y,comp,req_comp);
1115 }
1116
1117 #ifndef STBI_NO_STDIO
stbi_loadf(char const * filename,int * x,int * y,int * comp,int req_comp)1118 STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
1119 {
1120 float *result;
1121 FILE *f = stbi__fopen(filename, "rb");
1122 if (!f) return stbi__errpf("can't fopen", "Unable to open file");
1123 result = stbi_loadf_from_file(f,x,y,comp,req_comp);
1124 fclose(f);
1125 return result;
1126 }
1127
stbi_loadf_from_file(FILE * f,int * x,int * y,int * comp,int req_comp)1128 STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1129 {
1130 stbi__context s;
1131 stbi__start_file(&s,f);
1132 return stbi__loadf_main(&s,x,y,comp,req_comp);
1133 }
1134 #endif // !STBI_NO_STDIO
1135
1136 #endif // !STBI_NO_LINEAR
1137
1138 // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
1139 // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
1140 // reports false!
1141
stbi_is_hdr_from_memory(stbi_uc const * buffer,int len)1142 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
1143 {
1144 #ifndef STBI_NO_HDR
1145 stbi__context s;
1146 stbi__start_mem(&s,buffer,len);
1147 return stbi__hdr_test(&s);
1148 #else
1149 STBI_NOTUSED(buffer);
1150 STBI_NOTUSED(len);
1151 return 0;
1152 #endif
1153 }
1154
1155 #ifndef STBI_NO_STDIO
stbi_is_hdr(char const * filename)1156 STBIDEF int stbi_is_hdr (char const *filename)
1157 {
1158 FILE *f = stbi__fopen(filename, "rb");
1159 int result=0;
1160 if (f) {
1161 result = stbi_is_hdr_from_file(f);
1162 fclose(f);
1163 }
1164 return result;
1165 }
1166
stbi_is_hdr_from_file(FILE * f)1167 STBIDEF int stbi_is_hdr_from_file(FILE *f)
1168 {
1169 #ifndef STBI_NO_HDR
1170 stbi__context s;
1171 stbi__start_file(&s,f);
1172 return stbi__hdr_test(&s);
1173 #else
1174 STBI_NOTUSED(f);
1175 return 0;
1176 #endif
1177 }
1178 #endif // !STBI_NO_STDIO
1179
stbi_is_hdr_from_callbacks(stbi_io_callbacks const * clbk,void * user)1180 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
1181 {
1182 #ifndef STBI_NO_HDR
1183 stbi__context s;
1184 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1185 return stbi__hdr_test(&s);
1186 #else
1187 STBI_NOTUSED(clbk);
1188 STBI_NOTUSED(user);
1189 return 0;
1190 #endif
1191 }
1192
1193 #ifndef STBI_NO_LINEAR
1194 static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f;
1195
stbi_ldr_to_hdr_gamma(float gamma)1196 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
stbi_ldr_to_hdr_scale(float scale)1197 STBIDEF void stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
1198 #endif
1199
1200 static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f;
1201
stbi_hdr_to_ldr_gamma(float gamma)1202 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; }
stbi_hdr_to_ldr_scale(float scale)1203 STBIDEF void stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; }
1204
1205
1206 //////////////////////////////////////////////////////////////////////////////
1207 //
1208 // Common code used by all image loaders
1209 //
1210
1211 enum
1212 {
1213 STBI__SCAN_load=0,
1214 STBI__SCAN_type,
1215 STBI__SCAN_header
1216 };
1217
stbi__refill_buffer(stbi__context * s)1218 static void stbi__refill_buffer(stbi__context *s)
1219 {
1220 int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
1221 if (n == 0) {
1222 // at end of file, treat same as if from memory, but need to handle case
1223 // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
1224 s->read_from_callbacks = 0;
1225 s->img_buffer = s->buffer_start;
1226 s->img_buffer_end = s->buffer_start+1;
1227 *s->img_buffer = 0;
1228 } else {
1229 s->img_buffer = s->buffer_start;
1230 s->img_buffer_end = s->buffer_start + n;
1231 }
1232 }
1233
stbi__get8(stbi__context * s)1234 stbi_inline static stbi_uc stbi__get8(stbi__context *s)
1235 {
1236 if (s->img_buffer < s->img_buffer_end)
1237 return *s->img_buffer++;
1238 if (s->read_from_callbacks) {
1239 stbi__refill_buffer(s);
1240 return *s->img_buffer++;
1241 }
1242 return 0;
1243 }
1244
stbi__at_eof(stbi__context * s)1245 stbi_inline static int stbi__at_eof(stbi__context *s)
1246 {
1247 if (s->io.read) {
1248 if (!(s->io.eof)(s->io_user_data)) return 0;
1249 // if feof() is true, check if buffer = end
1250 // special case: we've only got the special 0 character at the end
1251 if (s->read_from_callbacks == 0) return 1;
1252 }
1253
1254 return s->img_buffer >= s->img_buffer_end;
1255 }
1256
stbi__skip(stbi__context * s,int n)1257 static void stbi__skip(stbi__context *s, int n)
1258 {
1259 if (n < 0) {
1260 s->img_buffer = s->img_buffer_end;
1261 return;
1262 }
1263 if (s->io.read) {
1264 int blen = (int) (s->img_buffer_end - s->img_buffer);
1265 if (blen < n) {
1266 s->img_buffer = s->img_buffer_end;
1267 (s->io.skip)(s->io_user_data, n - blen);
1268 return;
1269 }
1270 }
1271 s->img_buffer += n;
1272 }
1273
stbi__getn(stbi__context * s,stbi_uc * buffer,int n)1274 static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n)
1275 {
1276 if (s->io.read) {
1277 int blen = (int) (s->img_buffer_end - s->img_buffer);
1278 if (blen < n) {
1279 int res, count;
1280
1281 memcpy(buffer, s->img_buffer, blen);
1282
1283 count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
1284 res = (count == (n-blen));
1285 s->img_buffer = s->img_buffer_end;
1286 return res;
1287 }
1288 }
1289
1290 if (s->img_buffer+n <= s->img_buffer_end) {
1291 memcpy(buffer, s->img_buffer, n);
1292 s->img_buffer += n;
1293 return 1;
1294 } else
1295 return 0;
1296 }
1297
stbi__get16be(stbi__context * s)1298 static int stbi__get16be(stbi__context *s)
1299 {
1300 int z = stbi__get8(s);
1301 return (z << 8) + stbi__get8(s);
1302 }
1303
1304 #if defined(STBI_NO_PNG) && defined(STBI_NO_PSD) && defined(STBI_NO_PIC)
1305 // nothing
1306 #else
stbi__get32be(stbi__context * s)1307 static stbi__uint32 stbi__get32be(stbi__context *s)
1308 {
1309 stbi__uint32 z = stbi__get16be(s);
1310 return (z << 16) + stbi__get16be(s);
1311 }
1312 #endif
1313
1314 #if defined(STBI_NO_BMP) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF)
1315 // nothing
1316 #else
stbi__get16le(stbi__context * s)1317 static int stbi__get16le(stbi__context *s)
1318 {
1319 int z = stbi__get8(s);
1320 return z + (stbi__get8(s) << 8);
1321 }
1322 #endif
1323
1324 #ifndef STBI_NO_BMP
stbi__get32le(stbi__context * s)1325 static stbi__uint32 stbi__get32le(stbi__context *s)
1326 {
1327 stbi__uint32 z = stbi__get16le(s);
1328 return z + (stbi__get16le(s) << 16);
1329 }
1330 #endif
1331
1332 #define STBI__BYTECAST(x) ((stbi_uc) ((x) & 255)) // truncate int to byte without warnings
1333
1334
1335 //////////////////////////////////////////////////////////////////////////////
1336 //
1337 // generic converter from built-in img_n to req_comp
1338 // individual types do this automatically as much as possible (e.g. jpeg
1339 // does all cases internally since it needs to colorspace convert anyway,
1340 // and it never has alpha, so very few cases ). png can automatically
1341 // interleave an alpha=255 channel, but falls back to this for other cases
1342 //
1343 // assume data buffer is malloced, so malloc a new one and free that one
1344 // only failure mode is malloc failing
1345
stbi__compute_y(int r,int g,int b)1346 static stbi_uc stbi__compute_y(int r, int g, int b)
1347 {
1348 return (stbi_uc) (((r*77) + (g*150) + (29*b)) >> 8);
1349 }
1350
stbi__convert_format(unsigned char * data,int img_n,int req_comp,unsigned int x,unsigned int y)1351 static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y)
1352 {
1353 int i,j;
1354 unsigned char *good;
1355
1356 if (req_comp == img_n) return data;
1357 STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
1358
1359 good = (unsigned char *) stbi__malloc(req_comp * x * y);
1360 if (good == NULL) {
1361 STBI_FREE(data);
1362 return stbi__errpuc("outofmem", "Out of memory");
1363 }
1364
1365 for (j=0; j < (int) y; ++j) {
1366 unsigned char *src = data + j * x * img_n ;
1367 unsigned char *dest = good + j * x * req_comp;
1368
1369 #define COMBO(a,b) ((a)*8+(b))
1370 #define CASE(a,b) case COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
1371 // convert source image with img_n components to one with req_comp components;
1372 // avoid switch per pixel, so use switch per scanline and massive macros
1373 switch (COMBO(img_n, req_comp)) {
1374 CASE(1,2) dest[0]=src[0], dest[1]=255; break;
1375 CASE(1,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1376 CASE(1,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=255; break;
1377 CASE(2,1) dest[0]=src[0]; break;
1378 CASE(2,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1379 CASE(2,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=src[1]; break;
1380 CASE(3,4) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2],dest[3]=255; break;
1381 CASE(3,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1382 CASE(3,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = 255; break;
1383 CASE(4,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1384 CASE(4,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = src[3]; break;
1385 CASE(4,3) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2]; break;
1386 default: STBI_ASSERT(0);
1387 }
1388 #undef CASE
1389 }
1390
1391 STBI_FREE(data);
1392 return good;
1393 }
1394
1395 #ifndef STBI_NO_LINEAR
stbi__ldr_to_hdr(stbi_uc * data,int x,int y,int comp)1396 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
1397 {
1398 int i,k,n;
1399 float *output = (float *) stbi__malloc(x * y * comp * sizeof(float));
1400 if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
1401 // compute number of non-alpha components
1402 if (comp & 1) n = comp; else n = comp-1;
1403 for (i=0; i < x*y; ++i) {
1404 for (k=0; k < n; ++k) {
1405 output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
1406 }
1407 if (k < comp) output[i*comp + k] = data[i*comp+k]/255.0f;
1408 }
1409 STBI_FREE(data);
1410 return output;
1411 }
1412 #endif
1413
1414 #ifndef STBI_NO_HDR
1415 #define stbi__float2int(x) ((int) (x))
stbi__hdr_to_ldr(float * data,int x,int y,int comp)1416 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp)
1417 {
1418 int i,k,n;
1419 stbi_uc *output = (stbi_uc *) stbi__malloc(x * y * comp);
1420 if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
1421 // compute number of non-alpha components
1422 if (comp & 1) n = comp; else n = comp-1;
1423 for (i=0; i < x*y; ++i) {
1424 for (k=0; k < n; ++k) {
1425 float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
1426 if (z < 0) z = 0;
1427 if (z > 255) z = 255;
1428 output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1429 }
1430 if (k < comp) {
1431 float z = data[i*comp+k] * 255 + 0.5f;
1432 if (z < 0) z = 0;
1433 if (z > 255) z = 255;
1434 output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1435 }
1436 }
1437 STBI_FREE(data);
1438 return output;
1439 }
1440 #endif
1441
1442 //////////////////////////////////////////////////////////////////////////////
1443 //
1444 // "baseline" JPEG/JFIF decoder
1445 //
1446 // simple implementation
1447 // - doesn't support delayed output of y-dimension
1448 // - simple interface (only one output format: 8-bit interleaved RGB)
1449 // - doesn't try to recover corrupt jpegs
1450 // - doesn't allow partial loading, loading multiple at once
1451 // - still fast on x86 (copying globals into locals doesn't help x86)
1452 // - allocates lots of intermediate memory (full size of all components)
1453 // - non-interleaved case requires this anyway
1454 // - allows good upsampling (see next)
1455 // high-quality
1456 // - upsampled channels are bilinearly interpolated, even across blocks
1457 // - quality integer IDCT derived from IJG's 'slow'
1458 // performance
1459 // - fast huffman; reasonable integer IDCT
1460 // - some SIMD kernels for common paths on targets with SSE2/NEON
1461 // - uses a lot of intermediate memory, could cache poorly
1462
1463 #ifndef STBI_NO_JPEG
1464
1465 // huffman decoding acceleration
1466 #define FAST_BITS 9 // larger handles more cases; smaller stomps less cache
1467
1468 typedef struct
1469 {
1470 stbi_uc fast[1 << FAST_BITS];
1471 // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
1472 stbi__uint16 code[256];
1473 stbi_uc values[256];
1474 stbi_uc size[257];
1475 unsigned int maxcode[18];
1476 int delta[17]; // old 'firstsymbol' - old 'firstcode'
1477 } stbi__huffman;
1478
1479 typedef struct
1480 {
1481 stbi__context *s;
1482 stbi__huffman huff_dc[4];
1483 stbi__huffman huff_ac[4];
1484 stbi_uc dequant[4][64];
1485 stbi__int16 fast_ac[4][1 << FAST_BITS];
1486
1487 // sizes for components, interleaved MCUs
1488 int img_h_max, img_v_max;
1489 int img_mcu_x, img_mcu_y;
1490 int img_mcu_w, img_mcu_h;
1491
1492 // definition of jpeg image component
1493 struct
1494 {
1495 int id;
1496 int h,v;
1497 int tq;
1498 int hd,ha;
1499 int dc_pred;
1500
1501 int x,y,w2,h2;
1502 stbi_uc *data;
1503 void *raw_data, *raw_coeff;
1504 stbi_uc *linebuf;
1505 short *coeff; // progressive only
1506 int coeff_w, coeff_h; // number of 8x8 coefficient blocks
1507 } img_comp[4];
1508
1509 stbi__uint32 code_buffer; // jpeg entropy-coded buffer
1510 int code_bits; // number of valid bits
1511 unsigned char marker; // marker seen while filling entropy buffer
1512 int nomore; // flag if we saw a marker so must stop
1513
1514 int progressive;
1515 int spec_start;
1516 int spec_end;
1517 int succ_high;
1518 int succ_low;
1519 int eob_run;
1520 int rgb;
1521
1522 int scan_n, order[4];
1523 int restart_interval, todo;
1524
1525 // kernels
1526 void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]);
1527 void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step);
1528 stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs);
1529 } stbi__jpeg;
1530
stbi__build_huffman(stbi__huffman * h,int * count)1531 static int stbi__build_huffman(stbi__huffman *h, int *count)
1532 {
1533 int i,j,k=0,code;
1534 // build size list for each symbol (from JPEG spec)
1535 for (i=0; i < 16; ++i)
1536 for (j=0; j < count[i]; ++j)
1537 h->size[k++] = (stbi_uc) (i+1);
1538 h->size[k] = 0;
1539
1540 // compute actual symbols (from jpeg spec)
1541 code = 0;
1542 k = 0;
1543 for(j=1; j <= 16; ++j) {
1544 // compute delta to add to code to compute symbol id
1545 h->delta[j] = k - code;
1546 if (h->size[k] == j) {
1547 while (h->size[k] == j)
1548 h->code[k++] = (stbi__uint16) (code++);
1549 if (code-1 >= (1 << j)) return stbi__err("bad code lengths","Corrupt JPEG");
1550 }
1551 // compute largest code + 1 for this size, preshifted as needed later
1552 h->maxcode[j] = code << (16-j);
1553 code <<= 1;
1554 }
1555 h->maxcode[j] = 0xffffffff;
1556
1557 // build non-spec acceleration table; 255 is flag for not-accelerated
1558 memset(h->fast, 255, 1 << FAST_BITS);
1559 for (i=0; i < k; ++i) {
1560 int s = h->size[i];
1561 if (s <= FAST_BITS) {
1562 int c = h->code[i] << (FAST_BITS-s);
1563 int m = 1 << (FAST_BITS-s);
1564 for (j=0; j < m; ++j) {
1565 h->fast[c+j] = (stbi_uc) i;
1566 }
1567 }
1568 }
1569 return 1;
1570 }
1571
1572 // build a table that decodes both magnitude and value of small ACs in
1573 // one go.
stbi__build_fast_ac(stbi__int16 * fast_ac,stbi__huffman * h)1574 static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h)
1575 {
1576 int i;
1577 for (i=0; i < (1 << FAST_BITS); ++i) {
1578 stbi_uc fast = h->fast[i];
1579 fast_ac[i] = 0;
1580 if (fast < 255) {
1581 int rs = h->values[fast];
1582 int run = (rs >> 4) & 15;
1583 int magbits = rs & 15;
1584 int len = h->size[fast];
1585
1586 if (magbits && len + magbits <= FAST_BITS) {
1587 // magnitude code followed by receive_extend code
1588 int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
1589 int m = 1 << (magbits - 1);
1590 if (k < m) k += (-1 << magbits) + 1;
1591 // if the result is small enough, we can fit it in fast_ac table
1592 if (k >= -128 && k <= 127)
1593 fast_ac[i] = (stbi__int16) ((k << 8) + (run << 4) + (len + magbits));
1594 }
1595 }
1596 }
1597 }
1598
stbi__grow_buffer_unsafe(stbi__jpeg * j)1599 static void stbi__grow_buffer_unsafe(stbi__jpeg *j)
1600 {
1601 do {
1602 int b = j->nomore ? 0 : stbi__get8(j->s);
1603 if (b == 0xff) {
1604 int c = stbi__get8(j->s);
1605 if (c != 0) {
1606 j->marker = (unsigned char) c;
1607 j->nomore = 1;
1608 return;
1609 }
1610 }
1611 j->code_buffer |= b << (24 - j->code_bits);
1612 j->code_bits += 8;
1613 } while (j->code_bits <= 24);
1614 }
1615
1616 // (1 << n) - 1
1617 static stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
1618
1619 // decode a jpeg huffman value from the bitstream
stbi__jpeg_huff_decode(stbi__jpeg * j,stbi__huffman * h)1620 stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h)
1621 {
1622 unsigned int temp;
1623 int c,k;
1624
1625 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1626
1627 // look at the top FAST_BITS and determine what symbol ID it is,
1628 // if the code is <= FAST_BITS
1629 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1630 k = h->fast[c];
1631 if (k < 255) {
1632 int s = h->size[k];
1633 if (s > j->code_bits)
1634 return -1;
1635 j->code_buffer <<= s;
1636 j->code_bits -= s;
1637 return h->values[k];
1638 }
1639
1640 // naive test is to shift the code_buffer down so k bits are
1641 // valid, then test against maxcode. To speed this up, we've
1642 // preshifted maxcode left so that it has (16-k) 0s at the
1643 // end; in other words, regardless of the number of bits, it
1644 // wants to be compared against something shifted to have 16;
1645 // that way we don't need to shift inside the loop.
1646 temp = j->code_buffer >> 16;
1647 for (k=FAST_BITS+1 ; ; ++k)
1648 if (temp < h->maxcode[k])
1649 break;
1650 if (k == 17) {
1651 // error! code not found
1652 j->code_bits -= 16;
1653 return -1;
1654 }
1655
1656 if (k > j->code_bits)
1657 return -1;
1658
1659 // convert the huffman code to the symbol id
1660 c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
1661 STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
1662
1663 // convert the id to a symbol
1664 j->code_bits -= k;
1665 j->code_buffer <<= k;
1666 return h->values[c];
1667 }
1668
1669 // bias[n] = (-1<<n) + 1
1670 static int const stbi__jbias[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
1671
1672 // combined JPEG 'receive' and JPEG 'extend', since baseline
1673 // always extends everything it receives.
stbi__extend_receive(stbi__jpeg * j,int n)1674 stbi_inline static int stbi__extend_receive(stbi__jpeg *j, int n)
1675 {
1676 unsigned int k;
1677 int sgn;
1678 if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1679
1680 sgn = (stbi__int32)j->code_buffer >> 31; // sign bit is always in MSB
1681 k = stbi_lrot(j->code_buffer, n);
1682 STBI_ASSERT(n >= 0 && n < (int) (sizeof(stbi__bmask)/sizeof(*stbi__bmask)));
1683 j->code_buffer = k & ~stbi__bmask[n];
1684 k &= stbi__bmask[n];
1685 j->code_bits -= n;
1686 return k + (stbi__jbias[n] & ~sgn);
1687 }
1688
1689 // get some unsigned bits
stbi__jpeg_get_bits(stbi__jpeg * j,int n)1690 stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg *j, int n)
1691 {
1692 unsigned int k;
1693 if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1694 k = stbi_lrot(j->code_buffer, n);
1695 j->code_buffer = k & ~stbi__bmask[n];
1696 k &= stbi__bmask[n];
1697 j->code_bits -= n;
1698 return k;
1699 }
1700
stbi__jpeg_get_bit(stbi__jpeg * j)1701 stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg *j)
1702 {
1703 unsigned int k;
1704 if (j->code_bits < 1) stbi__grow_buffer_unsafe(j);
1705 k = j->code_buffer;
1706 j->code_buffer <<= 1;
1707 --j->code_bits;
1708 return k & 0x80000000;
1709 }
1710
1711 // given a value that's at position X in the zigzag stream,
1712 // where does it appear in the 8x8 matrix coded as row-major?
1713 static stbi_uc stbi__jpeg_dezigzag[64+15] =
1714 {
1715 0, 1, 8, 16, 9, 2, 3, 10,
1716 17, 24, 32, 25, 18, 11, 4, 5,
1717 12, 19, 26, 33, 40, 48, 41, 34,
1718 27, 20, 13, 6, 7, 14, 21, 28,
1719 35, 42, 49, 56, 57, 50, 43, 36,
1720 29, 22, 15, 23, 30, 37, 44, 51,
1721 58, 59, 52, 45, 38, 31, 39, 46,
1722 53, 60, 61, 54, 47, 55, 62, 63,
1723 // let corrupt input sample past end
1724 63, 63, 63, 63, 63, 63, 63, 63,
1725 63, 63, 63, 63, 63, 63, 63
1726 };
1727
1728 // decode one 64-entry block--
stbi__jpeg_decode_block(stbi__jpeg * j,short data[64],stbi__huffman * hdc,stbi__huffman * hac,stbi__int16 * fac,int b,stbi_uc * dequant)1729 static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi_uc *dequant)
1730 {
1731 int diff,dc,k;
1732 int t;
1733
1734 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1735 t = stbi__jpeg_huff_decode(j, hdc);
1736 if (t < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1737
1738 // 0 all the ac values now so we can do it 32-bits at a time
1739 memset(data,0,64*sizeof(data[0]));
1740
1741 diff = t ? stbi__extend_receive(j, t) : 0;
1742 dc = j->img_comp[b].dc_pred + diff;
1743 j->img_comp[b].dc_pred = dc;
1744 data[0] = (short) (dc * dequant[0]);
1745
1746 // decode AC components, see JPEG spec
1747 k = 1;
1748 do {
1749 unsigned int zig;
1750 int c,r,s;
1751 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1752 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1753 r = fac[c];
1754 if (r) { // fast-AC path
1755 k += (r >> 4) & 15; // run
1756 s = r & 15; // combined length
1757 j->code_buffer <<= s;
1758 j->code_bits -= s;
1759 // decode into unzigzag'd location
1760 zig = stbi__jpeg_dezigzag[k++];
1761 data[zig] = (short) ((r >> 8) * dequant[zig]);
1762 } else {
1763 int rs = stbi__jpeg_huff_decode(j, hac);
1764 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1765 s = rs & 15;
1766 r = rs >> 4;
1767 if (s == 0) {
1768 if (rs != 0xf0) break; // end block
1769 k += 16;
1770 } else {
1771 k += r;
1772 // decode into unzigzag'd location
1773 zig = stbi__jpeg_dezigzag[k++];
1774 data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]);
1775 }
1776 }
1777 } while (k < 64);
1778 return 1;
1779 }
1780
stbi__jpeg_decode_block_prog_dc(stbi__jpeg * j,short data[64],stbi__huffman * hdc,int b)1781 static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg *j, short data[64], stbi__huffman *hdc, int b)
1782 {
1783 int diff,dc;
1784 int t;
1785 if (j->spec_end != 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1786
1787 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1788
1789 if (j->succ_high == 0) {
1790 // first scan for DC coefficient, must be first
1791 memset(data,0,64*sizeof(data[0])); // 0 all the ac values now
1792 t = stbi__jpeg_huff_decode(j, hdc);
1793 diff = t ? stbi__extend_receive(j, t) : 0;
1794
1795 dc = j->img_comp[b].dc_pred + diff;
1796 j->img_comp[b].dc_pred = dc;
1797 data[0] = (short) (dc << j->succ_low);
1798 } else {
1799 // refinement scan for DC coefficient
1800 if (stbi__jpeg_get_bit(j))
1801 data[0] += (short) (1 << j->succ_low);
1802 }
1803 return 1;
1804 }
1805
1806 // @OPTIMIZE: store non-zigzagged during the decode passes,
1807 // and only de-zigzag when dequantizing
stbi__jpeg_decode_block_prog_ac(stbi__jpeg * j,short data[64],stbi__huffman * hac,stbi__int16 * fac)1808 static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg *j, short data[64], stbi__huffman *hac, stbi__int16 *fac)
1809 {
1810 int k;
1811 if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1812
1813 if (j->succ_high == 0) {
1814 int shift = j->succ_low;
1815
1816 if (j->eob_run) {
1817 --j->eob_run;
1818 return 1;
1819 }
1820
1821 k = j->spec_start;
1822 do {
1823 unsigned int zig;
1824 int c,r,s;
1825 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1826 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1827 r = fac[c];
1828 if (r) { // fast-AC path
1829 k += (r >> 4) & 15; // run
1830 s = r & 15; // combined length
1831 j->code_buffer <<= s;
1832 j->code_bits -= s;
1833 zig = stbi__jpeg_dezigzag[k++];
1834 data[zig] = (short) ((r >> 8) << shift);
1835 } else {
1836 int rs = stbi__jpeg_huff_decode(j, hac);
1837 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1838 s = rs & 15;
1839 r = rs >> 4;
1840 if (s == 0) {
1841 if (r < 15) {
1842 j->eob_run = (1 << r);
1843 if (r)
1844 j->eob_run += stbi__jpeg_get_bits(j, r);
1845 --j->eob_run;
1846 break;
1847 }
1848 k += 16;
1849 } else {
1850 k += r;
1851 zig = stbi__jpeg_dezigzag[k++];
1852 data[zig] = (short) (stbi__extend_receive(j,s) << shift);
1853 }
1854 }
1855 } while (k <= j->spec_end);
1856 } else {
1857 // refinement scan for these AC coefficients
1858
1859 short bit = (short) (1 << j->succ_low);
1860
1861 if (j->eob_run) {
1862 --j->eob_run;
1863 for (k = j->spec_start; k <= j->spec_end; ++k) {
1864 short *p = &data[stbi__jpeg_dezigzag[k]];
1865 if (*p != 0)
1866 if (stbi__jpeg_get_bit(j))
1867 if ((*p & bit)==0) {
1868 if (*p > 0)
1869 *p += bit;
1870 else
1871 *p -= bit;
1872 }
1873 }
1874 } else {
1875 k = j->spec_start;
1876 do {
1877 int r,s;
1878 int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
1879 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1880 s = rs & 15;
1881 r = rs >> 4;
1882 if (s == 0) {
1883 if (r < 15) {
1884 j->eob_run = (1 << r) - 1;
1885 if (r)
1886 j->eob_run += stbi__jpeg_get_bits(j, r);
1887 r = 64; // force end of block
1888 } else {
1889 // r=15 s=0 should write 16 0s, so we just do
1890 // a run of 15 0s and then write s (which is 0),
1891 // so we don't have to do anything special here
1892 }
1893 } else {
1894 if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG");
1895 // sign bit
1896 if (stbi__jpeg_get_bit(j))
1897 s = bit;
1898 else
1899 s = -bit;
1900 }
1901
1902 // advance by r
1903 while (k <= j->spec_end) {
1904 short *p = &data[stbi__jpeg_dezigzag[k++]];
1905 if (*p != 0) {
1906 if (stbi__jpeg_get_bit(j))
1907 if ((*p & bit)==0) {
1908 if (*p > 0)
1909 *p += bit;
1910 else
1911 *p -= bit;
1912 }
1913 } else {
1914 if (r == 0) {
1915 *p = (short) s;
1916 break;
1917 }
1918 --r;
1919 }
1920 }
1921 } while (k <= j->spec_end);
1922 }
1923 }
1924 return 1;
1925 }
1926
1927 // take a -128..127 value and stbi__clamp it and convert to 0..255
stbi__clamp(int x)1928 stbi_inline static stbi_uc stbi__clamp(int x)
1929 {
1930 // trick to use a single test to catch both cases
1931 if ((unsigned int) x > 255) {
1932 if (x < 0) return 0;
1933 if (x > 255) return 255;
1934 }
1935 return (stbi_uc) x;
1936 }
1937
1938 #define stbi__f2f(x) ((int) (((x) * 4096 + 0.5)))
1939 #define stbi__fsh(x) ((x) << 12)
1940
1941 // derived from jidctint -- DCT_ISLOW
1942 #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
1943 int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
1944 p2 = s2; \
1945 p3 = s6; \
1946 p1 = (p2+p3) * stbi__f2f(0.5411961f); \
1947 t2 = p1 + p3*stbi__f2f(-1.847759065f); \
1948 t3 = p1 + p2*stbi__f2f( 0.765366865f); \
1949 p2 = s0; \
1950 p3 = s4; \
1951 t0 = stbi__fsh(p2+p3); \
1952 t1 = stbi__fsh(p2-p3); \
1953 x0 = t0+t3; \
1954 x3 = t0-t3; \
1955 x1 = t1+t2; \
1956 x2 = t1-t2; \
1957 t0 = s7; \
1958 t1 = s5; \
1959 t2 = s3; \
1960 t3 = s1; \
1961 p3 = t0+t2; \
1962 p4 = t1+t3; \
1963 p1 = t0+t3; \
1964 p2 = t1+t2; \
1965 p5 = (p3+p4)*stbi__f2f( 1.175875602f); \
1966 t0 = t0*stbi__f2f( 0.298631336f); \
1967 t1 = t1*stbi__f2f( 2.053119869f); \
1968 t2 = t2*stbi__f2f( 3.072711026f); \
1969 t3 = t3*stbi__f2f( 1.501321110f); \
1970 p1 = p5 + p1*stbi__f2f(-0.899976223f); \
1971 p2 = p5 + p2*stbi__f2f(-2.562915447f); \
1972 p3 = p3*stbi__f2f(-1.961570560f); \
1973 p4 = p4*stbi__f2f(-0.390180644f); \
1974 t3 += p1+p4; \
1975 t2 += p2+p3; \
1976 t1 += p2+p4; \
1977 t0 += p1+p3;
1978
stbi__idct_block(stbi_uc * out,int out_stride,short data[64])1979 static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64])
1980 {
1981 int i,val[64],*v=val;
1982 stbi_uc *o;
1983 short *d = data;
1984
1985 // columns
1986 for (i=0; i < 8; ++i,++d, ++v) {
1987 // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
1988 if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
1989 && d[40]==0 && d[48]==0 && d[56]==0) {
1990 // no shortcut 0 seconds
1991 // (1|2|3|4|5|6|7)==0 0 seconds
1992 // all separate -0.047 seconds
1993 // 1 && 2|3 && 4|5 && 6|7: -0.047 seconds
1994 int dcterm = d[0] << 2;
1995 v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
1996 } else {
1997 STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56])
1998 // constants scaled things up by 1<<12; let's bring them back
1999 // down, but keep 2 extra bits of precision
2000 x0 += 512; x1 += 512; x2 += 512; x3 += 512;
2001 v[ 0] = (x0+t3) >> 10;
2002 v[56] = (x0-t3) >> 10;
2003 v[ 8] = (x1+t2) >> 10;
2004 v[48] = (x1-t2) >> 10;
2005 v[16] = (x2+t1) >> 10;
2006 v[40] = (x2-t1) >> 10;
2007 v[24] = (x3+t0) >> 10;
2008 v[32] = (x3-t0) >> 10;
2009 }
2010 }
2011
2012 for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
2013 // no fast case since the first 1D IDCT spread components out
2014 STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
2015 // constants scaled things up by 1<<12, plus we had 1<<2 from first
2016 // loop, plus horizontal and vertical each scale by sqrt(8) so together
2017 // we've got an extra 1<<3, so 1<<17 total we need to remove.
2018 // so we want to round that, which means adding 0.5 * 1<<17,
2019 // aka 65536. Also, we'll end up with -128 to 127 that we want
2020 // to encode as 0..255 by adding 128, so we'll add that before the shift
2021 x0 += 65536 + (128<<17);
2022 x1 += 65536 + (128<<17);
2023 x2 += 65536 + (128<<17);
2024 x3 += 65536 + (128<<17);
2025 // tried computing the shifts into temps, or'ing the temps to see
2026 // if any were out of range, but that was slower
2027 o[0] = stbi__clamp((x0+t3) >> 17);
2028 o[7] = stbi__clamp((x0-t3) >> 17);
2029 o[1] = stbi__clamp((x1+t2) >> 17);
2030 o[6] = stbi__clamp((x1-t2) >> 17);
2031 o[2] = stbi__clamp((x2+t1) >> 17);
2032 o[5] = stbi__clamp((x2-t1) >> 17);
2033 o[3] = stbi__clamp((x3+t0) >> 17);
2034 o[4] = stbi__clamp((x3-t0) >> 17);
2035 }
2036 }
2037
2038 #ifdef STBI_SSE2
2039 // sse2 integer IDCT. not the fastest possible implementation but it
2040 // produces bit-identical results to the generic C version so it's
2041 // fully "transparent".
stbi__idct_simd(stbi_uc * out,int out_stride,short data[64])2042 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2043 {
2044 // This is constructed to match our regular (generic) integer IDCT exactly.
2045 __m128i row0, row1, row2, row3, row4, row5, row6, row7;
2046 __m128i tmp;
2047
2048 // dot product constant: even elems=x, odd elems=y
2049 #define dct_const(x,y) _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
2050
2051 // out(0) = c0[even]*x + c0[odd]*y (c0, x, y 16-bit, out 32-bit)
2052 // out(1) = c1[even]*x + c1[odd]*y
2053 #define dct_rot(out0,out1, x,y,c0,c1) \
2054 __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
2055 __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
2056 __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
2057 __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
2058 __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
2059 __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
2060
2061 // out = in << 12 (in 16-bit, out 32-bit)
2062 #define dct_widen(out, in) \
2063 __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
2064 __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
2065
2066 // wide add
2067 #define dct_wadd(out, a, b) \
2068 __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
2069 __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
2070
2071 // wide sub
2072 #define dct_wsub(out, a, b) \
2073 __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
2074 __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
2075
2076 // butterfly a/b, add bias, then shift by "s" and pack
2077 #define dct_bfly32o(out0, out1, a,b,bias,s) \
2078 { \
2079 __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
2080 __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
2081 dct_wadd(sum, abiased, b); \
2082 dct_wsub(dif, abiased, b); \
2083 out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
2084 out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
2085 }
2086
2087 // 8-bit interleave step (for transposes)
2088 #define dct_interleave8(a, b) \
2089 tmp = a; \
2090 a = _mm_unpacklo_epi8(a, b); \
2091 b = _mm_unpackhi_epi8(tmp, b)
2092
2093 // 16-bit interleave step (for transposes)
2094 #define dct_interleave16(a, b) \
2095 tmp = a; \
2096 a = _mm_unpacklo_epi16(a, b); \
2097 b = _mm_unpackhi_epi16(tmp, b)
2098
2099 #define dct_pass(bias,shift) \
2100 { \
2101 /* even part */ \
2102 dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
2103 __m128i sum04 = _mm_add_epi16(row0, row4); \
2104 __m128i dif04 = _mm_sub_epi16(row0, row4); \
2105 dct_widen(t0e, sum04); \
2106 dct_widen(t1e, dif04); \
2107 dct_wadd(x0, t0e, t3e); \
2108 dct_wsub(x3, t0e, t3e); \
2109 dct_wadd(x1, t1e, t2e); \
2110 dct_wsub(x2, t1e, t2e); \
2111 /* odd part */ \
2112 dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
2113 dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
2114 __m128i sum17 = _mm_add_epi16(row1, row7); \
2115 __m128i sum35 = _mm_add_epi16(row3, row5); \
2116 dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
2117 dct_wadd(x4, y0o, y4o); \
2118 dct_wadd(x5, y1o, y5o); \
2119 dct_wadd(x6, y2o, y5o); \
2120 dct_wadd(x7, y3o, y4o); \
2121 dct_bfly32o(row0,row7, x0,x7,bias,shift); \
2122 dct_bfly32o(row1,row6, x1,x6,bias,shift); \
2123 dct_bfly32o(row2,row5, x2,x5,bias,shift); \
2124 dct_bfly32o(row3,row4, x3,x4,bias,shift); \
2125 }
2126
2127 __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
2128 __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f));
2129 __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
2130 __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
2131 __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f));
2132 __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f));
2133 __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f));
2134 __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f));
2135
2136 // rounding biases in column/row passes, see stbi__idct_block for explanation.
2137 __m128i bias_0 = _mm_set1_epi32(512);
2138 __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17));
2139
2140 // load
2141 row0 = _mm_load_si128((const __m128i *) (data + 0*8));
2142 row1 = _mm_load_si128((const __m128i *) (data + 1*8));
2143 row2 = _mm_load_si128((const __m128i *) (data + 2*8));
2144 row3 = _mm_load_si128((const __m128i *) (data + 3*8));
2145 row4 = _mm_load_si128((const __m128i *) (data + 4*8));
2146 row5 = _mm_load_si128((const __m128i *) (data + 5*8));
2147 row6 = _mm_load_si128((const __m128i *) (data + 6*8));
2148 row7 = _mm_load_si128((const __m128i *) (data + 7*8));
2149
2150 // column pass
2151 dct_pass(bias_0, 10);
2152
2153 {
2154 // 16bit 8x8 transpose pass 1
2155 dct_interleave16(row0, row4);
2156 dct_interleave16(row1, row5);
2157 dct_interleave16(row2, row6);
2158 dct_interleave16(row3, row7);
2159
2160 // transpose pass 2
2161 dct_interleave16(row0, row2);
2162 dct_interleave16(row1, row3);
2163 dct_interleave16(row4, row6);
2164 dct_interleave16(row5, row7);
2165
2166 // transpose pass 3
2167 dct_interleave16(row0, row1);
2168 dct_interleave16(row2, row3);
2169 dct_interleave16(row4, row5);
2170 dct_interleave16(row6, row7);
2171 }
2172
2173 // row pass
2174 dct_pass(bias_1, 17);
2175
2176 {
2177 // pack
2178 __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
2179 __m128i p1 = _mm_packus_epi16(row2, row3);
2180 __m128i p2 = _mm_packus_epi16(row4, row5);
2181 __m128i p3 = _mm_packus_epi16(row6, row7);
2182
2183 // 8bit 8x8 transpose pass 1
2184 dct_interleave8(p0, p2); // a0e0a1e1...
2185 dct_interleave8(p1, p3); // c0g0c1g1...
2186
2187 // transpose pass 2
2188 dct_interleave8(p0, p1); // a0c0e0g0...
2189 dct_interleave8(p2, p3); // b0d0f0h0...
2190
2191 // transpose pass 3
2192 dct_interleave8(p0, p2); // a0b0c0d0...
2193 dct_interleave8(p1, p3); // a4b4c4d4...
2194
2195 // store
2196 _mm_storel_epi64((__m128i *) out, p0); out += out_stride;
2197 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
2198 _mm_storel_epi64((__m128i *) out, p2); out += out_stride;
2199 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
2200 _mm_storel_epi64((__m128i *) out, p1); out += out_stride;
2201 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
2202 _mm_storel_epi64((__m128i *) out, p3); out += out_stride;
2203 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e));
2204 }
2205
2206 #undef dct_const
2207 #undef dct_rot
2208 #undef dct_widen
2209 #undef dct_wadd
2210 #undef dct_wsub
2211 #undef dct_bfly32o
2212 #undef dct_interleave8
2213 #undef dct_interleave16
2214 #undef dct_pass
2215 }
2216
2217 #endif // STBI_SSE2
2218
2219 #ifdef STBI_NEON
2220
2221 // NEON integer IDCT. should produce bit-identical
2222 // results to the generic C version.
stbi__idct_simd(stbi_uc * out,int out_stride,short data[64])2223 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2224 {
2225 int16x8_t row0, row1, row2, row3, row4, row5, row6, row7;
2226
2227 int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f));
2228 int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f));
2229 int16x4_t rot0_2 = vdup_n_s16(stbi__f2f( 0.765366865f));
2230 int16x4_t rot1_0 = vdup_n_s16(stbi__f2f( 1.175875602f));
2231 int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f));
2232 int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f));
2233 int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f));
2234 int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f));
2235 int16x4_t rot3_0 = vdup_n_s16(stbi__f2f( 0.298631336f));
2236 int16x4_t rot3_1 = vdup_n_s16(stbi__f2f( 2.053119869f));
2237 int16x4_t rot3_2 = vdup_n_s16(stbi__f2f( 3.072711026f));
2238 int16x4_t rot3_3 = vdup_n_s16(stbi__f2f( 1.501321110f));
2239
2240 #define dct_long_mul(out, inq, coeff) \
2241 int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
2242 int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
2243
2244 #define dct_long_mac(out, acc, inq, coeff) \
2245 int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
2246 int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
2247
2248 #define dct_widen(out, inq) \
2249 int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
2250 int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
2251
2252 // wide add
2253 #define dct_wadd(out, a, b) \
2254 int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
2255 int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
2256
2257 // wide sub
2258 #define dct_wsub(out, a, b) \
2259 int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
2260 int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
2261
2262 // butterfly a/b, then shift using "shiftop" by "s" and pack
2263 #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
2264 { \
2265 dct_wadd(sum, a, b); \
2266 dct_wsub(dif, a, b); \
2267 out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
2268 out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
2269 }
2270
2271 #define dct_pass(shiftop, shift) \
2272 { \
2273 /* even part */ \
2274 int16x8_t sum26 = vaddq_s16(row2, row6); \
2275 dct_long_mul(p1e, sum26, rot0_0); \
2276 dct_long_mac(t2e, p1e, row6, rot0_1); \
2277 dct_long_mac(t3e, p1e, row2, rot0_2); \
2278 int16x8_t sum04 = vaddq_s16(row0, row4); \
2279 int16x8_t dif04 = vsubq_s16(row0, row4); \
2280 dct_widen(t0e, sum04); \
2281 dct_widen(t1e, dif04); \
2282 dct_wadd(x0, t0e, t3e); \
2283 dct_wsub(x3, t0e, t3e); \
2284 dct_wadd(x1, t1e, t2e); \
2285 dct_wsub(x2, t1e, t2e); \
2286 /* odd part */ \
2287 int16x8_t sum15 = vaddq_s16(row1, row5); \
2288 int16x8_t sum17 = vaddq_s16(row1, row7); \
2289 int16x8_t sum35 = vaddq_s16(row3, row5); \
2290 int16x8_t sum37 = vaddq_s16(row3, row7); \
2291 int16x8_t sumodd = vaddq_s16(sum17, sum35); \
2292 dct_long_mul(p5o, sumodd, rot1_0); \
2293 dct_long_mac(p1o, p5o, sum17, rot1_1); \
2294 dct_long_mac(p2o, p5o, sum35, rot1_2); \
2295 dct_long_mul(p3o, sum37, rot2_0); \
2296 dct_long_mul(p4o, sum15, rot2_1); \
2297 dct_wadd(sump13o, p1o, p3o); \
2298 dct_wadd(sump24o, p2o, p4o); \
2299 dct_wadd(sump23o, p2o, p3o); \
2300 dct_wadd(sump14o, p1o, p4o); \
2301 dct_long_mac(x4, sump13o, row7, rot3_0); \
2302 dct_long_mac(x5, sump24o, row5, rot3_1); \
2303 dct_long_mac(x6, sump23o, row3, rot3_2); \
2304 dct_long_mac(x7, sump14o, row1, rot3_3); \
2305 dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
2306 dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
2307 dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
2308 dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
2309 }
2310
2311 // load
2312 row0 = vld1q_s16(data + 0*8);
2313 row1 = vld1q_s16(data + 1*8);
2314 row2 = vld1q_s16(data + 2*8);
2315 row3 = vld1q_s16(data + 3*8);
2316 row4 = vld1q_s16(data + 4*8);
2317 row5 = vld1q_s16(data + 5*8);
2318 row6 = vld1q_s16(data + 6*8);
2319 row7 = vld1q_s16(data + 7*8);
2320
2321 // add DC bias
2322 row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
2323
2324 // column pass
2325 dct_pass(vrshrn_n_s32, 10);
2326
2327 // 16bit 8x8 transpose
2328 {
2329 // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
2330 // whether compilers actually get this is another story, sadly.
2331 #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
2332 #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
2333 #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
2334
2335 // pass 1
2336 dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6
2337 dct_trn16(row2, row3);
2338 dct_trn16(row4, row5);
2339 dct_trn16(row6, row7);
2340
2341 // pass 2
2342 dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4
2343 dct_trn32(row1, row3);
2344 dct_trn32(row4, row6);
2345 dct_trn32(row5, row7);
2346
2347 // pass 3
2348 dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0
2349 dct_trn64(row1, row5);
2350 dct_trn64(row2, row6);
2351 dct_trn64(row3, row7);
2352
2353 #undef dct_trn16
2354 #undef dct_trn32
2355 #undef dct_trn64
2356 }
2357
2358 // row pass
2359 // vrshrn_n_s32 only supports shifts up to 16, we need
2360 // 17. so do a non-rounding shift of 16 first then follow
2361 // up with a rounding shift by 1.
2362 dct_pass(vshrn_n_s32, 16);
2363
2364 {
2365 // pack and round
2366 uint8x8_t p0 = vqrshrun_n_s16(row0, 1);
2367 uint8x8_t p1 = vqrshrun_n_s16(row1, 1);
2368 uint8x8_t p2 = vqrshrun_n_s16(row2, 1);
2369 uint8x8_t p3 = vqrshrun_n_s16(row3, 1);
2370 uint8x8_t p4 = vqrshrun_n_s16(row4, 1);
2371 uint8x8_t p5 = vqrshrun_n_s16(row5, 1);
2372 uint8x8_t p6 = vqrshrun_n_s16(row6, 1);
2373 uint8x8_t p7 = vqrshrun_n_s16(row7, 1);
2374
2375 // again, these can translate into one instruction, but often don't.
2376 #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
2377 #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
2378 #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
2379
2380 // sadly can't use interleaved stores here since we only write
2381 // 8 bytes to each scan line!
2382
2383 // 8x8 8-bit transpose pass 1
2384 dct_trn8_8(p0, p1);
2385 dct_trn8_8(p2, p3);
2386 dct_trn8_8(p4, p5);
2387 dct_trn8_8(p6, p7);
2388
2389 // pass 2
2390 dct_trn8_16(p0, p2);
2391 dct_trn8_16(p1, p3);
2392 dct_trn8_16(p4, p6);
2393 dct_trn8_16(p5, p7);
2394
2395 // pass 3
2396 dct_trn8_32(p0, p4);
2397 dct_trn8_32(p1, p5);
2398 dct_trn8_32(p2, p6);
2399 dct_trn8_32(p3, p7);
2400
2401 // store
2402 vst1_u8(out, p0); out += out_stride;
2403 vst1_u8(out, p1); out += out_stride;
2404 vst1_u8(out, p2); out += out_stride;
2405 vst1_u8(out, p3); out += out_stride;
2406 vst1_u8(out, p4); out += out_stride;
2407 vst1_u8(out, p5); out += out_stride;
2408 vst1_u8(out, p6); out += out_stride;
2409 vst1_u8(out, p7);
2410
2411 #undef dct_trn8_8
2412 #undef dct_trn8_16
2413 #undef dct_trn8_32
2414 }
2415
2416 #undef dct_long_mul
2417 #undef dct_long_mac
2418 #undef dct_widen
2419 #undef dct_wadd
2420 #undef dct_wsub
2421 #undef dct_bfly32o
2422 #undef dct_pass
2423 }
2424
2425 #endif // STBI_NEON
2426
2427 #define STBI__MARKER_none 0xff
2428 // if there's a pending marker from the entropy stream, return that
2429 // otherwise, fetch from the stream and get a marker. if there's no
2430 // marker, return 0xff, which is never a valid marker value
stbi__get_marker(stbi__jpeg * j)2431 static stbi_uc stbi__get_marker(stbi__jpeg *j)
2432 {
2433 stbi_uc x;
2434 if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
2435 x = stbi__get8(j->s);
2436 if (x != 0xff) return STBI__MARKER_none;
2437 while (x == 0xff)
2438 x = stbi__get8(j->s);
2439 return x;
2440 }
2441
2442 // in each scan, we'll have scan_n components, and the order
2443 // of the components is specified by order[]
2444 #define STBI__RESTART(x) ((x) >= 0xd0 && (x) <= 0xd7)
2445
2446 // after a restart interval, stbi__jpeg_reset the entropy decoder and
2447 // the dc prediction
stbi__jpeg_reset(stbi__jpeg * j)2448 static void stbi__jpeg_reset(stbi__jpeg *j)
2449 {
2450 j->code_bits = 0;
2451 j->code_buffer = 0;
2452 j->nomore = 0;
2453 j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = 0;
2454 j->marker = STBI__MARKER_none;
2455 j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
2456 j->eob_run = 0;
2457 // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
2458 // since we don't even allow 1<<30 pixels
2459 }
2460
stbi__parse_entropy_coded_data(stbi__jpeg * z)2461 static int stbi__parse_entropy_coded_data(stbi__jpeg *z)
2462 {
2463 stbi__jpeg_reset(z);
2464 if (!z->progressive) {
2465 if (z->scan_n == 1) {
2466 int i,j;
2467 STBI_SIMD_ALIGN(short, data[64]);
2468 int n = z->order[0];
2469 // non-interleaved data, we just need to process one block at a time,
2470 // in trivial scanline order
2471 // number of blocks to do just depends on how many actual "pixels" this
2472 // component has, independent of interleaved MCU blocking and such
2473 int w = (z->img_comp[n].x+7) >> 3;
2474 int h = (z->img_comp[n].y+7) >> 3;
2475 for (j=0; j < h; ++j) {
2476 for (i=0; i < w; ++i) {
2477 int ha = z->img_comp[n].ha;
2478 if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2479 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2480 // every data block is an MCU, so countdown the restart interval
2481 if (--z->todo <= 0) {
2482 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2483 // if it's NOT a restart, then just bail, so we get corrupt data
2484 // rather than no data
2485 if (!STBI__RESTART(z->marker)) return 1;
2486 stbi__jpeg_reset(z);
2487 }
2488 }
2489 }
2490 return 1;
2491 } else { // interleaved
2492 int i,j,k,x,y;
2493 STBI_SIMD_ALIGN(short, data[64]);
2494 for (j=0; j < z->img_mcu_y; ++j) {
2495 for (i=0; i < z->img_mcu_x; ++i) {
2496 // scan an interleaved mcu... process scan_n components in order
2497 for (k=0; k < z->scan_n; ++k) {
2498 int n = z->order[k];
2499 // scan out an mcu's worth of this component; that's just determined
2500 // by the basic H and V specified for the component
2501 for (y=0; y < z->img_comp[n].v; ++y) {
2502 for (x=0; x < z->img_comp[n].h; ++x) {
2503 int x2 = (i*z->img_comp[n].h + x)*8;
2504 int y2 = (j*z->img_comp[n].v + y)*8;
2505 int ha = z->img_comp[n].ha;
2506 if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2507 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data);
2508 }
2509 }
2510 }
2511 // after all interleaved components, that's an interleaved MCU,
2512 // so now count down the restart interval
2513 if (--z->todo <= 0) {
2514 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2515 if (!STBI__RESTART(z->marker)) return 1;
2516 stbi__jpeg_reset(z);
2517 }
2518 }
2519 }
2520 return 1;
2521 }
2522 } else {
2523 if (z->scan_n == 1) {
2524 int i,j;
2525 int n = z->order[0];
2526 // non-interleaved data, we just need to process one block at a time,
2527 // in trivial scanline order
2528 // number of blocks to do just depends on how many actual "pixels" this
2529 // component has, independent of interleaved MCU blocking and such
2530 int w = (z->img_comp[n].x+7) >> 3;
2531 int h = (z->img_comp[n].y+7) >> 3;
2532 for (j=0; j < h; ++j) {
2533 for (i=0; i < w; ++i) {
2534 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2535 if (z->spec_start == 0) {
2536 if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2537 return 0;
2538 } else {
2539 int ha = z->img_comp[n].ha;
2540 if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha]))
2541 return 0;
2542 }
2543 // every data block is an MCU, so countdown the restart interval
2544 if (--z->todo <= 0) {
2545 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2546 if (!STBI__RESTART(z->marker)) return 1;
2547 stbi__jpeg_reset(z);
2548 }
2549 }
2550 }
2551 return 1;
2552 } else { // interleaved
2553 int i,j,k,x,y;
2554 for (j=0; j < z->img_mcu_y; ++j) {
2555 for (i=0; i < z->img_mcu_x; ++i) {
2556 // scan an interleaved mcu... process scan_n components in order
2557 for (k=0; k < z->scan_n; ++k) {
2558 int n = z->order[k];
2559 // scan out an mcu's worth of this component; that's just determined
2560 // by the basic H and V specified for the component
2561 for (y=0; y < z->img_comp[n].v; ++y) {
2562 for (x=0; x < z->img_comp[n].h; ++x) {
2563 int x2 = (i*z->img_comp[n].h + x);
2564 int y2 = (j*z->img_comp[n].v + y);
2565 short *data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w);
2566 if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2567 return 0;
2568 }
2569 }
2570 }
2571 // after all interleaved components, that's an interleaved MCU,
2572 // so now count down the restart interval
2573 if (--z->todo <= 0) {
2574 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2575 if (!STBI__RESTART(z->marker)) return 1;
2576 stbi__jpeg_reset(z);
2577 }
2578 }
2579 }
2580 return 1;
2581 }
2582 }
2583 }
2584
stbi__jpeg_dequantize(short * data,stbi_uc * dequant)2585 static void stbi__jpeg_dequantize(short *data, stbi_uc *dequant)
2586 {
2587 int i;
2588 for (i=0; i < 64; ++i)
2589 data[i] *= dequant[i];
2590 }
2591
stbi__jpeg_finish(stbi__jpeg * z)2592 static void stbi__jpeg_finish(stbi__jpeg *z)
2593 {
2594 if (z->progressive) {
2595 // dequantize and idct the data
2596 int i,j,n;
2597 for (n=0; n < z->s->img_n; ++n) {
2598 int w = (z->img_comp[n].x+7) >> 3;
2599 int h = (z->img_comp[n].y+7) >> 3;
2600 for (j=0; j < h; ++j) {
2601 for (i=0; i < w; ++i) {
2602 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2603 stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]);
2604 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2605 }
2606 }
2607 }
2608 }
2609 }
2610
stbi__process_marker(stbi__jpeg * z,int m)2611 static int stbi__process_marker(stbi__jpeg *z, int m)
2612 {
2613 int L;
2614 switch (m) {
2615 case STBI__MARKER_none: // no marker found
2616 return stbi__err("expected marker","Corrupt JPEG");
2617
2618 case 0xDD: // DRI - specify restart interval
2619 if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
2620 z->restart_interval = stbi__get16be(z->s);
2621 return 1;
2622
2623 case 0xDB: // DQT - define quantization table
2624 L = stbi__get16be(z->s)-2;
2625 while (L > 0) {
2626 int q = stbi__get8(z->s);
2627 int p = q >> 4;
2628 int t = q & 15,i;
2629 if (p != 0) return stbi__err("bad DQT type","Corrupt JPEG");
2630 if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG");
2631 for (i=0; i < 64; ++i)
2632 z->dequant[t][stbi__jpeg_dezigzag[i]] = stbi__get8(z->s);
2633 L -= 65;
2634 }
2635 return L==0;
2636
2637 case 0xC4: // DHT - define huffman table
2638 L = stbi__get16be(z->s)-2;
2639 while (L > 0) {
2640 stbi_uc *v;
2641 int sizes[16],i,n=0;
2642 int q = stbi__get8(z->s);
2643 int tc = q >> 4;
2644 int th = q & 15;
2645 if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG");
2646 for (i=0; i < 16; ++i) {
2647 sizes[i] = stbi__get8(z->s);
2648 n += sizes[i];
2649 }
2650 L -= 17;
2651 if (tc == 0) {
2652 if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0;
2653 v = z->huff_dc[th].values;
2654 } else {
2655 if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0;
2656 v = z->huff_ac[th].values;
2657 }
2658 for (i=0; i < n; ++i)
2659 v[i] = stbi__get8(z->s);
2660 if (tc != 0)
2661 stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
2662 L -= n;
2663 }
2664 return L==0;
2665 }
2666 // check for comment block or APP blocks
2667 if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
2668 stbi__skip(z->s, stbi__get16be(z->s)-2);
2669 return 1;
2670 }
2671 return 0;
2672 }
2673
2674 // after we see SOS
stbi__process_scan_header(stbi__jpeg * z)2675 static int stbi__process_scan_header(stbi__jpeg *z)
2676 {
2677 int i;
2678 int Ls = stbi__get16be(z->s);
2679 z->scan_n = stbi__get8(z->s);
2680 if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return stbi__err("bad SOS component count","Corrupt JPEG");
2681 if (Ls != 6+2*z->scan_n) return stbi__err("bad SOS len","Corrupt JPEG");
2682 for (i=0; i < z->scan_n; ++i) {
2683 int id = stbi__get8(z->s), which;
2684 int q = stbi__get8(z->s);
2685 for (which = 0; which < z->s->img_n; ++which)
2686 if (z->img_comp[which].id == id)
2687 break;
2688 if (which == z->s->img_n) return 0; // no match
2689 z->img_comp[which].hd = q >> 4; if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff","Corrupt JPEG");
2690 z->img_comp[which].ha = q & 15; if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff","Corrupt JPEG");
2691 z->order[i] = which;
2692 }
2693
2694 {
2695 int aa;
2696 z->spec_start = stbi__get8(z->s);
2697 z->spec_end = stbi__get8(z->s); // should be 63, but might be 0
2698 aa = stbi__get8(z->s);
2699 z->succ_high = (aa >> 4);
2700 z->succ_low = (aa & 15);
2701 if (z->progressive) {
2702 if (z->spec_start > 63 || z->spec_end > 63 || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13)
2703 return stbi__err("bad SOS", "Corrupt JPEG");
2704 } else {
2705 if (z->spec_start != 0) return stbi__err("bad SOS","Corrupt JPEG");
2706 if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS","Corrupt JPEG");
2707 z->spec_end = 63;
2708 }
2709 }
2710
2711 return 1;
2712 }
2713
stbi__process_frame_header(stbi__jpeg * z,int scan)2714 static int stbi__process_frame_header(stbi__jpeg *z, int scan)
2715 {
2716 stbi__context *s = z->s;
2717 int Lf,p,i,q, h_max=1,v_max=1,c;
2718 Lf = stbi__get16be(s); if (Lf < 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
2719 p = stbi__get8(s); if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
2720 s->img_y = stbi__get16be(s); if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
2721 s->img_x = stbi__get16be(s); if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
2722 c = stbi__get8(s);
2723 if (c != 3 && c != 1) return stbi__err("bad component count","Corrupt JPEG"); // JFIF requires
2724 s->img_n = c;
2725 for (i=0; i < c; ++i) {
2726 z->img_comp[i].data = NULL;
2727 z->img_comp[i].linebuf = NULL;
2728 }
2729
2730 if (Lf != 8+3*s->img_n) return stbi__err("bad SOF len","Corrupt JPEG");
2731
2732 z->rgb = 0;
2733 for (i=0; i < s->img_n; ++i) {
2734 static unsigned char rgb[3] = { 'R', 'G', 'B' };
2735 z->img_comp[i].id = stbi__get8(s);
2736 if (z->img_comp[i].id != i+1) // JFIF requires
2737 if (z->img_comp[i].id != i) { // some version of jpegtran outputs non-JFIF-compliant files!
2738 // somethings output this (see http://fileformats.archiveteam.org/wiki/JPEG#Color_format)
2739 if (z->img_comp[i].id != rgb[i])
2740 return stbi__err("bad component ID","Corrupt JPEG");
2741 ++z->rgb;
2742 }
2743 q = stbi__get8(s);
2744 z->img_comp[i].h = (q >> 4); if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG");
2745 z->img_comp[i].v = q & 15; if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG");
2746 z->img_comp[i].tq = stbi__get8(s); if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG");
2747 }
2748
2749 if (scan != STBI__SCAN_load) return 1;
2750
2751 if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
2752
2753 for (i=0; i < s->img_n; ++i) {
2754 if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
2755 if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
2756 }
2757
2758 // compute interleaved mcu info
2759 z->img_h_max = h_max;
2760 z->img_v_max = v_max;
2761 z->img_mcu_w = h_max * 8;
2762 z->img_mcu_h = v_max * 8;
2763 z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
2764 z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;
2765
2766 for (i=0; i < s->img_n; ++i) {
2767 // number of effective pixels (e.g. for non-interleaved MCU)
2768 z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
2769 z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
2770 // to simplify generation, we'll allocate enough memory to decode
2771 // the bogus oversized data from using interleaved MCUs and their
2772 // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
2773 // discard the extra data until colorspace conversion
2774 z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
2775 z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
2776 z->img_comp[i].raw_data = stbi__malloc(z->img_comp[i].w2 * z->img_comp[i].h2+15);
2777
2778 if (z->img_comp[i].raw_data == NULL) {
2779 for(--i; i >= 0; --i) {
2780 STBI_FREE(z->img_comp[i].raw_data);
2781 z->img_comp[i].raw_data = NULL;
2782 }
2783 return stbi__err("outofmem", "Out of memory");
2784 }
2785 // align blocks for idct using mmx/sse
2786 z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
2787 z->img_comp[i].linebuf = NULL;
2788 if (z->progressive) {
2789 z->img_comp[i].coeff_w = (z->img_comp[i].w2 + 7) >> 3;
2790 z->img_comp[i].coeff_h = (z->img_comp[i].h2 + 7) >> 3;
2791 z->img_comp[i].raw_coeff = STBI_MALLOC(z->img_comp[i].coeff_w * z->img_comp[i].coeff_h * 64 * sizeof(short) + 15);
2792 z->img_comp[i].coeff = (short*) (((size_t) z->img_comp[i].raw_coeff + 15) & ~15);
2793 } else {
2794 z->img_comp[i].coeff = 0;
2795 z->img_comp[i].raw_coeff = 0;
2796 }
2797 }
2798
2799 return 1;
2800 }
2801
2802 // use comparisons since in some cases we handle more than one case (e.g. SOF)
2803 #define stbi__DNL(x) ((x) == 0xdc)
2804 #define stbi__SOI(x) ((x) == 0xd8)
2805 #define stbi__EOI(x) ((x) == 0xd9)
2806 #define stbi__SOF(x) ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
2807 #define stbi__SOS(x) ((x) == 0xda)
2808
2809 #define stbi__SOF_progressive(x) ((x) == 0xc2)
2810
stbi__decode_jpeg_header(stbi__jpeg * z,int scan)2811 static int stbi__decode_jpeg_header(stbi__jpeg *z, int scan)
2812 {
2813 int m;
2814 z->marker = STBI__MARKER_none; // initialize cached marker to empty
2815 m = stbi__get_marker(z);
2816 if (!stbi__SOI(m)) return stbi__err("no SOI","Corrupt JPEG");
2817 if (scan == STBI__SCAN_type) return 1;
2818 m = stbi__get_marker(z);
2819 while (!stbi__SOF(m)) {
2820 if (!stbi__process_marker(z,m)) return 0;
2821 m = stbi__get_marker(z);
2822 while (m == STBI__MARKER_none) {
2823 // some files have extra padding after their blocks, so ok, we'll scan
2824 if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG");
2825 m = stbi__get_marker(z);
2826 }
2827 }
2828 z->progressive = stbi__SOF_progressive(m);
2829 if (!stbi__process_frame_header(z, scan)) return 0;
2830 return 1;
2831 }
2832
2833 // decode image to YCbCr format
stbi__decode_jpeg_image(stbi__jpeg * j)2834 static int stbi__decode_jpeg_image(stbi__jpeg *j)
2835 {
2836 int m;
2837 for (m = 0; m < 4; m++) {
2838 j->img_comp[m].raw_data = NULL;
2839 j->img_comp[m].raw_coeff = NULL;
2840 }
2841 j->restart_interval = 0;
2842 if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0;
2843 m = stbi__get_marker(j);
2844 while (!stbi__EOI(m)) {
2845 if (stbi__SOS(m)) {
2846 if (!stbi__process_scan_header(j)) return 0;
2847 if (!stbi__parse_entropy_coded_data(j)) return 0;
2848 if (j->marker == STBI__MARKER_none ) {
2849 // handle 0s at the end of image data from IP Kamera 9060
2850 while (!stbi__at_eof(j->s)) {
2851 int x = stbi__get8(j->s);
2852 if (x == 255) {
2853 j->marker = stbi__get8(j->s);
2854 break;
2855 } else if (x != 0) {
2856 return stbi__err("junk before marker", "Corrupt JPEG");
2857 }
2858 }
2859 // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
2860 }
2861 } else {
2862 if (!stbi__process_marker(j, m)) return 0;
2863 }
2864 m = stbi__get_marker(j);
2865 }
2866 if (j->progressive)
2867 stbi__jpeg_finish(j);
2868 return 1;
2869 }
2870
2871 // static jfif-centered resampling (across block boundaries)
2872
2873 typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1,
2874 int w, int hs);
2875
2876 #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
2877
resample_row_1(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2878 static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2879 {
2880 STBI_NOTUSED(out);
2881 STBI_NOTUSED(in_far);
2882 STBI_NOTUSED(w);
2883 STBI_NOTUSED(hs);
2884 return in_near;
2885 }
2886
stbi__resample_row_v_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2887 static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2888 {
2889 // need to generate two samples vertically for every one in input
2890 int i;
2891 STBI_NOTUSED(hs);
2892 for (i=0; i < w; ++i)
2893 out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2);
2894 return out;
2895 }
2896
stbi__resample_row_h_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2897 static stbi_uc* stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2898 {
2899 // need to generate two samples horizontally for every one in input
2900 int i;
2901 stbi_uc *input = in_near;
2902
2903 if (w == 1) {
2904 // if only one sample, can't do any interpolation
2905 out[0] = out[1] = input[0];
2906 return out;
2907 }
2908
2909 out[0] = input[0];
2910 out[1] = stbi__div4(input[0]*3 + input[1] + 2);
2911 for (i=1; i < w-1; ++i) {
2912 int n = 3*input[i]+2;
2913 out[i*2+0] = stbi__div4(n+input[i-1]);
2914 out[i*2+1] = stbi__div4(n+input[i+1]);
2915 }
2916 out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2);
2917 out[i*2+1] = input[w-1];
2918
2919 STBI_NOTUSED(in_far);
2920 STBI_NOTUSED(hs);
2921
2922 return out;
2923 }
2924
2925 #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
2926
stbi__resample_row_hv_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2927 static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2928 {
2929 // need to generate 2x2 samples for every one in input
2930 int i,t0,t1;
2931 if (w == 1) {
2932 out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2933 return out;
2934 }
2935
2936 t1 = 3*in_near[0] + in_far[0];
2937 out[0] = stbi__div4(t1+2);
2938 for (i=1; i < w; ++i) {
2939 t0 = t1;
2940 t1 = 3*in_near[i]+in_far[i];
2941 out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
2942 out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
2943 }
2944 out[w*2-1] = stbi__div4(t1+2);
2945
2946 STBI_NOTUSED(hs);
2947
2948 return out;
2949 }
2950
2951 #if defined(STBI_SSE2) || defined(STBI_NEON)
stbi__resample_row_hv_2_simd(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2952 static stbi_uc *stbi__resample_row_hv_2_simd(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2953 {
2954 // need to generate 2x2 samples for every one in input
2955 int i=0,t0,t1;
2956
2957 if (w == 1) {
2958 out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2959 return out;
2960 }
2961
2962 t1 = 3*in_near[0] + in_far[0];
2963 // process groups of 8 pixels for as long as we can.
2964 // note we can't handle the last pixel in a row in this loop
2965 // because we need to handle the filter boundary conditions.
2966 for (; i < ((w-1) & ~7); i += 8) {
2967 #if defined(STBI_SSE2)
2968 // load and perform the vertical filtering pass
2969 // this uses 3*x + y = 4*x + (y - x)
2970 __m128i zero = _mm_setzero_si128();
2971 __m128i farb = _mm_loadl_epi64((__m128i *) (in_far + i));
2972 __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i));
2973 __m128i farw = _mm_unpacklo_epi8(farb, zero);
2974 __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
2975 __m128i diff = _mm_sub_epi16(farw, nearw);
2976 __m128i nears = _mm_slli_epi16(nearw, 2);
2977 __m128i curr = _mm_add_epi16(nears, diff); // current row
2978
2979 // horizontal filter works the same based on shifted vers of current
2980 // row. "prev" is current row shifted right by 1 pixel; we need to
2981 // insert the previous pixel value (from t1).
2982 // "next" is current row shifted left by 1 pixel, with first pixel
2983 // of next block of 8 pixels added in.
2984 __m128i prv0 = _mm_slli_si128(curr, 2);
2985 __m128i nxt0 = _mm_srli_si128(curr, 2);
2986 __m128i prev = _mm_insert_epi16(prv0, t1, 0);
2987 __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7);
2988
2989 // horizontal filter, polyphase implementation since it's convenient:
2990 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
2991 // odd pixels = 3*cur + next = cur*4 + (next - cur)
2992 // note the shared term.
2993 __m128i bias = _mm_set1_epi16(8);
2994 __m128i curs = _mm_slli_epi16(curr, 2);
2995 __m128i prvd = _mm_sub_epi16(prev, curr);
2996 __m128i nxtd = _mm_sub_epi16(next, curr);
2997 __m128i curb = _mm_add_epi16(curs, bias);
2998 __m128i even = _mm_add_epi16(prvd, curb);
2999 __m128i odd = _mm_add_epi16(nxtd, curb);
3000
3001 // interleave even and odd pixels, then undo scaling.
3002 __m128i int0 = _mm_unpacklo_epi16(even, odd);
3003 __m128i int1 = _mm_unpackhi_epi16(even, odd);
3004 __m128i de0 = _mm_srli_epi16(int0, 4);
3005 __m128i de1 = _mm_srli_epi16(int1, 4);
3006
3007 // pack and write output
3008 __m128i outv = _mm_packus_epi16(de0, de1);
3009 _mm_storeu_si128((__m128i *) (out + i*2), outv);
3010 #elif defined(STBI_NEON)
3011 // load and perform the vertical filtering pass
3012 // this uses 3*x + y = 4*x + (y - x)
3013 uint8x8_t farb = vld1_u8(in_far + i);
3014 uint8x8_t nearb = vld1_u8(in_near + i);
3015 int16x8_t diff = vreinterpretq_s16_u16(vsubl_u8(farb, nearb));
3016 int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2));
3017 int16x8_t curr = vaddq_s16(nears, diff); // current row
3018
3019 // horizontal filter works the same based on shifted vers of current
3020 // row. "prev" is current row shifted right by 1 pixel; we need to
3021 // insert the previous pixel value (from t1).
3022 // "next" is current row shifted left by 1 pixel, with first pixel
3023 // of next block of 8 pixels added in.
3024 int16x8_t prv0 = vextq_s16(curr, curr, 7);
3025 int16x8_t nxt0 = vextq_s16(curr, curr, 1);
3026 int16x8_t prev = vsetq_lane_s16(t1, prv0, 0);
3027 int16x8_t next = vsetq_lane_s16(3*in_near[i+8] + in_far[i+8], nxt0, 7);
3028
3029 // horizontal filter, polyphase implementation since it's convenient:
3030 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
3031 // odd pixels = 3*cur + next = cur*4 + (next - cur)
3032 // note the shared term.
3033 int16x8_t curs = vshlq_n_s16(curr, 2);
3034 int16x8_t prvd = vsubq_s16(prev, curr);
3035 int16x8_t nxtd = vsubq_s16(next, curr);
3036 int16x8_t even = vaddq_s16(curs, prvd);
3037 int16x8_t odd = vaddq_s16(curs, nxtd);
3038
3039 // undo scaling and round, then store with even/odd phases interleaved
3040 uint8x8x2_t o;
3041 o.val[0] = vqrshrun_n_s16(even, 4);
3042 o.val[1] = vqrshrun_n_s16(odd, 4);
3043 vst2_u8(out + i*2, o);
3044 #endif
3045
3046 // "previous" value for next iter
3047 t1 = 3*in_near[i+7] + in_far[i+7];
3048 }
3049
3050 t0 = t1;
3051 t1 = 3*in_near[i] + in_far[i];
3052 out[i*2] = stbi__div16(3*t1 + t0 + 8);
3053
3054 for (++i; i < w; ++i) {
3055 t0 = t1;
3056 t1 = 3*in_near[i]+in_far[i];
3057 out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
3058 out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
3059 }
3060 out[w*2-1] = stbi__div4(t1+2);
3061
3062 STBI_NOTUSED(hs);
3063
3064 return out;
3065 }
3066 #endif
3067
stbi__resample_row_generic(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)3068 static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3069 {
3070 // resample with nearest-neighbor
3071 int i,j;
3072 STBI_NOTUSED(in_far);
3073 for (i=0; i < w; ++i)
3074 for (j=0; j < hs; ++j)
3075 out[i*hs+j] = in_near[i];
3076 return out;
3077 }
3078
3079 #ifdef STBI_JPEG_OLD
3080 // this is the same YCbCr-to-RGB calculation that stb_image has used
3081 // historically before the algorithm changes in 1.49
3082 #define float2fixed(x) ((int) ((x) * 65536 + 0.5))
stbi__YCbCr_to_RGB_row(stbi_uc * out,const stbi_uc * y,const stbi_uc * pcb,const stbi_uc * pcr,int count,int step)3083 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3084 {
3085 int i;
3086 for (i=0; i < count; ++i) {
3087 int y_fixed = (y[i] << 16) + 32768; // rounding
3088 int r,g,b;
3089 int cr = pcr[i] - 128;
3090 int cb = pcb[i] - 128;
3091 r = y_fixed + cr*float2fixed(1.40200f);
3092 g = y_fixed - cr*float2fixed(0.71414f) - cb*float2fixed(0.34414f);
3093 b = y_fixed + cb*float2fixed(1.77200f);
3094 r >>= 16;
3095 g >>= 16;
3096 b >>= 16;
3097 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3098 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3099 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3100 out[0] = (stbi_uc)r;
3101 out[1] = (stbi_uc)g;
3102 out[2] = (stbi_uc)b;
3103 out[3] = 255;
3104 out += step;
3105 }
3106 }
3107 #else
3108 // this is a reduced-precision calculation of YCbCr-to-RGB introduced
3109 // to make sure the code produces the same results in both SIMD and scalar
3110 #define float2fixed(x) (((int) ((x) * 4096.0f + 0.5f)) << 8)
stbi__YCbCr_to_RGB_row(stbi_uc * out,const stbi_uc * y,const stbi_uc * pcb,const stbi_uc * pcr,int count,int step)3111 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3112 {
3113 int i;
3114 for (i=0; i < count; ++i) {
3115 int y_fixed = (y[i] << 20) + (1<<19); // rounding
3116 int r,g,b;
3117 int cr = pcr[i] - 128;
3118 int cb = pcb[i] - 128;
3119 r = y_fixed + cr* float2fixed(1.40200f);
3120 g = y_fixed + (cr*-float2fixed(0.71414f)) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3121 b = y_fixed + cb* float2fixed(1.77200f);
3122 r >>= 20;
3123 g >>= 20;
3124 b >>= 20;
3125 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3126 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3127 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3128 out[0] = (stbi_uc)r;
3129 out[1] = (stbi_uc)g;
3130 out[2] = (stbi_uc)b;
3131 out[3] = 255;
3132 out += step;
3133 }
3134 }
3135 #endif
3136
3137 #if defined(STBI_SSE2) || defined(STBI_NEON)
stbi__YCbCr_to_RGB_simd(stbi_uc * out,stbi_uc const * y,stbi_uc const * pcb,stbi_uc const * pcr,int count,int step)3138 static void stbi__YCbCr_to_RGB_simd(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step)
3139 {
3140 int i = 0;
3141
3142 #ifdef STBI_SSE2
3143 // step == 3 is pretty ugly on the final interleave, and i'm not convinced
3144 // it's useful in practice (you wouldn't use it for textures, for example).
3145 // so just accelerate step == 4 case.
3146 if (step == 4) {
3147 // this is a fairly straightforward implementation and not super-optimized.
3148 __m128i signflip = _mm_set1_epi8(-0x80);
3149 __m128i cr_const0 = _mm_set1_epi16( (short) ( 1.40200f*4096.0f+0.5f));
3150 __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f));
3151 __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f));
3152 __m128i cb_const1 = _mm_set1_epi16( (short) ( 1.77200f*4096.0f+0.5f));
3153 __m128i y_bias = _mm_set1_epi8((char) (unsigned char) 128);
3154 __m128i xw = _mm_set1_epi16(255); // alpha channel
3155
3156 for (; i+7 < count; i += 8) {
3157 // load
3158 __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i));
3159 __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i));
3160 __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i));
3161 __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
3162 __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
3163
3164 // unpack to short (and left-shift cr, cb by 8)
3165 __m128i yw = _mm_unpacklo_epi8(y_bias, y_bytes);
3166 __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
3167 __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
3168
3169 // color transform
3170 __m128i yws = _mm_srli_epi16(yw, 4);
3171 __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
3172 __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
3173 __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
3174 __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
3175 __m128i rws = _mm_add_epi16(cr0, yws);
3176 __m128i gwt = _mm_add_epi16(cb0, yws);
3177 __m128i bws = _mm_add_epi16(yws, cb1);
3178 __m128i gws = _mm_add_epi16(gwt, cr1);
3179
3180 // descale
3181 __m128i rw = _mm_srai_epi16(rws, 4);
3182 __m128i bw = _mm_srai_epi16(bws, 4);
3183 __m128i gw = _mm_srai_epi16(gws, 4);
3184
3185 // back to byte, set up for transpose
3186 __m128i brb = _mm_packus_epi16(rw, bw);
3187 __m128i gxb = _mm_packus_epi16(gw, xw);
3188
3189 // transpose to interleave channels
3190 __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
3191 __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
3192 __m128i o0 = _mm_unpacklo_epi16(t0, t1);
3193 __m128i o1 = _mm_unpackhi_epi16(t0, t1);
3194
3195 // store
3196 _mm_storeu_si128((__m128i *) (out + 0), o0);
3197 _mm_storeu_si128((__m128i *) (out + 16), o1);
3198 out += 32;
3199 }
3200 }
3201 #endif
3202
3203 #ifdef STBI_NEON
3204 // in this version, step=3 support would be easy to add. but is there demand?
3205 if (step == 4) {
3206 // this is a fairly straightforward implementation and not super-optimized.
3207 uint8x8_t signflip = vdup_n_u8(0x80);
3208 int16x8_t cr_const0 = vdupq_n_s16( (short) ( 1.40200f*4096.0f+0.5f));
3209 int16x8_t cr_const1 = vdupq_n_s16( - (short) ( 0.71414f*4096.0f+0.5f));
3210 int16x8_t cb_const0 = vdupq_n_s16( - (short) ( 0.34414f*4096.0f+0.5f));
3211 int16x8_t cb_const1 = vdupq_n_s16( (short) ( 1.77200f*4096.0f+0.5f));
3212
3213 for (; i+7 < count; i += 8) {
3214 // load
3215 uint8x8_t y_bytes = vld1_u8(y + i);
3216 uint8x8_t cr_bytes = vld1_u8(pcr + i);
3217 uint8x8_t cb_bytes = vld1_u8(pcb + i);
3218 int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip));
3219 int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip));
3220
3221 // expand to s16
3222 int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4));
3223 int16x8_t crw = vshll_n_s8(cr_biased, 7);
3224 int16x8_t cbw = vshll_n_s8(cb_biased, 7);
3225
3226 // color transform
3227 int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0);
3228 int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0);
3229 int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1);
3230 int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1);
3231 int16x8_t rws = vaddq_s16(yws, cr0);
3232 int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1);
3233 int16x8_t bws = vaddq_s16(yws, cb1);
3234
3235 // undo scaling, round, convert to byte
3236 uint8x8x4_t o;
3237 o.val[0] = vqrshrun_n_s16(rws, 4);
3238 o.val[1] = vqrshrun_n_s16(gws, 4);
3239 o.val[2] = vqrshrun_n_s16(bws, 4);
3240 o.val[3] = vdup_n_u8(255);
3241
3242 // store, interleaving r/g/b/a
3243 vst4_u8(out, o);
3244 out += 8*4;
3245 }
3246 }
3247 #endif
3248
3249 for (; i < count; ++i) {
3250 int y_fixed = (y[i] << 20) + (1<<19); // rounding
3251 int r,g,b;
3252 int cr = pcr[i] - 128;
3253 int cb = pcb[i] - 128;
3254 r = y_fixed + cr* float2fixed(1.40200f);
3255 g = y_fixed + cr*-float2fixed(0.71414f) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3256 b = y_fixed + cb* float2fixed(1.77200f);
3257 r >>= 20;
3258 g >>= 20;
3259 b >>= 20;
3260 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3261 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3262 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3263 out[0] = (stbi_uc)r;
3264 out[1] = (stbi_uc)g;
3265 out[2] = (stbi_uc)b;
3266 out[3] = 255;
3267 out += step;
3268 }
3269 }
3270 #endif
3271
3272 // set up the kernels
stbi__setup_jpeg(stbi__jpeg * j)3273 static void stbi__setup_jpeg(stbi__jpeg *j)
3274 {
3275 j->idct_block_kernel = stbi__idct_block;
3276 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
3277 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
3278
3279 #ifdef STBI_SSE2
3280 if (stbi__sse2_available()) {
3281 j->idct_block_kernel = stbi__idct_simd;
3282 #ifndef STBI_JPEG_OLD
3283 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3284 #endif
3285 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3286 }
3287 #endif
3288
3289 #ifdef STBI_NEON
3290 j->idct_block_kernel = stbi__idct_simd;
3291 #ifndef STBI_JPEG_OLD
3292 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3293 #endif
3294 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3295 #endif
3296 }
3297
3298 // clean up the temporary component buffers
stbi__cleanup_jpeg(stbi__jpeg * j)3299 static void stbi__cleanup_jpeg(stbi__jpeg *j)
3300 {
3301 int i;
3302 for (i=0; i < j->s->img_n; ++i) {
3303 if (j->img_comp[i].raw_data) {
3304 STBI_FREE(j->img_comp[i].raw_data);
3305 j->img_comp[i].raw_data = NULL;
3306 j->img_comp[i].data = NULL;
3307 }
3308 if (j->img_comp[i].raw_coeff) {
3309 STBI_FREE(j->img_comp[i].raw_coeff);
3310 j->img_comp[i].raw_coeff = 0;
3311 j->img_comp[i].coeff = 0;
3312 }
3313 if (j->img_comp[i].linebuf) {
3314 STBI_FREE(j->img_comp[i].linebuf);
3315 j->img_comp[i].linebuf = NULL;
3316 }
3317 }
3318 }
3319
3320 typedef struct
3321 {
3322 resample_row_func resample;
3323 stbi_uc *line0,*line1;
3324 int hs,vs; // expansion factor in each axis
3325 int w_lores; // horizontal pixels pre-expansion
3326 int ystep; // how far through vertical expansion we are
3327 int ypos; // which pre-expansion row we're on
3328 } stbi__resample;
3329
load_jpeg_image(stbi__jpeg * z,int * out_x,int * out_y,int * comp,int req_comp)3330 static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
3331 {
3332 int n, decode_n;
3333 z->s->img_n = 0; // make stbi__cleanup_jpeg safe
3334
3335 // validate req_comp
3336 if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
3337
3338 // load a jpeg image from whichever source, but leave in YCbCr format
3339 if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
3340
3341 // determine actual number of components to generate
3342 n = req_comp ? req_comp : z->s->img_n;
3343
3344 if (z->s->img_n == 3 && n < 3)
3345 decode_n = 1;
3346 else
3347 decode_n = z->s->img_n;
3348
3349 // resample and color-convert
3350 {
3351 int k;
3352 unsigned int i,j;
3353 stbi_uc *output;
3354 stbi_uc *coutput[4];
3355
3356 stbi__resample res_comp[4];
3357
3358 for (k=0; k < decode_n; ++k) {
3359 stbi__resample *r = &res_comp[k];
3360
3361 // allocate line buffer big enough for upsampling off the edges
3362 // with upsample factor of 4
3363 z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3);
3364 if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3365
3366 r->hs = z->img_h_max / z->img_comp[k].h;
3367 r->vs = z->img_v_max / z->img_comp[k].v;
3368 r->ystep = r->vs >> 1;
3369 r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
3370 r->ypos = 0;
3371 r->line0 = r->line1 = z->img_comp[k].data;
3372
3373 if (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
3374 else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
3375 else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
3376 else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
3377 else r->resample = stbi__resample_row_generic;
3378 }
3379
3380 // can't error after this so, this is safe
3381 output = (stbi_uc *) stbi__malloc(n * z->s->img_x * z->s->img_y + 1);
3382 if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3383
3384 // now go ahead and resample
3385 for (j=0; j < z->s->img_y; ++j) {
3386 stbi_uc *out = output + n * z->s->img_x * j;
3387 for (k=0; k < decode_n; ++k) {
3388 stbi__resample *r = &res_comp[k];
3389 int y_bot = r->ystep >= (r->vs >> 1);
3390 coutput[k] = r->resample(z->img_comp[k].linebuf,
3391 y_bot ? r->line1 : r->line0,
3392 y_bot ? r->line0 : r->line1,
3393 r->w_lores, r->hs);
3394 if (++r->ystep >= r->vs) {
3395 r->ystep = 0;
3396 r->line0 = r->line1;
3397 if (++r->ypos < z->img_comp[k].y)
3398 r->line1 += z->img_comp[k].w2;
3399 }
3400 }
3401 if (n >= 3) {
3402 stbi_uc *y = coutput[0];
3403 if (z->s->img_n == 3) {
3404 if (z->rgb == 3) {
3405 for (i=0; i < z->s->img_x; ++i) {
3406 out[0] = y[i];
3407 out[1] = coutput[1][i];
3408 out[2] = coutput[2][i];
3409 out[3] = 255;
3410 out += n;
3411 }
3412 } else {
3413 z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3414 }
3415 } else
3416 for (i=0; i < z->s->img_x; ++i) {
3417 out[0] = out[1] = out[2] = y[i];
3418 out[3] = 255; // not used if n==3
3419 out += n;
3420 }
3421 } else {
3422 stbi_uc *y = coutput[0];
3423 if (n == 1)
3424 for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
3425 else
3426 for (i=0; i < z->s->img_x; ++i) *out++ = y[i], *out++ = 255;
3427 }
3428 }
3429 stbi__cleanup_jpeg(z);
3430 *out_x = z->s->img_x;
3431 *out_y = z->s->img_y;
3432 if (comp) *comp = z->s->img_n; // report original components, not output
3433 return output;
3434 }
3435 }
3436
stbi__jpeg_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)3437 static unsigned char *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
3438 {
3439 unsigned char* result;
3440 stbi__jpeg* j = (stbi__jpeg*) stbi__malloc(sizeof(stbi__jpeg));
3441 j->s = s;
3442 stbi__setup_jpeg(j);
3443 result = load_jpeg_image(j, x,y,comp,req_comp);
3444 STBI_FREE(j);
3445 return result;
3446 }
3447
stbi__jpeg_test(stbi__context * s)3448 static int stbi__jpeg_test(stbi__context *s)
3449 {
3450 int r;
3451 stbi__jpeg j;
3452 j.s = s;
3453 stbi__setup_jpeg(&j);
3454 r = stbi__decode_jpeg_header(&j, STBI__SCAN_type);
3455 stbi__rewind(s);
3456 return r;
3457 }
3458
stbi__jpeg_info_raw(stbi__jpeg * j,int * x,int * y,int * comp)3459 static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp)
3460 {
3461 if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) {
3462 stbi__rewind( j->s );
3463 return 0;
3464 }
3465 if (x) *x = j->s->img_x;
3466 if (y) *y = j->s->img_y;
3467 if (comp) *comp = j->s->img_n;
3468 return 1;
3469 }
3470
stbi__jpeg_info(stbi__context * s,int * x,int * y,int * comp)3471 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp)
3472 {
3473 int result;
3474 stbi__jpeg* j = (stbi__jpeg*) (stbi__malloc(sizeof(stbi__jpeg)));
3475 j->s = s;
3476 result = stbi__jpeg_info_raw(j, x, y, comp);
3477 STBI_FREE(j);
3478 return result;
3479 }
3480 #endif
3481
3482 // public domain zlib decode v0.2 Sean Barrett 2006-11-18
3483 // simple implementation
3484 // - all input must be provided in an upfront buffer
3485 // - all output is written to a single output buffer (can malloc/realloc)
3486 // performance
3487 // - fast huffman
3488
3489 #ifndef STBI_NO_ZLIB
3490
3491 // fast-way is faster to check than jpeg huffman, but slow way is slower
3492 #define STBI__ZFAST_BITS 9 // accelerate all cases in default tables
3493 #define STBI__ZFAST_MASK ((1 << STBI__ZFAST_BITS) - 1)
3494
3495 // zlib-style huffman encoding
3496 // (jpegs packs from left, zlib from right, so can't share code)
3497 typedef struct
3498 {
3499 stbi__uint16 fast[1 << STBI__ZFAST_BITS];
3500 stbi__uint16 firstcode[16];
3501 int maxcode[17];
3502 stbi__uint16 firstsymbol[16];
3503 stbi_uc size[288];
3504 stbi__uint16 value[288];
3505 } stbi__zhuffman;
3506
stbi__bitreverse16(int n)3507 stbi_inline static int stbi__bitreverse16(int n)
3508 {
3509 n = ((n & 0xAAAA) >> 1) | ((n & 0x5555) << 1);
3510 n = ((n & 0xCCCC) >> 2) | ((n & 0x3333) << 2);
3511 n = ((n & 0xF0F0) >> 4) | ((n & 0x0F0F) << 4);
3512 n = ((n & 0xFF00) >> 8) | ((n & 0x00FF) << 8);
3513 return n;
3514 }
3515
stbi__bit_reverse(int v,int bits)3516 stbi_inline static int stbi__bit_reverse(int v, int bits)
3517 {
3518 STBI_ASSERT(bits <= 16);
3519 // to bit reverse n bits, reverse 16 and shift
3520 // e.g. 11 bits, bit reverse and shift away 5
3521 return stbi__bitreverse16(v) >> (16-bits);
3522 }
3523
stbi__zbuild_huffman(stbi__zhuffman * z,stbi_uc * sizelist,int num)3524 static int stbi__zbuild_huffman(stbi__zhuffman *z, stbi_uc *sizelist, int num)
3525 {
3526 int i,k=0;
3527 int code, next_code[16], sizes[17];
3528
3529 // DEFLATE spec for generating codes
3530 memset(sizes, 0, sizeof(sizes));
3531 memset(z->fast, 0, sizeof(z->fast));
3532 for (i=0; i < num; ++i)
3533 ++sizes[sizelist[i]];
3534 sizes[0] = 0;
3535 for (i=1; i < 16; ++i)
3536 if (sizes[i] > (1 << i))
3537 return stbi__err("bad sizes", "Corrupt PNG");
3538 code = 0;
3539 for (i=1; i < 16; ++i) {
3540 next_code[i] = code;
3541 z->firstcode[i] = (stbi__uint16) code;
3542 z->firstsymbol[i] = (stbi__uint16) k;
3543 code = (code + sizes[i]);
3544 if (sizes[i])
3545 if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt PNG");
3546 z->maxcode[i] = code << (16-i); // preshift for inner loop
3547 code <<= 1;
3548 k += sizes[i];
3549 }
3550 z->maxcode[16] = 0x10000; // sentinel
3551 for (i=0; i < num; ++i) {
3552 int s = sizelist[i];
3553 if (s) {
3554 int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
3555 stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i);
3556 z->size [c] = (stbi_uc ) s;
3557 z->value[c] = (stbi__uint16) i;
3558 if (s <= STBI__ZFAST_BITS) {
3559 int j = stbi__bit_reverse(next_code[s],s);
3560 while (j < (1 << STBI__ZFAST_BITS)) {
3561 z->fast[j] = fastv;
3562 j += (1 << s);
3563 }
3564 }
3565 ++next_code[s];
3566 }
3567 }
3568 return 1;
3569 }
3570
3571 // zlib-from-memory implementation for PNG reading
3572 // because PNG allows splitting the zlib stream arbitrarily,
3573 // and it's annoying structurally to have PNG call ZLIB call PNG,
3574 // we require PNG read all the IDATs and combine them into a single
3575 // memory buffer
3576
3577 typedef struct
3578 {
3579 stbi_uc *zbuffer, *zbuffer_end;
3580 int num_bits;
3581 stbi__uint32 code_buffer;
3582
3583 char *zout;
3584 char *zout_start;
3585 char *zout_end;
3586 int z_expandable;
3587
3588 stbi__zhuffman z_length, z_distance;
3589 } stbi__zbuf;
3590
stbi__zget8(stbi__zbuf * z)3591 stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z)
3592 {
3593 if (z->zbuffer >= z->zbuffer_end) return 0;
3594 return *z->zbuffer++;
3595 }
3596
stbi__fill_bits(stbi__zbuf * z)3597 static void stbi__fill_bits(stbi__zbuf *z)
3598 {
3599 do {
3600 STBI_ASSERT(z->code_buffer < (1U << z->num_bits));
3601 z->code_buffer |= (unsigned int) stbi__zget8(z) << z->num_bits;
3602 z->num_bits += 8;
3603 } while (z->num_bits <= 24);
3604 }
3605
stbi__zreceive(stbi__zbuf * z,int n)3606 stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n)
3607 {
3608 unsigned int k;
3609 if (z->num_bits < n) stbi__fill_bits(z);
3610 k = z->code_buffer & ((1 << n) - 1);
3611 z->code_buffer >>= n;
3612 z->num_bits -= n;
3613 return k;
3614 }
3615
stbi__zhuffman_decode_slowpath(stbi__zbuf * a,stbi__zhuffman * z)3616 static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z)
3617 {
3618 int b,s,k;
3619 // not resolved by fast table, so compute it the slow way
3620 // use jpeg approach, which requires MSbits at top
3621 k = stbi__bit_reverse(a->code_buffer, 16);
3622 for (s=STBI__ZFAST_BITS+1; ; ++s)
3623 if (k < z->maxcode[s])
3624 break;
3625 if (s == 16) return -1; // invalid code!
3626 // code size is s, so:
3627 b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
3628 STBI_ASSERT(z->size[b] == s);
3629 a->code_buffer >>= s;
3630 a->num_bits -= s;
3631 return z->value[b];
3632 }
3633
stbi__zhuffman_decode(stbi__zbuf * a,stbi__zhuffman * z)3634 stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z)
3635 {
3636 int b,s;
3637 if (a->num_bits < 16) stbi__fill_bits(a);
3638 b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
3639 if (b) {
3640 s = b >> 9;
3641 a->code_buffer >>= s;
3642 a->num_bits -= s;
3643 return b & 511;
3644 }
3645 return stbi__zhuffman_decode_slowpath(a, z);
3646 }
3647
stbi__zexpand(stbi__zbuf * z,char * zout,int n)3648 static int stbi__zexpand(stbi__zbuf *z, char *zout, int n) // need to make room for n bytes
3649 {
3650 char *q;
3651 int cur, limit, old_limit;
3652 z->zout = zout;
3653 if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG");
3654 cur = (int) (z->zout - z->zout_start);
3655 limit = old_limit = (int) (z->zout_end - z->zout_start);
3656 while (cur + n > limit)
3657 limit *= 2;
3658 q = (char *) STBI_REALLOC_SIZED(z->zout_start, old_limit, limit);
3659 STBI_NOTUSED(old_limit);
3660 if (q == NULL) return stbi__err("outofmem", "Out of memory");
3661 z->zout_start = q;
3662 z->zout = q + cur;
3663 z->zout_end = q + limit;
3664 return 1;
3665 }
3666
3667 static int stbi__zlength_base[31] = {
3668 3,4,5,6,7,8,9,10,11,13,
3669 15,17,19,23,27,31,35,43,51,59,
3670 67,83,99,115,131,163,195,227,258,0,0 };
3671
3672 static int stbi__zlength_extra[31]=
3673 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
3674
3675 static int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
3676 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
3677
3678 static int stbi__zdist_extra[32] =
3679 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
3680
stbi__parse_huffman_block(stbi__zbuf * a)3681 static int stbi__parse_huffman_block(stbi__zbuf *a)
3682 {
3683 char *zout = a->zout;
3684 for(;;) {
3685 int z = stbi__zhuffman_decode(a, &a->z_length);
3686 if (z < 256) {
3687 if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
3688 if (zout >= a->zout_end) {
3689 if (!stbi__zexpand(a, zout, 1)) return 0;
3690 zout = a->zout;
3691 }
3692 *zout++ = (char) z;
3693 } else {
3694 stbi_uc *p;
3695 int len,dist;
3696 if (z == 256) {
3697 a->zout = zout;
3698 return 1;
3699 }
3700 z -= 257;
3701 len = stbi__zlength_base[z];
3702 if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
3703 z = stbi__zhuffman_decode(a, &a->z_distance);
3704 if (z < 0) return stbi__err("bad huffman code","Corrupt PNG");
3705 dist = stbi__zdist_base[z];
3706 if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
3707 if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG");
3708 if (zout + len > a->zout_end) {
3709 if (!stbi__zexpand(a, zout, len)) return 0;
3710 zout = a->zout;
3711 }
3712 p = (stbi_uc *) (zout - dist);
3713 if (dist == 1) { // run of one byte; common in images.
3714 stbi_uc v = *p;
3715 if (len) { do *zout++ = v; while (--len); }
3716 } else {
3717 if (len) { do *zout++ = *p++; while (--len); }
3718 }
3719 }
3720 }
3721 }
3722
stbi__compute_huffman_codes(stbi__zbuf * a)3723 static int stbi__compute_huffman_codes(stbi__zbuf *a)
3724 {
3725 static stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
3726 stbi__zhuffman z_codelength;
3727 stbi_uc lencodes[286+32+137];//padding for maximum single op
3728 stbi_uc codelength_sizes[19];
3729 int i,n;
3730
3731 int hlit = stbi__zreceive(a,5) + 257;
3732 int hdist = stbi__zreceive(a,5) + 1;
3733 int hclen = stbi__zreceive(a,4) + 4;
3734
3735 memset(codelength_sizes, 0, sizeof(codelength_sizes));
3736 for (i=0; i < hclen; ++i) {
3737 int s = stbi__zreceive(a,3);
3738 codelength_sizes[length_dezigzag[i]] = (stbi_uc) s;
3739 }
3740 if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
3741
3742 n = 0;
3743 while (n < hlit + hdist) {
3744 int c = stbi__zhuffman_decode(a, &z_codelength);
3745 if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG");
3746 if (c < 16)
3747 lencodes[n++] = (stbi_uc) c;
3748 else if (c == 16) {
3749 c = stbi__zreceive(a,2)+3;
3750 memset(lencodes+n, lencodes[n-1], c);
3751 n += c;
3752 } else if (c == 17) {
3753 c = stbi__zreceive(a,3)+3;
3754 memset(lencodes+n, 0, c);
3755 n += c;
3756 } else {
3757 STBI_ASSERT(c == 18);
3758 c = stbi__zreceive(a,7)+11;
3759 memset(lencodes+n, 0, c);
3760 n += c;
3761 }
3762 }
3763 if (n != hlit+hdist) return stbi__err("bad codelengths","Corrupt PNG");
3764 if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
3765 if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
3766 return 1;
3767 }
3768
stbi__parse_uncompressed_block(stbi__zbuf * a)3769 static int stbi__parse_uncompressed_block(stbi__zbuf *a)
3770 {
3771 stbi_uc header[4];
3772 int len,nlen,k;
3773 if (a->num_bits & 7)
3774 stbi__zreceive(a, a->num_bits & 7); // discard
3775 // drain the bit-packed data into header
3776 k = 0;
3777 while (a->num_bits > 0) {
3778 header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check
3779 a->code_buffer >>= 8;
3780 a->num_bits -= 8;
3781 }
3782 STBI_ASSERT(a->num_bits == 0);
3783 // now fill header the normal way
3784 while (k < 4)
3785 header[k++] = stbi__zget8(a);
3786 len = header[1] * 256 + header[0];
3787 nlen = header[3] * 256 + header[2];
3788 if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
3789 if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG");
3790 if (a->zout + len > a->zout_end)
3791 if (!stbi__zexpand(a, a->zout, len)) return 0;
3792 memcpy(a->zout, a->zbuffer, len);
3793 a->zbuffer += len;
3794 a->zout += len;
3795 return 1;
3796 }
3797
stbi__parse_zlib_header(stbi__zbuf * a)3798 static int stbi__parse_zlib_header(stbi__zbuf *a)
3799 {
3800 int cmf = stbi__zget8(a);
3801 int cm = cmf & 15;
3802 /* int cinfo = cmf >> 4; */
3803 int flg = stbi__zget8(a);
3804 if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
3805 if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
3806 if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
3807 // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
3808 return 1;
3809 }
3810
3811 // @TODO: should statically initialize these for optimal thread safety
3812 static stbi_uc stbi__zdefault_length[288], stbi__zdefault_distance[32];
stbi__init_zdefaults(void)3813 static void stbi__init_zdefaults(void)
3814 {
3815 int i; // use <= to match clearly with spec
3816 for (i=0; i <= 143; ++i) stbi__zdefault_length[i] = 8;
3817 for ( ; i <= 255; ++i) stbi__zdefault_length[i] = 9;
3818 for ( ; i <= 279; ++i) stbi__zdefault_length[i] = 7;
3819 for ( ; i <= 287; ++i) stbi__zdefault_length[i] = 8;
3820
3821 for (i=0; i <= 31; ++i) stbi__zdefault_distance[i] = 5;
3822 }
3823
stbi__parse_zlib(stbi__zbuf * a,int parse_header)3824 static int stbi__parse_zlib(stbi__zbuf *a, int parse_header)
3825 {
3826 int final, type;
3827 if (parse_header)
3828 if (!stbi__parse_zlib_header(a)) return 0;
3829 a->num_bits = 0;
3830 a->code_buffer = 0;
3831 do {
3832 final = stbi__zreceive(a,1);
3833 type = stbi__zreceive(a,2);
3834 if (type == 0) {
3835 if (!stbi__parse_uncompressed_block(a)) return 0;
3836 } else if (type == 3) {
3837 return 0;
3838 } else {
3839 if (type == 1) {
3840 // use fixed code lengths
3841 if (!stbi__zdefault_distance[31]) stbi__init_zdefaults();
3842 if (!stbi__zbuild_huffman(&a->z_length , stbi__zdefault_length , 288)) return 0;
3843 if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance, 32)) return 0;
3844 } else {
3845 if (!stbi__compute_huffman_codes(a)) return 0;
3846 }
3847 if (!stbi__parse_huffman_block(a)) return 0;
3848 }
3849 } while (!final);
3850 return 1;
3851 }
3852
stbi__do_zlib(stbi__zbuf * a,char * obuf,int olen,int exp,int parse_header)3853 static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header)
3854 {
3855 a->zout_start = obuf;
3856 a->zout = obuf;
3857 a->zout_end = obuf + olen;
3858 a->z_expandable = exp;
3859
3860 return stbi__parse_zlib(a, parse_header);
3861 }
3862
stbi_zlib_decode_malloc_guesssize(const char * buffer,int len,int initial_size,int * outlen)3863 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)
3864 {
3865 stbi__zbuf a;
3866 char *p = (char *) stbi__malloc(initial_size);
3867 if (p == NULL) return NULL;
3868 a.zbuffer = (stbi_uc *) buffer;
3869 a.zbuffer_end = (stbi_uc *) buffer + len;
3870 if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
3871 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3872 return a.zout_start;
3873 } else {
3874 STBI_FREE(a.zout_start);
3875 return NULL;
3876 }
3877 }
3878
stbi_zlib_decode_malloc(char const * buffer,int len,int * outlen)3879 STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
3880 {
3881 return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
3882 }
3883
stbi_zlib_decode_malloc_guesssize_headerflag(const char * buffer,int len,int initial_size,int * outlen,int parse_header)3884 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
3885 {
3886 stbi__zbuf a;
3887 char *p = (char *) stbi__malloc(initial_size);
3888 if (p == NULL) return NULL;
3889 a.zbuffer = (stbi_uc *) buffer;
3890 a.zbuffer_end = (stbi_uc *) buffer + len;
3891 if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
3892 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3893 return a.zout_start;
3894 } else {
3895 STBI_FREE(a.zout_start);
3896 return NULL;
3897 }
3898 }
3899
stbi_zlib_decode_buffer(char * obuffer,int olen,char const * ibuffer,int ilen)3900 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
3901 {
3902 stbi__zbuf a;
3903 a.zbuffer = (stbi_uc *) ibuffer;
3904 a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3905 if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
3906 return (int) (a.zout - a.zout_start);
3907 else
3908 return -1;
3909 }
3910
stbi_zlib_decode_noheader_malloc(char const * buffer,int len,int * outlen)3911 STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
3912 {
3913 stbi__zbuf a;
3914 char *p = (char *) stbi__malloc(16384);
3915 if (p == NULL) return NULL;
3916 a.zbuffer = (stbi_uc *) buffer;
3917 a.zbuffer_end = (stbi_uc *) buffer+len;
3918 if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
3919 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3920 return a.zout_start;
3921 } else {
3922 STBI_FREE(a.zout_start);
3923 return NULL;
3924 }
3925 }
3926
stbi_zlib_decode_noheader_buffer(char * obuffer,int olen,const char * ibuffer,int ilen)3927 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
3928 {
3929 stbi__zbuf a;
3930 a.zbuffer = (stbi_uc *) ibuffer;
3931 a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3932 if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
3933 return (int) (a.zout - a.zout_start);
3934 else
3935 return -1;
3936 }
3937 #endif
3938
3939 // public domain "baseline" PNG decoder v0.10 Sean Barrett 2006-11-18
3940 // simple implementation
3941 // - only 8-bit samples
3942 // - no CRC checking
3943 // - allocates lots of intermediate memory
3944 // - avoids problem of streaming data between subsystems
3945 // - avoids explicit window management
3946 // performance
3947 // - uses stb_zlib, a PD zlib implementation with fast huffman decoding
3948
3949 #ifndef STBI_NO_PNG
3950 typedef struct
3951 {
3952 stbi__uint32 length;
3953 stbi__uint32 type;
3954 } stbi__pngchunk;
3955
stbi__get_chunk_header(stbi__context * s)3956 static stbi__pngchunk stbi__get_chunk_header(stbi__context *s)
3957 {
3958 stbi__pngchunk c;
3959 c.length = stbi__get32be(s);
3960 c.type = stbi__get32be(s);
3961 return c;
3962 }
3963
stbi__check_png_header(stbi__context * s)3964 static int stbi__check_png_header(stbi__context *s)
3965 {
3966 static stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
3967 int i;
3968 for (i=0; i < 8; ++i)
3969 if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG");
3970 return 1;
3971 }
3972
3973 typedef struct
3974 {
3975 stbi__context *s;
3976 stbi_uc *idata, *expanded, *out;
3977 int depth;
3978 } stbi__png;
3979
3980
3981 enum {
3982 STBI__F_none=0,
3983 STBI__F_sub=1,
3984 STBI__F_up=2,
3985 STBI__F_avg=3,
3986 STBI__F_paeth=4,
3987 // synthetic filters used for first scanline to avoid needing a dummy row of 0s
3988 STBI__F_avg_first,
3989 STBI__F_paeth_first
3990 };
3991
3992 static stbi_uc first_row_filter[5] =
3993 {
3994 STBI__F_none,
3995 STBI__F_sub,
3996 STBI__F_none,
3997 STBI__F_avg_first,
3998 STBI__F_paeth_first
3999 };
4000
stbi__paeth(int a,int b,int c)4001 static int stbi__paeth(int a, int b, int c)
4002 {
4003 int p = a + b - c;
4004 int pa = abs(p-a);
4005 int pb = abs(p-b);
4006 int pc = abs(p-c);
4007 if (pa <= pb && pa <= pc) return a;
4008 if (pb <= pc) return b;
4009 return c;
4010 }
4011
4012 static stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
4013
4014 // create the png data from post-deflated data
stbi__create_png_image_raw(stbi__png * a,stbi_uc * raw,stbi__uint32 raw_len,int out_n,stbi__uint32 x,stbi__uint32 y,int depth,int color)4015 static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
4016 {
4017 int bytes = (depth == 16? 2 : 1);
4018 stbi__context *s = a->s;
4019 stbi__uint32 i,j,stride = x*out_n*bytes;
4020 stbi__uint32 img_len, img_width_bytes;
4021 int k;
4022 int img_n = s->img_n; // copy it into a local for later
4023
4024 int output_bytes = out_n*bytes;
4025 int filter_bytes = img_n*bytes;
4026 int width = x;
4027
4028 STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1);
4029 a->out = (stbi_uc *) stbi__malloc(x * y * output_bytes); // extra bytes to write off the end into
4030 if (!a->out) return stbi__err("outofmem", "Out of memory");
4031
4032 img_width_bytes = (((img_n * x * depth) + 7) >> 3);
4033 img_len = (img_width_bytes + 1) * y;
4034 if (s->img_x == x && s->img_y == y) {
4035 if (raw_len != img_len) return stbi__err("not enough pixels","Corrupt PNG");
4036 } else { // interlaced:
4037 if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG");
4038 }
4039
4040 for (j=0; j < y; ++j) {
4041 stbi_uc *cur = a->out + stride*j;
4042 stbi_uc *prior = cur - stride;
4043 int filter = *raw++;
4044
4045 if (filter > 4)
4046 return stbi__err("invalid filter","Corrupt PNG");
4047
4048 if (depth < 8) {
4049 STBI_ASSERT(img_width_bytes <= x);
4050 cur += x*out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place
4051 filter_bytes = 1;
4052 width = img_width_bytes;
4053 }
4054
4055 // if first row, use special filter that doesn't sample previous row
4056 if (j == 0) filter = first_row_filter[filter];
4057
4058 // handle first byte explicitly
4059 for (k=0; k < filter_bytes; ++k) {
4060 switch (filter) {
4061 case STBI__F_none : cur[k] = raw[k]; break;
4062 case STBI__F_sub : cur[k] = raw[k]; break;
4063 case STBI__F_up : cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4064 case STBI__F_avg : cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1)); break;
4065 case STBI__F_paeth : cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0,prior[k],0)); break;
4066 case STBI__F_avg_first : cur[k] = raw[k]; break;
4067 case STBI__F_paeth_first: cur[k] = raw[k]; break;
4068 }
4069 }
4070
4071 if (depth == 8) {
4072 if (img_n != out_n)
4073 cur[img_n] = 255; // first pixel
4074 raw += img_n;
4075 cur += out_n;
4076 prior += out_n;
4077 } else if (depth == 16) {
4078 if (img_n != out_n) {
4079 cur[filter_bytes] = 255; // first pixel top byte
4080 cur[filter_bytes+1] = 255; // first pixel bottom byte
4081 }
4082 raw += filter_bytes;
4083 cur += output_bytes;
4084 prior += output_bytes;
4085 } else {
4086 raw += 1;
4087 cur += 1;
4088 prior += 1;
4089 }
4090
4091 // this is a little gross, so that we don't switch per-pixel or per-component
4092 if (depth < 8 || img_n == out_n) {
4093 int nk = (width - 1)*filter_bytes;
4094 #define CASE(f) \
4095 case f: \
4096 for (k=0; k < nk; ++k)
4097 switch (filter) {
4098 // "none" filter turns into a memcpy here; make that explicit.
4099 case STBI__F_none: memcpy(cur, raw, nk); break;
4100 CASE(STBI__F_sub) cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]); break;
4101 CASE(STBI__F_up) cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4102 CASE(STBI__F_avg) cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1)); break;
4103 CASE(STBI__F_paeth) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],prior[k],prior[k-filter_bytes])); break;
4104 CASE(STBI__F_avg_first) cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1)); break;
4105 CASE(STBI__F_paeth_first) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],0,0)); break;
4106 }
4107 #undef CASE
4108 raw += nk;
4109 } else {
4110 STBI_ASSERT(img_n+1 == out_n);
4111 #define CASE(f) \
4112 case f: \
4113 for (i=x-1; i >= 1; --i, cur[filter_bytes]=255,raw+=filter_bytes,cur+=output_bytes,prior+=output_bytes) \
4114 for (k=0; k < filter_bytes; ++k)
4115 switch (filter) {
4116 CASE(STBI__F_none) cur[k] = raw[k]; break;
4117 CASE(STBI__F_sub) cur[k] = STBI__BYTECAST(raw[k] + cur[k- output_bytes]); break;
4118 CASE(STBI__F_up) cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4119 CASE(STBI__F_avg) cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k- output_bytes])>>1)); break;
4120 CASE(STBI__F_paeth) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],prior[k],prior[k- output_bytes])); break;
4121 CASE(STBI__F_avg_first) cur[k] = STBI__BYTECAST(raw[k] + (cur[k- output_bytes] >> 1)); break;
4122 CASE(STBI__F_paeth_first) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],0,0)); break;
4123 }
4124 #undef CASE
4125
4126 // the loop above sets the high byte of the pixels' alpha, but for
4127 // 16 bit png files we also need the low byte set. we'll do that here.
4128 if (depth == 16) {
4129 cur = a->out + stride*j; // start at the beginning of the row again
4130 for (i=0; i < x; ++i,cur+=output_bytes) {
4131 cur[filter_bytes+1] = 255;
4132 }
4133 }
4134 }
4135 }
4136
4137 // we make a separate pass to expand bits to pixels; for performance,
4138 // this could run two scanlines behind the above code, so it won't
4139 // intefere with filtering but will still be in the cache.
4140 if (depth < 8) {
4141 for (j=0; j < y; ++j) {
4142 stbi_uc *cur = a->out + stride*j;
4143 stbi_uc *in = a->out + stride*j + x*out_n - img_width_bytes;
4144 // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
4145 // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
4146 stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range
4147
4148 // note that the final byte might overshoot and write more data than desired.
4149 // we can allocate enough data that this never writes out of memory, but it
4150 // could also overwrite the next scanline. can it overwrite non-empty data
4151 // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
4152 // so we need to explicitly clamp the final ones
4153
4154 if (depth == 4) {
4155 for (k=x*img_n; k >= 2; k-=2, ++in) {
4156 *cur++ = scale * ((*in >> 4) );
4157 *cur++ = scale * ((*in ) & 0x0f);
4158 }
4159 if (k > 0) *cur++ = scale * ((*in >> 4) );
4160 } else if (depth == 2) {
4161 for (k=x*img_n; k >= 4; k-=4, ++in) {
4162 *cur++ = scale * ((*in >> 6) );
4163 *cur++ = scale * ((*in >> 4) & 0x03);
4164 *cur++ = scale * ((*in >> 2) & 0x03);
4165 *cur++ = scale * ((*in ) & 0x03);
4166 }
4167 if (k > 0) *cur++ = scale * ((*in >> 6) );
4168 if (k > 1) *cur++ = scale * ((*in >> 4) & 0x03);
4169 if (k > 2) *cur++ = scale * ((*in >> 2) & 0x03);
4170 } else if (depth == 1) {
4171 for (k=x*img_n; k >= 8; k-=8, ++in) {
4172 *cur++ = scale * ((*in >> 7) );
4173 *cur++ = scale * ((*in >> 6) & 0x01);
4174 *cur++ = scale * ((*in >> 5) & 0x01);
4175 *cur++ = scale * ((*in >> 4) & 0x01);
4176 *cur++ = scale * ((*in >> 3) & 0x01);
4177 *cur++ = scale * ((*in >> 2) & 0x01);
4178 *cur++ = scale * ((*in >> 1) & 0x01);
4179 *cur++ = scale * ((*in ) & 0x01);
4180 }
4181 if (k > 0) *cur++ = scale * ((*in >> 7) );
4182 if (k > 1) *cur++ = scale * ((*in >> 6) & 0x01);
4183 if (k > 2) *cur++ = scale * ((*in >> 5) & 0x01);
4184 if (k > 3) *cur++ = scale * ((*in >> 4) & 0x01);
4185 if (k > 4) *cur++ = scale * ((*in >> 3) & 0x01);
4186 if (k > 5) *cur++ = scale * ((*in >> 2) & 0x01);
4187 if (k > 6) *cur++ = scale * ((*in >> 1) & 0x01);
4188 }
4189 if (img_n != out_n) {
4190 int q;
4191 // insert alpha = 255
4192 cur = a->out + stride*j;
4193 if (img_n == 1) {
4194 for (q=x-1; q >= 0; --q) {
4195 cur[q*2+1] = 255;
4196 cur[q*2+0] = cur[q];
4197 }
4198 } else {
4199 STBI_ASSERT(img_n == 3);
4200 for (q=x-1; q >= 0; --q) {
4201 cur[q*4+3] = 255;
4202 cur[q*4+2] = cur[q*3+2];
4203 cur[q*4+1] = cur[q*3+1];
4204 cur[q*4+0] = cur[q*3+0];
4205 }
4206 }
4207 }
4208 }
4209 } else if (depth == 16) {
4210 // force the image data from big-endian to platform-native.
4211 // this is done in a separate pass due to the decoding relying
4212 // on the data being untouched, but could probably be done
4213 // per-line during decode if care is taken.
4214 stbi_uc *cur = a->out;
4215 stbi__uint16 *cur16 = (stbi__uint16*)cur;
4216
4217 for(i=0; i < x*y*out_n; ++i,cur16++,cur+=2) {
4218 *cur16 = (cur[0] << 8) | cur[1];
4219 }
4220 }
4221
4222 return 1;
4223 }
4224
stbi__create_png_image(stbi__png * a,stbi_uc * image_data,stbi__uint32 image_data_len,int out_n,int depth,int color,int interlaced)4225 static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
4226 {
4227 stbi_uc *final;
4228 int p;
4229 if (!interlaced)
4230 return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
4231
4232 // de-interlacing
4233 final = (stbi_uc *) stbi__malloc(a->s->img_x * a->s->img_y * out_n);
4234 for (p=0; p < 7; ++p) {
4235 int xorig[] = { 0,4,0,2,0,1,0 };
4236 int yorig[] = { 0,0,4,0,2,0,1 };
4237 int xspc[] = { 8,8,4,4,2,2,1 };
4238 int yspc[] = { 8,8,8,4,4,2,2 };
4239 int i,j,x,y;
4240 // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
4241 x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
4242 y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
4243 if (x && y) {
4244 stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
4245 if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
4246 STBI_FREE(final);
4247 return 0;
4248 }
4249 for (j=0; j < y; ++j) {
4250 for (i=0; i < x; ++i) {
4251 int out_y = j*yspc[p]+yorig[p];
4252 int out_x = i*xspc[p]+xorig[p];
4253 memcpy(final + out_y*a->s->img_x*out_n + out_x*out_n,
4254 a->out + (j*x+i)*out_n, out_n);
4255 }
4256 }
4257 STBI_FREE(a->out);
4258 image_data += img_len;
4259 image_data_len -= img_len;
4260 }
4261 }
4262 a->out = final;
4263
4264 return 1;
4265 }
4266
stbi__compute_transparency(stbi__png * z,stbi_uc tc[3],int out_n)4267 static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n)
4268 {
4269 stbi__context *s = z->s;
4270 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4271 stbi_uc *p = z->out;
4272
4273 // compute color-based transparency, assuming we've
4274 // already got 255 as the alpha value in the output
4275 STBI_ASSERT(out_n == 2 || out_n == 4);
4276
4277 if (out_n == 2) {
4278 for (i=0; i < pixel_count; ++i) {
4279 p[1] = (p[0] == tc[0] ? 0 : 255);
4280 p += 2;
4281 }
4282 } else {
4283 for (i=0; i < pixel_count; ++i) {
4284 if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4285 p[3] = 0;
4286 p += 4;
4287 }
4288 }
4289 return 1;
4290 }
4291
stbi__compute_transparency16(stbi__png * z,stbi__uint16 tc[3],int out_n)4292 static int stbi__compute_transparency16(stbi__png *z, stbi__uint16 tc[3], int out_n)
4293 {
4294 stbi__context *s = z->s;
4295 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4296 stbi__uint16 *p = (stbi__uint16*) z->out;
4297
4298 // compute color-based transparency, assuming we've
4299 // already got 65535 as the alpha value in the output
4300 STBI_ASSERT(out_n == 2 || out_n == 4);
4301
4302 if (out_n == 2) {
4303 for (i = 0; i < pixel_count; ++i) {
4304 p[1] = (p[0] == tc[0] ? 0 : 65535);
4305 p += 2;
4306 }
4307 } else {
4308 for (i = 0; i < pixel_count; ++i) {
4309 if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4310 p[3] = 0;
4311 p += 4;
4312 }
4313 }
4314 return 1;
4315 }
4316
stbi__expand_png_palette(stbi__png * a,stbi_uc * palette,int len,int pal_img_n)4317 static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n)
4318 {
4319 stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
4320 stbi_uc *p, *temp_out, *orig = a->out;
4321
4322 p = (stbi_uc *) stbi__malloc(pixel_count * pal_img_n);
4323 if (p == NULL) return stbi__err("outofmem", "Out of memory");
4324
4325 // between here and free(out) below, exitting would leak
4326 temp_out = p;
4327
4328 if (pal_img_n == 3) {
4329 for (i=0; i < pixel_count; ++i) {
4330 int n = orig[i]*4;
4331 p[0] = palette[n ];
4332 p[1] = palette[n+1];
4333 p[2] = palette[n+2];
4334 p += 3;
4335 }
4336 } else {
4337 for (i=0; i < pixel_count; ++i) {
4338 int n = orig[i]*4;
4339 p[0] = palette[n ];
4340 p[1] = palette[n+1];
4341 p[2] = palette[n+2];
4342 p[3] = palette[n+3];
4343 p += 4;
4344 }
4345 }
4346 STBI_FREE(a->out);
4347 a->out = temp_out;
4348
4349 STBI_NOTUSED(len);
4350
4351 return 1;
4352 }
4353
stbi__reduce_png(stbi__png * p)4354 static int stbi__reduce_png(stbi__png *p)
4355 {
4356 int i;
4357 int img_len = p->s->img_x * p->s->img_y * p->s->img_out_n;
4358 stbi_uc *reduced;
4359 stbi__uint16 *orig = (stbi__uint16*)p->out;
4360
4361 if (p->depth != 16) return 1; // don't need to do anything if not 16-bit data
4362
4363 reduced = (stbi_uc *)stbi__malloc(img_len);
4364 if (p == NULL) return stbi__err("outofmem", "Out of memory");
4365
4366 for (i = 0; i < img_len; ++i) reduced[i] = (stbi_uc)((orig[i] >> 8) & 0xFF); // top half of each byte is a decent approx of 16->8 bit scaling
4367
4368 p->out = reduced;
4369 STBI_FREE(orig);
4370
4371 return 1;
4372 }
4373
4374 static int stbi__unpremultiply_on_load = 0;
4375 static int stbi__de_iphone_flag = 0;
4376
stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)4377 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
4378 {
4379 stbi__unpremultiply_on_load = flag_true_if_should_unpremultiply;
4380 }
4381
stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)4382 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
4383 {
4384 stbi__de_iphone_flag = flag_true_if_should_convert;
4385 }
4386
stbi__de_iphone(stbi__png * z)4387 static void stbi__de_iphone(stbi__png *z)
4388 {
4389 stbi__context *s = z->s;
4390 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4391 stbi_uc *p = z->out;
4392
4393 if (s->img_out_n == 3) { // convert bgr to rgb
4394 for (i=0; i < pixel_count; ++i) {
4395 stbi_uc t = p[0];
4396 p[0] = p[2];
4397 p[2] = t;
4398 p += 3;
4399 }
4400 } else {
4401 STBI_ASSERT(s->img_out_n == 4);
4402 if (stbi__unpremultiply_on_load) {
4403 // convert bgr to rgb and unpremultiply
4404 for (i=0; i < pixel_count; ++i) {
4405 stbi_uc a = p[3];
4406 stbi_uc t = p[0];
4407 if (a) {
4408 p[0] = p[2] * 255 / a;
4409 p[1] = p[1] * 255 / a;
4410 p[2] = t * 255 / a;
4411 } else {
4412 p[0] = p[2];
4413 p[2] = t;
4414 }
4415 p += 4;
4416 }
4417 } else {
4418 // convert bgr to rgb
4419 for (i=0; i < pixel_count; ++i) {
4420 stbi_uc t = p[0];
4421 p[0] = p[2];
4422 p[2] = t;
4423 p += 4;
4424 }
4425 }
4426 }
4427 }
4428
4429 #define STBI__PNG_TYPE(a,b,c,d) (((a) << 24) + ((b) << 16) + ((c) << 8) + (d))
4430
stbi__parse_png_file(stbi__png * z,int scan,int req_comp)4431 static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp)
4432 {
4433 stbi_uc palette[1024], pal_img_n=0;
4434 stbi_uc has_trans=0, tc[3];
4435 stbi__uint16 tc16[3];
4436 stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0;
4437 int first=1,k,interlace=0, color=0, is_iphone=0;
4438 stbi__context *s = z->s;
4439
4440 z->expanded = NULL;
4441 z->idata = NULL;
4442 z->out = NULL;
4443
4444 if (!stbi__check_png_header(s)) return 0;
4445
4446 if (scan == STBI__SCAN_type) return 1;
4447
4448 for (;;) {
4449 stbi__pngchunk c = stbi__get_chunk_header(s);
4450 switch (c.type) {
4451 case STBI__PNG_TYPE('C','g','B','I'):
4452 is_iphone = 1;
4453 stbi__skip(s, c.length);
4454 break;
4455 case STBI__PNG_TYPE('I','H','D','R'): {
4456 int comp,filter;
4457 if (!first) return stbi__err("multiple IHDR","Corrupt PNG");
4458 first = 0;
4459 if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG");
4460 s->img_x = stbi__get32be(s); if (s->img_x > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4461 s->img_y = stbi__get32be(s); if (s->img_y > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4462 z->depth = stbi__get8(s); if (z->depth != 1 && z->depth != 2 && z->depth != 4 && z->depth != 8 && z->depth != 16) return stbi__err("1/2/4/8/16-bit only","PNG not supported: 1/2/4/8/16-bit only");
4463 color = stbi__get8(s); if (color > 6) return stbi__err("bad ctype","Corrupt PNG");
4464 if (color == 3 && z->depth == 16) return stbi__err("bad ctype","Corrupt PNG");
4465 if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG");
4466 comp = stbi__get8(s); if (comp) return stbi__err("bad comp method","Corrupt PNG");
4467 filter= stbi__get8(s); if (filter) return stbi__err("bad filter method","Corrupt PNG");
4468 interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG");
4469 if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG");
4470 if (!pal_img_n) {
4471 s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
4472 if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
4473 if (scan == STBI__SCAN_header) return 1;
4474 } else {
4475 // if paletted, then pal_n is our final components, and
4476 // img_n is # components to decompress/filter.
4477 s->img_n = 1;
4478 if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG");
4479 // if SCAN_header, have to scan to see if we have a tRNS
4480 }
4481 break;
4482 }
4483
4484 case STBI__PNG_TYPE('P','L','T','E'): {
4485 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4486 if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
4487 pal_len = c.length / 3;
4488 if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG");
4489 for (i=0; i < pal_len; ++i) {
4490 palette[i*4+0] = stbi__get8(s);
4491 palette[i*4+1] = stbi__get8(s);
4492 palette[i*4+2] = stbi__get8(s);
4493 palette[i*4+3] = 255;
4494 }
4495 break;
4496 }
4497
4498 case STBI__PNG_TYPE('t','R','N','S'): {
4499 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4500 if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG");
4501 if (pal_img_n) {
4502 if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; }
4503 if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
4504 if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG");
4505 pal_img_n = 4;
4506 for (i=0; i < c.length; ++i)
4507 palette[i*4+3] = stbi__get8(s);
4508 } else {
4509 if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
4510 if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG");
4511 has_trans = 1;
4512 if (z->depth == 16) {
4513 for (k = 0; k < s->img_n; ++k) tc16[k] = stbi__get16be(s); // copy the values as-is
4514 } else {
4515 for (k = 0; k < s->img_n; ++k) tc[k] = (stbi_uc)(stbi__get16be(s) & 255) * stbi__depth_scale_table[z->depth]; // non 8-bit images will be larger
4516 }
4517 }
4518 break;
4519 }
4520
4521 case STBI__PNG_TYPE('I','D','A','T'): {
4522 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4523 if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG");
4524 if (scan == STBI__SCAN_header) { s->img_n = pal_img_n; return 1; }
4525 if ((int)(ioff + c.length) < (int)ioff) return 0;
4526 if (ioff + c.length > idata_limit) {
4527 stbi__uint32 idata_limit_old = idata_limit;
4528 stbi_uc *p;
4529 if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
4530 while (ioff + c.length > idata_limit)
4531 idata_limit *= 2;
4532 STBI_NOTUSED(idata_limit_old);
4533 p = (stbi_uc *) STBI_REALLOC_SIZED(z->idata, idata_limit_old, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
4534 z->idata = p;
4535 }
4536 if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG");
4537 ioff += c.length;
4538 break;
4539 }
4540
4541 case STBI__PNG_TYPE('I','E','N','D'): {
4542 stbi__uint32 raw_len, bpl;
4543 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4544 if (scan != STBI__SCAN_load) return 1;
4545 if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG");
4546 // initial guess for decoded data size to avoid unnecessary reallocs
4547 bpl = (s->img_x * z->depth + 7) / 8; // bytes per line, per component
4548 raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
4549 z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone);
4550 if (z->expanded == NULL) return 0; // zlib should set error
4551 STBI_FREE(z->idata); z->idata = NULL;
4552 if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
4553 s->img_out_n = s->img_n+1;
4554 else
4555 s->img_out_n = s->img_n;
4556 if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, z->depth, color, interlace)) return 0;
4557 if (has_trans) {
4558 if (z->depth == 16) {
4559 if (!stbi__compute_transparency16(z, tc16, s->img_out_n)) return 0;
4560 } else {
4561 if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
4562 }
4563 }
4564 if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
4565 stbi__de_iphone(z);
4566 if (pal_img_n) {
4567 // pal_img_n == 3 or 4
4568 s->img_n = pal_img_n; // record the actual colors we had
4569 s->img_out_n = pal_img_n;
4570 if (req_comp >= 3) s->img_out_n = req_comp;
4571 if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
4572 return 0;
4573 }
4574 STBI_FREE(z->expanded); z->expanded = NULL;
4575 return 1;
4576 }
4577
4578 default:
4579 // if critical, fail
4580 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4581 if ((c.type & (1 << 29)) == 0) {
4582 #ifndef STBI_NO_FAILURE_STRINGS
4583 // not threadsafe
4584 static char invalid_chunk[] = "XXXX PNG chunk not known";
4585 invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
4586 invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
4587 invalid_chunk[2] = STBI__BYTECAST(c.type >> 8);
4588 invalid_chunk[3] = STBI__BYTECAST(c.type >> 0);
4589 #endif
4590 return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
4591 }
4592 stbi__skip(s, c.length);
4593 break;
4594 }
4595 // end of PNG chunk, read and skip CRC
4596 stbi__get32be(s);
4597 }
4598 }
4599
stbi__do_png(stbi__png * p,int * x,int * y,int * n,int req_comp)4600 static unsigned char *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp)
4601 {
4602 unsigned char *result=NULL;
4603 if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
4604 if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) {
4605 if (p->depth == 16) {
4606 if (!stbi__reduce_png(p)) {
4607 return result;
4608 }
4609 }
4610 result = p->out;
4611 p->out = NULL;
4612 if (req_comp && req_comp != p->s->img_out_n) {
4613 result = stbi__convert_format(result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
4614 p->s->img_out_n = req_comp;
4615 if (result == NULL) return result;
4616 }
4617 *x = p->s->img_x;
4618 *y = p->s->img_y;
4619 if (n) *n = p->s->img_n;
4620 }
4621 STBI_FREE(p->out); p->out = NULL;
4622 STBI_FREE(p->expanded); p->expanded = NULL;
4623 STBI_FREE(p->idata); p->idata = NULL;
4624
4625 return result;
4626 }
4627
stbi__png_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4628 static unsigned char *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4629 {
4630 stbi__png p;
4631 p.s = s;
4632 return stbi__do_png(&p, x,y,comp,req_comp);
4633 }
4634
stbi__png_test(stbi__context * s)4635 static int stbi__png_test(stbi__context *s)
4636 {
4637 int r;
4638 r = stbi__check_png_header(s);
4639 stbi__rewind(s);
4640 return r;
4641 }
4642
stbi__png_info_raw(stbi__png * p,int * x,int * y,int * comp)4643 static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp)
4644 {
4645 if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) {
4646 stbi__rewind( p->s );
4647 return 0;
4648 }
4649 if (x) *x = p->s->img_x;
4650 if (y) *y = p->s->img_y;
4651 if (comp) *comp = p->s->img_n;
4652 return 1;
4653 }
4654
stbi__png_info(stbi__context * s,int * x,int * y,int * comp)4655 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp)
4656 {
4657 stbi__png p;
4658 p.s = s;
4659 return stbi__png_info_raw(&p, x, y, comp);
4660 }
4661 #endif
4662
4663 // Microsoft/Windows BMP image
4664
4665 #ifndef STBI_NO_BMP
stbi__bmp_test_raw(stbi__context * s)4666 static int stbi__bmp_test_raw(stbi__context *s)
4667 {
4668 int r;
4669 int sz;
4670 if (stbi__get8(s) != 'B') return 0;
4671 if (stbi__get8(s) != 'M') return 0;
4672 stbi__get32le(s); // discard filesize
4673 stbi__get16le(s); // discard reserved
4674 stbi__get16le(s); // discard reserved
4675 stbi__get32le(s); // discard data offset
4676 sz = stbi__get32le(s);
4677 r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
4678 return r;
4679 }
4680
stbi__bmp_test(stbi__context * s)4681 static int stbi__bmp_test(stbi__context *s)
4682 {
4683 int r = stbi__bmp_test_raw(s);
4684 stbi__rewind(s);
4685 return r;
4686 }
4687
4688
4689 // returns 0..31 for the highest set bit
stbi__high_bit(unsigned int z)4690 static int stbi__high_bit(unsigned int z)
4691 {
4692 int n=0;
4693 if (z == 0) return -1;
4694 if (z >= 0x10000) n += 16, z >>= 16;
4695 if (z >= 0x00100) n += 8, z >>= 8;
4696 if (z >= 0x00010) n += 4, z >>= 4;
4697 if (z >= 0x00004) n += 2, z >>= 2;
4698 if (z >= 0x00002) n += 1, z >>= 1;
4699 return n;
4700 }
4701
stbi__bitcount(unsigned int a)4702 static int stbi__bitcount(unsigned int a)
4703 {
4704 a = (a & 0x55555555) + ((a >> 1) & 0x55555555); // max 2
4705 a = (a & 0x33333333) + ((a >> 2) & 0x33333333); // max 4
4706 a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
4707 a = (a + (a >> 8)); // max 16 per 8 bits
4708 a = (a + (a >> 16)); // max 32 per 8 bits
4709 return a & 0xff;
4710 }
4711
stbi__shiftsigned(int v,int shift,int bits)4712 static int stbi__shiftsigned(int v, int shift, int bits)
4713 {
4714 int result;
4715 int z=0;
4716
4717 if (shift < 0) v <<= -shift;
4718 else v >>= shift;
4719 result = v;
4720
4721 z = bits;
4722 while (z < 8) {
4723 result += v >> z;
4724 z += bits;
4725 }
4726 return result;
4727 }
4728
4729 typedef struct
4730 {
4731 int bpp, offset, hsz;
4732 unsigned int mr,mg,mb,ma, all_a;
4733 } stbi__bmp_data;
4734
stbi__bmp_parse_header(stbi__context * s,stbi__bmp_data * info)4735 static void *stbi__bmp_parse_header(stbi__context *s, stbi__bmp_data *info)
4736 {
4737 int hsz;
4738 if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
4739 stbi__get32le(s); // discard filesize
4740 stbi__get16le(s); // discard reserved
4741 stbi__get16le(s); // discard reserved
4742 info->offset = stbi__get32le(s);
4743 info->hsz = hsz = stbi__get32le(s);
4744 info->mr = info->mg = info->mb = info->ma = 0;
4745
4746 if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
4747 if (hsz == 12) {
4748 s->img_x = stbi__get16le(s);
4749 s->img_y = stbi__get16le(s);
4750 } else {
4751 s->img_x = stbi__get32le(s);
4752 s->img_y = stbi__get32le(s);
4753 }
4754 if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
4755 info->bpp = stbi__get16le(s);
4756 if (info->bpp == 1) return stbi__errpuc("monochrome", "BMP type not supported: 1-bit");
4757 if (hsz != 12) {
4758 int compress = stbi__get32le(s);
4759 if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
4760 stbi__get32le(s); // discard sizeof
4761 stbi__get32le(s); // discard hres
4762 stbi__get32le(s); // discard vres
4763 stbi__get32le(s); // discard colorsused
4764 stbi__get32le(s); // discard max important
4765 if (hsz == 40 || hsz == 56) {
4766 if (hsz == 56) {
4767 stbi__get32le(s);
4768 stbi__get32le(s);
4769 stbi__get32le(s);
4770 stbi__get32le(s);
4771 }
4772 if (info->bpp == 16 || info->bpp == 32) {
4773 if (compress == 0) {
4774 if (info->bpp == 32) {
4775 info->mr = 0xffu << 16;
4776 info->mg = 0xffu << 8;
4777 info->mb = 0xffu << 0;
4778 info->ma = 0xffu << 24;
4779 info->all_a = 0; // if all_a is 0 at end, then we loaded alpha channel but it was all 0
4780 } else {
4781 info->mr = 31u << 10;
4782 info->mg = 31u << 5;
4783 info->mb = 31u << 0;
4784 }
4785 } else if (compress == 3) {
4786 info->mr = stbi__get32le(s);
4787 info->mg = stbi__get32le(s);
4788 info->mb = stbi__get32le(s);
4789 // not documented, but generated by photoshop and handled by mspaint
4790 if (info->mr == info->mg && info->mg == info->mb) {
4791 // ?!?!?
4792 return stbi__errpuc("bad BMP", "bad BMP");
4793 }
4794 } else
4795 return stbi__errpuc("bad BMP", "bad BMP");
4796 }
4797 } else {
4798 int i;
4799 if (hsz != 108 && hsz != 124)
4800 return stbi__errpuc("bad BMP", "bad BMP");
4801 info->mr = stbi__get32le(s);
4802 info->mg = stbi__get32le(s);
4803 info->mb = stbi__get32le(s);
4804 info->ma = stbi__get32le(s);
4805 stbi__get32le(s); // discard color space
4806 for (i=0; i < 12; ++i)
4807 stbi__get32le(s); // discard color space parameters
4808 if (hsz == 124) {
4809 stbi__get32le(s); // discard rendering intent
4810 stbi__get32le(s); // discard offset of profile data
4811 stbi__get32le(s); // discard size of profile data
4812 stbi__get32le(s); // discard reserved
4813 }
4814 }
4815 }
4816 return (void *) 1;
4817 }
4818
4819
stbi__bmp_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4820 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4821 {
4822 stbi_uc *out;
4823 unsigned int mr=0,mg=0,mb=0,ma=0, all_a;
4824 stbi_uc pal[256][4];
4825 int psize=0,i,j,width;
4826 int flip_vertically, pad, target;
4827 stbi__bmp_data info;
4828
4829 info.all_a = 255;
4830 if (stbi__bmp_parse_header(s, &info) == NULL)
4831 return NULL; // error code already set
4832
4833 flip_vertically = ((int) s->img_y) > 0;
4834 s->img_y = abs((int) s->img_y);
4835
4836 mr = info.mr;
4837 mg = info.mg;
4838 mb = info.mb;
4839 ma = info.ma;
4840 all_a = info.all_a;
4841
4842 if (info.hsz == 12) {
4843 if (info.bpp < 24)
4844 psize = (info.offset - 14 - 24) / 3;
4845 } else {
4846 if (info.bpp < 16)
4847 psize = (info.offset - 14 - info.hsz) >> 2;
4848 }
4849
4850 s->img_n = ma ? 4 : 3;
4851 if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
4852 target = req_comp;
4853 else
4854 target = s->img_n; // if they want monochrome, we'll post-convert
4855
4856 out = (stbi_uc *) stbi__malloc(target * s->img_x * s->img_y);
4857 if (!out) return stbi__errpuc("outofmem", "Out of memory");
4858 if (info.bpp < 16) {
4859 int z=0;
4860 if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
4861 for (i=0; i < psize; ++i) {
4862 pal[i][2] = stbi__get8(s);
4863 pal[i][1] = stbi__get8(s);
4864 pal[i][0] = stbi__get8(s);
4865 if (info.hsz != 12) stbi__get8(s);
4866 pal[i][3] = 255;
4867 }
4868 stbi__skip(s, info.offset - 14 - info.hsz - psize * (info.hsz == 12 ? 3 : 4));
4869 if (info.bpp == 4) width = (s->img_x + 1) >> 1;
4870 else if (info.bpp == 8) width = s->img_x;
4871 else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
4872 pad = (-width)&3;
4873 for (j=0; j < (int) s->img_y; ++j) {
4874 for (i=0; i < (int) s->img_x; i += 2) {
4875 int v=stbi__get8(s),v2=0;
4876 if (info.bpp == 4) {
4877 v2 = v & 15;
4878 v >>= 4;
4879 }
4880 out[z++] = pal[v][0];
4881 out[z++] = pal[v][1];
4882 out[z++] = pal[v][2];
4883 if (target == 4) out[z++] = 255;
4884 if (i+1 == (int) s->img_x) break;
4885 v = (info.bpp == 8) ? stbi__get8(s) : v2;
4886 out[z++] = pal[v][0];
4887 out[z++] = pal[v][1];
4888 out[z++] = pal[v][2];
4889 if (target == 4) out[z++] = 255;
4890 }
4891 stbi__skip(s, pad);
4892 }
4893 } else {
4894 int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
4895 int z = 0;
4896 int easy=0;
4897 stbi__skip(s, info.offset - 14 - info.hsz);
4898 if (info.bpp == 24) width = 3 * s->img_x;
4899 else if (info.bpp == 16) width = 2*s->img_x;
4900 else /* bpp = 32 and pad = 0 */ width=0;
4901 pad = (-width) & 3;
4902 if (info.bpp == 24) {
4903 easy = 1;
4904 } else if (info.bpp == 32) {
4905 if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
4906 easy = 2;
4907 }
4908 if (!easy) {
4909 if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
4910 // right shift amt to put high bit in position #7
4911 rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr);
4912 gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg);
4913 bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb);
4914 ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma);
4915 }
4916 for (j=0; j < (int) s->img_y; ++j) {
4917 if (easy) {
4918 for (i=0; i < (int) s->img_x; ++i) {
4919 unsigned char a;
4920 out[z+2] = stbi__get8(s);
4921 out[z+1] = stbi__get8(s);
4922 out[z+0] = stbi__get8(s);
4923 z += 3;
4924 a = (easy == 2 ? stbi__get8(s) : 255);
4925 all_a |= a;
4926 if (target == 4) out[z++] = a;
4927 }
4928 } else {
4929 int bpp = info.bpp;
4930 for (i=0; i < (int) s->img_x; ++i) {
4931 stbi__uint32 v = (bpp == 16 ? (stbi__uint32) stbi__get16le(s) : stbi__get32le(s));
4932 int a;
4933 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
4934 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
4935 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
4936 a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
4937 all_a |= a;
4938 if (target == 4) out[z++] = STBI__BYTECAST(a);
4939 }
4940 }
4941 stbi__skip(s, pad);
4942 }
4943 }
4944
4945 // if alpha channel is all 0s, replace with all 255s
4946 if (target == 4 && all_a == 0)
4947 for (i=4*s->img_x*s->img_y-1; i >= 0; i -= 4)
4948 out[i] = 255;
4949
4950 if (flip_vertically) {
4951 stbi_uc t;
4952 for (j=0; j < (int) s->img_y>>1; ++j) {
4953 stbi_uc *p1 = out + j *s->img_x*target;
4954 stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
4955 for (i=0; i < (int) s->img_x*target; ++i) {
4956 t = p1[i], p1[i] = p2[i], p2[i] = t;
4957 }
4958 }
4959 }
4960
4961 if (req_comp && req_comp != target) {
4962 out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
4963 if (out == NULL) return out; // stbi__convert_format frees input on failure
4964 }
4965
4966 *x = s->img_x;
4967 *y = s->img_y;
4968 if (comp) *comp = s->img_n;
4969 return out;
4970 }
4971 #endif
4972
4973 // Targa Truevision - TGA
4974 // by Jonathan Dummer
4975 #ifndef STBI_NO_TGA
4976 // returns STBI_rgb or whatever, 0 on error
stbi__tga_get_comp(int bits_per_pixel,int is_grey,int * is_rgb16)4977 static int stbi__tga_get_comp(int bits_per_pixel, int is_grey, int* is_rgb16)
4978 {
4979 // only RGB or RGBA (incl. 16bit) or grey allowed
4980 if(is_rgb16) *is_rgb16 = 0;
4981 switch(bits_per_pixel) {
4982 case 8: return STBI_grey;
4983 case 16: if(is_grey) return STBI_grey_alpha;
4984 // else: fall-through
4985 case 15: if(is_rgb16) *is_rgb16 = 1;
4986 return STBI_rgb;
4987 case 24: // fall-through
4988 case 32: return bits_per_pixel/8;
4989 default: return 0;
4990 }
4991 }
4992
stbi__tga_info(stbi__context * s,int * x,int * y,int * comp)4993 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp)
4994 {
4995 int tga_w, tga_h, tga_comp, tga_image_type, tga_bits_per_pixel, tga_colormap_bpp;
4996 int sz, tga_colormap_type;
4997 stbi__get8(s); // discard Offset
4998 tga_colormap_type = stbi__get8(s); // colormap type
4999 if( tga_colormap_type > 1 ) {
5000 stbi__rewind(s);
5001 return 0; // only RGB or indexed allowed
5002 }
5003 tga_image_type = stbi__get8(s); // image type
5004 if ( tga_colormap_type == 1 ) { // colormapped (paletted) image
5005 if (tga_image_type != 1 && tga_image_type != 9) {
5006 stbi__rewind(s);
5007 return 0;
5008 }
5009 stbi__skip(s,4); // skip index of first colormap entry and number of entries
5010 sz = stbi__get8(s); // check bits per palette color entry
5011 if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) {
5012 stbi__rewind(s);
5013 return 0;
5014 }
5015 stbi__skip(s,4); // skip image x and y origin
5016 tga_colormap_bpp = sz;
5017 } else { // "normal" image w/o colormap - only RGB or grey allowed, +/- RLE
5018 if ( (tga_image_type != 2) && (tga_image_type != 3) && (tga_image_type != 10) && (tga_image_type != 11) ) {
5019 stbi__rewind(s);
5020 return 0; // only RGB or grey allowed, +/- RLE
5021 }
5022 stbi__skip(s,9); // skip colormap specification and image x/y origin
5023 tga_colormap_bpp = 0;
5024 }
5025 tga_w = stbi__get16le(s);
5026 if( tga_w < 1 ) {
5027 stbi__rewind(s);
5028 return 0; // test width
5029 }
5030 tga_h = stbi__get16le(s);
5031 if( tga_h < 1 ) {
5032 stbi__rewind(s);
5033 return 0; // test height
5034 }
5035 tga_bits_per_pixel = stbi__get8(s); // bits per pixel
5036 stbi__get8(s); // ignore alpha bits
5037 if (tga_colormap_bpp != 0) {
5038 if((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16)) {
5039 // when using a colormap, tga_bits_per_pixel is the size of the indexes
5040 // I don't think anything but 8 or 16bit indexes makes sense
5041 stbi__rewind(s);
5042 return 0;
5043 }
5044 tga_comp = stbi__tga_get_comp(tga_colormap_bpp, 0, NULL);
5045 } else {
5046 tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3) || (tga_image_type == 11), NULL);
5047 }
5048 if(!tga_comp) {
5049 stbi__rewind(s);
5050 return 0;
5051 }
5052 if (x) *x = tga_w;
5053 if (y) *y = tga_h;
5054 if (comp) *comp = tga_comp;
5055 return 1; // seems to have passed everything
5056 }
5057
stbi__tga_test(stbi__context * s)5058 static int stbi__tga_test(stbi__context *s)
5059 {
5060 int res = 0;
5061 int sz, tga_color_type;
5062 stbi__get8(s); // discard Offset
5063 tga_color_type = stbi__get8(s); // color type
5064 if ( tga_color_type > 1 ) goto errorEnd; // only RGB or indexed allowed
5065 sz = stbi__get8(s); // image type
5066 if ( tga_color_type == 1 ) { // colormapped (paletted) image
5067 if (sz != 1 && sz != 9) goto errorEnd; // colortype 1 demands image type 1 or 9
5068 stbi__skip(s,4); // skip index of first colormap entry and number of entries
5069 sz = stbi__get8(s); // check bits per palette color entry
5070 if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
5071 stbi__skip(s,4); // skip image x and y origin
5072 } else { // "normal" image w/o colormap
5073 if ( (sz != 2) && (sz != 3) && (sz != 10) && (sz != 11) ) goto errorEnd; // only RGB or grey allowed, +/- RLE
5074 stbi__skip(s,9); // skip colormap specification and image x/y origin
5075 }
5076 if ( stbi__get16le(s) < 1 ) goto errorEnd; // test width
5077 if ( stbi__get16le(s) < 1 ) goto errorEnd; // test height
5078 sz = stbi__get8(s); // bits per pixel
5079 if ( (tga_color_type == 1) && (sz != 8) && (sz != 16) ) goto errorEnd; // for colormapped images, bpp is size of an index
5080 if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
5081
5082 res = 1; // if we got this far, everything's good and we can return 1 instead of 0
5083
5084 errorEnd:
5085 stbi__rewind(s);
5086 return res;
5087 }
5088
5089 // read 16bit value and convert to 24bit RGB
stbi__tga_read_rgb16(stbi__context * s,stbi_uc * out)5090 void stbi__tga_read_rgb16(stbi__context *s, stbi_uc* out)
5091 {
5092 stbi__uint16 px = stbi__get16le(s);
5093 stbi__uint16 fiveBitMask = 31;
5094 // we have 3 channels with 5bits each
5095 int r = (px >> 10) & fiveBitMask;
5096 int g = (px >> 5) & fiveBitMask;
5097 int b = px & fiveBitMask;
5098 // Note that this saves the data in RGB(A) order, so it doesn't need to be swapped later
5099 out[0] = (r * 255)/31;
5100 out[1] = (g * 255)/31;
5101 out[2] = (b * 255)/31;
5102
5103 // some people claim that the most significant bit might be used for alpha
5104 // (possibly if an alpha-bit is set in the "image descriptor byte")
5105 // but that only made 16bit test images completely translucent..
5106 // so let's treat all 15 and 16bit TGAs as RGB with no alpha.
5107 }
5108
stbi__tga_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)5109 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5110 {
5111 // read in the TGA header stuff
5112 int tga_offset = stbi__get8(s);
5113 int tga_indexed = stbi__get8(s);
5114 int tga_image_type = stbi__get8(s);
5115 int tga_is_RLE = 0;
5116 int tga_palette_start = stbi__get16le(s);
5117 int tga_palette_len = stbi__get16le(s);
5118 int tga_palette_bits = stbi__get8(s);
5119 int tga_x_origin = stbi__get16le(s);
5120 int tga_y_origin = stbi__get16le(s);
5121 int tga_width = stbi__get16le(s);
5122 int tga_height = stbi__get16le(s);
5123 int tga_bits_per_pixel = stbi__get8(s);
5124 int tga_comp, tga_rgb16=0;
5125 int tga_inverted = stbi__get8(s);
5126 // int tga_alpha_bits = tga_inverted & 15; // the 4 lowest bits - unused (useless?)
5127 // image data
5128 unsigned char *tga_data;
5129 unsigned char *tga_palette = NULL;
5130 int i, j;
5131 unsigned char raw_data[4];
5132 int RLE_count = 0;
5133 int RLE_repeating = 0;
5134 int read_next_pixel = 1;
5135
5136 // do a tiny bit of precessing
5137 if ( tga_image_type >= 8 )
5138 {
5139 tga_image_type -= 8;
5140 tga_is_RLE = 1;
5141 }
5142 tga_inverted = 1 - ((tga_inverted >> 5) & 1);
5143
5144 // If I'm paletted, then I'll use the number of bits from the palette
5145 if ( tga_indexed ) tga_comp = stbi__tga_get_comp(tga_palette_bits, 0, &tga_rgb16);
5146 else tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3), &tga_rgb16);
5147
5148 if(!tga_comp) // shouldn't really happen, stbi__tga_test() should have ensured basic consistency
5149 return stbi__errpuc("bad format", "Can't find out TGA pixelformat");
5150
5151 // tga info
5152 *x = tga_width;
5153 *y = tga_height;
5154 if (comp) *comp = tga_comp;
5155
5156 tga_data = (unsigned char*)stbi__malloc( (size_t)tga_width * tga_height * tga_comp );
5157 if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
5158
5159 // skip to the data's starting position (offset usually = 0)
5160 stbi__skip(s, tga_offset );
5161
5162 if ( !tga_indexed && !tga_is_RLE && !tga_rgb16 ) {
5163 for (i=0; i < tga_height; ++i) {
5164 int row = tga_inverted ? tga_height -i - 1 : i;
5165 stbi_uc *tga_row = tga_data + row*tga_width*tga_comp;
5166 stbi__getn(s, tga_row, tga_width * tga_comp);
5167 }
5168 } else {
5169 // do I need to load a palette?
5170 if ( tga_indexed)
5171 {
5172 // any data to skip? (offset usually = 0)
5173 stbi__skip(s, tga_palette_start );
5174 // load the palette
5175 tga_palette = (unsigned char*)stbi__malloc( tga_palette_len * tga_comp );
5176 if (!tga_palette) {
5177 STBI_FREE(tga_data);
5178 return stbi__errpuc("outofmem", "Out of memory");
5179 }
5180 if (tga_rgb16) {
5181 stbi_uc *pal_entry = tga_palette;
5182 STBI_ASSERT(tga_comp == STBI_rgb);
5183 for (i=0; i < tga_palette_len; ++i) {
5184 stbi__tga_read_rgb16(s, pal_entry);
5185 pal_entry += tga_comp;
5186 }
5187 } else if (!stbi__getn(s, tga_palette, tga_palette_len * tga_comp)) {
5188 STBI_FREE(tga_data);
5189 STBI_FREE(tga_palette);
5190 return stbi__errpuc("bad palette", "Corrupt TGA");
5191 }
5192 }
5193 // load the data
5194 for (i=0; i < tga_width * tga_height; ++i)
5195 {
5196 // if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
5197 if ( tga_is_RLE )
5198 {
5199 if ( RLE_count == 0 )
5200 {
5201 // yep, get the next byte as a RLE command
5202 int RLE_cmd = stbi__get8(s);
5203 RLE_count = 1 + (RLE_cmd & 127);
5204 RLE_repeating = RLE_cmd >> 7;
5205 read_next_pixel = 1;
5206 } else if ( !RLE_repeating )
5207 {
5208 read_next_pixel = 1;
5209 }
5210 } else
5211 {
5212 read_next_pixel = 1;
5213 }
5214 // OK, if I need to read a pixel, do it now
5215 if ( read_next_pixel )
5216 {
5217 // load however much data we did have
5218 if ( tga_indexed )
5219 {
5220 // read in index, then perform the lookup
5221 int pal_idx = (tga_bits_per_pixel == 8) ? stbi__get8(s) : stbi__get16le(s);
5222 if ( pal_idx >= tga_palette_len ) {
5223 // invalid index
5224 pal_idx = 0;
5225 }
5226 pal_idx *= tga_comp;
5227 for (j = 0; j < tga_comp; ++j) {
5228 raw_data[j] = tga_palette[pal_idx+j];
5229 }
5230 } else if(tga_rgb16) {
5231 STBI_ASSERT(tga_comp == STBI_rgb);
5232 stbi__tga_read_rgb16(s, raw_data);
5233 } else {
5234 // read in the data raw
5235 for (j = 0; j < tga_comp; ++j) {
5236 raw_data[j] = stbi__get8(s);
5237 }
5238 }
5239 // clear the reading flag for the next pixel
5240 read_next_pixel = 0;
5241 } // end of reading a pixel
5242
5243 // copy data
5244 for (j = 0; j < tga_comp; ++j)
5245 tga_data[i*tga_comp+j] = raw_data[j];
5246
5247 // in case we're in RLE mode, keep counting down
5248 --RLE_count;
5249 }
5250 // do I need to invert the image?
5251 if ( tga_inverted )
5252 {
5253 for (j = 0; j*2 < tga_height; ++j)
5254 {
5255 int index1 = j * tga_width * tga_comp;
5256 int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
5257 for (i = tga_width * tga_comp; i > 0; --i)
5258 {
5259 unsigned char temp = tga_data[index1];
5260 tga_data[index1] = tga_data[index2];
5261 tga_data[index2] = temp;
5262 ++index1;
5263 ++index2;
5264 }
5265 }
5266 }
5267 // clear my palette, if I had one
5268 if ( tga_palette != NULL )
5269 {
5270 STBI_FREE( tga_palette );
5271 }
5272 }
5273
5274 // swap RGB - if the source data was RGB16, it already is in the right order
5275 if (tga_comp >= 3 && !tga_rgb16)
5276 {
5277 unsigned char* tga_pixel = tga_data;
5278 for (i=0; i < tga_width * tga_height; ++i)
5279 {
5280 unsigned char temp = tga_pixel[0];
5281 tga_pixel[0] = tga_pixel[2];
5282 tga_pixel[2] = temp;
5283 tga_pixel += tga_comp;
5284 }
5285 }
5286
5287 // convert to target component count
5288 if (req_comp && req_comp != tga_comp)
5289 tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
5290
5291 // the things I do to get rid of an error message, and yet keep
5292 // Microsoft's C compilers happy... [8^(
5293 tga_palette_start = tga_palette_len = tga_palette_bits =
5294 tga_x_origin = tga_y_origin = 0;
5295 // OK, done
5296 return tga_data;
5297 }
5298 #endif
5299
5300 // *************************************************************************************************
5301 // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
5302
5303 #ifndef STBI_NO_PSD
stbi__psd_test(stbi__context * s)5304 static int stbi__psd_test(stbi__context *s)
5305 {
5306 int r = (stbi__get32be(s) == 0x38425053);
5307 stbi__rewind(s);
5308 return r;
5309 }
5310
stbi__psd_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)5311 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5312 {
5313 int pixelCount;
5314 int channelCount, compression;
5315 int channel, i, count, len;
5316 int bitdepth;
5317 int w,h;
5318 stbi_uc *out;
5319
5320 // Check identifier
5321 if (stbi__get32be(s) != 0x38425053) // "8BPS"
5322 return stbi__errpuc("not PSD", "Corrupt PSD image");
5323
5324 // Check file type version.
5325 if (stbi__get16be(s) != 1)
5326 return stbi__errpuc("wrong version", "Unsupported version of PSD image");
5327
5328 // Skip 6 reserved bytes.
5329 stbi__skip(s, 6 );
5330
5331 // Read the number of channels (R, G, B, A, etc).
5332 channelCount = stbi__get16be(s);
5333 if (channelCount < 0 || channelCount > 16)
5334 return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
5335
5336 // Read the rows and columns of the image.
5337 h = stbi__get32be(s);
5338 w = stbi__get32be(s);
5339
5340 // Make sure the depth is 8 bits.
5341 bitdepth = stbi__get16be(s);
5342 if (bitdepth != 8 && bitdepth != 16)
5343 return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 or 16 bit");
5344
5345 // Make sure the color mode is RGB.
5346 // Valid options are:
5347 // 0: Bitmap
5348 // 1: Grayscale
5349 // 2: Indexed color
5350 // 3: RGB color
5351 // 4: CMYK color
5352 // 7: Multichannel
5353 // 8: Duotone
5354 // 9: Lab color
5355 if (stbi__get16be(s) != 3)
5356 return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
5357
5358 // Skip the Mode Data. (It's the palette for indexed color; other info for other modes.)
5359 stbi__skip(s,stbi__get32be(s) );
5360
5361 // Skip the image resources. (resolution, pen tool paths, etc)
5362 stbi__skip(s, stbi__get32be(s) );
5363
5364 // Skip the reserved data.
5365 stbi__skip(s, stbi__get32be(s) );
5366
5367 // Find out if the data is compressed.
5368 // Known values:
5369 // 0: no compression
5370 // 1: RLE compressed
5371 compression = stbi__get16be(s);
5372 if (compression > 1)
5373 return stbi__errpuc("bad compression", "PSD has an unknown compression format");
5374
5375 // Create the destination image.
5376 out = (stbi_uc *) stbi__malloc(4 * w*h);
5377 if (!out) return stbi__errpuc("outofmem", "Out of memory");
5378 pixelCount = w*h;
5379
5380 // Initialize the data to zero.
5381 //memset( out, 0, pixelCount * 4 );
5382
5383 // Finally, the image data.
5384 if (compression) {
5385 // RLE as used by .PSD and .TIFF
5386 // Loop until you get the number of unpacked bytes you are expecting:
5387 // Read the next source byte into n.
5388 // If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
5389 // Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
5390 // Else if n is 128, noop.
5391 // Endloop
5392
5393 // The RLE-compressed data is preceeded by a 2-byte data count for each row in the data,
5394 // which we're going to just skip.
5395 stbi__skip(s, h * channelCount * 2 );
5396
5397 // Read the RLE data by channel.
5398 for (channel = 0; channel < 4; channel++) {
5399 stbi_uc *p;
5400
5401 p = out+channel;
5402 if (channel >= channelCount) {
5403 // Fill this channel with default data.
5404 for (i = 0; i < pixelCount; i++, p += 4)
5405 *p = (channel == 3 ? 255 : 0);
5406 } else {
5407 // Read the RLE data.
5408 count = 0;
5409 while (count < pixelCount) {
5410 len = stbi__get8(s);
5411 if (len == 128) {
5412 // No-op.
5413 } else if (len < 128) {
5414 // Copy next len+1 bytes literally.
5415 len++;
5416 count += len;
5417 while (len) {
5418 *p = stbi__get8(s);
5419 p += 4;
5420 len--;
5421 }
5422 } else if (len > 128) {
5423 stbi_uc val;
5424 // Next -len+1 bytes in the dest are replicated from next source byte.
5425 // (Interpret len as a negative 8-bit int.)
5426 len ^= 0x0FF;
5427 len += 2;
5428 val = stbi__get8(s);
5429 count += len;
5430 while (len) {
5431 *p = val;
5432 p += 4;
5433 len--;
5434 }
5435 }
5436 }
5437 }
5438 }
5439
5440 } else {
5441 // We're at the raw image data. It's each channel in order (Red, Green, Blue, Alpha, ...)
5442 // where each channel consists of an 8-bit value for each pixel in the image.
5443
5444 // Read the data by channel.
5445 for (channel = 0; channel < 4; channel++) {
5446 stbi_uc *p;
5447
5448 p = out + channel;
5449 if (channel >= channelCount) {
5450 // Fill this channel with default data.
5451 stbi_uc val = channel == 3 ? 255 : 0;
5452 for (i = 0; i < pixelCount; i++, p += 4)
5453 *p = val;
5454 } else {
5455 // Read the data.
5456 if (bitdepth == 16) {
5457 for (i = 0; i < pixelCount; i++, p += 4)
5458 *p = (stbi_uc) (stbi__get16be(s) >> 8);
5459 } else {
5460 for (i = 0; i < pixelCount; i++, p += 4)
5461 *p = stbi__get8(s);
5462 }
5463 }
5464 }
5465 }
5466
5467 if (channelCount >= 4) {
5468 for (i=0; i < w*h; ++i) {
5469 unsigned char *pixel = out + 4*i;
5470 if (pixel[3] != 0 && pixel[3] != 255) {
5471 // remove weird white matte from PSD
5472 float a = pixel[3] / 255.0f;
5473 float ra = 1.0f / a;
5474 float inv_a = 255.0f * (1 - ra);
5475 pixel[0] = (unsigned char) (pixel[0]*ra + inv_a);
5476 pixel[1] = (unsigned char) (pixel[1]*ra + inv_a);
5477 pixel[2] = (unsigned char) (pixel[2]*ra + inv_a);
5478 }
5479 }
5480 }
5481
5482 if (req_comp && req_comp != 4) {
5483 out = stbi__convert_format(out, 4, req_comp, w, h);
5484 if (out == NULL) return out; // stbi__convert_format frees input on failure
5485 }
5486
5487 if (comp) *comp = 4;
5488 *y = h;
5489 *x = w;
5490
5491 return out;
5492 }
5493 #endif
5494
5495 // *************************************************************************************************
5496 // Softimage PIC loader
5497 // by Tom Seddon
5498 //
5499 // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
5500 // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
5501
5502 #ifndef STBI_NO_PIC
stbi__pic_is4(stbi__context * s,const char * str)5503 static int stbi__pic_is4(stbi__context *s,const char *str)
5504 {
5505 int i;
5506 for (i=0; i<4; ++i)
5507 if (stbi__get8(s) != (stbi_uc)str[i])
5508 return 0;
5509
5510 return 1;
5511 }
5512
stbi__pic_test_core(stbi__context * s)5513 static int stbi__pic_test_core(stbi__context *s)
5514 {
5515 int i;
5516
5517 if (!stbi__pic_is4(s,"\x53\x80\xF6\x34"))
5518 return 0;
5519
5520 for(i=0;i<84;++i)
5521 stbi__get8(s);
5522
5523 if (!stbi__pic_is4(s,"PICT"))
5524 return 0;
5525
5526 return 1;
5527 }
5528
5529 typedef struct
5530 {
5531 stbi_uc size,type,channel;
5532 } stbi__pic_packet;
5533
stbi__readval(stbi__context * s,int channel,stbi_uc * dest)5534 static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest)
5535 {
5536 int mask=0x80, i;
5537
5538 for (i=0; i<4; ++i, mask>>=1) {
5539 if (channel & mask) {
5540 if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short");
5541 dest[i]=stbi__get8(s);
5542 }
5543 }
5544
5545 return dest;
5546 }
5547
stbi__copyval(int channel,stbi_uc * dest,const stbi_uc * src)5548 static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src)
5549 {
5550 int mask=0x80,i;
5551
5552 for (i=0;i<4; ++i, mask>>=1)
5553 if (channel&mask)
5554 dest[i]=src[i];
5555 }
5556
stbi__pic_load_core(stbi__context * s,int width,int height,int * comp,stbi_uc * result)5557 static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result)
5558 {
5559 int act_comp=0,num_packets=0,y,chained;
5560 stbi__pic_packet packets[10];
5561
5562 // this will (should...) cater for even some bizarre stuff like having data
5563 // for the same channel in multiple packets.
5564 do {
5565 stbi__pic_packet *packet;
5566
5567 if (num_packets==sizeof(packets)/sizeof(packets[0]))
5568 return stbi__errpuc("bad format","too many packets");
5569
5570 packet = &packets[num_packets++];
5571
5572 chained = stbi__get8(s);
5573 packet->size = stbi__get8(s);
5574 packet->type = stbi__get8(s);
5575 packet->channel = stbi__get8(s);
5576
5577 act_comp |= packet->channel;
5578
5579 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (reading packets)");
5580 if (packet->size != 8) return stbi__errpuc("bad format","packet isn't 8bpp");
5581 } while (chained);
5582
5583 *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
5584
5585 for(y=0; y<height; ++y) {
5586 int packet_idx;
5587
5588 for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
5589 stbi__pic_packet *packet = &packets[packet_idx];
5590 stbi_uc *dest = result+y*width*4;
5591
5592 switch (packet->type) {
5593 default:
5594 return stbi__errpuc("bad format","packet has bad compression type");
5595
5596 case 0: {//uncompressed
5597 int x;
5598
5599 for(x=0;x<width;++x, dest+=4)
5600 if (!stbi__readval(s,packet->channel,dest))
5601 return 0;
5602 break;
5603 }
5604
5605 case 1://Pure RLE
5606 {
5607 int left=width, i;
5608
5609 while (left>0) {
5610 stbi_uc count,value[4];
5611
5612 count=stbi__get8(s);
5613 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pure read count)");
5614
5615 if (count > left)
5616 count = (stbi_uc) left;
5617
5618 if (!stbi__readval(s,packet->channel,value)) return 0;
5619
5620 for(i=0; i<count; ++i,dest+=4)
5621 stbi__copyval(packet->channel,dest,value);
5622 left -= count;
5623 }
5624 }
5625 break;
5626
5627 case 2: {//Mixed RLE
5628 int left=width;
5629 while (left>0) {
5630 int count = stbi__get8(s), i;
5631 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (mixed read count)");
5632
5633 if (count >= 128) { // Repeated
5634 stbi_uc value[4];
5635
5636 if (count==128)
5637 count = stbi__get16be(s);
5638 else
5639 count -= 127;
5640 if (count > left)
5641 return stbi__errpuc("bad file","scanline overrun");
5642
5643 if (!stbi__readval(s,packet->channel,value))
5644 return 0;
5645
5646 for(i=0;i<count;++i, dest += 4)
5647 stbi__copyval(packet->channel,dest,value);
5648 } else { // Raw
5649 ++count;
5650 if (count>left) return stbi__errpuc("bad file","scanline overrun");
5651
5652 for(i=0;i<count;++i, dest+=4)
5653 if (!stbi__readval(s,packet->channel,dest))
5654 return 0;
5655 }
5656 left-=count;
5657 }
5658 break;
5659 }
5660 }
5661 }
5662 }
5663
5664 return result;
5665 }
5666
stbi__pic_load(stbi__context * s,int * px,int * py,int * comp,int req_comp)5667 static stbi_uc *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp)
5668 {
5669 stbi_uc *result;
5670 int i, x,y;
5671
5672 for (i=0; i<92; ++i)
5673 stbi__get8(s);
5674
5675 x = stbi__get16be(s);
5676 y = stbi__get16be(s);
5677 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pic header)");
5678 if ((1 << 28) / x < y) return stbi__errpuc("too large", "Image too large to decode");
5679
5680 stbi__get32be(s); //skip `ratio'
5681 stbi__get16be(s); //skip `fields'
5682 stbi__get16be(s); //skip `pad'
5683
5684 // intermediate buffer is RGBA
5685 result = (stbi_uc *) stbi__malloc(x*y*4);
5686 memset(result, 0xff, x*y*4);
5687
5688 if (!stbi__pic_load_core(s,x,y,comp, result)) {
5689 STBI_FREE(result);
5690 result=0;
5691 }
5692 *px = x;
5693 *py = y;
5694 if (req_comp == 0) req_comp = *comp;
5695 result=stbi__convert_format(result,4,req_comp,x,y);
5696
5697 return result;
5698 }
5699
stbi__pic_test(stbi__context * s)5700 static int stbi__pic_test(stbi__context *s)
5701 {
5702 int r = stbi__pic_test_core(s);
5703 stbi__rewind(s);
5704 return r;
5705 }
5706 #endif
5707
5708 // *************************************************************************************************
5709 // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
5710
5711 #ifndef STBI_NO_GIF
5712 typedef struct
5713 {
5714 stbi__int16 prefix;
5715 stbi_uc first;
5716 stbi_uc suffix;
5717 } stbi__gif_lzw;
5718
5719 typedef struct
5720 {
5721 int w,h;
5722 stbi_uc *out, *old_out; // output buffer (always 4 components)
5723 int flags, bgindex, ratio, transparent, eflags, delay;
5724 stbi_uc pal[256][4];
5725 stbi_uc lpal[256][4];
5726 stbi__gif_lzw codes[4096];
5727 stbi_uc *color_table;
5728 int parse, step;
5729 int lflags;
5730 int start_x, start_y;
5731 int max_x, max_y;
5732 int cur_x, cur_y;
5733 int line_size;
5734 } stbi__gif;
5735
stbi__gif_test_raw(stbi__context * s)5736 static int stbi__gif_test_raw(stbi__context *s)
5737 {
5738 int sz;
5739 if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
5740 sz = stbi__get8(s);
5741 if (sz != '9' && sz != '7') return 0;
5742 if (stbi__get8(s) != 'a') return 0;
5743 return 1;
5744 }
5745
stbi__gif_test(stbi__context * s)5746 static int stbi__gif_test(stbi__context *s)
5747 {
5748 int r = stbi__gif_test_raw(s);
5749 stbi__rewind(s);
5750 return r;
5751 }
5752
stbi__gif_parse_colortable(stbi__context * s,stbi_uc pal[256][4],int num_entries,int transp)5753 static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp)
5754 {
5755 int i;
5756 for (i=0; i < num_entries; ++i) {
5757 pal[i][2] = stbi__get8(s);
5758 pal[i][1] = stbi__get8(s);
5759 pal[i][0] = stbi__get8(s);
5760 pal[i][3] = transp == i ? 0 : 255;
5761 }
5762 }
5763
stbi__gif_header(stbi__context * s,stbi__gif * g,int * comp,int is_info)5764 static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info)
5765 {
5766 stbi_uc version;
5767 if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
5768 return stbi__err("not GIF", "Corrupt GIF");
5769
5770 version = stbi__get8(s);
5771 if (version != '7' && version != '9') return stbi__err("not GIF", "Corrupt GIF");
5772 if (stbi__get8(s) != 'a') return stbi__err("not GIF", "Corrupt GIF");
5773
5774 stbi__g_failure_reason = "";
5775 g->w = stbi__get16le(s);
5776 g->h = stbi__get16le(s);
5777 g->flags = stbi__get8(s);
5778 g->bgindex = stbi__get8(s);
5779 g->ratio = stbi__get8(s);
5780 g->transparent = -1;
5781
5782 if (comp != 0) *comp = 4; // can't actually tell whether it's 3 or 4 until we parse the comments
5783
5784 if (is_info) return 1;
5785
5786 if (g->flags & 0x80)
5787 stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
5788
5789 return 1;
5790 }
5791
stbi__gif_info_raw(stbi__context * s,int * x,int * y,int * comp)5792 static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp)
5793 {
5794 stbi__gif* g = (stbi__gif*) stbi__malloc(sizeof(stbi__gif));
5795 if (!stbi__gif_header(s, g, comp, 1)) {
5796 STBI_FREE(g);
5797 stbi__rewind( s );
5798 return 0;
5799 }
5800 if (x) *x = g->w;
5801 if (y) *y = g->h;
5802 STBI_FREE(g);
5803 return 1;
5804 }
5805
stbi__out_gif_code(stbi__gif * g,stbi__uint16 code)5806 static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code)
5807 {
5808 stbi_uc *p, *c;
5809
5810 // recurse to decode the prefixes, since the linked-list is backwards,
5811 // and working backwards through an interleaved image would be nasty
5812 if (g->codes[code].prefix >= 0)
5813 stbi__out_gif_code(g, g->codes[code].prefix);
5814
5815 if (g->cur_y >= g->max_y) return;
5816
5817 p = &g->out[g->cur_x + g->cur_y];
5818 c = &g->color_table[g->codes[code].suffix * 4];
5819
5820 if (c[3] >= 128) {
5821 p[0] = c[2];
5822 p[1] = c[1];
5823 p[2] = c[0];
5824 p[3] = c[3];
5825 }
5826 g->cur_x += 4;
5827
5828 if (g->cur_x >= g->max_x) {
5829 g->cur_x = g->start_x;
5830 g->cur_y += g->step;
5831
5832 while (g->cur_y >= g->max_y && g->parse > 0) {
5833 g->step = (1 << g->parse) * g->line_size;
5834 g->cur_y = g->start_y + (g->step >> 1);
5835 --g->parse;
5836 }
5837 }
5838 }
5839
stbi__process_gif_raster(stbi__context * s,stbi__gif * g)5840 static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g)
5841 {
5842 stbi_uc lzw_cs;
5843 stbi__int32 len, init_code;
5844 stbi__uint32 first;
5845 stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
5846 stbi__gif_lzw *p;
5847
5848 lzw_cs = stbi__get8(s);
5849 if (lzw_cs > 12) return NULL;
5850 clear = 1 << lzw_cs;
5851 first = 1;
5852 codesize = lzw_cs + 1;
5853 codemask = (1 << codesize) - 1;
5854 bits = 0;
5855 valid_bits = 0;
5856 for (init_code = 0; init_code < clear; init_code++) {
5857 g->codes[init_code].prefix = -1;
5858 g->codes[init_code].first = (stbi_uc) init_code;
5859 g->codes[init_code].suffix = (stbi_uc) init_code;
5860 }
5861
5862 // support no starting clear code
5863 avail = clear+2;
5864 oldcode = -1;
5865
5866 len = 0;
5867 for(;;) {
5868 if (valid_bits < codesize) {
5869 if (len == 0) {
5870 len = stbi__get8(s); // start new block
5871 if (len == 0)
5872 return g->out;
5873 }
5874 --len;
5875 bits |= (stbi__int32) stbi__get8(s) << valid_bits;
5876 valid_bits += 8;
5877 } else {
5878 stbi__int32 code = bits & codemask;
5879 bits >>= codesize;
5880 valid_bits -= codesize;
5881 // @OPTIMIZE: is there some way we can accelerate the non-clear path?
5882 if (code == clear) { // clear code
5883 codesize = lzw_cs + 1;
5884 codemask = (1 << codesize) - 1;
5885 avail = clear + 2;
5886 oldcode = -1;
5887 first = 0;
5888 } else if (code == clear + 1) { // end of stream code
5889 stbi__skip(s, len);
5890 while ((len = stbi__get8(s)) > 0)
5891 stbi__skip(s,len);
5892 return g->out;
5893 } else if (code <= avail) {
5894 if (first) return stbi__errpuc("no clear code", "Corrupt GIF");
5895
5896 if (oldcode >= 0) {
5897 p = &g->codes[avail++];
5898 if (avail > 4096) return stbi__errpuc("too many codes", "Corrupt GIF");
5899 p->prefix = (stbi__int16) oldcode;
5900 p->first = g->codes[oldcode].first;
5901 p->suffix = (code == avail) ? p->first : g->codes[code].first;
5902 } else if (code == avail)
5903 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5904
5905 stbi__out_gif_code(g, (stbi__uint16) code);
5906
5907 if ((avail & codemask) == 0 && avail <= 0x0FFF) {
5908 codesize++;
5909 codemask = (1 << codesize) - 1;
5910 }
5911
5912 oldcode = code;
5913 } else {
5914 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5915 }
5916 }
5917 }
5918 }
5919
stbi__fill_gif_background(stbi__gif * g,int x0,int y0,int x1,int y1)5920 static void stbi__fill_gif_background(stbi__gif *g, int x0, int y0, int x1, int y1)
5921 {
5922 int x, y;
5923 stbi_uc *c = g->pal[g->bgindex];
5924 for (y = y0; y < y1; y += 4 * g->w) {
5925 for (x = x0; x < x1; x += 4) {
5926 stbi_uc *p = &g->out[y + x];
5927 p[0] = c[2];
5928 p[1] = c[1];
5929 p[2] = c[0];
5930 p[3] = 0;
5931 }
5932 }
5933 }
5934
5935 // this function is designed to support animated gifs, although stb_image doesn't support it
stbi__gif_load_next(stbi__context * s,stbi__gif * g,int * comp,int req_comp)5936 static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp)
5937 {
5938 int i;
5939 stbi_uc *prev_out = 0;
5940
5941 if (g->out == 0 && !stbi__gif_header(s, g, comp,0))
5942 return 0; // stbi__g_failure_reason set by stbi__gif_header
5943
5944 prev_out = g->out;
5945 g->out = (stbi_uc *) stbi__malloc(4 * g->w * g->h);
5946 if (g->out == 0) return stbi__errpuc("outofmem", "Out of memory");
5947
5948 switch ((g->eflags & 0x1C) >> 2) {
5949 case 0: // unspecified (also always used on 1st frame)
5950 stbi__fill_gif_background(g, 0, 0, 4 * g->w, 4 * g->w * g->h);
5951 break;
5952 case 1: // do not dispose
5953 if (prev_out) memcpy(g->out, prev_out, 4 * g->w * g->h);
5954 g->old_out = prev_out;
5955 break;
5956 case 2: // dispose to background
5957 if (prev_out) memcpy(g->out, prev_out, 4 * g->w * g->h);
5958 stbi__fill_gif_background(g, g->start_x, g->start_y, g->max_x, g->max_y);
5959 break;
5960 case 3: // dispose to previous
5961 if (g->old_out) {
5962 for (i = g->start_y; i < g->max_y; i += 4 * g->w)
5963 memcpy(&g->out[i + g->start_x], &g->old_out[i + g->start_x], g->max_x - g->start_x);
5964 }
5965 break;
5966 }
5967
5968 for (;;) {
5969 switch (stbi__get8(s)) {
5970 case 0x2C: /* Image Descriptor */
5971 {
5972 int prev_trans = -1;
5973 stbi__int32 x, y, w, h;
5974 stbi_uc *o;
5975
5976 x = stbi__get16le(s);
5977 y = stbi__get16le(s);
5978 w = stbi__get16le(s);
5979 h = stbi__get16le(s);
5980 if (((x + w) > (g->w)) || ((y + h) > (g->h)))
5981 return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
5982
5983 g->line_size = g->w * 4;
5984 g->start_x = x * 4;
5985 g->start_y = y * g->line_size;
5986 g->max_x = g->start_x + w * 4;
5987 g->max_y = g->start_y + h * g->line_size;
5988 g->cur_x = g->start_x;
5989 g->cur_y = g->start_y;
5990
5991 g->lflags = stbi__get8(s);
5992
5993 if (g->lflags & 0x40) {
5994 g->step = 8 * g->line_size; // first interlaced spacing
5995 g->parse = 3;
5996 } else {
5997 g->step = g->line_size;
5998 g->parse = 0;
5999 }
6000
6001 if (g->lflags & 0x80) {
6002 stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
6003 g->color_table = (stbi_uc *) g->lpal;
6004 } else if (g->flags & 0x80) {
6005 if (g->transparent >= 0 && (g->eflags & 0x01)) {
6006 prev_trans = g->pal[g->transparent][3];
6007 g->pal[g->transparent][3] = 0;
6008 }
6009 g->color_table = (stbi_uc *) g->pal;
6010 } else
6011 return stbi__errpuc("missing color table", "Corrupt GIF");
6012
6013 o = stbi__process_gif_raster(s, g);
6014 if (o == NULL) return NULL;
6015
6016 if (prev_trans != -1)
6017 g->pal[g->transparent][3] = (stbi_uc) prev_trans;
6018
6019 return o;
6020 }
6021
6022 case 0x21: // Comment Extension.
6023 {
6024 int len;
6025 if (stbi__get8(s) == 0xF9) { // Graphic Control Extension.
6026 len = stbi__get8(s);
6027 if (len == 4) {
6028 g->eflags = stbi__get8(s);
6029 g->delay = stbi__get16le(s);
6030 g->transparent = stbi__get8(s);
6031 } else {
6032 stbi__skip(s, len);
6033 break;
6034 }
6035 }
6036 while ((len = stbi__get8(s)) != 0)
6037 stbi__skip(s, len);
6038 break;
6039 }
6040
6041 case 0x3B: // gif stream termination code
6042 return (stbi_uc *) s; // using '1' causes warning on some compilers
6043
6044 default:
6045 return stbi__errpuc("unknown code", "Corrupt GIF");
6046 }
6047 }
6048
6049 STBI_NOTUSED(req_comp);
6050 }
6051
stbi__gif_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)6052 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6053 {
6054 stbi_uc *u = 0;
6055 stbi__gif* g = (stbi__gif*) stbi__malloc(sizeof(stbi__gif));
6056 memset(g, 0, sizeof(*g));
6057
6058 u = stbi__gif_load_next(s, g, comp, req_comp);
6059 if (u == (stbi_uc *) s) u = 0; // end of animated gif marker
6060 if (u) {
6061 *x = g->w;
6062 *y = g->h;
6063 if (req_comp && req_comp != 4)
6064 u = stbi__convert_format(u, 4, req_comp, g->w, g->h);
6065 }
6066 else if (g->out)
6067 STBI_FREE(g->out);
6068 STBI_FREE(g);
6069 return u;
6070 }
6071
stbi__gif_info(stbi__context * s,int * x,int * y,int * comp)6072 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp)
6073 {
6074 return stbi__gif_info_raw(s,x,y,comp);
6075 }
6076 #endif
6077
6078 // *************************************************************************************************
6079 // Radiance RGBE HDR loader
6080 // originally by Nicolas Schulz
6081 #ifndef STBI_NO_HDR
stbi__hdr_test_core(stbi__context * s)6082 static int stbi__hdr_test_core(stbi__context *s)
6083 {
6084 const char *signature = "#?RADIANCE\n";
6085 int i;
6086 for (i=0; signature[i]; ++i)
6087 if (stbi__get8(s) != signature[i])
6088 return 0;
6089 return 1;
6090 }
6091
stbi__hdr_test(stbi__context * s)6092 static int stbi__hdr_test(stbi__context* s)
6093 {
6094 int r = stbi__hdr_test_core(s);
6095 stbi__rewind(s);
6096 return r;
6097 }
6098
6099 #define STBI__HDR_BUFLEN 1024
stbi__hdr_gettoken(stbi__context * z,char * buffer)6100 static char *stbi__hdr_gettoken(stbi__context *z, char *buffer)
6101 {
6102 int len=0;
6103 char c = '\0';
6104
6105 c = (char) stbi__get8(z);
6106
6107 while (!stbi__at_eof(z) && c != '\n') {
6108 buffer[len++] = c;
6109 if (len == STBI__HDR_BUFLEN-1) {
6110 // flush to end of line
6111 while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
6112 ;
6113 break;
6114 }
6115 c = (char) stbi__get8(z);
6116 }
6117
6118 buffer[len] = 0;
6119 return buffer;
6120 }
6121
stbi__hdr_convert(float * output,stbi_uc * input,int req_comp)6122 static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp)
6123 {
6124 if ( input[3] != 0 ) {
6125 float f1;
6126 // Exponent
6127 f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
6128 if (req_comp <= 2)
6129 output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
6130 else {
6131 output[0] = input[0] * f1;
6132 output[1] = input[1] * f1;
6133 output[2] = input[2] * f1;
6134 }
6135 if (req_comp == 2) output[1] = 1;
6136 if (req_comp == 4) output[3] = 1;
6137 } else {
6138 switch (req_comp) {
6139 case 4: output[3] = 1; /* fallthrough */
6140 case 3: output[0] = output[1] = output[2] = 0;
6141 break;
6142 case 2: output[1] = 1; /* fallthrough */
6143 case 1: output[0] = 0;
6144 break;
6145 }
6146 }
6147 }
6148
stbi__hdr_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)6149 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6150 {
6151 char buffer[STBI__HDR_BUFLEN];
6152 char *token;
6153 int valid = 0;
6154 int width, height;
6155 stbi_uc *scanline;
6156 float *hdr_data;
6157 int len;
6158 unsigned char count, value;
6159 int i, j, k, c1,c2, z;
6160
6161
6162 // Check identifier
6163 if (strcmp(stbi__hdr_gettoken(s,buffer), "#?RADIANCE") != 0)
6164 return stbi__errpf("not HDR", "Corrupt HDR image");
6165
6166 // Parse header
6167 for(;;) {
6168 token = stbi__hdr_gettoken(s,buffer);
6169 if (token[0] == 0) break;
6170 if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6171 }
6172
6173 if (!valid) return stbi__errpf("unsupported format", "Unsupported HDR format");
6174
6175 // Parse width and height
6176 // can't use sscanf() if we're not using stdio!
6177 token = stbi__hdr_gettoken(s,buffer);
6178 if (strncmp(token, "-Y ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6179 token += 3;
6180 height = (int) strtol(token, &token, 10);
6181 while (*token == ' ') ++token;
6182 if (strncmp(token, "+X ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6183 token += 3;
6184 width = (int) strtol(token, NULL, 10);
6185
6186 *x = width;
6187 *y = height;
6188
6189 if (comp) *comp = 3;
6190 if (req_comp == 0) req_comp = 3;
6191
6192 // Read data
6193 hdr_data = (float *) stbi__malloc(height * width * req_comp * sizeof(float));
6194
6195 // Load image data
6196 // image data is stored as some number of sca
6197 if ( width < 8 || width >= 32768) {
6198 // Read flat data
6199 for (j=0; j < height; ++j) {
6200 for (i=0; i < width; ++i) {
6201 stbi_uc rgbe[4];
6202 main_decode_loop:
6203 stbi__getn(s, rgbe, 4);
6204 stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
6205 }
6206 }
6207 } else {
6208 // Read RLE-encoded data
6209 scanline = NULL;
6210
6211 for (j = 0; j < height; ++j) {
6212 c1 = stbi__get8(s);
6213 c2 = stbi__get8(s);
6214 len = stbi__get8(s);
6215 if (c1 != 2 || c2 != 2 || (len & 0x80)) {
6216 // not run-length encoded, so we have to actually use THIS data as a decoded
6217 // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
6218 stbi_uc rgbe[4];
6219 rgbe[0] = (stbi_uc) c1;
6220 rgbe[1] = (stbi_uc) c2;
6221 rgbe[2] = (stbi_uc) len;
6222 rgbe[3] = (stbi_uc) stbi__get8(s);
6223 stbi__hdr_convert(hdr_data, rgbe, req_comp);
6224 i = 1;
6225 j = 0;
6226 STBI_FREE(scanline);
6227 goto main_decode_loop; // yes, this makes no sense
6228 }
6229 len <<= 8;
6230 len |= stbi__get8(s);
6231 if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
6232 if (scanline == NULL) scanline = (stbi_uc *) stbi__malloc(width * 4);
6233
6234 for (k = 0; k < 4; ++k) {
6235 i = 0;
6236 while (i < width) {
6237 count = stbi__get8(s);
6238 if (count > 128) {
6239 // Run
6240 value = stbi__get8(s);
6241 count -= 128;
6242 for (z = 0; z < count; ++z)
6243 scanline[i++ * 4 + k] = value;
6244 } else {
6245 // Dump
6246 for (z = 0; z < count; ++z)
6247 scanline[i++ * 4 + k] = stbi__get8(s);
6248 }
6249 }
6250 }
6251 for (i=0; i < width; ++i)
6252 stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
6253 }
6254 STBI_FREE(scanline);
6255 }
6256
6257 return hdr_data;
6258 }
6259
stbi__hdr_info(stbi__context * s,int * x,int * y,int * comp)6260 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp)
6261 {
6262 char buffer[STBI__HDR_BUFLEN];
6263 char *token;
6264 int valid = 0;
6265
6266 if (stbi__hdr_test(s) == 0) {
6267 stbi__rewind( s );
6268 return 0;
6269 }
6270
6271 for(;;) {
6272 token = stbi__hdr_gettoken(s,buffer);
6273 if (token[0] == 0) break;
6274 if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6275 }
6276
6277 if (!valid) {
6278 stbi__rewind( s );
6279 return 0;
6280 }
6281 token = stbi__hdr_gettoken(s,buffer);
6282 if (strncmp(token, "-Y ", 3)) {
6283 stbi__rewind( s );
6284 return 0;
6285 }
6286 token += 3;
6287 *y = (int) strtol(token, &token, 10);
6288 while (*token == ' ') ++token;
6289 if (strncmp(token, "+X ", 3)) {
6290 stbi__rewind( s );
6291 return 0;
6292 }
6293 token += 3;
6294 *x = (int) strtol(token, NULL, 10);
6295 *comp = 3;
6296 return 1;
6297 }
6298 #endif // STBI_NO_HDR
6299
6300 #ifndef STBI_NO_BMP
stbi__bmp_info(stbi__context * s,int * x,int * y,int * comp)6301 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp)
6302 {
6303 void *p;
6304 stbi__bmp_data info;
6305
6306 info.all_a = 255;
6307 p = stbi__bmp_parse_header(s, &info);
6308 stbi__rewind( s );
6309 if (p == NULL)
6310 return 0;
6311 *x = s->img_x;
6312 *y = s->img_y;
6313 *comp = info.ma ? 4 : 3;
6314 return 1;
6315 }
6316 #endif
6317
6318 #ifndef STBI_NO_PSD
stbi__psd_info(stbi__context * s,int * x,int * y,int * comp)6319 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp)
6320 {
6321 int channelCount;
6322 if (stbi__get32be(s) != 0x38425053) {
6323 stbi__rewind( s );
6324 return 0;
6325 }
6326 if (stbi__get16be(s) != 1) {
6327 stbi__rewind( s );
6328 return 0;
6329 }
6330 stbi__skip(s, 6);
6331 channelCount = stbi__get16be(s);
6332 if (channelCount < 0 || channelCount > 16) {
6333 stbi__rewind( s );
6334 return 0;
6335 }
6336 *y = stbi__get32be(s);
6337 *x = stbi__get32be(s);
6338 if (stbi__get16be(s) != 8) {
6339 stbi__rewind( s );
6340 return 0;
6341 }
6342 if (stbi__get16be(s) != 3) {
6343 stbi__rewind( s );
6344 return 0;
6345 }
6346 *comp = 4;
6347 return 1;
6348 }
6349 #endif
6350
6351 #ifndef STBI_NO_PIC
stbi__pic_info(stbi__context * s,int * x,int * y,int * comp)6352 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp)
6353 {
6354 int act_comp=0,num_packets=0,chained;
6355 stbi__pic_packet packets[10];
6356
6357 if (!stbi__pic_is4(s,"\x53\x80\xF6\x34")) {
6358 stbi__rewind(s);
6359 return 0;
6360 }
6361
6362 stbi__skip(s, 88);
6363
6364 *x = stbi__get16be(s);
6365 *y = stbi__get16be(s);
6366 if (stbi__at_eof(s)) {
6367 stbi__rewind( s);
6368 return 0;
6369 }
6370 if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
6371 stbi__rewind( s );
6372 return 0;
6373 }
6374
6375 stbi__skip(s, 8);
6376
6377 do {
6378 stbi__pic_packet *packet;
6379
6380 if (num_packets==sizeof(packets)/sizeof(packets[0]))
6381 return 0;
6382
6383 packet = &packets[num_packets++];
6384 chained = stbi__get8(s);
6385 packet->size = stbi__get8(s);
6386 packet->type = stbi__get8(s);
6387 packet->channel = stbi__get8(s);
6388 act_comp |= packet->channel;
6389
6390 if (stbi__at_eof(s)) {
6391 stbi__rewind( s );
6392 return 0;
6393 }
6394 if (packet->size != 8) {
6395 stbi__rewind( s );
6396 return 0;
6397 }
6398 } while (chained);
6399
6400 *comp = (act_comp & 0x10 ? 4 : 3);
6401
6402 return 1;
6403 }
6404 #endif
6405
6406 // *************************************************************************************************
6407 // Portable Gray Map and Portable Pixel Map loader
6408 // by Ken Miller
6409 //
6410 // PGM: http://netpbm.sourceforge.net/doc/pgm.html
6411 // PPM: http://netpbm.sourceforge.net/doc/ppm.html
6412 //
6413 // Known limitations:
6414 // Does not support comments in the header section
6415 // Does not support ASCII image data (formats P2 and P3)
6416 // Does not support 16-bit-per-channel
6417
6418 #ifndef STBI_NO_PNM
6419
stbi__pnm_test(stbi__context * s)6420 static int stbi__pnm_test(stbi__context *s)
6421 {
6422 char p, t;
6423 p = (char) stbi__get8(s);
6424 t = (char) stbi__get8(s);
6425 if (p != 'P' || (t != '5' && t != '6')) {
6426 stbi__rewind( s );
6427 return 0;
6428 }
6429 return 1;
6430 }
6431
stbi__pnm_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)6432 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6433 {
6434 stbi_uc *out;
6435 if (!stbi__pnm_info(s, (int *)&s->img_x, (int *)&s->img_y, (int *)&s->img_n))
6436 return 0;
6437 *x = s->img_x;
6438 *y = s->img_y;
6439 *comp = s->img_n;
6440
6441 out = (stbi_uc *) stbi__malloc(s->img_n * s->img_x * s->img_y);
6442 if (!out) return stbi__errpuc("outofmem", "Out of memory");
6443 stbi__getn(s, out, s->img_n * s->img_x * s->img_y);
6444
6445 if (req_comp && req_comp != s->img_n) {
6446 out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y);
6447 if (out == NULL) return out; // stbi__convert_format frees input on failure
6448 }
6449 return out;
6450 }
6451
stbi__pnm_isspace(char c)6452 static int stbi__pnm_isspace(char c)
6453 {
6454 return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r';
6455 }
6456
stbi__pnm_skip_whitespace(stbi__context * s,char * c)6457 static void stbi__pnm_skip_whitespace(stbi__context *s, char *c)
6458 {
6459 for (;;) {
6460 while (!stbi__at_eof(s) && stbi__pnm_isspace(*c))
6461 *c = (char) stbi__get8(s);
6462
6463 if (stbi__at_eof(s) || *c != '#')
6464 break;
6465
6466 while (!stbi__at_eof(s) && *c != '\n' && *c != '\r' )
6467 *c = (char) stbi__get8(s);
6468 }
6469 }
6470
stbi__pnm_isdigit(char c)6471 static int stbi__pnm_isdigit(char c)
6472 {
6473 return c >= '0' && c <= '9';
6474 }
6475
stbi__pnm_getinteger(stbi__context * s,char * c)6476 static int stbi__pnm_getinteger(stbi__context *s, char *c)
6477 {
6478 int value = 0;
6479
6480 while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) {
6481 value = value*10 + (*c - '0');
6482 *c = (char) stbi__get8(s);
6483 }
6484
6485 return value;
6486 }
6487
stbi__pnm_info(stbi__context * s,int * x,int * y,int * comp)6488 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp)
6489 {
6490 int maxv;
6491 char c, p, t;
6492
6493 stbi__rewind( s );
6494
6495 // Get identifier
6496 p = (char) stbi__get8(s);
6497 t = (char) stbi__get8(s);
6498 if (p != 'P' || (t != '5' && t != '6')) {
6499 stbi__rewind( s );
6500 return 0;
6501 }
6502
6503 *comp = (t == '6') ? 3 : 1; // '5' is 1-component .pgm; '6' is 3-component .ppm
6504
6505 c = (char) stbi__get8(s);
6506 stbi__pnm_skip_whitespace(s, &c);
6507
6508 *x = stbi__pnm_getinteger(s, &c); // read width
6509 stbi__pnm_skip_whitespace(s, &c);
6510
6511 *y = stbi__pnm_getinteger(s, &c); // read height
6512 stbi__pnm_skip_whitespace(s, &c);
6513
6514 maxv = stbi__pnm_getinteger(s, &c); // read max value
6515
6516 if (maxv > 255)
6517 return stbi__err("max value > 255", "PPM image not 8-bit");
6518 else
6519 return 1;
6520 }
6521 #endif
6522
stbi__info_main(stbi__context * s,int * x,int * y,int * comp)6523 static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp)
6524 {
6525 #ifndef STBI_NO_JPEG
6526 if (stbi__jpeg_info(s, x, y, comp)) return 1;
6527 #endif
6528
6529 #ifndef STBI_NO_PNG
6530 if (stbi__png_info(s, x, y, comp)) return 1;
6531 #endif
6532
6533 #ifndef STBI_NO_GIF
6534 if (stbi__gif_info(s, x, y, comp)) return 1;
6535 #endif
6536
6537 #ifndef STBI_NO_BMP
6538 if (stbi__bmp_info(s, x, y, comp)) return 1;
6539 #endif
6540
6541 #ifndef STBI_NO_PSD
6542 if (stbi__psd_info(s, x, y, comp)) return 1;
6543 #endif
6544
6545 #ifndef STBI_NO_PIC
6546 if (stbi__pic_info(s, x, y, comp)) return 1;
6547 #endif
6548
6549 #ifndef STBI_NO_PNM
6550 if (stbi__pnm_info(s, x, y, comp)) return 1;
6551 #endif
6552
6553 #ifndef STBI_NO_HDR
6554 if (stbi__hdr_info(s, x, y, comp)) return 1;
6555 #endif
6556
6557 // test tga last because it's a crappy test!
6558 #ifndef STBI_NO_TGA
6559 if (stbi__tga_info(s, x, y, comp))
6560 return 1;
6561 #endif
6562 return stbi__err("unknown image type", "Image not of any known type, or corrupt");
6563 }
6564
6565 #ifndef STBI_NO_STDIO
stbi_info(char const * filename,int * x,int * y,int * comp)6566 STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
6567 {
6568 FILE *f = stbi__fopen(filename, "rb");
6569 int result;
6570 if (!f) return stbi__err("can't fopen", "Unable to open file");
6571 result = stbi_info_from_file(f, x, y, comp);
6572 fclose(f);
6573 return result;
6574 }
6575
stbi_info_from_file(FILE * f,int * x,int * y,int * comp)6576 STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
6577 {
6578 int r;
6579 stbi__context s;
6580 long pos = ftell(f);
6581 stbi__start_file(&s, f);
6582 r = stbi__info_main(&s,x,y,comp);
6583 fseek(f,pos,SEEK_SET);
6584 return r;
6585 }
6586 #endif // !STBI_NO_STDIO
6587
stbi_info_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp)6588 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
6589 {
6590 stbi__context s;
6591 stbi__start_mem(&s,buffer,len);
6592 return stbi__info_main(&s,x,y,comp);
6593 }
6594
stbi_info_from_callbacks(stbi_io_callbacks const * c,void * user,int * x,int * y,int * comp)6595 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
6596 {
6597 stbi__context s;
6598 stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
6599 return stbi__info_main(&s,x,y,comp);
6600 }
6601
6602 #endif // STB_IMAGE_IMPLEMENTATION
6603
6604 /*
6605 revision history:
6606 2.12 (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
6607 2.11 (2016-04-02) allocate large structures on the stack
6608 remove white matting for transparent PSD
6609 fix reported channel count for PNG & BMP
6610 re-enable SSE2 in non-gcc 64-bit
6611 support RGB-formatted JPEG
6612 read 16-bit PNGs (only as 8-bit)
6613 2.10 (2016-01-22) avoid warning introduced in 2.09 by STBI_REALLOC_SIZED
6614 2.09 (2016-01-16) allow comments in PNM files
6615 16-bit-per-pixel TGA (not bit-per-component)
6616 info() for TGA could break due to .hdr handling
6617 info() for BMP to shares code instead of sloppy parse
6618 can use STBI_REALLOC_SIZED if allocator doesn't support realloc
6619 code cleanup
6620 2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
6621 2.07 (2015-09-13) fix compiler warnings
6622 partial animated GIF support
6623 limited 16-bpc PSD support
6624 #ifdef unused functions
6625 bug with < 92 byte PIC,PNM,HDR,TGA
6626 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
6627 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
6628 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
6629 2.03 (2015-04-12) extra corruption checking (mmozeiko)
6630 stbi_set_flip_vertically_on_load (nguillemot)
6631 fix NEON support; fix mingw support
6632 2.02 (2015-01-19) fix incorrect assert, fix warning
6633 2.01 (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
6634 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
6635 2.00 (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
6636 progressive JPEG (stb)
6637 PGM/PPM support (Ken Miller)
6638 STBI_MALLOC,STBI_REALLOC,STBI_FREE
6639 GIF bugfix -- seemingly never worked
6640 STBI_NO_*, STBI_ONLY_*
6641 1.48 (2014-12-14) fix incorrectly-named assert()
6642 1.47 (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
6643 optimize PNG (ryg)
6644 fix bug in interlaced PNG with user-specified channel count (stb)
6645 1.46 (2014-08-26)
6646 fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
6647 1.45 (2014-08-16)
6648 fix MSVC-ARM internal compiler error by wrapping malloc
6649 1.44 (2014-08-07)
6650 various warning fixes from Ronny Chevalier
6651 1.43 (2014-07-15)
6652 fix MSVC-only compiler problem in code changed in 1.42
6653 1.42 (2014-07-09)
6654 don't define _CRT_SECURE_NO_WARNINGS (affects user code)
6655 fixes to stbi__cleanup_jpeg path
6656 added STBI_ASSERT to avoid requiring assert.h
6657 1.41 (2014-06-25)
6658 fix search&replace from 1.36 that messed up comments/error messages
6659 1.40 (2014-06-22)
6660 fix gcc struct-initialization warning
6661 1.39 (2014-06-15)
6662 fix to TGA optimization when req_comp != number of components in TGA;
6663 fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
6664 add support for BMP version 5 (more ignored fields)
6665 1.38 (2014-06-06)
6666 suppress MSVC warnings on integer casts truncating values
6667 fix accidental rename of 'skip' field of I/O
6668 1.37 (2014-06-04)
6669 remove duplicate typedef
6670 1.36 (2014-06-03)
6671 convert to header file single-file library
6672 if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
6673 1.35 (2014-05-27)
6674 various warnings
6675 fix broken STBI_SIMD path
6676 fix bug where stbi_load_from_file no longer left file pointer in correct place
6677 fix broken non-easy path for 32-bit BMP (possibly never used)
6678 TGA optimization by Arseny Kapoulkine
6679 1.34 (unknown)
6680 use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
6681 1.33 (2011-07-14)
6682 make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
6683 1.32 (2011-07-13)
6684 support for "info" function for all supported filetypes (SpartanJ)
6685 1.31 (2011-06-20)
6686 a few more leak fixes, bug in PNG handling (SpartanJ)
6687 1.30 (2011-06-11)
6688 added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
6689 removed deprecated format-specific test/load functions
6690 removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
6691 error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
6692 fix inefficiency in decoding 32-bit BMP (David Woo)
6693 1.29 (2010-08-16)
6694 various warning fixes from Aurelien Pocheville
6695 1.28 (2010-08-01)
6696 fix bug in GIF palette transparency (SpartanJ)
6697 1.27 (2010-08-01)
6698 cast-to-stbi_uc to fix warnings
6699 1.26 (2010-07-24)
6700 fix bug in file buffering for PNG reported by SpartanJ
6701 1.25 (2010-07-17)
6702 refix trans_data warning (Won Chun)
6703 1.24 (2010-07-12)
6704 perf improvements reading from files on platforms with lock-heavy fgetc()
6705 minor perf improvements for jpeg
6706 deprecated type-specific functions so we'll get feedback if they're needed
6707 attempt to fix trans_data warning (Won Chun)
6708 1.23 fixed bug in iPhone support
6709 1.22 (2010-07-10)
6710 removed image *writing* support
6711 stbi_info support from Jetro Lauha
6712 GIF support from Jean-Marc Lienher
6713 iPhone PNG-extensions from James Brown
6714 warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
6715 1.21 fix use of 'stbi_uc' in header (reported by jon blow)
6716 1.20 added support for Softimage PIC, by Tom Seddon
6717 1.19 bug in interlaced PNG corruption check (found by ryg)
6718 1.18 (2008-08-02)
6719 fix a threading bug (local mutable static)
6720 1.17 support interlaced PNG
6721 1.16 major bugfix - stbi__convert_format converted one too many pixels
6722 1.15 initialize some fields for thread safety
6723 1.14 fix threadsafe conversion bug
6724 header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
6725 1.13 threadsafe
6726 1.12 const qualifiers in the API
6727 1.11 Support installable IDCT, colorspace conversion routines
6728 1.10 Fixes for 64-bit (don't use "unsigned long")
6729 optimized upsampling by Fabian "ryg" Giesen
6730 1.09 Fix format-conversion for PSD code (bad global variables!)
6731 1.08 Thatcher Ulrich's PSD code integrated by Nicolas Schulz
6732 1.07 attempt to fix C++ warning/errors again
6733 1.06 attempt to fix C++ warning/errors again
6734 1.05 fix TGA loading to return correct *comp and use good luminance calc
6735 1.04 default float alpha is 1, not 255; use 'void *' for stbi_image_free
6736 1.03 bugfixes to STBI_NO_STDIO, STBI_NO_HDR
6737 1.02 support for (subset of) HDR files, float interface for preferred access to them
6738 1.01 fix bug: possible bug in handling right-side up bmps... not sure
6739 fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
6740 1.00 interface to zlib that skips zlib header
6741 0.99 correct handling of alpha in palette
6742 0.98 TGA loader by lonesock; dynamically add loaders (untested)
6743 0.97 jpeg errors on too large a file; also catch another malloc failure
6744 0.96 fix detection of invalid v value - particleman@mollyrocket forum
6745 0.95 during header scan, seek to markers in case of padding
6746 0.94 STBI_NO_STDIO to disable stdio usage; rename all #defines the same
6747 0.93 handle jpegtran output; verbose errors
6748 0.92 read 4,8,16,24,32-bit BMP files of several formats
6749 0.91 output 24-bit Windows 3.0 BMP files
6750 0.90 fix a few more warnings; bump version number to approach 1.0
6751 0.61 bugfixes due to Marc LeBlanc, Christopher Lloyd
6752 0.60 fix compiling as c++
6753 0.59 fix warnings: merge Dave Moore's -Wall fixes
6754 0.58 fix bug: zlib uncompressed mode len/nlen was wrong endian
6755 0.57 fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
6756 0.56 fix bug: zlib uncompressed mode len vs. nlen
6757 0.55 fix bug: restart_interval not initialized to 0
6758 0.54 allow NULL for 'int *comp'
6759 0.53 fix bug in png 3->4; speedup png decoding
6760 0.52 png handles req_comp=3,4 directly; minor cleanup; jpeg comments
6761 0.51 obey req_comp requests, 1-component jpegs return as 1-component,
6762 on 'test' only check type, not whether we support this variant
6763 0.50 (2006-11-19)
6764 first released version
6765 */
6766