1 /* stb_image - v2.12 - public domain image loader - http://nothings.org/stb_image.h
2 no warranty implied; use at your own risk
3
4 Do this:
5 #define STB_IMAGE_IMPLEMENTATION
6 before you include this file in *one* C or C++ file to create the implementation.
7
8 // i.e. it should look like this:
9 #include ...
10 #include ...
11 #include ...
12 #define STB_IMAGE_IMPLEMENTATION
13 #include "stb_image.h"
14
15 You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
16 And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
17
18
19 QUICK NOTES:
20 Primarily of interest to game developers and other people who can
21 avoid problematic images and only need the trivial interface
22
23 JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
24 PNG 1/2/4/8-bit-per-channel (16 bpc not supported)
25
26 TGA (not sure what subset, if a subset)
27 BMP non-1bpp, non-RLE
28 PSD (composited view only, no extra channels, 8/16 bit-per-channel)
29
30 GIF (*comp always reports as 4-channel)
31 HDR (radiance rgbE format)
32 PIC (Softimage PIC)
33 PNM (PPM and PGM binary only)
34
35 Animated GIF still needs a proper API, but here's one way to do it:
36 http://gist.github.com/urraka/685d9a6340b26b830d49
37
38 - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
39 - decode from arbitrary I/O callbacks
40 - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
41
42 Full documentation under "DOCUMENTATION" below.
43
44
45 Revision 2.00 release notes:
46
47 - Progressive JPEG is now supported.
48
49 - PPM and PGM binary formats are now supported, thanks to Ken Miller.
50
51 - x86 platforms now make use of SSE2 SIMD instructions for
52 JPEG decoding, and ARM platforms can use NEON SIMD if requested.
53 This work was done by Fabian "ryg" Giesen. SSE2 is used by
54 default, but NEON must be enabled explicitly; see docs.
55
56 With other JPEG optimizations included in this version, we see
57 2x speedup on a JPEG on an x86 machine, and a 1.5x speedup
58 on a JPEG on an ARM machine, relative to previous versions of this
59 library. The same results will not obtain for all JPGs and for all
60 x86/ARM machines. (Note that progressive JPEGs are significantly
61 slower to decode than regular JPEGs.) This doesn't mean that this
62 is the fastest JPEG decoder in the land; rather, it brings it
63 closer to parity with standard libraries. If you want the fastest
64 decode, look elsewhere. (See "Philosophy" section of docs below.)
65
66 See final bullet items below for more info on SIMD.
67
68 - Added STBI_MALLOC, STBI_REALLOC, and STBI_FREE macros for replacing
69 the memory allocator. Unlike other STBI libraries, these macros don't
70 support a context parameter, so if you need to pass a context in to
71 the allocator, you'll have to store it in a global or a thread-local
72 variable.
73
74 - Split existing STBI_NO_HDR flag into two flags, STBI_NO_HDR and
75 STBI_NO_LINEAR.
76 STBI_NO_HDR: suppress implementation of .hdr reader format
77 STBI_NO_LINEAR: suppress high-dynamic-range light-linear float API
78
79 - You can suppress implementation of any of the decoders to reduce
80 your code footprint by #defining one or more of the following
81 symbols before creating the implementation.
82
83 STBI_NO_JPEG
84 STBI_NO_PNG
85 STBI_NO_BMP
86 STBI_NO_PSD
87 STBI_NO_TGA
88 STBI_NO_GIF
89 STBI_NO_HDR
90 STBI_NO_PIC
91 STBI_NO_PNM (.ppm and .pgm)
92
93 - You can request *only* certain decoders and suppress all other ones
94 (this will be more forward-compatible, as addition of new decoders
95 doesn't require you to disable them explicitly):
96
97 STBI_ONLY_JPEG
98 STBI_ONLY_PNG
99 STBI_ONLY_BMP
100 STBI_ONLY_PSD
101 STBI_ONLY_TGA
102 STBI_ONLY_GIF
103 STBI_ONLY_HDR
104 STBI_ONLY_PIC
105 STBI_ONLY_PNM (.ppm and .pgm)
106
107 Note that you can define multiples of these, and you will get all
108 of them ("only x" and "only y" is interpreted to mean "only x&y").
109
110 - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
111 want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
112
113 - Compilation of all SIMD code can be suppressed with
114 #define STBI_NO_SIMD
115 It should not be necessary to disable SIMD unless you have issues
116 compiling (e.g. using an x86 compiler which doesn't support SSE
117 intrinsics or that doesn't support the method used to detect
118 SSE2 support at run-time), and even those can be reported as
119 bugs so I can refine the built-in compile-time checking to be
120 smarter.
121
122 - The old STBI_SIMD system which allowed installing a user-defined
123 IDCT etc. has been removed. If you need this, don't upgrade. My
124 assumption is that almost nobody was doing this, and those who
125 were will find the built-in SIMD more satisfactory anyway.
126
127 - RGB values computed for JPEG images are slightly different from
128 previous versions of stb_image. (This is due to using less
129 integer precision in SIMD.) The C code has been adjusted so
130 that the same RGB values will be computed regardless of whether
131 SIMD support is available, so your app should always produce
132 consistent results. But these results are slightly different from
133 previous versions. (Specifically, about 3% of available YCbCr values
134 will compute different RGB results from pre-1.49 versions by +-1;
135 most of the deviating values are one smaller in the G channel.)
136
137 - If you must produce consistent results with previous versions of
138 stb_image, #define STBI_JPEG_OLD and you will get the same results
139 you used to; however, you will not get the SIMD speedups for
140 the YCbCr-to-RGB conversion step (although you should still see
141 significant JPEG speedup from the other changes).
142
143 Please note that STBI_JPEG_OLD is a temporary feature; it will be
144 removed in future versions of the library. It is only intended for
145 near-term back-compatibility use.
146
147
148 Latest revision history:
149 2.12 (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
150 2.11 (2016-04-02) 16-bit PNGS; enable SSE2 in non-gcc x64
151 RGB-format JPEG; remove white matting in PSD;
152 allocate large structures on the stack;
153 correct channel count for PNG & BMP
154 2.10 (2016-01-22) avoid warning introduced in 2.09
155 2.09 (2016-01-16) 16-bit TGA; comments in PNM files; STBI_REALLOC_SIZED
156 2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
157 2.07 (2015-09-13) partial animated GIF support
158 limited 16-bit PSD support
159 minor bugs, code cleanup, and compiler warnings
160 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
161 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
162 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
163 2.03 (2015-04-12) additional corruption checking
164 stbi_set_flip_vertically_on_load
165 fix NEON support; fix mingw support
166 2.02 (2015-01-19) fix incorrect assert, fix warning
167 2.01 (2015-01-17) fix various warnings
168 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
169 2.00 (2014-12-25) optimize JPEG, including x86 SSE2 & ARM NEON SIMD
170 progressive JPEG
171 PGM/PPM support
172 STBI_MALLOC,STBI_REALLOC,STBI_FREE
173 STBI_NO_*, STBI_ONLY_*
174 GIF bugfix
175
176 See end of file for full revision history.
177
178
179 ============================ Contributors =========================
180
181 Image formats Extensions, features
182 Sean Barrett (jpeg, png, bmp) Jetro Lauha (stbi_info)
183 Nicolas Schulz (hdr, psd) Martin "SpartanJ" Golini (stbi_info)
184 Jonathan Dummer (tga) James "moose2000" Brown (iPhone PNG)
185 Jean-Marc Lienher (gif) Ben "Disch" Wenger (io callbacks)
186 Tom Seddon (pic) Omar Cornut (1/2/4-bit PNG)
187 Thatcher Ulrich (psd) Nicolas Guillemot (vertical flip)
188 Ken Miller (pgm, ppm) Richard Mitton (16-bit PSD)
189 urraka@github (animated gif) Junggon Kim (PNM comments)
190 Daniel Gibson (16-bit TGA)
191
192 Optimizations & bugfixes
193 Fabian "ryg" Giesen
194 Arseny Kapoulkine
195
196 Bug & warning fixes
197 Marc LeBlanc David Woo Guillaume George Martins Mozeiko
198 Christpher Lloyd Martin Golini Jerry Jansson Joseph Thomson
199 Dave Moore Roy Eltham Hayaki Saito Phil Jordan
200 Won Chun Luke Graham Johan Duparc Nathan Reed
201 the Horde3D community Thomas Ruf Ronny Chevalier Nick Verigakis
202 Janez Zemva John Bartholomew Michal Cichon svdijk@github
203 Jonathan Blow Ken Hamada Tero Hanninen Baldur Karlsson
204 Laurent Gomila Cort Stratton Sergio Gonzalez romigrou@github
205 Aruelien Pocheville Thibault Reuille Cass Everitt Matthew Gregan
206 Ryamond Barbiero Paul Du Bois Engin Manap snagar@github
207 Michaelangel007@github Oriol Ferrer Mesia socks-the-fox
208 Blazej Dariusz Roszkowski
209
210
211 LICENSE
212
213 This software is dual-licensed to the public domain and under the following
214 license: you are granted a perpetual, irrevocable license to copy, modify,
215 publish, and distribute this file as you see fit.
216
217 */
218
219 #ifndef STBI_INCLUDE_STB_IMAGE_H
220 #define STBI_INCLUDE_STB_IMAGE_H
221
222 // DOCUMENTATION
223 //
224 // Limitations:
225 // - no 16-bit-per-channel PNG
226 // - no 12-bit-per-channel JPEG
227 // - no JPEGs with arithmetic coding
228 // - no 1-bit BMP
229 // - GIF always returns *comp=4
230 //
231 // Basic usage (see HDR discussion below for HDR usage):
232 // int x,y,n;
233 // unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
234 // // ... process data if not NULL ...
235 // // ... x = width, y = height, n = # 8-bit components per pixel ...
236 // // ... replace '0' with '1'..'4' to force that many components per pixel
237 // // ... but 'n' will always be the number that it would have been if you said 0
238 // stbi_image_free(data)
239 //
240 // Standard parameters:
241 // int *x -- outputs image width in pixels
242 // int *y -- outputs image height in pixels
243 // int *comp -- outputs # of image components in image file
244 // int req_comp -- if non-zero, # of image components requested in result
245 //
246 // The return value from an image loader is an 'unsigned char *' which points
247 // to the pixel data, or NULL on an allocation failure or if the image is
248 // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
249 // with each pixel consisting of N interleaved 8-bit components; the first
250 // pixel pointed to is top-left-most in the image. There is no padding between
251 // image scanlines or between pixels, regardless of format. The number of
252 // components N is 'req_comp' if req_comp is non-zero, or *comp otherwise.
253 // If req_comp is non-zero, *comp has the number of components that _would_
254 // have been output otherwise. E.g. if you set req_comp to 4, you will always
255 // get RGBA output, but you can check *comp to see if it's trivially opaque
256 // because e.g. there were only 3 channels in the source image.
257 //
258 // An output image with N components has the following components interleaved
259 // in this order in each pixel:
260 //
261 // N=#comp components
262 // 1 grey
263 // 2 grey, alpha
264 // 3 red, green, blue
265 // 4 red, green, blue, alpha
266 //
267 // If image loading fails for any reason, the return value will be NULL,
268 // and *x, *y, *comp will be unchanged. The function stbi_failure_reason()
269 // can be queried for an extremely brief, end-user unfriendly explanation
270 // of why the load failed. Define STBI_NO_FAILURE_STRINGS to avoid
271 // compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
272 // more user-friendly ones.
273 //
274 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
275 //
276 // ===========================================================================
277 //
278 // Philosophy
279 //
280 // stb libraries are designed with the following priorities:
281 //
282 // 1. easy to use
283 // 2. easy to maintain
284 // 3. good performance
285 //
286 // Sometimes I let "good performance" creep up in priority over "easy to maintain",
287 // and for best performance I may provide less-easy-to-use APIs that give higher
288 // performance, in addition to the easy to use ones. Nevertheless, it's important
289 // to keep in mind that from the standpoint of you, a client of this library,
290 // all you care about is #1 and #3, and stb libraries do not emphasize #3 above all.
291 //
292 // Some secondary priorities arise directly from the first two, some of which
293 // make more explicit reasons why performance can't be emphasized.
294 //
295 // - Portable ("ease of use")
296 // - Small footprint ("easy to maintain")
297 // - No dependencies ("ease of use")
298 //
299 // ===========================================================================
300 //
301 // I/O callbacks
302 //
303 // I/O callbacks allow you to read from arbitrary sources, like packaged
304 // files or some other source. Data read from callbacks are processed
305 // through a small internal buffer (currently 128 bytes) to try to reduce
306 // overhead.
307 //
308 // The three functions you must define are "read" (reads some bytes of data),
309 // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
310 //
311 // ===========================================================================
312 //
313 // SIMD support
314 //
315 // The JPEG decoder will try to automatically use SIMD kernels on x86 when
316 // supported by the compiler. For ARM Neon support, you must explicitly
317 // request it.
318 //
319 // (The old do-it-yourself SIMD API is no longer supported in the current
320 // code.)
321 //
322 // On x86, SSE2 will automatically be used when available based on a run-time
323 // test; if not, the generic C versions are used as a fall-back. On ARM targets,
324 // the typical path is to have separate builds for NEON and non-NEON devices
325 // (at least this is true for iOS and Android). Therefore, the NEON support is
326 // toggled by a build flag: define STBI_NEON to get NEON loops.
327 //
328 // The output of the JPEG decoder is slightly different from versions where
329 // SIMD support was introduced (that is, for versions before 1.49). The
330 // difference is only +-1 in the 8-bit RGB channels, and only on a small
331 // fraction of pixels. You can force the pre-1.49 behavior by defining
332 // STBI_JPEG_OLD, but this will disable some of the SIMD decoding path
333 // and hence cost some performance.
334 //
335 // If for some reason you do not want to use any of SIMD code, or if
336 // you have issues compiling it, you can disable it entirely by
337 // defining STBI_NO_SIMD.
338 //
339 // ===========================================================================
340 //
341 // HDR image support (disable by defining STBI_NO_HDR)
342 //
343 // stb_image now supports loading HDR images in general, and currently
344 // the Radiance .HDR file format, although the support is provided
345 // generically. You can still load any file through the existing interface;
346 // if you attempt to load an HDR file, it will be automatically remapped to
347 // LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
348 // both of these constants can be reconfigured through this interface:
349 //
350 // stbi_hdr_to_ldr_gamma(2.2f);
351 // stbi_hdr_to_ldr_scale(1.0f);
352 //
353 // (note, do not use _inverse_ constants; stbi_image will invert them
354 // appropriately).
355 //
356 // Additionally, there is a new, parallel interface for loading files as
357 // (linear) floats to preserve the full dynamic range:
358 //
359 // float *data = stbi_loadf(filename, &x, &y, &n, 0);
360 //
361 // If you load LDR images through this interface, those images will
362 // be promoted to floating point values, run through the inverse of
363 // constants corresponding to the above:
364 //
365 // stbi_ldr_to_hdr_scale(1.0f);
366 // stbi_ldr_to_hdr_gamma(2.2f);
367 //
368 // Finally, given a filename (or an open file or memory block--see header
369 // file for details) containing image data, you can query for the "most
370 // appropriate" interface to use (that is, whether the image is HDR or
371 // not), using:
372 //
373 // stbi_is_hdr(char *filename);
374 //
375 // ===========================================================================
376 //
377 // iPhone PNG support:
378 //
379 // By default we convert iphone-formatted PNGs back to RGB, even though
380 // they are internally encoded differently. You can disable this conversion
381 // by by calling stbi_convert_iphone_png_to_rgb(0), in which case
382 // you will always just get the native iphone "format" through (which
383 // is BGR stored in RGB).
384 //
385 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
386 // pixel to remove any premultiplied alpha *only* if the image file explicitly
387 // says there's premultiplied data (currently only happens in iPhone images,
388 // and only if iPhone convert-to-rgb processing is on).
389 //
390
391
392 #ifndef STBI_NO_STDIO
393 #include <stdio.h>
394 #endif // STBI_NO_STDIO
395
396 #define STBI_VERSION 1
397
398 enum
399 {
400 STBI_default = 0, // only used for req_comp
401
402 STBI_grey = 1,
403 STBI_grey_alpha = 2,
404 STBI_rgb = 3,
405 STBI_rgb_alpha = 4
406 };
407
408 typedef unsigned char stbi_uc;
409
410 #ifdef __cplusplus
411 extern "C" {
412 #endif
413
414 #ifdef STB_IMAGE_STATIC
415 #define STBIDEF static
416 #else
417 #define STBIDEF extern
418 #endif
419
420 //////////////////////////////////////////////////////////////////////////////
421 //
422 // PRIMARY API - works on images of any type
423 //
424
425 //
426 // load image by filename, open file, or memory buffer
427 //
428
429 typedef struct
430 {
431 int (*read) (void *user,char *data,int size); // fill 'data' with 'size' bytes. return number of bytes actually read
432 void (*skip) (void *user,int n); // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
433 int (*eof) (void *user); // returns nonzero if we are at end of file/data
434 } stbi_io_callbacks;
435
436 STBIDEF stbi_uc *stbi_load (char const *filename, int *x, int *y, int *comp, int req_comp);
437 STBIDEF stbi_uc *stbi_load_from_memory (stbi_uc const *buffer, int len , int *x, int *y, int *comp, int req_comp);
438 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk , void *user, int *x, int *y, int *comp, int req_comp);
439
440 #ifndef STBI_NO_STDIO
441 STBIDEF stbi_uc *stbi_load_from_file (FILE *f, int *x, int *y, int *comp, int req_comp);
442 // for stbi_load_from_file, file pointer is left pointing immediately after image
443 #endif
444
445 #ifndef STBI_NO_LINEAR
446 STBIDEF float *stbi_loadf (char const *filename, int *x, int *y, int *comp, int req_comp);
447 STBIDEF float *stbi_loadf_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
448 STBIDEF float *stbi_loadf_from_callbacks (stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp);
449
450 #ifndef STBI_NO_STDIO
451 STBIDEF float *stbi_loadf_from_file (FILE *f, int *x, int *y, int *comp, int req_comp);
452 #endif
453 #endif
454
455 #ifndef STBI_NO_HDR
456 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma);
457 STBIDEF void stbi_hdr_to_ldr_scale(float scale);
458 #endif // STBI_NO_HDR
459
460 #ifndef STBI_NO_LINEAR
461 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma);
462 STBIDEF void stbi_ldr_to_hdr_scale(float scale);
463 #endif // STBI_NO_LINEAR
464
465 // stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
466 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
467 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
468 #ifndef STBI_NO_STDIO
469 STBIDEF int stbi_is_hdr (char const *filename);
470 STBIDEF int stbi_is_hdr_from_file(FILE *f);
471 #endif // STBI_NO_STDIO
472
473
474 // get a VERY brief reason for failure
475 // NOT THREADSAFE
476 STBIDEF const char *stbi_failure_reason (void);
477
478 // free the loaded image -- this is just free()
479 STBIDEF void stbi_image_free (void *retval_from_stbi_load);
480
481 // get image dimensions & components without fully decoding
482 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
483 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp);
484
485 #ifndef STBI_NO_STDIO
486 STBIDEF int stbi_info (char const *filename, int *x, int *y, int *comp);
487 STBIDEF int stbi_info_from_file (FILE *f, int *x, int *y, int *comp);
488
489 #endif
490
491
492
493 // for image formats that explicitly notate that they have premultiplied alpha,
494 // we just return the colors as stored in the file. set this flag to force
495 // unpremultiplication. results are undefined if the unpremultiply overflow.
496 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
497
498 // indicate whether we should process iphone images back to canonical format,
499 // or just pass them through "as-is"
500 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
501
502 // flip the image vertically, so the first pixel in the output array is the bottom left
503 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip);
504
505 // ZLIB client - used by PNG, available for other purposes
506
507 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
508 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
509 STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
510 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
511
512 STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
513 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
514
515
516 #ifdef __cplusplus
517 }
518 #endif
519
520 //
521 //
522 //// end header file /////////////////////////////////////////////////////
523 #endif // STBI_INCLUDE_STB_IMAGE_H
524
525 #ifdef STB_IMAGE_IMPLEMENTATION
526
527 #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
528 || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
529 || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
530 || defined(STBI_ONLY_ZLIB)
531 #ifndef STBI_ONLY_JPEG
532 #define STBI_NO_JPEG
533 #endif
534 #ifndef STBI_ONLY_PNG
535 #define STBI_NO_PNG
536 #endif
537 #ifndef STBI_ONLY_BMP
538 #define STBI_NO_BMP
539 #endif
540 #ifndef STBI_ONLY_PSD
541 #define STBI_NO_PSD
542 #endif
543 #ifndef STBI_ONLY_TGA
544 #define STBI_NO_TGA
545 #endif
546 #ifndef STBI_ONLY_GIF
547 #define STBI_NO_GIF
548 #endif
549 #ifndef STBI_ONLY_HDR
550 #define STBI_NO_HDR
551 #endif
552 #ifndef STBI_ONLY_PIC
553 #define STBI_NO_PIC
554 #endif
555 #ifndef STBI_ONLY_PNM
556 #define STBI_NO_PNM
557 #endif
558 #endif
559
560 #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
561 #define STBI_NO_ZLIB
562 #endif
563
564
565 #include <stdarg.h>
566 #include <stddef.h> // ptrdiff_t on osx
567 #include <stdlib.h>
568 #include <string.h>
569
570 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
571 #include <math.h> // ldexp
572 #endif
573
574 #ifndef STBI_NO_STDIO
575 #include <stdio.h>
576 #endif
577
578 #ifndef STBI_ASSERT
579 #include <assert.h>
580 #define STBI_ASSERT(x) assert(x)
581 #endif
582
583
584 #ifndef _MSC_VER
585 #ifdef __cplusplus
586 #define stbi_inline inline
587 #else
588 #define stbi_inline
589 #endif
590 #else
591 #define stbi_inline __forceinline
592 #endif
593
594
595 #ifdef _MSC_VER
596 typedef unsigned short stbi__uint16;
597 typedef signed short stbi__int16;
598 typedef unsigned int stbi__uint32;
599 typedef signed int stbi__int32;
600 #else
601 #include <stdint.h>
602 typedef uint16_t stbi__uint16;
603 typedef int16_t stbi__int16;
604 typedef uint32_t stbi__uint32;
605 typedef int32_t stbi__int32;
606 #endif
607
608 // should produce compiler error if size is wrong
609 typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1];
610
611 #ifdef _MSC_VER
612 #define STBI_NOTUSED(v) (void)(v)
613 #else
614 #define STBI_NOTUSED(v) (void)sizeof(v)
615 #endif
616
617 #ifdef _MSC_VER
618 #define STBI_HAS_LROTL
619 #endif
620
621 #ifdef STBI_HAS_LROTL
622 #define stbi_lrot(x,y) _lrotl(x,y)
623 #else
624 #define stbi_lrot(x,y) (((x) << (y)) | ((x) >> (32 - (y))))
625 #endif
626
627 #if defined(STBI_MALLOC) && defined(STBI_FREE) && (defined(STBI_REALLOC) || defined(STBI_REALLOC_SIZED))
628 // ok
629 #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC) && !defined(STBI_REALLOC_SIZED)
630 // ok
631 #else
632 #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC (or STBI_REALLOC_SIZED)."
633 #endif
634
635 #ifndef STBI_MALLOC
636 #define STBI_MALLOC(sz) malloc(sz)
637 #define STBI_REALLOC(p,newsz) realloc(p,newsz)
638 #define STBI_FREE(p) free(p)
639 #endif
640
641 #ifndef STBI_REALLOC_SIZED
642 #define STBI_REALLOC_SIZED(p,oldsz,newsz) STBI_REALLOC(p,newsz)
643 #endif
644
645 // x86/x64 detection
646 #if defined(__x86_64__) || defined(_M_X64)
647 #define STBI__X64_TARGET
648 #elif defined(__i386) || defined(_M_IX86)
649 #define STBI__X86_TARGET
650 #endif
651
652 #if defined(__GNUC__) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET)) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
653 // NOTE: not clear do we actually need this for the 64-bit path?
654 // gcc doesn't support sse2 intrinsics unless you compile with -msse2,
655 // (but compiling with -msse2 allows the compiler to use SSE2 everywhere;
656 // this is just broken and gcc are jerks for not fixing it properly
657 // http://www.virtualdub.org/blog/pivot/entry.php?id=363 )
658 #define STBI_NO_SIMD
659 #endif
660
661 #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
662 // Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
663 //
664 // 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
665 // Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
666 // As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
667 // simultaneously enabling "-mstackrealign".
668 //
669 // See https://github.com/nothings/stb/issues/81 for more information.
670 //
671 // So default to no SSE2 on 32-bit MinGW. If you've read this far and added
672 // -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
673 #define STBI_NO_SIMD
674 #endif
675
676 #if !defined(STBI_NO_SIMD) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET))
677 #define STBI_SSE2
678 #include <emmintrin.h>
679
680 #ifdef _MSC_VER
681
682 #if _MSC_VER >= 1400 // not VC6
683 #include <intrin.h> // __cpuid
stbi__cpuid3(void)684 static int stbi__cpuid3(void)
685 {
686 int info[4];
687 __cpuid(info,1);
688 return info[3];
689 }
690 #else
stbi__cpuid3(void)691 static int stbi__cpuid3(void)
692 {
693 int res;
694 __asm {
695 mov eax,1
696 cpuid
697 mov res,edx
698 }
699 return res;
700 }
701 #endif
702
703 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
704
stbi__sse2_available()705 static int stbi__sse2_available()
706 {
707 int info3 = stbi__cpuid3();
708 return ((info3 >> 26) & 1) != 0;
709 }
710 #else // assume GCC-style if not VC++
711 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
712
stbi__sse2_available()713 static int stbi__sse2_available()
714 {
715 #if defined(__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__) >= 408 // GCC 4.8 or later
716 // GCC 4.8+ has a nice way to do this
717 return __builtin_cpu_supports("sse2");
718 #else
719 // portable way to do this, preferably without using GCC inline ASM?
720 // just bail for now.
721 return 0;
722 #endif
723 }
724 #endif
725 #endif
726
727 // ARM NEON
728 #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
729 #undef STBI_NEON
730 #endif
731
732 #ifdef STBI_NEON
733 #include <arm_neon.h>
734 // assume GCC or Clang on ARM targets
735 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
736 #endif
737
738 #ifndef STBI_SIMD_ALIGN
739 #define STBI_SIMD_ALIGN(type, name) type name
740 #endif
741
742 ///////////////////////////////////////////////
743 //
744 // stbi__context struct and start_xxx functions
745
746 // stbi__context structure is our basic context used by all images, so it
747 // contains all the IO context, plus some basic image information
748 typedef struct
749 {
750 stbi__uint32 img_x, img_y;
751 int img_n, img_out_n;
752
753 stbi_io_callbacks io;
754 void *io_user_data;
755
756 int read_from_callbacks;
757 int buflen;
758 stbi_uc buffer_start[128];
759
760 stbi_uc *img_buffer, *img_buffer_end;
761 stbi_uc *img_buffer_original, *img_buffer_original_end;
762 } stbi__context;
763
764
765 static void stbi__refill_buffer(stbi__context *s);
766
767 // initialize a memory-decode context
stbi__start_mem(stbi__context * s,stbi_uc const * buffer,int len)768 static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len)
769 {
770 s->io.read = NULL;
771 s->read_from_callbacks = 0;
772 s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer;
773 s->img_buffer_end = s->img_buffer_original_end = (stbi_uc *) buffer+len;
774 }
775
776 // initialize a callback-based context
stbi__start_callbacks(stbi__context * s,stbi_io_callbacks * c,void * user)777 static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user)
778 {
779 s->io = *c;
780 s->io_user_data = user;
781 s->buflen = sizeof(s->buffer_start);
782 s->read_from_callbacks = 1;
783 s->img_buffer_original = s->buffer_start;
784 stbi__refill_buffer(s);
785 s->img_buffer_original_end = s->img_buffer_end;
786 }
787
788 #ifndef STBI_NO_STDIO
789
stbi__stdio_read(void * user,char * data,int size)790 static int stbi__stdio_read(void *user, char *data, int size)
791 {
792 return (int) fread(data,1,size,(FILE*) user);
793 }
794
stbi__stdio_skip(void * user,int n)795 static void stbi__stdio_skip(void *user, int n)
796 {
797 fseek((FILE*) user, n, SEEK_CUR);
798 }
799
stbi__stdio_eof(void * user)800 static int stbi__stdio_eof(void *user)
801 {
802 return feof((FILE*) user);
803 }
804
805 static stbi_io_callbacks stbi__stdio_callbacks =
806 {
807 stbi__stdio_read,
808 stbi__stdio_skip,
809 stbi__stdio_eof,
810 };
811
stbi__start_file(stbi__context * s,FILE * f)812 static void stbi__start_file(stbi__context *s, FILE *f)
813 {
814 stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f);
815 }
816
817 //static void stop_file(stbi__context *s) { }
818
819 #endif // !STBI_NO_STDIO
820
stbi__rewind(stbi__context * s)821 static void stbi__rewind(stbi__context *s)
822 {
823 // conceptually rewind SHOULD rewind to the beginning of the stream,
824 // but we just rewind to the beginning of the initial buffer, because
825 // we only use it after doing 'test', which only ever looks at at most 92 bytes
826 s->img_buffer = s->img_buffer_original;
827 s->img_buffer_end = s->img_buffer_original_end;
828 }
829
830 #ifndef STBI_NO_JPEG
831 static int stbi__jpeg_test(stbi__context *s);
832 static stbi_uc *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
833 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp);
834 #endif
835
836 #ifndef STBI_NO_PNG
837 static int stbi__png_test(stbi__context *s);
838 static stbi_uc *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
839 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp);
840 #endif
841
842 #ifndef STBI_NO_BMP
843 static int stbi__bmp_test(stbi__context *s);
844 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
845 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp);
846 #endif
847
848 #ifndef STBI_NO_TGA
849 static int stbi__tga_test(stbi__context *s);
850 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
851 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp);
852 #endif
853
854 #ifndef STBI_NO_PSD
855 static int stbi__psd_test(stbi__context *s);
856 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
857 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp);
858 #endif
859
860 #ifndef STBI_NO_HDR
861 static int stbi__hdr_test(stbi__context *s);
862 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
863 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp);
864 #endif
865
866 #ifndef STBI_NO_PIC
867 static int stbi__pic_test(stbi__context *s);
868 static stbi_uc *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
869 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp);
870 #endif
871
872 #ifndef STBI_NO_GIF
873 static int stbi__gif_test(stbi__context *s);
874 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
875 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp);
876 #endif
877
878 #ifndef STBI_NO_PNM
879 static int stbi__pnm_test(stbi__context *s);
880 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
881 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp);
882 #endif
883
884 // this is not threadsafe
885 static const char *stbi__g_failure_reason;
886
stbi_failure_reason(void)887 STBIDEF const char *stbi_failure_reason(void)
888 {
889 return stbi__g_failure_reason;
890 }
891
stbi__err(const char * str)892 static int stbi__err(const char *str)
893 {
894 stbi__g_failure_reason = str;
895 return 0;
896 }
897
stbi__malloc(size_t size)898 static void *stbi__malloc(size_t size)
899 {
900 return STBI_MALLOC(size);
901 }
902
903 // stbi__err - error
904 // stbi__errpf - error returning pointer to float
905 // stbi__errpuc - error returning pointer to unsigned char
906
907 #ifdef STBI_NO_FAILURE_STRINGS
908 #define stbi__err(x,y) 0
909 #elif defined(STBI_FAILURE_USERMSG)
910 #define stbi__err(x,y) stbi__err(y)
911 #else
912 #define stbi__err(x,y) stbi__err(x)
913 #endif
914
915 #define stbi__errpf(x,y) ((float *)(size_t) (stbi__err(x,y)?NULL:NULL))
916 #define stbi__errpuc(x,y) ((unsigned char *)(size_t) (stbi__err(x,y)?NULL:NULL))
917
stbi_image_free(void * retval_from_stbi_load)918 STBIDEF void stbi_image_free(void *retval_from_stbi_load)
919 {
920 STBI_FREE(retval_from_stbi_load);
921 }
922
923 #ifndef STBI_NO_LINEAR
924 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
925 #endif
926
927 #ifndef STBI_NO_HDR
928 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp);
929 #endif
930
931 static int stbi__vertically_flip_on_load = 0;
932
stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)933 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
934 {
935 stbi__vertically_flip_on_load = flag_true_if_should_flip;
936 }
937
stbi__load_main(stbi__context * s,int * x,int * y,int * comp,int req_comp)938 static unsigned char *stbi__load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
939 {
940 #ifndef STBI_NO_JPEG
941 if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp);
942 #endif
943 #ifndef STBI_NO_PNG
944 if (stbi__png_test(s)) return stbi__png_load(s,x,y,comp,req_comp);
945 #endif
946 #ifndef STBI_NO_BMP
947 if (stbi__bmp_test(s)) return stbi__bmp_load(s,x,y,comp,req_comp);
948 #endif
949 #ifndef STBI_NO_GIF
950 if (stbi__gif_test(s)) return stbi__gif_load(s,x,y,comp,req_comp);
951 #endif
952 #ifndef STBI_NO_PSD
953 if (stbi__psd_test(s)) return stbi__psd_load(s,x,y,comp,req_comp);
954 #endif
955 #ifndef STBI_NO_PIC
956 if (stbi__pic_test(s)) return stbi__pic_load(s,x,y,comp,req_comp);
957 #endif
958 #ifndef STBI_NO_PNM
959 if (stbi__pnm_test(s)) return stbi__pnm_load(s,x,y,comp,req_comp);
960 #endif
961
962 #ifndef STBI_NO_HDR
963 if (stbi__hdr_test(s)) {
964 float *hdr = stbi__hdr_load(s, x,y,comp,req_comp);
965 return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
966 }
967 #endif
968
969 #ifndef STBI_NO_TGA
970 // test tga last because it's a crappy test!
971 if (stbi__tga_test(s))
972 return stbi__tga_load(s,x,y,comp,req_comp);
973 #endif
974
975 return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
976 }
977
stbi__load_flip(stbi__context * s,int * x,int * y,int * comp,int req_comp)978 static unsigned char *stbi__load_flip(stbi__context *s, int *x, int *y, int *comp, int req_comp)
979 {
980 unsigned char *result = stbi__load_main(s, x, y, comp, req_comp);
981
982 if (stbi__vertically_flip_on_load && result != NULL) {
983 int w = *x, h = *y;
984 int depth = req_comp ? req_comp : *comp;
985 int row,col,z;
986 stbi_uc temp;
987
988 // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
989 for (row = 0; row < (h>>1); row++) {
990 for (col = 0; col < w; col++) {
991 for (z = 0; z < depth; z++) {
992 temp = result[(row * w + col) * depth + z];
993 result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
994 result[((h - row - 1) * w + col) * depth + z] = temp;
995 }
996 }
997 }
998 }
999
1000 return result;
1001 }
1002
1003 #ifndef STBI_NO_HDR
stbi__float_postprocess(float * result,int * x,int * y,int * comp,int req_comp)1004 static void stbi__float_postprocess(float *result, int *x, int *y, int *comp, int req_comp)
1005 {
1006 if (stbi__vertically_flip_on_load && result != NULL) {
1007 int w = *x, h = *y;
1008 int depth = req_comp ? req_comp : *comp;
1009 int row,col,z;
1010 float temp;
1011
1012 // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
1013 for (row = 0; row < (h>>1); row++) {
1014 for (col = 0; col < w; col++) {
1015 for (z = 0; z < depth; z++) {
1016 temp = result[(row * w + col) * depth + z];
1017 result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
1018 result[((h - row - 1) * w + col) * depth + z] = temp;
1019 }
1020 }
1021 }
1022 }
1023 }
1024 #endif
1025
1026 #ifndef STBI_NO_STDIO
1027
stbi__fopen(char const * filename,char const * mode)1028 static FILE *stbi__fopen(char const *filename, char const *mode)
1029 {
1030 FILE *f;
1031 #if defined(_MSC_VER) && _MSC_VER >= 1400
1032 if (0 != fopen_s(&f, filename, mode))
1033 f=0;
1034 #else
1035 f = fopen(filename, mode);
1036 #endif
1037 return f;
1038 }
1039
1040
stbi_load(char const * filename,int * x,int * y,int * comp,int req_comp)1041 STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
1042 {
1043 FILE *f = stbi__fopen(filename, "rb");
1044 unsigned char *result;
1045 if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
1046 result = stbi_load_from_file(f,x,y,comp,req_comp);
1047 fclose(f);
1048 return result;
1049 }
1050
stbi_load_from_file(FILE * f,int * x,int * y,int * comp,int req_comp)1051 STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1052 {
1053 unsigned char *result;
1054 stbi__context s;
1055 stbi__start_file(&s,f);
1056 result = stbi__load_flip(&s,x,y,comp,req_comp);
1057 if (result) {
1058 // need to 'unget' all the characters in the IO buffer
1059 fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
1060 }
1061 return result;
1062 }
1063 #endif //!STBI_NO_STDIO
1064
stbi_load_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp,int req_comp)1065 STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1066 {
1067 stbi__context s;
1068 stbi__start_mem(&s,buffer,len);
1069 return stbi__load_flip(&s,x,y,comp,req_comp);
1070 }
1071
stbi_load_from_callbacks(stbi_io_callbacks const * clbk,void * user,int * x,int * y,int * comp,int req_comp)1072 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1073 {
1074 stbi__context s;
1075 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1076 return stbi__load_flip(&s,x,y,comp,req_comp);
1077 }
1078
1079 #ifndef STBI_NO_LINEAR
stbi__loadf_main(stbi__context * s,int * x,int * y,int * comp,int req_comp)1080 static float *stbi__loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
1081 {
1082 unsigned char *data;
1083 #ifndef STBI_NO_HDR
1084 if (stbi__hdr_test(s)) {
1085 float *hdr_data = stbi__hdr_load(s,x,y,comp,req_comp);
1086 if (hdr_data)
1087 stbi__float_postprocess(hdr_data,x,y,comp,req_comp);
1088 return hdr_data;
1089 }
1090 #endif
1091 data = stbi__load_flip(s, x, y, comp, req_comp);
1092 if (data)
1093 return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
1094 return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
1095 }
1096
stbi_loadf_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp,int req_comp)1097 STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1098 {
1099 stbi__context s;
1100 stbi__start_mem(&s,buffer,len);
1101 return stbi__loadf_main(&s,x,y,comp,req_comp);
1102 }
1103
stbi_loadf_from_callbacks(stbi_io_callbacks const * clbk,void * user,int * x,int * y,int * comp,int req_comp)1104 STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1105 {
1106 stbi__context s;
1107 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1108 return stbi__loadf_main(&s,x,y,comp,req_comp);
1109 }
1110
1111 #ifndef STBI_NO_STDIO
stbi_loadf(char const * filename,int * x,int * y,int * comp,int req_comp)1112 STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
1113 {
1114 float *result;
1115 FILE *f = stbi__fopen(filename, "rb");
1116 if (!f) return stbi__errpf("can't fopen", "Unable to open file");
1117 result = stbi_loadf_from_file(f,x,y,comp,req_comp);
1118 fclose(f);
1119 return result;
1120 }
1121
stbi_loadf_from_file(FILE * f,int * x,int * y,int * comp,int req_comp)1122 STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1123 {
1124 stbi__context s;
1125 stbi__start_file(&s,f);
1126 return stbi__loadf_main(&s,x,y,comp,req_comp);
1127 }
1128 #endif // !STBI_NO_STDIO
1129
1130 #endif // !STBI_NO_LINEAR
1131
1132 // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
1133 // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
1134 // reports false!
1135
stbi_is_hdr_from_memory(stbi_uc const * buffer,int len)1136 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
1137 {
1138 #ifndef STBI_NO_HDR
1139 stbi__context s;
1140 stbi__start_mem(&s,buffer,len);
1141 return stbi__hdr_test(&s);
1142 #else
1143 STBI_NOTUSED(buffer);
1144 STBI_NOTUSED(len);
1145 return 0;
1146 #endif
1147 }
1148
1149 #ifndef STBI_NO_STDIO
stbi_is_hdr(char const * filename)1150 STBIDEF int stbi_is_hdr (char const *filename)
1151 {
1152 FILE *f = stbi__fopen(filename, "rb");
1153 int result=0;
1154 if (f) {
1155 result = stbi_is_hdr_from_file(f);
1156 fclose(f);
1157 }
1158 return result;
1159 }
1160
stbi_is_hdr_from_file(FILE * f)1161 STBIDEF int stbi_is_hdr_from_file(FILE *f)
1162 {
1163 #ifndef STBI_NO_HDR
1164 stbi__context s;
1165 stbi__start_file(&s,f);
1166 return stbi__hdr_test(&s);
1167 #else
1168 STBI_NOTUSED(f);
1169 return 0;
1170 #endif
1171 }
1172 #endif // !STBI_NO_STDIO
1173
stbi_is_hdr_from_callbacks(stbi_io_callbacks const * clbk,void * user)1174 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
1175 {
1176 #ifndef STBI_NO_HDR
1177 stbi__context s;
1178 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1179 return stbi__hdr_test(&s);
1180 #else
1181 STBI_NOTUSED(clbk);
1182 STBI_NOTUSED(user);
1183 return 0;
1184 #endif
1185 }
1186
1187 #ifndef STBI_NO_LINEAR
1188 static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f;
1189
stbi_ldr_to_hdr_gamma(float gamma)1190 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
stbi_ldr_to_hdr_scale(float scale)1191 STBIDEF void stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
1192 #endif
1193
1194 static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f;
1195
stbi_hdr_to_ldr_gamma(float gamma)1196 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; }
stbi_hdr_to_ldr_scale(float scale)1197 STBIDEF void stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; }
1198
1199
1200 //////////////////////////////////////////////////////////////////////////////
1201 //
1202 // Common code used by all image loaders
1203 //
1204
1205 enum
1206 {
1207 STBI__SCAN_load=0,
1208 STBI__SCAN_type,
1209 STBI__SCAN_header
1210 };
1211
stbi__refill_buffer(stbi__context * s)1212 static void stbi__refill_buffer(stbi__context *s)
1213 {
1214 int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
1215 if (n == 0) {
1216 // at end of file, treat same as if from memory, but need to handle case
1217 // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
1218 s->read_from_callbacks = 0;
1219 s->img_buffer = s->buffer_start;
1220 s->img_buffer_end = s->buffer_start+1;
1221 *s->img_buffer = 0;
1222 } else {
1223 s->img_buffer = s->buffer_start;
1224 s->img_buffer_end = s->buffer_start + n;
1225 }
1226 }
1227
stbi__get8(stbi__context * s)1228 stbi_inline static stbi_uc stbi__get8(stbi__context *s)
1229 {
1230 if (s->img_buffer < s->img_buffer_end)
1231 return *s->img_buffer++;
1232 if (s->read_from_callbacks) {
1233 stbi__refill_buffer(s);
1234 return *s->img_buffer++;
1235 }
1236 return 0;
1237 }
1238
stbi__at_eof(stbi__context * s)1239 stbi_inline static int stbi__at_eof(stbi__context *s)
1240 {
1241 if (s->io.read) {
1242 if (!(s->io.eof)(s->io_user_data)) return 0;
1243 // if feof() is true, check if buffer = end
1244 // special case: we've only got the special 0 character at the end
1245 if (s->read_from_callbacks == 0) return 1;
1246 }
1247
1248 return s->img_buffer >= s->img_buffer_end;
1249 }
1250
stbi__skip(stbi__context * s,int n)1251 static void stbi__skip(stbi__context *s, int n)
1252 {
1253 if (n < 0) {
1254 s->img_buffer = s->img_buffer_end;
1255 return;
1256 }
1257 if (s->io.read) {
1258 int blen = (int) (s->img_buffer_end - s->img_buffer);
1259 if (blen < n) {
1260 s->img_buffer = s->img_buffer_end;
1261 (s->io.skip)(s->io_user_data, n - blen);
1262 return;
1263 }
1264 }
1265 s->img_buffer += n;
1266 }
1267
stbi__getn(stbi__context * s,stbi_uc * buffer,int n)1268 static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n)
1269 {
1270 if (s->io.read) {
1271 int blen = (int) (s->img_buffer_end - s->img_buffer);
1272 if (blen < n) {
1273 int res, count;
1274
1275 memcpy(buffer, s->img_buffer, blen);
1276
1277 count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
1278 res = (count == (n-blen));
1279 s->img_buffer = s->img_buffer_end;
1280 return res;
1281 }
1282 }
1283
1284 if (s->img_buffer+n <= s->img_buffer_end) {
1285 memcpy(buffer, s->img_buffer, n);
1286 s->img_buffer += n;
1287 return 1;
1288 } else
1289 return 0;
1290 }
1291
stbi__get16be(stbi__context * s)1292 static int stbi__get16be(stbi__context *s)
1293 {
1294 int z = stbi__get8(s);
1295 return (z << 8) + stbi__get8(s);
1296 }
1297
stbi__get32be(stbi__context * s)1298 static stbi__uint32 stbi__get32be(stbi__context *s)
1299 {
1300 stbi__uint32 z = stbi__get16be(s);
1301 return (z << 16) + stbi__get16be(s);
1302 }
1303
1304 #if defined(STBI_NO_BMP) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF)
1305 // nothing
1306 #else
stbi__get16le(stbi__context * s)1307 static int stbi__get16le(stbi__context *s)
1308 {
1309 int z = stbi__get8(s);
1310 return z + (stbi__get8(s) << 8);
1311 }
1312 #endif
1313
1314 #ifndef STBI_NO_BMP
stbi__get32le(stbi__context * s)1315 static stbi__uint32 stbi__get32le(stbi__context *s)
1316 {
1317 stbi__uint32 z = stbi__get16le(s);
1318 return z + (stbi__get16le(s) << 16);
1319 }
1320 #endif
1321
1322 #define STBI__BYTECAST(x) ((stbi_uc) ((x) & 255)) // truncate int to byte without warnings
1323
1324
1325 //////////////////////////////////////////////////////////////////////////////
1326 //
1327 // generic converter from built-in img_n to req_comp
1328 // individual types do this automatically as much as possible (e.g. jpeg
1329 // does all cases internally since it needs to colorspace convert anyway,
1330 // and it never has alpha, so very few cases ). png can automatically
1331 // interleave an alpha=255 channel, but falls back to this for other cases
1332 //
1333 // assume data buffer is malloced, so malloc a new one and free that one
1334 // only failure mode is malloc failing
1335
stbi__compute_y(int r,int g,int b)1336 static stbi_uc stbi__compute_y(int r, int g, int b)
1337 {
1338 return (stbi_uc) (((r*77) + (g*150) + (29*b)) >> 8);
1339 }
1340
stbi__convert_format(unsigned char * data,int img_n,int req_comp,unsigned int x,unsigned int y)1341 static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y)
1342 {
1343 int i,j;
1344 unsigned char *good;
1345
1346 if (req_comp == img_n) return data;
1347 STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
1348
1349 good = (unsigned char *) stbi__malloc(req_comp * x * y);
1350 if (good == NULL) {
1351 STBI_FREE(data);
1352 return stbi__errpuc("outofmem", "Out of memory");
1353 }
1354
1355 for (j=0; j < (int) y; ++j) {
1356 unsigned char *src = data + j * x * img_n ;
1357 unsigned char *dest = good + j * x * req_comp;
1358
1359 #define COMBO(a,b) ((a)*8+(b))
1360 #define CASE(a,b) case COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
1361 // convert source image with img_n components to one with req_comp components;
1362 // avoid switch per pixel, so use switch per scanline and massive macros
1363 switch (COMBO(img_n, req_comp)) {
1364 CASE(1,2) dest[0]=src[0], dest[1]=255; break;
1365 CASE(1,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1366 CASE(1,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=255; break;
1367 CASE(2,1) dest[0]=src[0]; break;
1368 CASE(2,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1369 CASE(2,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=src[1]; break;
1370 CASE(3,4) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2],dest[3]=255; break;
1371 CASE(3,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1372 CASE(3,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = 255; break;
1373 CASE(4,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1374 CASE(4,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = src[3]; break;
1375 CASE(4,3) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2]; break;
1376 default: STBI_ASSERT(0);
1377 }
1378 #undef CASE
1379 }
1380
1381 STBI_FREE(data);
1382 return good;
1383 }
1384
1385 #ifndef STBI_NO_LINEAR
stbi__ldr_to_hdr(stbi_uc * data,int x,int y,int comp)1386 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
1387 {
1388 int i,k,n;
1389 float *output = (float *) stbi__malloc(x * y * comp * sizeof(float));
1390 if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
1391 // compute number of non-alpha components
1392 if (comp & 1) n = comp; else n = comp-1;
1393 for (i=0; i < x*y; ++i) {
1394 for (k=0; k < n; ++k) {
1395 output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
1396 }
1397 if (k < comp) output[i*comp + k] = data[i*comp+k]/255.0f;
1398 }
1399 STBI_FREE(data);
1400 return output;
1401 }
1402 #endif
1403
1404 #ifndef STBI_NO_HDR
1405 #define stbi__float2int(x) ((int) (x))
stbi__hdr_to_ldr(float * data,int x,int y,int comp)1406 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp)
1407 {
1408 int i,k,n;
1409 stbi_uc *output = (stbi_uc *) stbi__malloc(x * y * comp);
1410 if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
1411 // compute number of non-alpha components
1412 if (comp & 1) n = comp; else n = comp-1;
1413 for (i=0; i < x*y; ++i) {
1414 for (k=0; k < n; ++k) {
1415 float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
1416 if (z < 0) z = 0;
1417 if (z > 255) z = 255;
1418 output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1419 }
1420 if (k < comp) {
1421 float z = data[i*comp+k] * 255 + 0.5f;
1422 if (z < 0) z = 0;
1423 if (z > 255) z = 255;
1424 output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1425 }
1426 }
1427 STBI_FREE(data);
1428 return output;
1429 }
1430 #endif
1431
1432 //////////////////////////////////////////////////////////////////////////////
1433 //
1434 // "baseline" JPEG/JFIF decoder
1435 //
1436 // simple implementation
1437 // - doesn't support delayed output of y-dimension
1438 // - simple interface (only one output format: 8-bit interleaved RGB)
1439 // - doesn't try to recover corrupt jpegs
1440 // - doesn't allow partial loading, loading multiple at once
1441 // - still fast on x86 (copying globals into locals doesn't help x86)
1442 // - allocates lots of intermediate memory (full size of all components)
1443 // - non-interleaved case requires this anyway
1444 // - allows good upsampling (see next)
1445 // high-quality
1446 // - upsampled channels are bilinearly interpolated, even across blocks
1447 // - quality integer IDCT derived from IJG's 'slow'
1448 // performance
1449 // - fast huffman; reasonable integer IDCT
1450 // - some SIMD kernels for common paths on targets with SSE2/NEON
1451 // - uses a lot of intermediate memory, could cache poorly
1452
1453 #ifndef STBI_NO_JPEG
1454
1455 // huffman decoding acceleration
1456 #define FAST_BITS 9 // larger handles more cases; smaller stomps less cache
1457
1458 typedef struct
1459 {
1460 stbi_uc fast[1 << FAST_BITS];
1461 // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
1462 stbi__uint16 code[256];
1463 stbi_uc values[256];
1464 stbi_uc size[257];
1465 unsigned int maxcode[18];
1466 int delta[17]; // old 'firstsymbol' - old 'firstcode'
1467 } stbi__huffman;
1468
1469 typedef struct
1470 {
1471 stbi__context *s;
1472 stbi__huffman huff_dc[4];
1473 stbi__huffman huff_ac[4];
1474 stbi_uc dequant[4][64];
1475 stbi__int16 fast_ac[4][1 << FAST_BITS];
1476
1477 // sizes for components, interleaved MCUs
1478 int img_h_max, img_v_max;
1479 int img_mcu_x, img_mcu_y;
1480 int img_mcu_w, img_mcu_h;
1481
1482 // definition of jpeg image component
1483 struct
1484 {
1485 int id;
1486 int h,v;
1487 int tq;
1488 int hd,ha;
1489 int dc_pred;
1490
1491 int x,y,w2,h2;
1492 stbi_uc *data;
1493 void *raw_data, *raw_coeff;
1494 stbi_uc *linebuf;
1495 short *coeff; // progressive only
1496 int coeff_w, coeff_h; // number of 8x8 coefficient blocks
1497 } img_comp[4];
1498
1499 stbi__uint32 code_buffer; // jpeg entropy-coded buffer
1500 int code_bits; // number of valid bits
1501 unsigned char marker; // marker seen while filling entropy buffer
1502 int nomore; // flag if we saw a marker so must stop
1503
1504 int progressive;
1505 int spec_start;
1506 int spec_end;
1507 int succ_high;
1508 int succ_low;
1509 int eob_run;
1510 int rgb;
1511
1512 int scan_n, order[4];
1513 int restart_interval, todo;
1514
1515 // kernels
1516 void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]);
1517 void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step);
1518 stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs);
1519 } stbi__jpeg;
1520
stbi__build_huffman(stbi__huffman * h,int * count)1521 static int stbi__build_huffman(stbi__huffman *h, int *count)
1522 {
1523 int i,j,k=0,code;
1524 // build size list for each symbol (from JPEG spec)
1525 for (i=0; i < 16; ++i)
1526 for (j=0; j < count[i]; ++j)
1527 h->size[k++] = (stbi_uc) (i+1);
1528 h->size[k] = 0;
1529
1530 // compute actual symbols (from jpeg spec)
1531 code = 0;
1532 k = 0;
1533 for(j=1; j <= 16; ++j) {
1534 // compute delta to add to code to compute symbol id
1535 h->delta[j] = k - code;
1536 if (h->size[k] == j) {
1537 while (h->size[k] == j)
1538 h->code[k++] = (stbi__uint16) (code++);
1539 if (code-1 >= (1 << j)) return stbi__err("bad code lengths","Corrupt JPEG");
1540 }
1541 // compute largest code + 1 for this size, preshifted as needed later
1542 h->maxcode[j] = code << (16-j);
1543 code <<= 1;
1544 }
1545 h->maxcode[j] = 0xffffffff;
1546
1547 // build non-spec acceleration table; 255 is flag for not-accelerated
1548 memset(h->fast, 255, 1 << FAST_BITS);
1549 for (i=0; i < k; ++i) {
1550 int s = h->size[i];
1551 if (s <= FAST_BITS) {
1552 int c = h->code[i] << (FAST_BITS-s);
1553 int m = 1 << (FAST_BITS-s);
1554 for (j=0; j < m; ++j) {
1555 h->fast[c+j] = (stbi_uc) i;
1556 }
1557 }
1558 }
1559 return 1;
1560 }
1561
1562 // build a table that decodes both magnitude and value of small ACs in
1563 // one go.
stbi__build_fast_ac(stbi__int16 * fast_ac,stbi__huffman * h)1564 static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h)
1565 {
1566 int i;
1567 for (i=0; i < (1 << FAST_BITS); ++i) {
1568 stbi_uc fast = h->fast[i];
1569 fast_ac[i] = 0;
1570 if (fast < 255) {
1571 int rs = h->values[fast];
1572 int run = (rs >> 4) & 15;
1573 int magbits = rs & 15;
1574 int len = h->size[fast];
1575
1576 if (magbits && len + magbits <= FAST_BITS) {
1577 // magnitude code followed by receive_extend code
1578 int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
1579 int m = 1 << (magbits - 1);
1580 if (k < m) k += (-1 << magbits) + 1;
1581 // if the result is small enough, we can fit it in fast_ac table
1582 if (k >= -128 && k <= 127)
1583 fast_ac[i] = (stbi__int16) ((k << 8) + (run << 4) + (len + magbits));
1584 }
1585 }
1586 }
1587 }
1588
stbi__grow_buffer_unsafe(stbi__jpeg * j)1589 static void stbi__grow_buffer_unsafe(stbi__jpeg *j)
1590 {
1591 do {
1592 int b = j->nomore ? 0 : stbi__get8(j->s);
1593 if (b == 0xff) {
1594 int c = stbi__get8(j->s);
1595 if (c != 0) {
1596 j->marker = (unsigned char) c;
1597 j->nomore = 1;
1598 return;
1599 }
1600 }
1601 j->code_buffer |= b << (24 - j->code_bits);
1602 j->code_bits += 8;
1603 } while (j->code_bits <= 24);
1604 }
1605
1606 // (1 << n) - 1
1607 static stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
1608
1609 // decode a jpeg huffman value from the bitstream
stbi__jpeg_huff_decode(stbi__jpeg * j,stbi__huffman * h)1610 stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h)
1611 {
1612 unsigned int temp;
1613 int c,k;
1614
1615 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1616
1617 // look at the top FAST_BITS and determine what symbol ID it is,
1618 // if the code is <= FAST_BITS
1619 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1620 k = h->fast[c];
1621 if (k < 255) {
1622 int s = h->size[k];
1623 if (s > j->code_bits)
1624 return -1;
1625 j->code_buffer <<= s;
1626 j->code_bits -= s;
1627 return h->values[k];
1628 }
1629
1630 // naive test is to shift the code_buffer down so k bits are
1631 // valid, then test against maxcode. To speed this up, we've
1632 // preshifted maxcode left so that it has (16-k) 0s at the
1633 // end; in other words, regardless of the number of bits, it
1634 // wants to be compared against something shifted to have 16;
1635 // that way we don't need to shift inside the loop.
1636 temp = j->code_buffer >> 16;
1637 for (k=FAST_BITS+1 ; ; ++k)
1638 if (temp < h->maxcode[k])
1639 break;
1640 if (k == 17) {
1641 // error! code not found
1642 j->code_bits -= 16;
1643 return -1;
1644 }
1645
1646 if (k > j->code_bits)
1647 return -1;
1648
1649 // convert the huffman code to the symbol id
1650 c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
1651 STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
1652
1653 // convert the id to a symbol
1654 j->code_bits -= k;
1655 j->code_buffer <<= k;
1656 return h->values[c];
1657 }
1658
1659 // bias[n] = (-1<<n) + 1
1660 static int const stbi__jbias[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
1661
1662 // combined JPEG 'receive' and JPEG 'extend', since baseline
1663 // always extends everything it receives.
stbi__extend_receive(stbi__jpeg * j,int n)1664 stbi_inline static int stbi__extend_receive(stbi__jpeg *j, int n)
1665 {
1666 unsigned int k;
1667 int sgn;
1668 if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1669
1670 sgn = (stbi__int32)j->code_buffer >> 31; // sign bit is always in MSB
1671 k = stbi_lrot(j->code_buffer, n);
1672 STBI_ASSERT(n >= 0 && n < (int) (sizeof(stbi__bmask)/sizeof(*stbi__bmask)));
1673 j->code_buffer = k & ~stbi__bmask[n];
1674 k &= stbi__bmask[n];
1675 j->code_bits -= n;
1676 return k + (stbi__jbias[n] & ~sgn);
1677 }
1678
1679 // get some unsigned bits
stbi__jpeg_get_bits(stbi__jpeg * j,int n)1680 stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg *j, int n)
1681 {
1682 unsigned int k;
1683 if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1684 k = stbi_lrot(j->code_buffer, n);
1685 j->code_buffer = k & ~stbi__bmask[n];
1686 k &= stbi__bmask[n];
1687 j->code_bits -= n;
1688 return k;
1689 }
1690
stbi__jpeg_get_bit(stbi__jpeg * j)1691 stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg *j)
1692 {
1693 unsigned int k;
1694 if (j->code_bits < 1) stbi__grow_buffer_unsafe(j);
1695 k = j->code_buffer;
1696 j->code_buffer <<= 1;
1697 --j->code_bits;
1698 return k & 0x80000000;
1699 }
1700
1701 // given a value that's at position X in the zigzag stream,
1702 // where does it appear in the 8x8 matrix coded as row-major?
1703 static stbi_uc stbi__jpeg_dezigzag[64+15] =
1704 {
1705 0, 1, 8, 16, 9, 2, 3, 10,
1706 17, 24, 32, 25, 18, 11, 4, 5,
1707 12, 19, 26, 33, 40, 48, 41, 34,
1708 27, 20, 13, 6, 7, 14, 21, 28,
1709 35, 42, 49, 56, 57, 50, 43, 36,
1710 29, 22, 15, 23, 30, 37, 44, 51,
1711 58, 59, 52, 45, 38, 31, 39, 46,
1712 53, 60, 61, 54, 47, 55, 62, 63,
1713 // let corrupt input sample past end
1714 63, 63, 63, 63, 63, 63, 63, 63,
1715 63, 63, 63, 63, 63, 63, 63
1716 };
1717
1718 // decode one 64-entry block--
stbi__jpeg_decode_block(stbi__jpeg * j,short data[64],stbi__huffman * hdc,stbi__huffman * hac,stbi__int16 * fac,int b,stbi_uc * dequant)1719 static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi_uc *dequant)
1720 {
1721 int diff,dc,k;
1722 int t;
1723
1724 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1725 t = stbi__jpeg_huff_decode(j, hdc);
1726 if (t < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1727
1728 // 0 all the ac values now so we can do it 32-bits at a time
1729 memset(data,0,64*sizeof(data[0]));
1730
1731 diff = t ? stbi__extend_receive(j, t) : 0;
1732 dc = j->img_comp[b].dc_pred + diff;
1733 j->img_comp[b].dc_pred = dc;
1734 data[0] = (short) (dc * dequant[0]);
1735
1736 // decode AC components, see JPEG spec
1737 k = 1;
1738 do {
1739 unsigned int zig;
1740 int c,r,s;
1741 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1742 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1743 r = fac[c];
1744 if (r) { // fast-AC path
1745 k += (r >> 4) & 15; // run
1746 s = r & 15; // combined length
1747 j->code_buffer <<= s;
1748 j->code_bits -= s;
1749 // decode into unzigzag'd location
1750 zig = stbi__jpeg_dezigzag[k++];
1751 data[zig] = (short) ((r >> 8) * dequant[zig]);
1752 } else {
1753 int rs = stbi__jpeg_huff_decode(j, hac);
1754 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1755 s = rs & 15;
1756 r = rs >> 4;
1757 if (s == 0) {
1758 if (rs != 0xf0) break; // end block
1759 k += 16;
1760 } else {
1761 k += r;
1762 // decode into unzigzag'd location
1763 zig = stbi__jpeg_dezigzag[k++];
1764 data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]);
1765 }
1766 }
1767 } while (k < 64);
1768 return 1;
1769 }
1770
stbi__jpeg_decode_block_prog_dc(stbi__jpeg * j,short data[64],stbi__huffman * hdc,int b)1771 static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg *j, short data[64], stbi__huffman *hdc, int b)
1772 {
1773 int diff,dc;
1774 int t;
1775 if (j->spec_end != 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1776
1777 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1778
1779 if (j->succ_high == 0) {
1780 // first scan for DC coefficient, must be first
1781 memset(data,0,64*sizeof(data[0])); // 0 all the ac values now
1782 t = stbi__jpeg_huff_decode(j, hdc);
1783 diff = t ? stbi__extend_receive(j, t) : 0;
1784
1785 dc = j->img_comp[b].dc_pred + diff;
1786 j->img_comp[b].dc_pred = dc;
1787 data[0] = (short) (dc << j->succ_low);
1788 } else {
1789 // refinement scan for DC coefficient
1790 if (stbi__jpeg_get_bit(j))
1791 data[0] += (short) (1 << j->succ_low);
1792 }
1793 return 1;
1794 }
1795
1796 // @OPTIMIZE: store non-zigzagged during the decode passes,
1797 // and only de-zigzag when dequantizing
stbi__jpeg_decode_block_prog_ac(stbi__jpeg * j,short data[64],stbi__huffman * hac,stbi__int16 * fac)1798 static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg *j, short data[64], stbi__huffman *hac, stbi__int16 *fac)
1799 {
1800 int k;
1801 if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1802
1803 if (j->succ_high == 0) {
1804 int shift = j->succ_low;
1805
1806 if (j->eob_run) {
1807 --j->eob_run;
1808 return 1;
1809 }
1810
1811 k = j->spec_start;
1812 do {
1813 unsigned int zig;
1814 int c,r,s;
1815 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1816 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1817 r = fac[c];
1818 if (r) { // fast-AC path
1819 k += (r >> 4) & 15; // run
1820 s = r & 15; // combined length
1821 j->code_buffer <<= s;
1822 j->code_bits -= s;
1823 zig = stbi__jpeg_dezigzag[k++];
1824 data[zig] = (short) ((r >> 8) << shift);
1825 } else {
1826 int rs = stbi__jpeg_huff_decode(j, hac);
1827 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1828 s = rs & 15;
1829 r = rs >> 4;
1830 if (s == 0) {
1831 if (r < 15) {
1832 j->eob_run = (1 << r);
1833 if (r)
1834 j->eob_run += stbi__jpeg_get_bits(j, r);
1835 --j->eob_run;
1836 break;
1837 }
1838 k += 16;
1839 } else {
1840 k += r;
1841 zig = stbi__jpeg_dezigzag[k++];
1842 data[zig] = (short) (stbi__extend_receive(j,s) << shift);
1843 }
1844 }
1845 } while (k <= j->spec_end);
1846 } else {
1847 // refinement scan for these AC coefficients
1848
1849 short bit = (short) (1 << j->succ_low);
1850
1851 if (j->eob_run) {
1852 --j->eob_run;
1853 for (k = j->spec_start; k <= j->spec_end; ++k) {
1854 short *p = &data[stbi__jpeg_dezigzag[k]];
1855 if (*p != 0)
1856 if (stbi__jpeg_get_bit(j))
1857 if ((*p & bit)==0) {
1858 if (*p > 0)
1859 *p += bit;
1860 else
1861 *p -= bit;
1862 }
1863 }
1864 } else {
1865 k = j->spec_start;
1866 do {
1867 int r,s;
1868 int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
1869 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1870 s = rs & 15;
1871 r = rs >> 4;
1872 if (s == 0) {
1873 if (r < 15) {
1874 j->eob_run = (1 << r) - 1;
1875 if (r)
1876 j->eob_run += stbi__jpeg_get_bits(j, r);
1877 r = 64; // force end of block
1878 } else {
1879 // r=15 s=0 should write 16 0s, so we just do
1880 // a run of 15 0s and then write s (which is 0),
1881 // so we don't have to do anything special here
1882 }
1883 } else {
1884 if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG");
1885 // sign bit
1886 if (stbi__jpeg_get_bit(j))
1887 s = bit;
1888 else
1889 s = -bit;
1890 }
1891
1892 // advance by r
1893 while (k <= j->spec_end) {
1894 short *p = &data[stbi__jpeg_dezigzag[k++]];
1895 if (*p != 0) {
1896 if (stbi__jpeg_get_bit(j))
1897 if ((*p & bit)==0) {
1898 if (*p > 0)
1899 *p += bit;
1900 else
1901 *p -= bit;
1902 }
1903 } else {
1904 if (r == 0) {
1905 *p = (short) s;
1906 break;
1907 }
1908 --r;
1909 }
1910 }
1911 } while (k <= j->spec_end);
1912 }
1913 }
1914 return 1;
1915 }
1916
1917 // take a -128..127 value and stbi__clamp it and convert to 0..255
stbi__clamp(int x)1918 stbi_inline static stbi_uc stbi__clamp(int x)
1919 {
1920 // trick to use a single test to catch both cases
1921 if ((unsigned int) x > 255) {
1922 if (x < 0) return 0;
1923 if (x > 255) return 255;
1924 }
1925 return (stbi_uc) x;
1926 }
1927
1928 #define stbi__f2f(x) ((int) (((x) * 4096 + 0.5)))
1929 #define stbi__fsh(x) ((x) << 12)
1930
1931 // derived from jidctint -- DCT_ISLOW
1932 #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
1933 int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
1934 p2 = s2; \
1935 p3 = s6; \
1936 p1 = (p2+p3) * stbi__f2f(0.5411961f); \
1937 t2 = p1 + p3*stbi__f2f(-1.847759065f); \
1938 t3 = p1 + p2*stbi__f2f( 0.765366865f); \
1939 p2 = s0; \
1940 p3 = s4; \
1941 t0 = stbi__fsh(p2+p3); \
1942 t1 = stbi__fsh(p2-p3); \
1943 x0 = t0+t3; \
1944 x3 = t0-t3; \
1945 x1 = t1+t2; \
1946 x2 = t1-t2; \
1947 t0 = s7; \
1948 t1 = s5; \
1949 t2 = s3; \
1950 t3 = s1; \
1951 p3 = t0+t2; \
1952 p4 = t1+t3; \
1953 p1 = t0+t3; \
1954 p2 = t1+t2; \
1955 p5 = (p3+p4)*stbi__f2f( 1.175875602f); \
1956 t0 = t0*stbi__f2f( 0.298631336f); \
1957 t1 = t1*stbi__f2f( 2.053119869f); \
1958 t2 = t2*stbi__f2f( 3.072711026f); \
1959 t3 = t3*stbi__f2f( 1.501321110f); \
1960 p1 = p5 + p1*stbi__f2f(-0.899976223f); \
1961 p2 = p5 + p2*stbi__f2f(-2.562915447f); \
1962 p3 = p3*stbi__f2f(-1.961570560f); \
1963 p4 = p4*stbi__f2f(-0.390180644f); \
1964 t3 += p1+p4; \
1965 t2 += p2+p3; \
1966 t1 += p2+p4; \
1967 t0 += p1+p3;
1968
stbi__idct_block(stbi_uc * out,int out_stride,short data[64])1969 static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64])
1970 {
1971 int i,val[64],*v=val;
1972 stbi_uc *o;
1973 short *d = data;
1974
1975 // columns
1976 for (i=0; i < 8; ++i,++d, ++v) {
1977 // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
1978 if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
1979 && d[40]==0 && d[48]==0 && d[56]==0) {
1980 // no shortcut 0 seconds
1981 // (1|2|3|4|5|6|7)==0 0 seconds
1982 // all separate -0.047 seconds
1983 // 1 && 2|3 && 4|5 && 6|7: -0.047 seconds
1984 int dcterm = d[0] << 2;
1985 v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
1986 } else {
1987 STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56])
1988 // constants scaled things up by 1<<12; let's bring them back
1989 // down, but keep 2 extra bits of precision
1990 x0 += 512; x1 += 512; x2 += 512; x3 += 512;
1991 v[ 0] = (x0+t3) >> 10;
1992 v[56] = (x0-t3) >> 10;
1993 v[ 8] = (x1+t2) >> 10;
1994 v[48] = (x1-t2) >> 10;
1995 v[16] = (x2+t1) >> 10;
1996 v[40] = (x2-t1) >> 10;
1997 v[24] = (x3+t0) >> 10;
1998 v[32] = (x3-t0) >> 10;
1999 }
2000 }
2001
2002 for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
2003 // no fast case since the first 1D IDCT spread components out
2004 STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
2005 // constants scaled things up by 1<<12, plus we had 1<<2 from first
2006 // loop, plus horizontal and vertical each scale by sqrt(8) so together
2007 // we've got an extra 1<<3, so 1<<17 total we need to remove.
2008 // so we want to round that, which means adding 0.5 * 1<<17,
2009 // aka 65536. Also, we'll end up with -128 to 127 that we want
2010 // to encode as 0..255 by adding 128, so we'll add that before the shift
2011 x0 += 65536 + (128<<17);
2012 x1 += 65536 + (128<<17);
2013 x2 += 65536 + (128<<17);
2014 x3 += 65536 + (128<<17);
2015 // tried computing the shifts into temps, or'ing the temps to see
2016 // if any were out of range, but that was slower
2017 o[0] = stbi__clamp((x0+t3) >> 17);
2018 o[7] = stbi__clamp((x0-t3) >> 17);
2019 o[1] = stbi__clamp((x1+t2) >> 17);
2020 o[6] = stbi__clamp((x1-t2) >> 17);
2021 o[2] = stbi__clamp((x2+t1) >> 17);
2022 o[5] = stbi__clamp((x2-t1) >> 17);
2023 o[3] = stbi__clamp((x3+t0) >> 17);
2024 o[4] = stbi__clamp((x3-t0) >> 17);
2025 }
2026 }
2027
2028 #ifdef STBI_SSE2
2029 // sse2 integer IDCT. not the fastest possible implementation but it
2030 // produces bit-identical results to the generic C version so it's
2031 // fully "transparent".
stbi__idct_simd(stbi_uc * out,int out_stride,short data[64])2032 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2033 {
2034 // This is constructed to match our regular (generic) integer IDCT exactly.
2035 __m128i row0, row1, row2, row3, row4, row5, row6, row7;
2036 __m128i tmp;
2037
2038 // dot product constant: even elems=x, odd elems=y
2039 #define dct_const(x,y) _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
2040
2041 // out(0) = c0[even]*x + c0[odd]*y (c0, x, y 16-bit, out 32-bit)
2042 // out(1) = c1[even]*x + c1[odd]*y
2043 #define dct_rot(out0,out1, x,y,c0,c1) \
2044 __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
2045 __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
2046 __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
2047 __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
2048 __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
2049 __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
2050
2051 // out = in << 12 (in 16-bit, out 32-bit)
2052 #define dct_widen(out, in) \
2053 __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
2054 __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
2055
2056 // wide add
2057 #define dct_wadd(out, a, b) \
2058 __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
2059 __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
2060
2061 // wide sub
2062 #define dct_wsub(out, a, b) \
2063 __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
2064 __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
2065
2066 // butterfly a/b, add bias, then shift by "s" and pack
2067 #define dct_bfly32o(out0, out1, a,b,bias,s) \
2068 { \
2069 __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
2070 __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
2071 dct_wadd(sum, abiased, b); \
2072 dct_wsub(dif, abiased, b); \
2073 out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
2074 out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
2075 }
2076
2077 // 8-bit interleave step (for transposes)
2078 #define dct_interleave8(a, b) \
2079 tmp = a; \
2080 a = _mm_unpacklo_epi8(a, b); \
2081 b = _mm_unpackhi_epi8(tmp, b)
2082
2083 // 16-bit interleave step (for transposes)
2084 #define dct_interleave16(a, b) \
2085 tmp = a; \
2086 a = _mm_unpacklo_epi16(a, b); \
2087 b = _mm_unpackhi_epi16(tmp, b)
2088
2089 #define dct_pass(bias,shift) \
2090 { \
2091 /* even part */ \
2092 dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
2093 __m128i sum04 = _mm_add_epi16(row0, row4); \
2094 __m128i dif04 = _mm_sub_epi16(row0, row4); \
2095 dct_widen(t0e, sum04); \
2096 dct_widen(t1e, dif04); \
2097 dct_wadd(x0, t0e, t3e); \
2098 dct_wsub(x3, t0e, t3e); \
2099 dct_wadd(x1, t1e, t2e); \
2100 dct_wsub(x2, t1e, t2e); \
2101 /* odd part */ \
2102 dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
2103 dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
2104 __m128i sum17 = _mm_add_epi16(row1, row7); \
2105 __m128i sum35 = _mm_add_epi16(row3, row5); \
2106 dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
2107 dct_wadd(x4, y0o, y4o); \
2108 dct_wadd(x5, y1o, y5o); \
2109 dct_wadd(x6, y2o, y5o); \
2110 dct_wadd(x7, y3o, y4o); \
2111 dct_bfly32o(row0,row7, x0,x7,bias,shift); \
2112 dct_bfly32o(row1,row6, x1,x6,bias,shift); \
2113 dct_bfly32o(row2,row5, x2,x5,bias,shift); \
2114 dct_bfly32o(row3,row4, x3,x4,bias,shift); \
2115 }
2116
2117 __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
2118 __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f));
2119 __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
2120 __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
2121 __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f));
2122 __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f));
2123 __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f));
2124 __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f));
2125
2126 // rounding biases in column/row passes, see stbi__idct_block for explanation.
2127 __m128i bias_0 = _mm_set1_epi32(512);
2128 __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17));
2129
2130 // load
2131 row0 = _mm_load_si128((const __m128i *) (data + 0*8));
2132 row1 = _mm_load_si128((const __m128i *) (data + 1*8));
2133 row2 = _mm_load_si128((const __m128i *) (data + 2*8));
2134 row3 = _mm_load_si128((const __m128i *) (data + 3*8));
2135 row4 = _mm_load_si128((const __m128i *) (data + 4*8));
2136 row5 = _mm_load_si128((const __m128i *) (data + 5*8));
2137 row6 = _mm_load_si128((const __m128i *) (data + 6*8));
2138 row7 = _mm_load_si128((const __m128i *) (data + 7*8));
2139
2140 // column pass
2141 dct_pass(bias_0, 10);
2142
2143 {
2144 // 16bit 8x8 transpose pass 1
2145 dct_interleave16(row0, row4);
2146 dct_interleave16(row1, row5);
2147 dct_interleave16(row2, row6);
2148 dct_interleave16(row3, row7);
2149
2150 // transpose pass 2
2151 dct_interleave16(row0, row2);
2152 dct_interleave16(row1, row3);
2153 dct_interleave16(row4, row6);
2154 dct_interleave16(row5, row7);
2155
2156 // transpose pass 3
2157 dct_interleave16(row0, row1);
2158 dct_interleave16(row2, row3);
2159 dct_interleave16(row4, row5);
2160 dct_interleave16(row6, row7);
2161 }
2162
2163 // row pass
2164 dct_pass(bias_1, 17);
2165
2166 {
2167 // pack
2168 __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
2169 __m128i p1 = _mm_packus_epi16(row2, row3);
2170 __m128i p2 = _mm_packus_epi16(row4, row5);
2171 __m128i p3 = _mm_packus_epi16(row6, row7);
2172
2173 // 8bit 8x8 transpose pass 1
2174 dct_interleave8(p0, p2); // a0e0a1e1...
2175 dct_interleave8(p1, p3); // c0g0c1g1...
2176
2177 // transpose pass 2
2178 dct_interleave8(p0, p1); // a0c0e0g0...
2179 dct_interleave8(p2, p3); // b0d0f0h0...
2180
2181 // transpose pass 3
2182 dct_interleave8(p0, p2); // a0b0c0d0...
2183 dct_interleave8(p1, p3); // a4b4c4d4...
2184
2185 // store
2186 _mm_storel_epi64((__m128i *) out, p0); out += out_stride;
2187 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
2188 _mm_storel_epi64((__m128i *) out, p2); out += out_stride;
2189 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
2190 _mm_storel_epi64((__m128i *) out, p1); out += out_stride;
2191 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
2192 _mm_storel_epi64((__m128i *) out, p3); out += out_stride;
2193 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e));
2194 }
2195
2196 #undef dct_const
2197 #undef dct_rot
2198 #undef dct_widen
2199 #undef dct_wadd
2200 #undef dct_wsub
2201 #undef dct_bfly32o
2202 #undef dct_interleave8
2203 #undef dct_interleave16
2204 #undef dct_pass
2205 }
2206
2207 #endif // STBI_SSE2
2208
2209 #ifdef STBI_NEON
2210
2211 // NEON integer IDCT. should produce bit-identical
2212 // results to the generic C version.
stbi__idct_simd(stbi_uc * out,int out_stride,short data[64])2213 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2214 {
2215 int16x8_t row0, row1, row2, row3, row4, row5, row6, row7;
2216
2217 int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f));
2218 int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f));
2219 int16x4_t rot0_2 = vdup_n_s16(stbi__f2f( 0.765366865f));
2220 int16x4_t rot1_0 = vdup_n_s16(stbi__f2f( 1.175875602f));
2221 int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f));
2222 int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f));
2223 int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f));
2224 int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f));
2225 int16x4_t rot3_0 = vdup_n_s16(stbi__f2f( 0.298631336f));
2226 int16x4_t rot3_1 = vdup_n_s16(stbi__f2f( 2.053119869f));
2227 int16x4_t rot3_2 = vdup_n_s16(stbi__f2f( 3.072711026f));
2228 int16x4_t rot3_3 = vdup_n_s16(stbi__f2f( 1.501321110f));
2229
2230 #define dct_long_mul(out, inq, coeff) \
2231 int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
2232 int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
2233
2234 #define dct_long_mac(out, acc, inq, coeff) \
2235 int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
2236 int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
2237
2238 #define dct_widen(out, inq) \
2239 int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
2240 int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
2241
2242 // wide add
2243 #define dct_wadd(out, a, b) \
2244 int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
2245 int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
2246
2247 // wide sub
2248 #define dct_wsub(out, a, b) \
2249 int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
2250 int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
2251
2252 // butterfly a/b, then shift using "shiftop" by "s" and pack
2253 #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
2254 { \
2255 dct_wadd(sum, a, b); \
2256 dct_wsub(dif, a, b); \
2257 out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
2258 out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
2259 }
2260
2261 #define dct_pass(shiftop, shift) \
2262 { \
2263 /* even part */ \
2264 int16x8_t sum26 = vaddq_s16(row2, row6); \
2265 dct_long_mul(p1e, sum26, rot0_0); \
2266 dct_long_mac(t2e, p1e, row6, rot0_1); \
2267 dct_long_mac(t3e, p1e, row2, rot0_2); \
2268 int16x8_t sum04 = vaddq_s16(row0, row4); \
2269 int16x8_t dif04 = vsubq_s16(row0, row4); \
2270 dct_widen(t0e, sum04); \
2271 dct_widen(t1e, dif04); \
2272 dct_wadd(x0, t0e, t3e); \
2273 dct_wsub(x3, t0e, t3e); \
2274 dct_wadd(x1, t1e, t2e); \
2275 dct_wsub(x2, t1e, t2e); \
2276 /* odd part */ \
2277 int16x8_t sum15 = vaddq_s16(row1, row5); \
2278 int16x8_t sum17 = vaddq_s16(row1, row7); \
2279 int16x8_t sum35 = vaddq_s16(row3, row5); \
2280 int16x8_t sum37 = vaddq_s16(row3, row7); \
2281 int16x8_t sumodd = vaddq_s16(sum17, sum35); \
2282 dct_long_mul(p5o, sumodd, rot1_0); \
2283 dct_long_mac(p1o, p5o, sum17, rot1_1); \
2284 dct_long_mac(p2o, p5o, sum35, rot1_2); \
2285 dct_long_mul(p3o, sum37, rot2_0); \
2286 dct_long_mul(p4o, sum15, rot2_1); \
2287 dct_wadd(sump13o, p1o, p3o); \
2288 dct_wadd(sump24o, p2o, p4o); \
2289 dct_wadd(sump23o, p2o, p3o); \
2290 dct_wadd(sump14o, p1o, p4o); \
2291 dct_long_mac(x4, sump13o, row7, rot3_0); \
2292 dct_long_mac(x5, sump24o, row5, rot3_1); \
2293 dct_long_mac(x6, sump23o, row3, rot3_2); \
2294 dct_long_mac(x7, sump14o, row1, rot3_3); \
2295 dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
2296 dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
2297 dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
2298 dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
2299 }
2300
2301 // load
2302 row0 = vld1q_s16(data + 0*8);
2303 row1 = vld1q_s16(data + 1*8);
2304 row2 = vld1q_s16(data + 2*8);
2305 row3 = vld1q_s16(data + 3*8);
2306 row4 = vld1q_s16(data + 4*8);
2307 row5 = vld1q_s16(data + 5*8);
2308 row6 = vld1q_s16(data + 6*8);
2309 row7 = vld1q_s16(data + 7*8);
2310
2311 // add DC bias
2312 row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
2313
2314 // column pass
2315 dct_pass(vrshrn_n_s32, 10);
2316
2317 // 16bit 8x8 transpose
2318 {
2319 // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
2320 // whether compilers actually get this is another story, sadly.
2321 #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
2322 #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
2323 #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
2324
2325 // pass 1
2326 dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6
2327 dct_trn16(row2, row3);
2328 dct_trn16(row4, row5);
2329 dct_trn16(row6, row7);
2330
2331 // pass 2
2332 dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4
2333 dct_trn32(row1, row3);
2334 dct_trn32(row4, row6);
2335 dct_trn32(row5, row7);
2336
2337 // pass 3
2338 dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0
2339 dct_trn64(row1, row5);
2340 dct_trn64(row2, row6);
2341 dct_trn64(row3, row7);
2342
2343 #undef dct_trn16
2344 #undef dct_trn32
2345 #undef dct_trn64
2346 }
2347
2348 // row pass
2349 // vrshrn_n_s32 only supports shifts up to 16, we need
2350 // 17. so do a non-rounding shift of 16 first then follow
2351 // up with a rounding shift by 1.
2352 dct_pass(vshrn_n_s32, 16);
2353
2354 {
2355 // pack and round
2356 uint8x8_t p0 = vqrshrun_n_s16(row0, 1);
2357 uint8x8_t p1 = vqrshrun_n_s16(row1, 1);
2358 uint8x8_t p2 = vqrshrun_n_s16(row2, 1);
2359 uint8x8_t p3 = vqrshrun_n_s16(row3, 1);
2360 uint8x8_t p4 = vqrshrun_n_s16(row4, 1);
2361 uint8x8_t p5 = vqrshrun_n_s16(row5, 1);
2362 uint8x8_t p6 = vqrshrun_n_s16(row6, 1);
2363 uint8x8_t p7 = vqrshrun_n_s16(row7, 1);
2364
2365 // again, these can translate into one instruction, but often don't.
2366 #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
2367 #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
2368 #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
2369
2370 // sadly can't use interleaved stores here since we only write
2371 // 8 bytes to each scan line!
2372
2373 // 8x8 8-bit transpose pass 1
2374 dct_trn8_8(p0, p1);
2375 dct_trn8_8(p2, p3);
2376 dct_trn8_8(p4, p5);
2377 dct_trn8_8(p6, p7);
2378
2379 // pass 2
2380 dct_trn8_16(p0, p2);
2381 dct_trn8_16(p1, p3);
2382 dct_trn8_16(p4, p6);
2383 dct_trn8_16(p5, p7);
2384
2385 // pass 3
2386 dct_trn8_32(p0, p4);
2387 dct_trn8_32(p1, p5);
2388 dct_trn8_32(p2, p6);
2389 dct_trn8_32(p3, p7);
2390
2391 // store
2392 vst1_u8(out, p0); out += out_stride;
2393 vst1_u8(out, p1); out += out_stride;
2394 vst1_u8(out, p2); out += out_stride;
2395 vst1_u8(out, p3); out += out_stride;
2396 vst1_u8(out, p4); out += out_stride;
2397 vst1_u8(out, p5); out += out_stride;
2398 vst1_u8(out, p6); out += out_stride;
2399 vst1_u8(out, p7);
2400
2401 #undef dct_trn8_8
2402 #undef dct_trn8_16
2403 #undef dct_trn8_32
2404 }
2405
2406 #undef dct_long_mul
2407 #undef dct_long_mac
2408 #undef dct_widen
2409 #undef dct_wadd
2410 #undef dct_wsub
2411 #undef dct_bfly32o
2412 #undef dct_pass
2413 }
2414
2415 #endif // STBI_NEON
2416
2417 #define STBI__MARKER_none 0xff
2418 // if there's a pending marker from the entropy stream, return that
2419 // otherwise, fetch from the stream and get a marker. if there's no
2420 // marker, return 0xff, which is never a valid marker value
stbi__get_marker(stbi__jpeg * j)2421 static stbi_uc stbi__get_marker(stbi__jpeg *j)
2422 {
2423 stbi_uc x;
2424 if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
2425 x = stbi__get8(j->s);
2426 if (x != 0xff) return STBI__MARKER_none;
2427 while (x == 0xff)
2428 x = stbi__get8(j->s);
2429 return x;
2430 }
2431
2432 // in each scan, we'll have scan_n components, and the order
2433 // of the components is specified by order[]
2434 #define STBI__RESTART(x) ((x) >= 0xd0 && (x) <= 0xd7)
2435
2436 // after a restart interval, stbi__jpeg_reset the entropy decoder and
2437 // the dc prediction
stbi__jpeg_reset(stbi__jpeg * j)2438 static void stbi__jpeg_reset(stbi__jpeg *j)
2439 {
2440 j->code_bits = 0;
2441 j->code_buffer = 0;
2442 j->nomore = 0;
2443 j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = 0;
2444 j->marker = STBI__MARKER_none;
2445 j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
2446 j->eob_run = 0;
2447 // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
2448 // since we don't even allow 1<<30 pixels
2449 }
2450
stbi__parse_entropy_coded_data(stbi__jpeg * z)2451 static int stbi__parse_entropy_coded_data(stbi__jpeg *z)
2452 {
2453 stbi__jpeg_reset(z);
2454 if (!z->progressive) {
2455 if (z->scan_n == 1) {
2456 int i,j;
2457 STBI_SIMD_ALIGN(short, data[64]);
2458 int n = z->order[0];
2459 // non-interleaved data, we just need to process one block at a time,
2460 // in trivial scanline order
2461 // number of blocks to do just depends on how many actual "pixels" this
2462 // component has, independent of interleaved MCU blocking and such
2463 int w = (z->img_comp[n].x+7) >> 3;
2464 int h = (z->img_comp[n].y+7) >> 3;
2465 for (j=0; j < h; ++j) {
2466 for (i=0; i < w; ++i) {
2467 int ha = z->img_comp[n].ha;
2468 if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2469 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2470 // every data block is an MCU, so countdown the restart interval
2471 if (--z->todo <= 0) {
2472 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2473 // if it's NOT a restart, then just bail, so we get corrupt data
2474 // rather than no data
2475 if (!STBI__RESTART(z->marker)) return 1;
2476 stbi__jpeg_reset(z);
2477 }
2478 }
2479 }
2480 return 1;
2481 } else { // interleaved
2482 int i,j,k,x,y;
2483 STBI_SIMD_ALIGN(short, data[64]);
2484 for (j=0; j < z->img_mcu_y; ++j) {
2485 for (i=0; i < z->img_mcu_x; ++i) {
2486 // scan an interleaved mcu... process scan_n components in order
2487 for (k=0; k < z->scan_n; ++k) {
2488 int n = z->order[k];
2489 // scan out an mcu's worth of this component; that's just determined
2490 // by the basic H and V specified for the component
2491 for (y=0; y < z->img_comp[n].v; ++y) {
2492 for (x=0; x < z->img_comp[n].h; ++x) {
2493 int x2 = (i*z->img_comp[n].h + x)*8;
2494 int y2 = (j*z->img_comp[n].v + y)*8;
2495 int ha = z->img_comp[n].ha;
2496 if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2497 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data);
2498 }
2499 }
2500 }
2501 // after all interleaved components, that's an interleaved MCU,
2502 // so now count down the restart interval
2503 if (--z->todo <= 0) {
2504 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2505 if (!STBI__RESTART(z->marker)) return 1;
2506 stbi__jpeg_reset(z);
2507 }
2508 }
2509 }
2510 return 1;
2511 }
2512 } else {
2513 if (z->scan_n == 1) {
2514 int i,j;
2515 int n = z->order[0];
2516 // non-interleaved data, we just need to process one block at a time,
2517 // in trivial scanline order
2518 // number of blocks to do just depends on how many actual "pixels" this
2519 // component has, independent of interleaved MCU blocking and such
2520 int w = (z->img_comp[n].x+7) >> 3;
2521 int h = (z->img_comp[n].y+7) >> 3;
2522 for (j=0; j < h; ++j) {
2523 for (i=0; i < w; ++i) {
2524 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2525 if (z->spec_start == 0) {
2526 if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2527 return 0;
2528 } else {
2529 int ha = z->img_comp[n].ha;
2530 if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha]))
2531 return 0;
2532 }
2533 // every data block is an MCU, so countdown the restart interval
2534 if (--z->todo <= 0) {
2535 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2536 if (!STBI__RESTART(z->marker)) return 1;
2537 stbi__jpeg_reset(z);
2538 }
2539 }
2540 }
2541 return 1;
2542 } else { // interleaved
2543 int i,j,k,x,y;
2544 for (j=0; j < z->img_mcu_y; ++j) {
2545 for (i=0; i < z->img_mcu_x; ++i) {
2546 // scan an interleaved mcu... process scan_n components in order
2547 for (k=0; k < z->scan_n; ++k) {
2548 int n = z->order[k];
2549 // scan out an mcu's worth of this component; that's just determined
2550 // by the basic H and V specified for the component
2551 for (y=0; y < z->img_comp[n].v; ++y) {
2552 for (x=0; x < z->img_comp[n].h; ++x) {
2553 int x2 = (i*z->img_comp[n].h + x);
2554 int y2 = (j*z->img_comp[n].v + y);
2555 short *data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w);
2556 if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2557 return 0;
2558 }
2559 }
2560 }
2561 // after all interleaved components, that's an interleaved MCU,
2562 // so now count down the restart interval
2563 if (--z->todo <= 0) {
2564 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2565 if (!STBI__RESTART(z->marker)) return 1;
2566 stbi__jpeg_reset(z);
2567 }
2568 }
2569 }
2570 return 1;
2571 }
2572 }
2573 }
2574
stbi__jpeg_dequantize(short * data,stbi_uc * dequant)2575 static void stbi__jpeg_dequantize(short *data, stbi_uc *dequant)
2576 {
2577 int i;
2578 for (i=0; i < 64; ++i)
2579 data[i] *= dequant[i];
2580 }
2581
stbi__jpeg_finish(stbi__jpeg * z)2582 static void stbi__jpeg_finish(stbi__jpeg *z)
2583 {
2584 if (z->progressive) {
2585 // dequantize and idct the data
2586 int i,j,n;
2587 for (n=0; n < z->s->img_n; ++n) {
2588 int w = (z->img_comp[n].x+7) >> 3;
2589 int h = (z->img_comp[n].y+7) >> 3;
2590 for (j=0; j < h; ++j) {
2591 for (i=0; i < w; ++i) {
2592 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2593 stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]);
2594 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2595 }
2596 }
2597 }
2598 }
2599 }
2600
stbi__process_marker(stbi__jpeg * z,int m)2601 static int stbi__process_marker(stbi__jpeg *z, int m)
2602 {
2603 int L;
2604 switch (m) {
2605 case STBI__MARKER_none: // no marker found
2606 return stbi__err("expected marker","Corrupt JPEG");
2607
2608 case 0xDD: // DRI - specify restart interval
2609 if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
2610 z->restart_interval = stbi__get16be(z->s);
2611 return 1;
2612
2613 case 0xDB: // DQT - define quantization table
2614 L = stbi__get16be(z->s)-2;
2615 while (L > 0) {
2616 int q = stbi__get8(z->s);
2617 int p = q >> 4;
2618 int t = q & 15,i;
2619 if (p != 0) return stbi__err("bad DQT type","Corrupt JPEG");
2620 if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG");
2621 for (i=0; i < 64; ++i)
2622 z->dequant[t][stbi__jpeg_dezigzag[i]] = stbi__get8(z->s);
2623 L -= 65;
2624 }
2625 return L==0;
2626
2627 case 0xC4: // DHT - define huffman table
2628 L = stbi__get16be(z->s)-2;
2629 while (L > 0) {
2630 stbi_uc *v;
2631 int sizes[16],i,n=0;
2632 int q = stbi__get8(z->s);
2633 int tc = q >> 4;
2634 int th = q & 15;
2635 if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG");
2636 for (i=0; i < 16; ++i) {
2637 sizes[i] = stbi__get8(z->s);
2638 n += sizes[i];
2639 }
2640 L -= 17;
2641 if (tc == 0) {
2642 if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0;
2643 v = z->huff_dc[th].values;
2644 } else {
2645 if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0;
2646 v = z->huff_ac[th].values;
2647 }
2648 for (i=0; i < n; ++i)
2649 v[i] = stbi__get8(z->s);
2650 if (tc != 0)
2651 stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
2652 L -= n;
2653 }
2654 return L==0;
2655 }
2656 // check for comment block or APP blocks
2657 if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
2658 stbi__skip(z->s, stbi__get16be(z->s)-2);
2659 return 1;
2660 }
2661 return 0;
2662 }
2663
2664 // after we see SOS
stbi__process_scan_header(stbi__jpeg * z)2665 static int stbi__process_scan_header(stbi__jpeg *z)
2666 {
2667 int i;
2668 int Ls = stbi__get16be(z->s);
2669 z->scan_n = stbi__get8(z->s);
2670 if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return stbi__err("bad SOS component count","Corrupt JPEG");
2671 if (Ls != 6+2*z->scan_n) return stbi__err("bad SOS len","Corrupt JPEG");
2672 for (i=0; i < z->scan_n; ++i) {
2673 int id = stbi__get8(z->s), which;
2674 int q = stbi__get8(z->s);
2675 for (which = 0; which < z->s->img_n; ++which)
2676 if (z->img_comp[which].id == id)
2677 break;
2678 if (which == z->s->img_n) return 0; // no match
2679 z->img_comp[which].hd = q >> 4; if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff","Corrupt JPEG");
2680 z->img_comp[which].ha = q & 15; if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff","Corrupt JPEG");
2681 z->order[i] = which;
2682 }
2683
2684 {
2685 int aa;
2686 z->spec_start = stbi__get8(z->s);
2687 z->spec_end = stbi__get8(z->s); // should be 63, but might be 0
2688 aa = stbi__get8(z->s);
2689 z->succ_high = (aa >> 4);
2690 z->succ_low = (aa & 15);
2691 if (z->progressive) {
2692 if (z->spec_start > 63 || z->spec_end > 63 || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13)
2693 return stbi__err("bad SOS", "Corrupt JPEG");
2694 } else {
2695 if (z->spec_start != 0) return stbi__err("bad SOS","Corrupt JPEG");
2696 if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS","Corrupt JPEG");
2697 z->spec_end = 63;
2698 }
2699 }
2700
2701 return 1;
2702 }
2703
stbi__process_frame_header(stbi__jpeg * z,int scan)2704 static int stbi__process_frame_header(stbi__jpeg *z, int scan)
2705 {
2706 stbi__context *s = z->s;
2707 int Lf,p,i,q, h_max=1,v_max=1,c;
2708 Lf = stbi__get16be(s); if (Lf < 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
2709 p = stbi__get8(s); if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
2710 s->img_y = stbi__get16be(s); if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
2711 s->img_x = stbi__get16be(s); if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
2712 c = stbi__get8(s);
2713 if (c != 3 && c != 1) return stbi__err("bad component count","Corrupt JPEG"); // JFIF requires
2714 s->img_n = c;
2715 for (i=0; i < c; ++i) {
2716 z->img_comp[i].data = NULL;
2717 z->img_comp[i].linebuf = NULL;
2718 }
2719
2720 if (Lf != 8+3*s->img_n) return stbi__err("bad SOF len","Corrupt JPEG");
2721
2722 z->rgb = 0;
2723 for (i=0; i < s->img_n; ++i) {
2724 static unsigned char rgb[3] = { 'R', 'G', 'B' };
2725 z->img_comp[i].id = stbi__get8(s);
2726 if (z->img_comp[i].id != i+1) // JFIF requires
2727 if (z->img_comp[i].id != i) { // some version of jpegtran outputs non-JFIF-compliant files!
2728 // somethings output this (see http://fileformats.archiveteam.org/wiki/JPEG#Color_format)
2729 if (z->img_comp[i].id != rgb[i])
2730 return stbi__err("bad component ID","Corrupt JPEG");
2731 ++z->rgb;
2732 }
2733 q = stbi__get8(s);
2734 z->img_comp[i].h = (q >> 4); if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG");
2735 z->img_comp[i].v = q & 15; if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG");
2736 z->img_comp[i].tq = stbi__get8(s); if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG");
2737 }
2738
2739 if (scan != STBI__SCAN_load) return 1;
2740
2741 if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
2742
2743 for (i=0; i < s->img_n; ++i) {
2744 if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
2745 if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
2746 }
2747
2748 // compute interleaved mcu info
2749 z->img_h_max = h_max;
2750 z->img_v_max = v_max;
2751 z->img_mcu_w = h_max * 8;
2752 z->img_mcu_h = v_max * 8;
2753 z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
2754 z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;
2755
2756 for (i=0; i < s->img_n; ++i) {
2757 // number of effective pixels (e.g. for non-interleaved MCU)
2758 z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
2759 z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
2760 // to simplify generation, we'll allocate enough memory to decode
2761 // the bogus oversized data from using interleaved MCUs and their
2762 // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
2763 // discard the extra data until colorspace conversion
2764 z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
2765 z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
2766 z->img_comp[i].raw_data = stbi__malloc(z->img_comp[i].w2 * z->img_comp[i].h2+15);
2767
2768 if (z->img_comp[i].raw_data == NULL) {
2769 for(--i; i >= 0; --i) {
2770 STBI_FREE(z->img_comp[i].raw_data);
2771 z->img_comp[i].raw_data = NULL;
2772 }
2773 return stbi__err("outofmem", "Out of memory");
2774 }
2775 // align blocks for idct using mmx/sse
2776 z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
2777 z->img_comp[i].linebuf = NULL;
2778 if (z->progressive) {
2779 z->img_comp[i].coeff_w = (z->img_comp[i].w2 + 7) >> 3;
2780 z->img_comp[i].coeff_h = (z->img_comp[i].h2 + 7) >> 3;
2781 z->img_comp[i].raw_coeff = STBI_MALLOC(z->img_comp[i].coeff_w * z->img_comp[i].coeff_h * 64 * sizeof(short) + 15);
2782 z->img_comp[i].coeff = (short*) (((size_t) z->img_comp[i].raw_coeff + 15) & ~15);
2783 } else {
2784 z->img_comp[i].coeff = 0;
2785 z->img_comp[i].raw_coeff = 0;
2786 }
2787 }
2788
2789 return 1;
2790 }
2791
2792 // use comparisons since in some cases we handle more than one case (e.g. SOF)
2793 #define stbi__DNL(x) ((x) == 0xdc)
2794 #define stbi__SOI(x) ((x) == 0xd8)
2795 #define stbi__EOI(x) ((x) == 0xd9)
2796 #define stbi__SOF(x) ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
2797 #define stbi__SOS(x) ((x) == 0xda)
2798
2799 #define stbi__SOF_progressive(x) ((x) == 0xc2)
2800
stbi__decode_jpeg_header(stbi__jpeg * z,int scan)2801 static int stbi__decode_jpeg_header(stbi__jpeg *z, int scan)
2802 {
2803 int m;
2804 z->marker = STBI__MARKER_none; // initialize cached marker to empty
2805 m = stbi__get_marker(z);
2806 if (!stbi__SOI(m)) return stbi__err("no SOI","Corrupt JPEG");
2807 if (scan == STBI__SCAN_type) return 1;
2808 m = stbi__get_marker(z);
2809 while (!stbi__SOF(m)) {
2810 if (!stbi__process_marker(z,m)) return 0;
2811 m = stbi__get_marker(z);
2812 while (m == STBI__MARKER_none) {
2813 // some files have extra padding after their blocks, so ok, we'll scan
2814 if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG");
2815 m = stbi__get_marker(z);
2816 }
2817 }
2818 z->progressive = stbi__SOF_progressive(m);
2819 if (!stbi__process_frame_header(z, scan)) return 0;
2820 return 1;
2821 }
2822
2823 // decode image to YCbCr format
stbi__decode_jpeg_image(stbi__jpeg * j)2824 static int stbi__decode_jpeg_image(stbi__jpeg *j)
2825 {
2826 int m;
2827 for (m = 0; m < 4; m++) {
2828 j->img_comp[m].raw_data = NULL;
2829 j->img_comp[m].raw_coeff = NULL;
2830 }
2831 j->restart_interval = 0;
2832 if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0;
2833 m = stbi__get_marker(j);
2834 while (!stbi__EOI(m)) {
2835 if (stbi__SOS(m)) {
2836 if (!stbi__process_scan_header(j)) return 0;
2837 if (!stbi__parse_entropy_coded_data(j)) return 0;
2838 if (j->marker == STBI__MARKER_none ) {
2839 // handle 0s at the end of image data from IP Kamera 9060
2840 while (!stbi__at_eof(j->s)) {
2841 int x = stbi__get8(j->s);
2842 if (x == 255) {
2843 j->marker = stbi__get8(j->s);
2844 break;
2845 } else if (x != 0) {
2846 return stbi__err("junk before marker", "Corrupt JPEG");
2847 }
2848 }
2849 // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
2850 }
2851 } else {
2852 if (!stbi__process_marker(j, m)) return 0;
2853 }
2854 m = stbi__get_marker(j);
2855 }
2856 if (j->progressive)
2857 stbi__jpeg_finish(j);
2858 return 1;
2859 }
2860
2861 // static jfif-centered resampling (across block boundaries)
2862
2863 typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1,
2864 int w, int hs);
2865
2866 #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
2867
resample_row_1(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2868 static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2869 {
2870 STBI_NOTUSED(out);
2871 STBI_NOTUSED(in_far);
2872 STBI_NOTUSED(w);
2873 STBI_NOTUSED(hs);
2874 return in_near;
2875 }
2876
stbi__resample_row_v_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2877 static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2878 {
2879 // need to generate two samples vertically for every one in input
2880 int i;
2881 STBI_NOTUSED(hs);
2882 for (i=0; i < w; ++i)
2883 out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2);
2884 return out;
2885 }
2886
stbi__resample_row_h_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2887 static stbi_uc* stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2888 {
2889 // need to generate two samples horizontally for every one in input
2890 int i;
2891 stbi_uc *input = in_near;
2892
2893 if (w == 1) {
2894 // if only one sample, can't do any interpolation
2895 out[0] = out[1] = input[0];
2896 return out;
2897 }
2898
2899 out[0] = input[0];
2900 out[1] = stbi__div4(input[0]*3 + input[1] + 2);
2901 for (i=1; i < w-1; ++i) {
2902 int n = 3*input[i]+2;
2903 out[i*2+0] = stbi__div4(n+input[i-1]);
2904 out[i*2+1] = stbi__div4(n+input[i+1]);
2905 }
2906 out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2);
2907 out[i*2+1] = input[w-1];
2908
2909 STBI_NOTUSED(in_far);
2910 STBI_NOTUSED(hs);
2911
2912 return out;
2913 }
2914
2915 #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
2916
stbi__resample_row_hv_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2917 static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2918 {
2919 // need to generate 2x2 samples for every one in input
2920 int i,t0,t1;
2921 if (w == 1) {
2922 out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2923 return out;
2924 }
2925
2926 t1 = 3*in_near[0] + in_far[0];
2927 out[0] = stbi__div4(t1+2);
2928 for (i=1; i < w; ++i) {
2929 t0 = t1;
2930 t1 = 3*in_near[i]+in_far[i];
2931 out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
2932 out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
2933 }
2934 out[w*2-1] = stbi__div4(t1+2);
2935
2936 STBI_NOTUSED(hs);
2937
2938 return out;
2939 }
2940
2941 #if defined(STBI_SSE2) || defined(STBI_NEON)
stbi__resample_row_hv_2_simd(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2942 static stbi_uc *stbi__resample_row_hv_2_simd(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2943 {
2944 // need to generate 2x2 samples for every one in input
2945 int i=0,t0,t1;
2946
2947 if (w == 1) {
2948 out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2949 return out;
2950 }
2951
2952 t1 = 3*in_near[0] + in_far[0];
2953 // process groups of 8 pixels for as long as we can.
2954 // note we can't handle the last pixel in a row in this loop
2955 // because we need to handle the filter boundary conditions.
2956 for (; i < ((w-1) & ~7); i += 8) {
2957 #if defined(STBI_SSE2)
2958 // load and perform the vertical filtering pass
2959 // this uses 3*x + y = 4*x + (y - x)
2960 __m128i zero = _mm_setzero_si128();
2961 __m128i farb = _mm_loadl_epi64((__m128i *) (in_far + i));
2962 __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i));
2963 __m128i farw = _mm_unpacklo_epi8(farb, zero);
2964 __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
2965 __m128i diff = _mm_sub_epi16(farw, nearw);
2966 __m128i nears = _mm_slli_epi16(nearw, 2);
2967 __m128i curr = _mm_add_epi16(nears, diff); // current row
2968
2969 // horizontal filter works the same based on shifted vers of current
2970 // row. "prev" is current row shifted right by 1 pixel; we need to
2971 // insert the previous pixel value (from t1).
2972 // "next" is current row shifted left by 1 pixel, with first pixel
2973 // of next block of 8 pixels added in.
2974 __m128i prv0 = _mm_slli_si128(curr, 2);
2975 __m128i nxt0 = _mm_srli_si128(curr, 2);
2976 __m128i prev = _mm_insert_epi16(prv0, t1, 0);
2977 __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7);
2978
2979 // horizontal filter, polyphase implementation since it's convenient:
2980 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
2981 // odd pixels = 3*cur + next = cur*4 + (next - cur)
2982 // note the shared term.
2983 __m128i bias = _mm_set1_epi16(8);
2984 __m128i curs = _mm_slli_epi16(curr, 2);
2985 __m128i prvd = _mm_sub_epi16(prev, curr);
2986 __m128i nxtd = _mm_sub_epi16(next, curr);
2987 __m128i curb = _mm_add_epi16(curs, bias);
2988 __m128i even = _mm_add_epi16(prvd, curb);
2989 __m128i odd = _mm_add_epi16(nxtd, curb);
2990
2991 // interleave even and odd pixels, then undo scaling.
2992 __m128i int0 = _mm_unpacklo_epi16(even, odd);
2993 __m128i int1 = _mm_unpackhi_epi16(even, odd);
2994 __m128i de0 = _mm_srli_epi16(int0, 4);
2995 __m128i de1 = _mm_srli_epi16(int1, 4);
2996
2997 // pack and write output
2998 __m128i outv = _mm_packus_epi16(de0, de1);
2999 _mm_storeu_si128((__m128i *) (out + i*2), outv);
3000 #elif defined(STBI_NEON)
3001 // load and perform the vertical filtering pass
3002 // this uses 3*x + y = 4*x + (y - x)
3003 uint8x8_t farb = vld1_u8(in_far + i);
3004 uint8x8_t nearb = vld1_u8(in_near + i);
3005 int16x8_t diff = vreinterpretq_s16_u16(vsubl_u8(farb, nearb));
3006 int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2));
3007 int16x8_t curr = vaddq_s16(nears, diff); // current row
3008
3009 // horizontal filter works the same based on shifted vers of current
3010 // row. "prev" is current row shifted right by 1 pixel; we need to
3011 // insert the previous pixel value (from t1).
3012 // "next" is current row shifted left by 1 pixel, with first pixel
3013 // of next block of 8 pixels added in.
3014 int16x8_t prv0 = vextq_s16(curr, curr, 7);
3015 int16x8_t nxt0 = vextq_s16(curr, curr, 1);
3016 int16x8_t prev = vsetq_lane_s16(t1, prv0, 0);
3017 int16x8_t next = vsetq_lane_s16(3*in_near[i+8] + in_far[i+8], nxt0, 7);
3018
3019 // horizontal filter, polyphase implementation since it's convenient:
3020 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
3021 // odd pixels = 3*cur + next = cur*4 + (next - cur)
3022 // note the shared term.
3023 int16x8_t curs = vshlq_n_s16(curr, 2);
3024 int16x8_t prvd = vsubq_s16(prev, curr);
3025 int16x8_t nxtd = vsubq_s16(next, curr);
3026 int16x8_t even = vaddq_s16(curs, prvd);
3027 int16x8_t odd = vaddq_s16(curs, nxtd);
3028
3029 // undo scaling and round, then store with even/odd phases interleaved
3030 uint8x8x2_t o;
3031 o.val[0] = vqrshrun_n_s16(even, 4);
3032 o.val[1] = vqrshrun_n_s16(odd, 4);
3033 vst2_u8(out + i*2, o);
3034 #endif
3035
3036 // "previous" value for next iter
3037 t1 = 3*in_near[i+7] + in_far[i+7];
3038 }
3039
3040 t0 = t1;
3041 t1 = 3*in_near[i] + in_far[i];
3042 out[i*2] = stbi__div16(3*t1 + t0 + 8);
3043
3044 for (++i; i < w; ++i) {
3045 t0 = t1;
3046 t1 = 3*in_near[i]+in_far[i];
3047 out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
3048 out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
3049 }
3050 out[w*2-1] = stbi__div4(t1+2);
3051
3052 STBI_NOTUSED(hs);
3053
3054 return out;
3055 }
3056 #endif
3057
stbi__resample_row_generic(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)3058 static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3059 {
3060 // resample with nearest-neighbor
3061 int i,j;
3062 STBI_NOTUSED(in_far);
3063 for (i=0; i < w; ++i)
3064 for (j=0; j < hs; ++j)
3065 out[i*hs+j] = in_near[i];
3066 return out;
3067 }
3068
3069 #ifdef STBI_JPEG_OLD
3070 // this is the same YCbCr-to-RGB calculation that stb_image has used
3071 // historically before the algorithm changes in 1.49
3072 #define float2fixed(x) ((int) ((x) * 65536 + 0.5))
stbi__YCbCr_to_RGB_row(stbi_uc * out,const stbi_uc * y,const stbi_uc * pcb,const stbi_uc * pcr,int count,int step)3073 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3074 {
3075 int i;
3076 for (i=0; i < count; ++i) {
3077 int y_fixed = (y[i] << 16) + 32768; // rounding
3078 int r,g,b;
3079 int cr = pcr[i] - 128;
3080 int cb = pcb[i] - 128;
3081 r = y_fixed + cr*float2fixed(1.40200f);
3082 g = y_fixed - cr*float2fixed(0.71414f) - cb*float2fixed(0.34414f);
3083 b = y_fixed + cb*float2fixed(1.77200f);
3084 r >>= 16;
3085 g >>= 16;
3086 b >>= 16;
3087 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3088 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3089 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3090 out[0] = (stbi_uc)r;
3091 out[1] = (stbi_uc)g;
3092 out[2] = (stbi_uc)b;
3093 out[3] = 255;
3094 out += step;
3095 }
3096 }
3097 #else
3098 // this is a reduced-precision calculation of YCbCr-to-RGB introduced
3099 // to make sure the code produces the same results in both SIMD and scalar
3100 #define float2fixed(x) (((int) ((x) * 4096.0f + 0.5f)) << 8)
stbi__YCbCr_to_RGB_row(stbi_uc * out,const stbi_uc * y,const stbi_uc * pcb,const stbi_uc * pcr,int count,int step)3101 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3102 {
3103 int i;
3104 for (i=0; i < count; ++i) {
3105 int y_fixed = (y[i] << 20) + (1<<19); // rounding
3106 int r,g,b;
3107 int cr = pcr[i] - 128;
3108 int cb = pcb[i] - 128;
3109 r = y_fixed + cr* float2fixed(1.40200f);
3110 g = y_fixed + (cr*-float2fixed(0.71414f)) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3111 b = y_fixed + cb* float2fixed(1.77200f);
3112 r >>= 20;
3113 g >>= 20;
3114 b >>= 20;
3115 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3116 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3117 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3118 out[0] = (stbi_uc)r;
3119 out[1] = (stbi_uc)g;
3120 out[2] = (stbi_uc)b;
3121 out[3] = 255;
3122 out += step;
3123 }
3124 }
3125 #endif
3126
3127 #if defined(STBI_SSE2) || defined(STBI_NEON)
stbi__YCbCr_to_RGB_simd(stbi_uc * out,stbi_uc const * y,stbi_uc const * pcb,stbi_uc const * pcr,int count,int step)3128 static void stbi__YCbCr_to_RGB_simd(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step)
3129 {
3130 int i = 0;
3131
3132 #ifdef STBI_SSE2
3133 // step == 3 is pretty ugly on the final interleave, and i'm not convinced
3134 // it's useful in practice (you wouldn't use it for textures, for example).
3135 // so just accelerate step == 4 case.
3136 if (step == 4) {
3137 // this is a fairly straightforward implementation and not super-optimized.
3138 __m128i signflip = _mm_set1_epi8(-0x80);
3139 __m128i cr_const0 = _mm_set1_epi16( (short) ( 1.40200f*4096.0f+0.5f));
3140 __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f));
3141 __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f));
3142 __m128i cb_const1 = _mm_set1_epi16( (short) ( 1.77200f*4096.0f+0.5f));
3143 __m128i y_bias = _mm_set1_epi8((char) (unsigned char) 128);
3144 __m128i xw = _mm_set1_epi16(255); // alpha channel
3145
3146 for (; i+7 < count; i += 8) {
3147 // load
3148 __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i));
3149 __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i));
3150 __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i));
3151 __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
3152 __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
3153
3154 // unpack to short (and left-shift cr, cb by 8)
3155 __m128i yw = _mm_unpacklo_epi8(y_bias, y_bytes);
3156 __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
3157 __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
3158
3159 // color transform
3160 __m128i yws = _mm_srli_epi16(yw, 4);
3161 __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
3162 __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
3163 __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
3164 __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
3165 __m128i rws = _mm_add_epi16(cr0, yws);
3166 __m128i gwt = _mm_add_epi16(cb0, yws);
3167 __m128i bws = _mm_add_epi16(yws, cb1);
3168 __m128i gws = _mm_add_epi16(gwt, cr1);
3169
3170 // descale
3171 __m128i rw = _mm_srai_epi16(rws, 4);
3172 __m128i bw = _mm_srai_epi16(bws, 4);
3173 __m128i gw = _mm_srai_epi16(gws, 4);
3174
3175 // back to byte, set up for transpose
3176 __m128i brb = _mm_packus_epi16(rw, bw);
3177 __m128i gxb = _mm_packus_epi16(gw, xw);
3178
3179 // transpose to interleave channels
3180 __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
3181 __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
3182 __m128i o0 = _mm_unpacklo_epi16(t0, t1);
3183 __m128i o1 = _mm_unpackhi_epi16(t0, t1);
3184
3185 // store
3186 _mm_storeu_si128((__m128i *) (out + 0), o0);
3187 _mm_storeu_si128((__m128i *) (out + 16), o1);
3188 out += 32;
3189 }
3190 }
3191 #endif
3192
3193 #ifdef STBI_NEON
3194 // in this version, step=3 support would be easy to add. but is there demand?
3195 if (step == 4) {
3196 // this is a fairly straightforward implementation and not super-optimized.
3197 uint8x8_t signflip = vdup_n_u8(0x80);
3198 int16x8_t cr_const0 = vdupq_n_s16( (short) ( 1.40200f*4096.0f+0.5f));
3199 int16x8_t cr_const1 = vdupq_n_s16( - (short) ( 0.71414f*4096.0f+0.5f));
3200 int16x8_t cb_const0 = vdupq_n_s16( - (short) ( 0.34414f*4096.0f+0.5f));
3201 int16x8_t cb_const1 = vdupq_n_s16( (short) ( 1.77200f*4096.0f+0.5f));
3202
3203 for (; i+7 < count; i += 8) {
3204 // load
3205 uint8x8_t y_bytes = vld1_u8(y + i);
3206 uint8x8_t cr_bytes = vld1_u8(pcr + i);
3207 uint8x8_t cb_bytes = vld1_u8(pcb + i);
3208 int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip));
3209 int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip));
3210
3211 // expand to s16
3212 int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4));
3213 int16x8_t crw = vshll_n_s8(cr_biased, 7);
3214 int16x8_t cbw = vshll_n_s8(cb_biased, 7);
3215
3216 // color transform
3217 int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0);
3218 int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0);
3219 int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1);
3220 int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1);
3221 int16x8_t rws = vaddq_s16(yws, cr0);
3222 int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1);
3223 int16x8_t bws = vaddq_s16(yws, cb1);
3224
3225 // undo scaling, round, convert to byte
3226 uint8x8x4_t o;
3227 o.val[0] = vqrshrun_n_s16(rws, 4);
3228 o.val[1] = vqrshrun_n_s16(gws, 4);
3229 o.val[2] = vqrshrun_n_s16(bws, 4);
3230 o.val[3] = vdup_n_u8(255);
3231
3232 // store, interleaving r/g/b/a
3233 vst4_u8(out, o);
3234 out += 8*4;
3235 }
3236 }
3237 #endif
3238
3239 for (; i < count; ++i) {
3240 int y_fixed = (y[i] << 20) + (1<<19); // rounding
3241 int r,g,b;
3242 int cr = pcr[i] - 128;
3243 int cb = pcb[i] - 128;
3244 r = y_fixed + cr* float2fixed(1.40200f);
3245 g = y_fixed + cr*-float2fixed(0.71414f) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3246 b = y_fixed + cb* float2fixed(1.77200f);
3247 r >>= 20;
3248 g >>= 20;
3249 b >>= 20;
3250 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3251 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3252 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3253 out[0] = (stbi_uc)r;
3254 out[1] = (stbi_uc)g;
3255 out[2] = (stbi_uc)b;
3256 out[3] = 255;
3257 out += step;
3258 }
3259 }
3260 #endif
3261
3262 // set up the kernels
stbi__setup_jpeg(stbi__jpeg * j)3263 static void stbi__setup_jpeg(stbi__jpeg *j)
3264 {
3265 j->idct_block_kernel = stbi__idct_block;
3266 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
3267 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
3268
3269 #ifdef STBI_SSE2
3270 if (stbi__sse2_available()) {
3271 j->idct_block_kernel = stbi__idct_simd;
3272 #ifndef STBI_JPEG_OLD
3273 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3274 #endif
3275 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3276 }
3277 #endif
3278
3279 #ifdef STBI_NEON
3280 j->idct_block_kernel = stbi__idct_simd;
3281 #ifndef STBI_JPEG_OLD
3282 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3283 #endif
3284 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3285 #endif
3286 }
3287
3288 // clean up the temporary component buffers
stbi__cleanup_jpeg(stbi__jpeg * j)3289 static void stbi__cleanup_jpeg(stbi__jpeg *j)
3290 {
3291 int i;
3292 for (i=0; i < j->s->img_n; ++i) {
3293 if (j->img_comp[i].raw_data) {
3294 STBI_FREE(j->img_comp[i].raw_data);
3295 j->img_comp[i].raw_data = NULL;
3296 j->img_comp[i].data = NULL;
3297 }
3298 if (j->img_comp[i].raw_coeff) {
3299 STBI_FREE(j->img_comp[i].raw_coeff);
3300 j->img_comp[i].raw_coeff = 0;
3301 j->img_comp[i].coeff = 0;
3302 }
3303 if (j->img_comp[i].linebuf) {
3304 STBI_FREE(j->img_comp[i].linebuf);
3305 j->img_comp[i].linebuf = NULL;
3306 }
3307 }
3308 }
3309
3310 typedef struct
3311 {
3312 resample_row_func resample;
3313 stbi_uc *line0,*line1;
3314 int hs,vs; // expansion factor in each axis
3315 int w_lores; // horizontal pixels pre-expansion
3316 int ystep; // how far through vertical expansion we are
3317 int ypos; // which pre-expansion row we're on
3318 } stbi__resample;
3319
load_jpeg_image(stbi__jpeg * z,int * out_x,int * out_y,int * comp,int req_comp)3320 static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
3321 {
3322 int n, decode_n;
3323 z->s->img_n = 0; // make stbi__cleanup_jpeg safe
3324
3325 // validate req_comp
3326 if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
3327
3328 // load a jpeg image from whichever source, but leave in YCbCr format
3329 if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
3330
3331 // determine actual number of components to generate
3332 n = req_comp ? req_comp : z->s->img_n;
3333
3334 if (z->s->img_n == 3 && n < 3)
3335 decode_n = 1;
3336 else
3337 decode_n = z->s->img_n;
3338
3339 // resample and color-convert
3340 {
3341 int k;
3342 unsigned int i,j;
3343 stbi_uc *output;
3344 stbi_uc *coutput[4];
3345
3346 stbi__resample res_comp[4];
3347
3348 for (k=0; k < decode_n; ++k) {
3349 stbi__resample *r = &res_comp[k];
3350
3351 // allocate line buffer big enough for upsampling off the edges
3352 // with upsample factor of 4
3353 z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3);
3354 if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3355
3356 r->hs = z->img_h_max / z->img_comp[k].h;
3357 r->vs = z->img_v_max / z->img_comp[k].v;
3358 r->ystep = r->vs >> 1;
3359 r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
3360 r->ypos = 0;
3361 r->line0 = r->line1 = z->img_comp[k].data;
3362
3363 if (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
3364 else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
3365 else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
3366 else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
3367 else r->resample = stbi__resample_row_generic;
3368 }
3369
3370 // can't error after this so, this is safe
3371 output = (stbi_uc *) stbi__malloc(n * z->s->img_x * z->s->img_y + 1);
3372 if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3373
3374 // now go ahead and resample
3375 for (j=0; j < z->s->img_y; ++j) {
3376 stbi_uc *out = output + n * z->s->img_x * j;
3377 for (k=0; k < decode_n; ++k) {
3378 stbi__resample *r = &res_comp[k];
3379 int y_bot = r->ystep >= (r->vs >> 1);
3380 coutput[k] = r->resample(z->img_comp[k].linebuf,
3381 y_bot ? r->line1 : r->line0,
3382 y_bot ? r->line0 : r->line1,
3383 r->w_lores, r->hs);
3384 if (++r->ystep >= r->vs) {
3385 r->ystep = 0;
3386 r->line0 = r->line1;
3387 if (++r->ypos < z->img_comp[k].y)
3388 r->line1 += z->img_comp[k].w2;
3389 }
3390 }
3391 if (n >= 3) {
3392 stbi_uc *y = coutput[0];
3393 if (z->s->img_n == 3) {
3394 if (z->rgb == 3) {
3395 for (i=0; i < z->s->img_x; ++i) {
3396 out[0] = y[i];
3397 out[1] = coutput[1][i];
3398 out[2] = coutput[2][i];
3399 out[3] = 255;
3400 out += n;
3401 }
3402 } else {
3403 z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3404 }
3405 } else
3406 for (i=0; i < z->s->img_x; ++i) {
3407 out[0] = out[1] = out[2] = y[i];
3408 out[3] = 255; // not used if n==3
3409 out += n;
3410 }
3411 } else {
3412 stbi_uc *y = coutput[0];
3413 if (n == 1)
3414 for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
3415 else
3416 for (i=0; i < z->s->img_x; ++i) *out++ = y[i], *out++ = 255;
3417 }
3418 }
3419 stbi__cleanup_jpeg(z);
3420 *out_x = z->s->img_x;
3421 *out_y = z->s->img_y;
3422 if (comp) *comp = z->s->img_n; // report original components, not output
3423 return output;
3424 }
3425 }
3426
stbi__jpeg_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)3427 static unsigned char *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
3428 {
3429 unsigned char* result;
3430 stbi__jpeg* j = (stbi__jpeg*) stbi__malloc(sizeof(stbi__jpeg));
3431 j->s = s;
3432 stbi__setup_jpeg(j);
3433 result = load_jpeg_image(j, x,y,comp,req_comp);
3434 STBI_FREE(j);
3435 return result;
3436 }
3437
stbi__jpeg_test(stbi__context * s)3438 static int stbi__jpeg_test(stbi__context *s)
3439 {
3440 int r;
3441 stbi__jpeg j;
3442 j.s = s;
3443 stbi__setup_jpeg(&j);
3444 r = stbi__decode_jpeg_header(&j, STBI__SCAN_type);
3445 stbi__rewind(s);
3446 return r;
3447 }
3448
stbi__jpeg_info_raw(stbi__jpeg * j,int * x,int * y,int * comp)3449 static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp)
3450 {
3451 if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) {
3452 stbi__rewind( j->s );
3453 return 0;
3454 }
3455 if (x) *x = j->s->img_x;
3456 if (y) *y = j->s->img_y;
3457 if (comp) *comp = j->s->img_n;
3458 return 1;
3459 }
3460
stbi__jpeg_info(stbi__context * s,int * x,int * y,int * comp)3461 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp)
3462 {
3463 int result;
3464 stbi__jpeg* j = (stbi__jpeg*) (stbi__malloc(sizeof(stbi__jpeg)));
3465 j->s = s;
3466 result = stbi__jpeg_info_raw(j, x, y, comp);
3467 STBI_FREE(j);
3468 return result;
3469 }
3470 #endif
3471
3472 // public domain zlib decode v0.2 Sean Barrett 2006-11-18
3473 // simple implementation
3474 // - all input must be provided in an upfront buffer
3475 // - all output is written to a single output buffer (can malloc/realloc)
3476 // performance
3477 // - fast huffman
3478
3479 #ifndef STBI_NO_ZLIB
3480
3481 // fast-way is faster to check than jpeg huffman, but slow way is slower
3482 #define STBI__ZFAST_BITS 9 // accelerate all cases in default tables
3483 #define STBI__ZFAST_MASK ((1 << STBI__ZFAST_BITS) - 1)
3484
3485 // zlib-style huffman encoding
3486 // (jpegs packs from left, zlib from right, so can't share code)
3487 typedef struct
3488 {
3489 stbi__uint16 fast[1 << STBI__ZFAST_BITS];
3490 stbi__uint16 firstcode[16];
3491 int maxcode[17];
3492 stbi__uint16 firstsymbol[16];
3493 stbi_uc size[288];
3494 stbi__uint16 value[288];
3495 } stbi__zhuffman;
3496
stbi__bitreverse16(int n)3497 stbi_inline static int stbi__bitreverse16(int n)
3498 {
3499 n = ((n & 0xAAAA) >> 1) | ((n & 0x5555) << 1);
3500 n = ((n & 0xCCCC) >> 2) | ((n & 0x3333) << 2);
3501 n = ((n & 0xF0F0) >> 4) | ((n & 0x0F0F) << 4);
3502 n = ((n & 0xFF00) >> 8) | ((n & 0x00FF) << 8);
3503 return n;
3504 }
3505
stbi__bit_reverse(int v,int bits)3506 stbi_inline static int stbi__bit_reverse(int v, int bits)
3507 {
3508 STBI_ASSERT(bits <= 16);
3509 // to bit reverse n bits, reverse 16 and shift
3510 // e.g. 11 bits, bit reverse and shift away 5
3511 return stbi__bitreverse16(v) >> (16-bits);
3512 }
3513
stbi__zbuild_huffman(stbi__zhuffman * z,stbi_uc * sizelist,int num)3514 static int stbi__zbuild_huffman(stbi__zhuffman *z, stbi_uc *sizelist, int num)
3515 {
3516 int i,k=0;
3517 int code, next_code[16], sizes[17];
3518
3519 // DEFLATE spec for generating codes
3520 memset(sizes, 0, sizeof(sizes));
3521 memset(z->fast, 0, sizeof(z->fast));
3522 for (i=0; i < num; ++i)
3523 ++sizes[sizelist[i]];
3524 sizes[0] = 0;
3525 for (i=1; i < 16; ++i)
3526 if (sizes[i] > (1 << i))
3527 return stbi__err("bad sizes", "Corrupt PNG");
3528 code = 0;
3529 for (i=1; i < 16; ++i) {
3530 next_code[i] = code;
3531 z->firstcode[i] = (stbi__uint16) code;
3532 z->firstsymbol[i] = (stbi__uint16) k;
3533 code = (code + sizes[i]);
3534 if (sizes[i])
3535 if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt PNG");
3536 z->maxcode[i] = code << (16-i); // preshift for inner loop
3537 code <<= 1;
3538 k += sizes[i];
3539 }
3540 z->maxcode[16] = 0x10000; // sentinel
3541 for (i=0; i < num; ++i) {
3542 int s = sizelist[i];
3543 if (s) {
3544 int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
3545 stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i);
3546 z->size [c] = (stbi_uc ) s;
3547 z->value[c] = (stbi__uint16) i;
3548 if (s <= STBI__ZFAST_BITS) {
3549 int j = stbi__bit_reverse(next_code[s],s);
3550 while (j < (1 << STBI__ZFAST_BITS)) {
3551 z->fast[j] = fastv;
3552 j += (1 << s);
3553 }
3554 }
3555 ++next_code[s];
3556 }
3557 }
3558 return 1;
3559 }
3560
3561 // zlib-from-memory implementation for PNG reading
3562 // because PNG allows splitting the zlib stream arbitrarily,
3563 // and it's annoying structurally to have PNG call ZLIB call PNG,
3564 // we require PNG read all the IDATs and combine them into a single
3565 // memory buffer
3566
3567 typedef struct
3568 {
3569 stbi_uc *zbuffer, *zbuffer_end;
3570 int num_bits;
3571 stbi__uint32 code_buffer;
3572
3573 char *zout;
3574 char *zout_start;
3575 char *zout_end;
3576 int z_expandable;
3577
3578 stbi__zhuffman z_length, z_distance;
3579 } stbi__zbuf;
3580
stbi__zget8(stbi__zbuf * z)3581 stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z)
3582 {
3583 if (z->zbuffer >= z->zbuffer_end) return 0;
3584 return *z->zbuffer++;
3585 }
3586
stbi__fill_bits(stbi__zbuf * z)3587 static void stbi__fill_bits(stbi__zbuf *z)
3588 {
3589 do {
3590 STBI_ASSERT(z->code_buffer < (1U << z->num_bits));
3591 z->code_buffer |= (unsigned int) stbi__zget8(z) << z->num_bits;
3592 z->num_bits += 8;
3593 } while (z->num_bits <= 24);
3594 }
3595
stbi__zreceive(stbi__zbuf * z,int n)3596 stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n)
3597 {
3598 unsigned int k;
3599 if (z->num_bits < n) stbi__fill_bits(z);
3600 k = z->code_buffer & ((1 << n) - 1);
3601 z->code_buffer >>= n;
3602 z->num_bits -= n;
3603 return k;
3604 }
3605
stbi__zhuffman_decode_slowpath(stbi__zbuf * a,stbi__zhuffman * z)3606 static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z)
3607 {
3608 int b,s,k;
3609 // not resolved by fast table, so compute it the slow way
3610 // use jpeg approach, which requires MSbits at top
3611 k = stbi__bit_reverse(a->code_buffer, 16);
3612 for (s=STBI__ZFAST_BITS+1; ; ++s)
3613 if (k < z->maxcode[s])
3614 break;
3615 if (s == 16) return -1; // invalid code!
3616 // code size is s, so:
3617 b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
3618 STBI_ASSERT(z->size[b] == s);
3619 a->code_buffer >>= s;
3620 a->num_bits -= s;
3621 return z->value[b];
3622 }
3623
stbi__zhuffman_decode(stbi__zbuf * a,stbi__zhuffman * z)3624 stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z)
3625 {
3626 int b,s;
3627 if (a->num_bits < 16) stbi__fill_bits(a);
3628 b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
3629 if (b) {
3630 s = b >> 9;
3631 a->code_buffer >>= s;
3632 a->num_bits -= s;
3633 return b & 511;
3634 }
3635 return stbi__zhuffman_decode_slowpath(a, z);
3636 }
3637
stbi__zexpand(stbi__zbuf * z,char * zout,int n)3638 static int stbi__zexpand(stbi__zbuf *z, char *zout, int n) // need to make room for n bytes
3639 {
3640 char *q;
3641 int cur, limit, old_limit;
3642 z->zout = zout;
3643 if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG");
3644 cur = (int) (z->zout - z->zout_start);
3645 limit = old_limit = (int) (z->zout_end - z->zout_start);
3646 while (cur + n > limit)
3647 limit *= 2;
3648 q = (char *) STBI_REALLOC_SIZED(z->zout_start, old_limit, limit);
3649 STBI_NOTUSED(old_limit);
3650 if (q == NULL) return stbi__err("outofmem", "Out of memory");
3651 z->zout_start = q;
3652 z->zout = q + cur;
3653 z->zout_end = q + limit;
3654 return 1;
3655 }
3656
3657 static int stbi__zlength_base[31] = {
3658 3,4,5,6,7,8,9,10,11,13,
3659 15,17,19,23,27,31,35,43,51,59,
3660 67,83,99,115,131,163,195,227,258,0,0 };
3661
3662 static int stbi__zlength_extra[31]=
3663 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
3664
3665 static int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
3666 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
3667
3668 static int stbi__zdist_extra[32] =
3669 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
3670
stbi__parse_huffman_block(stbi__zbuf * a)3671 static int stbi__parse_huffman_block(stbi__zbuf *a)
3672 {
3673 char *zout = a->zout;
3674 for(;;) {
3675 int z = stbi__zhuffman_decode(a, &a->z_length);
3676 if (z < 256) {
3677 if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
3678 if (zout >= a->zout_end) {
3679 if (!stbi__zexpand(a, zout, 1)) return 0;
3680 zout = a->zout;
3681 }
3682 *zout++ = (char) z;
3683 } else {
3684 stbi_uc *p;
3685 int len,dist;
3686 if (z == 256) {
3687 a->zout = zout;
3688 return 1;
3689 }
3690 z -= 257;
3691 len = stbi__zlength_base[z];
3692 if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
3693 z = stbi__zhuffman_decode(a, &a->z_distance);
3694 if (z < 0) return stbi__err("bad huffman code","Corrupt PNG");
3695 dist = stbi__zdist_base[z];
3696 if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
3697 if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG");
3698 if (zout + len > a->zout_end) {
3699 if (!stbi__zexpand(a, zout, len)) return 0;
3700 zout = a->zout;
3701 }
3702 p = (stbi_uc *) (zout - dist);
3703 if (dist == 1) { // run of one byte; common in images.
3704 stbi_uc v = *p;
3705 if (len) { do *zout++ = v; while (--len); }
3706 } else {
3707 if (len) { do *zout++ = *p++; while (--len); }
3708 }
3709 }
3710 }
3711 }
3712
stbi__compute_huffman_codes(stbi__zbuf * a)3713 static int stbi__compute_huffman_codes(stbi__zbuf *a)
3714 {
3715 static stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
3716 stbi__zhuffman z_codelength;
3717 stbi_uc lencodes[286+32+137];//padding for maximum single op
3718 stbi_uc codelength_sizes[19];
3719 int i,n;
3720
3721 int hlit = stbi__zreceive(a,5) + 257;
3722 int hdist = stbi__zreceive(a,5) + 1;
3723 int hclen = stbi__zreceive(a,4) + 4;
3724
3725 memset(codelength_sizes, 0, sizeof(codelength_sizes));
3726 for (i=0; i < hclen; ++i) {
3727 int s = stbi__zreceive(a,3);
3728 codelength_sizes[length_dezigzag[i]] = (stbi_uc) s;
3729 }
3730 if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
3731
3732 n = 0;
3733 while (n < hlit + hdist) {
3734 int c = stbi__zhuffman_decode(a, &z_codelength);
3735 if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG");
3736 if (c < 16)
3737 lencodes[n++] = (stbi_uc) c;
3738 else if (c == 16) {
3739 c = stbi__zreceive(a,2)+3;
3740 memset(lencodes+n, lencodes[n-1], c);
3741 n += c;
3742 } else if (c == 17) {
3743 c = stbi__zreceive(a,3)+3;
3744 memset(lencodes+n, 0, c);
3745 n += c;
3746 } else {
3747 STBI_ASSERT(c == 18);
3748 c = stbi__zreceive(a,7)+11;
3749 memset(lencodes+n, 0, c);
3750 n += c;
3751 }
3752 }
3753 if (n != hlit+hdist) return stbi__err("bad codelengths","Corrupt PNG");
3754 if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
3755 if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
3756 return 1;
3757 }
3758
stbi__parse_uncompressed_block(stbi__zbuf * a)3759 static int stbi__parse_uncompressed_block(stbi__zbuf *a)
3760 {
3761 stbi_uc header[4];
3762 int len,nlen,k;
3763 if (a->num_bits & 7)
3764 stbi__zreceive(a, a->num_bits & 7); // discard
3765 // drain the bit-packed data into header
3766 k = 0;
3767 while (a->num_bits > 0) {
3768 header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check
3769 a->code_buffer >>= 8;
3770 a->num_bits -= 8;
3771 }
3772 STBI_ASSERT(a->num_bits == 0);
3773 // now fill header the normal way
3774 while (k < 4)
3775 header[k++] = stbi__zget8(a);
3776 len = header[1] * 256 + header[0];
3777 nlen = header[3] * 256 + header[2];
3778 if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
3779 if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG");
3780 if (a->zout + len > a->zout_end)
3781 if (!stbi__zexpand(a, a->zout, len)) return 0;
3782 memcpy(a->zout, a->zbuffer, len);
3783 a->zbuffer += len;
3784 a->zout += len;
3785 return 1;
3786 }
3787
stbi__parse_zlib_header(stbi__zbuf * a)3788 static int stbi__parse_zlib_header(stbi__zbuf *a)
3789 {
3790 int cmf = stbi__zget8(a);
3791 int cm = cmf & 15;
3792 /* int cinfo = cmf >> 4; */
3793 int flg = stbi__zget8(a);
3794 if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
3795 if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
3796 if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
3797 // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
3798 return 1;
3799 }
3800
3801 // @TODO: should statically initialize these for optimal thread safety
3802 static stbi_uc stbi__zdefault_length[288], stbi__zdefault_distance[32];
stbi__init_zdefaults(void)3803 static void stbi__init_zdefaults(void)
3804 {
3805 int i; // use <= to match clearly with spec
3806 for (i=0; i <= 143; ++i) stbi__zdefault_length[i] = 8;
3807 for ( ; i <= 255; ++i) stbi__zdefault_length[i] = 9;
3808 for ( ; i <= 279; ++i) stbi__zdefault_length[i] = 7;
3809 for ( ; i <= 287; ++i) stbi__zdefault_length[i] = 8;
3810
3811 for (i=0; i <= 31; ++i) stbi__zdefault_distance[i] = 5;
3812 }
3813
stbi__parse_zlib(stbi__zbuf * a,int parse_header)3814 static int stbi__parse_zlib(stbi__zbuf *a, int parse_header)
3815 {
3816 int final, type;
3817 if (parse_header)
3818 if (!stbi__parse_zlib_header(a)) return 0;
3819 a->num_bits = 0;
3820 a->code_buffer = 0;
3821 do {
3822 final = stbi__zreceive(a,1);
3823 type = stbi__zreceive(a,2);
3824 if (type == 0) {
3825 if (!stbi__parse_uncompressed_block(a)) return 0;
3826 } else if (type == 3) {
3827 return 0;
3828 } else {
3829 if (type == 1) {
3830 // use fixed code lengths
3831 if (!stbi__zdefault_distance[31]) stbi__init_zdefaults();
3832 if (!stbi__zbuild_huffman(&a->z_length , stbi__zdefault_length , 288)) return 0;
3833 if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance, 32)) return 0;
3834 } else {
3835 if (!stbi__compute_huffman_codes(a)) return 0;
3836 }
3837 if (!stbi__parse_huffman_block(a)) return 0;
3838 }
3839 } while (!final);
3840 return 1;
3841 }
3842
stbi__do_zlib(stbi__zbuf * a,char * obuf,int olen,int exp,int parse_header)3843 static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header)
3844 {
3845 a->zout_start = obuf;
3846 a->zout = obuf;
3847 a->zout_end = obuf + olen;
3848 a->z_expandable = exp;
3849
3850 return stbi__parse_zlib(a, parse_header);
3851 }
3852
stbi_zlib_decode_malloc_guesssize(const char * buffer,int len,int initial_size,int * outlen)3853 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)
3854 {
3855 stbi__zbuf a;
3856 char *p = (char *) stbi__malloc(initial_size);
3857 if (p == NULL) return NULL;
3858 a.zbuffer = (stbi_uc *) buffer;
3859 a.zbuffer_end = (stbi_uc *) buffer + len;
3860 if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
3861 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3862 return a.zout_start;
3863 } else {
3864 STBI_FREE(a.zout_start);
3865 return NULL;
3866 }
3867 }
3868
stbi_zlib_decode_malloc(char const * buffer,int len,int * outlen)3869 STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
3870 {
3871 return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
3872 }
3873
stbi_zlib_decode_malloc_guesssize_headerflag(const char * buffer,int len,int initial_size,int * outlen,int parse_header)3874 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
3875 {
3876 stbi__zbuf a;
3877 char *p = (char *) stbi__malloc(initial_size);
3878 if (p == NULL) return NULL;
3879 a.zbuffer = (stbi_uc *) buffer;
3880 a.zbuffer_end = (stbi_uc *) buffer + len;
3881 if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
3882 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3883 return a.zout_start;
3884 } else {
3885 STBI_FREE(a.zout_start);
3886 return NULL;
3887 }
3888 }
3889
stbi_zlib_decode_buffer(char * obuffer,int olen,char const * ibuffer,int ilen)3890 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
3891 {
3892 stbi__zbuf a;
3893 a.zbuffer = (stbi_uc *) ibuffer;
3894 a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3895 if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
3896 return (int) (a.zout - a.zout_start);
3897 else
3898 return -1;
3899 }
3900
stbi_zlib_decode_noheader_malloc(char const * buffer,int len,int * outlen)3901 STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
3902 {
3903 stbi__zbuf a;
3904 char *p = (char *) stbi__malloc(16384);
3905 if (p == NULL) return NULL;
3906 a.zbuffer = (stbi_uc *) buffer;
3907 a.zbuffer_end = (stbi_uc *) buffer+len;
3908 if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
3909 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3910 return a.zout_start;
3911 } else {
3912 STBI_FREE(a.zout_start);
3913 return NULL;
3914 }
3915 }
3916
stbi_zlib_decode_noheader_buffer(char * obuffer,int olen,const char * ibuffer,int ilen)3917 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
3918 {
3919 stbi__zbuf a;
3920 a.zbuffer = (stbi_uc *) ibuffer;
3921 a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3922 if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
3923 return (int) (a.zout - a.zout_start);
3924 else
3925 return -1;
3926 }
3927 #endif
3928
3929 // public domain "baseline" PNG decoder v0.10 Sean Barrett 2006-11-18
3930 // simple implementation
3931 // - only 8-bit samples
3932 // - no CRC checking
3933 // - allocates lots of intermediate memory
3934 // - avoids problem of streaming data between subsystems
3935 // - avoids explicit window management
3936 // performance
3937 // - uses stb_zlib, a PD zlib implementation with fast huffman decoding
3938
3939 #ifndef STBI_NO_PNG
3940 typedef struct
3941 {
3942 stbi__uint32 length;
3943 stbi__uint32 type;
3944 } stbi__pngchunk;
3945
stbi__get_chunk_header(stbi__context * s)3946 static stbi__pngchunk stbi__get_chunk_header(stbi__context *s)
3947 {
3948 stbi__pngchunk c;
3949 c.length = stbi__get32be(s);
3950 c.type = stbi__get32be(s);
3951 return c;
3952 }
3953
stbi__check_png_header(stbi__context * s)3954 static int stbi__check_png_header(stbi__context *s)
3955 {
3956 static stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
3957 int i;
3958 for (i=0; i < 8; ++i)
3959 if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG");
3960 return 1;
3961 }
3962
3963 typedef struct
3964 {
3965 stbi__context *s;
3966 stbi_uc *idata, *expanded, *out;
3967 int depth;
3968 } stbi__png;
3969
3970
3971 enum {
3972 STBI__F_none=0,
3973 STBI__F_sub=1,
3974 STBI__F_up=2,
3975 STBI__F_avg=3,
3976 STBI__F_paeth=4,
3977 // synthetic filters used for first scanline to avoid needing a dummy row of 0s
3978 STBI__F_avg_first,
3979 STBI__F_paeth_first
3980 };
3981
3982 static stbi_uc first_row_filter[5] =
3983 {
3984 STBI__F_none,
3985 STBI__F_sub,
3986 STBI__F_none,
3987 STBI__F_avg_first,
3988 STBI__F_paeth_first
3989 };
3990
stbi__paeth(int a,int b,int c)3991 static int stbi__paeth(int a, int b, int c)
3992 {
3993 int p = a + b - c;
3994 int pa = abs(p-a);
3995 int pb = abs(p-b);
3996 int pc = abs(p-c);
3997 if (pa <= pb && pa <= pc) return a;
3998 if (pb <= pc) return b;
3999 return c;
4000 }
4001
4002 static stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
4003
4004 // create the png data from post-deflated data
stbi__create_png_image_raw(stbi__png * a,stbi_uc * raw,stbi__uint32 raw_len,int out_n,stbi__uint32 x,stbi__uint32 y,int depth,int color)4005 static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
4006 {
4007 int bytes = (depth == 16? 2 : 1);
4008 stbi__context *s = a->s;
4009 stbi__uint32 i,j,stride = x*out_n*bytes;
4010 stbi__uint32 img_len, img_width_bytes;
4011 int k;
4012 int img_n = s->img_n; // copy it into a local for later
4013
4014 int output_bytes = out_n*bytes;
4015 int filter_bytes = img_n*bytes;
4016 int width = x;
4017
4018 STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1);
4019 a->out = (stbi_uc *) stbi__malloc(x * y * output_bytes); // extra bytes to write off the end into
4020 if (!a->out) return stbi__err("outofmem", "Out of memory");
4021
4022 img_width_bytes = (((img_n * x * depth) + 7) >> 3);
4023 img_len = (img_width_bytes + 1) * y;
4024 if (s->img_x == x && s->img_y == y) {
4025 if (raw_len != img_len) return stbi__err("not enough pixels","Corrupt PNG");
4026 } else { // interlaced:
4027 if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG");
4028 }
4029
4030 for (j=0; j < y; ++j) {
4031 stbi_uc *cur = a->out + stride*j;
4032 stbi_uc *prior = cur - stride;
4033 int filter = *raw++;
4034
4035 if (filter > 4)
4036 return stbi__err("invalid filter","Corrupt PNG");
4037
4038 if (depth < 8) {
4039 STBI_ASSERT(img_width_bytes <= x);
4040 cur += x*out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place
4041 filter_bytes = 1;
4042 width = img_width_bytes;
4043 }
4044
4045 // if first row, use special filter that doesn't sample previous row
4046 if (j == 0) filter = first_row_filter[filter];
4047
4048 // handle first byte explicitly
4049 for (k=0; k < filter_bytes; ++k) {
4050 switch (filter) {
4051 case STBI__F_none : cur[k] = raw[k]; break;
4052 case STBI__F_sub : cur[k] = raw[k]; break;
4053 case STBI__F_up : cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4054 case STBI__F_avg : cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1)); break;
4055 case STBI__F_paeth : cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0,prior[k],0)); break;
4056 case STBI__F_avg_first : cur[k] = raw[k]; break;
4057 case STBI__F_paeth_first: cur[k] = raw[k]; break;
4058 }
4059 }
4060
4061 if (depth == 8) {
4062 if (img_n != out_n)
4063 cur[img_n] = 255; // first pixel
4064 raw += img_n;
4065 cur += out_n;
4066 prior += out_n;
4067 } else if (depth == 16) {
4068 if (img_n != out_n) {
4069 cur[filter_bytes] = 255; // first pixel top byte
4070 cur[filter_bytes+1] = 255; // first pixel bottom byte
4071 }
4072 raw += filter_bytes;
4073 cur += output_bytes;
4074 prior += output_bytes;
4075 } else {
4076 raw += 1;
4077 cur += 1;
4078 prior += 1;
4079 }
4080
4081 // this is a little gross, so that we don't switch per-pixel or per-component
4082 if (depth < 8 || img_n == out_n) {
4083 int nk = (width - 1)*filter_bytes;
4084 #define CASE(f) \
4085 case f: \
4086 for (k=0; k < nk; ++k)
4087 switch (filter) {
4088 // "none" filter turns into a memcpy here; make that explicit.
4089 case STBI__F_none: memcpy(cur, raw, nk); break;
4090 CASE(STBI__F_sub) cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]); break;
4091 CASE(STBI__F_up) cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4092 CASE(STBI__F_avg) cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1)); break;
4093 CASE(STBI__F_paeth) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],prior[k],prior[k-filter_bytes])); break;
4094 CASE(STBI__F_avg_first) cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1)); break;
4095 CASE(STBI__F_paeth_first) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],0,0)); break;
4096 }
4097 #undef CASE
4098 raw += nk;
4099 } else {
4100 STBI_ASSERT(img_n+1 == out_n);
4101 #define CASE(f) \
4102 case f: \
4103 for (i=x-1; i >= 1; --i, cur[filter_bytes]=255,raw+=filter_bytes,cur+=output_bytes,prior+=output_bytes) \
4104 for (k=0; k < filter_bytes; ++k)
4105 switch (filter) {
4106 CASE(STBI__F_none) cur[k] = raw[k]; break;
4107 CASE(STBI__F_sub) cur[k] = STBI__BYTECAST(raw[k] + cur[k- output_bytes]); break;
4108 CASE(STBI__F_up) cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4109 CASE(STBI__F_avg) cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k- output_bytes])>>1)); break;
4110 CASE(STBI__F_paeth) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],prior[k],prior[k- output_bytes])); break;
4111 CASE(STBI__F_avg_first) cur[k] = STBI__BYTECAST(raw[k] + (cur[k- output_bytes] >> 1)); break;
4112 CASE(STBI__F_paeth_first) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],0,0)); break;
4113 }
4114 #undef CASE
4115
4116 // the loop above sets the high byte of the pixels' alpha, but for
4117 // 16 bit png files we also need the low byte set. we'll do that here.
4118 if (depth == 16) {
4119 cur = a->out + stride*j; // start at the beginning of the row again
4120 for (i=0; i < x; ++i,cur+=output_bytes) {
4121 cur[filter_bytes+1] = 255;
4122 }
4123 }
4124 }
4125 }
4126
4127 // we make a separate pass to expand bits to pixels; for performance,
4128 // this could run two scanlines behind the above code, so it won't
4129 // intefere with filtering but will still be in the cache.
4130 if (depth < 8) {
4131 for (j=0; j < y; ++j) {
4132 stbi_uc *cur = a->out + stride*j;
4133 stbi_uc *in = a->out + stride*j + x*out_n - img_width_bytes;
4134 // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
4135 // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
4136 stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range
4137
4138 // note that the final byte might overshoot and write more data than desired.
4139 // we can allocate enough data that this never writes out of memory, but it
4140 // could also overwrite the next scanline. can it overwrite non-empty data
4141 // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
4142 // so we need to explicitly clamp the final ones
4143
4144 if (depth == 4) {
4145 for (k=x*img_n; k >= 2; k-=2, ++in) {
4146 *cur++ = scale * ((*in >> 4) );
4147 *cur++ = scale * ((*in ) & 0x0f);
4148 }
4149 if (k > 0) *cur++ = scale * ((*in >> 4) );
4150 } else if (depth == 2) {
4151 for (k=x*img_n; k >= 4; k-=4, ++in) {
4152 *cur++ = scale * ((*in >> 6) );
4153 *cur++ = scale * ((*in >> 4) & 0x03);
4154 *cur++ = scale * ((*in >> 2) & 0x03);
4155 *cur++ = scale * ((*in ) & 0x03);
4156 }
4157 if (k > 0) *cur++ = scale * ((*in >> 6) );
4158 if (k > 1) *cur++ = scale * ((*in >> 4) & 0x03);
4159 if (k > 2) *cur++ = scale * ((*in >> 2) & 0x03);
4160 } else if (depth == 1) {
4161 for (k=x*img_n; k >= 8; k-=8, ++in) {
4162 *cur++ = scale * ((*in >> 7) );
4163 *cur++ = scale * ((*in >> 6) & 0x01);
4164 *cur++ = scale * ((*in >> 5) & 0x01);
4165 *cur++ = scale * ((*in >> 4) & 0x01);
4166 *cur++ = scale * ((*in >> 3) & 0x01);
4167 *cur++ = scale * ((*in >> 2) & 0x01);
4168 *cur++ = scale * ((*in >> 1) & 0x01);
4169 *cur++ = scale * ((*in ) & 0x01);
4170 }
4171 if (k > 0) *cur++ = scale * ((*in >> 7) );
4172 if (k > 1) *cur++ = scale * ((*in >> 6) & 0x01);
4173 if (k > 2) *cur++ = scale * ((*in >> 5) & 0x01);
4174 if (k > 3) *cur++ = scale * ((*in >> 4) & 0x01);
4175 if (k > 4) *cur++ = scale * ((*in >> 3) & 0x01);
4176 if (k > 5) *cur++ = scale * ((*in >> 2) & 0x01);
4177 if (k > 6) *cur++ = scale * ((*in >> 1) & 0x01);
4178 }
4179 if (img_n != out_n) {
4180 int q;
4181 // insert alpha = 255
4182 cur = a->out + stride*j;
4183 if (img_n == 1) {
4184 for (q=x-1; q >= 0; --q) {
4185 cur[q*2+1] = 255;
4186 cur[q*2+0] = cur[q];
4187 }
4188 } else {
4189 STBI_ASSERT(img_n == 3);
4190 for (q=x-1; q >= 0; --q) {
4191 cur[q*4+3] = 255;
4192 cur[q*4+2] = cur[q*3+2];
4193 cur[q*4+1] = cur[q*3+1];
4194 cur[q*4+0] = cur[q*3+0];
4195 }
4196 }
4197 }
4198 }
4199 } else if (depth == 16) {
4200 // force the image data from big-endian to platform-native.
4201 // this is done in a separate pass due to the decoding relying
4202 // on the data being untouched, but could probably be done
4203 // per-line during decode if care is taken.
4204 stbi_uc *cur = a->out;
4205 stbi__uint16 *cur16 = (stbi__uint16*)cur;
4206
4207 for(i=0; i < x*y*out_n; ++i,cur16++,cur+=2) {
4208 *cur16 = (cur[0] << 8) | cur[1];
4209 }
4210 }
4211
4212 return 1;
4213 }
4214
stbi__create_png_image(stbi__png * a,stbi_uc * image_data,stbi__uint32 image_data_len,int out_n,int depth,int color,int interlaced)4215 static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
4216 {
4217 stbi_uc *final;
4218 int p;
4219 if (!interlaced)
4220 return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
4221
4222 // de-interlacing
4223 final = (stbi_uc *) stbi__malloc(a->s->img_x * a->s->img_y * out_n);
4224 for (p=0; p < 7; ++p) {
4225 int xorig[] = { 0,4,0,2,0,1,0 };
4226 int yorig[] = { 0,0,4,0,2,0,1 };
4227 int xspc[] = { 8,8,4,4,2,2,1 };
4228 int yspc[] = { 8,8,8,4,4,2,2 };
4229 int i,j,x,y;
4230 // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
4231 x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
4232 y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
4233 if (x && y) {
4234 stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
4235 if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
4236 STBI_FREE(final);
4237 return 0;
4238 }
4239 for (j=0; j < y; ++j) {
4240 for (i=0; i < x; ++i) {
4241 int out_y = j*yspc[p]+yorig[p];
4242 int out_x = i*xspc[p]+xorig[p];
4243 memcpy(final + out_y*a->s->img_x*out_n + out_x*out_n,
4244 a->out + (j*x+i)*out_n, out_n);
4245 }
4246 }
4247 STBI_FREE(a->out);
4248 image_data += img_len;
4249 image_data_len -= img_len;
4250 }
4251 }
4252 a->out = final;
4253
4254 return 1;
4255 }
4256
stbi__compute_transparency(stbi__png * z,stbi_uc tc[3],int out_n)4257 static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n)
4258 {
4259 stbi__context *s = z->s;
4260 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4261 stbi_uc *p = z->out;
4262
4263 // compute color-based transparency, assuming we've
4264 // already got 255 as the alpha value in the output
4265 STBI_ASSERT(out_n == 2 || out_n == 4);
4266
4267 if (out_n == 2) {
4268 for (i=0; i < pixel_count; ++i) {
4269 p[1] = (p[0] == tc[0] ? 0 : 255);
4270 p += 2;
4271 }
4272 } else {
4273 for (i=0; i < pixel_count; ++i) {
4274 if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4275 p[3] = 0;
4276 p += 4;
4277 }
4278 }
4279 return 1;
4280 }
4281
stbi__compute_transparency16(stbi__png * z,stbi__uint16 tc[3],int out_n)4282 static int stbi__compute_transparency16(stbi__png *z, stbi__uint16 tc[3], int out_n)
4283 {
4284 stbi__context *s = z->s;
4285 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4286 stbi__uint16 *p = (stbi__uint16*) z->out;
4287
4288 // compute color-based transparency, assuming we've
4289 // already got 65535 as the alpha value in the output
4290 STBI_ASSERT(out_n == 2 || out_n == 4);
4291
4292 if (out_n == 2) {
4293 for (i = 0; i < pixel_count; ++i) {
4294 p[1] = (p[0] == tc[0] ? 0 : 65535);
4295 p += 2;
4296 }
4297 } else {
4298 for (i = 0; i < pixel_count; ++i) {
4299 if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4300 p[3] = 0;
4301 p += 4;
4302 }
4303 }
4304 return 1;
4305 }
4306
stbi__expand_png_palette(stbi__png * a,stbi_uc * palette,int len,int pal_img_n)4307 static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n)
4308 {
4309 stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
4310 stbi_uc *p, *temp_out, *orig = a->out;
4311
4312 p = (stbi_uc *) stbi__malloc(pixel_count * pal_img_n);
4313 if (p == NULL) return stbi__err("outofmem", "Out of memory");
4314
4315 // between here and free(out) below, exitting would leak
4316 temp_out = p;
4317
4318 if (pal_img_n == 3) {
4319 for (i=0; i < pixel_count; ++i) {
4320 int n = orig[i]*4;
4321 p[0] = palette[n ];
4322 p[1] = palette[n+1];
4323 p[2] = palette[n+2];
4324 p += 3;
4325 }
4326 } else {
4327 for (i=0; i < pixel_count; ++i) {
4328 int n = orig[i]*4;
4329 p[0] = palette[n ];
4330 p[1] = palette[n+1];
4331 p[2] = palette[n+2];
4332 p[3] = palette[n+3];
4333 p += 4;
4334 }
4335 }
4336 STBI_FREE(a->out);
4337 a->out = temp_out;
4338
4339 STBI_NOTUSED(len);
4340
4341 return 1;
4342 }
4343
stbi__reduce_png(stbi__png * p)4344 static int stbi__reduce_png(stbi__png *p)
4345 {
4346 int i;
4347 int img_len = p->s->img_x * p->s->img_y * p->s->img_out_n;
4348 stbi_uc *reduced;
4349 stbi__uint16 *orig = (stbi__uint16*)p->out;
4350
4351 if (p->depth != 16) return 1; // don't need to do anything if not 16-bit data
4352
4353 reduced = (stbi_uc *)stbi__malloc(img_len);
4354 if (p == NULL) return stbi__err("outofmem", "Out of memory");
4355
4356 for (i = 0; i < img_len; ++i) reduced[i] = (stbi_uc)((orig[i] >> 8) & 0xFF); // top half of each byte is a decent approx of 16->8 bit scaling
4357
4358 p->out = reduced;
4359 STBI_FREE(orig);
4360
4361 return 1;
4362 }
4363
4364 static int stbi__unpremultiply_on_load = 0;
4365 static int stbi__de_iphone_flag = 0;
4366
stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)4367 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
4368 {
4369 stbi__unpremultiply_on_load = flag_true_if_should_unpremultiply;
4370 }
4371
stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)4372 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
4373 {
4374 stbi__de_iphone_flag = flag_true_if_should_convert;
4375 }
4376
stbi__de_iphone(stbi__png * z)4377 static void stbi__de_iphone(stbi__png *z)
4378 {
4379 stbi__context *s = z->s;
4380 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4381 stbi_uc *p = z->out;
4382
4383 if (s->img_out_n == 3) { // convert bgr to rgb
4384 for (i=0; i < pixel_count; ++i) {
4385 stbi_uc t = p[0];
4386 p[0] = p[2];
4387 p[2] = t;
4388 p += 3;
4389 }
4390 } else {
4391 STBI_ASSERT(s->img_out_n == 4);
4392 if (stbi__unpremultiply_on_load) {
4393 // convert bgr to rgb and unpremultiply
4394 for (i=0; i < pixel_count; ++i) {
4395 stbi_uc a = p[3];
4396 stbi_uc t = p[0];
4397 if (a) {
4398 p[0] = p[2] * 255 / a;
4399 p[1] = p[1] * 255 / a;
4400 p[2] = t * 255 / a;
4401 } else {
4402 p[0] = p[2];
4403 p[2] = t;
4404 }
4405 p += 4;
4406 }
4407 } else {
4408 // convert bgr to rgb
4409 for (i=0; i < pixel_count; ++i) {
4410 stbi_uc t = p[0];
4411 p[0] = p[2];
4412 p[2] = t;
4413 p += 4;
4414 }
4415 }
4416 }
4417 }
4418
4419 #define STBI__PNG_TYPE(a,b,c,d) (((a) << 24) + ((b) << 16) + ((c) << 8) + (d))
4420
stbi__parse_png_file(stbi__png * z,int scan,int req_comp)4421 static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp)
4422 {
4423 stbi_uc palette[1024], pal_img_n=0;
4424 stbi_uc has_trans=0, tc[3];
4425 stbi__uint16 tc16[3];
4426 stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0;
4427 int first=1,k,interlace=0, color=0, is_iphone=0;
4428 stbi__context *s = z->s;
4429
4430 z->expanded = NULL;
4431 z->idata = NULL;
4432 z->out = NULL;
4433
4434 if (!stbi__check_png_header(s)) return 0;
4435
4436 if (scan == STBI__SCAN_type) return 1;
4437
4438 for (;;) {
4439 stbi__pngchunk c = stbi__get_chunk_header(s);
4440 switch (c.type) {
4441 case STBI__PNG_TYPE('C','g','B','I'):
4442 is_iphone = 1;
4443 stbi__skip(s, c.length);
4444 break;
4445 case STBI__PNG_TYPE('I','H','D','R'): {
4446 int comp,filter;
4447 if (!first) return stbi__err("multiple IHDR","Corrupt PNG");
4448 first = 0;
4449 if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG");
4450 s->img_x = stbi__get32be(s); if (s->img_x > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4451 s->img_y = stbi__get32be(s); if (s->img_y > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4452 z->depth = stbi__get8(s); if (z->depth != 1 && z->depth != 2 && z->depth != 4 && z->depth != 8 && z->depth != 16) return stbi__err("1/2/4/8/16-bit only","PNG not supported: 1/2/4/8/16-bit only");
4453 color = stbi__get8(s); if (color > 6) return stbi__err("bad ctype","Corrupt PNG");
4454 if (color == 3 && z->depth == 16) return stbi__err("bad ctype","Corrupt PNG");
4455 if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG");
4456 comp = stbi__get8(s); if (comp) return stbi__err("bad comp method","Corrupt PNG");
4457 filter= stbi__get8(s); if (filter) return stbi__err("bad filter method","Corrupt PNG");
4458 interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG");
4459 if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG");
4460 if (!pal_img_n) {
4461 s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
4462 if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
4463 if (scan == STBI__SCAN_header) return 1;
4464 } else {
4465 // if paletted, then pal_n is our final components, and
4466 // img_n is # components to decompress/filter.
4467 s->img_n = 1;
4468 if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG");
4469 // if SCAN_header, have to scan to see if we have a tRNS
4470 }
4471 break;
4472 }
4473
4474 case STBI__PNG_TYPE('P','L','T','E'): {
4475 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4476 if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
4477 pal_len = c.length / 3;
4478 if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG");
4479 for (i=0; i < pal_len; ++i) {
4480 palette[i*4+0] = stbi__get8(s);
4481 palette[i*4+1] = stbi__get8(s);
4482 palette[i*4+2] = stbi__get8(s);
4483 palette[i*4+3] = 255;
4484 }
4485 break;
4486 }
4487
4488 case STBI__PNG_TYPE('t','R','N','S'): {
4489 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4490 if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG");
4491 if (pal_img_n) {
4492 if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; }
4493 if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
4494 if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG");
4495 pal_img_n = 4;
4496 for (i=0; i < c.length; ++i)
4497 palette[i*4+3] = stbi__get8(s);
4498 } else {
4499 if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
4500 if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG");
4501 has_trans = 1;
4502 if (z->depth == 16) {
4503 for (k = 0; k < s->img_n; ++k) tc16[k] = stbi__get16be(s); // copy the values as-is
4504 } else {
4505 for (k = 0; k < s->img_n; ++k) tc[k] = (stbi_uc)(stbi__get16be(s) & 255) * stbi__depth_scale_table[z->depth]; // non 8-bit images will be larger
4506 }
4507 }
4508 break;
4509 }
4510
4511 case STBI__PNG_TYPE('I','D','A','T'): {
4512 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4513 if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG");
4514 if (scan == STBI__SCAN_header) { s->img_n = pal_img_n; return 1; }
4515 if ((int)(ioff + c.length) < (int)ioff) return 0;
4516 if (ioff + c.length > idata_limit) {
4517 stbi__uint32 idata_limit_old = idata_limit;
4518 stbi_uc *p;
4519 if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
4520 while (ioff + c.length > idata_limit)
4521 idata_limit *= 2;
4522 STBI_NOTUSED(idata_limit_old);
4523 p = (stbi_uc *) STBI_REALLOC_SIZED(z->idata, idata_limit_old, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
4524 z->idata = p;
4525 }
4526 if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG");
4527 ioff += c.length;
4528 break;
4529 }
4530
4531 case STBI__PNG_TYPE('I','E','N','D'): {
4532 stbi__uint32 raw_len, bpl;
4533 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4534 if (scan != STBI__SCAN_load) return 1;
4535 if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG");
4536 // initial guess for decoded data size to avoid unnecessary reallocs
4537 bpl = (s->img_x * z->depth + 7) / 8; // bytes per line, per component
4538 raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
4539 z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone);
4540 if (z->expanded == NULL) return 0; // zlib should set error
4541 STBI_FREE(z->idata); z->idata = NULL;
4542 if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
4543 s->img_out_n = s->img_n+1;
4544 else
4545 s->img_out_n = s->img_n;
4546 if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, z->depth, color, interlace)) return 0;
4547 if (has_trans) {
4548 if (z->depth == 16) {
4549 if (!stbi__compute_transparency16(z, tc16, s->img_out_n)) return 0;
4550 } else {
4551 if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
4552 }
4553 }
4554 if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
4555 stbi__de_iphone(z);
4556 if (pal_img_n) {
4557 // pal_img_n == 3 or 4
4558 s->img_n = pal_img_n; // record the actual colors we had
4559 s->img_out_n = pal_img_n;
4560 if (req_comp >= 3) s->img_out_n = req_comp;
4561 if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
4562 return 0;
4563 }
4564 STBI_FREE(z->expanded); z->expanded = NULL;
4565 return 1;
4566 }
4567
4568 default:
4569 // if critical, fail
4570 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4571 if ((c.type & (1 << 29)) == 0) {
4572 #ifndef STBI_NO_FAILURE_STRINGS
4573 // not threadsafe
4574 static char invalid_chunk[] = "XXXX PNG chunk not known";
4575 invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
4576 invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
4577 invalid_chunk[2] = STBI__BYTECAST(c.type >> 8);
4578 invalid_chunk[3] = STBI__BYTECAST(c.type >> 0);
4579 #endif
4580 return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
4581 }
4582 stbi__skip(s, c.length);
4583 break;
4584 }
4585 // end of PNG chunk, read and skip CRC
4586 stbi__get32be(s);
4587 }
4588 }
4589
stbi__do_png(stbi__png * p,int * x,int * y,int * n,int req_comp)4590 static unsigned char *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp)
4591 {
4592 unsigned char *result=NULL;
4593 if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
4594 if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) {
4595 if (p->depth == 16) {
4596 if (!stbi__reduce_png(p)) {
4597 return result;
4598 }
4599 }
4600 result = p->out;
4601 p->out = NULL;
4602 if (req_comp && req_comp != p->s->img_out_n) {
4603 result = stbi__convert_format(result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
4604 p->s->img_out_n = req_comp;
4605 if (result == NULL) return result;
4606 }
4607 *x = p->s->img_x;
4608 *y = p->s->img_y;
4609 if (n) *n = p->s->img_n;
4610 }
4611 STBI_FREE(p->out); p->out = NULL;
4612 STBI_FREE(p->expanded); p->expanded = NULL;
4613 STBI_FREE(p->idata); p->idata = NULL;
4614
4615 return result;
4616 }
4617
stbi__png_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4618 static unsigned char *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4619 {
4620 stbi__png p;
4621 p.s = s;
4622 return stbi__do_png(&p, x,y,comp,req_comp);
4623 }
4624
stbi__png_test(stbi__context * s)4625 static int stbi__png_test(stbi__context *s)
4626 {
4627 int r;
4628 r = stbi__check_png_header(s);
4629 stbi__rewind(s);
4630 return r;
4631 }
4632
stbi__png_info_raw(stbi__png * p,int * x,int * y,int * comp)4633 static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp)
4634 {
4635 if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) {
4636 stbi__rewind( p->s );
4637 return 0;
4638 }
4639 if (x) *x = p->s->img_x;
4640 if (y) *y = p->s->img_y;
4641 if (comp) *comp = p->s->img_n;
4642 return 1;
4643 }
4644
stbi__png_info(stbi__context * s,int * x,int * y,int * comp)4645 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp)
4646 {
4647 stbi__png p;
4648 p.s = s;
4649 return stbi__png_info_raw(&p, x, y, comp);
4650 }
4651 #endif
4652
4653 // Microsoft/Windows BMP image
4654
4655 #ifndef STBI_NO_BMP
stbi__bmp_test_raw(stbi__context * s)4656 static int stbi__bmp_test_raw(stbi__context *s)
4657 {
4658 int r;
4659 int sz;
4660 if (stbi__get8(s) != 'B') return 0;
4661 if (stbi__get8(s) != 'M') return 0;
4662 stbi__get32le(s); // discard filesize
4663 stbi__get16le(s); // discard reserved
4664 stbi__get16le(s); // discard reserved
4665 stbi__get32le(s); // discard data offset
4666 sz = stbi__get32le(s);
4667 r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
4668 return r;
4669 }
4670
stbi__bmp_test(stbi__context * s)4671 static int stbi__bmp_test(stbi__context *s)
4672 {
4673 int r = stbi__bmp_test_raw(s);
4674 stbi__rewind(s);
4675 return r;
4676 }
4677
4678
4679 // returns 0..31 for the highest set bit
stbi__high_bit(unsigned int z)4680 static int stbi__high_bit(unsigned int z)
4681 {
4682 int n=0;
4683 if (z == 0) return -1;
4684 if (z >= 0x10000) n += 16, z >>= 16;
4685 if (z >= 0x00100) n += 8, z >>= 8;
4686 if (z >= 0x00010) n += 4, z >>= 4;
4687 if (z >= 0x00004) n += 2, z >>= 2;
4688 if (z >= 0x00002) n += 1, z >>= 1;
4689 return n;
4690 }
4691
stbi__bitcount(unsigned int a)4692 static int stbi__bitcount(unsigned int a)
4693 {
4694 a = (a & 0x55555555) + ((a >> 1) & 0x55555555); // max 2
4695 a = (a & 0x33333333) + ((a >> 2) & 0x33333333); // max 4
4696 a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
4697 a = (a + (a >> 8)); // max 16 per 8 bits
4698 a = (a + (a >> 16)); // max 32 per 8 bits
4699 return a & 0xff;
4700 }
4701
stbi__shiftsigned(int v,int shift,int bits)4702 static int stbi__shiftsigned(int v, int shift, int bits)
4703 {
4704 int result;
4705 int z=0;
4706
4707 if (shift < 0) v <<= -shift;
4708 else v >>= shift;
4709 result = v;
4710
4711 z = bits;
4712 while (z < 8) {
4713 result += v >> z;
4714 z += bits;
4715 }
4716 return result;
4717 }
4718
4719 typedef struct
4720 {
4721 int bpp, offset, hsz;
4722 unsigned int mr,mg,mb,ma, all_a;
4723 } stbi__bmp_data;
4724
stbi__bmp_parse_header(stbi__context * s,stbi__bmp_data * info)4725 static void *stbi__bmp_parse_header(stbi__context *s, stbi__bmp_data *info)
4726 {
4727 int hsz;
4728 if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
4729 stbi__get32le(s); // discard filesize
4730 stbi__get16le(s); // discard reserved
4731 stbi__get16le(s); // discard reserved
4732 info->offset = stbi__get32le(s);
4733 info->hsz = hsz = stbi__get32le(s);
4734 info->mr = info->mg = info->mb = info->ma = 0;
4735
4736 if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
4737 if (hsz == 12) {
4738 s->img_x = stbi__get16le(s);
4739 s->img_y = stbi__get16le(s);
4740 } else {
4741 s->img_x = stbi__get32le(s);
4742 s->img_y = stbi__get32le(s);
4743 }
4744 if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
4745 info->bpp = stbi__get16le(s);
4746 if (info->bpp == 1) return stbi__errpuc("monochrome", "BMP type not supported: 1-bit");
4747 if (hsz != 12) {
4748 int compress = stbi__get32le(s);
4749 if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
4750 stbi__get32le(s); // discard sizeof
4751 stbi__get32le(s); // discard hres
4752 stbi__get32le(s); // discard vres
4753 stbi__get32le(s); // discard colorsused
4754 stbi__get32le(s); // discard max important
4755 if (hsz == 40 || hsz == 56) {
4756 if (hsz == 56) {
4757 stbi__get32le(s);
4758 stbi__get32le(s);
4759 stbi__get32le(s);
4760 stbi__get32le(s);
4761 }
4762 if (info->bpp == 16 || info->bpp == 32) {
4763 if (compress == 0) {
4764 if (info->bpp == 32) {
4765 info->mr = 0xffu << 16;
4766 info->mg = 0xffu << 8;
4767 info->mb = 0xffu << 0;
4768 info->ma = 0xffu << 24;
4769 info->all_a = 0; // if all_a is 0 at end, then we loaded alpha channel but it was all 0
4770 } else {
4771 info->mr = 31u << 10;
4772 info->mg = 31u << 5;
4773 info->mb = 31u << 0;
4774 }
4775 } else if (compress == 3) {
4776 info->mr = stbi__get32le(s);
4777 info->mg = stbi__get32le(s);
4778 info->mb = stbi__get32le(s);
4779 // not documented, but generated by photoshop and handled by mspaint
4780 if (info->mr == info->mg && info->mg == info->mb) {
4781 // ?!?!?
4782 return stbi__errpuc("bad BMP", "bad BMP");
4783 }
4784 } else
4785 return stbi__errpuc("bad BMP", "bad BMP");
4786 }
4787 } else {
4788 int i;
4789 if (hsz != 108 && hsz != 124)
4790 return stbi__errpuc("bad BMP", "bad BMP");
4791 info->mr = stbi__get32le(s);
4792 info->mg = stbi__get32le(s);
4793 info->mb = stbi__get32le(s);
4794 info->ma = stbi__get32le(s);
4795 stbi__get32le(s); // discard color space
4796 for (i=0; i < 12; ++i)
4797 stbi__get32le(s); // discard color space parameters
4798 if (hsz == 124) {
4799 stbi__get32le(s); // discard rendering intent
4800 stbi__get32le(s); // discard offset of profile data
4801 stbi__get32le(s); // discard size of profile data
4802 stbi__get32le(s); // discard reserved
4803 }
4804 }
4805 }
4806 return (void *) 1;
4807 }
4808
4809
stbi__bmp_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4810 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4811 {
4812 stbi_uc *out;
4813 unsigned int mr=0,mg=0,mb=0,ma=0, all_a;
4814 stbi_uc pal[256][4];
4815 int psize=0,i,j,width;
4816 int flip_vertically, pad, target;
4817 stbi__bmp_data info;
4818
4819 info.all_a = 255;
4820 if (stbi__bmp_parse_header(s, &info) == NULL)
4821 return NULL; // error code already set
4822
4823 flip_vertically = ((int) s->img_y) > 0;
4824 s->img_y = abs((int) s->img_y);
4825
4826 mr = info.mr;
4827 mg = info.mg;
4828 mb = info.mb;
4829 ma = info.ma;
4830 all_a = info.all_a;
4831
4832 if (info.hsz == 12) {
4833 if (info.bpp < 24)
4834 psize = (info.offset - 14 - 24) / 3;
4835 } else {
4836 if (info.bpp < 16)
4837 psize = (info.offset - 14 - info.hsz) >> 2;
4838 }
4839
4840 s->img_n = ma ? 4 : 3;
4841 if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
4842 target = req_comp;
4843 else
4844 target = s->img_n; // if they want monochrome, we'll post-convert
4845
4846 out = (stbi_uc *) stbi__malloc(target * s->img_x * s->img_y);
4847 if (!out) return stbi__errpuc("outofmem", "Out of memory");
4848 if (info.bpp < 16) {
4849 int z=0;
4850 if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
4851 for (i=0; i < psize; ++i) {
4852 pal[i][2] = stbi__get8(s);
4853 pal[i][1] = stbi__get8(s);
4854 pal[i][0] = stbi__get8(s);
4855 if (info.hsz != 12) stbi__get8(s);
4856 pal[i][3] = 255;
4857 }
4858 stbi__skip(s, info.offset - 14 - info.hsz - psize * (info.hsz == 12 ? 3 : 4));
4859 if (info.bpp == 4) width = (s->img_x + 1) >> 1;
4860 else if (info.bpp == 8) width = s->img_x;
4861 else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
4862 pad = (-width)&3;
4863 for (j=0; j < (int) s->img_y; ++j) {
4864 for (i=0; i < (int) s->img_x; i += 2) {
4865 int v=stbi__get8(s),v2=0;
4866 if (info.bpp == 4) {
4867 v2 = v & 15;
4868 v >>= 4;
4869 }
4870 out[z++] = pal[v][0];
4871 out[z++] = pal[v][1];
4872 out[z++] = pal[v][2];
4873 if (target == 4) out[z++] = 255;
4874 if (i+1 == (int) s->img_x) break;
4875 v = (info.bpp == 8) ? stbi__get8(s) : v2;
4876 out[z++] = pal[v][0];
4877 out[z++] = pal[v][1];
4878 out[z++] = pal[v][2];
4879 if (target == 4) out[z++] = 255;
4880 }
4881 stbi__skip(s, pad);
4882 }
4883 } else {
4884 int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
4885 int z = 0;
4886 int easy=0;
4887 stbi__skip(s, info.offset - 14 - info.hsz);
4888 if (info.bpp == 24) width = 3 * s->img_x;
4889 else if (info.bpp == 16) width = 2*s->img_x;
4890 else /* bpp = 32 and pad = 0 */ width=0;
4891 pad = (-width) & 3;
4892 if (info.bpp == 24) {
4893 easy = 1;
4894 } else if (info.bpp == 32) {
4895 if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
4896 easy = 2;
4897 }
4898 if (!easy) {
4899 if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
4900 // right shift amt to put high bit in position #7
4901 rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr);
4902 gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg);
4903 bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb);
4904 ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma);
4905 }
4906 for (j=0; j < (int) s->img_y; ++j) {
4907 if (easy) {
4908 for (i=0; i < (int) s->img_x; ++i) {
4909 unsigned char a;
4910 out[z+2] = stbi__get8(s);
4911 out[z+1] = stbi__get8(s);
4912 out[z+0] = stbi__get8(s);
4913 z += 3;
4914 a = (easy == 2 ? stbi__get8(s) : 255);
4915 all_a |= a;
4916 if (target == 4) out[z++] = a;
4917 }
4918 } else {
4919 int bpp = info.bpp;
4920 for (i=0; i < (int) s->img_x; ++i) {
4921 stbi__uint32 v = (bpp == 16 ? (stbi__uint32) stbi__get16le(s) : stbi__get32le(s));
4922 int a;
4923 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
4924 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
4925 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
4926 a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
4927 all_a |= a;
4928 if (target == 4) out[z++] = STBI__BYTECAST(a);
4929 }
4930 }
4931 stbi__skip(s, pad);
4932 }
4933 }
4934
4935 // if alpha channel is all 0s, replace with all 255s
4936 if (target == 4 && all_a == 0)
4937 for (i=4*s->img_x*s->img_y-1; i >= 0; i -= 4)
4938 out[i] = 255;
4939
4940 if (flip_vertically) {
4941 stbi_uc t;
4942 for (j=0; j < (int) s->img_y>>1; ++j) {
4943 stbi_uc *p1 = out + j *s->img_x*target;
4944 stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
4945 for (i=0; i < (int) s->img_x*target; ++i) {
4946 t = p1[i], p1[i] = p2[i], p2[i] = t;
4947 }
4948 }
4949 }
4950
4951 if (req_comp && req_comp != target) {
4952 out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
4953 if (out == NULL) return out; // stbi__convert_format frees input on failure
4954 }
4955
4956 *x = s->img_x;
4957 *y = s->img_y;
4958 if (comp) *comp = s->img_n;
4959 return out;
4960 }
4961 #endif
4962
4963 // Targa Truevision - TGA
4964 // by Jonathan Dummer
4965 #ifndef STBI_NO_TGA
4966 // returns STBI_rgb or whatever, 0 on error
stbi__tga_get_comp(int bits_per_pixel,int is_grey,int * is_rgb16)4967 static int stbi__tga_get_comp(int bits_per_pixel, int is_grey, int* is_rgb16)
4968 {
4969 // only RGB or RGBA (incl. 16bit) or grey allowed
4970 if(is_rgb16) *is_rgb16 = 0;
4971 switch(bits_per_pixel) {
4972 case 8: return STBI_grey;
4973 case 16: if(is_grey) return STBI_grey_alpha;
4974 // else: fall-through
4975 case 15: if(is_rgb16) *is_rgb16 = 1;
4976 return STBI_rgb;
4977 case 24: // fall-through
4978 case 32: return bits_per_pixel/8;
4979 default: return 0;
4980 }
4981 }
4982
stbi__tga_info(stbi__context * s,int * x,int * y,int * comp)4983 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp)
4984 {
4985 int tga_w, tga_h, tga_comp, tga_image_type, tga_bits_per_pixel, tga_colormap_bpp;
4986 int sz, tga_colormap_type;
4987 stbi__get8(s); // discard Offset
4988 tga_colormap_type = stbi__get8(s); // colormap type
4989 if( tga_colormap_type > 1 ) {
4990 stbi__rewind(s);
4991 return 0; // only RGB or indexed allowed
4992 }
4993 tga_image_type = stbi__get8(s); // image type
4994 if ( tga_colormap_type == 1 ) { // colormapped (paletted) image
4995 if (tga_image_type != 1 && tga_image_type != 9) {
4996 stbi__rewind(s);
4997 return 0;
4998 }
4999 stbi__skip(s,4); // skip index of first colormap entry and number of entries
5000 sz = stbi__get8(s); // check bits per palette color entry
5001 if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) {
5002 stbi__rewind(s);
5003 return 0;
5004 }
5005 stbi__skip(s,4); // skip image x and y origin
5006 tga_colormap_bpp = sz;
5007 } else { // "normal" image w/o colormap - only RGB or grey allowed, +/- RLE
5008 if ( (tga_image_type != 2) && (tga_image_type != 3) && (tga_image_type != 10) && (tga_image_type != 11) ) {
5009 stbi__rewind(s);
5010 return 0; // only RGB or grey allowed, +/- RLE
5011 }
5012 stbi__skip(s,9); // skip colormap specification and image x/y origin
5013 tga_colormap_bpp = 0;
5014 }
5015 tga_w = stbi__get16le(s);
5016 if( tga_w < 1 ) {
5017 stbi__rewind(s);
5018 return 0; // test width
5019 }
5020 tga_h = stbi__get16le(s);
5021 if( tga_h < 1 ) {
5022 stbi__rewind(s);
5023 return 0; // test height
5024 }
5025 tga_bits_per_pixel = stbi__get8(s); // bits per pixel
5026 stbi__get8(s); // ignore alpha bits
5027 if (tga_colormap_bpp != 0) {
5028 if((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16)) {
5029 // when using a colormap, tga_bits_per_pixel is the size of the indexes
5030 // I don't think anything but 8 or 16bit indexes makes sense
5031 stbi__rewind(s);
5032 return 0;
5033 }
5034 tga_comp = stbi__tga_get_comp(tga_colormap_bpp, 0, NULL);
5035 } else {
5036 tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3) || (tga_image_type == 11), NULL);
5037 }
5038 if(!tga_comp) {
5039 stbi__rewind(s);
5040 return 0;
5041 }
5042 if (x) *x = tga_w;
5043 if (y) *y = tga_h;
5044 if (comp) *comp = tga_comp;
5045 return 1; // seems to have passed everything
5046 }
5047
stbi__tga_test(stbi__context * s)5048 static int stbi__tga_test(stbi__context *s)
5049 {
5050 int res = 0;
5051 int sz, tga_color_type;
5052 stbi__get8(s); // discard Offset
5053 tga_color_type = stbi__get8(s); // color type
5054 if ( tga_color_type > 1 ) goto errorEnd; // only RGB or indexed allowed
5055 sz = stbi__get8(s); // image type
5056 if ( tga_color_type == 1 ) { // colormapped (paletted) image
5057 if (sz != 1 && sz != 9) goto errorEnd; // colortype 1 demands image type 1 or 9
5058 stbi__skip(s,4); // skip index of first colormap entry and number of entries
5059 sz = stbi__get8(s); // check bits per palette color entry
5060 if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
5061 stbi__skip(s,4); // skip image x and y origin
5062 } else { // "normal" image w/o colormap
5063 if ( (sz != 2) && (sz != 3) && (sz != 10) && (sz != 11) ) goto errorEnd; // only RGB or grey allowed, +/- RLE
5064 stbi__skip(s,9); // skip colormap specification and image x/y origin
5065 }
5066 if ( stbi__get16le(s) < 1 ) goto errorEnd; // test width
5067 if ( stbi__get16le(s) < 1 ) goto errorEnd; // test height
5068 sz = stbi__get8(s); // bits per pixel
5069 if ( (tga_color_type == 1) && (sz != 8) && (sz != 16) ) goto errorEnd; // for colormapped images, bpp is size of an index
5070 if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
5071
5072 res = 1; // if we got this far, everything's good and we can return 1 instead of 0
5073
5074 errorEnd:
5075 stbi__rewind(s);
5076 return res;
5077 }
5078
5079 // read 16bit value and convert to 24bit RGB
stbi__tga_read_rgb16(stbi__context * s,stbi_uc * out)5080 void stbi__tga_read_rgb16(stbi__context *s, stbi_uc* out)
5081 {
5082 stbi__uint16 px = stbi__get16le(s);
5083 stbi__uint16 fiveBitMask = 31;
5084 // we have 3 channels with 5bits each
5085 int r = (px >> 10) & fiveBitMask;
5086 int g = (px >> 5) & fiveBitMask;
5087 int b = px & fiveBitMask;
5088 // Note that this saves the data in RGB(A) order, so it doesn't need to be swapped later
5089 out[0] = (r * 255)/31;
5090 out[1] = (g * 255)/31;
5091 out[2] = (b * 255)/31;
5092
5093 // some people claim that the most significant bit might be used for alpha
5094 // (possibly if an alpha-bit is set in the "image descriptor byte")
5095 // but that only made 16bit test images completely translucent..
5096 // so let's treat all 15 and 16bit TGAs as RGB with no alpha.
5097 }
5098
stbi__tga_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)5099 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5100 {
5101 // read in the TGA header stuff
5102 int tga_offset = stbi__get8(s);
5103 int tga_indexed = stbi__get8(s);
5104 int tga_image_type = stbi__get8(s);
5105 int tga_is_RLE = 0;
5106 int tga_palette_start = stbi__get16le(s);
5107 int tga_palette_len = stbi__get16le(s);
5108 int tga_palette_bits = stbi__get8(s);
5109 int tga_x_origin = stbi__get16le(s);
5110 int tga_y_origin = stbi__get16le(s);
5111 int tga_width = stbi__get16le(s);
5112 int tga_height = stbi__get16le(s);
5113 int tga_bits_per_pixel = stbi__get8(s);
5114 int tga_comp, tga_rgb16=0;
5115 int tga_inverted = stbi__get8(s);
5116 // int tga_alpha_bits = tga_inverted & 15; // the 4 lowest bits - unused (useless?)
5117 // image data
5118 unsigned char *tga_data;
5119 unsigned char *tga_palette = NULL;
5120 int i, j;
5121 unsigned char raw_data[4];
5122 int RLE_count = 0;
5123 int RLE_repeating = 0;
5124 int read_next_pixel = 1;
5125
5126 // do a tiny bit of precessing
5127 if ( tga_image_type >= 8 )
5128 {
5129 tga_image_type -= 8;
5130 tga_is_RLE = 1;
5131 }
5132 tga_inverted = 1 - ((tga_inverted >> 5) & 1);
5133
5134 // If I'm paletted, then I'll use the number of bits from the palette
5135 if ( tga_indexed ) tga_comp = stbi__tga_get_comp(tga_palette_bits, 0, &tga_rgb16);
5136 else tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3), &tga_rgb16);
5137
5138 if(!tga_comp) // shouldn't really happen, stbi__tga_test() should have ensured basic consistency
5139 return stbi__errpuc("bad format", "Can't find out TGA pixelformat");
5140
5141 // tga info
5142 *x = tga_width;
5143 *y = tga_height;
5144 if (comp) *comp = tga_comp;
5145
5146 tga_data = (unsigned char*)stbi__malloc( (size_t)tga_width * tga_height * tga_comp );
5147 if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
5148
5149 // skip to the data's starting position (offset usually = 0)
5150 stbi__skip(s, tga_offset );
5151
5152 if ( !tga_indexed && !tga_is_RLE && !tga_rgb16 ) {
5153 for (i=0; i < tga_height; ++i) {
5154 int row = tga_inverted ? tga_height -i - 1 : i;
5155 stbi_uc *tga_row = tga_data + row*tga_width*tga_comp;
5156 stbi__getn(s, tga_row, tga_width * tga_comp);
5157 }
5158 } else {
5159 // do I need to load a palette?
5160 if ( tga_indexed)
5161 {
5162 // any data to skip? (offset usually = 0)
5163 stbi__skip(s, tga_palette_start );
5164 // load the palette
5165 tga_palette = (unsigned char*)stbi__malloc( tga_palette_len * tga_comp );
5166 if (!tga_palette) {
5167 STBI_FREE(tga_data);
5168 return stbi__errpuc("outofmem", "Out of memory");
5169 }
5170 if (tga_rgb16) {
5171 stbi_uc *pal_entry = tga_palette;
5172 STBI_ASSERT(tga_comp == STBI_rgb);
5173 for (i=0; i < tga_palette_len; ++i) {
5174 stbi__tga_read_rgb16(s, pal_entry);
5175 pal_entry += tga_comp;
5176 }
5177 } else if (!stbi__getn(s, tga_palette, tga_palette_len * tga_comp)) {
5178 STBI_FREE(tga_data);
5179 STBI_FREE(tga_palette);
5180 return stbi__errpuc("bad palette", "Corrupt TGA");
5181 }
5182 }
5183 // load the data
5184 for (i=0; i < tga_width * tga_height; ++i)
5185 {
5186 // if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
5187 if ( tga_is_RLE )
5188 {
5189 if ( RLE_count == 0 )
5190 {
5191 // yep, get the next byte as a RLE command
5192 int RLE_cmd = stbi__get8(s);
5193 RLE_count = 1 + (RLE_cmd & 127);
5194 RLE_repeating = RLE_cmd >> 7;
5195 read_next_pixel = 1;
5196 } else if ( !RLE_repeating )
5197 {
5198 read_next_pixel = 1;
5199 }
5200 } else
5201 {
5202 read_next_pixel = 1;
5203 }
5204 // OK, if I need to read a pixel, do it now
5205 if ( read_next_pixel )
5206 {
5207 // load however much data we did have
5208 if ( tga_indexed )
5209 {
5210 // read in index, then perform the lookup
5211 int pal_idx = (tga_bits_per_pixel == 8) ? stbi__get8(s) : stbi__get16le(s);
5212 if ( pal_idx >= tga_palette_len ) {
5213 // invalid index
5214 pal_idx = 0;
5215 }
5216 pal_idx *= tga_comp;
5217 for (j = 0; j < tga_comp; ++j) {
5218 raw_data[j] = tga_palette[pal_idx+j];
5219 }
5220 } else if(tga_rgb16) {
5221 STBI_ASSERT(tga_comp == STBI_rgb);
5222 stbi__tga_read_rgb16(s, raw_data);
5223 } else {
5224 // read in the data raw
5225 for (j = 0; j < tga_comp; ++j) {
5226 raw_data[j] = stbi__get8(s);
5227 }
5228 }
5229 // clear the reading flag for the next pixel
5230 read_next_pixel = 0;
5231 } // end of reading a pixel
5232
5233 // copy data
5234 for (j = 0; j < tga_comp; ++j)
5235 tga_data[i*tga_comp+j] = raw_data[j];
5236
5237 // in case we're in RLE mode, keep counting down
5238 --RLE_count;
5239 }
5240 // do I need to invert the image?
5241 if ( tga_inverted )
5242 {
5243 for (j = 0; j*2 < tga_height; ++j)
5244 {
5245 int index1 = j * tga_width * tga_comp;
5246 int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
5247 for (i = tga_width * tga_comp; i > 0; --i)
5248 {
5249 unsigned char temp = tga_data[index1];
5250 tga_data[index1] = tga_data[index2];
5251 tga_data[index2] = temp;
5252 ++index1;
5253 ++index2;
5254 }
5255 }
5256 }
5257 // clear my palette, if I had one
5258 if ( tga_palette != NULL )
5259 {
5260 STBI_FREE( tga_palette );
5261 }
5262 }
5263
5264 // swap RGB - if the source data was RGB16, it already is in the right order
5265 if (tga_comp >= 3 && !tga_rgb16)
5266 {
5267 unsigned char* tga_pixel = tga_data;
5268 for (i=0; i < tga_width * tga_height; ++i)
5269 {
5270 unsigned char temp = tga_pixel[0];
5271 tga_pixel[0] = tga_pixel[2];
5272 tga_pixel[2] = temp;
5273 tga_pixel += tga_comp;
5274 }
5275 }
5276
5277 // convert to target component count
5278 if (req_comp && req_comp != tga_comp)
5279 tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
5280
5281 // the things I do to get rid of an error message, and yet keep
5282 // Microsoft's C compilers happy... [8^(
5283 tga_palette_start = tga_palette_len = tga_palette_bits =
5284 tga_x_origin = tga_y_origin = 0;
5285 // OK, done
5286 return tga_data;
5287 }
5288 #endif
5289
5290 // *************************************************************************************************
5291 // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
5292
5293 #ifndef STBI_NO_PSD
stbi__psd_test(stbi__context * s)5294 static int stbi__psd_test(stbi__context *s)
5295 {
5296 int r = (stbi__get32be(s) == 0x38425053);
5297 stbi__rewind(s);
5298 return r;
5299 }
5300
stbi__psd_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)5301 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5302 {
5303 int pixelCount;
5304 int channelCount, compression;
5305 int channel, i, count, len;
5306 int bitdepth;
5307 int w,h;
5308 stbi_uc *out;
5309
5310 // Check identifier
5311 if (stbi__get32be(s) != 0x38425053) // "8BPS"
5312 return stbi__errpuc("not PSD", "Corrupt PSD image");
5313
5314 // Check file type version.
5315 if (stbi__get16be(s) != 1)
5316 return stbi__errpuc("wrong version", "Unsupported version of PSD image");
5317
5318 // Skip 6 reserved bytes.
5319 stbi__skip(s, 6 );
5320
5321 // Read the number of channels (R, G, B, A, etc).
5322 channelCount = stbi__get16be(s);
5323 if (channelCount < 0 || channelCount > 16)
5324 return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
5325
5326 // Read the rows and columns of the image.
5327 h = stbi__get32be(s);
5328 w = stbi__get32be(s);
5329
5330 // Make sure the depth is 8 bits.
5331 bitdepth = stbi__get16be(s);
5332 if (bitdepth != 8 && bitdepth != 16)
5333 return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 or 16 bit");
5334
5335 // Make sure the color mode is RGB.
5336 // Valid options are:
5337 // 0: Bitmap
5338 // 1: Grayscale
5339 // 2: Indexed color
5340 // 3: RGB color
5341 // 4: CMYK color
5342 // 7: Multichannel
5343 // 8: Duotone
5344 // 9: Lab color
5345 if (stbi__get16be(s) != 3)
5346 return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
5347
5348 // Skip the Mode Data. (It's the palette for indexed color; other info for other modes.)
5349 stbi__skip(s,stbi__get32be(s) );
5350
5351 // Skip the image resources. (resolution, pen tool paths, etc)
5352 stbi__skip(s, stbi__get32be(s) );
5353
5354 // Skip the reserved data.
5355 stbi__skip(s, stbi__get32be(s) );
5356
5357 // Find out if the data is compressed.
5358 // Known values:
5359 // 0: no compression
5360 // 1: RLE compressed
5361 compression = stbi__get16be(s);
5362 if (compression > 1)
5363 return stbi__errpuc("bad compression", "PSD has an unknown compression format");
5364
5365 // Create the destination image.
5366 out = (stbi_uc *) stbi__malloc(4 * w*h);
5367 if (!out) return stbi__errpuc("outofmem", "Out of memory");
5368 pixelCount = w*h;
5369
5370 // Initialize the data to zero.
5371 //memset( out, 0, pixelCount * 4 );
5372
5373 // Finally, the image data.
5374 if (compression) {
5375 // RLE as used by .PSD and .TIFF
5376 // Loop until you get the number of unpacked bytes you are expecting:
5377 // Read the next source byte into n.
5378 // If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
5379 // Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
5380 // Else if n is 128, noop.
5381 // Endloop
5382
5383 // The RLE-compressed data is preceeded by a 2-byte data count for each row in the data,
5384 // which we're going to just skip.
5385 stbi__skip(s, h * channelCount * 2 );
5386
5387 // Read the RLE data by channel.
5388 for (channel = 0; channel < 4; channel++) {
5389 stbi_uc *p;
5390
5391 p = out+channel;
5392 if (channel >= channelCount) {
5393 // Fill this channel with default data.
5394 for (i = 0; i < pixelCount; i++, p += 4)
5395 *p = (channel == 3 ? 255 : 0);
5396 } else {
5397 // Read the RLE data.
5398 count = 0;
5399 while (count < pixelCount) {
5400 len = stbi__get8(s);
5401 if (len == 128) {
5402 // No-op.
5403 } else if (len < 128) {
5404 // Copy next len+1 bytes literally.
5405 len++;
5406 count += len;
5407 while (len) {
5408 *p = stbi__get8(s);
5409 p += 4;
5410 len--;
5411 }
5412 } else if (len > 128) {
5413 stbi_uc val;
5414 // Next -len+1 bytes in the dest are replicated from next source byte.
5415 // (Interpret len as a negative 8-bit int.)
5416 len ^= 0x0FF;
5417 len += 2;
5418 val = stbi__get8(s);
5419 count += len;
5420 while (len) {
5421 *p = val;
5422 p += 4;
5423 len--;
5424 }
5425 }
5426 }
5427 }
5428 }
5429
5430 } else {
5431 // We're at the raw image data. It's each channel in order (Red, Green, Blue, Alpha, ...)
5432 // where each channel consists of an 8-bit value for each pixel in the image.
5433
5434 // Read the data by channel.
5435 for (channel = 0; channel < 4; channel++) {
5436 stbi_uc *p;
5437
5438 p = out + channel;
5439 if (channel >= channelCount) {
5440 // Fill this channel with default data.
5441 stbi_uc val = channel == 3 ? 255 : 0;
5442 for (i = 0; i < pixelCount; i++, p += 4)
5443 *p = val;
5444 } else {
5445 // Read the data.
5446 if (bitdepth == 16) {
5447 for (i = 0; i < pixelCount; i++, p += 4)
5448 *p = (stbi_uc) (stbi__get16be(s) >> 8);
5449 } else {
5450 for (i = 0; i < pixelCount; i++, p += 4)
5451 *p = stbi__get8(s);
5452 }
5453 }
5454 }
5455 }
5456
5457 if (channelCount >= 4) {
5458 for (i=0; i < w*h; ++i) {
5459 unsigned char *pixel = out + 4*i;
5460 if (pixel[3] != 0 && pixel[3] != 255) {
5461 // remove weird white matte from PSD
5462 float a = pixel[3] / 255.0f;
5463 float ra = 1.0f / a;
5464 float inv_a = 255.0f * (1 - ra);
5465 pixel[0] = (unsigned char) (pixel[0]*ra + inv_a);
5466 pixel[1] = (unsigned char) (pixel[1]*ra + inv_a);
5467 pixel[2] = (unsigned char) (pixel[2]*ra + inv_a);
5468 }
5469 }
5470 }
5471
5472 if (req_comp && req_comp != 4) {
5473 out = stbi__convert_format(out, 4, req_comp, w, h);
5474 if (out == NULL) return out; // stbi__convert_format frees input on failure
5475 }
5476
5477 if (comp) *comp = 4;
5478 *y = h;
5479 *x = w;
5480
5481 return out;
5482 }
5483 #endif
5484
5485 // *************************************************************************************************
5486 // Softimage PIC loader
5487 // by Tom Seddon
5488 //
5489 // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
5490 // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
5491
5492 #ifndef STBI_NO_PIC
stbi__pic_is4(stbi__context * s,const char * str)5493 static int stbi__pic_is4(stbi__context *s,const char *str)
5494 {
5495 int i;
5496 for (i=0; i<4; ++i)
5497 if (stbi__get8(s) != (stbi_uc)str[i])
5498 return 0;
5499
5500 return 1;
5501 }
5502
stbi__pic_test_core(stbi__context * s)5503 static int stbi__pic_test_core(stbi__context *s)
5504 {
5505 int i;
5506
5507 if (!stbi__pic_is4(s,"\x53\x80\xF6\x34"))
5508 return 0;
5509
5510 for(i=0;i<84;++i)
5511 stbi__get8(s);
5512
5513 if (!stbi__pic_is4(s,"PICT"))
5514 return 0;
5515
5516 return 1;
5517 }
5518
5519 typedef struct
5520 {
5521 stbi_uc size,type,channel;
5522 } stbi__pic_packet;
5523
stbi__readval(stbi__context * s,int channel,stbi_uc * dest)5524 static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest)
5525 {
5526 int mask=0x80, i;
5527
5528 for (i=0; i<4; ++i, mask>>=1) {
5529 if (channel & mask) {
5530 if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short");
5531 dest[i]=stbi__get8(s);
5532 }
5533 }
5534
5535 return dest;
5536 }
5537
stbi__copyval(int channel,stbi_uc * dest,const stbi_uc * src)5538 static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src)
5539 {
5540 int mask=0x80,i;
5541
5542 for (i=0;i<4; ++i, mask>>=1)
5543 if (channel&mask)
5544 dest[i]=src[i];
5545 }
5546
stbi__pic_load_core(stbi__context * s,int width,int height,int * comp,stbi_uc * result)5547 static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result)
5548 {
5549 int act_comp=0,num_packets=0,y,chained;
5550 stbi__pic_packet packets[10];
5551
5552 // this will (should...) cater for even some bizarre stuff like having data
5553 // for the same channel in multiple packets.
5554 do {
5555 stbi__pic_packet *packet;
5556
5557 if (num_packets==sizeof(packets)/sizeof(packets[0]))
5558 return stbi__errpuc("bad format","too many packets");
5559
5560 packet = &packets[num_packets++];
5561
5562 chained = stbi__get8(s);
5563 packet->size = stbi__get8(s);
5564 packet->type = stbi__get8(s);
5565 packet->channel = stbi__get8(s);
5566
5567 act_comp |= packet->channel;
5568
5569 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (reading packets)");
5570 if (packet->size != 8) return stbi__errpuc("bad format","packet isn't 8bpp");
5571 } while (chained);
5572
5573 *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
5574
5575 for(y=0; y<height; ++y) {
5576 int packet_idx;
5577
5578 for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
5579 stbi__pic_packet *packet = &packets[packet_idx];
5580 stbi_uc *dest = result+y*width*4;
5581
5582 switch (packet->type) {
5583 default:
5584 return stbi__errpuc("bad format","packet has bad compression type");
5585
5586 case 0: {//uncompressed
5587 int x;
5588
5589 for(x=0;x<width;++x, dest+=4)
5590 if (!stbi__readval(s,packet->channel,dest))
5591 return 0;
5592 break;
5593 }
5594
5595 case 1://Pure RLE
5596 {
5597 int left=width, i;
5598
5599 while (left>0) {
5600 stbi_uc count,value[4];
5601
5602 count=stbi__get8(s);
5603 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pure read count)");
5604
5605 if (count > left)
5606 count = (stbi_uc) left;
5607
5608 if (!stbi__readval(s,packet->channel,value)) return 0;
5609
5610 for(i=0; i<count; ++i,dest+=4)
5611 stbi__copyval(packet->channel,dest,value);
5612 left -= count;
5613 }
5614 }
5615 break;
5616
5617 case 2: {//Mixed RLE
5618 int left=width;
5619 while (left>0) {
5620 int count = stbi__get8(s), i;
5621 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (mixed read count)");
5622
5623 if (count >= 128) { // Repeated
5624 stbi_uc value[4];
5625
5626 if (count==128)
5627 count = stbi__get16be(s);
5628 else
5629 count -= 127;
5630 if (count > left)
5631 return stbi__errpuc("bad file","scanline overrun");
5632
5633 if (!stbi__readval(s,packet->channel,value))
5634 return 0;
5635
5636 for(i=0;i<count;++i, dest += 4)
5637 stbi__copyval(packet->channel,dest,value);
5638 } else { // Raw
5639 ++count;
5640 if (count>left) return stbi__errpuc("bad file","scanline overrun");
5641
5642 for(i=0;i<count;++i, dest+=4)
5643 if (!stbi__readval(s,packet->channel,dest))
5644 return 0;
5645 }
5646 left-=count;
5647 }
5648 break;
5649 }
5650 }
5651 }
5652 }
5653
5654 return result;
5655 }
5656
stbi__pic_load(stbi__context * s,int * px,int * py,int * comp,int req_comp)5657 static stbi_uc *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp)
5658 {
5659 stbi_uc *result;
5660 int i, x,y;
5661
5662 for (i=0; i<92; ++i)
5663 stbi__get8(s);
5664
5665 x = stbi__get16be(s);
5666 y = stbi__get16be(s);
5667 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pic header)");
5668 if ((1 << 28) / x < y) return stbi__errpuc("too large", "Image too large to decode");
5669
5670 stbi__get32be(s); //skip `ratio'
5671 stbi__get16be(s); //skip `fields'
5672 stbi__get16be(s); //skip `pad'
5673
5674 // intermediate buffer is RGBA
5675 result = (stbi_uc *) stbi__malloc(x*y*4);
5676 memset(result, 0xff, x*y*4);
5677
5678 if (!stbi__pic_load_core(s,x,y,comp, result)) {
5679 STBI_FREE(result);
5680 result=0;
5681 }
5682 *px = x;
5683 *py = y;
5684 if (req_comp == 0) req_comp = *comp;
5685 result=stbi__convert_format(result,4,req_comp,x,y);
5686
5687 return result;
5688 }
5689
stbi__pic_test(stbi__context * s)5690 static int stbi__pic_test(stbi__context *s)
5691 {
5692 int r = stbi__pic_test_core(s);
5693 stbi__rewind(s);
5694 return r;
5695 }
5696 #endif
5697
5698 // *************************************************************************************************
5699 // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
5700
5701 #ifndef STBI_NO_GIF
5702 typedef struct
5703 {
5704 stbi__int16 prefix;
5705 stbi_uc first;
5706 stbi_uc suffix;
5707 } stbi__gif_lzw;
5708
5709 typedef struct
5710 {
5711 int w,h;
5712 stbi_uc *out, *old_out; // output buffer (always 4 components)
5713 int flags, bgindex, ratio, transparent, eflags, delay;
5714 stbi_uc pal[256][4];
5715 stbi_uc lpal[256][4];
5716 stbi__gif_lzw codes[4096];
5717 stbi_uc *color_table;
5718 int parse, step;
5719 int lflags;
5720 int start_x, start_y;
5721 int max_x, max_y;
5722 int cur_x, cur_y;
5723 int line_size;
5724 } stbi__gif;
5725
stbi__gif_test_raw(stbi__context * s)5726 static int stbi__gif_test_raw(stbi__context *s)
5727 {
5728 int sz;
5729 if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
5730 sz = stbi__get8(s);
5731 if (sz != '9' && sz != '7') return 0;
5732 if (stbi__get8(s) != 'a') return 0;
5733 return 1;
5734 }
5735
stbi__gif_test(stbi__context * s)5736 static int stbi__gif_test(stbi__context *s)
5737 {
5738 int r = stbi__gif_test_raw(s);
5739 stbi__rewind(s);
5740 return r;
5741 }
5742
stbi__gif_parse_colortable(stbi__context * s,stbi_uc pal[256][4],int num_entries,int transp)5743 static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp)
5744 {
5745 int i;
5746 for (i=0; i < num_entries; ++i) {
5747 pal[i][2] = stbi__get8(s);
5748 pal[i][1] = stbi__get8(s);
5749 pal[i][0] = stbi__get8(s);
5750 pal[i][3] = transp == i ? 0 : 255;
5751 }
5752 }
5753
stbi__gif_header(stbi__context * s,stbi__gif * g,int * comp,int is_info)5754 static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info)
5755 {
5756 stbi_uc version;
5757 if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
5758 return stbi__err("not GIF", "Corrupt GIF");
5759
5760 version = stbi__get8(s);
5761 if (version != '7' && version != '9') return stbi__err("not GIF", "Corrupt GIF");
5762 if (stbi__get8(s) != 'a') return stbi__err("not GIF", "Corrupt GIF");
5763
5764 stbi__g_failure_reason = "";
5765 g->w = stbi__get16le(s);
5766 g->h = stbi__get16le(s);
5767 g->flags = stbi__get8(s);
5768 g->bgindex = stbi__get8(s);
5769 g->ratio = stbi__get8(s);
5770 g->transparent = -1;
5771
5772 if (comp != 0) *comp = 4; // can't actually tell whether it's 3 or 4 until we parse the comments
5773
5774 if (is_info) return 1;
5775
5776 if (g->flags & 0x80)
5777 stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
5778
5779 return 1;
5780 }
5781
stbi__gif_info_raw(stbi__context * s,int * x,int * y,int * comp)5782 static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp)
5783 {
5784 stbi__gif* g = (stbi__gif*) stbi__malloc(sizeof(stbi__gif));
5785 if (!stbi__gif_header(s, g, comp, 1)) {
5786 STBI_FREE(g);
5787 stbi__rewind( s );
5788 return 0;
5789 }
5790 if (x) *x = g->w;
5791 if (y) *y = g->h;
5792 STBI_FREE(g);
5793 return 1;
5794 }
5795
stbi__out_gif_code(stbi__gif * g,stbi__uint16 code)5796 static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code)
5797 {
5798 stbi_uc *p, *c;
5799
5800 // recurse to decode the prefixes, since the linked-list is backwards,
5801 // and working backwards through an interleaved image would be nasty
5802 if (g->codes[code].prefix >= 0)
5803 stbi__out_gif_code(g, g->codes[code].prefix);
5804
5805 if (g->cur_y >= g->max_y) return;
5806
5807 p = &g->out[g->cur_x + g->cur_y];
5808 c = &g->color_table[g->codes[code].suffix * 4];
5809
5810 if (c[3] >= 128) {
5811 p[0] = c[2];
5812 p[1] = c[1];
5813 p[2] = c[0];
5814 p[3] = c[3];
5815 }
5816 g->cur_x += 4;
5817
5818 if (g->cur_x >= g->max_x) {
5819 g->cur_x = g->start_x;
5820 g->cur_y += g->step;
5821
5822 while (g->cur_y >= g->max_y && g->parse > 0) {
5823 g->step = (1 << g->parse) * g->line_size;
5824 g->cur_y = g->start_y + (g->step >> 1);
5825 --g->parse;
5826 }
5827 }
5828 }
5829
stbi__process_gif_raster(stbi__context * s,stbi__gif * g)5830 static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g)
5831 {
5832 stbi_uc lzw_cs;
5833 stbi__int32 len, init_code;
5834 stbi__uint32 first;
5835 stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
5836 stbi__gif_lzw *p;
5837
5838 lzw_cs = stbi__get8(s);
5839 if (lzw_cs > 12) return NULL;
5840 clear = 1 << lzw_cs;
5841 first = 1;
5842 codesize = lzw_cs + 1;
5843 codemask = (1 << codesize) - 1;
5844 bits = 0;
5845 valid_bits = 0;
5846 for (init_code = 0; init_code < clear; init_code++) {
5847 g->codes[init_code].prefix = -1;
5848 g->codes[init_code].first = (stbi_uc) init_code;
5849 g->codes[init_code].suffix = (stbi_uc) init_code;
5850 }
5851
5852 // support no starting clear code
5853 avail = clear+2;
5854 oldcode = -1;
5855
5856 len = 0;
5857 for(;;) {
5858 if (valid_bits < codesize) {
5859 if (len == 0) {
5860 len = stbi__get8(s); // start new block
5861 if (len == 0)
5862 return g->out;
5863 }
5864 --len;
5865 bits |= (stbi__int32) stbi__get8(s) << valid_bits;
5866 valid_bits += 8;
5867 } else {
5868 stbi__int32 code = bits & codemask;
5869 bits >>= codesize;
5870 valid_bits -= codesize;
5871 // @OPTIMIZE: is there some way we can accelerate the non-clear path?
5872 if (code == clear) { // clear code
5873 codesize = lzw_cs + 1;
5874 codemask = (1 << codesize) - 1;
5875 avail = clear + 2;
5876 oldcode = -1;
5877 first = 0;
5878 } else if (code == clear + 1) { // end of stream code
5879 stbi__skip(s, len);
5880 while ((len = stbi__get8(s)) > 0)
5881 stbi__skip(s,len);
5882 return g->out;
5883 } else if (code <= avail) {
5884 if (first) return stbi__errpuc("no clear code", "Corrupt GIF");
5885
5886 if (oldcode >= 0) {
5887 p = &g->codes[avail++];
5888 if (avail > 4096) return stbi__errpuc("too many codes", "Corrupt GIF");
5889 p->prefix = (stbi__int16) oldcode;
5890 p->first = g->codes[oldcode].first;
5891 p->suffix = (code == avail) ? p->first : g->codes[code].first;
5892 } else if (code == avail)
5893 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5894
5895 stbi__out_gif_code(g, (stbi__uint16) code);
5896
5897 if ((avail & codemask) == 0 && avail <= 0x0FFF) {
5898 codesize++;
5899 codemask = (1 << codesize) - 1;
5900 }
5901
5902 oldcode = code;
5903 } else {
5904 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5905 }
5906 }
5907 }
5908 }
5909
stbi__fill_gif_background(stbi__gif * g,int x0,int y0,int x1,int y1)5910 static void stbi__fill_gif_background(stbi__gif *g, int x0, int y0, int x1, int y1)
5911 {
5912 int x, y;
5913 stbi_uc *c = g->pal[g->bgindex];
5914 for (y = y0; y < y1; y += 4 * g->w) {
5915 for (x = x0; x < x1; x += 4) {
5916 stbi_uc *p = &g->out[y + x];
5917 p[0] = c[2];
5918 p[1] = c[1];
5919 p[2] = c[0];
5920 p[3] = 0;
5921 }
5922 }
5923 }
5924
5925 // this function is designed to support animated gifs, although stb_image doesn't support it
stbi__gif_load_next(stbi__context * s,stbi__gif * g,int * comp,int req_comp)5926 static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp)
5927 {
5928 int i;
5929 stbi_uc *prev_out = 0;
5930
5931 if (g->out == 0 && !stbi__gif_header(s, g, comp,0))
5932 return 0; // stbi__g_failure_reason set by stbi__gif_header
5933
5934 prev_out = g->out;
5935 g->out = (stbi_uc *) stbi__malloc(4 * g->w * g->h);
5936 if (g->out == 0) return stbi__errpuc("outofmem", "Out of memory");
5937
5938 switch ((g->eflags & 0x1C) >> 2) {
5939 case 0: // unspecified (also always used on 1st frame)
5940 stbi__fill_gif_background(g, 0, 0, 4 * g->w, 4 * g->w * g->h);
5941 break;
5942 case 1: // do not dispose
5943 if (prev_out) memcpy(g->out, prev_out, 4 * g->w * g->h);
5944 g->old_out = prev_out;
5945 break;
5946 case 2: // dispose to background
5947 if (prev_out) memcpy(g->out, prev_out, 4 * g->w * g->h);
5948 stbi__fill_gif_background(g, g->start_x, g->start_y, g->max_x, g->max_y);
5949 break;
5950 case 3: // dispose to previous
5951 if (g->old_out) {
5952 for (i = g->start_y; i < g->max_y; i += 4 * g->w)
5953 memcpy(&g->out[i + g->start_x], &g->old_out[i + g->start_x], g->max_x - g->start_x);
5954 }
5955 break;
5956 }
5957
5958 for (;;) {
5959 switch (stbi__get8(s)) {
5960 case 0x2C: /* Image Descriptor */
5961 {
5962 int prev_trans = -1;
5963 stbi__int32 x, y, w, h;
5964 stbi_uc *o;
5965
5966 x = stbi__get16le(s);
5967 y = stbi__get16le(s);
5968 w = stbi__get16le(s);
5969 h = stbi__get16le(s);
5970 if (((x + w) > (g->w)) || ((y + h) > (g->h)))
5971 return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
5972
5973 g->line_size = g->w * 4;
5974 g->start_x = x * 4;
5975 g->start_y = y * g->line_size;
5976 g->max_x = g->start_x + w * 4;
5977 g->max_y = g->start_y + h * g->line_size;
5978 g->cur_x = g->start_x;
5979 g->cur_y = g->start_y;
5980
5981 g->lflags = stbi__get8(s);
5982
5983 if (g->lflags & 0x40) {
5984 g->step = 8 * g->line_size; // first interlaced spacing
5985 g->parse = 3;
5986 } else {
5987 g->step = g->line_size;
5988 g->parse = 0;
5989 }
5990
5991 if (g->lflags & 0x80) {
5992 stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
5993 g->color_table = (stbi_uc *) g->lpal;
5994 } else if (g->flags & 0x80) {
5995 if (g->transparent >= 0 && (g->eflags & 0x01)) {
5996 prev_trans = g->pal[g->transparent][3];
5997 g->pal[g->transparent][3] = 0;
5998 }
5999 g->color_table = (stbi_uc *) g->pal;
6000 } else
6001 return stbi__errpuc("missing color table", "Corrupt GIF");
6002
6003 o = stbi__process_gif_raster(s, g);
6004 if (o == NULL) return NULL;
6005
6006 if (prev_trans != -1)
6007 g->pal[g->transparent][3] = (stbi_uc) prev_trans;
6008
6009 return o;
6010 }
6011
6012 case 0x21: // Comment Extension.
6013 {
6014 int len;
6015 if (stbi__get8(s) == 0xF9) { // Graphic Control Extension.
6016 len = stbi__get8(s);
6017 if (len == 4) {
6018 g->eflags = stbi__get8(s);
6019 g->delay = stbi__get16le(s);
6020 g->transparent = stbi__get8(s);
6021 } else {
6022 stbi__skip(s, len);
6023 break;
6024 }
6025 }
6026 while ((len = stbi__get8(s)) != 0)
6027 stbi__skip(s, len);
6028 break;
6029 }
6030
6031 case 0x3B: // gif stream termination code
6032 return (stbi_uc *) s; // using '1' causes warning on some compilers
6033
6034 default:
6035 return stbi__errpuc("unknown code", "Corrupt GIF");
6036 }
6037 }
6038
6039 STBI_NOTUSED(req_comp);
6040 }
6041
stbi__gif_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)6042 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6043 {
6044 stbi_uc *u = 0;
6045 stbi__gif* g = (stbi__gif*) stbi__malloc(sizeof(stbi__gif));
6046 memset(g, 0, sizeof(*g));
6047
6048 u = stbi__gif_load_next(s, g, comp, req_comp);
6049 if (u == (stbi_uc *) s) u = 0; // end of animated gif marker
6050 if (u) {
6051 *x = g->w;
6052 *y = g->h;
6053 if (req_comp && req_comp != 4)
6054 u = stbi__convert_format(u, 4, req_comp, g->w, g->h);
6055 }
6056 else if (g->out)
6057 STBI_FREE(g->out);
6058 STBI_FREE(g);
6059 return u;
6060 }
6061
stbi__gif_info(stbi__context * s,int * x,int * y,int * comp)6062 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp)
6063 {
6064 return stbi__gif_info_raw(s,x,y,comp);
6065 }
6066 #endif
6067
6068 // *************************************************************************************************
6069 // Radiance RGBE HDR loader
6070 // originally by Nicolas Schulz
6071 #ifndef STBI_NO_HDR
stbi__hdr_test_core(stbi__context * s)6072 static int stbi__hdr_test_core(stbi__context *s)
6073 {
6074 const char *signature = "#?RADIANCE\n";
6075 int i;
6076 for (i=0; signature[i]; ++i)
6077 if (stbi__get8(s) != signature[i])
6078 return 0;
6079 return 1;
6080 }
6081
stbi__hdr_test(stbi__context * s)6082 static int stbi__hdr_test(stbi__context* s)
6083 {
6084 int r = stbi__hdr_test_core(s);
6085 stbi__rewind(s);
6086 return r;
6087 }
6088
6089 #define STBI__HDR_BUFLEN 1024
stbi__hdr_gettoken(stbi__context * z,char * buffer)6090 static char *stbi__hdr_gettoken(stbi__context *z, char *buffer)
6091 {
6092 int len=0;
6093 char c = '\0';
6094
6095 c = (char) stbi__get8(z);
6096
6097 while (!stbi__at_eof(z) && c != '\n') {
6098 buffer[len++] = c;
6099 if (len == STBI__HDR_BUFLEN-1) {
6100 // flush to end of line
6101 while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
6102 ;
6103 break;
6104 }
6105 c = (char) stbi__get8(z);
6106 }
6107
6108 buffer[len] = 0;
6109 return buffer;
6110 }
6111
stbi__hdr_convert(float * output,stbi_uc * input,int req_comp)6112 static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp)
6113 {
6114 if ( input[3] != 0 ) {
6115 float f1;
6116 // Exponent
6117 f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
6118 if (req_comp <= 2)
6119 output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
6120 else {
6121 output[0] = input[0] * f1;
6122 output[1] = input[1] * f1;
6123 output[2] = input[2] * f1;
6124 }
6125 if (req_comp == 2) output[1] = 1;
6126 if (req_comp == 4) output[3] = 1;
6127 } else {
6128 switch (req_comp) {
6129 case 4: output[3] = 1; /* fallthrough */
6130 case 3: output[0] = output[1] = output[2] = 0;
6131 break;
6132 case 2: output[1] = 1; /* fallthrough */
6133 case 1: output[0] = 0;
6134 break;
6135 }
6136 }
6137 }
6138
stbi__hdr_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)6139 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6140 {
6141 char buffer[STBI__HDR_BUFLEN];
6142 char *token;
6143 int valid = 0;
6144 int width, height;
6145 stbi_uc *scanline;
6146 float *hdr_data;
6147 int len;
6148 unsigned char count, value;
6149 int i, j, k, c1,c2, z;
6150
6151
6152 // Check identifier
6153 if (strcmp(stbi__hdr_gettoken(s,buffer), "#?RADIANCE") != 0)
6154 return stbi__errpf("not HDR", "Corrupt HDR image");
6155
6156 // Parse header
6157 for(;;) {
6158 token = stbi__hdr_gettoken(s,buffer);
6159 if (token[0] == 0) break;
6160 if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6161 }
6162
6163 if (!valid) return stbi__errpf("unsupported format", "Unsupported HDR format");
6164
6165 // Parse width and height
6166 // can't use sscanf() if we're not using stdio!
6167 token = stbi__hdr_gettoken(s,buffer);
6168 if (strncmp(token, "-Y ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6169 token += 3;
6170 height = (int) strtol(token, &token, 10);
6171 while (*token == ' ') ++token;
6172 if (strncmp(token, "+X ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
6173 token += 3;
6174 width = (int) strtol(token, NULL, 10);
6175
6176 *x = width;
6177 *y = height;
6178
6179 if (comp) *comp = 3;
6180 if (req_comp == 0) req_comp = 3;
6181
6182 // Read data
6183 hdr_data = (float *) stbi__malloc(height * width * req_comp * sizeof(float));
6184
6185 // Load image data
6186 // image data is stored as some number of sca
6187 if ( width < 8 || width >= 32768) {
6188 // Read flat data
6189 for (j=0; j < height; ++j) {
6190 for (i=0; i < width; ++i) {
6191 stbi_uc rgbe[4];
6192 main_decode_loop:
6193 stbi__getn(s, rgbe, 4);
6194 stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
6195 }
6196 }
6197 } else {
6198 // Read RLE-encoded data
6199 scanline = NULL;
6200
6201 for (j = 0; j < height; ++j) {
6202 c1 = stbi__get8(s);
6203 c2 = stbi__get8(s);
6204 len = stbi__get8(s);
6205 if (c1 != 2 || c2 != 2 || (len & 0x80)) {
6206 // not run-length encoded, so we have to actually use THIS data as a decoded
6207 // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
6208 stbi_uc rgbe[4];
6209 rgbe[0] = (stbi_uc) c1;
6210 rgbe[1] = (stbi_uc) c2;
6211 rgbe[2] = (stbi_uc) len;
6212 rgbe[3] = (stbi_uc) stbi__get8(s);
6213 stbi__hdr_convert(hdr_data, rgbe, req_comp);
6214 i = 1;
6215 j = 0;
6216 STBI_FREE(scanline);
6217 goto main_decode_loop; // yes, this makes no sense
6218 }
6219 len <<= 8;
6220 len |= stbi__get8(s);
6221 if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
6222 if (scanline == NULL) scanline = (stbi_uc *) stbi__malloc(width * 4);
6223
6224 for (k = 0; k < 4; ++k) {
6225 i = 0;
6226 while (i < width) {
6227 count = stbi__get8(s);
6228 if (count > 128) {
6229 // Run
6230 value = stbi__get8(s);
6231 count -= 128;
6232 for (z = 0; z < count; ++z)
6233 scanline[i++ * 4 + k] = value;
6234 } else {
6235 // Dump
6236 for (z = 0; z < count; ++z)
6237 scanline[i++ * 4 + k] = stbi__get8(s);
6238 }
6239 }
6240 }
6241 for (i=0; i < width; ++i)
6242 stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
6243 }
6244 STBI_FREE(scanline);
6245 }
6246
6247 return hdr_data;
6248 }
6249
stbi__hdr_info(stbi__context * s,int * x,int * y,int * comp)6250 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp)
6251 {
6252 char buffer[STBI__HDR_BUFLEN];
6253 char *token;
6254 int valid = 0;
6255
6256 if (stbi__hdr_test(s) == 0) {
6257 stbi__rewind( s );
6258 return 0;
6259 }
6260
6261 for(;;) {
6262 token = stbi__hdr_gettoken(s,buffer);
6263 if (token[0] == 0) break;
6264 if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6265 }
6266
6267 if (!valid) {
6268 stbi__rewind( s );
6269 return 0;
6270 }
6271 token = stbi__hdr_gettoken(s,buffer);
6272 if (strncmp(token, "-Y ", 3)) {
6273 stbi__rewind( s );
6274 return 0;
6275 }
6276 token += 3;
6277 *y = (int) strtol(token, &token, 10);
6278 while (*token == ' ') ++token;
6279 if (strncmp(token, "+X ", 3)) {
6280 stbi__rewind( s );
6281 return 0;
6282 }
6283 token += 3;
6284 *x = (int) strtol(token, NULL, 10);
6285 *comp = 3;
6286 return 1;
6287 }
6288 #endif // STBI_NO_HDR
6289
6290 #ifndef STBI_NO_BMP
stbi__bmp_info(stbi__context * s,int * x,int * y,int * comp)6291 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp)
6292 {
6293 void *p;
6294 stbi__bmp_data info;
6295
6296 info.all_a = 255;
6297 p = stbi__bmp_parse_header(s, &info);
6298 stbi__rewind( s );
6299 if (p == NULL)
6300 return 0;
6301 *x = s->img_x;
6302 *y = s->img_y;
6303 *comp = info.ma ? 4 : 3;
6304 return 1;
6305 }
6306 #endif
6307
6308 #ifndef STBI_NO_PSD
stbi__psd_info(stbi__context * s,int * x,int * y,int * comp)6309 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp)
6310 {
6311 int channelCount;
6312 if (stbi__get32be(s) != 0x38425053) {
6313 stbi__rewind( s );
6314 return 0;
6315 }
6316 if (stbi__get16be(s) != 1) {
6317 stbi__rewind( s );
6318 return 0;
6319 }
6320 stbi__skip(s, 6);
6321 channelCount = stbi__get16be(s);
6322 if (channelCount < 0 || channelCount > 16) {
6323 stbi__rewind( s );
6324 return 0;
6325 }
6326 *y = stbi__get32be(s);
6327 *x = stbi__get32be(s);
6328 if (stbi__get16be(s) != 8) {
6329 stbi__rewind( s );
6330 return 0;
6331 }
6332 if (stbi__get16be(s) != 3) {
6333 stbi__rewind( s );
6334 return 0;
6335 }
6336 *comp = 4;
6337 return 1;
6338 }
6339 #endif
6340
6341 #ifndef STBI_NO_PIC
stbi__pic_info(stbi__context * s,int * x,int * y,int * comp)6342 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp)
6343 {
6344 int act_comp=0,num_packets=0,chained;
6345 stbi__pic_packet packets[10];
6346
6347 if (!stbi__pic_is4(s,"\x53\x80\xF6\x34")) {
6348 stbi__rewind(s);
6349 return 0;
6350 }
6351
6352 stbi__skip(s, 88);
6353
6354 *x = stbi__get16be(s);
6355 *y = stbi__get16be(s);
6356 if (stbi__at_eof(s)) {
6357 stbi__rewind( s);
6358 return 0;
6359 }
6360 if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
6361 stbi__rewind( s );
6362 return 0;
6363 }
6364
6365 stbi__skip(s, 8);
6366
6367 do {
6368 stbi__pic_packet *packet;
6369
6370 if (num_packets==sizeof(packets)/sizeof(packets[0]))
6371 return 0;
6372
6373 packet = &packets[num_packets++];
6374 chained = stbi__get8(s);
6375 packet->size = stbi__get8(s);
6376 packet->type = stbi__get8(s);
6377 packet->channel = stbi__get8(s);
6378 act_comp |= packet->channel;
6379
6380 if (stbi__at_eof(s)) {
6381 stbi__rewind( s );
6382 return 0;
6383 }
6384 if (packet->size != 8) {
6385 stbi__rewind( s );
6386 return 0;
6387 }
6388 } while (chained);
6389
6390 *comp = (act_comp & 0x10 ? 4 : 3);
6391
6392 return 1;
6393 }
6394 #endif
6395
6396 // *************************************************************************************************
6397 // Portable Gray Map and Portable Pixel Map loader
6398 // by Ken Miller
6399 //
6400 // PGM: http://netpbm.sourceforge.net/doc/pgm.html
6401 // PPM: http://netpbm.sourceforge.net/doc/ppm.html
6402 //
6403 // Known limitations:
6404 // Does not support comments in the header section
6405 // Does not support ASCII image data (formats P2 and P3)
6406 // Does not support 16-bit-per-channel
6407
6408 #ifndef STBI_NO_PNM
6409
stbi__pnm_test(stbi__context * s)6410 static int stbi__pnm_test(stbi__context *s)
6411 {
6412 char p, t;
6413 p = (char) stbi__get8(s);
6414 t = (char) stbi__get8(s);
6415 if (p != 'P' || (t != '5' && t != '6')) {
6416 stbi__rewind( s );
6417 return 0;
6418 }
6419 return 1;
6420 }
6421
stbi__pnm_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)6422 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6423 {
6424 stbi_uc *out;
6425 if (!stbi__pnm_info(s, (int *)&s->img_x, (int *)&s->img_y, (int *)&s->img_n))
6426 return 0;
6427 *x = s->img_x;
6428 *y = s->img_y;
6429 *comp = s->img_n;
6430
6431 out = (stbi_uc *) stbi__malloc(s->img_n * s->img_x * s->img_y);
6432 if (!out) return stbi__errpuc("outofmem", "Out of memory");
6433 stbi__getn(s, out, s->img_n * s->img_x * s->img_y);
6434
6435 if (req_comp && req_comp != s->img_n) {
6436 out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y);
6437 if (out == NULL) return out; // stbi__convert_format frees input on failure
6438 }
6439 return out;
6440 }
6441
stbi__pnm_isspace(char c)6442 static int stbi__pnm_isspace(char c)
6443 {
6444 return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r';
6445 }
6446
stbi__pnm_skip_whitespace(stbi__context * s,char * c)6447 static void stbi__pnm_skip_whitespace(stbi__context *s, char *c)
6448 {
6449 for (;;) {
6450 while (!stbi__at_eof(s) && stbi__pnm_isspace(*c))
6451 *c = (char) stbi__get8(s);
6452
6453 if (stbi__at_eof(s) || *c != '#')
6454 break;
6455
6456 while (!stbi__at_eof(s) && *c != '\n' && *c != '\r' )
6457 *c = (char) stbi__get8(s);
6458 }
6459 }
6460
stbi__pnm_isdigit(char c)6461 static int stbi__pnm_isdigit(char c)
6462 {
6463 return c >= '0' && c <= '9';
6464 }
6465
stbi__pnm_getinteger(stbi__context * s,char * c)6466 static int stbi__pnm_getinteger(stbi__context *s, char *c)
6467 {
6468 int value = 0;
6469
6470 while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) {
6471 value = value*10 + (*c - '0');
6472 *c = (char) stbi__get8(s);
6473 }
6474
6475 return value;
6476 }
6477
stbi__pnm_info(stbi__context * s,int * x,int * y,int * comp)6478 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp)
6479 {
6480 int maxv;
6481 char c, p, t;
6482
6483 stbi__rewind( s );
6484
6485 // Get identifier
6486 p = (char) stbi__get8(s);
6487 t = (char) stbi__get8(s);
6488 if (p != 'P' || (t != '5' && t != '6')) {
6489 stbi__rewind( s );
6490 return 0;
6491 }
6492
6493 *comp = (t == '6') ? 3 : 1; // '5' is 1-component .pgm; '6' is 3-component .ppm
6494
6495 c = (char) stbi__get8(s);
6496 stbi__pnm_skip_whitespace(s, &c);
6497
6498 *x = stbi__pnm_getinteger(s, &c); // read width
6499 stbi__pnm_skip_whitespace(s, &c);
6500
6501 *y = stbi__pnm_getinteger(s, &c); // read height
6502 stbi__pnm_skip_whitespace(s, &c);
6503
6504 maxv = stbi__pnm_getinteger(s, &c); // read max value
6505
6506 if (maxv > 255)
6507 return stbi__err("max value > 255", "PPM image not 8-bit");
6508 else
6509 return 1;
6510 }
6511 #endif
6512
stbi__info_main(stbi__context * s,int * x,int * y,int * comp)6513 static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp)
6514 {
6515 #ifndef STBI_NO_JPEG
6516 if (stbi__jpeg_info(s, x, y, comp)) return 1;
6517 #endif
6518
6519 #ifndef STBI_NO_PNG
6520 if (stbi__png_info(s, x, y, comp)) return 1;
6521 #endif
6522
6523 #ifndef STBI_NO_GIF
6524 if (stbi__gif_info(s, x, y, comp)) return 1;
6525 #endif
6526
6527 #ifndef STBI_NO_BMP
6528 if (stbi__bmp_info(s, x, y, comp)) return 1;
6529 #endif
6530
6531 #ifndef STBI_NO_PSD
6532 if (stbi__psd_info(s, x, y, comp)) return 1;
6533 #endif
6534
6535 #ifndef STBI_NO_PIC
6536 if (stbi__pic_info(s, x, y, comp)) return 1;
6537 #endif
6538
6539 #ifndef STBI_NO_PNM
6540 if (stbi__pnm_info(s, x, y, comp)) return 1;
6541 #endif
6542
6543 #ifndef STBI_NO_HDR
6544 if (stbi__hdr_info(s, x, y, comp)) return 1;
6545 #endif
6546
6547 // test tga last because it's a crappy test!
6548 #ifndef STBI_NO_TGA
6549 if (stbi__tga_info(s, x, y, comp))
6550 return 1;
6551 #endif
6552 return stbi__err("unknown image type", "Image not of any known type, or corrupt");
6553 }
6554
6555 #ifndef STBI_NO_STDIO
stbi_info(char const * filename,int * x,int * y,int * comp)6556 STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
6557 {
6558 FILE *f = stbi__fopen(filename, "rb");
6559 int result;
6560 if (!f) return stbi__err("can't fopen", "Unable to open file");
6561 result = stbi_info_from_file(f, x, y, comp);
6562 fclose(f);
6563 return result;
6564 }
6565
stbi_info_from_file(FILE * f,int * x,int * y,int * comp)6566 STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
6567 {
6568 int r;
6569 stbi__context s;
6570 long pos = ftell(f);
6571 stbi__start_file(&s, f);
6572 r = stbi__info_main(&s,x,y,comp);
6573 fseek(f,pos,SEEK_SET);
6574 return r;
6575 }
6576 #endif // !STBI_NO_STDIO
6577
stbi_info_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp)6578 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
6579 {
6580 stbi__context s;
6581 stbi__start_mem(&s,buffer,len);
6582 return stbi__info_main(&s,x,y,comp);
6583 }
6584
stbi_info_from_callbacks(stbi_io_callbacks const * c,void * user,int * x,int * y,int * comp)6585 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
6586 {
6587 stbi__context s;
6588 stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
6589 return stbi__info_main(&s,x,y,comp);
6590 }
6591
6592 #endif // STB_IMAGE_IMPLEMENTATION
6593
6594 /*
6595 revision history:
6596 2.12 (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
6597 2.11 (2016-04-02) allocate large structures on the stack
6598 remove white matting for transparent PSD
6599 fix reported channel count for PNG & BMP
6600 re-enable SSE2 in non-gcc 64-bit
6601 support RGB-formatted JPEG
6602 read 16-bit PNGs (only as 8-bit)
6603 2.10 (2016-01-22) avoid warning introduced in 2.09 by STBI_REALLOC_SIZED
6604 2.09 (2016-01-16) allow comments in PNM files
6605 16-bit-per-pixel TGA (not bit-per-component)
6606 info() for TGA could break due to .hdr handling
6607 info() for BMP to shares code instead of sloppy parse
6608 can use STBI_REALLOC_SIZED if allocator doesn't support realloc
6609 code cleanup
6610 2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
6611 2.07 (2015-09-13) fix compiler warnings
6612 partial animated GIF support
6613 limited 16-bpc PSD support
6614 #ifdef unused functions
6615 bug with < 92 byte PIC,PNM,HDR,TGA
6616 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
6617 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
6618 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
6619 2.03 (2015-04-12) extra corruption checking (mmozeiko)
6620 stbi_set_flip_vertically_on_load (nguillemot)
6621 fix NEON support; fix mingw support
6622 2.02 (2015-01-19) fix incorrect assert, fix warning
6623 2.01 (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
6624 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
6625 2.00 (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
6626 progressive JPEG (stb)
6627 PGM/PPM support (Ken Miller)
6628 STBI_MALLOC,STBI_REALLOC,STBI_FREE
6629 GIF bugfix -- seemingly never worked
6630 STBI_NO_*, STBI_ONLY_*
6631 1.48 (2014-12-14) fix incorrectly-named assert()
6632 1.47 (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
6633 optimize PNG (ryg)
6634 fix bug in interlaced PNG with user-specified channel count (stb)
6635 1.46 (2014-08-26)
6636 fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
6637 1.45 (2014-08-16)
6638 fix MSVC-ARM internal compiler error by wrapping malloc
6639 1.44 (2014-08-07)
6640 various warning fixes from Ronny Chevalier
6641 1.43 (2014-07-15)
6642 fix MSVC-only compiler problem in code changed in 1.42
6643 1.42 (2014-07-09)
6644 don't define _CRT_SECURE_NO_WARNINGS (affects user code)
6645 fixes to stbi__cleanup_jpeg path
6646 added STBI_ASSERT to avoid requiring assert.h
6647 1.41 (2014-06-25)
6648 fix search&replace from 1.36 that messed up comments/error messages
6649 1.40 (2014-06-22)
6650 fix gcc struct-initialization warning
6651 1.39 (2014-06-15)
6652 fix to TGA optimization when req_comp != number of components in TGA;
6653 fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
6654 add support for BMP version 5 (more ignored fields)
6655 1.38 (2014-06-06)
6656 suppress MSVC warnings on integer casts truncating values
6657 fix accidental rename of 'skip' field of I/O
6658 1.37 (2014-06-04)
6659 remove duplicate typedef
6660 1.36 (2014-06-03)
6661 convert to header file single-file library
6662 if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
6663 1.35 (2014-05-27)
6664 various warnings
6665 fix broken STBI_SIMD path
6666 fix bug where stbi_load_from_file no longer left file pointer in correct place
6667 fix broken non-easy path for 32-bit BMP (possibly never used)
6668 TGA optimization by Arseny Kapoulkine
6669 1.34 (unknown)
6670 use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
6671 1.33 (2011-07-14)
6672 make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
6673 1.32 (2011-07-13)
6674 support for "info" function for all supported filetypes (SpartanJ)
6675 1.31 (2011-06-20)
6676 a few more leak fixes, bug in PNG handling (SpartanJ)
6677 1.30 (2011-06-11)
6678 added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
6679 removed deprecated format-specific test/load functions
6680 removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
6681 error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
6682 fix inefficiency in decoding 32-bit BMP (David Woo)
6683 1.29 (2010-08-16)
6684 various warning fixes from Aurelien Pocheville
6685 1.28 (2010-08-01)
6686 fix bug in GIF palette transparency (SpartanJ)
6687 1.27 (2010-08-01)
6688 cast-to-stbi_uc to fix warnings
6689 1.26 (2010-07-24)
6690 fix bug in file buffering for PNG reported by SpartanJ
6691 1.25 (2010-07-17)
6692 refix trans_data warning (Won Chun)
6693 1.24 (2010-07-12)
6694 perf improvements reading from files on platforms with lock-heavy fgetc()
6695 minor perf improvements for jpeg
6696 deprecated type-specific functions so we'll get feedback if they're needed
6697 attempt to fix trans_data warning (Won Chun)
6698 1.23 fixed bug in iPhone support
6699 1.22 (2010-07-10)
6700 removed image *writing* support
6701 stbi_info support from Jetro Lauha
6702 GIF support from Jean-Marc Lienher
6703 iPhone PNG-extensions from James Brown
6704 warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
6705 1.21 fix use of 'stbi_uc' in header (reported by jon blow)
6706 1.20 added support for Softimage PIC, by Tom Seddon
6707 1.19 bug in interlaced PNG corruption check (found by ryg)
6708 1.18 (2008-08-02)
6709 fix a threading bug (local mutable static)
6710 1.17 support interlaced PNG
6711 1.16 major bugfix - stbi__convert_format converted one too many pixels
6712 1.15 initialize some fields for thread safety
6713 1.14 fix threadsafe conversion bug
6714 header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
6715 1.13 threadsafe
6716 1.12 const qualifiers in the API
6717 1.11 Support installable IDCT, colorspace conversion routines
6718 1.10 Fixes for 64-bit (don't use "unsigned long")
6719 optimized upsampling by Fabian "ryg" Giesen
6720 1.09 Fix format-conversion for PSD code (bad global variables!)
6721 1.08 Thatcher Ulrich's PSD code integrated by Nicolas Schulz
6722 1.07 attempt to fix C++ warning/errors again
6723 1.06 attempt to fix C++ warning/errors again
6724 1.05 fix TGA loading to return correct *comp and use good luminance calc
6725 1.04 default float alpha is 1, not 255; use 'void *' for stbi_image_free
6726 1.03 bugfixes to STBI_NO_STDIO, STBI_NO_HDR
6727 1.02 support for (subset of) HDR files, float interface for preferred access to them
6728 1.01 fix bug: possible bug in handling right-side up bmps... not sure
6729 fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
6730 1.00 interface to zlib that skips zlib header
6731 0.99 correct handling of alpha in palette
6732 0.98 TGA loader by lonesock; dynamically add loaders (untested)
6733 0.97 jpeg errors on too large a file; also catch another malloc failure
6734 0.96 fix detection of invalid v value - particleman@mollyrocket forum
6735 0.95 during header scan, seek to markers in case of padding
6736 0.94 STBI_NO_STDIO to disable stdio usage; rename all #defines the same
6737 0.93 handle jpegtran output; verbose errors
6738 0.92 read 4,8,16,24,32-bit BMP files of several formats
6739 0.91 output 24-bit Windows 3.0 BMP files
6740 0.90 fix a few more warnings; bump version number to approach 1.0
6741 0.61 bugfixes due to Marc LeBlanc, Christopher Lloyd
6742 0.60 fix compiling as c++
6743 0.59 fix warnings: merge Dave Moore's -Wall fixes
6744 0.58 fix bug: zlib uncompressed mode len/nlen was wrong endian
6745 0.57 fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
6746 0.56 fix bug: zlib uncompressed mode len vs. nlen
6747 0.55 fix bug: restart_interval not initialized to 0
6748 0.54 allow NULL for 'int *comp'
6749 0.53 fix bug in png 3->4; speedup png decoding
6750 0.52 png handles req_comp=3,4 directly; minor cleanup; jpeg comments
6751 0.51 obey req_comp requests, 1-component jpegs return as 1-component,
6752 on 'test' only check type, not whether we support this variant
6753 0.50 (2006-11-19)
6754 first released version
6755 */
6756