1 /* stb_image - v2.06 - public domain image loader - http://nothings.org/stb_image.h
2 no warranty implied; use at your own risk
3
4 Do this:
5 #define STB_IMAGE_IMPLEMENTATION
6 before you include this file in *one* C or C++ file to create the implementation.
7
8 // i.e. it should look like this:
9 #include ...
10 #include ...
11 #include ...
12 #define STB_IMAGE_IMPLEMENTATION
13 #include "stb_image.h"
14
15 You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
16 And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
17
18
19 QUICK NOTES:
20 Primarily of interest to game developers and other people who can
21 avoid problematic images and only need the trivial interface
22
23 JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
24 PNG 1/2/4/8-bit-per-channel (16 bpc not supported)
25
26 TGA (not sure what subset, if a subset)
27 BMP non-1bpp, non-RLE
28 PSD (composited view only, no extra channels)
29
30 GIF (*comp always reports as 4-channel)
31 HDR (radiance rgbE format)
32 PIC (Softimage PIC)
33 PNM (PPM and PGM binary only)
34
35 - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
36 - decode from arbitrary I/O callbacks
37 - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
38
39 Full documentation under "DOCUMENTATION" below.
40
41
42 Revision 2.00 release notes:
43
44 - Progressive JPEG is now supported.
45
46 - PPM and PGM binary formats are now supported, thanks to Ken Miller.
47
48 - x86 platforms now make use of SSE2 SIMD instructions for
49 JPEG decoding, and ARM platforms can use NEON SIMD if requested.
50 This work was done by Fabian "ryg" Giesen. SSE2 is used by
51 default, but NEON must be enabled explicitly; see docs.
52
53 With other JPEG optimizations included in this version, we see
54 2x speedup on a JPEG on an x86 machine, and a 1.5x speedup
55 on a JPEG on an ARM machine, relative to previous versions of this
56 library. The same results will not obtain for all JPGs and for all
57 x86/ARM machines. (Note that progressive JPEGs are significantly
58 slower to decode than regular JPEGs.) This doesn't mean that this
59 is the fastest JPEG decoder in the land; rather, it brings it
60 closer to parity with standard libraries. If you want the fastest
61 decode, look elsewhere. (See "Philosophy" section of docs below.)
62
63 See final bullet items below for more info on SIMD.
64
65 - Added STBI_MALLOC, STBI_REALLOC, and STBI_FREE macros for replacing
66 the memory allocator. Unlike other STBI libraries, these macros don't
67 support a context parameter, so if you need to pass a context in to
68 the allocator, you'll have to store it in a global or a thread-local
69 variable.
70
71 - Split existing STBI_NO_HDR flag into two flags, STBI_NO_HDR and
72 STBI_NO_LINEAR.
73 STBI_NO_HDR: suppress implementation of .hdr reader format
74 STBI_NO_LINEAR: suppress high-dynamic-range light-linear float API
75
76 - You can suppress implementation of any of the decoders to reduce
77 your code footprint by #defining one or more of the following
78 symbols before creating the implementation.
79
80 STBI_NO_JPEG
81 STBI_NO_PNG
82 STBI_NO_BMP
83 STBI_NO_PSD
84 STBI_NO_TGA
85 STBI_NO_GIF
86 STBI_NO_HDR
87 STBI_NO_PIC
88 STBI_NO_PNM (.ppm and .pgm)
89
90 - You can request *only* certain decoders and suppress all other ones
91 (this will be more forward-compatible, as addition of new decoders
92 doesn't require you to disable them explicitly):
93
94 STBI_ONLY_JPEG
95 STBI_ONLY_PNG
96 STBI_ONLY_BMP
97 STBI_ONLY_PSD
98 STBI_ONLY_TGA
99 STBI_ONLY_GIF
100 STBI_ONLY_HDR
101 STBI_ONLY_PIC
102 STBI_ONLY_PNM (.ppm and .pgm)
103
104 Note that you can define multiples of these, and you will get all
105 of them ("only x" and "only y" is interpreted to mean "only x&y").
106
107 - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
108 want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
109
110 - Compilation of all SIMD code can be suppressed with
111 #define STBI_NO_SIMD
112 It should not be necessary to disable SIMD unless you have issues
113 compiling (e.g. using an x86 compiler which doesn't support SSE
114 intrinsics or that doesn't support the method used to detect
115 SSE2 support at run-time), and even those can be reported as
116 bugs so I can refine the built-in compile-time checking to be
117 smarter.
118
119 - The old STBI_SIMD system which allowed installing a user-defined
120 IDCT etc. has been removed. If you need this, don't upgrade. My
121 assumption is that almost nobody was doing this, and those who
122 were will find the built-in SIMD more satisfactory anyway.
123
124 - RGB values computed for JPEG images are slightly different from
125 previous versions of stb_image. (This is due to using less
126 integer precision in SIMD.) The C code has been adjusted so
127 that the same RGB values will be computed regardless of whether
128 SIMD support is available, so your app should always produce
129 consistent results. But these results are slightly different from
130 previous versions. (Specifically, about 3% of available YCbCr values
131 will compute different RGB results from pre-1.49 versions by +-1;
132 most of the deviating values are one smaller in the G channel.)
133
134 - If you must produce consistent results with previous versions of
135 stb_image, #define STBI_JPEG_OLD and you will get the same results
136 you used to; however, you will not get the SIMD speedups for
137 the YCbCr-to-RGB conversion step (although you should still see
138 significant JPEG speedup from the other changes).
139
140 Please note that STBI_JPEG_OLD is a temporary feature; it will be
141 removed in future versions of the library. It is only intended for
142 near-term back-compatibility use.
143
144
145 Latest revision history:
146 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
147 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
148 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
149 2.03 (2015-04-12) additional corruption checking
150 stbi_set_flip_vertically_on_load
151 fix NEON support; fix mingw support
152 2.02 (2015-01-19) fix incorrect assert, fix warning
153 2.01 (2015-01-17) fix various warnings
154 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
155 2.00 (2014-12-25) optimize JPEG, including x86 SSE2 & ARM NEON SIMD
156 progressive JPEG
157 PGM/PPM support
158 STBI_MALLOC,STBI_REALLOC,STBI_FREE
159 STBI_NO_*, STBI_ONLY_*
160 GIF bugfix
161 1.48 (2014-12-14) fix incorrectly-named assert()
162 1.47 (2014-12-14) 1/2/4-bit PNG support (both grayscale and paletted)
163 optimize PNG
164 fix bug in interlaced PNG with user-specified channel count
165
166 See end of file for full revision history.
167
168
169 ============================ Contributors =========================
170
171 Image formats Bug fixes & warning fixes
172 Sean Barrett (jpeg, png, bmp) Marc LeBlanc
173 Nicolas Schulz (hdr, psd) Christpher Lloyd
174 Jonathan Dummer (tga) Dave Moore
175 Jean-Marc Lienher (gif) Won Chun
176 Tom Seddon (pic) the Horde3D community
177 Thatcher Ulrich (psd) Janez Zemva
178 Ken Miller (pgm, ppm) Jonathan Blow
179 Laurent Gomila
180 Aruelien Pocheville
181 Extensions, features Ryamond Barbiero
182 Jetro Lauha (stbi_info) David Woo
183 Martin "SpartanJ" Golini (stbi_info) Martin Golini
184 James "moose2000" Brown (iPhone PNG) Roy Eltham
185 Ben "Disch" Wenger (io callbacks) Luke Graham
186 Omar Cornut (1/2/4-bit PNG) Thomas Ruf
187 Nicolas Guillemot (vertical flip) John Bartholomew
188 Ken Hamada
189 Optimizations & bugfixes Cort Stratton
190 Fabian "ryg" Giesen Blazej Dariusz Roszkowski
191 Arseny Kapoulkine Thibault Reuille
192 Paul Du Bois
193 Guillaume George
194 If your name should be here but Jerry Jansson
195 isn't, let Sean know. Hayaki Saito
196 Johan Duparc
197 Ronny Chevalier
198 Michal Cichon
199 Tero Hanninen
200 Sergio Gonzalez
201 Cass Everitt
202 Engin Manap
203 Martins Mozeiko
204 Joseph Thomson
205 Phil Jordan
206
207 License:
208 This software is in the public domain. Where that dedication is not
209 recognized, you are granted a perpetual, irrevocable license to copy
210 and modify this file however you want.
211
212 */
213
214 #ifndef STBI_INCLUDE_STB_IMAGE_H
215 #define STBI_INCLUDE_STB_IMAGE_H
216
217 // DOCUMENTATION
218 //
219 // Limitations:
220 // - no 16-bit-per-channel PNG
221 // - no 12-bit-per-channel JPEG
222 // - no JPEGs with arithmetic coding
223 // - no 1-bit BMP
224 // - GIF always returns *comp=4
225 //
226 // Basic usage (see HDR discussion below for HDR usage):
227 // int x,y,n;
228 // unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
229 // // ... process data if not NULL ...
230 // // ... x = width, y = height, n = # 8-bit components per pixel ...
231 // // ... replace '0' with '1'..'4' to force that many components per pixel
232 // // ... but 'n' will always be the number that it would have been if you said 0
233 // stbi_image_free(data)
234 //
235 // Standard parameters:
236 // int *x -- outputs image width in pixels
237 // int *y -- outputs image height in pixels
238 // int *comp -- outputs # of image components in image file
239 // int req_comp -- if non-zero, # of image components requested in result
240 //
241 // The return value from an image loader is an 'unsigned char *' which points
242 // to the pixel data, or NULL on an allocation failure or if the image is
243 // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
244 // with each pixel consisting of N interleaved 8-bit components; the first
245 // pixel pointed to is top-left-most in the image. There is no padding between
246 // image scanlines or between pixels, regardless of format. The number of
247 // components N is 'req_comp' if req_comp is non-zero, or *comp otherwise.
248 // If req_comp is non-zero, *comp has the number of components that _would_
249 // have been output otherwise. E.g. if you set req_comp to 4, you will always
250 // get RGBA output, but you can check *comp to see if it's trivially opaque
251 // because e.g. there were only 3 channels in the source image.
252 //
253 // An output image with N components has the following components interleaved
254 // in this order in each pixel:
255 //
256 // N=#comp components
257 // 1 grey
258 // 2 grey, alpha
259 // 3 red, green, blue
260 // 4 red, green, blue, alpha
261 //
262 // If image loading fails for any reason, the return value will be NULL,
263 // and *x, *y, *comp will be unchanged. The function stbi_failure_reason()
264 // can be queried for an extremely brief, end-user unfriendly explanation
265 // of why the load failed. Define STBI_NO_FAILURE_STRINGS to avoid
266 // compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
267 // more user-friendly ones.
268 //
269 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
270 //
271 // ===========================================================================
272 //
273 // Philosophy
274 //
275 // stb libraries are designed with the following priorities:
276 //
277 // 1. easy to use
278 // 2. easy to maintain
279 // 3. good performance
280 //
281 // Sometimes I let "good performance" creep up in priority over "easy to maintain",
282 // and for best performance I may provide less-easy-to-use APIs that give higher
283 // performance, in addition to the easy to use ones. Nevertheless, it's important
284 // to keep in mind that from the standpoint of you, a client of this library,
285 // all you care about is #1 and #3, and stb libraries do not emphasize #3 above all.
286 //
287 // Some secondary priorities arise directly from the first two, some of which
288 // make more explicit reasons why performance can't be emphasized.
289 //
290 // - Portable ("ease of use")
291 // - Small footprint ("easy to maintain")
292 // - No dependencies ("ease of use")
293 //
294 // ===========================================================================
295 //
296 // I/O callbacks
297 //
298 // I/O callbacks allow you to read from arbitrary sources, like packaged
299 // files or some other source. Data read from callbacks are processed
300 // through a small internal buffer (currently 128 bytes) to try to reduce
301 // overhead.
302 //
303 // The three functions you must define are "read" (reads some bytes of data),
304 // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
305 //
306 // ===========================================================================
307 //
308 // SIMD support
309 //
310 // The JPEG decoder will try to automatically use SIMD kernels on x86 when
311 // supported by the compiler. For ARM Neon support, you must explicitly
312 // request it.
313 //
314 // (The old do-it-yourself SIMD API is no longer supported in the current
315 // code.)
316 //
317 // On x86, SSE2 will automatically be used when available based on a run-time
318 // test; if not, the generic C versions are used as a fall-back. On ARM targets,
319 // the typical path is to have separate builds for NEON and non-NEON devices
320 // (at least this is true for iOS and Android). Therefore, the NEON support is
321 // toggled by a build flag: define STBI_NEON to get NEON loops.
322 //
323 // The output of the JPEG decoder is slightly different from versions where
324 // SIMD support was introduced (that is, for versions before 1.49). The
325 // difference is only +-1 in the 8-bit RGB channels, and only on a small
326 // fraction of pixels. You can force the pre-1.49 behavior by defining
327 // STBI_JPEG_OLD, but this will disable some of the SIMD decoding path
328 // and hence cost some performance.
329 //
330 // If for some reason you do not want to use any of SIMD code, or if
331 // you have issues compiling it, you can disable it entirely by
332 // defining STBI_NO_SIMD.
333 //
334 // ===========================================================================
335 //
336 // HDR image support (disable by defining STBI_NO_HDR)
337 //
338 // stb_image now supports loading HDR images in general, and currently
339 // the Radiance .HDR file format, although the support is provided
340 // generically. You can still load any file through the existing interface;
341 // if you attempt to load an HDR file, it will be automatically remapped to
342 // LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
343 // both of these constants can be reconfigured through this interface:
344 //
345 // stbi_hdr_to_ldr_gamma(2.2f);
346 // stbi_hdr_to_ldr_scale(1.0f);
347 //
348 // (note, do not use _inverse_ constants; stbi_image will invert them
349 // appropriately).
350 //
351 // Additionally, there is a new, parallel interface for loading files as
352 // (linear) floats to preserve the full dynamic range:
353 //
354 // float *data = stbi_loadf(filename, &x, &y, &n, 0);
355 //
356 // If you load LDR images through this interface, those images will
357 // be promoted to floating point values, run through the inverse of
358 // constants corresponding to the above:
359 //
360 // stbi_ldr_to_hdr_scale(1.0f);
361 // stbi_ldr_to_hdr_gamma(2.2f);
362 //
363 // Finally, given a filename (or an open file or memory block--see header
364 // file for details) containing image data, you can query for the "most
365 // appropriate" interface to use (that is, whether the image is HDR or
366 // not), using:
367 //
368 // stbi_is_hdr(char *filename);
369 //
370 // ===========================================================================
371 //
372 // iPhone PNG support:
373 //
374 // By default we convert iphone-formatted PNGs back to RGB, even though
375 // they are internally encoded differently. You can disable this conversion
376 // by by calling stbi_convert_iphone_png_to_rgb(0), in which case
377 // you will always just get the native iphone "format" through (which
378 // is BGR stored in RGB).
379 //
380 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
381 // pixel to remove any premultiplied alpha *only* if the image file explicitly
382 // says there's premultiplied data (currently only happens in iPhone images,
383 // and only if iPhone convert-to-rgb processing is on).
384 //
385
386
387 #ifndef STBI_NO_STDIO
388 #include <stdio.h>
389 #endif /* STBI_NO_STDIO */
390
391 #define STBI_VERSION 1
392
393 enum
394 {
395 STBI_default = 0, /* only used for req_comp */
396 STBI_grey = 1,
397 STBI_grey_alpha = 2,
398 STBI_rgb = 3,
399 STBI_rgb_alpha = 4
400 };
401
402 typedef unsigned char stbi_uc;
403
404 #ifdef __cplusplus
405 extern "C" {
406 #endif
407
408 #ifdef STB_IMAGE_STATIC
409 #define STBIDEF static
410 #else
411 #define STBIDEF extern
412 #endif
413
414 //////////////////////////////////////////////////////////////////////////////
415 //
416 // PRIMARY API - works on images of any type
417 //
418
419 /* load image by filename, open file, or memory buffer */
420
421 typedef struct
422 {
423 int (*read) (void *user,char *data,int size); /* fill 'data' with 'size' bytes. return number of bytes actually read */
424 void (*skip) (void *user,int n); /* skip the next 'n' bytes, or 'unget' the last -n bytes if negative */
425 int (*eof) (void *user); /* returns nonzero if we are at end of file/data */
426 } stbi_io_callbacks;
427
428 STBIDEF stbi_uc *stbi_load (char const *filename, int *x, int *y, int *comp, int req_comp);
429 STBIDEF stbi_uc *stbi_load_from_memory (stbi_uc const *buffer, int len , int *x, int *y, int *comp, int req_comp);
430 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk , void *user, int *x, int *y, int *comp, int req_comp);
431
432 #ifndef STBI_NO_STDIO
433 STBIDEF stbi_uc *stbi_load_from_file (FILE *f, int *x, int *y, int *comp, int req_comp);
434 /* for stbi_load_from_file, file pointer is left pointing immediately after image */
435 #endif
436
437 #ifndef STBI_NO_LINEAR
438 STBIDEF float *stbi_loadf (char const *filename, int *x, int *y, int *comp, int req_comp);
439 STBIDEF float *stbi_loadf_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
440 STBIDEF float *stbi_loadf_from_callbacks (stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp);
441
442 #ifndef STBI_NO_STDIO
443 STBIDEF float *stbi_loadf_from_file (FILE *f, int *x, int *y, int *comp, int req_comp);
444 #endif
445 #endif
446
447 #ifndef STBI_NO_HDR
448 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma);
449 STBIDEF void stbi_hdr_to_ldr_scale(float scale);
450 #endif
451
452 #ifndef STBI_NO_LINEAR
453 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma);
454 STBIDEF void stbi_ldr_to_hdr_scale(float scale);
455 #endif /* STBI_NO_HDR */
456
457 /* stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR */
458 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
459 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
460 #ifndef STBI_NO_STDIO
461 STBIDEF int stbi_is_hdr (char const *filename);
462 STBIDEF int stbi_is_hdr_from_file(FILE *f);
463 #endif /* STBI_NO_STDIO */
464
465
466 /* get a VERY brief reason for failure
467 * NOT THREADSAFE */
468 STBIDEF const char *stbi_failure_reason (void);
469
470 /* free the loaded image -- this is just free() */
471 STBIDEF void stbi_image_free (void *retval_from_stbi_load);
472
473 /* get image dimensions & components without fully decoding */
474 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
475 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp);
476
477 #ifndef STBI_NO_STDIO
478 STBIDEF int stbi_info (char const *filename, int *x, int *y, int *comp);
479 STBIDEF int stbi_info_from_file (FILE *f, int *x, int *y, int *comp);
480
481 #endif
482
483 // for image formats that explicitly notate that they have premultiplied alpha,
484 // we just return the colors as stored in the file. set this flag to force
485 // unpremultiplication. results are undefined if the unpremultiply overflow.
486 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
487
488 // indicate whether we should process iphone images back to canonical format,
489 // or just pass them through "as-is"
490 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
491
492 /* flip the image vertically, so the first pixel in the output array is the bottom left */
493 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip);
494
495 /* ZLIB client - used by PNG, available for other purposes */
496
497 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
498 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
499 STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
500 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
501
502 STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
503 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
504
505
506 #ifdef __cplusplus
507 }
508 #endif
509
510 //// end header file /////////////////////////////////////////////////////
511 #endif /* STBI_INCLUDE_STB_IMAGE_H */
512
513 #ifdef STB_IMAGE_IMPLEMENTATION
514
515 #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
516 || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
517 || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
518 || defined(STBI_ONLY_ZLIB)
519 #ifndef STBI_ONLY_JPEG
520 #define STBI_NO_JPEG
521 #endif
522 #ifndef STBI_ONLY_PNG
523 #define STBI_NO_PNG
524 #endif
525 #ifndef STBI_ONLY_BMP
526 #define STBI_NO_BMP
527 #endif
528 #ifndef STBI_ONLY_PSD
529 #define STBI_NO_PSD
530 #endif
531 #ifndef STBI_ONLY_TGA
532 #define STBI_NO_TGA
533 #endif
534 #ifndef STBI_ONLY_GIF
535 #define STBI_NO_GIF
536 #endif
537 #ifndef STBI_ONLY_HDR
538 #define STBI_NO_HDR
539 #endif
540 #ifndef STBI_ONLY_PIC
541 #define STBI_NO_PIC
542 #endif
543 #ifndef STBI_ONLY_PNM
544 #define STBI_NO_PNM
545 #endif
546 #endif
547
548 #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
549 #define STBI_NO_ZLIB
550 #endif
551
552
553 #include <stdarg.h>
554 #include <stddef.h> /* ptrdiff_t on osx */
555 #include <stdlib.h>
556 #include <string.h>
557
558 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
559 #include <math.h> /* ldexp */
560 #endif
561
562 #ifndef STBI_NO_STDIO
563 #include <stdio.h>
564 #endif
565
566 #ifndef STBI_ASSERT
567 #include <assert.h>
568 #define STBI_ASSERT(x) assert(x)
569 #endif
570
571
572 #ifndef _MSC_VER
573 #ifdef __cplusplus
574 #define stbi_inline inline
575 #else
576 #define stbi_inline
577 #endif
578 #else
579 #define stbi_inline __forceinline
580 #endif
581
582
583 #ifdef _MSC_VER
584 typedef unsigned short stbi__uint16;
585 typedef signed short stbi__int16;
586 typedef unsigned int stbi__uint32;
587 typedef signed int stbi__int32;
588 #else
589 #include <stdint.h>
590 typedef uint16_t stbi__uint16;
591 typedef int16_t stbi__int16;
592 typedef uint32_t stbi__uint32;
593 typedef int32_t stbi__int32;
594 #endif
595
596 /* should produce compiler error if size is wrong */
597 typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1];
598
599 #ifdef _MSC_VER
600 #define STBI_NOTUSED(v) (void)(v)
601 #else
602 #define STBI_NOTUSED(v) (void)sizeof(v)
603 #endif
604
605 #ifdef _MSC_VER
606 #define STBI_HAS_LROTL
607 #endif
608
609 #ifdef STBI_HAS_LROTL
610 #define stbi_lrot(x,y) _lrotl(x,y)
611 #else
612 #define stbi_lrot(x,y) (((x) << (y)) | ((x) >> (32 - (y))))
613 #endif
614
615 #if defined(STBI_MALLOC) && defined(STBI_FREE) && defined(STBI_REALLOC)
616 // ok
617 #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC)
618 // ok
619 #else
620 #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC."
621 #endif
622
623 #ifndef STBI_MALLOC
624 #define STBI_MALLOC(sz) malloc(sz)
625 #define STBI_REALLOC(p,sz) realloc(p,sz)
626 #define STBI_FREE(p) free(p)
627 #endif
628
629 // x86/x64 detection
630 #if defined(__x86_64__) || defined(_M_X64)
631 #define STBI__X64_TARGET
632 #elif defined(__i386) || defined(_M_IX86)
633 #define STBI__X86_TARGET
634 #endif
635
636 #if defined(__GNUC__) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET)) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
637 /* NOTE: not clear do we actually need this for the 64-bit path?
638 * gcc doesn't support sse2 intrinsics unless you compile with -msse2,
639 * (but compiling with -msse2 allows the compiler to use SSE2 everywhere;
640 * this is just broken and gcc are jerks for not fixing it properly
641 * http://www.virtualdub.org/blog/pivot/entry.php?id=363 )
642 */
643 #define STBI_NO_SIMD
644 #endif
645
646 #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
647 /* Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
648 *
649 * 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
650 * Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
651 * As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
652 * simultaneously enabling "-mstackrealign".
653 *
654 * See https://github.com/nothings/stb/issues/81 for more information.
655 *
656 * So default to no SSE2 on 32-bit MinGW. If you've read this far and added
657 * -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
658 */
659 #define STBI_NO_SIMD
660 #endif
661
662 #if !defined(STBI_NO_SIMD) && defined(STBI__X86_TARGET)
663 #define STBI_SSE2
664 #include <emmintrin.h>
665
666 #ifdef _MSC_VER
667
668 #if _MSC_VER >= 1400 /* not VC6 */
669 #include <intrin.h> /* __cpuid */
stbi__cpuid3(void)670 static int stbi__cpuid3(void)
671 {
672 int info[4];
673 __cpuid(info,1);
674 return info[3];
675 }
676 #else
stbi__cpuid3(void)677 static int stbi__cpuid3(void)
678 {
679 int res;
680 __asm {
681 mov eax,1
682 cpuid
683 mov res,edx
684 }
685 return res;
686 }
687 #endif
688
689 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
690
stbi__sse2_available()691 static int stbi__sse2_available()
692 {
693 int info3 = stbi__cpuid3();
694 return ((info3 >> 26) & 1) != 0;
695 }
696 #else /* assume GCC-style if not VC++ */
697 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
698
stbi__sse2_available()699 static int stbi__sse2_available()
700 {
701 #if defined(__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__) >= 408 /* GCC 4.8 or later */
702 /* GCC 4.8+ has a nice way to do this */
703 return __builtin_cpu_supports("sse2");
704 #else
705 /* portable way to do this, preferably without using GCC inline ASM?
706 * just bail for now. */
707 return 0;
708 #endif
709 }
710 #endif
711 #endif
712
713 /* ARM NEON */
714 #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
715 #undef STBI_NEON
716 #endif
717
718 #ifdef STBI_NEON
719 #include <arm_neon.h>
720 /* assume GCC or Clang on ARM targets */
721 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
722 #endif
723
724 #ifndef STBI_SIMD_ALIGN
725 #define STBI_SIMD_ALIGN(type, name) type name
726 #endif
727
728 ///////////////////////////////////////////////
729 //
730 // stbi__context struct and start_xxx functions
731
732 // stbi__context structure is our basic context used by all images, so it
733 // contains all the IO context, plus some basic image information
734 typedef struct
735 {
736 stbi__uint32 img_x, img_y;
737 int img_n, img_out_n;
738
739 stbi_io_callbacks io;
740 void *io_user_data;
741
742 int read_from_callbacks;
743 int buflen;
744 stbi_uc buffer_start[128];
745
746 stbi_uc *img_buffer, *img_buffer_end;
747 stbi_uc *img_buffer_original;
748 } stbi__context;
749
750
751 static void stbi__refill_buffer(stbi__context *s);
752
753 // initialize a memory-decode context
stbi__start_mem(stbi__context * s,stbi_uc const * buffer,int len)754 static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len)
755 {
756 s->io.read = NULL;
757 s->read_from_callbacks = 0;
758 s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer;
759 s->img_buffer_end = (stbi_uc *) buffer+len;
760 }
761
762 // initialize a callback-based context
stbi__start_callbacks(stbi__context * s,stbi_io_callbacks * c,void * user)763 static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user)
764 {
765 s->io = *c;
766 s->io_user_data = user;
767 s->buflen = sizeof(s->buffer_start);
768 s->read_from_callbacks = 1;
769 s->img_buffer_original = s->buffer_start;
770 stbi__refill_buffer(s);
771 }
772
773 #ifndef STBI_NO_STDIO
774
stbi__stdio_read(void * user,char * data,int size)775 static int stbi__stdio_read(void *user, char *data, int size)
776 {
777 return (int) fread(data,1,size,(FILE*) user);
778 }
779
stbi__stdio_skip(void * user,int n)780 static void stbi__stdio_skip(void *user, int n)
781 {
782 fseek((FILE*) user, n, SEEK_CUR);
783 }
784
stbi__stdio_eof(void * user)785 static int stbi__stdio_eof(void *user)
786 {
787 return feof((FILE*) user);
788 }
789
790 static stbi_io_callbacks stbi__stdio_callbacks =
791 {
792 stbi__stdio_read,
793 stbi__stdio_skip,
794 stbi__stdio_eof,
795 };
796
stbi__start_file(stbi__context * s,FILE * f)797 static void stbi__start_file(stbi__context *s, FILE *f)
798 {
799 stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f);
800 }
801
802 #endif /* !STBI_NO_STDIO */
803
stbi__rewind(stbi__context * s)804 static void stbi__rewind(stbi__context *s)
805 {
806 /* conceptually rewind SHOULD rewind to the beginning of the stream,
807 * but we just rewind to the beginning of the initial buffer, because
808 * we only use it after doing 'test', which only ever looks at at most 92 bytes
809 */
810 s->img_buffer = s->img_buffer_original;
811 }
812
813 #ifndef STBI_NO_JPEG
814 static int stbi__jpeg_test(stbi__context *s);
815 static stbi_uc *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
816 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp);
817 #endif
818
819 #ifndef STBI_NO_PNG
820 static int stbi__png_test(stbi__context *s);
821 static stbi_uc *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
822 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp);
823 #endif
824
825 #ifndef STBI_NO_BMP
826 static int stbi__bmp_test(stbi__context *s);
827 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
828 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp);
829 #endif
830
831 #ifndef STBI_NO_TGA
832 static int stbi__tga_test(stbi__context *s);
833 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
834 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp);
835 #endif
836
837 #ifndef STBI_NO_PSD
838 static int stbi__psd_test(stbi__context *s);
839 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
840 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp);
841 #endif
842
843 #ifndef STBI_NO_HDR
844 static int stbi__hdr_test(stbi__context *s);
845 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
846 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp);
847 #endif
848
849 #ifndef STBI_NO_PIC
850 static int stbi__pic_test(stbi__context *s);
851 static stbi_uc *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
852 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp);
853 #endif
854
855 #ifndef STBI_NO_GIF
856 static int stbi__gif_test(stbi__context *s);
857 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
858 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp);
859 #endif
860
861 #ifndef STBI_NO_PNM
862 static int stbi__pnm_test(stbi__context *s);
863 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
864 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp);
865 #endif
866
867 // this is not threadsafe
868 static const char *stbi__g_failure_reason;
869
stbi_failure_reason(void)870 STBIDEF const char *stbi_failure_reason(void)
871 {
872 return stbi__g_failure_reason;
873 }
874
stbi__err(const char * str)875 static int stbi__err(const char *str)
876 {
877 stbi__g_failure_reason = str;
878 return 0;
879 }
880
stbi__malloc(size_t size)881 static void *stbi__malloc(size_t size)
882 {
883 return STBI_MALLOC(size);
884 }
885
886 // stbi__err - error
887 // stbi__errpf - error returning pointer to float
888 // stbi__errpuc - error returning pointer to unsigned char
889
890 #ifdef STBI_NO_FAILURE_STRINGS
891 #define stbi__err(x,y) 0
892 #elif defined(STBI_FAILURE_USERMSG)
893 #define stbi__err(x,y) stbi__err(y)
894 #else
895 #define stbi__err(x,y) stbi__err(x)
896 #endif
897
898 #define stbi__errpf(x,y) ((float *) (stbi__err(x,y)?NULL:NULL))
899 #define stbi__errpuc(x,y) ((unsigned char *) (stbi__err(x,y)?NULL:NULL))
900
stbi_image_free(void * retval_from_stbi_load)901 STBIDEF void stbi_image_free(void *retval_from_stbi_load)
902 {
903 STBI_FREE(retval_from_stbi_load);
904 }
905
906 #ifndef STBI_NO_LINEAR
907 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
908 #endif
909
910 #ifndef STBI_NO_HDR
911 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp);
912 #endif
913
914 static int stbi__vertically_flip_on_load = 0;
915
stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)916 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
917 {
918 stbi__vertically_flip_on_load = flag_true_if_should_flip;
919 }
920
stbi__load_main(stbi__context * s,int * x,int * y,int * comp,int req_comp)921 static unsigned char *stbi__load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
922 {
923 #ifndef STBI_NO_JPEG
924 if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp);
925 #endif
926 #ifndef STBI_NO_PNG
927 if (stbi__png_test(s)) return stbi__png_load(s,x,y,comp,req_comp);
928 #endif
929 #ifndef STBI_NO_BMP
930 if (stbi__bmp_test(s)) return stbi__bmp_load(s,x,y,comp,req_comp);
931 #endif
932 #ifndef STBI_NO_GIF
933 if (stbi__gif_test(s)) return stbi__gif_load(s,x,y,comp,req_comp);
934 #endif
935 #ifndef STBI_NO_PSD
936 if (stbi__psd_test(s)) return stbi__psd_load(s,x,y,comp,req_comp);
937 #endif
938 #ifndef STBI_NO_PIC
939 if (stbi__pic_test(s)) return stbi__pic_load(s,x,y,comp,req_comp);
940 #endif
941 #ifndef STBI_NO_PNM
942 if (stbi__pnm_test(s)) return stbi__pnm_load(s,x,y,comp,req_comp);
943 #endif
944
945 #ifndef STBI_NO_HDR
946 if (stbi__hdr_test(s)) {
947 float *hdr = stbi__hdr_load(s, x,y,comp,req_comp);
948 return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
949 }
950 #endif
951
952 #ifndef STBI_NO_TGA
953 // test tga last because it's a crappy test!
954 if (stbi__tga_test(s))
955 return stbi__tga_load(s,x,y,comp,req_comp);
956 #endif
957
958 return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
959 }
960
stbi__load_flip(stbi__context * s,int * x,int * y,int * comp,int req_comp)961 static unsigned char *stbi__load_flip(stbi__context *s, int *x, int *y, int *comp, int req_comp)
962 {
963 unsigned char *result = stbi__load_main(s, x, y, comp, req_comp);
964
965 if (stbi__vertically_flip_on_load && result != NULL) {
966 int w = *x, h = *y;
967 int depth = req_comp ? req_comp : *comp;
968 int row,col,z;
969 stbi_uc temp;
970
971 // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
972 for (row = 0; row < (h>>1); row++) {
973 for (col = 0; col < w; col++) {
974 for (z = 0; z < depth; z++) {
975 temp = result[(row * w + col) * depth + z];
976 result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
977 result[((h - row - 1) * w + col) * depth + z] = temp;
978 }
979 }
980 }
981 }
982
983 return result;
984 }
985
986 #ifndef STBI_NO_HDR
stbi__float_postprocess(float * result,int * x,int * y,int * comp,int req_comp)987 static void stbi__float_postprocess(float *result, int *x, int *y, int *comp, int req_comp)
988 {
989 if (stbi__vertically_flip_on_load && result != NULL) {
990 int w = *x, h = *y;
991 int depth = req_comp ? req_comp : *comp;
992 int row,col,z;
993 float temp;
994
995 // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
996 for (row = 0; row < (h>>1); row++) {
997 for (col = 0; col < w; col++) {
998 for (z = 0; z < depth; z++) {
999 temp = result[(row * w + col) * depth + z];
1000 result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
1001 result[((h - row - 1) * w + col) * depth + z] = temp;
1002 }
1003 }
1004 }
1005 }
1006 }
1007 #endif
1008
1009
1010 #ifndef STBI_NO_STDIO
1011
stbi__fopen(char const * filename,char const * mode)1012 static FILE *stbi__fopen(char const *filename, char const *mode)
1013 {
1014 FILE *f;
1015 #if defined(_MSC_VER) && _MSC_VER >= 1400
1016 if (0 != fopen_s(&f, filename, mode))
1017 f=0;
1018 #else
1019 f = fopen(filename, mode);
1020 #endif
1021 return f;
1022 }
1023
1024
stbi_load(char const * filename,int * x,int * y,int * comp,int req_comp)1025 STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
1026 {
1027 FILE *f = stbi__fopen(filename, "rb");
1028 unsigned char *result;
1029 if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
1030 result = stbi_load_from_file(f,x,y,comp,req_comp);
1031 fclose(f);
1032 return result;
1033 }
1034
stbi_load_from_file(FILE * f,int * x,int * y,int * comp,int req_comp)1035 STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1036 {
1037 unsigned char *result;
1038 stbi__context s;
1039 stbi__start_file(&s,f);
1040 result = stbi__load_flip(&s,x,y,comp,req_comp);
1041 if (result) {
1042 /* need to 'unget' all the characters in the IO buffer */
1043 fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
1044 }
1045 return result;
1046 }
1047 #endif /* !STBI_NO_STDIO */
1048
stbi_load_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp,int req_comp)1049 STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1050 {
1051 stbi__context s;
1052 stbi__start_mem(&s,buffer,len);
1053 return stbi__load_flip(&s,x,y,comp,req_comp);
1054 }
1055
stbi_load_from_callbacks(stbi_io_callbacks const * clbk,void * user,int * x,int * y,int * comp,int req_comp)1056 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1057 {
1058 stbi__context s;
1059 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1060 return stbi__load_flip(&s,x,y,comp,req_comp);
1061 }
1062
1063 #ifndef STBI_NO_LINEAR
stbi__loadf_main(stbi__context * s,int * x,int * y,int * comp,int req_comp)1064 static float *stbi__loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
1065 {
1066 unsigned char *data;
1067 #ifndef STBI_NO_HDR
1068 if (stbi__hdr_test(s)) {
1069 float *hdr_data = stbi__hdr_load(s,x,y,comp,req_comp);
1070 if (hdr_data)
1071 stbi__float_postprocess(hdr_data,x,y,comp,req_comp);
1072 return hdr_data;
1073 }
1074 #endif
1075 data = stbi__load_flip(s, x, y, comp, req_comp);
1076 if (data)
1077 return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
1078 return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
1079 }
1080
stbi_loadf_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp,int req_comp)1081 STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1082 {
1083 stbi__context s;
1084 stbi__start_mem(&s,buffer,len);
1085 return stbi__loadf_main(&s,x,y,comp,req_comp);
1086 }
1087
stbi_loadf_from_callbacks(stbi_io_callbacks const * clbk,void * user,int * x,int * y,int * comp,int req_comp)1088 STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1089 {
1090 stbi__context s;
1091 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1092 return stbi__loadf_main(&s,x,y,comp,req_comp);
1093 }
1094
1095 #ifndef STBI_NO_STDIO
stbi_loadf(char const * filename,int * x,int * y,int * comp,int req_comp)1096 STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
1097 {
1098 float *result;
1099 FILE *f = stbi__fopen(filename, "rb");
1100 if (!f) return stbi__errpf("can't fopen", "Unable to open file");
1101 result = stbi_loadf_from_file(f,x,y,comp,req_comp);
1102 fclose(f);
1103 return result;
1104 }
1105
stbi_loadf_from_file(FILE * f,int * x,int * y,int * comp,int req_comp)1106 STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1107 {
1108 stbi__context s;
1109 stbi__start_file(&s,f);
1110 return stbi__loadf_main(&s,x,y,comp,req_comp);
1111 }
1112 #endif // !STBI_NO_STDIO
1113
1114 #endif // !STBI_NO_LINEAR
1115
1116 // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
1117 // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
1118 // reports false!
1119
stbi_is_hdr_from_memory(stbi_uc const * buffer,int len)1120 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
1121 {
1122 #ifndef STBI_NO_HDR
1123 stbi__context s;
1124 stbi__start_mem(&s,buffer,len);
1125 return stbi__hdr_test(&s);
1126 #else
1127 STBI_NOTUSED(buffer);
1128 STBI_NOTUSED(len);
1129 return 0;
1130 #endif
1131 }
1132
1133 #ifndef STBI_NO_STDIO
stbi_is_hdr(char const * filename)1134 STBIDEF int stbi_is_hdr (char const *filename)
1135 {
1136 FILE *f = stbi__fopen(filename, "rb");
1137 int result=0;
1138 if (f) {
1139 result = stbi_is_hdr_from_file(f);
1140 fclose(f);
1141 }
1142 return result;
1143 }
1144
stbi_is_hdr_from_file(FILE * f)1145 STBIDEF int stbi_is_hdr_from_file(FILE *f)
1146 {
1147 #ifndef STBI_NO_HDR
1148 stbi__context s;
1149 stbi__start_file(&s,f);
1150 return stbi__hdr_test(&s);
1151 #else
1152 return 0;
1153 #endif
1154 }
1155 #endif // !STBI_NO_STDIO
1156
stbi_is_hdr_from_callbacks(stbi_io_callbacks const * clbk,void * user)1157 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
1158 {
1159 #ifndef STBI_NO_HDR
1160 stbi__context s;
1161 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1162 return stbi__hdr_test(&s);
1163 #else
1164 return 0;
1165 #endif
1166 }
1167
1168 static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f;
1169 static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f;
1170
1171 #ifndef STBI_NO_LINEAR
stbi_ldr_to_hdr_gamma(float gamma)1172 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
stbi_ldr_to_hdr_scale(float scale)1173 STBIDEF void stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
1174 #endif
1175
1176 /* forward declarations */
1177 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma);
1178 STBIDEF void stbi_hdr_to_ldr_scale(float scale);
1179
stbi_hdr_to_ldr_gamma(float gamma)1180 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; }
stbi_hdr_to_ldr_scale(float scale)1181 STBIDEF void stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; }
1182
1183
1184 //////////////////////////////////////////////////////////////////////////////
1185 //
1186 // Common code used by all image loaders
1187 //
1188
1189 enum
1190 {
1191 STBI__SCAN_load=0,
1192 STBI__SCAN_type,
1193 STBI__SCAN_header
1194 };
1195
stbi__refill_buffer(stbi__context * s)1196 static void stbi__refill_buffer(stbi__context *s)
1197 {
1198 int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
1199 if (n == 0) {
1200 // at end of file, treat same as if from memory, but need to handle case
1201 // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
1202 s->read_from_callbacks = 0;
1203 s->img_buffer = s->buffer_start;
1204 s->img_buffer_end = s->buffer_start+1;
1205 *s->img_buffer = 0;
1206 } else {
1207 s->img_buffer = s->buffer_start;
1208 s->img_buffer_end = s->buffer_start + n;
1209 }
1210 }
1211
stbi__get8(stbi__context * s)1212 stbi_inline static stbi_uc stbi__get8(stbi__context *s)
1213 {
1214 if (s->img_buffer < s->img_buffer_end)
1215 return *s->img_buffer++;
1216 if (s->read_from_callbacks) {
1217 stbi__refill_buffer(s);
1218 return *s->img_buffer++;
1219 }
1220 return 0;
1221 }
1222
stbi__at_eof(stbi__context * s)1223 stbi_inline static int stbi__at_eof(stbi__context *s)
1224 {
1225 if (s->io.read) {
1226 if (!(s->io.eof)(s->io_user_data)) return 0;
1227 // if feof() is true, check if buffer = end
1228 // special case: we've only got the special 0 character at the end
1229 if (s->read_from_callbacks == 0) return 1;
1230 }
1231
1232 return s->img_buffer >= s->img_buffer_end;
1233 }
1234
stbi__skip(stbi__context * s,int n)1235 static void stbi__skip(stbi__context *s, int n)
1236 {
1237 if (n < 0) {
1238 s->img_buffer = s->img_buffer_end;
1239 return;
1240 }
1241 if (s->io.read) {
1242 int blen = (int) (s->img_buffer_end - s->img_buffer);
1243 if (blen < n) {
1244 s->img_buffer = s->img_buffer_end;
1245 (s->io.skip)(s->io_user_data, n - blen);
1246 return;
1247 }
1248 }
1249 s->img_buffer += n;
1250 }
1251
stbi__getn(stbi__context * s,stbi_uc * buffer,int n)1252 static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n)
1253 {
1254 if (s->io.read) {
1255 int blen = (int) (s->img_buffer_end - s->img_buffer);
1256 if (blen < n) {
1257 int res, count;
1258
1259 memcpy(buffer, s->img_buffer, blen);
1260
1261 count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
1262 res = (count == (n-blen));
1263 s->img_buffer = s->img_buffer_end;
1264 return res;
1265 }
1266 }
1267
1268 if (s->img_buffer+n <= s->img_buffer_end) {
1269 memcpy(buffer, s->img_buffer, n);
1270 s->img_buffer += n;
1271 return 1;
1272 } else
1273 return 0;
1274 }
1275
stbi__get16be(stbi__context * s)1276 static int stbi__get16be(stbi__context *s)
1277 {
1278 int z = stbi__get8(s);
1279 return (z << 8) + stbi__get8(s);
1280 }
1281
stbi__get32be(stbi__context * s)1282 static stbi__uint32 stbi__get32be(stbi__context *s)
1283 {
1284 stbi__uint32 z = stbi__get16be(s);
1285 return (z << 16) + stbi__get16be(s);
1286 }
1287
stbi__get16le(stbi__context * s)1288 static int stbi__get16le(stbi__context *s)
1289 {
1290 int z = stbi__get8(s);
1291 return z + (stbi__get8(s) << 8);
1292 }
1293
stbi__get32le(stbi__context * s)1294 static stbi__uint32 stbi__get32le(stbi__context *s)
1295 {
1296 stbi__uint32 z = stbi__get16le(s);
1297 return z + (stbi__get16le(s) << 16);
1298 }
1299
1300 #define STBI__BYTECAST(x) ((stbi_uc) ((x) & 255)) // truncate int to byte without warnings
1301
1302
1303 //////////////////////////////////////////////////////////////////////////////
1304 //
1305 // generic converter from built-in img_n to req_comp
1306 // individual types do this automatically as much as possible (e.g. jpeg
1307 // does all cases internally since it needs to colorspace convert anyway,
1308 // and it never has alpha, so very few cases ). png can automatically
1309 // interleave an alpha=255 channel, but falls back to this for other cases
1310 //
1311 // assume data buffer is malloced, so malloc a new one and free that one
1312 // only failure mode is malloc failing
1313
stbi__compute_y(int r,int g,int b)1314 static stbi_uc stbi__compute_y(int r, int g, int b)
1315 {
1316 return (stbi_uc) (((r*77) + (g*150) + (29*b)) >> 8);
1317 }
1318
stbi__convert_format(unsigned char * data,int img_n,int req_comp,unsigned int x,unsigned int y)1319 static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y)
1320 {
1321 int i,j;
1322 unsigned char *good;
1323
1324 if (req_comp == img_n) return data;
1325 STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
1326
1327 good = (unsigned char *) stbi__malloc(req_comp * x * y);
1328 if (good == NULL) {
1329 STBI_FREE(data);
1330 return stbi__errpuc("outofmem", "Out of memory");
1331 }
1332
1333 for (j=0; j < (int) y; ++j) {
1334 unsigned char *src = data + j * x * img_n ;
1335 unsigned char *dest = good + j * x * req_comp;
1336
1337 #define COMBO(a,b) ((a)*8+(b))
1338 #define CASE(a,b) case COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
1339 // convert source image with img_n components to one with req_comp components;
1340 // avoid switch per pixel, so use switch per scanline and massive macros
1341 switch (COMBO(img_n, req_comp)) {
1342 CASE(1,2) dest[0]=src[0], dest[1]=255; break;
1343 CASE(1,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1344 CASE(1,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=255; break;
1345 CASE(2,1) dest[0]=src[0]; break;
1346 CASE(2,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1347 CASE(2,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=src[1]; break;
1348 CASE(3,4) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2],dest[3]=255; break;
1349 CASE(3,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1350 CASE(3,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = 255; break;
1351 CASE(4,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1352 CASE(4,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = src[3]; break;
1353 CASE(4,3) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2]; break;
1354 default: STBI_ASSERT(0);
1355 }
1356 #undef CASE
1357 }
1358
1359 STBI_FREE(data);
1360 return good;
1361 }
1362
1363 #ifndef STBI_NO_LINEAR
stbi__ldr_to_hdr(stbi_uc * data,int x,int y,int comp)1364 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
1365 {
1366 int i,k,n;
1367 float *output = (float *) stbi__malloc(x * y * comp * sizeof(float));
1368 if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
1369 // compute number of non-alpha components
1370 if (comp & 1) n = comp; else n = comp-1;
1371 for (i=0; i < x*y; ++i) {
1372 for (k=0; k < n; ++k) {
1373 output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
1374 }
1375 if (k < comp) output[i*comp + k] = data[i*comp+k]/255.0f;
1376 }
1377 STBI_FREE(data);
1378 return output;
1379 }
1380 #endif
1381
1382 #ifndef STBI_NO_HDR
1383 #define stbi__float2int(x) ((int) (x))
stbi__hdr_to_ldr(float * data,int x,int y,int comp)1384 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp)
1385 {
1386 int i,k,n;
1387 stbi_uc *output = (stbi_uc *) stbi__malloc(x * y * comp);
1388 if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
1389 // compute number of non-alpha components
1390 if (comp & 1) n = comp; else n = comp-1;
1391 for (i=0; i < x*y; ++i) {
1392 for (k=0; k < n; ++k) {
1393 float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
1394 if (z < 0) z = 0;
1395 if (z > 255) z = 255;
1396 output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1397 }
1398 if (k < comp) {
1399 float z = data[i*comp+k] * 255 + 0.5f;
1400 if (z < 0) z = 0;
1401 if (z > 255) z = 255;
1402 output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1403 }
1404 }
1405 STBI_FREE(data);
1406 return output;
1407 }
1408 #endif
1409
1410 //////////////////////////////////////////////////////////////////////////////
1411 //
1412 // "baseline" JPEG/JFIF decoder
1413 //
1414 // simple implementation
1415 // - doesn't support delayed output of y-dimension
1416 // - simple interface (only one output format: 8-bit interleaved RGB)
1417 // - doesn't try to recover corrupt jpegs
1418 // - doesn't allow partial loading, loading multiple at once
1419 // - still fast on x86 (copying globals into locals doesn't help x86)
1420 // - allocates lots of intermediate memory (full size of all components)
1421 // - non-interleaved case requires this anyway
1422 // - allows good upsampling (see next)
1423 // high-quality
1424 // - upsampled channels are bilinearly interpolated, even across blocks
1425 // - quality integer IDCT derived from IJG's 'slow'
1426 // performance
1427 // - fast huffman; reasonable integer IDCT
1428 // - some SIMD kernels for common paths on targets with SSE2/NEON
1429 // - uses a lot of intermediate memory, could cache poorly
1430
1431 #ifndef STBI_NO_JPEG
1432
1433 // huffman decoding acceleration
1434 #define FAST_BITS 9 // larger handles more cases; smaller stomps less cache
1435
1436 typedef struct
1437 {
1438 stbi_uc fast[1 << FAST_BITS];
1439 // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
1440 stbi__uint16 code[256];
1441 stbi_uc values[256];
1442 stbi_uc size[257];
1443 unsigned int maxcode[18];
1444 int delta[17]; // old 'firstsymbol' - old 'firstcode'
1445 } stbi__huffman;
1446
1447 typedef struct
1448 {
1449 stbi__context *s;
1450 stbi__huffman huff_dc[4];
1451 stbi__huffman huff_ac[4];
1452 stbi_uc dequant[4][64];
1453 stbi__int16 fast_ac[4][1 << FAST_BITS];
1454
1455 // sizes for components, interleaved MCUs
1456 int img_h_max, img_v_max;
1457 int img_mcu_x, img_mcu_y;
1458 int img_mcu_w, img_mcu_h;
1459
1460 // definition of jpeg image component
1461 struct
1462 {
1463 int id;
1464 int h,v;
1465 int tq;
1466 int hd,ha;
1467 int dc_pred;
1468
1469 int x,y,w2,h2;
1470 stbi_uc *data;
1471 void *raw_data, *raw_coeff;
1472 stbi_uc *linebuf;
1473 short *coeff; // progressive only
1474 int coeff_w, coeff_h; // number of 8x8 coefficient blocks
1475 } img_comp[4];
1476
1477 stbi__uint32 code_buffer; // jpeg entropy-coded buffer
1478 int code_bits; // number of valid bits
1479 unsigned char marker; // marker seen while filling entropy buffer
1480 int nomore; // flag if we saw a marker so must stop
1481
1482 int progressive;
1483 int spec_start;
1484 int spec_end;
1485 int succ_high;
1486 int succ_low;
1487 int eob_run;
1488
1489 int scan_n, order[4];
1490 int restart_interval, todo;
1491
1492 // kernels
1493 void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]);
1494 void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step);
1495 stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs);
1496 } stbi__jpeg;
1497
stbi__build_huffman(stbi__huffman * h,int * count)1498 static int stbi__build_huffman(stbi__huffman *h, int *count)
1499 {
1500 int i,j,k=0,code;
1501 // build size list for each symbol (from JPEG spec)
1502 for (i=0; i < 16; ++i)
1503 for (j=0; j < count[i]; ++j)
1504 h->size[k++] = (stbi_uc) (i+1);
1505 h->size[k] = 0;
1506
1507 // compute actual symbols (from jpeg spec)
1508 code = 0;
1509 k = 0;
1510 for(j=1; j <= 16; ++j) {
1511 // compute delta to add to code to compute symbol id
1512 h->delta[j] = k - code;
1513 if (h->size[k] == j) {
1514 while (h->size[k] == j)
1515 h->code[k++] = (stbi__uint16) (code++);
1516 if (code-1 >= (1 << j)) return stbi__err("bad code lengths","Corrupt JPEG");
1517 }
1518 // compute largest code + 1 for this size, preshifted as needed later
1519 h->maxcode[j] = code << (16-j);
1520 code <<= 1;
1521 }
1522 h->maxcode[j] = 0xffffffff;
1523
1524 // build non-spec acceleration table; 255 is flag for not-accelerated
1525 memset(h->fast, 255, 1 << FAST_BITS);
1526 for (i=0; i < k; ++i) {
1527 int s = h->size[i];
1528 if (s <= FAST_BITS) {
1529 int c = h->code[i] << (FAST_BITS-s);
1530 int m = 1 << (FAST_BITS-s);
1531 for (j=0; j < m; ++j) {
1532 h->fast[c+j] = (stbi_uc) i;
1533 }
1534 }
1535 }
1536 return 1;
1537 }
1538
1539 // build a table that decodes both magnitude and value of small ACs in
1540 // one go.
stbi__build_fast_ac(stbi__int16 * fast_ac,stbi__huffman * h)1541 static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h)
1542 {
1543 int i;
1544 for (i=0; i < (1 << FAST_BITS); ++i) {
1545 stbi_uc fast = h->fast[i];
1546 fast_ac[i] = 0;
1547 if (fast < 255) {
1548 int rs = h->values[fast];
1549 int run = (rs >> 4) & 15;
1550 int magbits = rs & 15;
1551 int len = h->size[fast];
1552
1553 if (magbits && len + magbits <= FAST_BITS) {
1554 // magnitude code followed by receive_extend code
1555 int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
1556 int m = 1 << (magbits - 1);
1557 if (k < m) k += (-1 << magbits) + 1;
1558 // if the result is small enough, we can fit it in fast_ac table
1559 if (k >= -128 && k <= 127)
1560 fast_ac[i] = (stbi__int16) ((k << 8) + (run << 4) + (len + magbits));
1561 }
1562 }
1563 }
1564 }
1565
stbi__grow_buffer_unsafe(stbi__jpeg * j)1566 static void stbi__grow_buffer_unsafe(stbi__jpeg *j)
1567 {
1568 do {
1569 int b = j->nomore ? 0 : stbi__get8(j->s);
1570 if (b == 0xff) {
1571 int c = stbi__get8(j->s);
1572 if (c != 0) {
1573 j->marker = (unsigned char) c;
1574 j->nomore = 1;
1575 return;
1576 }
1577 }
1578 j->code_buffer |= b << (24 - j->code_bits);
1579 j->code_bits += 8;
1580 } while (j->code_bits <= 24);
1581 }
1582
1583 // (1 << n) - 1
1584 static stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
1585
1586 // decode a jpeg huffman value from the bitstream
stbi__jpeg_huff_decode(stbi__jpeg * j,stbi__huffman * h)1587 stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h)
1588 {
1589 unsigned int temp;
1590 int c,k;
1591
1592 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1593
1594 // look at the top FAST_BITS and determine what symbol ID it is,
1595 // if the code is <= FAST_BITS
1596 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1597 k = h->fast[c];
1598 if (k < 255) {
1599 int s = h->size[k];
1600 if (s > j->code_bits)
1601 return -1;
1602 j->code_buffer <<= s;
1603 j->code_bits -= s;
1604 return h->values[k];
1605 }
1606
1607 // naive test is to shift the code_buffer down so k bits are
1608 // valid, then test against maxcode. To speed this up, we've
1609 // preshifted maxcode left so that it has (16-k) 0s at the
1610 // end; in other words, regardless of the number of bits, it
1611 // wants to be compared against something shifted to have 16;
1612 // that way we don't need to shift inside the loop.
1613 temp = j->code_buffer >> 16;
1614 for (k=FAST_BITS+1 ; ; ++k)
1615 if (temp < h->maxcode[k])
1616 break;
1617 if (k == 17) {
1618 // error! code not found
1619 j->code_bits -= 16;
1620 return -1;
1621 }
1622
1623 if (k > j->code_bits)
1624 return -1;
1625
1626 // convert the huffman code to the symbol id
1627 c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
1628 STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
1629
1630 // convert the id to a symbol
1631 j->code_bits -= k;
1632 j->code_buffer <<= k;
1633 return h->values[c];
1634 }
1635
1636 // bias[n] = (-1<<n) + 1
1637 static int const stbi__jbias[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
1638
1639 // combined JPEG 'receive' and JPEG 'extend', since baseline
1640 // always extends everything it receives.
stbi__extend_receive(stbi__jpeg * j,int n)1641 stbi_inline static int stbi__extend_receive(stbi__jpeg *j, int n)
1642 {
1643 unsigned int k;
1644 int sgn;
1645 if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1646
1647 sgn = (stbi__int32)j->code_buffer >> 31; // sign bit is always in MSB
1648 k = stbi_lrot(j->code_buffer, n);
1649 STBI_ASSERT(n >= 0 && n < (int) (sizeof(stbi__bmask)/sizeof(*stbi__bmask)));
1650 j->code_buffer = k & ~stbi__bmask[n];
1651 k &= stbi__bmask[n];
1652 j->code_bits -= n;
1653 return k + (stbi__jbias[n] & ~sgn);
1654 }
1655
1656 // get some unsigned bits
stbi__jpeg_get_bits(stbi__jpeg * j,int n)1657 stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg *j, int n)
1658 {
1659 unsigned int k;
1660 if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1661 k = stbi_lrot(j->code_buffer, n);
1662 j->code_buffer = k & ~stbi__bmask[n];
1663 k &= stbi__bmask[n];
1664 j->code_bits -= n;
1665 return k;
1666 }
1667
stbi__jpeg_get_bit(stbi__jpeg * j)1668 stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg *j)
1669 {
1670 unsigned int k;
1671 if (j->code_bits < 1) stbi__grow_buffer_unsafe(j);
1672 k = j->code_buffer;
1673 j->code_buffer <<= 1;
1674 --j->code_bits;
1675 return k & 0x80000000;
1676 }
1677
1678 // given a value that's at position X in the zigzag stream,
1679 // where does it appear in the 8x8 matrix coded as row-major?
1680 static stbi_uc stbi__jpeg_dezigzag[64+15] =
1681 {
1682 0, 1, 8, 16, 9, 2, 3, 10,
1683 17, 24, 32, 25, 18, 11, 4, 5,
1684 12, 19, 26, 33, 40, 48, 41, 34,
1685 27, 20, 13, 6, 7, 14, 21, 28,
1686 35, 42, 49, 56, 57, 50, 43, 36,
1687 29, 22, 15, 23, 30, 37, 44, 51,
1688 58, 59, 52, 45, 38, 31, 39, 46,
1689 53, 60, 61, 54, 47, 55, 62, 63,
1690 // let corrupt input sample past end
1691 63, 63, 63, 63, 63, 63, 63, 63,
1692 63, 63, 63, 63, 63, 63, 63
1693 };
1694
1695 // decode one 64-entry block--
stbi__jpeg_decode_block(stbi__jpeg * j,short data[64],stbi__huffman * hdc,stbi__huffman * hac,stbi__int16 * fac,int b,stbi_uc * dequant)1696 static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi_uc *dequant)
1697 {
1698 int diff,dc,k;
1699 int t;
1700
1701 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1702 t = stbi__jpeg_huff_decode(j, hdc);
1703 if (t < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1704
1705 // 0 all the ac values now so we can do it 32-bits at a time
1706 memset(data,0,64*sizeof(data[0]));
1707
1708 diff = t ? stbi__extend_receive(j, t) : 0;
1709 dc = j->img_comp[b].dc_pred + diff;
1710 j->img_comp[b].dc_pred = dc;
1711 data[0] = (short) (dc * dequant[0]);
1712
1713 // decode AC components, see JPEG spec
1714 k = 1;
1715 do {
1716 unsigned int zig;
1717 int c,r,s;
1718 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1719 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1720 r = fac[c];
1721 if (r) { // fast-AC path
1722 k += (r >> 4) & 15; // run
1723 s = r & 15; // combined length
1724 j->code_buffer <<= s;
1725 j->code_bits -= s;
1726 // decode into unzigzag'd location
1727 zig = stbi__jpeg_dezigzag[k++];
1728 data[zig] = (short) ((r >> 8) * dequant[zig]);
1729 } else {
1730 int rs = stbi__jpeg_huff_decode(j, hac);
1731 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1732 s = rs & 15;
1733 r = rs >> 4;
1734 if (s == 0) {
1735 if (rs != 0xf0) break; // end block
1736 k += 16;
1737 } else {
1738 k += r;
1739 // decode into unzigzag'd location
1740 zig = stbi__jpeg_dezigzag[k++];
1741 data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]);
1742 }
1743 }
1744 } while (k < 64);
1745 return 1;
1746 }
1747
stbi__jpeg_decode_block_prog_dc(stbi__jpeg * j,short data[64],stbi__huffman * hdc,int b)1748 static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg *j, short data[64], stbi__huffman *hdc, int b)
1749 {
1750 if (j->spec_end != 0)
1751 return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1752
1753 if (j->code_bits < 16)
1754 stbi__grow_buffer_unsafe(j);
1755
1756 if (j->succ_high == 0)
1757 {
1758 int diff,dc;
1759 int t;
1760
1761 /* first scan for DC coefficient, must be first */
1762 memset(data,0,64*sizeof(data[0])); // 0 all the ac values now
1763 t = stbi__jpeg_huff_decode(j, hdc);
1764 diff = t ? stbi__extend_receive(j, t) : 0;
1765
1766 dc = j->img_comp[b].dc_pred + diff;
1767 j->img_comp[b].dc_pred = dc;
1768 data[0] = (short) (dc << j->succ_low);
1769 }
1770 else
1771 {
1772 /* refinement scan for DC coefficient */
1773 if (stbi__jpeg_get_bit(j))
1774 data[0] += (short) (1 << j->succ_low);
1775 }
1776 return 1;
1777 }
1778
1779 // @OPTIMIZE: store non-zigzagged during the decode passes,
1780 // and only de-zigzag when dequantizing
stbi__jpeg_decode_block_prog_ac(stbi__jpeg * j,short data[64],stbi__huffman * hac,stbi__int16 * fac)1781 static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg *j, short data[64], stbi__huffman *hac, stbi__int16 *fac)
1782 {
1783 int k;
1784 if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1785
1786 if (j->succ_high == 0) {
1787 int shift = j->succ_low;
1788
1789 if (j->eob_run) {
1790 --j->eob_run;
1791 return 1;
1792 }
1793
1794 k = j->spec_start;
1795 do {
1796 unsigned int zig;
1797 int c,r,s;
1798 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1799 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1800 r = fac[c];
1801 if (r) { // fast-AC path
1802 k += (r >> 4) & 15; // run
1803 s = r & 15; // combined length
1804 j->code_buffer <<= s;
1805 j->code_bits -= s;
1806 zig = stbi__jpeg_dezigzag[k++];
1807 data[zig] = (short) ((r >> 8) << shift);
1808 } else {
1809 int rs = stbi__jpeg_huff_decode(j, hac);
1810 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1811 s = rs & 15;
1812 r = rs >> 4;
1813 if (s == 0) {
1814 if (r < 15) {
1815 j->eob_run = (1 << r);
1816 if (r)
1817 j->eob_run += stbi__jpeg_get_bits(j, r);
1818 --j->eob_run;
1819 break;
1820 }
1821 k += 16;
1822 } else {
1823 k += r;
1824 zig = stbi__jpeg_dezigzag[k++];
1825 data[zig] = (short) (stbi__extend_receive(j,s) << shift);
1826 }
1827 }
1828 } while (k <= j->spec_end);
1829 } else {
1830 // refinement scan for these AC coefficients
1831
1832 short bit = (short) (1 << j->succ_low);
1833
1834 if (j->eob_run) {
1835 --j->eob_run;
1836 for (k = j->spec_start; k <= j->spec_end; ++k) {
1837 short *p = &data[stbi__jpeg_dezigzag[k]];
1838 if (*p != 0)
1839 if (stbi__jpeg_get_bit(j))
1840 if ((*p & bit)==0) {
1841 if (*p > 0)
1842 *p += bit;
1843 else
1844 *p -= bit;
1845 }
1846 }
1847 } else {
1848 k = j->spec_start;
1849 do {
1850 int r,s;
1851 int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
1852 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1853 s = rs & 15;
1854 r = rs >> 4;
1855 if (s == 0) {
1856 if (r < 15) {
1857 j->eob_run = (1 << r) - 1;
1858 if (r)
1859 j->eob_run += stbi__jpeg_get_bits(j, r);
1860 r = 64; // force end of block
1861 } else {
1862 // r=15 s=0 should write 16 0s, so we just do
1863 // a run of 15 0s and then write s (which is 0),
1864 // so we don't have to do anything special here
1865 }
1866 } else {
1867 if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG");
1868 // sign bit
1869 if (stbi__jpeg_get_bit(j))
1870 s = bit;
1871 else
1872 s = -bit;
1873 }
1874
1875 // advance by r
1876 while (k <= j->spec_end) {
1877 short *p = &data[stbi__jpeg_dezigzag[k++]];
1878 if (*p != 0) {
1879 if (stbi__jpeg_get_bit(j))
1880 if ((*p & bit)==0) {
1881 if (*p > 0)
1882 *p += bit;
1883 else
1884 *p -= bit;
1885 }
1886 } else {
1887 if (r == 0) {
1888 *p = (short) s;
1889 break;
1890 }
1891 --r;
1892 }
1893 }
1894 } while (k <= j->spec_end);
1895 }
1896 }
1897 return 1;
1898 }
1899
1900 // take a -128..127 value and stbi__clamp it and convert to 0..255
stbi__clamp(int x)1901 stbi_inline static stbi_uc stbi__clamp(int x)
1902 {
1903 // trick to use a single test to catch both cases
1904 if ((unsigned int) x > 255) {
1905 if (x < 0) return 0;
1906 if (x > 255) return 255;
1907 }
1908 return (stbi_uc) x;
1909 }
1910
1911 #define stbi__f2f(x) ((int) (((x) * 4096 + 0.5)))
1912 #define stbi__fsh(x) ((x) << 12)
1913
1914 // derived from jidctint -- DCT_ISLOW
1915 #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
1916 int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
1917 p2 = s2; \
1918 p3 = s6; \
1919 p1 = (p2+p3) * stbi__f2f(0.5411961f); \
1920 t2 = p1 + p3*stbi__f2f(-1.847759065f); \
1921 t3 = p1 + p2*stbi__f2f( 0.765366865f); \
1922 p2 = s0; \
1923 p3 = s4; \
1924 t0 = stbi__fsh(p2+p3); \
1925 t1 = stbi__fsh(p2-p3); \
1926 x0 = t0+t3; \
1927 x3 = t0-t3; \
1928 x1 = t1+t2; \
1929 x2 = t1-t2; \
1930 t0 = s7; \
1931 t1 = s5; \
1932 t2 = s3; \
1933 t3 = s1; \
1934 p3 = t0+t2; \
1935 p4 = t1+t3; \
1936 p1 = t0+t3; \
1937 p2 = t1+t2; \
1938 p5 = (p3+p4)*stbi__f2f( 1.175875602f); \
1939 t0 = t0*stbi__f2f( 0.298631336f); \
1940 t1 = t1*stbi__f2f( 2.053119869f); \
1941 t2 = t2*stbi__f2f( 3.072711026f); \
1942 t3 = t3*stbi__f2f( 1.501321110f); \
1943 p1 = p5 + p1*stbi__f2f(-0.899976223f); \
1944 p2 = p5 + p2*stbi__f2f(-2.562915447f); \
1945 p3 = p3*stbi__f2f(-1.961570560f); \
1946 p4 = p4*stbi__f2f(-0.390180644f); \
1947 t3 += p1+p4; \
1948 t2 += p2+p3; \
1949 t1 += p2+p4; \
1950 t0 += p1+p3;
1951
stbi__idct_block(stbi_uc * out,int out_stride,short data[64])1952 static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64])
1953 {
1954 int i,val[64],*v=val;
1955 stbi_uc *o;
1956 short *d = data;
1957
1958 // columns
1959 for (i=0; i < 8; ++i,++d, ++v) {
1960 // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
1961 if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
1962 && d[40]==0 && d[48]==0 && d[56]==0) {
1963 // no shortcut 0 seconds
1964 // (1|2|3|4|5|6|7)==0 0 seconds
1965 // all separate -0.047 seconds
1966 // 1 && 2|3 && 4|5 && 6|7: -0.047 seconds
1967 int dcterm = d[0] << 2;
1968 v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
1969 } else {
1970 STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56])
1971 // constants scaled things up by 1<<12; let's bring them back
1972 // down, but keep 2 extra bits of precision
1973 x0 += 512; x1 += 512; x2 += 512; x3 += 512;
1974 v[ 0] = (x0+t3) >> 10;
1975 v[56] = (x0-t3) >> 10;
1976 v[ 8] = (x1+t2) >> 10;
1977 v[48] = (x1-t2) >> 10;
1978 v[16] = (x2+t1) >> 10;
1979 v[40] = (x2-t1) >> 10;
1980 v[24] = (x3+t0) >> 10;
1981 v[32] = (x3-t0) >> 10;
1982 }
1983 }
1984
1985 for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
1986 // no fast case since the first 1D IDCT spread components out
1987 STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
1988 // constants scaled things up by 1<<12, plus we had 1<<2 from first
1989 // loop, plus horizontal and vertical each scale by sqrt(8) so together
1990 // we've got an extra 1<<3, so 1<<17 total we need to remove.
1991 // so we want to round that, which means adding 0.5 * 1<<17,
1992 // aka 65536. Also, we'll end up with -128 to 127 that we want
1993 // to encode as 0..255 by adding 128, so we'll add that before the shift
1994 x0 += 65536 + (128<<17);
1995 x1 += 65536 + (128<<17);
1996 x2 += 65536 + (128<<17);
1997 x3 += 65536 + (128<<17);
1998 // tried computing the shifts into temps, or'ing the temps to see
1999 // if any were out of range, but that was slower
2000 o[0] = stbi__clamp((x0+t3) >> 17);
2001 o[7] = stbi__clamp((x0-t3) >> 17);
2002 o[1] = stbi__clamp((x1+t2) >> 17);
2003 o[6] = stbi__clamp((x1-t2) >> 17);
2004 o[2] = stbi__clamp((x2+t1) >> 17);
2005 o[5] = stbi__clamp((x2-t1) >> 17);
2006 o[3] = stbi__clamp((x3+t0) >> 17);
2007 o[4] = stbi__clamp((x3-t0) >> 17);
2008 }
2009 }
2010
2011 #ifdef STBI_SSE2
2012 /* sse2 integer IDCT. not the fastest possible implementation but it
2013 * produces bit-identical results to the generic C version so it's
2014 * fully "transparent".
2015 */
stbi__idct_simd(stbi_uc * out,int out_stride,short data[64])2016 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2017 {
2018 /* This is constructed to match our regular (generic) integer IDCT exactly. */
2019 __m128i row0, row1, row2, row3, row4, row5, row6, row7;
2020 __m128i tmp;
2021
2022 /* dot product constant: even elems=x, odd elems=y */
2023 #define dct_const(x,y) _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
2024
2025 /* out(0) = c0[even]*x + c0[odd]*y (c0, x, y 16-bit, out 32-bit)
2026 * out(1) = c1[even]*x + c1[odd]*y
2027 */
2028 #define dct_rot(out0,out1, x,y,c0,c1) \
2029 __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
2030 __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
2031 __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
2032 __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
2033 __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
2034 __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
2035
2036 /* out = in << 12 (in 16-bit, out 32-bit) */
2037 #define dct_widen(out, in) \
2038 __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
2039 __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
2040
2041 /* wide add */
2042 #define dct_wadd(out, a, b) \
2043 __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
2044 __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
2045
2046 /* wide sub */
2047 #define dct_wsub(out, a, b) \
2048 __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
2049 __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
2050
2051 /* butterfly a/b, add bias, then shift by "s" and pack */
2052 #define dct_bfly32o(out0, out1, a,b,bias,s) \
2053 { \
2054 __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
2055 __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
2056 dct_wadd(sum, abiased, b); \
2057 dct_wsub(dif, abiased, b); \
2058 out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
2059 out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
2060 }
2061
2062 /* 8-bit interleave step (for transposes) */
2063 #define dct_interleave8(a, b) \
2064 tmp = a; \
2065 a = _mm_unpacklo_epi8(a, b); \
2066 b = _mm_unpackhi_epi8(tmp, b)
2067
2068 /* 16-bit interleave step (for transposes) */
2069 #define dct_interleave16(a, b) \
2070 tmp = a; \
2071 a = _mm_unpacklo_epi16(a, b); \
2072 b = _mm_unpackhi_epi16(tmp, b)
2073
2074 #define dct_pass(bias,shift) \
2075 { \
2076 /* even part */ \
2077 dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
2078 __m128i sum04 = _mm_add_epi16(row0, row4); \
2079 __m128i dif04 = _mm_sub_epi16(row0, row4); \
2080 dct_widen(t0e, sum04); \
2081 dct_widen(t1e, dif04); \
2082 dct_wadd(x0, t0e, t3e); \
2083 dct_wsub(x3, t0e, t3e); \
2084 dct_wadd(x1, t1e, t2e); \
2085 dct_wsub(x2, t1e, t2e); \
2086 /* odd part */ \
2087 dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
2088 dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
2089 __m128i sum17 = _mm_add_epi16(row1, row7); \
2090 __m128i sum35 = _mm_add_epi16(row3, row5); \
2091 dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
2092 dct_wadd(x4, y0o, y4o); \
2093 dct_wadd(x5, y1o, y5o); \
2094 dct_wadd(x6, y2o, y5o); \
2095 dct_wadd(x7, y3o, y4o); \
2096 dct_bfly32o(row0,row7, x0,x7,bias,shift); \
2097 dct_bfly32o(row1,row6, x1,x6,bias,shift); \
2098 dct_bfly32o(row2,row5, x2,x5,bias,shift); \
2099 dct_bfly32o(row3,row4, x3,x4,bias,shift); \
2100 }
2101
2102 __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
2103 __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f));
2104 __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
2105 __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
2106 __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f));
2107 __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f));
2108 __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f));
2109 __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f));
2110
2111 /* rounding biases in column/row passes, see stbi__idct_block for explanation. */
2112 __m128i bias_0 = _mm_set1_epi32(512);
2113 __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17));
2114
2115 /* load */
2116 row0 = _mm_load_si128((const __m128i *) (data + 0*8));
2117 row1 = _mm_load_si128((const __m128i *) (data + 1*8));
2118 row2 = _mm_load_si128((const __m128i *) (data + 2*8));
2119 row3 = _mm_load_si128((const __m128i *) (data + 3*8));
2120 row4 = _mm_load_si128((const __m128i *) (data + 4*8));
2121 row5 = _mm_load_si128((const __m128i *) (data + 5*8));
2122 row6 = _mm_load_si128((const __m128i *) (data + 6*8));
2123 row7 = _mm_load_si128((const __m128i *) (data + 7*8));
2124
2125 /* column pass */
2126 dct_pass(bias_0, 10);
2127
2128 {
2129 /* 16bit 8x8 transpose pass 1 */
2130 dct_interleave16(row0, row4);
2131 dct_interleave16(row1, row5);
2132 dct_interleave16(row2, row6);
2133 dct_interleave16(row3, row7);
2134
2135 /* transpose pass 2 */
2136 dct_interleave16(row0, row2);
2137 dct_interleave16(row1, row3);
2138 dct_interleave16(row4, row6);
2139 dct_interleave16(row5, row7);
2140
2141 /* transpose pass 3 */
2142 dct_interleave16(row0, row1);
2143 dct_interleave16(row2, row3);
2144 dct_interleave16(row4, row5);
2145 dct_interleave16(row6, row7);
2146 }
2147
2148 /* row pass */
2149 dct_pass(bias_1, 17);
2150
2151 {
2152 /* pack */
2153 __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
2154 __m128i p1 = _mm_packus_epi16(row2, row3);
2155 __m128i p2 = _mm_packus_epi16(row4, row5);
2156 __m128i p3 = _mm_packus_epi16(row6, row7);
2157
2158 // 8bit 8x8 transpose pass 1
2159 dct_interleave8(p0, p2); // a0e0a1e1...
2160 dct_interleave8(p1, p3); // c0g0c1g1...
2161
2162 // transpose pass 2
2163 dct_interleave8(p0, p1); // a0c0e0g0...
2164 dct_interleave8(p2, p3); // b0d0f0h0...
2165
2166 // transpose pass 3
2167 dct_interleave8(p0, p2); // a0b0c0d0...
2168 dct_interleave8(p1, p3); // a4b4c4d4...
2169
2170 // store
2171 _mm_storel_epi64((__m128i *) out, p0); out += out_stride;
2172 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
2173 _mm_storel_epi64((__m128i *) out, p2); out += out_stride;
2174 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
2175 _mm_storel_epi64((__m128i *) out, p1); out += out_stride;
2176 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
2177 _mm_storel_epi64((__m128i *) out, p3); out += out_stride;
2178 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e));
2179 }
2180
2181 #undef dct_const
2182 #undef dct_rot
2183 #undef dct_widen
2184 #undef dct_wadd
2185 #undef dct_wsub
2186 #undef dct_bfly32o
2187 #undef dct_interleave8
2188 #undef dct_interleave16
2189 #undef dct_pass
2190 }
2191
2192 #endif /* STBI_SSE2 */
2193
2194 #ifdef STBI_NEON
2195
2196 /* NEON integer IDCT. should produce bit-identical
2197 * results to the generic C version. */
stbi__idct_simd(stbi_uc * out,int out_stride,short data[64])2198 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2199 {
2200 int16x8_t row0, row1, row2, row3, row4, row5, row6, row7;
2201
2202 int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f));
2203 int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f));
2204 int16x4_t rot0_2 = vdup_n_s16(stbi__f2f( 0.765366865f));
2205 int16x4_t rot1_0 = vdup_n_s16(stbi__f2f( 1.175875602f));
2206 int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f));
2207 int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f));
2208 int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f));
2209 int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f));
2210 int16x4_t rot3_0 = vdup_n_s16(stbi__f2f( 0.298631336f));
2211 int16x4_t rot3_1 = vdup_n_s16(stbi__f2f( 2.053119869f));
2212 int16x4_t rot3_2 = vdup_n_s16(stbi__f2f( 3.072711026f));
2213 int16x4_t rot3_3 = vdup_n_s16(stbi__f2f( 1.501321110f));
2214
2215 #define dct_long_mul(out, inq, coeff) \
2216 int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
2217 int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
2218
2219 #define dct_long_mac(out, acc, inq, coeff) \
2220 int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
2221 int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
2222
2223 #define dct_widen(out, inq) \
2224 int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
2225 int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
2226
2227 /* wide add */
2228 #define dct_wadd(out, a, b) \
2229 int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
2230 int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
2231
2232 /* wide sub */
2233 #define dct_wsub(out, a, b) \
2234 int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
2235 int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
2236
2237 // butterfly a/b, then shift using "shiftop" by "s" and pack
2238 #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
2239 { \
2240 dct_wadd(sum, a, b); \
2241 dct_wsub(dif, a, b); \
2242 out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
2243 out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
2244 }
2245
2246 #define dct_pass(shiftop, shift) \
2247 { \
2248 /* even part */ \
2249 int16x8_t sum26 = vaddq_s16(row2, row6); \
2250 dct_long_mul(p1e, sum26, rot0_0); \
2251 dct_long_mac(t2e, p1e, row6, rot0_1); \
2252 dct_long_mac(t3e, p1e, row2, rot0_2); \
2253 int16x8_t sum04 = vaddq_s16(row0, row4); \
2254 int16x8_t dif04 = vsubq_s16(row0, row4); \
2255 dct_widen(t0e, sum04); \
2256 dct_widen(t1e, dif04); \
2257 dct_wadd(x0, t0e, t3e); \
2258 dct_wsub(x3, t0e, t3e); \
2259 dct_wadd(x1, t1e, t2e); \
2260 dct_wsub(x2, t1e, t2e); \
2261 /* odd part */ \
2262 int16x8_t sum15 = vaddq_s16(row1, row5); \
2263 int16x8_t sum17 = vaddq_s16(row1, row7); \
2264 int16x8_t sum35 = vaddq_s16(row3, row5); \
2265 int16x8_t sum37 = vaddq_s16(row3, row7); \
2266 int16x8_t sumodd = vaddq_s16(sum17, sum35); \
2267 dct_long_mul(p5o, sumodd, rot1_0); \
2268 dct_long_mac(p1o, p5o, sum17, rot1_1); \
2269 dct_long_mac(p2o, p5o, sum35, rot1_2); \
2270 dct_long_mul(p3o, sum37, rot2_0); \
2271 dct_long_mul(p4o, sum15, rot2_1); \
2272 dct_wadd(sump13o, p1o, p3o); \
2273 dct_wadd(sump24o, p2o, p4o); \
2274 dct_wadd(sump23o, p2o, p3o); \
2275 dct_wadd(sump14o, p1o, p4o); \
2276 dct_long_mac(x4, sump13o, row7, rot3_0); \
2277 dct_long_mac(x5, sump24o, row5, rot3_1); \
2278 dct_long_mac(x6, sump23o, row3, rot3_2); \
2279 dct_long_mac(x7, sump14o, row1, rot3_3); \
2280 dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
2281 dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
2282 dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
2283 dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
2284 }
2285
2286 // load
2287 row0 = vld1q_s16(data + 0*8);
2288 row1 = vld1q_s16(data + 1*8);
2289 row2 = vld1q_s16(data + 2*8);
2290 row3 = vld1q_s16(data + 3*8);
2291 row4 = vld1q_s16(data + 4*8);
2292 row5 = vld1q_s16(data + 5*8);
2293 row6 = vld1q_s16(data + 6*8);
2294 row7 = vld1q_s16(data + 7*8);
2295
2296 // add DC bias
2297 row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
2298
2299 // column pass
2300 dct_pass(vrshrn_n_s32, 10);
2301
2302 // 16bit 8x8 transpose
2303 {
2304 // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
2305 // whether compilers actually get this is another story, sadly.
2306 #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
2307 #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
2308 #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
2309
2310 // pass 1
2311 dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6
2312 dct_trn16(row2, row3);
2313 dct_trn16(row4, row5);
2314 dct_trn16(row6, row7);
2315
2316 // pass 2
2317 dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4
2318 dct_trn32(row1, row3);
2319 dct_trn32(row4, row6);
2320 dct_trn32(row5, row7);
2321
2322 // pass 3
2323 dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0
2324 dct_trn64(row1, row5);
2325 dct_trn64(row2, row6);
2326 dct_trn64(row3, row7);
2327
2328 #undef dct_trn16
2329 #undef dct_trn32
2330 #undef dct_trn64
2331 }
2332
2333 // row pass
2334 // vrshrn_n_s32 only supports shifts up to 16, we need
2335 // 17. so do a non-rounding shift of 16 first then follow
2336 // up with a rounding shift by 1.
2337 dct_pass(vshrn_n_s32, 16);
2338
2339 {
2340 /* pack and round */
2341 uint8x8_t p0 = vqrshrun_n_s16(row0, 1);
2342 uint8x8_t p1 = vqrshrun_n_s16(row1, 1);
2343 uint8x8_t p2 = vqrshrun_n_s16(row2, 1);
2344 uint8x8_t p3 = vqrshrun_n_s16(row3, 1);
2345 uint8x8_t p4 = vqrshrun_n_s16(row4, 1);
2346 uint8x8_t p5 = vqrshrun_n_s16(row5, 1);
2347 uint8x8_t p6 = vqrshrun_n_s16(row6, 1);
2348 uint8x8_t p7 = vqrshrun_n_s16(row7, 1);
2349
2350 /* again, these can translate into one instruction, but often don't. */
2351 #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
2352 #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
2353 #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
2354
2355 /* sadly can't use interleaved stores here since we only write
2356 * 8 bytes to each scan line! */
2357
2358 /* 8x8 8-bit transpose pass 1 */
2359 dct_trn8_8(p0, p1);
2360 dct_trn8_8(p2, p3);
2361 dct_trn8_8(p4, p5);
2362 dct_trn8_8(p6, p7);
2363
2364 /* pass 2 */
2365 dct_trn8_16(p0, p2);
2366 dct_trn8_16(p1, p3);
2367 dct_trn8_16(p4, p6);
2368 dct_trn8_16(p5, p7);
2369
2370 /* pass 3 */
2371 dct_trn8_32(p0, p4);
2372 dct_trn8_32(p1, p5);
2373 dct_trn8_32(p2, p6);
2374 dct_trn8_32(p3, p7);
2375
2376 /* store */
2377 vst1_u8(out, p0); out += out_stride;
2378 vst1_u8(out, p1); out += out_stride;
2379 vst1_u8(out, p2); out += out_stride;
2380 vst1_u8(out, p3); out += out_stride;
2381 vst1_u8(out, p4); out += out_stride;
2382 vst1_u8(out, p5); out += out_stride;
2383 vst1_u8(out, p6); out += out_stride;
2384 vst1_u8(out, p7);
2385
2386 #undef dct_trn8_8
2387 #undef dct_trn8_16
2388 #undef dct_trn8_32
2389 }
2390
2391 #undef dct_long_mul
2392 #undef dct_long_mac
2393 #undef dct_widen
2394 #undef dct_wadd
2395 #undef dct_wsub
2396 #undef dct_bfly32o
2397 #undef dct_pass
2398 }
2399
2400 #endif /* STBI_NEON */
2401
2402 #define STBI__MARKER_none 0xff
2403 /* if there's a pending marker from the entropy stream, return that
2404 * otherwise, fetch from the stream and get a marker. if there's no
2405 * marker, return 0xff, which is never a valid marker value
2406 */
stbi__get_marker(stbi__jpeg * j)2407 static stbi_uc stbi__get_marker(stbi__jpeg *j)
2408 {
2409 stbi_uc x;
2410 if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
2411 x = stbi__get8(j->s);
2412 if (x != 0xff) return STBI__MARKER_none;
2413 while (x == 0xff)
2414 x = stbi__get8(j->s);
2415 return x;
2416 }
2417
2418 /* in each scan, we'll have scan_n components, and the order
2419 * of the components is specified by order[]
2420 */
2421 #define STBI__RESTART(x) ((x) >= 0xd0 && (x) <= 0xd7)
2422
2423 /* after a restart interval, stbi__jpeg_reset the entropy decoder and
2424 * the dc prediction
2425 */
stbi__jpeg_reset(stbi__jpeg * j)2426 static void stbi__jpeg_reset(stbi__jpeg *j)
2427 {
2428 j->code_bits = 0;
2429 j->code_buffer = 0;
2430 j->nomore = 0;
2431 j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = 0;
2432 j->marker = STBI__MARKER_none;
2433 j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
2434 j->eob_run = 0;
2435 // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
2436 // since we don't even allow 1<<30 pixels
2437 }
2438
stbi__parse_entropy_coded_data(stbi__jpeg * z)2439 static int stbi__parse_entropy_coded_data(stbi__jpeg *z)
2440 {
2441 stbi__jpeg_reset(z);
2442 if (!z->progressive) {
2443 if (z->scan_n == 1) {
2444 int i,j;
2445 STBI_SIMD_ALIGN(short, data[64]);
2446 int n = z->order[0];
2447 // non-interleaved data, we just need to process one block at a time,
2448 // in trivial scanline order
2449 // number of blocks to do just depends on how many actual "pixels" this
2450 // component has, independent of interleaved MCU blocking and such
2451 int w = (z->img_comp[n].x+7) >> 3;
2452 int h = (z->img_comp[n].y+7) >> 3;
2453 for (j=0; j < h; ++j) {
2454 for (i=0; i < w; ++i) {
2455 int ha = z->img_comp[n].ha;
2456 if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2457 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2458 // every data block is an MCU, so countdown the restart interval
2459 if (--z->todo <= 0) {
2460 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2461 // if it's NOT a restart, then just bail, so we get corrupt data
2462 // rather than no data
2463 if (!STBI__RESTART(z->marker)) return 1;
2464 stbi__jpeg_reset(z);
2465 }
2466 }
2467 }
2468 return 1;
2469 } else { // interleaved
2470 int i,j,k,x,y;
2471 STBI_SIMD_ALIGN(short, data[64]);
2472 for (j=0; j < z->img_mcu_y; ++j) {
2473 for (i=0; i < z->img_mcu_x; ++i) {
2474 // scan an interleaved mcu... process scan_n components in order
2475 for (k=0; k < z->scan_n; ++k) {
2476 int n = z->order[k];
2477 // scan out an mcu's worth of this component; that's just determined
2478 // by the basic H and V specified for the component
2479 for (y=0; y < z->img_comp[n].v; ++y) {
2480 for (x=0; x < z->img_comp[n].h; ++x) {
2481 int x2 = (i*z->img_comp[n].h + x)*8;
2482 int y2 = (j*z->img_comp[n].v + y)*8;
2483 int ha = z->img_comp[n].ha;
2484 if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2485 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data);
2486 }
2487 }
2488 }
2489 // after all interleaved components, that's an interleaved MCU,
2490 // so now count down the restart interval
2491 if (--z->todo <= 0) {
2492 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2493 if (!STBI__RESTART(z->marker)) return 1;
2494 stbi__jpeg_reset(z);
2495 }
2496 }
2497 }
2498 return 1;
2499 }
2500 } else {
2501 if (z->scan_n == 1) {
2502 int i,j;
2503 int n = z->order[0];
2504 // non-interleaved data, we just need to process one block at a time,
2505 // in trivial scanline order
2506 // number of blocks to do just depends on how many actual "pixels" this
2507 // component has, independent of interleaved MCU blocking and such
2508 int w = (z->img_comp[n].x+7) >> 3;
2509 int h = (z->img_comp[n].y+7) >> 3;
2510 for (j=0; j < h; ++j) {
2511 for (i=0; i < w; ++i) {
2512 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2513 if (z->spec_start == 0) {
2514 if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2515 return 0;
2516 } else {
2517 int ha = z->img_comp[n].ha;
2518 if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha]))
2519 return 0;
2520 }
2521 // every data block is an MCU, so countdown the restart interval
2522 if (--z->todo <= 0) {
2523 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2524 if (!STBI__RESTART(z->marker)) return 1;
2525 stbi__jpeg_reset(z);
2526 }
2527 }
2528 }
2529 return 1;
2530 } else { // interleaved
2531 int i,j,k,x,y;
2532 for (j=0; j < z->img_mcu_y; ++j) {
2533 for (i=0; i < z->img_mcu_x; ++i) {
2534 // scan an interleaved mcu... process scan_n components in order
2535 for (k=0; k < z->scan_n; ++k) {
2536 int n = z->order[k];
2537 // scan out an mcu's worth of this component; that's just determined
2538 // by the basic H and V specified for the component
2539 for (y=0; y < z->img_comp[n].v; ++y) {
2540 for (x=0; x < z->img_comp[n].h; ++x) {
2541 int x2 = (i*z->img_comp[n].h + x);
2542 int y2 = (j*z->img_comp[n].v + y);
2543 short *data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w);
2544 if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2545 return 0;
2546 }
2547 }
2548 }
2549 // after all interleaved components, that's an interleaved MCU,
2550 // so now count down the restart interval
2551 if (--z->todo <= 0) {
2552 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2553 if (!STBI__RESTART(z->marker)) return 1;
2554 stbi__jpeg_reset(z);
2555 }
2556 }
2557 }
2558 return 1;
2559 }
2560 }
2561 }
2562
stbi__jpeg_dequantize(short * data,stbi_uc * dequant)2563 static void stbi__jpeg_dequantize(short *data, stbi_uc *dequant)
2564 {
2565 int i;
2566 for (i=0; i < 64; ++i)
2567 data[i] *= dequant[i];
2568 }
2569
stbi__jpeg_finish(stbi__jpeg * z)2570 static void stbi__jpeg_finish(stbi__jpeg *z)
2571 {
2572 if (z->progressive) {
2573 // dequantize and idct the data
2574 int i,j,n;
2575 for (n=0; n < z->s->img_n; ++n) {
2576 int w = (z->img_comp[n].x+7) >> 3;
2577 int h = (z->img_comp[n].y+7) >> 3;
2578 for (j=0; j < h; ++j) {
2579 for (i=0; i < w; ++i) {
2580 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2581 stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]);
2582 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2583 }
2584 }
2585 }
2586 }
2587 }
2588
stbi__process_marker(stbi__jpeg * z,int m)2589 static int stbi__process_marker(stbi__jpeg *z, int m)
2590 {
2591 int L;
2592 switch (m) {
2593 case STBI__MARKER_none: // no marker found
2594 return stbi__err("expected marker","Corrupt JPEG");
2595
2596 case 0xDD: // DRI - specify restart interval
2597 if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
2598 z->restart_interval = stbi__get16be(z->s);
2599 return 1;
2600
2601 case 0xDB: // DQT - define quantization table
2602 L = stbi__get16be(z->s)-2;
2603 while (L > 0) {
2604 int q = stbi__get8(z->s);
2605 int p = q >> 4;
2606 int t = q & 15,i;
2607 if (p != 0) return stbi__err("bad DQT type","Corrupt JPEG");
2608 if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG");
2609 for (i=0; i < 64; ++i)
2610 z->dequant[t][stbi__jpeg_dezigzag[i]] = stbi__get8(z->s);
2611 L -= 65;
2612 }
2613 return L==0;
2614
2615 case 0xC4: // DHT - define huffman table
2616 L = stbi__get16be(z->s)-2;
2617 while (L > 0) {
2618 stbi_uc *v;
2619 int sizes[16],i,n=0;
2620 int q = stbi__get8(z->s);
2621 int tc = q >> 4;
2622 int th = q & 15;
2623 if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG");
2624 for (i=0; i < 16; ++i) {
2625 sizes[i] = stbi__get8(z->s);
2626 n += sizes[i];
2627 }
2628 L -= 17;
2629 if (tc == 0) {
2630 if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0;
2631 v = z->huff_dc[th].values;
2632 } else {
2633 if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0;
2634 v = z->huff_ac[th].values;
2635 }
2636 for (i=0; i < n; ++i)
2637 v[i] = stbi__get8(z->s);
2638 if (tc != 0)
2639 stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
2640 L -= n;
2641 }
2642 return L==0;
2643 }
2644 // check for comment block or APP blocks
2645 if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
2646 stbi__skip(z->s, stbi__get16be(z->s)-2);
2647 return 1;
2648 }
2649 return 0;
2650 }
2651
2652 // after we see SOS
stbi__process_scan_header(stbi__jpeg * z)2653 static int stbi__process_scan_header(stbi__jpeg *z)
2654 {
2655 int i;
2656 int Ls = stbi__get16be(z->s);
2657
2658 z->scan_n = stbi__get8(z->s);
2659
2660 if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n)
2661 return stbi__err("bad SOS component count","Corrupt JPEG");
2662 if (Ls != 6+2*z->scan_n)
2663 return stbi__err("bad SOS len","Corrupt JPEG");
2664
2665 for (i=0; i < z->scan_n; ++i)
2666 {
2667 int id = stbi__get8(z->s), which;
2668 int q = stbi__get8(z->s);
2669
2670 for (which = 0; which < z->s->img_n; ++which)
2671 if (z->img_comp[which].id == id)
2672 break;
2673 if (which == z->s->img_n)
2674 return 0; /* no match */
2675
2676 z->img_comp[which].hd = q >> 4; if (z->img_comp[which].hd > 3)
2677 return stbi__err("bad DC huff","Corrupt JPEG");
2678 z->img_comp[which].ha = q & 15; if (z->img_comp[which].ha > 3)
2679 return stbi__err("bad AC huff","Corrupt JPEG");
2680 z->order[i] = which;
2681 }
2682
2683 {
2684 int aa;
2685 z->spec_start = stbi__get8(z->s);
2686 z->spec_end = stbi__get8(z->s); /* should be 63, but might be 0 */
2687 aa = stbi__get8(z->s);
2688 z->succ_high = (aa >> 4);
2689 z->succ_low = (aa & 15);
2690 if (z->progressive) {
2691 if (z->spec_start > 63 || z->spec_end > 63 || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13)
2692 return stbi__err("bad SOS", "Corrupt JPEG");
2693 } else {
2694 if (z->spec_start != 0) return stbi__err("bad SOS","Corrupt JPEG");
2695 if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS","Corrupt JPEG");
2696 z->spec_end = 63;
2697 }
2698 }
2699
2700 return 1;
2701 }
2702
stbi__process_frame_header(stbi__jpeg * z,int scan)2703 static int stbi__process_frame_header(stbi__jpeg *z, int scan)
2704 {
2705 stbi__context *s = z->s;
2706 int Lf,p,i,q, h_max=1,v_max=1,c;
2707 Lf = stbi__get16be(s); if (Lf < 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
2708 p = stbi__get8(s); if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
2709 s->img_y = stbi__get16be(s); if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
2710 s->img_x = stbi__get16be(s); if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
2711 c = stbi__get8(s);
2712 if (c != 3 && c != 1) return stbi__err("bad component count","Corrupt JPEG"); // JFIF requires
2713 s->img_n = c;
2714 for (i=0; i < c; ++i) {
2715 z->img_comp[i].data = NULL;
2716 z->img_comp[i].linebuf = NULL;
2717 }
2718
2719 if (Lf != 8+3*s->img_n) return stbi__err("bad SOF len","Corrupt JPEG");
2720
2721 for (i=0; i < s->img_n; ++i) {
2722 z->img_comp[i].id = stbi__get8(s);
2723 if (z->img_comp[i].id != i+1) // JFIF requires
2724 if (z->img_comp[i].id != i) // some version of jpegtran outputs non-JFIF-compliant files!
2725 return stbi__err("bad component ID","Corrupt JPEG");
2726 q = stbi__get8(s);
2727 z->img_comp[i].h = (q >> 4); if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG");
2728 z->img_comp[i].v = q & 15; if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG");
2729 z->img_comp[i].tq = stbi__get8(s); if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG");
2730 }
2731
2732 if (scan != STBI__SCAN_load) return 1;
2733
2734 if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
2735
2736 for (i=0; i < s->img_n; ++i) {
2737 if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
2738 if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
2739 }
2740
2741 // compute interleaved mcu info
2742 z->img_h_max = h_max;
2743 z->img_v_max = v_max;
2744 z->img_mcu_w = h_max * 8;
2745 z->img_mcu_h = v_max * 8;
2746 z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
2747 z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;
2748
2749 for (i=0; i < s->img_n; ++i) {
2750 // number of effective pixels (e.g. for non-interleaved MCU)
2751 z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
2752 z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
2753 // to simplify generation, we'll allocate enough memory to decode
2754 // the bogus oversized data from using interleaved MCUs and their
2755 // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
2756 // discard the extra data until colorspace conversion
2757 z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
2758 z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
2759 z->img_comp[i].raw_data = stbi__malloc(z->img_comp[i].w2 * z->img_comp[i].h2+15);
2760
2761 if (z->img_comp[i].raw_data == NULL) {
2762 for(--i; i >= 0; --i) {
2763 STBI_FREE(z->img_comp[i].raw_data);
2764 z->img_comp[i].data = NULL;
2765 }
2766 return stbi__err("outofmem", "Out of memory");
2767 }
2768 // align blocks for idct using mmx/sse
2769 z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
2770 z->img_comp[i].linebuf = NULL;
2771 if (z->progressive) {
2772 z->img_comp[i].coeff_w = (z->img_comp[i].w2 + 7) >> 3;
2773 z->img_comp[i].coeff_h = (z->img_comp[i].h2 + 7) >> 3;
2774 z->img_comp[i].raw_coeff = STBI_MALLOC(z->img_comp[i].coeff_w * z->img_comp[i].coeff_h * 64 * sizeof(short) + 15);
2775 z->img_comp[i].coeff = (short*) (((size_t) z->img_comp[i].raw_coeff + 15) & ~15);
2776 } else {
2777 z->img_comp[i].coeff = 0;
2778 z->img_comp[i].raw_coeff = 0;
2779 }
2780 }
2781
2782 return 1;
2783 }
2784
2785 // use comparisons since in some cases we handle more than one case (e.g. SOF)
2786 #define stbi__DNL(x) ((x) == 0xdc)
2787 #define stbi__SOI(x) ((x) == 0xd8)
2788 #define stbi__EOI(x) ((x) == 0xd9)
2789 #define stbi__SOF(x) ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
2790 #define stbi__SOS(x) ((x) == 0xda)
2791
2792 #define stbi__SOF_progressive(x) ((x) == 0xc2)
2793
stbi__decode_jpeg_header(stbi__jpeg * z,int scan)2794 static int stbi__decode_jpeg_header(stbi__jpeg *z, int scan)
2795 {
2796 int m;
2797 z->marker = STBI__MARKER_none; // initialize cached marker to empty
2798 m = stbi__get_marker(z);
2799 if (!stbi__SOI(m)) return stbi__err("no SOI","Corrupt JPEG");
2800 if (scan == STBI__SCAN_type) return 1;
2801 m = stbi__get_marker(z);
2802 while (!stbi__SOF(m)) {
2803 if (!stbi__process_marker(z,m)) return 0;
2804 m = stbi__get_marker(z);
2805 while (m == STBI__MARKER_none) {
2806 // some files have extra padding after their blocks, so ok, we'll scan
2807 if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG");
2808 m = stbi__get_marker(z);
2809 }
2810 }
2811 z->progressive = stbi__SOF_progressive(m);
2812 if (!stbi__process_frame_header(z, scan)) return 0;
2813 return 1;
2814 }
2815
2816 // decode image to YCbCr format
stbi__decode_jpeg_image(stbi__jpeg * j)2817 static int stbi__decode_jpeg_image(stbi__jpeg *j)
2818 {
2819 int m;
2820 for (m = 0; m < 4; m++) {
2821 j->img_comp[m].raw_data = NULL;
2822 j->img_comp[m].raw_coeff = NULL;
2823 }
2824 j->restart_interval = 0;
2825 if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0;
2826 m = stbi__get_marker(j);
2827 while (!stbi__EOI(m)) {
2828 if (stbi__SOS(m)) {
2829 if (!stbi__process_scan_header(j)) return 0;
2830 if (!stbi__parse_entropy_coded_data(j)) return 0;
2831 if (j->marker == STBI__MARKER_none ) {
2832 // handle 0s at the end of image data from IP Kamera 9060
2833 while (!stbi__at_eof(j->s)) {
2834 int x = stbi__get8(j->s);
2835 if (x == 255) {
2836 j->marker = stbi__get8(j->s);
2837 break;
2838 } else if (x != 0) {
2839 return stbi__err("junk before marker", "Corrupt JPEG");
2840 }
2841 }
2842 // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
2843 }
2844 } else {
2845 if (!stbi__process_marker(j, m)) return 0;
2846 }
2847 m = stbi__get_marker(j);
2848 }
2849 if (j->progressive)
2850 stbi__jpeg_finish(j);
2851 return 1;
2852 }
2853
2854 // static jfif-centered resampling (across block boundaries)
2855
2856 typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1,
2857 int w, int hs);
2858
2859 #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
2860
resample_row_1(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2861 static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2862 {
2863 STBI_NOTUSED(out);
2864 STBI_NOTUSED(in_far);
2865 STBI_NOTUSED(w);
2866 STBI_NOTUSED(hs);
2867 return in_near;
2868 }
2869
stbi__resample_row_v_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2870 static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2871 {
2872 // need to generate two samples vertically for every one in input
2873 int i;
2874 STBI_NOTUSED(hs);
2875 for (i=0; i < w; ++i)
2876 out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2);
2877 return out;
2878 }
2879
stbi__resample_row_h_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2880 static stbi_uc* stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2881 {
2882 // need to generate two samples horizontally for every one in input
2883 int i;
2884 stbi_uc *input = in_near;
2885
2886 if (w == 1) {
2887 // if only one sample, can't do any interpolation
2888 out[0] = out[1] = input[0];
2889 return out;
2890 }
2891
2892 out[0] = input[0];
2893 out[1] = stbi__div4(input[0]*3 + input[1] + 2);
2894 for (i=1; i < w-1; ++i) {
2895 int n = 3*input[i]+2;
2896 out[i*2+0] = stbi__div4(n+input[i-1]);
2897 out[i*2+1] = stbi__div4(n+input[i+1]);
2898 }
2899 out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2);
2900 out[i*2+1] = input[w-1];
2901
2902 STBI_NOTUSED(in_far);
2903 STBI_NOTUSED(hs);
2904
2905 return out;
2906 }
2907
2908 #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
2909
stbi__resample_row_hv_2(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2910 static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2911 {
2912 // need to generate 2x2 samples for every one in input
2913 int i,t0,t1;
2914 if (w == 1) {
2915 out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2916 return out;
2917 }
2918
2919 t1 = 3*in_near[0] + in_far[0];
2920 out[0] = stbi__div4(t1+2);
2921 for (i=1; i < w; ++i) {
2922 t0 = t1;
2923 t1 = 3*in_near[i]+in_far[i];
2924 out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
2925 out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
2926 }
2927 out[w*2-1] = stbi__div4(t1+2);
2928
2929 STBI_NOTUSED(hs);
2930
2931 return out;
2932 }
2933
2934 #if defined(STBI_SSE2) || defined(STBI_NEON)
stbi__resample_row_hv_2_simd(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)2935 static stbi_uc *stbi__resample_row_hv_2_simd(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2936 {
2937 /* need to generate 2x2 samples for every one in input */
2938 int i=0,t0,t1;
2939
2940 if (w == 1) {
2941 out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2942 return out;
2943 }
2944
2945 t1 = 3*in_near[0] + in_far[0];
2946 /* process groups of 8 pixels for as long as we can.
2947 * note we can't handle the last pixel in a row in this loop
2948 * because we need to handle the filter boundary conditions.
2949 */
2950 for (; i < ((w-1) & ~7); i += 8)
2951 {
2952 #if defined(STBI_SSE2)
2953 /* load and perform the vertical filtering pass
2954 * this uses 3*x + y = 4*x + (y - x) */
2955 __m128i zero = _mm_setzero_si128();
2956 __m128i farb = _mm_loadl_epi64((__m128i *) (in_far + i));
2957 __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i));
2958 __m128i farw = _mm_unpacklo_epi8(farb, zero);
2959 __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
2960 __m128i diff = _mm_sub_epi16(farw, nearw);
2961 __m128i nears = _mm_slli_epi16(nearw, 2);
2962 __m128i curr = _mm_add_epi16(nears, diff); /* current row */
2963
2964 /* horizontal filter works the same based on shifted vers of current
2965 * row. "prev" is current row shifted right by 1 pixel; we need to
2966 * insert the previous pixel value (from t1).
2967 * "next" is current row shifted left by 1 pixel, with first pixel
2968 * of next block of 8 pixels added in.
2969 */
2970 __m128i prv0 = _mm_slli_si128(curr, 2);
2971 __m128i nxt0 = _mm_srli_si128(curr, 2);
2972 __m128i prev = _mm_insert_epi16(prv0, t1, 0);
2973 __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7);
2974
2975 /* horizontal filter, polyphase implementation since it's convenient:
2976 * even pixels = 3*cur + prev = cur*4 + (prev - cur)
2977 * odd pixels = 3*cur + next = cur*4 + (next - cur)
2978 * note the shared term. */
2979 __m128i bias = _mm_set1_epi16(8);
2980 __m128i curs = _mm_slli_epi16(curr, 2);
2981 __m128i prvd = _mm_sub_epi16(prev, curr);
2982 __m128i nxtd = _mm_sub_epi16(next, curr);
2983 __m128i curb = _mm_add_epi16(curs, bias);
2984 __m128i even = _mm_add_epi16(prvd, curb);
2985 __m128i odd = _mm_add_epi16(nxtd, curb);
2986
2987 /* interleave even and odd pixels, then undo scaling. */
2988 __m128i int0 = _mm_unpacklo_epi16(even, odd);
2989 __m128i int1 = _mm_unpackhi_epi16(even, odd);
2990 __m128i de0 = _mm_srli_epi16(int0, 4);
2991 __m128i de1 = _mm_srli_epi16(int1, 4);
2992
2993 /* pack and write output */
2994 __m128i outv = _mm_packus_epi16(de0, de1);
2995 _mm_storeu_si128((__m128i *) (out + i*2), outv);
2996 #elif defined(STBI_NEON)
2997 // load and perform the vertical filtering pass
2998 // this uses 3*x + y = 4*x + (y - x)
2999 uint8x8_t farb = vld1_u8(in_far + i);
3000 uint8x8_t nearb = vld1_u8(in_near + i);
3001 int16x8_t diff = vreinterpretq_s16_u16(vsubl_u8(farb, nearb));
3002 int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2));
3003 int16x8_t curr = vaddq_s16(nears, diff); // current row
3004
3005 // horizontal filter works the same based on shifted vers of current
3006 // row. "prev" is current row shifted right by 1 pixel; we need to
3007 // insert the previous pixel value (from t1).
3008 // "next" is current row shifted left by 1 pixel, with first pixel
3009 // of next block of 8 pixels added in.
3010 int16x8_t prv0 = vextq_s16(curr, curr, 7);
3011 int16x8_t nxt0 = vextq_s16(curr, curr, 1);
3012 int16x8_t prev = vsetq_lane_s16(t1, prv0, 0);
3013 int16x8_t next = vsetq_lane_s16(3*in_near[i+8] + in_far[i+8], nxt0, 7);
3014
3015 /* horizontal filter, polyphase implementation since it's convenient:
3016 * even pixels = 3*cur + prev = cur*4 + (prev - cur)
3017 * odd pixels = 3*cur + next = cur*4 + (next - cur)
3018 * note the shared term.
3019 */
3020 int16x8_t curs = vshlq_n_s16(curr, 2);
3021 int16x8_t prvd = vsubq_s16(prev, curr);
3022 int16x8_t nxtd = vsubq_s16(next, curr);
3023 int16x8_t even = vaddq_s16(curs, prvd);
3024 int16x8_t odd = vaddq_s16(curs, nxtd);
3025
3026 /* undo scaling and round, then store with even/odd phases interleaved */
3027 uint8x8x2_t o;
3028 o.val[0] = vqrshrun_n_s16(even, 4);
3029 o.val[1] = vqrshrun_n_s16(odd, 4);
3030 vst2_u8(out + i*2, o);
3031 #endif
3032
3033 /* "previous" value for next iteration */
3034 t1 = 3*in_near[i+7] + in_far[i+7];
3035 }
3036
3037 t0 = t1;
3038 t1 = 3*in_near[i] + in_far[i];
3039 out[i*2] = stbi__div16(3*t1 + t0 + 8);
3040
3041 for (++i; i < w; ++i) {
3042 t0 = t1;
3043 t1 = 3*in_near[i]+in_far[i];
3044 out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
3045 out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
3046 }
3047 out[w*2-1] = stbi__div4(t1+2);
3048
3049 STBI_NOTUSED(hs);
3050
3051 return out;
3052 }
3053 #endif
3054
stbi__resample_row_generic(stbi_uc * out,stbi_uc * in_near,stbi_uc * in_far,int w,int hs)3055 static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3056 {
3057 /* resample with nearest-neighbor */
3058 int i,j;
3059 STBI_NOTUSED(in_far);
3060 for (i=0; i < w; ++i)
3061 for (j=0; j < hs; ++j)
3062 out[i*hs+j] = in_near[i];
3063 return out;
3064 }
3065
3066 #ifdef STBI_JPEG_OLD
3067 /* this is the same YCbCr-to-RGB calculation that stb_image has used
3068 * historically before the algorithm changes in 1.49 */
3069 #define float2fixed(x) ((int) ((x) * 65536 + 0.5))
stbi__YCbCr_to_RGB_row(stbi_uc * out,const stbi_uc * y,const stbi_uc * pcb,const stbi_uc * pcr,int count,int step)3070 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3071 {
3072 int i;
3073 for (i=0; i < count; ++i) {
3074 int y_fixed = (y[i] << 16) + 32768; // rounding
3075 int r,g,b;
3076 int cr = pcr[i] - 128;
3077 int cb = pcb[i] - 128;
3078 r = y_fixed + cr*float2fixed(1.40200f);
3079 g = y_fixed - cr*float2fixed(0.71414f) - cb*float2fixed(0.34414f);
3080 b = y_fixed + cb*float2fixed(1.77200f);
3081 r >>= 16;
3082 g >>= 16;
3083 b >>= 16;
3084 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3085 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3086 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3087 out[0] = (stbi_uc)r;
3088 out[1] = (stbi_uc)g;
3089 out[2] = (stbi_uc)b;
3090 out[3] = 255;
3091 out += step;
3092 }
3093 }
3094 #else
3095 /* this is a reduced-precision calculation of YCbCr-to-RGB introduced
3096 * to make sure the code produces the same results in both SIMD and scalar */
3097 #define float2fixed(x) (((int) ((x) * 4096.0f + 0.5f)) << 8)
stbi__YCbCr_to_RGB_row(stbi_uc * out,const stbi_uc * y,const stbi_uc * pcb,const stbi_uc * pcr,int count,int step)3098 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3099 {
3100 int i;
3101 for (i=0; i < count; ++i) {
3102 int y_fixed = (y[i] << 20) + (1<<19); /* rounding */
3103 int r,g,b;
3104 int cr = pcr[i] - 128;
3105 int cb = pcb[i] - 128;
3106 r = y_fixed + cr* float2fixed(1.40200f);
3107 g = y_fixed + (cr*-float2fixed(0.71414f)) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3108 b = y_fixed + cb* float2fixed(1.77200f);
3109 r >>= 20;
3110 g >>= 20;
3111 b >>= 20;
3112 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3113 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3114 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3115 out[0] = (stbi_uc)r;
3116 out[1] = (stbi_uc)g;
3117 out[2] = (stbi_uc)b;
3118 out[3] = 255;
3119 out += step;
3120 }
3121 }
3122 #endif
3123
3124 #if defined(STBI_SSE2) || defined(STBI_NEON)
stbi__YCbCr_to_RGB_simd(stbi_uc * out,stbi_uc const * y,stbi_uc const * pcb,stbi_uc const * pcr,int count,int step)3125 static void stbi__YCbCr_to_RGB_simd(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step)
3126 {
3127 int i = 0;
3128
3129 #ifdef STBI_SSE2
3130 /* step == 3 is pretty ugly on the final interleave, and i'm not convinced
3131 * it's useful in practice (you wouldn't use it for textures, for example).
3132 * so just accelerate step == 4 case.
3133 */
3134 if (step == 4)
3135 {
3136 /* this is a fairly straightforward implementation and not super-optimized. */
3137 __m128i signflip = _mm_set1_epi8(-0x80);
3138 __m128i cr_const0 = _mm_set1_epi16( (short) ( 1.40200f*4096.0f+0.5f));
3139 __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f));
3140 __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f));
3141 __m128i cb_const1 = _mm_set1_epi16( (short) ( 1.77200f*4096.0f+0.5f));
3142 __m128i y_bias = _mm_set1_epi8((char) (unsigned char) 128);
3143 __m128i xw = _mm_set1_epi16(255); /* alpha channel */
3144
3145 for (; i+7 < count; i += 8)
3146 {
3147 // load
3148 __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i));
3149 __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i));
3150 __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i));
3151 __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
3152 __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
3153
3154 // unpack to short (and left-shift cr, cb by 8)
3155 __m128i yw = _mm_unpacklo_epi8(y_bias, y_bytes);
3156 __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
3157 __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
3158
3159 // color transform
3160 __m128i yws = _mm_srli_epi16(yw, 4);
3161 __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
3162 __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
3163 __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
3164 __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
3165 __m128i rws = _mm_add_epi16(cr0, yws);
3166 __m128i gwt = _mm_add_epi16(cb0, yws);
3167 __m128i bws = _mm_add_epi16(yws, cb1);
3168 __m128i gws = _mm_add_epi16(gwt, cr1);
3169
3170 // descale
3171 __m128i rw = _mm_srai_epi16(rws, 4);
3172 __m128i bw = _mm_srai_epi16(bws, 4);
3173 __m128i gw = _mm_srai_epi16(gws, 4);
3174
3175 // back to byte, set up for transpose
3176 __m128i brb = _mm_packus_epi16(rw, bw);
3177 __m128i gxb = _mm_packus_epi16(gw, xw);
3178
3179 // transpose to interleave channels
3180 __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
3181 __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
3182 __m128i o0 = _mm_unpacklo_epi16(t0, t1);
3183 __m128i o1 = _mm_unpackhi_epi16(t0, t1);
3184
3185 // store
3186 _mm_storeu_si128((__m128i *) (out + 0), o0);
3187 _mm_storeu_si128((__m128i *) (out + 16), o1);
3188 out += 32;
3189 }
3190 }
3191 #endif
3192
3193 #ifdef STBI_NEON
3194 // in this version, step=3 support would be easy to add. but is there demand?
3195 if (step == 4) {
3196 // this is a fairly straightforward implementation and not super-optimized.
3197 uint8x8_t signflip = vdup_n_u8(0x80);
3198 int16x8_t cr_const0 = vdupq_n_s16( (short) ( 1.40200f*4096.0f+0.5f));
3199 int16x8_t cr_const1 = vdupq_n_s16( - (short) ( 0.71414f*4096.0f+0.5f));
3200 int16x8_t cb_const0 = vdupq_n_s16( - (short) ( 0.34414f*4096.0f+0.5f));
3201 int16x8_t cb_const1 = vdupq_n_s16( (short) ( 1.77200f*4096.0f+0.5f));
3202
3203 for (; i+7 < count; i += 8) {
3204 // load
3205 uint8x8_t y_bytes = vld1_u8(y + i);
3206 uint8x8_t cr_bytes = vld1_u8(pcr + i);
3207 uint8x8_t cb_bytes = vld1_u8(pcb + i);
3208 int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip));
3209 int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip));
3210
3211 // expand to s16
3212 int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4));
3213 int16x8_t crw = vshll_n_s8(cr_biased, 7);
3214 int16x8_t cbw = vshll_n_s8(cb_biased, 7);
3215
3216 // color transform
3217 int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0);
3218 int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0);
3219 int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1);
3220 int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1);
3221 int16x8_t rws = vaddq_s16(yws, cr0);
3222 int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1);
3223 int16x8_t bws = vaddq_s16(yws, cb1);
3224
3225 // undo scaling, round, convert to byte
3226 uint8x8x4_t o;
3227 o.val[0] = vqrshrun_n_s16(rws, 4);
3228 o.val[1] = vqrshrun_n_s16(gws, 4);
3229 o.val[2] = vqrshrun_n_s16(bws, 4);
3230 o.val[3] = vdup_n_u8(255);
3231
3232 // store, interleaving r/g/b/a
3233 vst4_u8(out, o);
3234 out += 8*4;
3235 }
3236 }
3237 #endif
3238
3239 for (; i < count; ++i) {
3240 int y_fixed = (y[i] << 20) + (1<<19); // rounding
3241 int r,g,b;
3242 int cr = pcr[i] - 128;
3243 int cb = pcb[i] - 128;
3244 r = y_fixed + cr* float2fixed(1.40200f);
3245 g = y_fixed + cr*-float2fixed(0.71414f) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3246 b = y_fixed + cb* float2fixed(1.77200f);
3247 r >>= 20;
3248 g >>= 20;
3249 b >>= 20;
3250 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3251 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3252 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3253 out[0] = (stbi_uc)r;
3254 out[1] = (stbi_uc)g;
3255 out[2] = (stbi_uc)b;
3256 out[3] = 255;
3257 out += step;
3258 }
3259 }
3260 #endif
3261
3262 /* set up the kernels */
stbi__setup_jpeg(stbi__jpeg * j)3263 static void stbi__setup_jpeg(stbi__jpeg *j)
3264 {
3265 j->idct_block_kernel = stbi__idct_block;
3266 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
3267 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
3268
3269 #ifdef STBI_SSE2
3270 if (stbi__sse2_available()) {
3271 j->idct_block_kernel = stbi__idct_simd;
3272 #ifndef STBI_JPEG_OLD
3273 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3274 #endif
3275 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3276 }
3277 #endif
3278
3279 #ifdef STBI_NEON
3280 j->idct_block_kernel = stbi__idct_simd;
3281 #ifndef STBI_JPEG_OLD
3282 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3283 #endif
3284 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3285 #endif
3286 }
3287
3288 /* clean up the temporary component buffers */
stbi__cleanup_jpeg(stbi__jpeg * j)3289 static void stbi__cleanup_jpeg(stbi__jpeg *j)
3290 {
3291 int i;
3292 for (i=0; i < j->s->img_n; ++i) {
3293 if (j->img_comp[i].raw_data) {
3294 STBI_FREE(j->img_comp[i].raw_data);
3295 j->img_comp[i].raw_data = NULL;
3296 j->img_comp[i].data = NULL;
3297 }
3298 if (j->img_comp[i].raw_coeff) {
3299 STBI_FREE(j->img_comp[i].raw_coeff);
3300 j->img_comp[i].raw_coeff = 0;
3301 j->img_comp[i].coeff = 0;
3302 }
3303 if (j->img_comp[i].linebuf) {
3304 STBI_FREE(j->img_comp[i].linebuf);
3305 j->img_comp[i].linebuf = NULL;
3306 }
3307 }
3308 }
3309
3310 typedef struct
3311 {
3312 resample_row_func resample;
3313 stbi_uc *line0,*line1;
3314 int hs,vs; // expansion factor in each axis
3315 int w_lores; // horizontal pixels pre-expansion
3316 int ystep; // how far through vertical expansion we are
3317 int ypos; // which pre-expansion row we're on
3318 } stbi__resample;
3319
load_jpeg_image(stbi__jpeg * z,int * out_x,int * out_y,int * comp,int req_comp)3320 static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
3321 {
3322 int n, decode_n;
3323 z->s->img_n = 0; // make stbi__cleanup_jpeg safe
3324
3325 // validate req_comp
3326 if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
3327
3328 // load a jpeg image from whichever source, but leave in YCbCr format
3329 if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
3330
3331 // determine actual number of components to generate
3332 n = req_comp ? req_comp : z->s->img_n;
3333
3334 if (z->s->img_n == 3 && n < 3)
3335 decode_n = 1;
3336 else
3337 decode_n = z->s->img_n;
3338
3339 // resample and color-convert
3340 {
3341 int k;
3342 unsigned int i,j;
3343 stbi_uc *output;
3344 stbi_uc *coutput[4];
3345
3346 stbi__resample res_comp[4];
3347
3348 for (k=0; k < decode_n; ++k) {
3349 stbi__resample *r = &res_comp[k];
3350
3351 // allocate line buffer big enough for upsampling off the edges
3352 // with upsample factor of 4
3353 z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3);
3354 if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3355
3356 r->hs = z->img_h_max / z->img_comp[k].h;
3357 r->vs = z->img_v_max / z->img_comp[k].v;
3358 r->ystep = r->vs >> 1;
3359 r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
3360 r->ypos = 0;
3361 r->line0 = r->line1 = z->img_comp[k].data;
3362
3363 if (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
3364 else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
3365 else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
3366 else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
3367 else r->resample = stbi__resample_row_generic;
3368 }
3369
3370 // can't error after this so, this is safe
3371 output = (stbi_uc *) stbi__malloc(n * z->s->img_x * z->s->img_y + 1);
3372 if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3373
3374 // now go ahead and resample
3375 for (j=0; j < z->s->img_y; ++j) {
3376 stbi_uc *out = output + n * z->s->img_x * j;
3377 for (k=0; k < decode_n; ++k) {
3378 stbi__resample *r = &res_comp[k];
3379 int y_bot = r->ystep >= (r->vs >> 1);
3380 coutput[k] = r->resample(z->img_comp[k].linebuf,
3381 y_bot ? r->line1 : r->line0,
3382 y_bot ? r->line0 : r->line1,
3383 r->w_lores, r->hs);
3384 if (++r->ystep >= r->vs) {
3385 r->ystep = 0;
3386 r->line0 = r->line1;
3387 if (++r->ypos < z->img_comp[k].y)
3388 r->line1 += z->img_comp[k].w2;
3389 }
3390 }
3391 if (n >= 3) {
3392 stbi_uc *y = coutput[0];
3393 if (z->s->img_n == 3) {
3394 z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3395 } else
3396 for (i=0; i < z->s->img_x; ++i) {
3397 out[0] = out[1] = out[2] = y[i];
3398 out[3] = 255; // not used if n==3
3399 out += n;
3400 }
3401 } else {
3402 stbi_uc *y = coutput[0];
3403 if (n == 1)
3404 for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
3405 else
3406 for (i=0; i < z->s->img_x; ++i) *out++ = y[i], *out++ = 255;
3407 }
3408 }
3409 stbi__cleanup_jpeg(z);
3410 *out_x = z->s->img_x;
3411 *out_y = z->s->img_y;
3412 if (comp) *comp = z->s->img_n; // report original components, not output
3413 return output;
3414 }
3415 }
3416
stbi__jpeg_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)3417 static unsigned char *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
3418 {
3419 stbi__jpeg j;
3420 j.s = s;
3421 stbi__setup_jpeg(&j);
3422 return load_jpeg_image(&j, x,y,comp,req_comp);
3423 }
3424
stbi__jpeg_test(stbi__context * s)3425 static int stbi__jpeg_test(stbi__context *s)
3426 {
3427 int r;
3428 stbi__jpeg j;
3429 j.s = s;
3430 stbi__setup_jpeg(&j);
3431 r = stbi__decode_jpeg_header(&j, STBI__SCAN_type);
3432 stbi__rewind(s);
3433 return r;
3434 }
3435
stbi__jpeg_info_raw(stbi__jpeg * j,int * x,int * y,int * comp)3436 static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp)
3437 {
3438 if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) {
3439 stbi__rewind( j->s );
3440 return 0;
3441 }
3442 if (x) *x = j->s->img_x;
3443 if (y) *y = j->s->img_y;
3444 if (comp) *comp = j->s->img_n;
3445 return 1;
3446 }
3447
stbi__jpeg_info(stbi__context * s,int * x,int * y,int * comp)3448 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp)
3449 {
3450 stbi__jpeg j;
3451 j.s = s;
3452 return stbi__jpeg_info_raw(&j, x, y, comp);
3453 }
3454 #endif
3455
3456 // public domain zlib decode v0.2 Sean Barrett 2006-11-18
3457 // simple implementation
3458 // - all input must be provided in an upfront buffer
3459 // - all output is written to a single output buffer (can malloc/realloc)
3460 // performance
3461 // - fast huffman
3462
3463 #ifndef STBI_NO_ZLIB
3464
3465 // fast-way is faster to check than jpeg huffman, but slow way is slower
3466 #define STBI__ZFAST_BITS 9 // accelerate all cases in default tables
3467 #define STBI__ZFAST_MASK ((1 << STBI__ZFAST_BITS) - 1)
3468
3469 // zlib-style huffman encoding
3470 // (jpegs packs from left, zlib from right, so can't share code)
3471 typedef struct
3472 {
3473 stbi__uint16 fast[1 << STBI__ZFAST_BITS];
3474 stbi__uint16 firstcode[16];
3475 int maxcode[17];
3476 stbi__uint16 firstsymbol[16];
3477 stbi_uc size[288];
3478 stbi__uint16 value[288];
3479 } stbi__zhuffman;
3480
stbi__bitreverse16(int n)3481 stbi_inline static int stbi__bitreverse16(int n)
3482 {
3483 n = ((n & 0xAAAA) >> 1) | ((n & 0x5555) << 1);
3484 n = ((n & 0xCCCC) >> 2) | ((n & 0x3333) << 2);
3485 n = ((n & 0xF0F0) >> 4) | ((n & 0x0F0F) << 4);
3486 n = ((n & 0xFF00) >> 8) | ((n & 0x00FF) << 8);
3487 return n;
3488 }
3489
stbi__bit_reverse(int v,int bits)3490 stbi_inline static int stbi__bit_reverse(int v, int bits)
3491 {
3492 STBI_ASSERT(bits <= 16);
3493 // to bit reverse n bits, reverse 16 and shift
3494 // e.g. 11 bits, bit reverse and shift away 5
3495 return stbi__bitreverse16(v) >> (16-bits);
3496 }
3497
stbi__zbuild_huffman(stbi__zhuffman * z,stbi_uc * sizelist,int num)3498 static int stbi__zbuild_huffman(stbi__zhuffman *z, stbi_uc *sizelist, int num)
3499 {
3500 int i,k=0;
3501 int code, next_code[16], sizes[17];
3502
3503 // DEFLATE spec for generating codes
3504 memset(sizes, 0, sizeof(sizes));
3505 memset(z->fast, 0, sizeof(z->fast));
3506 for (i=0; i < num; ++i)
3507 ++sizes[sizelist[i]];
3508 sizes[0] = 0;
3509 for (i=1; i < 16; ++i)
3510 if (sizes[i] > (1 << i))
3511 return stbi__err("bad sizes", "Corrupt PNG");
3512 code = 0;
3513 for (i=1; i < 16; ++i) {
3514 next_code[i] = code;
3515 z->firstcode[i] = (stbi__uint16) code;
3516 z->firstsymbol[i] = (stbi__uint16) k;
3517 code = (code + sizes[i]);
3518 if (sizes[i])
3519 if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt PNG");
3520 z->maxcode[i] = code << (16-i); // preshift for inner loop
3521 code <<= 1;
3522 k += sizes[i];
3523 }
3524 z->maxcode[16] = 0x10000; // sentinel
3525 for (i=0; i < num; ++i) {
3526 int s = sizelist[i];
3527 if (s) {
3528 int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
3529 stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i);
3530 z->size [c] = (stbi_uc ) s;
3531 z->value[c] = (stbi__uint16) i;
3532 if (s <= STBI__ZFAST_BITS) {
3533 int k = stbi__bit_reverse(next_code[s],s);
3534 while (k < (1 << STBI__ZFAST_BITS)) {
3535 z->fast[k] = fastv;
3536 k += (1 << s);
3537 }
3538 }
3539 ++next_code[s];
3540 }
3541 }
3542 return 1;
3543 }
3544
3545 // zlib-from-memory implementation for PNG reading
3546 // because PNG allows splitting the zlib stream arbitrarily,
3547 // and it's annoying structurally to have PNG call ZLIB call PNG,
3548 // we require PNG read all the IDATs and combine them into a single
3549 // memory buffer
3550
3551 typedef struct
3552 {
3553 stbi_uc *zbuffer, *zbuffer_end;
3554 int num_bits;
3555 stbi__uint32 code_buffer;
3556
3557 char *zout;
3558 char *zout_start;
3559 char *zout_end;
3560 int z_expandable;
3561
3562 stbi__zhuffman z_length, z_distance;
3563 } stbi__zbuf;
3564
stbi__zget8(stbi__zbuf * z)3565 stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z)
3566 {
3567 if (z->zbuffer >= z->zbuffer_end) return 0;
3568 return *z->zbuffer++;
3569 }
3570
stbi__fill_bits(stbi__zbuf * z)3571 static void stbi__fill_bits(stbi__zbuf *z)
3572 {
3573 do {
3574 STBI_ASSERT(z->code_buffer < (1U << z->num_bits));
3575 z->code_buffer |= stbi__zget8(z) << z->num_bits;
3576 z->num_bits += 8;
3577 } while (z->num_bits <= 24);
3578 }
3579
stbi__zreceive(stbi__zbuf * z,int n)3580 stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n)
3581 {
3582 unsigned int k;
3583 if (z->num_bits < n) stbi__fill_bits(z);
3584 k = z->code_buffer & ((1 << n) - 1);
3585 z->code_buffer >>= n;
3586 z->num_bits -= n;
3587 return k;
3588 }
3589
stbi__zhuffman_decode_slowpath(stbi__zbuf * a,stbi__zhuffman * z)3590 static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z)
3591 {
3592 int b,s,k;
3593 // not resolved by fast table, so compute it the slow way
3594 // use jpeg approach, which requires MSbits at top
3595 k = stbi__bit_reverse(a->code_buffer, 16);
3596 for (s=STBI__ZFAST_BITS+1; ; ++s)
3597 if (k < z->maxcode[s])
3598 break;
3599 if (s == 16) return -1; // invalid code!
3600 // code size is s, so:
3601 b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
3602 STBI_ASSERT(z->size[b] == s);
3603 a->code_buffer >>= s;
3604 a->num_bits -= s;
3605 return z->value[b];
3606 }
3607
stbi__zhuffman_decode(stbi__zbuf * a,stbi__zhuffman * z)3608 stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z)
3609 {
3610 int b,s;
3611 if (a->num_bits < 16) stbi__fill_bits(a);
3612 b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
3613 if (b) {
3614 s = b >> 9;
3615 a->code_buffer >>= s;
3616 a->num_bits -= s;
3617 return b & 511;
3618 }
3619 return stbi__zhuffman_decode_slowpath(a, z);
3620 }
3621
stbi__zexpand(stbi__zbuf * z,char * zout,int n)3622 static int stbi__zexpand(stbi__zbuf *z, char *zout, int n) // need to make room for n bytes
3623 {
3624 char *q;
3625 int cur, limit;
3626 z->zout = zout;
3627 if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG");
3628 cur = (int) (z->zout - z->zout_start);
3629 limit = (int) (z->zout_end - z->zout_start);
3630 while (cur + n > limit)
3631 limit *= 2;
3632 q = (char *) STBI_REALLOC(z->zout_start, limit);
3633 if (q == NULL) return stbi__err("outofmem", "Out of memory");
3634 z->zout_start = q;
3635 z->zout = q + cur;
3636 z->zout_end = q + limit;
3637 return 1;
3638 }
3639
3640 static int stbi__zlength_base[31] = {
3641 3,4,5,6,7,8,9,10,11,13,
3642 15,17,19,23,27,31,35,43,51,59,
3643 67,83,99,115,131,163,195,227,258,0,0 };
3644
3645 static int stbi__zlength_extra[31]=
3646 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
3647
3648 static int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
3649 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
3650
3651 static int stbi__zdist_extra[32] =
3652 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
3653
stbi__parse_huffman_block(stbi__zbuf * a)3654 static int stbi__parse_huffman_block(stbi__zbuf *a)
3655 {
3656 char *zout = a->zout;
3657 for(;;) {
3658 int z = stbi__zhuffman_decode(a, &a->z_length);
3659 if (z < 256) {
3660 if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
3661 if (zout >= a->zout_end) {
3662 if (!stbi__zexpand(a, zout, 1)) return 0;
3663 zout = a->zout;
3664 }
3665 *zout++ = (char) z;
3666 } else {
3667 stbi_uc *p;
3668 int len,dist;
3669 if (z == 256) {
3670 a->zout = zout;
3671 return 1;
3672 }
3673 z -= 257;
3674 len = stbi__zlength_base[z];
3675 if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
3676 z = stbi__zhuffman_decode(a, &a->z_distance);
3677 if (z < 0) return stbi__err("bad huffman code","Corrupt PNG");
3678 dist = stbi__zdist_base[z];
3679 if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
3680 if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG");
3681 if (zout + len > a->zout_end) {
3682 if (!stbi__zexpand(a, zout, len)) return 0;
3683 zout = a->zout;
3684 }
3685 p = (stbi_uc *) (zout - dist);
3686 if (dist == 1) { // run of one byte; common in images.
3687 stbi_uc v = *p;
3688 if (len) { do *zout++ = v; while (--len); }
3689 } else {
3690 if (len) { do *zout++ = *p++; while (--len); }
3691 }
3692 }
3693 }
3694 }
3695
stbi__compute_huffman_codes(stbi__zbuf * a)3696 static int stbi__compute_huffman_codes(stbi__zbuf *a)
3697 {
3698 static stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
3699 stbi__zhuffman z_codelength;
3700 stbi_uc lencodes[286+32+137];//padding for maximum single op
3701 stbi_uc codelength_sizes[19];
3702 int i,n;
3703
3704 int hlit = stbi__zreceive(a,5) + 257;
3705 int hdist = stbi__zreceive(a,5) + 1;
3706 int hclen = stbi__zreceive(a,4) + 4;
3707
3708 memset(codelength_sizes, 0, sizeof(codelength_sizes));
3709 for (i=0; i < hclen; ++i) {
3710 int s = stbi__zreceive(a,3);
3711 codelength_sizes[length_dezigzag[i]] = (stbi_uc) s;
3712 }
3713 if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
3714
3715 n = 0;
3716 while (n < hlit + hdist) {
3717 int c = stbi__zhuffman_decode(a, &z_codelength);
3718 if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG");
3719 if (c < 16)
3720 lencodes[n++] = (stbi_uc) c;
3721 else if (c == 16) {
3722 c = stbi__zreceive(a,2)+3;
3723 memset(lencodes+n, lencodes[n-1], c);
3724 n += c;
3725 } else if (c == 17) {
3726 c = stbi__zreceive(a,3)+3;
3727 memset(lencodes+n, 0, c);
3728 n += c;
3729 } else {
3730 STBI_ASSERT(c == 18);
3731 c = stbi__zreceive(a,7)+11;
3732 memset(lencodes+n, 0, c);
3733 n += c;
3734 }
3735 }
3736 if (n != hlit+hdist) return stbi__err("bad codelengths","Corrupt PNG");
3737 if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
3738 if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
3739 return 1;
3740 }
3741
stbi__parse_uncomperssed_block(stbi__zbuf * a)3742 static int stbi__parse_uncomperssed_block(stbi__zbuf *a)
3743 {
3744 stbi_uc header[4];
3745 int len,nlen,k;
3746 if (a->num_bits & 7)
3747 stbi__zreceive(a, a->num_bits & 7); // discard
3748 // drain the bit-packed data into header
3749 k = 0;
3750 while (a->num_bits > 0) {
3751 header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check
3752 a->code_buffer >>= 8;
3753 a->num_bits -= 8;
3754 }
3755 STBI_ASSERT(a->num_bits == 0);
3756 // now fill header the normal way
3757 while (k < 4)
3758 header[k++] = stbi__zget8(a);
3759 len = header[1] * 256 + header[0];
3760 nlen = header[3] * 256 + header[2];
3761 if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
3762 if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG");
3763 if (a->zout + len > a->zout_end)
3764 if (!stbi__zexpand(a, a->zout, len)) return 0;
3765 memcpy(a->zout, a->zbuffer, len);
3766 a->zbuffer += len;
3767 a->zout += len;
3768 return 1;
3769 }
3770
stbi__parse_zlib_header(stbi__zbuf * a)3771 static int stbi__parse_zlib_header(stbi__zbuf *a)
3772 {
3773 int cmf = stbi__zget8(a);
3774 int cm = cmf & 15;
3775 /* int cinfo = cmf >> 4; */
3776 int flg = stbi__zget8(a);
3777 if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
3778 if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
3779 if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
3780 // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
3781 return 1;
3782 }
3783
3784 // @TODO: should statically initialize these for optimal thread safety
3785 static stbi_uc stbi__zdefault_length[288], stbi__zdefault_distance[32];
stbi__init_zdefaults(void)3786 static void stbi__init_zdefaults(void)
3787 {
3788 int i; // use <= to match clearly with spec
3789 for (i=0; i <= 143; ++i) stbi__zdefault_length[i] = 8;
3790 for ( ; i <= 255; ++i) stbi__zdefault_length[i] = 9;
3791 for ( ; i <= 279; ++i) stbi__zdefault_length[i] = 7;
3792 for ( ; i <= 287; ++i) stbi__zdefault_length[i] = 8;
3793
3794 for (i=0; i <= 31; ++i) stbi__zdefault_distance[i] = 5;
3795 }
3796
stbi__parse_zlib(stbi__zbuf * a,int parse_header)3797 static int stbi__parse_zlib(stbi__zbuf *a, int parse_header)
3798 {
3799 int final, type;
3800 if (parse_header)
3801 if (!stbi__parse_zlib_header(a)) return 0;
3802 a->num_bits = 0;
3803 a->code_buffer = 0;
3804 do {
3805 final = stbi__zreceive(a,1);
3806 type = stbi__zreceive(a,2);
3807 if (type == 0) {
3808 if (!stbi__parse_uncomperssed_block(a)) return 0;
3809 } else if (type == 3) {
3810 return 0;
3811 } else {
3812 if (type == 1) {
3813 // use fixed code lengths
3814 if (!stbi__zdefault_distance[31]) stbi__init_zdefaults();
3815 if (!stbi__zbuild_huffman(&a->z_length , stbi__zdefault_length , 288)) return 0;
3816 if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance, 32)) return 0;
3817 } else {
3818 if (!stbi__compute_huffman_codes(a)) return 0;
3819 }
3820 if (!stbi__parse_huffman_block(a)) return 0;
3821 }
3822 } while (!final);
3823 return 1;
3824 }
3825
stbi__do_zlib(stbi__zbuf * a,char * obuf,int olen,int exp,int parse_header)3826 static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header)
3827 {
3828 a->zout_start = obuf;
3829 a->zout = obuf;
3830 a->zout_end = obuf + olen;
3831 a->z_expandable = exp;
3832
3833 return stbi__parse_zlib(a, parse_header);
3834 }
3835
stbi_zlib_decode_malloc_guesssize(const char * buffer,int len,int initial_size,int * outlen)3836 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)
3837 {
3838 stbi__zbuf a;
3839 char *p = (char *) stbi__malloc(initial_size);
3840 if (p == NULL) return NULL;
3841 a.zbuffer = (stbi_uc *) buffer;
3842 a.zbuffer_end = (stbi_uc *) buffer + len;
3843 if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
3844 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3845 return a.zout_start;
3846 } else {
3847 STBI_FREE(a.zout_start);
3848 return NULL;
3849 }
3850 }
3851
stbi_zlib_decode_malloc(char const * buffer,int len,int * outlen)3852 STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
3853 {
3854 return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
3855 }
3856
stbi_zlib_decode_malloc_guesssize_headerflag(const char * buffer,int len,int initial_size,int * outlen,int parse_header)3857 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
3858 {
3859 stbi__zbuf a;
3860 char *p = (char *) stbi__malloc(initial_size);
3861 if (p == NULL) return NULL;
3862 a.zbuffer = (stbi_uc *) buffer;
3863 a.zbuffer_end = (stbi_uc *) buffer + len;
3864 if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
3865 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3866 return a.zout_start;
3867 } else {
3868 STBI_FREE(a.zout_start);
3869 return NULL;
3870 }
3871 }
3872
stbi_zlib_decode_buffer(char * obuffer,int olen,char const * ibuffer,int ilen)3873 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
3874 {
3875 stbi__zbuf a;
3876 a.zbuffer = (stbi_uc *) ibuffer;
3877 a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3878 if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
3879 return (int) (a.zout - a.zout_start);
3880 else
3881 return -1;
3882 }
3883
stbi_zlib_decode_noheader_malloc(char const * buffer,int len,int * outlen)3884 STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
3885 {
3886 stbi__zbuf a;
3887 char *p = (char *) stbi__malloc(16384);
3888 if (p == NULL) return NULL;
3889 a.zbuffer = (stbi_uc *) buffer;
3890 a.zbuffer_end = (stbi_uc *) buffer+len;
3891 if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
3892 if (outlen) *outlen = (int) (a.zout - a.zout_start);
3893 return a.zout_start;
3894 } else {
3895 STBI_FREE(a.zout_start);
3896 return NULL;
3897 }
3898 }
3899
stbi_zlib_decode_noheader_buffer(char * obuffer,int olen,const char * ibuffer,int ilen)3900 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
3901 {
3902 stbi__zbuf a;
3903 a.zbuffer = (stbi_uc *) ibuffer;
3904 a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3905 if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
3906 return (int) (a.zout - a.zout_start);
3907 else
3908 return -1;
3909 }
3910 #endif
3911
3912 // public domain "baseline" PNG decoder v0.10 Sean Barrett 2006-11-18
3913 // simple implementation
3914 // - only 8-bit samples
3915 // - no CRC checking
3916 // - allocates lots of intermediate memory
3917 // - avoids problem of streaming data between subsystems
3918 // - avoids explicit window management
3919 // performance
3920 // - uses stb_zlib, a PD zlib implementation with fast huffman decoding
3921
3922 #ifndef STBI_NO_PNG
3923 typedef struct
3924 {
3925 stbi__uint32 length;
3926 stbi__uint32 type;
3927 } stbi__pngchunk;
3928
stbi__get_chunk_header(stbi__context * s)3929 static stbi__pngchunk stbi__get_chunk_header(stbi__context *s)
3930 {
3931 stbi__pngchunk c;
3932 c.length = stbi__get32be(s);
3933 c.type = stbi__get32be(s);
3934 return c;
3935 }
3936
stbi__check_png_header(stbi__context * s)3937 static int stbi__check_png_header(stbi__context *s)
3938 {
3939 static stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
3940 int i;
3941 for (i=0; i < 8; ++i)
3942 if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG");
3943 return 1;
3944 }
3945
3946 typedef struct
3947 {
3948 stbi__context *s;
3949 stbi_uc *idata, *expanded, *out;
3950 } stbi__png;
3951
3952
3953 enum {
3954 STBI__F_none=0,
3955 STBI__F_sub=1,
3956 STBI__F_up=2,
3957 STBI__F_avg=3,
3958 STBI__F_paeth=4,
3959 // synthetic filters used for first scanline to avoid needing a dummy row of 0s
3960 STBI__F_avg_first,
3961 STBI__F_paeth_first
3962 };
3963
3964 static stbi_uc first_row_filter[5] =
3965 {
3966 STBI__F_none,
3967 STBI__F_sub,
3968 STBI__F_none,
3969 STBI__F_avg_first,
3970 STBI__F_paeth_first
3971 };
3972
stbi__paeth(int a,int b,int c)3973 static int stbi__paeth(int a, int b, int c)
3974 {
3975 int p = a + b - c;
3976 int pa = abs(p-a);
3977 int pb = abs(p-b);
3978 int pc = abs(p-c);
3979 if (pa <= pb && pa <= pc) return a;
3980 if (pb <= pc) return b;
3981 return c;
3982 }
3983
3984 static stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
3985
3986 // create the png data from post-deflated data
stbi__create_png_image_raw(stbi__png * a,stbi_uc * raw,stbi__uint32 raw_len,int out_n,stbi__uint32 x,stbi__uint32 y,int depth,int color)3987 static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
3988 {
3989 stbi__context *s = a->s;
3990 stbi__uint32 i,j,stride = x*out_n;
3991 stbi__uint32 img_len, img_width_bytes;
3992 int k;
3993 int img_n = s->img_n; // copy it into a local for later
3994
3995 STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1);
3996 a->out = (stbi_uc *) stbi__malloc(x * y * out_n); // extra bytes to write off the end into
3997 if (!a->out) return stbi__err("outofmem", "Out of memory");
3998
3999 img_width_bytes = (((img_n * x * depth) + 7) >> 3);
4000 img_len = (img_width_bytes + 1) * y;
4001 if (s->img_x == x && s->img_y == y) {
4002 if (raw_len != img_len) return stbi__err("not enough pixels","Corrupt PNG");
4003 } else { // interlaced:
4004 if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG");
4005 }
4006
4007 for (j=0; j < y; ++j) {
4008 stbi_uc *cur = a->out + stride*j;
4009 stbi_uc *prior = cur - stride;
4010 int filter = *raw++;
4011 int filter_bytes = img_n;
4012 int width = x;
4013 if (filter > 4)
4014 return stbi__err("invalid filter","Corrupt PNG");
4015
4016 if (depth < 8) {
4017 STBI_ASSERT(img_width_bytes <= x);
4018 cur += x*out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place
4019 filter_bytes = 1;
4020 width = img_width_bytes;
4021 }
4022
4023 // if first row, use special filter that doesn't sample previous row
4024 if (j == 0) filter = first_row_filter[filter];
4025
4026 // handle first byte explicitly
4027 for (k=0; k < filter_bytes; ++k) {
4028 switch (filter) {
4029 case STBI__F_none : cur[k] = raw[k]; break;
4030 case STBI__F_sub : cur[k] = raw[k]; break;
4031 case STBI__F_up : cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4032 case STBI__F_avg : cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1)); break;
4033 case STBI__F_paeth : cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0,prior[k],0)); break;
4034 case STBI__F_avg_first : cur[k] = raw[k]; break;
4035 case STBI__F_paeth_first: cur[k] = raw[k]; break;
4036 }
4037 }
4038
4039 if (depth == 8) {
4040 if (img_n != out_n)
4041 cur[img_n] = 255; // first pixel
4042 raw += img_n;
4043 cur += out_n;
4044 prior += out_n;
4045 } else {
4046 raw += 1;
4047 cur += 1;
4048 prior += 1;
4049 }
4050
4051 // this is a little gross, so that we don't switch per-pixel or per-component
4052 if (depth < 8 || img_n == out_n) {
4053 int nk = (width - 1)*img_n;
4054 #define CASE(f) \
4055 case f: \
4056 for (k=0; k < nk; ++k)
4057 switch (filter) {
4058 // "none" filter turns into a memcpy here; make that explicit.
4059 case STBI__F_none: memcpy(cur, raw, nk); break;
4060 CASE(STBI__F_sub) cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]); break;
4061 CASE(STBI__F_up) cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4062 CASE(STBI__F_avg) cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1)); break;
4063 CASE(STBI__F_paeth) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],prior[k],prior[k-filter_bytes])); break;
4064 CASE(STBI__F_avg_first) cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1)); break;
4065 CASE(STBI__F_paeth_first) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],0,0)); break;
4066 }
4067 #undef CASE
4068 raw += nk;
4069 } else {
4070 STBI_ASSERT(img_n+1 == out_n);
4071 #define CASE(f) \
4072 case f: \
4073 for (i=x-1; i >= 1; --i, cur[img_n]=255,raw+=img_n,cur+=out_n,prior+=out_n) \
4074 for (k=0; k < img_n; ++k)
4075 switch (filter) {
4076 CASE(STBI__F_none) cur[k] = raw[k]; break;
4077 CASE(STBI__F_sub) cur[k] = STBI__BYTECAST(raw[k] + cur[k-out_n]); break;
4078 CASE(STBI__F_up) cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4079 CASE(STBI__F_avg) cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-out_n])>>1)); break;
4080 CASE(STBI__F_paeth) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-out_n],prior[k],prior[k-out_n])); break;
4081 CASE(STBI__F_avg_first) cur[k] = STBI__BYTECAST(raw[k] + (cur[k-out_n] >> 1)); break;
4082 CASE(STBI__F_paeth_first) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-out_n],0,0)); break;
4083 }
4084 #undef CASE
4085 }
4086 }
4087
4088 // we make a separate pass to expand bits to pixels; for performance,
4089 // this could run two scanlines behind the above code, so it won't
4090 // intefere with filtering but will still be in the cache.
4091 if (depth < 8) {
4092 for (j=0; j < y; ++j) {
4093 stbi_uc *cur = a->out + stride*j;
4094 stbi_uc *in = a->out + stride*j + x*out_n - img_width_bytes;
4095 // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
4096 // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
4097 stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range
4098
4099 // note that the final byte might overshoot and write more data than desired.
4100 // we can allocate enough data that this never writes out of memory, but it
4101 // could also overwrite the next scanline. can it overwrite non-empty data
4102 // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
4103 // so we need to explicitly clamp the final ones
4104
4105 if (depth == 4) {
4106 for (k=x*img_n; k >= 2; k-=2, ++in) {
4107 *cur++ = scale * ((*in >> 4) );
4108 *cur++ = scale * ((*in ) & 0x0f);
4109 }
4110 if (k > 0) *cur++ = scale * ((*in >> 4) );
4111 } else if (depth == 2) {
4112 for (k=x*img_n; k >= 4; k-=4, ++in) {
4113 *cur++ = scale * ((*in >> 6) );
4114 *cur++ = scale * ((*in >> 4) & 0x03);
4115 *cur++ = scale * ((*in >> 2) & 0x03);
4116 *cur++ = scale * ((*in ) & 0x03);
4117 }
4118 if (k > 0) *cur++ = scale * ((*in >> 6) );
4119 if (k > 1) *cur++ = scale * ((*in >> 4) & 0x03);
4120 if (k > 2) *cur++ = scale * ((*in >> 2) & 0x03);
4121 } else if (depth == 1) {
4122 for (k=x*img_n; k >= 8; k-=8, ++in) {
4123 *cur++ = scale * ((*in >> 7) );
4124 *cur++ = scale * ((*in >> 6) & 0x01);
4125 *cur++ = scale * ((*in >> 5) & 0x01);
4126 *cur++ = scale * ((*in >> 4) & 0x01);
4127 *cur++ = scale * ((*in >> 3) & 0x01);
4128 *cur++ = scale * ((*in >> 2) & 0x01);
4129 *cur++ = scale * ((*in >> 1) & 0x01);
4130 *cur++ = scale * ((*in ) & 0x01);
4131 }
4132 if (k > 0) *cur++ = scale * ((*in >> 7) );
4133 if (k > 1) *cur++ = scale * ((*in >> 6) & 0x01);
4134 if (k > 2) *cur++ = scale * ((*in >> 5) & 0x01);
4135 if (k > 3) *cur++ = scale * ((*in >> 4) & 0x01);
4136 if (k > 4) *cur++ = scale * ((*in >> 3) & 0x01);
4137 if (k > 5) *cur++ = scale * ((*in >> 2) & 0x01);
4138 if (k > 6) *cur++ = scale * ((*in >> 1) & 0x01);
4139 }
4140 if (img_n != out_n) {
4141 // insert alpha = 255
4142 stbi_uc *cur = a->out + stride*j;
4143 int i;
4144 if (img_n == 1) {
4145 for (i=x-1; i >= 0; --i) {
4146 cur[i*2+1] = 255;
4147 cur[i*2+0] = cur[i];
4148 }
4149 } else {
4150 STBI_ASSERT(img_n == 3);
4151 for (i=x-1; i >= 0; --i) {
4152 cur[i*4+3] = 255;
4153 cur[i*4+2] = cur[i*3+2];
4154 cur[i*4+1] = cur[i*3+1];
4155 cur[i*4+0] = cur[i*3+0];
4156 }
4157 }
4158 }
4159 }
4160 }
4161
4162 return 1;
4163 }
4164
stbi__create_png_image(stbi__png * a,stbi_uc * image_data,stbi__uint32 image_data_len,int out_n,int depth,int color,int interlaced)4165 static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
4166 {
4167 stbi_uc *final;
4168 int p;
4169 if (!interlaced)
4170 return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
4171
4172 // de-interlacing
4173 final = (stbi_uc *) stbi__malloc(a->s->img_x * a->s->img_y * out_n);
4174 for (p=0; p < 7; ++p) {
4175 int xorig[] = { 0,4,0,2,0,1,0 };
4176 int yorig[] = { 0,0,4,0,2,0,1 };
4177 int xspc[] = { 8,8,4,4,2,2,1 };
4178 int yspc[] = { 8,8,8,4,4,2,2 };
4179 int i,j,x,y;
4180 // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
4181 x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
4182 y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
4183 if (x && y) {
4184 stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
4185 if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
4186 STBI_FREE(final);
4187 return 0;
4188 }
4189 for (j=0; j < y; ++j) {
4190 for (i=0; i < x; ++i) {
4191 int out_y = j*yspc[p]+yorig[p];
4192 int out_x = i*xspc[p]+xorig[p];
4193 memcpy(final + out_y*a->s->img_x*out_n + out_x*out_n,
4194 a->out + (j*x+i)*out_n, out_n);
4195 }
4196 }
4197 STBI_FREE(a->out);
4198 image_data += img_len;
4199 image_data_len -= img_len;
4200 }
4201 }
4202 a->out = final;
4203
4204 return 1;
4205 }
4206
stbi__compute_transparency(stbi__png * z,stbi_uc tc[3],int out_n)4207 static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n)
4208 {
4209 stbi__context *s = z->s;
4210 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4211 stbi_uc *p = z->out;
4212
4213 // compute color-based transparency, assuming we've
4214 // already got 255 as the alpha value in the output
4215 STBI_ASSERT(out_n == 2 || out_n == 4);
4216
4217 if (out_n == 2) {
4218 for (i=0; i < pixel_count; ++i) {
4219 p[1] = (p[0] == tc[0] ? 0 : 255);
4220 p += 2;
4221 }
4222 } else {
4223 for (i=0; i < pixel_count; ++i) {
4224 if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4225 p[3] = 0;
4226 p += 4;
4227 }
4228 }
4229 return 1;
4230 }
4231
stbi__expand_png_palette(stbi__png * a,stbi_uc * palette,int len,int pal_img_n)4232 static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n)
4233 {
4234 stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
4235 stbi_uc *p, *temp_out, *orig = a->out;
4236
4237 p = (stbi_uc *) stbi__malloc(pixel_count * pal_img_n);
4238 if (p == NULL) return stbi__err("outofmem", "Out of memory");
4239
4240 // between here and free(out) below, exitting would leak
4241 temp_out = p;
4242
4243 if (pal_img_n == 3) {
4244 for (i=0; i < pixel_count; ++i) {
4245 int n = orig[i]*4;
4246 p[0] = palette[n ];
4247 p[1] = palette[n+1];
4248 p[2] = palette[n+2];
4249 p += 3;
4250 }
4251 } else {
4252 for (i=0; i < pixel_count; ++i) {
4253 int n = orig[i]*4;
4254 p[0] = palette[n ];
4255 p[1] = palette[n+1];
4256 p[2] = palette[n+2];
4257 p[3] = palette[n+3];
4258 p += 4;
4259 }
4260 }
4261 STBI_FREE(a->out);
4262 a->out = temp_out;
4263
4264 STBI_NOTUSED(len);
4265
4266 return 1;
4267 }
4268
4269 static int stbi__unpremultiply_on_load = 0;
4270 static int stbi__de_iphone_flag = 0;
4271
stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)4272 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
4273 {
4274 stbi__unpremultiply_on_load = flag_true_if_should_unpremultiply;
4275 }
4276
stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)4277 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
4278 {
4279 stbi__de_iphone_flag = flag_true_if_should_convert;
4280 }
4281
stbi__de_iphone(stbi__png * z)4282 static void stbi__de_iphone(stbi__png *z)
4283 {
4284 stbi__context *s = z->s;
4285 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4286 stbi_uc *p = z->out;
4287
4288 if (s->img_out_n == 3) { // convert bgr to rgb
4289 for (i=0; i < pixel_count; ++i) {
4290 stbi_uc t = p[0];
4291 p[0] = p[2];
4292 p[2] = t;
4293 p += 3;
4294 }
4295 } else {
4296 STBI_ASSERT(s->img_out_n == 4);
4297 if (stbi__unpremultiply_on_load) {
4298 // convert bgr to rgb and unpremultiply
4299 for (i=0; i < pixel_count; ++i) {
4300 stbi_uc a = p[3];
4301 stbi_uc t = p[0];
4302 if (a) {
4303 p[0] = p[2] * 255 / a;
4304 p[1] = p[1] * 255 / a;
4305 p[2] = t * 255 / a;
4306 } else {
4307 p[0] = p[2];
4308 p[2] = t;
4309 }
4310 p += 4;
4311 }
4312 } else {
4313 // convert bgr to rgb
4314 for (i=0; i < pixel_count; ++i) {
4315 stbi_uc t = p[0];
4316 p[0] = p[2];
4317 p[2] = t;
4318 p += 4;
4319 }
4320 }
4321 }
4322 }
4323
4324 #define STBI__PNG_TYPE(a,b,c,d) (((a) << 24) + ((b) << 16) + ((c) << 8) + (d))
4325
stbi__parse_png_file(stbi__png * z,int scan,int req_comp)4326 static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp)
4327 {
4328 stbi_uc palette[1024], pal_img_n=0;
4329 stbi_uc has_trans=0, tc[3];
4330 stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0;
4331 int first=1,k,interlace=0, color=0, depth=0, is_iphone=0;
4332 stbi__context *s = z->s;
4333
4334 z->expanded = NULL;
4335 z->idata = NULL;
4336 z->out = NULL;
4337
4338 if (!stbi__check_png_header(s)) return 0;
4339
4340 if (scan == STBI__SCAN_type) return 1;
4341
4342 for (;;) {
4343 stbi__pngchunk c = stbi__get_chunk_header(s);
4344 switch (c.type) {
4345 case STBI__PNG_TYPE('C','g','B','I'):
4346 is_iphone = 1;
4347 stbi__skip(s, c.length);
4348 break;
4349 case STBI__PNG_TYPE('I','H','D','R'): {
4350 int comp,filter;
4351 if (!first) return stbi__err("multiple IHDR","Corrupt PNG");
4352 first = 0;
4353 if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG");
4354 s->img_x = stbi__get32be(s); if (s->img_x > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4355 s->img_y = stbi__get32be(s); if (s->img_y > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4356 depth = stbi__get8(s); if (depth != 1 && depth != 2 && depth != 4 && depth != 8) return stbi__err("1/2/4/8-bit only","PNG not supported: 1/2/4/8-bit only");
4357 color = stbi__get8(s); if (color > 6) return stbi__err("bad ctype","Corrupt PNG");
4358 if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG");
4359 comp = stbi__get8(s); if (comp) return stbi__err("bad comp method","Corrupt PNG");
4360 filter= stbi__get8(s); if (filter) return stbi__err("bad filter method","Corrupt PNG");
4361 interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG");
4362 if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG");
4363 if (!pal_img_n) {
4364 s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
4365 if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
4366 if (scan == STBI__SCAN_header) return 1;
4367 } else {
4368 // if paletted, then pal_n is our final components, and
4369 // img_n is # components to decompress/filter.
4370 s->img_n = 1;
4371 if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG");
4372 // if SCAN_header, have to scan to see if we have a tRNS
4373 }
4374 break;
4375 }
4376
4377 case STBI__PNG_TYPE('P','L','T','E'): {
4378 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4379 if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
4380 pal_len = c.length / 3;
4381 if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG");
4382 for (i=0; i < pal_len; ++i) {
4383 palette[i*4+0] = stbi__get8(s);
4384 palette[i*4+1] = stbi__get8(s);
4385 palette[i*4+2] = stbi__get8(s);
4386 palette[i*4+3] = 255;
4387 }
4388 break;
4389 }
4390
4391 case STBI__PNG_TYPE('t','R','N','S'): {
4392 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4393 if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG");
4394 if (pal_img_n) {
4395 if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; }
4396 if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
4397 if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG");
4398 pal_img_n = 4;
4399 for (i=0; i < c.length; ++i)
4400 palette[i*4+3] = stbi__get8(s);
4401 } else {
4402 if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
4403 if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG");
4404 has_trans = 1;
4405 for (k=0; k < s->img_n; ++k)
4406 tc[k] = (stbi_uc) (stbi__get16be(s) & 255) * stbi__depth_scale_table[depth]; // non 8-bit images will be larger
4407 }
4408 break;
4409 }
4410
4411 case STBI__PNG_TYPE('I','D','A','T'): {
4412 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4413 if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG");
4414 if (scan == STBI__SCAN_header) { s->img_n = pal_img_n; return 1; }
4415 if ((int)(ioff + c.length) < (int)ioff) return 0;
4416 if (ioff + c.length > idata_limit) {
4417 stbi_uc *p;
4418 if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
4419 while (ioff + c.length > idata_limit)
4420 idata_limit *= 2;
4421 p = (stbi_uc *) STBI_REALLOC(z->idata, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
4422 z->idata = p;
4423 }
4424 if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG");
4425 ioff += c.length;
4426 break;
4427 }
4428
4429 case STBI__PNG_TYPE('I','E','N','D'): {
4430 stbi__uint32 raw_len, bpl;
4431 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4432 if (scan != STBI__SCAN_load) return 1;
4433 if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG");
4434 // initial guess for decoded data size to avoid unnecessary reallocs
4435 bpl = (s->img_x * depth + 7) / 8; // bytes per line, per component
4436 raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
4437 z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone);
4438 if (z->expanded == NULL) return 0; // zlib should set error
4439 STBI_FREE(z->idata); z->idata = NULL;
4440 if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
4441 s->img_out_n = s->img_n+1;
4442 else
4443 s->img_out_n = s->img_n;
4444 if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, depth, color, interlace)) return 0;
4445 if (has_trans)
4446 if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
4447 if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
4448 stbi__de_iphone(z);
4449 if (pal_img_n) {
4450 // pal_img_n == 3 or 4
4451 s->img_n = pal_img_n; // record the actual colors we had
4452 s->img_out_n = pal_img_n;
4453 if (req_comp >= 3) s->img_out_n = req_comp;
4454 if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
4455 return 0;
4456 }
4457 STBI_FREE(z->expanded); z->expanded = NULL;
4458 return 1;
4459 }
4460
4461 default:
4462 // if critical, fail
4463 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4464 if ((c.type & (1 << 29)) == 0) {
4465 #ifndef STBI_NO_FAILURE_STRINGS
4466 // not threadsafe
4467 static char invalid_chunk[] = "XXXX PNG chunk not known";
4468 invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
4469 invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
4470 invalid_chunk[2] = STBI__BYTECAST(c.type >> 8);
4471 invalid_chunk[3] = STBI__BYTECAST(c.type >> 0);
4472 #endif
4473 return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
4474 }
4475 stbi__skip(s, c.length);
4476 break;
4477 }
4478 // end of PNG chunk, read and skip CRC
4479 stbi__get32be(s);
4480 }
4481 }
4482
stbi__do_png(stbi__png * p,int * x,int * y,int * n,int req_comp)4483 static unsigned char *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp)
4484 {
4485 unsigned char *result=NULL;
4486 if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
4487 if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) {
4488 result = p->out;
4489 p->out = NULL;
4490 if (req_comp && req_comp != p->s->img_out_n) {
4491 result = stbi__convert_format(result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
4492 p->s->img_out_n = req_comp;
4493 if (result == NULL) return result;
4494 }
4495 *x = p->s->img_x;
4496 *y = p->s->img_y;
4497 if (n) *n = p->s->img_out_n;
4498 }
4499 STBI_FREE(p->out); p->out = NULL;
4500 STBI_FREE(p->expanded); p->expanded = NULL;
4501 STBI_FREE(p->idata); p->idata = NULL;
4502
4503 return result;
4504 }
4505
stbi__png_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4506 static unsigned char *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4507 {
4508 stbi__png p;
4509 p.s = s;
4510 return stbi__do_png(&p, x,y,comp,req_comp);
4511 }
4512
stbi__png_test(stbi__context * s)4513 static int stbi__png_test(stbi__context *s)
4514 {
4515 int r;
4516 r = stbi__check_png_header(s);
4517 stbi__rewind(s);
4518 return r;
4519 }
4520
stbi__png_info_raw(stbi__png * p,int * x,int * y,int * comp)4521 static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp)
4522 {
4523 if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) {
4524 stbi__rewind( p->s );
4525 return 0;
4526 }
4527 if (x) *x = p->s->img_x;
4528 if (y) *y = p->s->img_y;
4529 if (comp) *comp = p->s->img_n;
4530 return 1;
4531 }
4532
stbi__png_info(stbi__context * s,int * x,int * y,int * comp)4533 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp)
4534 {
4535 stbi__png p;
4536 p.s = s;
4537 return stbi__png_info_raw(&p, x, y, comp);
4538 }
4539 #endif
4540
4541 // Microsoft/Windows BMP image
4542
4543 #ifndef STBI_NO_BMP
stbi__bmp_test_raw(stbi__context * s)4544 static int stbi__bmp_test_raw(stbi__context *s)
4545 {
4546 int r;
4547 int sz;
4548 if (stbi__get8(s) != 'B') return 0;
4549 if (stbi__get8(s) != 'M') return 0;
4550 stbi__get32le(s); // discard filesize
4551 stbi__get16le(s); // discard reserved
4552 stbi__get16le(s); // discard reserved
4553 stbi__get32le(s); // discard data offset
4554 sz = stbi__get32le(s);
4555 r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
4556 return r;
4557 }
4558
stbi__bmp_test(stbi__context * s)4559 static int stbi__bmp_test(stbi__context *s)
4560 {
4561 int r = stbi__bmp_test_raw(s);
4562 stbi__rewind(s);
4563 return r;
4564 }
4565
4566
4567 // returns 0..31 for the highest set bit
stbi__high_bit(unsigned int z)4568 static int stbi__high_bit(unsigned int z)
4569 {
4570 int n=0;
4571 if (z == 0) return -1;
4572 if (z >= 0x10000) n += 16, z >>= 16;
4573 if (z >= 0x00100) n += 8, z >>= 8;
4574 if (z >= 0x00010) n += 4, z >>= 4;
4575 if (z >= 0x00004) n += 2, z >>= 2;
4576 if (z >= 0x00002) n += 1, z >>= 1;
4577 return n;
4578 }
4579
stbi__bitcount(unsigned int a)4580 static int stbi__bitcount(unsigned int a)
4581 {
4582 a = (a & 0x55555555) + ((a >> 1) & 0x55555555); // max 2
4583 a = (a & 0x33333333) + ((a >> 2) & 0x33333333); // max 4
4584 a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
4585 a = (a + (a >> 8)); // max 16 per 8 bits
4586 a = (a + (a >> 16)); // max 32 per 8 bits
4587 return a & 0xff;
4588 }
4589
stbi__shiftsigned(int v,int shift,int bits)4590 static int stbi__shiftsigned(int v, int shift, int bits)
4591 {
4592 int result;
4593 int z=0;
4594
4595 if (shift < 0) v <<= -shift;
4596 else v >>= shift;
4597 result = v;
4598
4599 z = bits;
4600 while (z < 8) {
4601 result += v >> z;
4602 z += bits;
4603 }
4604 return result;
4605 }
4606
stbi__bmp_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4607 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4608 {
4609 stbi_uc *out;
4610 unsigned int mr=0,mg=0,mb=0,ma=0;
4611 stbi_uc pal[256][4];
4612 int psize=0,i,j,compress=0,width;
4613 int bpp, flip_vertically, pad, target, offset, hsz;
4614 if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
4615 stbi__get32le(s); // discard filesize
4616 stbi__get16le(s); // discard reserved
4617 stbi__get16le(s); // discard reserved
4618 offset = stbi__get32le(s);
4619 hsz = stbi__get32le(s);
4620 if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
4621 if (hsz == 12) {
4622 s->img_x = stbi__get16le(s);
4623 s->img_y = stbi__get16le(s);
4624 } else {
4625 s->img_x = stbi__get32le(s);
4626 s->img_y = stbi__get32le(s);
4627 }
4628 if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
4629 bpp = stbi__get16le(s);
4630 if (bpp == 1) return stbi__errpuc("monochrome", "BMP type not supported: 1-bit");
4631 flip_vertically = ((int) s->img_y) > 0;
4632 s->img_y = abs((int) s->img_y);
4633 if (hsz == 12) {
4634 if (bpp < 24)
4635 psize = (offset - 14 - 24) / 3;
4636 } else {
4637 compress = stbi__get32le(s);
4638 if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
4639 stbi__get32le(s); // discard sizeof
4640 stbi__get32le(s); // discard hres
4641 stbi__get32le(s); // discard vres
4642 stbi__get32le(s); // discard colorsused
4643 stbi__get32le(s); // discard max important
4644 if (hsz == 40 || hsz == 56) {
4645 if (hsz == 56) {
4646 stbi__get32le(s);
4647 stbi__get32le(s);
4648 stbi__get32le(s);
4649 stbi__get32le(s);
4650 }
4651 if (bpp == 16 || bpp == 32)
4652 {
4653 mr = mg = mb = 0;
4654 if (compress == 0)
4655 {
4656 if (bpp == 32)
4657 {
4658 mr = 0xffu << 16;
4659 mg = 0xffu << 8;
4660 mb = 0xffu << 0;
4661 ma = 0xffu << 24;
4662 }
4663 else
4664 {
4665 mr = 31u << 10;
4666 mg = 31u << 5;
4667 mb = 31u << 0;
4668 }
4669 } else if (compress == 3) {
4670 mr = stbi__get32le(s);
4671 mg = stbi__get32le(s);
4672 mb = stbi__get32le(s);
4673 // not documented, but generated by photoshop and handled by mspaint
4674 if (mr == mg && mg == mb) {
4675 // ?!?!?
4676 return stbi__errpuc("bad BMP", "bad BMP");
4677 }
4678 } else
4679 return stbi__errpuc("bad BMP", "bad BMP");
4680 }
4681 } else {
4682 STBI_ASSERT(hsz == 108 || hsz == 124);
4683 mr = stbi__get32le(s);
4684 mg = stbi__get32le(s);
4685 mb = stbi__get32le(s);
4686 ma = stbi__get32le(s);
4687 stbi__get32le(s); // discard color space
4688 for (i=0; i < 12; ++i)
4689 stbi__get32le(s); // discard color space parameters
4690 if (hsz == 124) {
4691 stbi__get32le(s); // discard rendering intent
4692 stbi__get32le(s); // discard offset of profile data
4693 stbi__get32le(s); // discard size of profile data
4694 stbi__get32le(s); // discard reserved
4695 }
4696 }
4697 if (bpp < 16)
4698 psize = (offset - 14 - hsz) >> 2;
4699 }
4700 s->img_n = ma ? 4 : 3;
4701 if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
4702 target = req_comp;
4703 else
4704 target = s->img_n; // if they want monochrome, we'll post-convert
4705 out = (stbi_uc *) stbi__malloc(target * s->img_x * s->img_y);
4706 if (!out) return stbi__errpuc("outofmem", "Out of memory");
4707 if (bpp < 16) {
4708 int z=0;
4709 if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
4710 for (i=0; i < psize; ++i) {
4711 pal[i][2] = stbi__get8(s);
4712 pal[i][1] = stbi__get8(s);
4713 pal[i][0] = stbi__get8(s);
4714 if (hsz != 12) stbi__get8(s);
4715 pal[i][3] = 255;
4716 }
4717 stbi__skip(s, offset - 14 - hsz - psize * (hsz == 12 ? 3 : 4));
4718 if (bpp == 4) width = (s->img_x + 1) >> 1;
4719 else if (bpp == 8) width = s->img_x;
4720 else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
4721 pad = (-width)&3;
4722 for (j=0; j < (int) s->img_y; ++j) {
4723 for (i=0; i < (int) s->img_x; i += 2) {
4724 int v=stbi__get8(s),v2=0;
4725 if (bpp == 4) {
4726 v2 = v & 15;
4727 v >>= 4;
4728 }
4729 out[z++] = pal[v][0];
4730 out[z++] = pal[v][1];
4731 out[z++] = pal[v][2];
4732 if (target == 4) out[z++] = 255;
4733 if (i+1 == (int) s->img_x) break;
4734 v = (bpp == 8) ? stbi__get8(s) : v2;
4735 out[z++] = pal[v][0];
4736 out[z++] = pal[v][1];
4737 out[z++] = pal[v][2];
4738 if (target == 4) out[z++] = 255;
4739 }
4740 stbi__skip(s, pad);
4741 }
4742 } else {
4743 int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
4744 int z = 0;
4745 int easy=0;
4746 stbi__skip(s, offset - 14 - hsz);
4747 if (bpp == 24) width = 3 * s->img_x;
4748 else if (bpp == 16) width = 2*s->img_x;
4749 else /* bpp = 32 and pad = 0 */ width=0;
4750 pad = (-width) & 3;
4751 if (bpp == 24) {
4752 easy = 1;
4753 } else if (bpp == 32) {
4754 if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
4755 easy = 2;
4756 }
4757 if (!easy) {
4758 if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
4759 // right shift amt to put high bit in position #7
4760 rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr);
4761 gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg);
4762 bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb);
4763 ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma);
4764 }
4765 for (j=0; j < (int) s->img_y; ++j) {
4766 if (easy) {
4767 for (i=0; i < (int) s->img_x; ++i) {
4768 unsigned char a;
4769 out[z+2] = stbi__get8(s);
4770 out[z+1] = stbi__get8(s);
4771 out[z+0] = stbi__get8(s);
4772 z += 3;
4773 a = (easy == 2 ? stbi__get8(s) : 255);
4774 if (target == 4) out[z++] = a;
4775 }
4776 } else {
4777 for (i=0; i < (int) s->img_x; ++i) {
4778 stbi__uint32 v = (bpp == 16 ? (stbi__uint32) stbi__get16le(s) : stbi__get32le(s));
4779 int a;
4780 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
4781 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
4782 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
4783 a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
4784 if (target == 4) out[z++] = STBI__BYTECAST(a);
4785 }
4786 }
4787 stbi__skip(s, pad);
4788 }
4789 }
4790 if (flip_vertically) {
4791 stbi_uc t;
4792 for (j=0; j < (int) s->img_y>>1; ++j) {
4793 stbi_uc *p1 = out + j *s->img_x*target;
4794 stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
4795 for (i=0; i < (int) s->img_x*target; ++i) {
4796 t = p1[i], p1[i] = p2[i], p2[i] = t;
4797 }
4798 }
4799 }
4800
4801 if (req_comp && req_comp != target) {
4802 out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
4803 if (out == NULL) return out; // stbi__convert_format frees input on failure
4804 }
4805
4806 *x = s->img_x;
4807 *y = s->img_y;
4808 if (comp) *comp = s->img_n;
4809 return out;
4810 }
4811 #endif
4812
4813 // Targa Truevision - TGA
4814 // by Jonathan Dummer
4815 #ifndef STBI_NO_TGA
stbi__tga_info(stbi__context * s,int * x,int * y,int * comp)4816 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp)
4817 {
4818 int tga_w, tga_h, tga_comp;
4819 int sz;
4820 stbi__get8(s); // discard Offset
4821 sz = stbi__get8(s); // color type
4822 if( sz > 1 ) {
4823 stbi__rewind(s);
4824 return 0; // only RGB or indexed allowed
4825 }
4826 sz = stbi__get8(s); // image type
4827 // only RGB or grey allowed, +/- RLE
4828 if ((sz != 1) && (sz != 2) && (sz != 3) && (sz != 9) && (sz != 10) && (sz != 11)) return 0;
4829 stbi__skip(s,9);
4830 tga_w = stbi__get16le(s);
4831 if( tga_w < 1 ) {
4832 stbi__rewind(s);
4833 return 0; // test width
4834 }
4835 tga_h = stbi__get16le(s);
4836 if( tga_h < 1 ) {
4837 stbi__rewind(s);
4838 return 0; // test height
4839 }
4840 sz = stbi__get8(s); // bits per pixel
4841 // only RGB or RGBA or grey allowed
4842 if ((sz != 8) && (sz != 16) && (sz != 24) && (sz != 32)) {
4843 stbi__rewind(s);
4844 return 0;
4845 }
4846 tga_comp = sz;
4847 if (x) *x = tga_w;
4848 if (y) *y = tga_h;
4849 if (comp) *comp = tga_comp / 8;
4850 return 1; // seems to have passed everything
4851 }
4852
stbi__tga_test(stbi__context * s)4853 static int stbi__tga_test(stbi__context *s)
4854 {
4855 int res;
4856 int sz;
4857 stbi__get8(s); // discard Offset
4858 sz = stbi__get8(s); // color type
4859 if ( sz > 1 ) return 0; // only RGB or indexed allowed
4860 sz = stbi__get8(s); // image type
4861 if ( (sz != 1) && (sz != 2) && (sz != 3) && (sz != 9) && (sz != 10) && (sz != 11) ) return 0; // only RGB or grey allowed, +/- RLE
4862 stbi__get16be(s); // discard palette start
4863 stbi__get16be(s); // discard palette length
4864 stbi__get8(s); // discard bits per palette color entry
4865 stbi__get16be(s); // discard x origin
4866 stbi__get16be(s); // discard y origin
4867 if ( stbi__get16be(s) < 1 ) return 0; // test width
4868 if ( stbi__get16be(s) < 1 ) return 0; // test height
4869 sz = stbi__get8(s); // bits per pixel
4870 if ( (sz != 8) && (sz != 16) && (sz != 24) && (sz != 32) )
4871 res = 0;
4872 else
4873 res = 1;
4874 stbi__rewind(s);
4875 return res;
4876 }
4877
stbi__tga_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)4878 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4879 {
4880 // read in the TGA header stuff
4881 int tga_offset = stbi__get8(s);
4882 int tga_indexed = stbi__get8(s);
4883 int tga_image_type = stbi__get8(s);
4884 int tga_is_RLE = 0;
4885 int tga_palette_start = stbi__get16le(s);
4886 int tga_palette_len = stbi__get16le(s);
4887 int tga_palette_bits = stbi__get8(s);
4888 int tga_x_origin = stbi__get16le(s);
4889 int tga_y_origin = stbi__get16le(s);
4890 int tga_width = stbi__get16le(s);
4891 int tga_height = stbi__get16le(s);
4892 int tga_bits_per_pixel = stbi__get8(s);
4893 int tga_comp = tga_bits_per_pixel / 8;
4894 int tga_inverted = stbi__get8(s);
4895 // image data
4896 unsigned char *tga_data;
4897 unsigned char *tga_palette = NULL;
4898 int i, j;
4899 unsigned char raw_data[4] = {0};
4900 int RLE_count = 0;
4901 int RLE_repeating = 0;
4902 int read_next_pixel = 1;
4903
4904 // do a tiny bit of precessing
4905 if ( tga_image_type >= 8 )
4906 {
4907 tga_image_type -= 8;
4908 tga_is_RLE = 1;
4909 }
4910 /* int tga_alpha_bits = tga_inverted & 15; */
4911 tga_inverted = 1 - ((tga_inverted >> 5) & 1);
4912
4913 // error check
4914 if ( //(tga_indexed) ||
4915 (tga_width < 1) || (tga_height < 1) ||
4916 (tga_image_type < 1) || (tga_image_type > 3) ||
4917 ((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16) &&
4918 (tga_bits_per_pixel != 24) && (tga_bits_per_pixel != 32))
4919 )
4920 {
4921 return NULL; // we don't report this as a bad TGA because we don't even know if it's TGA
4922 }
4923
4924 // If I'm paletted, then I'll use the number of bits from the palette
4925 if ( tga_indexed )
4926 {
4927 tga_comp = tga_palette_bits / 8;
4928 }
4929
4930 // tga info
4931 *x = tga_width;
4932 *y = tga_height;
4933 if (comp) *comp = tga_comp;
4934
4935 tga_data = (unsigned char*)stbi__malloc( (size_t)tga_width * tga_height * tga_comp );
4936 if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
4937
4938 // skip to the data's starting position (offset usually = 0)
4939 stbi__skip(s, tga_offset );
4940
4941 if ( !tga_indexed && !tga_is_RLE) {
4942 for (i=0; i < tga_height; ++i) {
4943 int y = tga_inverted ? tga_height -i - 1 : i;
4944 stbi_uc *tga_row = tga_data + y*tga_width*tga_comp;
4945 stbi__getn(s, tga_row, tga_width * tga_comp);
4946 }
4947 } else {
4948 // do I need to load a palette?
4949 if ( tga_indexed)
4950 {
4951 // any data to skip? (offset usually = 0)
4952 stbi__skip(s, tga_palette_start );
4953 // load the palette
4954 tga_palette = (unsigned char*)stbi__malloc( tga_palette_len * tga_palette_bits / 8 );
4955 if (!tga_palette) {
4956 STBI_FREE(tga_data);
4957 return stbi__errpuc("outofmem", "Out of memory");
4958 }
4959 if (!stbi__getn(s, tga_palette, tga_palette_len * tga_palette_bits / 8 )) {
4960 STBI_FREE(tga_data);
4961 STBI_FREE(tga_palette);
4962 return stbi__errpuc("bad palette", "Corrupt TGA");
4963 }
4964 }
4965 // load the data
4966 for (i=0; i < tga_width * tga_height; ++i)
4967 {
4968 // if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
4969 if ( tga_is_RLE )
4970 {
4971 if ( RLE_count == 0 )
4972 {
4973 // yep, get the next byte as a RLE command
4974 int RLE_cmd = stbi__get8(s);
4975 RLE_count = 1 + (RLE_cmd & 127);
4976 RLE_repeating = RLE_cmd >> 7;
4977 read_next_pixel = 1;
4978 } else if ( !RLE_repeating )
4979 {
4980 read_next_pixel = 1;
4981 }
4982 } else
4983 {
4984 read_next_pixel = 1;
4985 }
4986 // OK, if I need to read a pixel, do it now
4987 if ( read_next_pixel )
4988 {
4989 // load however much data we did have
4990 if ( tga_indexed )
4991 {
4992 // read in 1 byte, then perform the lookup
4993 int pal_idx = stbi__get8(s);
4994 if ( pal_idx >= tga_palette_len )
4995 {
4996 // invalid index
4997 pal_idx = 0;
4998 }
4999 pal_idx *= tga_bits_per_pixel / 8;
5000 for (j = 0; j*8 < tga_bits_per_pixel; ++j)
5001 {
5002 raw_data[j] = tga_palette[pal_idx+j];
5003 }
5004 } else
5005 {
5006 // read in the data raw
5007 for (j = 0; j*8 < tga_bits_per_pixel; ++j)
5008 {
5009 raw_data[j] = stbi__get8(s);
5010 }
5011 }
5012 // clear the reading flag for the next pixel
5013 read_next_pixel = 0;
5014 } // end of reading a pixel
5015
5016 // copy data
5017 for (j = 0; j < tga_comp; ++j)
5018 tga_data[i*tga_comp+j] = raw_data[j];
5019
5020 // in case we're in RLE mode, keep counting down
5021 --RLE_count;
5022 }
5023 // do I need to invert the image?
5024 if ( tga_inverted )
5025 {
5026 for (j = 0; j*2 < tga_height; ++j)
5027 {
5028 int index1 = j * tga_width * tga_comp;
5029 int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
5030 for (i = tga_width * tga_comp; i > 0; --i)
5031 {
5032 unsigned char temp = tga_data[index1];
5033 tga_data[index1] = tga_data[index2];
5034 tga_data[index2] = temp;
5035 ++index1;
5036 ++index2;
5037 }
5038 }
5039 }
5040 // clear my palette, if I had one
5041 if ( tga_palette != NULL )
5042 {
5043 STBI_FREE( tga_palette );
5044 }
5045 }
5046
5047 // swap RGB
5048 if (tga_comp >= 3)
5049 {
5050 unsigned char* tga_pixel = tga_data;
5051 for (i=0; i < tga_width * tga_height; ++i)
5052 {
5053 unsigned char temp = tga_pixel[0];
5054 tga_pixel[0] = tga_pixel[2];
5055 tga_pixel[2] = temp;
5056 tga_pixel += tga_comp;
5057 }
5058 }
5059
5060 // convert to target component count
5061 if (req_comp && req_comp != tga_comp)
5062 tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
5063
5064 // the things I do to get rid of an error message, and yet keep
5065 // Microsoft's C compilers happy... [8^(
5066 tga_palette_start = tga_palette_len = tga_palette_bits =
5067 tga_x_origin = tga_y_origin = 0;
5068 // OK, done
5069 return tga_data;
5070 }
5071 #endif
5072
5073 /* Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB */
5074
5075 #ifndef STBI_NO_PSD
stbi__psd_test(stbi__context * s)5076 static int stbi__psd_test(stbi__context *s)
5077 {
5078 int r = (stbi__get32be(s) == 0x38425053);
5079 stbi__rewind(s);
5080 return r;
5081 }
5082
stbi__psd_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)5083 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5084 {
5085 int pixelCount;
5086 int channelCount, compression;
5087 int channel, i, count, len;
5088 int w,h;
5089 stbi_uc *out;
5090
5091 /* Check identifier */
5092 if (stbi__get32be(s) != 0x38425053) // "8BPS"
5093 return stbi__errpuc("not PSD", "Corrupt PSD image");
5094
5095 /* Check file type version. */
5096 if (stbi__get16be(s) != 1)
5097 return stbi__errpuc("wrong version", "Unsupported version of PSD image");
5098
5099 /* Skip 6 reserved bytes. */
5100 stbi__skip(s, 6 );
5101
5102 /* Read the number of channels (R, G, B, A, etc). */
5103 channelCount = stbi__get16be(s);
5104 if (channelCount < 0 || channelCount > 16)
5105 return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
5106
5107 /* Read the rows and columns of the image. */
5108 h = stbi__get32be(s);
5109 w = stbi__get32be(s);
5110
5111 /* Make sure the depth is 8 bits. */
5112 if (stbi__get16be(s) != 8)
5113 return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 bit");
5114
5115 // Make sure the color mode is RGB.
5116 // Valid options are:
5117 // 0: Bitmap
5118 // 1: Grayscale
5119 // 2: Indexed color
5120 // 3: RGB color
5121 // 4: CMYK color
5122 // 7: Multichannel
5123 // 8: Duotone
5124 // 9: Lab color
5125 if (stbi__get16be(s) != 3)
5126 return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
5127
5128 /* Skip the Mode Data. (It's the palette for indexed color; other info for other modes.) */
5129 stbi__skip(s,stbi__get32be(s) );
5130
5131 /* Skip the image resources. (resolution, pen tool paths, etc) */
5132 stbi__skip(s, stbi__get32be(s) );
5133
5134 /* Skip the reserved data. */
5135 stbi__skip(s, stbi__get32be(s) );
5136
5137 // Find out if the data is compressed.
5138 // Known values:
5139 // 0: no compression
5140 // 1: RLE compressed
5141 compression = stbi__get16be(s);
5142 if (compression > 1)
5143 return stbi__errpuc("bad compression", "PSD has an unknown compression format");
5144
5145 // Create the destination image.
5146 out = (stbi_uc *) stbi__malloc(4 * w*h);
5147 if (!out) return stbi__errpuc("outofmem", "Out of memory");
5148 pixelCount = w*h;
5149
5150 /* Finally, the image data. */
5151 if (compression)
5152 {
5153 // RLE as used by .PSD and .TIFF
5154 // Loop until you get the number of unpacked bytes you are expecting:
5155 // Read the next source byte into n.
5156 // If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
5157 // Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
5158 // Else if n is 128, noop.
5159 // Endloop
5160
5161 /* The RLE-compressed data is preceeded by a 2-byte data count
5162 * for each row in the data, which we're going to just skip. */
5163 stbi__skip(s, h * channelCount * 2 );
5164
5165 // Read the RLE data by channel.
5166 for (channel = 0; channel < 4; channel++) {
5167 stbi_uc *p;
5168
5169 p = out+channel;
5170 if (channel >= channelCount) {
5171 // Fill this channel with default data.
5172 for (i = 0; i < pixelCount; i++, p += 4)
5173 *p = (channel == 3 ? 255 : 0);
5174 } else {
5175 // Read the RLE data.
5176 count = 0;
5177 while (count < pixelCount) {
5178 len = stbi__get8(s);
5179 if (len == 128) {
5180 // No-op.
5181 } else if (len < 128) {
5182 // Copy next len+1 bytes literally.
5183 len++;
5184 count += len;
5185 while (len) {
5186 *p = stbi__get8(s);
5187 p += 4;
5188 len--;
5189 }
5190 } else if (len > 128) {
5191 stbi_uc val;
5192 // Next -len+1 bytes in the dest are replicated from next source byte.
5193 // (Interpret len as a negative 8-bit int.)
5194 len ^= 0x0FF;
5195 len += 2;
5196 val = stbi__get8(s);
5197 count += len;
5198 while (len) {
5199 *p = val;
5200 p += 4;
5201 len--;
5202 }
5203 }
5204 }
5205 }
5206 }
5207
5208 } else {
5209 // We're at the raw image data. It's each channel in order (Red, Green, Blue, Alpha, ...)
5210 // where each channel consists of an 8-bit value for each pixel in the image.
5211
5212 // Read the data by channel.
5213 for (channel = 0; channel < 4; channel++) {
5214 stbi_uc *p;
5215
5216 p = out + channel;
5217 if (channel > channelCount) {
5218 // Fill this channel with default data.
5219 for (i = 0; i < pixelCount; i++, p += 4)
5220 *p = channel == 3 ? 255 : 0;
5221 } else {
5222 // Read the data.
5223 for (i = 0; i < pixelCount; i++, p += 4)
5224 *p = stbi__get8(s);
5225 }
5226 }
5227 }
5228
5229 if (req_comp && req_comp != 4) {
5230 out = stbi__convert_format(out, 4, req_comp, w, h);
5231 if (out == NULL) return out; // stbi__convert_format frees input on failure
5232 }
5233
5234 if (comp) *comp = 4;
5235 *y = h;
5236 *x = w;
5237
5238 return out;
5239 }
5240 #endif
5241
5242 /* *************************************************************************************************
5243 * Softimage PIC loader
5244 * by Tom Seddon
5245 *
5246 * See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
5247 * See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
5248 */
5249
5250 #ifndef STBI_NO_PIC
stbi__pic_is4(stbi__context * s,const char * str)5251 static int stbi__pic_is4(stbi__context *s,const char *str)
5252 {
5253 int i;
5254 for (i=0; i<4; ++i)
5255 if (stbi__get8(s) != (stbi_uc)str[i])
5256 return 0;
5257
5258 return 1;
5259 }
5260
stbi__pic_test_core(stbi__context * s)5261 static int stbi__pic_test_core(stbi__context *s)
5262 {
5263 int i;
5264
5265 if (!stbi__pic_is4(s,"\x53\x80\xF6\x34"))
5266 return 0;
5267
5268 for(i=0;i<84;++i)
5269 stbi__get8(s);
5270
5271 if (!stbi__pic_is4(s,"PICT"))
5272 return 0;
5273
5274 return 1;
5275 }
5276
5277 typedef struct
5278 {
5279 stbi_uc size,type,channel;
5280 } stbi__pic_packet;
5281
stbi__readval(stbi__context * s,int channel,stbi_uc * dest)5282 static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest)
5283 {
5284 int mask=0x80, i;
5285
5286 for (i=0; i<4; ++i, mask>>=1) {
5287 if (channel & mask) {
5288 if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short");
5289 dest[i]=stbi__get8(s);
5290 }
5291 }
5292
5293 return dest;
5294 }
5295
stbi__copyval(int channel,stbi_uc * dest,const stbi_uc * src)5296 static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src)
5297 {
5298 int mask=0x80,i;
5299
5300 for (i=0;i<4; ++i, mask>>=1)
5301 if (channel&mask)
5302 dest[i]=src[i];
5303 }
5304
stbi__pic_load_core(stbi__context * s,int width,int height,int * comp,stbi_uc * result)5305 static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result)
5306 {
5307 int act_comp=0,num_packets=0,y,chained;
5308 stbi__pic_packet packets[10];
5309
5310 /* this will (should...) cater for even some bizarre stuff like having data
5311 * for the same channel in multiple packets.
5312 */
5313 do
5314 {
5315 stbi__pic_packet *packet;
5316
5317 if (num_packets==sizeof(packets)/sizeof(packets[0]))
5318 return stbi__errpuc("bad format","too many packets");
5319
5320 packet = &packets[num_packets++];
5321
5322 chained = stbi__get8(s);
5323 packet->size = stbi__get8(s);
5324 packet->type = stbi__get8(s);
5325 packet->channel = stbi__get8(s);
5326
5327 act_comp |= packet->channel;
5328
5329 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (reading packets)");
5330 if (packet->size != 8) return stbi__errpuc("bad format","packet isn't 8bpp");
5331 } while (chained);
5332
5333 *comp = (act_comp & 0x10 ? 4 : 3); /* has alpha channel? */
5334
5335 for(y=0; y<height; ++y) {
5336 int packet_idx;
5337
5338 for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
5339 stbi__pic_packet *packet = &packets[packet_idx];
5340 stbi_uc *dest = result+y*width*4;
5341
5342 switch (packet->type) {
5343 default:
5344 return stbi__errpuc("bad format","packet has bad compression type");
5345
5346 case 0: {//uncompressed
5347 int x;
5348
5349 for(x=0;x<width;++x, dest+=4)
5350 if (!stbi__readval(s,packet->channel,dest))
5351 return 0;
5352 break;
5353 }
5354
5355 case 1://Pure RLE
5356 {
5357 int left=width, i;
5358
5359 while (left>0) {
5360 stbi_uc count,value[4];
5361
5362 count=stbi__get8(s);
5363 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pure read count)");
5364
5365 if (count > left)
5366 count = (stbi_uc) left;
5367
5368 if (!stbi__readval(s,packet->channel,value)) return 0;
5369
5370 for(i=0; i<count; ++i,dest+=4)
5371 stbi__copyval(packet->channel,dest,value);
5372 left -= count;
5373 }
5374 }
5375 break;
5376
5377 case 2: {//Mixed RLE
5378 int left=width;
5379 while (left>0) {
5380 int count = stbi__get8(s), i;
5381 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (mixed read count)");
5382
5383 if (count >= 128) { // Repeated
5384 stbi_uc value[4];
5385 int i;
5386
5387 if (count==128)
5388 count = stbi__get16be(s);
5389 else
5390 count -= 127;
5391 if (count > left)
5392 return stbi__errpuc("bad file","scanline overrun");
5393
5394 if (!stbi__readval(s,packet->channel,value))
5395 return 0;
5396
5397 for(i=0;i<count;++i, dest += 4)
5398 stbi__copyval(packet->channel,dest,value);
5399 } else { // Raw
5400 ++count;
5401 if (count>left) return stbi__errpuc("bad file","scanline overrun");
5402
5403 for(i=0;i<count;++i, dest+=4)
5404 if (!stbi__readval(s,packet->channel,dest))
5405 return 0;
5406 }
5407 left-=count;
5408 }
5409 break;
5410 }
5411 }
5412 }
5413 }
5414
5415 return result;
5416 }
5417
stbi__pic_load(stbi__context * s,int * px,int * py,int * comp,int req_comp)5418 static stbi_uc *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp)
5419 {
5420 stbi_uc *result;
5421 int i, x,y;
5422
5423 for (i=0; i<92; ++i)
5424 stbi__get8(s);
5425
5426 x = stbi__get16be(s);
5427 y = stbi__get16be(s);
5428 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pic header)");
5429 if ((1 << 28) / x < y) return stbi__errpuc("too large", "Image too large to decode");
5430
5431 stbi__get32be(s); /* skip `ratio' */
5432 stbi__get16be(s); /* skip `fields' */
5433 stbi__get16be(s); /* skip `pad' */
5434
5435 /* intermediate buffer is RGBA */
5436 result = (stbi_uc *) stbi__malloc(x*y*4);
5437 memset(result, 0xff, x*y*4);
5438
5439 if (!stbi__pic_load_core(s,x,y,comp, result))
5440 {
5441 STBI_FREE(result);
5442 result=0;
5443 }
5444 *px = x;
5445 *py = y;
5446 if (req_comp == 0) req_comp = *comp;
5447 result=stbi__convert_format(result,4,req_comp,x,y);
5448
5449 return result;
5450 }
5451
stbi__pic_test(stbi__context * s)5452 static int stbi__pic_test(stbi__context *s)
5453 {
5454 int r = stbi__pic_test_core(s);
5455 stbi__rewind(s);
5456 return r;
5457 }
5458 #endif
5459
5460 /* *************************************************************************************************
5461 * GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
5462 */
5463
5464 #ifndef STBI_NO_GIF
5465 typedef struct
5466 {
5467 stbi__int16 prefix;
5468 stbi_uc first;
5469 stbi_uc suffix;
5470 } stbi__gif_lzw;
5471
5472 typedef struct
5473 {
5474 int w,h;
5475 stbi_uc *out; /* output buffer (always 4 components) */
5476 int flags, bgindex, ratio, transparent, eflags;
5477 stbi_uc pal[256][4];
5478 stbi_uc lpal[256][4];
5479 stbi__gif_lzw codes[4096];
5480 stbi_uc *color_table;
5481 int parse, step;
5482 int lflags;
5483 int start_x, start_y;
5484 int max_x, max_y;
5485 int cur_x, cur_y;
5486 int line_size;
5487 } stbi__gif;
5488
stbi__gif_test_raw(stbi__context * s)5489 static int stbi__gif_test_raw(stbi__context *s)
5490 {
5491 int sz;
5492 if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
5493 sz = stbi__get8(s);
5494 if (sz != '9' && sz != '7') return 0;
5495 if (stbi__get8(s) != 'a') return 0;
5496 return 1;
5497 }
5498
stbi__gif_test(stbi__context * s)5499 static int stbi__gif_test(stbi__context *s)
5500 {
5501 int r = stbi__gif_test_raw(s);
5502 stbi__rewind(s);
5503 return r;
5504 }
5505
stbi__gif_parse_colortable(stbi__context * s,stbi_uc pal[256][4],int num_entries,int transp)5506 static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp)
5507 {
5508 int i;
5509 for (i=0; i < num_entries; ++i) {
5510 pal[i][2] = stbi__get8(s);
5511 pal[i][1] = stbi__get8(s);
5512 pal[i][0] = stbi__get8(s);
5513 pal[i][3] = transp == i ? 0 : 255;
5514 }
5515 }
5516
stbi__gif_header(stbi__context * s,stbi__gif * g,int * comp,int is_info)5517 static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info)
5518 {
5519 stbi_uc version;
5520 if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
5521 return stbi__err("not GIF", "Corrupt GIF");
5522
5523 version = stbi__get8(s);
5524 if (version != '7' && version != '9') return stbi__err("not GIF", "Corrupt GIF");
5525 if (stbi__get8(s) != 'a') return stbi__err("not GIF", "Corrupt GIF");
5526
5527 stbi__g_failure_reason = "";
5528 g->w = stbi__get16le(s);
5529 g->h = stbi__get16le(s);
5530 g->flags = stbi__get8(s);
5531 g->bgindex = stbi__get8(s);
5532 g->ratio = stbi__get8(s);
5533 g->transparent = -1;
5534
5535 if (comp != 0) *comp = 4; /* can't actually tell whether it's 3 or 4 until we parse the comments */
5536
5537 if (is_info) return 1;
5538
5539 if (g->flags & 0x80)
5540 stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
5541
5542 return 1;
5543 }
5544
stbi__gif_info_raw(stbi__context * s,int * x,int * y,int * comp)5545 static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp)
5546 {
5547 stbi__gif g;
5548 if (!stbi__gif_header(s, &g, comp, 1)) {
5549 stbi__rewind( s );
5550 return 0;
5551 }
5552 if (x) *x = g.w;
5553 if (y) *y = g.h;
5554 return 1;
5555 }
5556
stbi__out_gif_code(stbi__gif * g,stbi__uint16 code)5557 static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code)
5558 {
5559 stbi_uc *p, *c;
5560
5561 /* recurse to decode the prefixes, since the linked-list is backwards,
5562 * and working backwards through an interleaved image would be nasty
5563 */
5564 if (g->codes[code].prefix >= 0)
5565 stbi__out_gif_code(g, g->codes[code].prefix);
5566
5567 if (g->cur_y >= g->max_y)
5568 return;
5569
5570 p = &g->out[g->cur_x + g->cur_y];
5571 c = &g->color_table[g->codes[code].suffix * 4];
5572
5573 if (c[3] >= 128) {
5574 p[0] = c[2];
5575 p[1] = c[1];
5576 p[2] = c[0];
5577 p[3] = c[3];
5578 }
5579 g->cur_x += 4;
5580
5581 if (g->cur_x >= g->max_x) {
5582 g->cur_x = g->start_x;
5583 g->cur_y += g->step;
5584
5585 while (g->cur_y >= g->max_y && g->parse > 0) {
5586 g->step = (1 << g->parse) * g->line_size;
5587 g->cur_y = g->start_y + (g->step >> 1);
5588 --g->parse;
5589 }
5590 }
5591 }
5592
stbi__process_gif_raster(stbi__context * s,stbi__gif * g)5593 static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g)
5594 {
5595 stbi_uc lzw_cs;
5596 stbi__int32 len, code;
5597 stbi__uint32 first;
5598 stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
5599 stbi__gif_lzw *p;
5600
5601 lzw_cs = stbi__get8(s);
5602 if (lzw_cs > 12) return NULL;
5603 clear = 1 << lzw_cs;
5604 first = 1;
5605 codesize = lzw_cs + 1;
5606 codemask = (1 << codesize) - 1;
5607 bits = 0;
5608 valid_bits = 0;
5609 for (code = 0; code < clear; code++) {
5610 g->codes[code].prefix = -1;
5611 g->codes[code].first = (stbi_uc) code;
5612 g->codes[code].suffix = (stbi_uc) code;
5613 }
5614
5615 /* support no starting clear code */
5616 avail = clear+2;
5617 oldcode = -1;
5618
5619 len = 0;
5620 for(;;) {
5621 if (valid_bits < codesize) {
5622 if (len == 0) {
5623 len = stbi__get8(s); /* start new block */
5624 if (len == 0)
5625 return g->out;
5626 }
5627 --len;
5628 bits |= (stbi__int32) stbi__get8(s) << valid_bits;
5629 valid_bits += 8;
5630 } else {
5631 stbi__int32 code = bits & codemask;
5632 bits >>= codesize;
5633 valid_bits -= codesize;
5634 // @OPTIMIZE: is there some way we can accelerate the non-clear path?
5635 if (code == clear) { // clear code
5636 codesize = lzw_cs + 1;
5637 codemask = (1 << codesize) - 1;
5638 avail = clear + 2;
5639 oldcode = -1;
5640 first = 0;
5641 } else if (code == clear + 1) { // end of stream code
5642 stbi__skip(s, len);
5643 while ((len = stbi__get8(s)) > 0)
5644 stbi__skip(s,len);
5645 return g->out;
5646 } else if (code <= avail) {
5647 if (first) return stbi__errpuc("no clear code", "Corrupt GIF");
5648
5649 if (oldcode >= 0) {
5650 p = &g->codes[avail++];
5651 if (avail > 4096) return stbi__errpuc("too many codes", "Corrupt GIF");
5652 p->prefix = (stbi__int16) oldcode;
5653 p->first = g->codes[oldcode].first;
5654 p->suffix = (code == avail) ? p->first : g->codes[code].first;
5655 } else if (code == avail)
5656 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5657
5658 stbi__out_gif_code(g, (stbi__uint16) code);
5659
5660 if ((avail & codemask) == 0 && avail <= 0x0FFF) {
5661 codesize++;
5662 codemask = (1 << codesize) - 1;
5663 }
5664
5665 oldcode = code;
5666 } else {
5667 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5668 }
5669 }
5670 }
5671 }
5672
stbi__fill_gif_background(stbi__gif * g)5673 static void stbi__fill_gif_background(stbi__gif *g)
5674 {
5675 int i;
5676 stbi_uc *c = g->pal[g->bgindex];
5677 /* @OPTIMIZE: write a dword at a time */
5678 for (i = 0; i < g->w * g->h * 4; i += 4)
5679 {
5680 stbi_uc *p = &g->out[i];
5681 p[0] = c[2];
5682 p[1] = c[1];
5683 p[2] = c[0];
5684 p[3] = c[3];
5685 }
5686 }
5687
5688 /* this function is designed to support animated gifs, although stb_image doesn't support it */
stbi__gif_load_next(stbi__context * s,stbi__gif * g,int * comp,int req_comp)5689 static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp)
5690 {
5691 int i;
5692 stbi_uc *old_out = 0;
5693
5694 if (g->out == 0) {
5695 if (!stbi__gif_header(s, g, comp,0))
5696 return 0;
5697
5698 g->out = (stbi_uc *) stbi__malloc(4 * g->w * g->h);
5699 if (g->out == 0) return stbi__errpuc("outofmem", "Out of memory");
5700 stbi__fill_gif_background(g);
5701 } else {
5702 // animated-gif-only path
5703 if (((g->eflags & 0x1C) >> 2) == 3) {
5704 old_out = g->out;
5705 g->out = (stbi_uc *) stbi__malloc(4 * g->w * g->h);
5706 if (g->out == 0) return stbi__errpuc("outofmem", "Out of memory");
5707 memcpy(g->out, old_out, g->w*g->h*4);
5708 }
5709 }
5710
5711 for (;;) {
5712 switch (stbi__get8(s)) {
5713 case 0x2C: /* Image Descriptor */
5714 {
5715 stbi__int32 x, y, w, h;
5716 stbi_uc *o;
5717
5718 x = stbi__get16le(s);
5719 y = stbi__get16le(s);
5720 w = stbi__get16le(s);
5721 h = stbi__get16le(s);
5722 if (((x + w) > (g->w)) || ((y + h) > (g->h)))
5723 return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
5724
5725 g->line_size = g->w * 4;
5726 g->start_x = x * 4;
5727 g->start_y = y * g->line_size;
5728 g->max_x = g->start_x + w * 4;
5729 g->max_y = g->start_y + h * g->line_size;
5730 g->cur_x = g->start_x;
5731 g->cur_y = g->start_y;
5732
5733 g->lflags = stbi__get8(s);
5734
5735 if (g->lflags & 0x40) {
5736 g->step = 8 * g->line_size; // first interlaced spacing
5737 g->parse = 3;
5738 } else {
5739 g->step = g->line_size;
5740 g->parse = 0;
5741 }
5742
5743 if (g->lflags & 0x80) {
5744 stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
5745 g->color_table = (stbi_uc *) g->lpal;
5746 } else if (g->flags & 0x80) {
5747 for (i=0; i < 256; ++i) // @OPTIMIZE: stbi__jpeg_reset only the previous transparent
5748 g->pal[i][3] = 255;
5749 if (g->transparent >= 0 && (g->eflags & 0x01))
5750 g->pal[g->transparent][3] = 0;
5751 g->color_table = (stbi_uc *) g->pal;
5752 } else
5753 return stbi__errpuc("missing color table", "Corrupt GIF");
5754
5755 o = stbi__process_gif_raster(s, g);
5756 if (o == NULL) return NULL;
5757
5758 if (req_comp && req_comp != 4)
5759 o = stbi__convert_format(o, 4, req_comp, g->w, g->h);
5760 return o;
5761 }
5762
5763 case 0x21: // Comment Extension.
5764 {
5765 int len;
5766 if (stbi__get8(s) == 0xF9) { // Graphic Control Extension.
5767 len = stbi__get8(s);
5768 if (len == 4) {
5769 g->eflags = stbi__get8(s);
5770 stbi__get16le(s); // delay
5771 g->transparent = stbi__get8(s);
5772 } else {
5773 stbi__skip(s, len);
5774 break;
5775 }
5776 }
5777 while ((len = stbi__get8(s)) != 0)
5778 stbi__skip(s, len);
5779 break;
5780 }
5781
5782 case 0x3B: // gif stream termination code
5783 return (stbi_uc *) s; // using '1' causes warning on some compilers
5784
5785 default:
5786 return stbi__errpuc("unknown code", "Corrupt GIF");
5787 }
5788 }
5789 }
5790
stbi__gif_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)5791 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5792 {
5793 stbi_uc *u = 0;
5794 stbi__gif g;
5795 memset(&g, 0, sizeof(g));
5796
5797 u = stbi__gif_load_next(s, &g, comp, req_comp);
5798 if (u == (stbi_uc *) s) u = 0; // end of animated gif marker
5799 if (u) {
5800 *x = g.w;
5801 *y = g.h;
5802 }
5803
5804 return u;
5805 }
5806
stbi__gif_info(stbi__context * s,int * x,int * y,int * comp)5807 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp)
5808 {
5809 return stbi__gif_info_raw(s,x,y,comp);
5810 }
5811 #endif
5812
5813 // *************************************************************************************************
5814 // Radiance RGBE HDR loader
5815 // originally by Nicolas Schulz
5816 #ifndef STBI_NO_HDR
stbi__hdr_test_core(stbi__context * s)5817 static int stbi__hdr_test_core(stbi__context *s)
5818 {
5819 const char *signature = "#?RADIANCE\n";
5820 int i;
5821 for (i=0; signature[i]; ++i)
5822 if (stbi__get8(s) != signature[i])
5823 return 0;
5824 return 1;
5825 }
5826
stbi__hdr_test(stbi__context * s)5827 static int stbi__hdr_test(stbi__context* s)
5828 {
5829 int r = stbi__hdr_test_core(s);
5830 stbi__rewind(s);
5831 return r;
5832 }
5833
5834 #define STBI__HDR_BUFLEN 1024
stbi__hdr_gettoken(stbi__context * z,char * buffer)5835 static char *stbi__hdr_gettoken(stbi__context *z, char *buffer)
5836 {
5837 int len=0;
5838 char c = '\0';
5839
5840 c = (char) stbi__get8(z);
5841
5842 while (!stbi__at_eof(z) && c != '\n') {
5843 buffer[len++] = c;
5844 if (len == STBI__HDR_BUFLEN-1) {
5845 // flush to end of line
5846 while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
5847 ;
5848 break;
5849 }
5850 c = (char) stbi__get8(z);
5851 }
5852
5853 buffer[len] = 0;
5854 return buffer;
5855 }
5856
stbi__hdr_convert(float * output,stbi_uc * input,int req_comp)5857 static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp)
5858 {
5859 if ( input[3] != 0 ) {
5860 float f1;
5861 // Exponent
5862 f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
5863 if (req_comp <= 2)
5864 output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
5865 else {
5866 output[0] = input[0] * f1;
5867 output[1] = input[1] * f1;
5868 output[2] = input[2] * f1;
5869 }
5870 if (req_comp == 2) output[1] = 1;
5871 if (req_comp == 4) output[3] = 1;
5872 } else {
5873 switch (req_comp) {
5874 case 4: output[3] = 1; /* fallthrough */
5875 case 3: output[0] = output[1] = output[2] = 0;
5876 break;
5877 case 2: output[1] = 1; /* fallthrough */
5878 case 1: output[0] = 0;
5879 break;
5880 }
5881 }
5882 }
5883
stbi__hdr_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)5884 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5885 {
5886 char buffer[STBI__HDR_BUFLEN];
5887 char *token;
5888 int valid = 0;
5889 int width, height;
5890 stbi_uc *scanline;
5891 float *hdr_data;
5892 int len;
5893 unsigned char count, value;
5894 int i, j, k, c1,c2, z;
5895
5896
5897 // Check identifier
5898 if (strcmp(stbi__hdr_gettoken(s,buffer), "#?RADIANCE") != 0)
5899 return stbi__errpf("not HDR", "Corrupt HDR image");
5900
5901 // Parse header
5902 for(;;) {
5903 token = stbi__hdr_gettoken(s,buffer);
5904 if (token[0] == 0) break;
5905 if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
5906 }
5907
5908 if (!valid) return stbi__errpf("unsupported format", "Unsupported HDR format");
5909
5910 // Parse width and height
5911 // can't use sscanf() if we're not using stdio!
5912 token = stbi__hdr_gettoken(s,buffer);
5913 if (strncmp(token, "-Y ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
5914 token += 3;
5915 height = (int) strtol(token, &token, 10);
5916 while (*token == ' ') ++token;
5917 if (strncmp(token, "+X ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
5918 token += 3;
5919 width = (int) strtol(token, NULL, 10);
5920
5921 *x = width;
5922 *y = height;
5923
5924 if (comp) *comp = 3;
5925 if (req_comp == 0) req_comp = 3;
5926
5927 // Read data
5928 hdr_data = (float *) stbi__malloc(height * width * req_comp * sizeof(float));
5929
5930 // Load image data
5931 // image data is stored as some number of sca
5932 if ( width < 8 || width >= 32768) {
5933 // Read flat data
5934 for (j=0; j < height; ++j) {
5935 for (i=0; i < width; ++i) {
5936 stbi_uc rgbe[4];
5937 main_decode_loop:
5938 stbi__getn(s, rgbe, 4);
5939 stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
5940 }
5941 }
5942 } else {
5943 /* Read RLE-encoded data */
5944 scanline = NULL;
5945
5946 for (j = 0; j < height; ++j) {
5947 c1 = stbi__get8(s);
5948 c2 = stbi__get8(s);
5949 len = stbi__get8(s);
5950 if (c1 != 2 || c2 != 2 || (len & 0x80)) {
5951 /* not run-length encoded, so we have to
5952 * actually use THIS data as a decoded
5953 * pixel (note this can't be a valid pixel
5954 * --one of RGB must be >= 128) */
5955 stbi_uc rgbe[4];
5956 rgbe[0] = (stbi_uc) c1;
5957 rgbe[1] = (stbi_uc) c2;
5958 rgbe[2] = (stbi_uc) len;
5959 rgbe[3] = (stbi_uc) stbi__get8(s);
5960 stbi__hdr_convert(hdr_data, rgbe, req_comp);
5961 i = 1;
5962 j = 0;
5963 STBI_FREE(scanline);
5964 goto main_decode_loop; // yes, this makes no sense
5965 }
5966 len <<= 8;
5967 len |= stbi__get8(s);
5968 if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
5969 if (scanline == NULL) scanline = (stbi_uc *) stbi__malloc(width * 4);
5970
5971 for (k = 0; k < 4; ++k) {
5972 i = 0;
5973 while (i < width) {
5974 count = stbi__get8(s);
5975 if (count > 128) {
5976 // Run
5977 value = stbi__get8(s);
5978 count -= 128;
5979 for (z = 0; z < count; ++z)
5980 scanline[i++ * 4 + k] = value;
5981 } else {
5982 // Dump
5983 for (z = 0; z < count; ++z)
5984 scanline[i++ * 4 + k] = stbi__get8(s);
5985 }
5986 }
5987 }
5988 for (i=0; i < width; ++i)
5989 stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
5990 }
5991 STBI_FREE(scanline);
5992 }
5993
5994 return hdr_data;
5995 }
5996
stbi__hdr_info(stbi__context * s,int * x,int * y,int * comp)5997 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp)
5998 {
5999 char buffer[STBI__HDR_BUFLEN];
6000 char *token;
6001 int valid = 0;
6002
6003 if (strcmp(stbi__hdr_gettoken(s,buffer), "#?RADIANCE") != 0) {
6004 stbi__rewind( s );
6005 return 0;
6006 }
6007
6008 for(;;) {
6009 token = stbi__hdr_gettoken(s,buffer);
6010 if (token[0] == 0) break;
6011 if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
6012 }
6013
6014 if (!valid) {
6015 stbi__rewind( s );
6016 return 0;
6017 }
6018 token = stbi__hdr_gettoken(s,buffer);
6019 if (strncmp(token, "-Y ", 3)) {
6020 stbi__rewind( s );
6021 return 0;
6022 }
6023 token += 3;
6024 *y = (int) strtol(token, &token, 10);
6025 while (*token == ' ') ++token;
6026 if (strncmp(token, "+X ", 3)) {
6027 stbi__rewind( s );
6028 return 0;
6029 }
6030 token += 3;
6031 *x = (int) strtol(token, NULL, 10);
6032 *comp = 3;
6033 return 1;
6034 }
6035 #endif /* STBI_NO_HDR */
6036
6037 #ifndef STBI_NO_BMP
stbi__bmp_info(stbi__context * s,int * x,int * y,int * comp)6038 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp)
6039 {
6040 int hsz;
6041 if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') {
6042 stbi__rewind( s );
6043 return 0;
6044 }
6045 stbi__skip(s,12);
6046 hsz = stbi__get32le(s);
6047 if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) {
6048 stbi__rewind( s );
6049 return 0;
6050 }
6051 if (hsz == 12) {
6052 *x = stbi__get16le(s);
6053 *y = stbi__get16le(s);
6054 } else {
6055 *x = stbi__get32le(s);
6056 *y = stbi__get32le(s);
6057 }
6058 if (stbi__get16le(s) != 1) {
6059 stbi__rewind( s );
6060 return 0;
6061 }
6062 *comp = stbi__get16le(s) / 8;
6063 return 1;
6064 }
6065 #endif
6066
6067 #ifndef STBI_NO_PSD
stbi__psd_info(stbi__context * s,int * x,int * y,int * comp)6068 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp)
6069 {
6070 int channelCount;
6071 if (stbi__get32be(s) != 0x38425053) {
6072 stbi__rewind( s );
6073 return 0;
6074 }
6075 if (stbi__get16be(s) != 1) {
6076 stbi__rewind( s );
6077 return 0;
6078 }
6079 stbi__skip(s, 6);
6080 channelCount = stbi__get16be(s);
6081 if (channelCount < 0 || channelCount > 16) {
6082 stbi__rewind( s );
6083 return 0;
6084 }
6085 *y = stbi__get32be(s);
6086 *x = stbi__get32be(s);
6087 if (stbi__get16be(s) != 8) {
6088 stbi__rewind( s );
6089 return 0;
6090 }
6091 if (stbi__get16be(s) != 3) {
6092 stbi__rewind( s );
6093 return 0;
6094 }
6095 *comp = 4;
6096 return 1;
6097 }
6098 #endif
6099
6100 #ifndef STBI_NO_PIC
stbi__pic_info(stbi__context * s,int * x,int * y,int * comp)6101 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp)
6102 {
6103 int act_comp=0,num_packets=0,chained;
6104 stbi__pic_packet packets[10];
6105
6106 stbi__skip(s, 92);
6107
6108 *x = stbi__get16be(s);
6109 *y = stbi__get16be(s);
6110 if (stbi__at_eof(s)) return 0;
6111 if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
6112 stbi__rewind( s );
6113 return 0;
6114 }
6115
6116 stbi__skip(s, 8);
6117
6118 do {
6119 stbi__pic_packet *packet;
6120
6121 if (num_packets==sizeof(packets)/sizeof(packets[0]))
6122 return 0;
6123
6124 packet = &packets[num_packets++];
6125 chained = stbi__get8(s);
6126 packet->size = stbi__get8(s);
6127 packet->type = stbi__get8(s);
6128 packet->channel = stbi__get8(s);
6129 act_comp |= packet->channel;
6130
6131 if (stbi__at_eof(s)) {
6132 stbi__rewind( s );
6133 return 0;
6134 }
6135 if (packet->size != 8) {
6136 stbi__rewind( s );
6137 return 0;
6138 }
6139 } while (chained);
6140
6141 *comp = (act_comp & 0x10 ? 4 : 3);
6142
6143 return 1;
6144 }
6145 #endif
6146
6147 // *************************************************************************************************
6148 // Portable Gray Map and Portable Pixel Map loader
6149 // by Ken Miller
6150 //
6151 // PGM: http://netpbm.sourceforge.net/doc/pgm.html
6152 // PPM: http://netpbm.sourceforge.net/doc/ppm.html
6153 //
6154 // Known limitations:
6155 // Does not support comments in the header section
6156 // Does not support ASCII image data (formats P2 and P3)
6157 // Does not support 16-bit-per-channel
6158
6159 #ifndef STBI_NO_PNM
6160
stbi__pnm_test(stbi__context * s)6161 static int stbi__pnm_test(stbi__context *s)
6162 {
6163 char p, t;
6164 p = (char) stbi__get8(s);
6165 t = (char) stbi__get8(s);
6166 if (p != 'P' || (t != '5' && t != '6')) {
6167 stbi__rewind( s );
6168 return 0;
6169 }
6170 return 1;
6171 }
6172
stbi__pnm_load(stbi__context * s,int * x,int * y,int * comp,int req_comp)6173 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6174 {
6175 stbi_uc *out;
6176 if (!stbi__pnm_info(s, (int *)&s->img_x, (int *)&s->img_y, (int *)&s->img_n))
6177 return 0;
6178 *x = s->img_x;
6179 *y = s->img_y;
6180 *comp = s->img_n;
6181
6182 out = (stbi_uc *) stbi__malloc(s->img_n * s->img_x * s->img_y);
6183 if (!out) return stbi__errpuc("outofmem", "Out of memory");
6184 stbi__getn(s, out, s->img_n * s->img_x * s->img_y);
6185
6186 if (req_comp && req_comp != s->img_n) {
6187 out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y);
6188 if (out == NULL) return out; // stbi__convert_format frees input on failure
6189 }
6190 return out;
6191 }
6192
stbi__pnm_isspace(char c)6193 static int stbi__pnm_isspace(char c)
6194 {
6195 return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r';
6196 }
6197
stbi__pnm_skip_whitespace(stbi__context * s,char * c)6198 static void stbi__pnm_skip_whitespace(stbi__context *s, char *c)
6199 {
6200 while (!stbi__at_eof(s) && stbi__pnm_isspace(*c))
6201 *c = (char) stbi__get8(s);
6202 }
6203
stbi__pnm_isdigit(char c)6204 static int stbi__pnm_isdigit(char c)
6205 {
6206 return c >= '0' && c <= '9';
6207 }
6208
stbi__pnm_getinteger(stbi__context * s,char * c)6209 static int stbi__pnm_getinteger(stbi__context *s, char *c)
6210 {
6211 int value = 0;
6212
6213 while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) {
6214 value = value*10 + (*c - '0');
6215 *c = (char) stbi__get8(s);
6216 }
6217
6218 return value;
6219 }
6220
stbi__pnm_info(stbi__context * s,int * x,int * y,int * comp)6221 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp)
6222 {
6223 int maxv;
6224 char c, p, t;
6225
6226 stbi__rewind( s );
6227
6228 // Get identifier
6229 p = (char) stbi__get8(s);
6230 t = (char) stbi__get8(s);
6231 if (p != 'P' || (t != '5' && t != '6')) {
6232 stbi__rewind( s );
6233 return 0;
6234 }
6235
6236 *comp = (t == '6') ? 3 : 1; // '5' is 1-component .pgm; '6' is 3-component .ppm
6237
6238 c = (char) stbi__get8(s);
6239 stbi__pnm_skip_whitespace(s, &c);
6240
6241 *x = stbi__pnm_getinteger(s, &c); // read width
6242 stbi__pnm_skip_whitespace(s, &c);
6243
6244 *y = stbi__pnm_getinteger(s, &c); // read height
6245 stbi__pnm_skip_whitespace(s, &c);
6246
6247 maxv = stbi__pnm_getinteger(s, &c); // read max value
6248
6249 if (maxv > 255)
6250 return stbi__err("max value > 255", "PPM image not 8-bit");
6251 else
6252 return 1;
6253 }
6254 #endif
6255
stbi__info_main(stbi__context * s,int * x,int * y,int * comp)6256 static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp)
6257 {
6258 #ifndef STBI_NO_JPEG
6259 if (stbi__jpeg_info(s, x, y, comp)) return 1;
6260 #endif
6261
6262 #ifndef STBI_NO_PNG
6263 if (stbi__png_info(s, x, y, comp)) return 1;
6264 #endif
6265
6266 #ifndef STBI_NO_GIF
6267 if (stbi__gif_info(s, x, y, comp)) return 1;
6268 #endif
6269
6270 #ifndef STBI_NO_BMP
6271 if (stbi__bmp_info(s, x, y, comp)) return 1;
6272 #endif
6273
6274 #ifndef STBI_NO_PSD
6275 if (stbi__psd_info(s, x, y, comp)) return 1;
6276 #endif
6277
6278 #ifndef STBI_NO_PIC
6279 if (stbi__pic_info(s, x, y, comp)) return 1;
6280 #endif
6281
6282 #ifndef STBI_NO_PNM
6283 if (stbi__pnm_info(s, x, y, comp)) return 1;
6284 #endif
6285
6286 #ifndef STBI_NO_HDR
6287 if (stbi__hdr_info(s, x, y, comp)) return 1;
6288 #endif
6289
6290 // test tga last because it's a crappy test!
6291 #ifndef STBI_NO_TGA
6292 if (stbi__tga_info(s, x, y, comp))
6293 return 1;
6294 #endif
6295 return stbi__err("unknown image type", "Image not of any known type, or corrupt");
6296 }
6297
6298 #ifndef STBI_NO_STDIO
stbi_info(char const * filename,int * x,int * y,int * comp)6299 STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
6300 {
6301 FILE *f = stbi__fopen(filename, "rb");
6302 int result;
6303 if (!f) return stbi__err("can't fopen", "Unable to open file");
6304 result = stbi_info_from_file(f, x, y, comp);
6305 fclose(f);
6306 return result;
6307 }
6308
stbi_info_from_file(FILE * f,int * x,int * y,int * comp)6309 STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
6310 {
6311 int r;
6312 stbi__context s;
6313 long pos = ftell(f);
6314 stbi__start_file(&s, f);
6315 r = stbi__info_main(&s,x,y,comp);
6316 fseek(f,pos,SEEK_SET);
6317 return r;
6318 }
6319 #endif /* !STBI_NO_STDIO */
6320
stbi_info_from_memory(stbi_uc const * buffer,int len,int * x,int * y,int * comp)6321 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
6322 {
6323 stbi__context s;
6324 stbi__start_mem(&s,buffer,len);
6325 return stbi__info_main(&s,x,y,comp);
6326 }
6327
stbi_info_from_callbacks(stbi_io_callbacks const * c,void * user,int * x,int * y,int * comp)6328 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
6329 {
6330 stbi__context s;
6331 stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
6332 return stbi__info_main(&s,x,y,comp);
6333 }
6334
6335 #endif /* STB_IMAGE_IMPLEMENTATION */
6336
6337 /*
6338 revision history:
6339 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
6340 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
6341 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
6342 2.03 (2015-04-12) extra corruption checking (mmozeiko)
6343 stbi_set_flip_vertically_on_load (nguillemot)
6344 fix NEON support; fix mingw support
6345 2.02 (2015-01-19) fix incorrect assert, fix warning
6346 2.01 (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
6347 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
6348 2.00 (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
6349 progressive JPEG (stb)
6350 PGM/PPM support (Ken Miller)
6351 STBI_MALLOC,STBI_REALLOC,STBI_FREE
6352 GIF bugfix -- seemingly never worked
6353 STBI_NO_*, STBI_ONLY_*
6354 1.48 (2014-12-14) fix incorrectly-named assert()
6355 1.47 (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
6356 optimize PNG (ryg)
6357 fix bug in interlaced PNG with user-specified channel count (stb)
6358 1.46 (2014-08-26)
6359 fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
6360 1.45 (2014-08-16)
6361 fix MSVC-ARM internal compiler error by wrapping malloc
6362 1.44 (2014-08-07)
6363 various warning fixes from Ronny Chevalier
6364 1.43 (2014-07-15)
6365 fix MSVC-only compiler problem in code changed in 1.42
6366 1.42 (2014-07-09)
6367 don't define _CRT_SECURE_NO_WARNINGS (affects user code)
6368 fixes to stbi__cleanup_jpeg path
6369 added STBI_ASSERT to avoid requiring assert.h
6370 1.41 (2014-06-25)
6371 fix search&replace from 1.36 that messed up comments/error messages
6372 1.40 (2014-06-22)
6373 fix gcc struct-initialization warning
6374 1.39 (2014-06-15)
6375 fix to TGA optimization when req_comp != number of components in TGA;
6376 fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
6377 add support for BMP version 5 (more ignored fields)
6378 1.38 (2014-06-06)
6379 suppress MSVC warnings on integer casts truncating values
6380 fix accidental rename of 'skip' field of I/O
6381 1.37 (2014-06-04)
6382 remove duplicate typedef
6383 1.36 (2014-06-03)
6384 convert to header file single-file library
6385 if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
6386 1.35 (2014-05-27)
6387 various warnings
6388 fix broken STBI_SIMD path
6389 fix bug where stbi_load_from_file no longer left file pointer in correct place
6390 fix broken non-easy path for 32-bit BMP (possibly never used)
6391 TGA optimization by Arseny Kapoulkine
6392 1.34 (unknown)
6393 use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
6394 1.33 (2011-07-14)
6395 make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
6396 1.32 (2011-07-13)
6397 support for "info" function for all supported filetypes (SpartanJ)
6398 1.31 (2011-06-20)
6399 a few more leak fixes, bug in PNG handling (SpartanJ)
6400 1.30 (2011-06-11)
6401 added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
6402 removed deprecated format-specific test/load functions
6403 removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
6404 error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
6405 fix inefficiency in decoding 32-bit BMP (David Woo)
6406 1.29 (2010-08-16)
6407 various warning fixes from Aurelien Pocheville
6408 1.28 (2010-08-01)
6409 fix bug in GIF palette transparency (SpartanJ)
6410 1.27 (2010-08-01)
6411 cast-to-stbi_uc to fix warnings
6412 1.26 (2010-07-24)
6413 fix bug in file buffering for PNG reported by SpartanJ
6414 1.25 (2010-07-17)
6415 refix trans_data warning (Won Chun)
6416 1.24 (2010-07-12)
6417 perf improvements reading from files on platforms with lock-heavy fgetc()
6418 minor perf improvements for jpeg
6419 deprecated type-specific functions so we'll get feedback if they're needed
6420 attempt to fix trans_data warning (Won Chun)
6421 1.23 fixed bug in iPhone support
6422 1.22 (2010-07-10)
6423 removed image *writing* support
6424 stbi_info support from Jetro Lauha
6425 GIF support from Jean-Marc Lienher
6426 iPhone PNG-extensions from James Brown
6427 warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
6428 1.21 fix use of 'stbi_uc' in header (reported by jon blow)
6429 1.20 added support for Softimage PIC, by Tom Seddon
6430 1.19 bug in interlaced PNG corruption check (found by ryg)
6431 1.18 (2008-08-02)
6432 fix a threading bug (local mutable static)
6433 1.17 support interlaced PNG
6434 1.16 major bugfix - stbi__convert_format converted one too many pixels
6435 1.15 initialize some fields for thread safety
6436 1.14 fix threadsafe conversion bug
6437 header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
6438 1.13 threadsafe
6439 1.12 const qualifiers in the API
6440 1.11 Support installable IDCT, colorspace conversion routines
6441 1.10 Fixes for 64-bit (don't use "unsigned long")
6442 optimized upsampling by Fabian "ryg" Giesen
6443 1.09 Fix format-conversion for PSD code (bad global variables!)
6444 1.08 Thatcher Ulrich's PSD code integrated by Nicolas Schulz
6445 1.07 attempt to fix C++ warning/errors again
6446 1.06 attempt to fix C++ warning/errors again
6447 1.05 fix TGA loading to return correct *comp and use good luminance calc
6448 1.04 default float alpha is 1, not 255; use 'void *' for stbi_image_free
6449 1.03 bugfixes to STBI_NO_STDIO, STBI_NO_HDR
6450 1.02 support for (subset of) HDR files, float interface for preferred access to them
6451 1.01 fix bug: possible bug in handling right-side up bmps... not sure
6452 fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
6453 1.00 interface to zlib that skips zlib header
6454 0.99 correct handling of alpha in palette
6455 0.98 TGA loader by lonesock; dynamically add loaders (untested)
6456 0.97 jpeg errors on too large a file; also catch another malloc failure
6457 0.96 fix detection of invalid v value - particleman@mollyrocket forum
6458 0.95 during header scan, seek to markers in case of padding
6459 0.94 STBI_NO_STDIO to disable stdio usage; rename all #defines the same
6460 0.93 handle jpegtran output; verbose errors
6461 0.92 read 4,8,16,24,32-bit BMP files of several formats
6462 0.91 output 24-bit Windows 3.0 BMP files
6463 0.90 fix a few more warnings; bump version number to approach 1.0
6464 0.61 bugfixes due to Marc LeBlanc, Christopher Lloyd
6465 0.60 fix compiling as c++
6466 0.59 fix warnings: merge Dave Moore's -Wall fixes
6467 0.58 fix bug: zlib uncompressed mode len/nlen was wrong endian
6468 0.57 fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
6469 0.56 fix bug: zlib uncompressed mode len vs. nlen
6470 0.55 fix bug: restart_interval not initialized to 0
6471 0.54 allow NULL for 'int *comp'
6472 0.53 fix bug in png 3->4; speedup png decoding
6473 0.52 png handles req_comp=3,4 directly; minor cleanup; jpeg comments
6474 0.51 obey req_comp requests, 1-component jpegs return as 1-component,
6475 on 'test' only check type, not whether we support this variant
6476 0.50 (2006-11-19)
6477 first released version
6478 */
6479