1== 2 March 2021 == 2gperftools 2.9.1 is out! 3 4Minor fixes landed since previous release: 5 6* OSX builds new prefer backtrace() and have somewhat working heap 7 sampling. 8 9* Incorrect assertion failure was fixed that crashed tcmalloc if 10 assertions were on and sized delete was used. More details in github 11 issue #1254. 12 13== 21 February 2021 == 14gperftools 2.9 is out! 15 16Few more changes landed compared to rc: 17 18* Venkatesh Srinivas has contributed thread-safety annotations 19 support. 20 21* couple more unit test bugs that caused tcmalloc_unittest to fail on 22 recent clang has been fixed. 23 24* usage of unsupportable linux_syscall_support.h has been removed from 25 few places. Building with --disable-heap-checker now completely 26 avoids it. Expect complete death of this header in next major 27 release. 28 29== 14 February 2021 == 30gperftools 2.9rc is out! 31 32Here are notable changes: 33 34* Jarno Rajahalme has contributed fix for crashing bug in syscalls 35 support for aarch64. 36 37* User SSE4 has contributed basic support for Elbrus 2000 architecture 38 (!) 39 40* Venkatesh Srinivas has contributed cleanup to atomic ops. 41 42* Đoàn Trần Công Danh has fixed cpu profiler compilation on musl. 43 44* there is now better backtracing support for aarch64 and 45 riscv. x86-64 with frame pointers now also defaults to this new 46 "generic" frame pointer backtracer. 47 48* emergency malloc is now enabled by default. Fixes hang on musl when 49 libgcc backtracer is enabled. 50 51* bunch of legacy config tests has been removed 52 53== 20 December 2020 == 54gperftools 2.8.1 is out! 55 56Here are notable changes: 57 58* previous release contained change to release memory without page 59 heap lock, but this change had at least one bug that caused to 60 crashes and corruption when running under aggressive decommit mode 61 (this is not default). While we check for other bugs, this feature 62 was reverted. See github issue #1204 and issue #1227. 63 64* stack traces depth captured by gperftools is now up to 254 levels 65 deep. Thanks to Kerrick Staley for this small but useful tweak. 66 67* Levon Ter-Grigoryan has contributed small fix for compiler warning. 68 69* Grant Henke has contributed updated detection of program counter 70 register for OS X on arm64. 71 72* Tim Gates has contributed small typo fix. 73 74* Steve Langasek has contributed basic build fixes for riscv64 (!). 75 76* Isaac Hier and okhowang have contributed premiliminary port of build 77 infrastructure to cmake. This works, but it is very premiliminary. 78 Autotools-based build is the only officially supported build for 79 now. 80 81== 6 July 2020 == 82gperftools 2.8 is out! 83 84Here are notable changes: 85 86* ProfilerGetStackTrace is now officially supported API for 87 libprofiler. Contributed by Kirill Müller. 88 89* Build failures on mingw were fixed. This fixed issue #1108. 90 91* Build failure of page_heap_test on MSVC was fixed. 92 93* Ryan Macnak contributed fix for compiling linux syscall support on 94 i386 and recent GCCs. This fixed issue #1076. 95 96* test failures caused by new gcc 10 optimizations were fixed. Same 97 change also fixed tests on clang. 98 99== 8 Mar 2020 == 100gperftools 2.8rc is out! 101 102Here are notable changes: 103 104* building code now requires c++11 or later. Bundled MSVC project was 105 converted to Visual Studio 2015. 106 107* User obones contributed fix for windows x64 TLS callbacks. This 108 fixed leak of thread caches on thread exists in 64-bit windows. 109 110* releasing memory back to kernel is now made with page heap lock 111 dropped. 112 113* HoluWu contributed fix for correct malloc patching on debug builds 114 on windows. This configuration previously crashed. 115 116* Romain Geissler contributed fix for tls access during early tls 117 initialization on dlopen. 118 119* large allocation reports are now silenced by default. Since not all 120 programs want their stderr polluted by those messages. Contributed 121 by Junhao Li. 122 123* HolyWu contributed improvements to MSVC project files. Notably, 124 there is now project for "overriding" version of tcmalloc. 125 126* MS-specific _recalloc is now correctly zeroing only malloced 127 part. This fix was contributed by HolyWu. 128 129* Brian Silverman contributed correctness fix to sampler_test. 130 131* Gabriel Marin ported few fixes from chromium's fork. As part of 132 those fixes, we reduced number of static initializers (forbidden in 133 chromium). Also we now syscalls via syscall function instead of 134 reimplementing direct way to make syscalls on each platform. 135 136* Brian Silverman fixed flakiness in page heap test. 137 138* There is now configure flag to skip installing perl pprof, since 139 external golang pprof is much superior. --disable-deprecated-pprof 140 is the flag. 141 142* Fabric Fontaine contributed fixes to drop use of nonstandard 143 __off64_t type. 144 145* Fabrice Fontaine contributed build fix to check for presence of 146 nonstandard __sbrk functions. It is only used by mmap hooks code and 147 (rightfully) not available on musl. 148 149* Fabrice Fontaine contributed build fix around mmap64 macro and 150 function conflict in same cases. 151 152* there is now configure time option to enable aggressive decommit by 153 default. Contributed by Laurent 154 Stacul. --enable-aggressive-decommit-by-default is the flag. 155 156* Tulio Magno Quites Machado Filho contributed build fixes for ppc 157 around ucontext access. 158 159* User pkubaj contributed couple build fixes for FreeBSD/ppc. 160 161* configure now always assumes we have mmap. This fixes configure 162 failures on some linux guests inside virtualbox. This fixed issue 163 #1008. 164 165* User shipujin contributed syscall support fixes for mips64 (big and 166 little endian). 167 168* Henrik Edin contributed configurable support for wide range of 169 malloc page sizes. 4K, 8K, 16K, 32K, 64K, 128K and 256K are now 170 supported via existing --with-tcmalloc-pagesize flag to configure. 171 172* Jon Kohler added overheads fields to per-size-class textual 173 stats. Stats that are available via 174 MallocExtension::instance()->GetStats(). 175 176* tcmalloc can now avoid fallback from memfs to default sys 177 allocator. TCMALLOC_MEMFS_DISABLE_FALLBACK switches this on. This 178 was contributed by Jon Kohler. 179 180* Ilya Leoshkevich fixed mmap syscall support on s390. 181 182* Todd Lipcon contributed small build warning fix. 183 184* User prehistoricpenguin contributed misc source file mode fixes (we 185 still had few few c++ files marked executable). 186 187* User invalid_ms_user contributed fix for typo. 188 189* Jakub Wilk contributed typos fixes. 190 191== 29 Apr 2018 == 192gperftools 2.7 is out! 193 194Few people contributed minor, but important fixes since rc. 195 196Changes: 197 198* bug in span stats printing introduced by new scalable page heap 199 change was fixed. 200 201* Christoph Müllner has contributed couple warnings fixes and initial 202 support for aarch64_ilp32 architecture. 203 204* Ben Dang contributed documentation fix for heap checker. 205 206* Fabrice Fontaine contributed fixed for linking benchmarks with 207 --disable-static. 208 209* Holy Wu has added sized deallocation unit tests. 210 211* Holy Wu has enabled support of sized deallocation (c++14) on recent 212 MSVC. 213 214* Holy Wu has fixed MSVC build in WIN32_OVERRIDE_ALLOCATORS mode. This 215 closed issue #716. 216 217* Holy Wu has contributed cleanup of config.h used on windows. 218 219* Mao Huang has contributed couple simple tcmalloc changes from 220 chromium code base. Making our tcmalloc forks a tiny bit closer. 221 222* issue #946 that caused compilation failures on some Linux clang 223 installations has been fixed. Much thanks to github user htuch for 224 helping to diagnose issue and proposing a fix. 225 226* Tulio Magno Quites Machado Filho has contributed build-time fix for 227 PPC (for problem introduced in one of commits since RC). 228 229== 18 Mar 2018 == 230gperftools 2.7rc is out! 231 232Changes: 233 234* Most notable change in this release is that very large allocations 235 (>1MiB) are now handled be O(log n) implementation. This is 236 contributed by Todd Lipcon based on earlier work by Aliaksei 237 Kandratsenka and James Golick. Special thanks to Alexey Serbin for 238 contributing OSX fix for that commit. 239 240* detection of sized deallocation support is improved. Which should 241 fix another set of issues building on OSX. Much thanks to Alexey 242 Serbin for reporting the issue, suggesting a fix and verifying it. 243 244* Todd Lipcon made a change to extend page heaps freelists to 1 MiB 245 (up from 1MiB - 8KiB). This may help a little for some workloads. 246 247* Ishan Arora contributed typo fix to docs 248 249== 9 Dec 2017 == 250gperftools 2.6.3 is out! 251 252Just two fixes were made in this release: 253 254* Stephan Zuercher has contributed a build fix for some recent XCode 255 versions. See issue #942 for more details. 256 257* assertion failure on some windows builds introduced by 2.6.2 was 258 fixed. Thanks to github user nkeemik for reporting it and testing 259 fix. See issue #944 for more details. 260 261== 30 Nov 2017 == 262gperftools 2.6.2 is out! 263 264Most notable change is recently added support for C++17 over-aligned 265allocation operators contributed by Andrey Semashev. I've extended his 266implemention to have roughly same performance as malloc/new. This 267release also has native support for C11 aligned_alloc. 268 269Rest is mostly bug fixes: 270 271* Jianbo Yang has contributed a fix for potentially severe data race 272 introduced by malloc fast-path work in gperftools 2.6. This race 273 could cause occasional violation of total thread cache size 274 constraint. See issue #929 for more details. 275 276* Correct behavior in out-of-memory condition in fast-path cases was 277 restored. This was another bug introduced by fast-path optimization 278 in gperftools 2.6 which caused operator new to silently return NULL 279 instead of doing correct C++ OOM handling (calling new_handler and 280 throwing bad_alloc). 281 282* Khem Raj has contributed couple build fixes for newer glibcs (ucontext_t vs 283 struct ucontext and loff_t definition) 284 285* Piotr Sikora has contributed build fix for OSX (not building unwind 286 benchmark). This was issue #910 (thanks to Yuriy Solovyov for 287 reporting it). 288 289* Dorin Lazăr has contributed fix for compiler warning 290 291* issue #912 (occasional deadlocking calling getenv too early on 292 windows) was fixed. Thanks to github user shangcangriluo for 293 reporting it. 294 295* Couple earlier lsan-related commits still causing occasional issues 296 linking on OSX has been reverted. See issue #901. 297 298* Volodimir Krylov has contributed GetProgramInvocationName for FreeBSD 299 300* changsu lee has contributed couple minor correctness fixes (missing 301 va_end() and missing free() call in rarely executed Symbolize path) 302 303* Andrew C. Morrow has contributed some more page heap stats. See issue 304 #935. 305 306* some cases of built-time warnings from various gcc/clang versions 307 about throw() declarations have been fixes. 308 309== 9 July 2017 == 310 311gperftools 2.6.1 is out! This is mostly bug-fixes release. 312 313* issue #901: build issue on OSX introduced in last-time commit in 2.6 314 was fixed (contributed by Francis Ricci) 315 316* tcmalloc_minimal now works on 32-bit ABI of mips64. This is issue 317 #845. Much thanks to Adhemerval Zanella and github user mtone. 318 319* Romain Geissler contributed build fix for -std=c++17. This is pull 320 request #897. 321 322* As part of fixing issue #904, tcmalloc atfork handler is now 323 installed early. This should fix slight chance of hitting deadlocks 324 at fork in some cases. 325 326== 4 July 2017 == 327 328gperftools 2.6 is out! 329 330* Kim Gräsman contributed documentation update for HEAPPROFILESIGNAL 331 environment variable 332 333* KernelMaker contributed fix for population of min_object_size field 334 returned by MallocExtension::GetFreeListSizes 335 336* commit 8c3dc52fcfe0 "issue-654: [pprof] handle split text segments" 337 was reverted. Some OSX users reported issues with this commit. Given 338 our pprof implementation is strongly deprecated it is best to drop 339 recently introduced features rather than breaking it badly. 340 341* Francis Ricci contributed improvement for interaction with leak 342 sanitizer. 343 344== 22 May 2017 == 345 346gperftools 2.6rc4 is out! 347 348Dynamic sized delete is disabled by default again. There is no hope of 349it working with eager dynamic symbols resolution (-z now linker 350flag). More details in 351https://bugzilla.redhat.com/show_bug.cgi?id=1452813 352 353== 21 May 2017 == 354 355gperftools 2.6rc3 is out! 356 357gperftools compilation on older systems (e.g. rhel 5) was fixed. This 358was originally reported in github issue #888. 359 360== 14 May 2017 == 361 362gperftools 2.6rc2 is out! 363 364Just 2 small fixes on top of 2.6rc. Particularly, Rajalakshmi 365Srinivasaraghavan contributed build fix for ppc32. 366 367== 14 May 2017 == 368 369gperftools 2.6rc is out! 370 371Highlights of this release are performance work on malloc fast-path 372and support for more modern visual studio runtimes, and deprecation of 373bundled pprof. Another significant performance-affecting changes are 374reverting central free list transfer batch size back to 32 and 375disabling of aggressive decommit mode by default. 376 377Note, while we still ship perl implementation of pprof, everyone is 378strongly advised to use golang reimplementation of pprof from 379https://github.com/google/pprof. 380 381Here are notable changes in more details (and see ChangeLog for full 382details): 383 384* a bunch of performance tweaks to tcmalloc fast-path were 385 merged. This speeds up critical path of tcmalloc by few tens of 386 %. Well tuned and allocation-heavy programs should see substantial 387 performance boost (should apply to all modern elf platforms). This 388 is based on Google-internal tcmalloc changes for fast-path (with 389 obvious exception of lacking per-cpu mode, of course). Original 390 changes were made by Aliaksei Kandratsenka. And Andrew Hunter, 391 Dmitry Vyukov and Sanjay Ghemawat contributed with reviews and 392 discussions. 393 394* Architectures with 48 bits address space (x86-64 and aarch64) now 395 use faster 2 level page map. This was ported from Google-internal 396 change by Sanjay Ghemawat. 397 398* Default value of TCMALLOC_TRANSFER_NUM_OBJ was returned back to 399 32. Larger values have been found to hurt certain programs (but help 400 some other benchmarks). Value can still be tweaked at run time via 401 environment variable. 402 403* tcmalloc aggressive decommit mode is now disabled by default 404 again. It was found to degrade performance of certain tensorflow 405 benchmarks. Users who prefer smaller heap over small performance win 406 can still set environment variable TCMALLOC_AGGRESSIVE_DECOMMIT=t. 407 408* runtime switchable sized delete support has be fixed and re-enabled 409 (on GNU/Linux). Programs that use C++ 14 or later that use sized 410 delete can again be sped up by setting environment variable 411 TCMALLOC_ENABLE_SIZED_DELETE=t. Support for enabling sized 412 deallication support at compile-time is still present, of course. 413 414* tcmalloc now explicitly avoids use of MADV_FREE on Linux, unless 415 TCMALLOC_USE_MADV_FREE is defined at compile time. This is because 416 performance impact of MADV_FREE is not well known. Original issue 417 #780 raised by Mathias Stearn. 418 419* issue #786 with occasional deadlocks in stack trace capturing via 420 libunwind was fixed. It was originally reported as Ceph issue: 421 http://tracker.ceph.com/issues/13522 422 423* ChangeLog is now automatically generated from git log. Old ChangeLog 424 is now ChangeLog.old. 425 426* tcmalloc now provides implementation of nallocx. Function was 427 originally introduced by jemalloc and can be used to return real 428 allocation size given allocation request size. This is ported from 429 Google-internal tcmalloc change contributed by Dmitry Vyukov. 430 431* issue #843 which made tcmalloc crash when used with erlang runtime 432 was fixed. 433 434* issue #839 which caused tcmalloc's aggressive decommit mode to 435 degrade performance in some corner cases was fixed. 436 437* Bryan Chan contributed support for 31-bit s390. 438 439* Brian Silverman contributed compilation fix for 32-bit ARMs 440 441* Issue #817 that was causing tcmalloc to fail on windows 10 and 442 later, as well as on recent msvc was fixed. We now patch _free_base 443 as well. 444 445* a bunch of minor documentaion/typos fixes by: Mike Gaffney 446 <mike@uberu.com>, iivlev <iivlev@productengine.com>, savefromgoogle 447 <savefromgoogle@users.noreply.github.com>, John McDole 448 <jtmcdole@gmail.com>, zmertens <zmertens@asu.edu>, Kirill Müller 449 <krlmlr@mailbox.org>, Eugene <n.eugene536@gmail.com>, Ola Olsson 450 <ola1olsson@gmail.com>, Mostyn Bramley-Moore <mostynb@opera.com> 451 452* Tulio Magno Quites Machado Filho has contributed removal of 453 deprecated glibc malloc hooks. 454 455* Issue #827 that caused intercepting malloc on osx 10.12 to fail was 456 fixed, by copying fix made by Mike Hommey to jemalloc. Much thanks 457 to Koichi Shiraishi and David Ribeiro Alves for reporting it and 458 testing fix. 459 460* Aman Gupta and Kenton Varda contributed minor fixes to pprof (but 461 note again that pprof is deprecated) 462 463* Ryan Macnak contributed compilation fix for aarch64 464 465* Francis Ricci has fixed unaligned memory access in debug allocator 466 467* TCMALLOC_PAGE_FENCE_NEVER_RECLAIM now actually works thanks to 468 contribution by Andrew Morrow. 469 470== 12 Mar 2016 == 471 472gperftools 2.5 is out! 473 474Just single bugfix was merged after rc2. Which was fix for issue #777. 475 476== 5 Mar 2016 == 477 478gperftools 2.5rc2 is out! 479 480New release contains just few commits on top of first release 481candidate. One of them is build fix for Visual Studio. Another 482significant change is that dynamic sized delete is now disabled by 483default. It turned out that IFUNC relocations are not supporting our 484advanced use case on all platforms and in all cases. 485 486== 21 Feb 2016 == 487 488gperftools 2.5rc is out! 489 490Here are major changes since 2.4: 491 492* we've moved to github! 493 494* Bryan Chan has contributed s390x support 495 496* stacktrace capturing via libgcc's _Unwind_Backtrace was implemented 497 (for architectures with missing or broken libunwind). 498 499* "emergency malloc" was implemented. Which unbreaks recursive calls 500 to malloc/free from stacktrace capturing functions (such us glib'c 501 backtrace() or libunwind on arm). It is enabled by 502 --enable-emergency-malloc configure flag or by default on arm when 503 --enable-stacktrace-via-backtrace is given. It is another fix for a 504 number common issues people had on platforms with missing or broken 505 libunwind. 506 507* C++14 sized-deallocation is now supported (on gcc 5 and recent 508 clangs). It is off by default and can be enabled at configure time 509 via --enable-sized-delete. On GNU/Linux it can also be enabled at 510 run-time by either TCMALLOC_ENABLE_SIZED_DELETE environment variable 511 or by defining tcmalloc_sized_delete_enabled function which should 512 return 1 to enable it. 513 514* we've lowered default value of transfer batch size to 512. Previous 515 value (bumped up in 2.1) was too high and caused performance 516 regression for some users. 512 should still give us performance 517 boost for workloads that need higher transfer batch size while not 518 penalizing other workloads too much. 519 520* Brian Silverman's patch finally stopped arming profiling timer 521 unless profiling is started. 522 523* Andrew Morrow has contributed support for obtaining cache size of the 524 current thread and softer idling (for use in MongoDB). 525 526* we've implemented few minor performance improvements, particularly 527 on malloc fast-path. 528 529A number of smaller fixes were made. Many of them were contributed: 530 531* issue that caused spurious profiler_unittest.sh failures was fixed. 532 533* Jonathan Lambrechts contributed improved callgrind format support to 534 pprof. 535 536* Matt Cross contributed better support for debug symbols in separate 537 files to pprof. 538 539* Matt Cross contributed support for printing collapsed stack frame 540 from pprof aimed at producing flame graphs. 541 542* Angus Gratton has contributed documentation fix mentioning that on 543 windows only tcmalloc_minimal is supported. 544 545* Anton Samokhvalov has made tcmalloc use mi_force_{un,}lock on OSX 546 instead of pthread_atfork. Which apparently fixes forking 547 issues tcmalloc had on OSX. 548 549* Milton Chiang has contributed support for building 32-bit gperftools 550 on arm8. 551 552* Patrick LoPresti has contributed support for specifying alternative 553 profiling signal via CPUPROFILE_TIMER_SIGNAL environment variable. 554 555* Paolo Bonzini has contributed support configuring filename for 556 sending malloc tracing output via TCMALLOC_TRACE_FILE environment 557 variable. 558 559* user spotrh has enabled use of futex on arm. 560 561* user mitchblank has contributed better declaration for arg-less 562 profiler functions. 563 564* Tom Conerly contributed proper freeing of memory allocated in 565 HeapProfileTable::FillOrderedProfile on error paths. 566 567* user fdeweerdt has contributed curl arguments handling fix in pprof 568 569* Frederik Mellbin fixed tcmalloc's idea of mangled new and delete 570 symbols on windows x64 571 572* Dair Grant has contributed cacheline alignment for ThreadCache 573 objects 574 575* Fredrik Mellbin has contributed updated windows/config.h for Visual 576 Studio 2015 and other windows fixes. 577 578* we're not linking libpthread to libtcmalloc_minimal anymore. Instead 579 libtcmalloc_minimal links to pthread symbols weakly. As a result 580 single-threaded programs remain single-threaded when linking to or 581 preloading libtcmalloc_minimal.so. 582 583* Boris Sazonov has contributed mips compilation fix and printf misue 584 in pprof. 585 586* Adhemerval Zanella has contributed alignment fixes for statically 587 allocated variables. 588 589* Jens Rosenboom has contributed fixes for heap-profiler_unittest.sh 590 591* gshirishfree has contributed better description for GetStats method. 592 593* cyshi has contributed spinlock pause fix. 594 595* Chris Mayo has contributed --docdir argument support for configure. 596 597* Duncan Sands has contributed fix for function aliases. 598 599* Simon Que contributed better include for malloc_hook_c.h 600 601* user wmamrak contributed struct timespec fix for Visual Studio 2015. 602 603* user ssubotin contributed typo in PrintAvailability code. 604 605 606== 10 Jan 2015 == 607 608gperftools 2.4 is out! The code is exactly same as 2.4rc. 609 610== 28 Dec 2014 == 611 612gperftools 2.4rc is out! 613 614Here are changes since 2.3: 615 616* enabled aggressive decommit option by default. It was found to 617 significantly improve memory fragmentation with negligible impact on 618 performance. (Thanks to investigation work performed by Adhemerval 619 Zanella) 620 621* added ./configure flags for tcmalloc pagesize and tcmalloc 622 allocation alignment. Larger page sizes have been reported to 623 improve performance occasionally. (Patch by Raphael Moreira Zinsly) 624 625* sped-up hot-path of malloc/free. By about 5% on static library and 626 about 10% on shared library. Mainly due to more efficient checking 627 of malloc hooks. 628 629* improved stacktrace capturing in cpu profiler (due to issue found by 630 Arun Sharma). As part of that issue pprof's handling of cpu profiles 631 was also improved. 632 633== 7 Dec 2014 == 634 635gperftools 2.3 is out! 636 637Here are changes since 2.3rc: 638 639* (issue 658) correctly close socketpair fds on failure (patch by glider) 640 641* libunwind integration can be disabled at configure time (patch by 642 Raphael Moreira Zinsly) 643 644* libunwind integration is disabled by default for ppc64 (patch by 645 Raphael Moreira Zinsly) 646 647* libunwind integration is force-disabled for OSX. It was not used by 648 default anyways. Fixes compilation issue I saw. 649 650== 2 Nov 2014 == 651 652gperftools 2.3rc is out! 653 654Most small improvements in this release were made to pprof tool. 655 656New experimental Linux-only (for now) cpu profiling mode is a notable 657big improvement. 658 659Here are notable changes since 2.2.1: 660 661* (issue-631) fixed debugallocation miscompilation on mmap-less 662 platforms (courtesy of user iamxujian) 663 664* (issue-630) reference to wrong PROFILE (vs. correct CPUPROFILE) 665 environment variable was fixed (courtesy of WenSheng He) 666 667* pprof now has option to display stack traces in output for heap 668 checker (courtesy of Michael Pasieka) 669 670* (issue-636) pprof web command now works on mingw 671 672* (issue-635) pprof now handles library paths that contain spaces 673 (courtesy of user mich...@sebesbefut.com) 674 675* (issue-637) pprof now has an option to not strip template arguments 676 (patch by jiakai) 677 678* (issue-644) possible out-of-bounds access in GetenvBeforeMain was 679 fixed (thanks to user abyss.7) 680 681* (issue-641) pprof now has an option --show_addresses (thanks to user 682 yurivict). New option prints instruction address in addition to 683 function name in stack traces 684 685* (issue-646) pprof now works around some issues of addr2line 686 reportedly when DWARF v4 format is used (patch by Adam McNeeney) 687 688* (issue-645) heap profiler exit message now includes remaining memory 689 allocated info (patch by user yurivict) 690 691* pprof code that finds location of /proc/<pid>/maps in cpu profile 692 files is now fixed (patch by Ricardo M. Correia) 693 694* (issue-654) pprof now handles "split text segments" feature of 695 Chromium for Android. (patch by simonb) 696 697* (issue-655) potential deadlock on windows caused by early call to 698 getenv in malloc initialization code was fixed (bug reported and fix 699 proposed by user zndmitry) 700 701* incorrect detection of arm 6zk instruction set support 702 (-mcpu=arm1176jzf-s) was fixed. (Reported by pedronavf on old 703 issue-493) 704 705* new cpu profiling mode on Linux is now implemented. It sets up 706 separate profiling timers for separate threads. Which improves 707 accuracy of profiling on Linux a lot. It is off by default. And is 708 enabled if both librt.f is loaded and CPUPROFILE_PER_THREAD_TIMERS 709 environment variable is set. But note that all threads need to be 710 registered via ProfilerRegisterThread. 711 712== 21 Jun 2014 == 713 714gperftools 2.2.1 is out! 715 716Here's list of fixes: 717 718* issue-626 was closed. Which fixes initialization statically linked 719 tcmalloc. 720 721* issue 628 was closed. It adds missing header file into source 722 tarball. This fixes for compilation on PPC Linux. 723 724== 3 May 2014 == 725 726gperftools 2.2 is out! 727 728Here are notable changes since 2.2rc: 729 730* issue 620 (crash on windows when c runtime dll is reloaded) was 731 fixed 732 733== 19 Apr 2014 == 734 735gperftools 2.2rc is out! 736 737Here are notable changes since 2.1: 738 739* a number of fixes for a number compilers and platforms. Notably 740 Visual Studio 2013, recent mingw with c++ threads and some OSX 741 fixes. 742 743* we now have mips and mips64 support! (courtesy of Jovan Zelincevic, 744 Jean Lee, user xiaoyur347 and others) 745 746* we now have aarch64 (aka arm64) support! (contributed by Riku 747 Voipio) 748 749* there's now support for ppc64-le (by Raphael Moreira Zinsly and 750 Adhemerval Zanella) 751 752* there's now some support of uclibc (contributed by user xiaoyur347) 753 754* google/ headers will now give you deprecation warning. They are 755 deprecated since 2.0 756 757* there's now new api: tc_malloc_skip_new_handler (ported from chromium 758 fork) 759 760* issue-557: added support for dumping heap profile via signal (by 761 Jean Lee) 762 763* issue-567: Petr Hosek contributed SysAllocator support for windows 764 765* Joonsoo Kim contributed several speedups for central freelist code 766 767* TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES environment variable now works 768 769* configure scripts are now using AM_MAINTAINER_MODE. It'll only 770 affect folks who modify source from .tar.gz and want automake to 771 automatically rebuild Makefile-s. See automake documentation for 772 that. 773 774* issue-586: detect main executable even if PIE is active (based on 775 patch by user themastermind1). Notably, it fixes profiler use with 776 ruby. 777 778* there is now support for switching backtrace capturing method at 779 runtime (via TCMALLOC_STACKTRACE_METHOD and 780 TCMALLOC_STACKTRACE_METHOD_VERBOSE environment variables) 781 782* there is new backtrace capturing method using -finstrument-functions 783 prologues contributed by user xiaoyur347 784 785* few cases of crashes/deadlocks in profiler were addressed. See 786 (famous) issue-66, issue-547 and issue-579. 787 788* issue-464 (memory corruption in debugalloc's realloc after 789 memallign) is now fixed 790 791* tcmalloc is now able to release memory back to OS on windows 792 (issue-489). The code was ported from chromium fork (by a number of 793 authors). 794 795* Together with issue-489 we ported chromium's "aggressive decommit" 796 mode. In this mode (settable via malloc extension and via 797 environment variable TCMALLOC_AGGRESSIVE_DECOMMIT), free pages are 798 returned back to OS immediately. 799 800* MallocExtension::instance() is now faster (based on patch by 801 Adhemerval Zanella) 802 803* issue-610 (hangs on windows in multibyte locales) is now fixed 804 805The following people helped with ideas or patches (based on git log, 806some contributions purely in bugtracker might be missing): Andrew 807C. Morrow, yurivict, Wang YanQing, Thomas Klausner, 808davide.italiano@10gen.com, Dai MIKURUBE, Joon-Sung Um, Jovan 809Zelincevic, Jean Lee, Petr Hosek, Ben Avison, drussel, Joonsoo Kim, 810Hannes Weisbach, xiaoyur347, Riku Voipio, Adhemerval Zanella, Raphael 811Moreira Zinsly 812 813== 30 July 2013 == 814 815gperftools 2.1 is out! 816 817Just few fixes where merged after rc. Most notably: 818 819* Some fixes for debug allocation on POWER/Linux 820 821== 20 July 2013 == 822 823gperftools 2.1rc is out! 824 825As a result of more than a year of contributions we're ready for 2.1 826release. 827 828But before making that step I'd like to create RC and make sure people 829have chance to test it. 830 831Here are notable changes since 2.0: 832 833* fixes for building on newer platforms. Notably, there's now initial 834 support for x32 ABI (--enable-minimal only at this time)) 835 836* new getNumericProperty stats for cache sizes 837 838* added HEAP_PROFILER_TIME_INTERVAL variable (see documentation) 839 840* added environment variable to control heap size (TCMALLOC_HEAP_LIMIT_MB) 841 842* added environment variable to disable release of memory back to OS 843 (TCMALLOC_DISABLE_MEMORY_RELEASE) 844 845* cpu profiler can now be switched on and off by sending it a signal 846 (specified in CPUPROFILESIGNAL) 847 848* (issue 491) fixed race-ful spinlock wake-ups 849 850* (issue 496) added some support for fork-ing of process that is using 851 tcmalloc 852 853* (issue 368) improved memory fragmentation when large chunks of 854 memory are allocated/freed 855 856== 03 February 2012 == 857 858I've just released gperftools 2.0 859 860The `google-perftools` project has been renamed to `gperftools`. I 861(csilvers) am stepping down as maintainer, to be replaced by 862David Chappelle. Welcome to the team, David! David has been an 863an active contributor to perftools in the past -- in fact, he's the 864only person other than me that already has commit status. I am 865pleased to have him take over as maintainer. 866 867I have both renamed the project (the Google Code site renamed a few 868weeks ago), and bumped the major version number up to 2, to reflect 869the new community ownership of the project. Almost all the 870[http://gperftools.googlecode.com/svn/tags/gperftools-2.0/ChangeLog changes] 871are related to the renaming. 872 873The main functional change from google-perftools 1.10 is that 874I've renamed the `google/` include-directory to be `gperftools/` 875instead. New code should `#include <gperftools/tcmalloc.h>`/etc. 876(Most users of perftools don't need any perftools-specific includes at 877all, so this is mostly directed to "power users.") I've kept the old 878names around as forwarding headers to the new, so `#include 879<google/tcmalloc.h>` will continue to work. 880 881(The other functional change which I snuck in is getting rid of some 882bash-isms in one of the unittest driver scripts, so it could run on 883Solaris.) 884 885Note that some internal names still contain the text `google`, such as 886the `google_malloc` internal linker section. I think that's a 887trickier transition, and can happen in a future release (if at all). 888 889 890=== 31 January 2012 === 891 892I've just released perftools 1.10 893 894There is an API-incompatible change: several of the methods in the 895`MallocExtension` class have changed from taking a `void*` to taking a 896`const void*`. You should not be affected by this API change 897unless you've written your own custom malloc extension that derives 898from `MallocExtension`, but since it is a user-visible change, I have 899upped the `.so` version number for this release. 900 901This release focuses on improvements to linux-syscall-support.h, 902including ARM and PPC fixups and general cleanups. I hope this will 903magically fix an array of bugs people have been seeing. 904 905There is also exciting news on the porting front, with support for 906patching win64 assembly contributed by IBM Canada! This is an 907important step -- perhaps the most difficult -- to getting perftools 908to work on 64-bit windows using the patching technique (it doesn't 909affect the libc-modification technique). `premable_patcher_test` has 910been added to help test these changes; it is meant to compile under 911x86_64, and won't work under win32. 912 913For the full list of changes, including improved `HEAP_PROFILE_MMAP` 914support, see the 915[http://gperftools.googlecode.com/svn/tags/google-perftools-1.10/ChangeLog ChangeLog]. 916 917 918=== 24 January 2011 === 919 920The `google-perftools` Google Code page has been renamed to 921`gperftools`, in preparation for the project being renamed to 922`gperftools`. In the coming weeks, I'll be stepping down as 923maintainer for the perftools project, and as part of that Google is 924relinquishing ownership of the project; it will now be entirely 925community run. The name change reflects that shift. The 'g' in 926'gperftools' stands for 'great'. :-) 927 928=== 23 December 2011 === 929 930I've just released perftools 1.9.1 931 932I missed including a file in the tarball, that is needed to compile on 933ARM. If you are not compiling on ARM, or have successfully compiled 934perftools 1.9, there is no need to upgrade. 935 936 937=== 22 December 2011 === 938 939I've just released perftools 1.9 940 941This change has a slew of improvements, from better ARM and freebsd 942support, to improved performance by moving some code outside of locks, 943to better pprof reporting of code with overloaded functions. 944 945The full list of changes is in the 946[http://google-perftools.googlecode.com/svn/tags/google-perftools-1.9/ChangeLog ChangeLog]. 947 948 949=== 26 August 2011 === 950 951I've just released perftools 1.8.3 952 953The star-crossed 1.8 series continues; in 1.8.1, I had accidentally 954removed some code that was needed for FreeBSD. (Without this code 955many apps would crash at startup.) This release re-adds that code. 956If you are not on FreeBSD, or are using FreeBSD with perftools 1.8 or 957earlier, there is no need to upgrade. 958 959=== 11 August 2011 === 960 961I've just released perftools 1.8.2 962 963I was incorrectly calculating the patch-level in the configuration 964step, meaning the TC_VERSION_PATCH #define in tcmalloc.h was wrong. 965Since the testing framework checks for this, it was failing. Now it 966should work again. This time, I was careful to re-run my tests after 967upping the version number. :-) 968 969If you don't care about the TC_VERSION_PATCH #define, there's no 970reason to upgrae. 971 972=== 26 July 2011 === 973 974I've just released perftools 1.8.1 975 976I was missing an #include that caused the build to break under some 977compilers, especially newer gcc's, that wanted it. This only affects 978people who build from source, so only the .tar.gz file is updated from 979perftools 1.8. If you didn't have any problems compiling perftools 9801.8, there's no reason to upgrade. 981 982=== 15 July 2011 === 983 984I've just released perftools 1.8 985 986Of the many changes in this release, a good number pertain to porting. 987I've revamped OS X support to use the malloc-zone framework; it should 988now Just Work to link in tcmalloc, without needing 989`DYLD_FORCE_FLAT_NAMESPACE` or the like. (This is a pretty major 990change, so please feel free to report feedback at 991google-perftools@googlegroups.com.) 64-bit Windows support is also 992improved, as is ARM support, and the hooks are in place to improve 993FreeBSD support as well. 994 995On the other hand, I'm seeing hanging tests on Cygwin. I see the same 996hanging even with (the old) perftools 1.7, so I'm guessing this is 997either a problem specific to my Cygwin installation, or nobody is 998trying to use perftools under Cygwin. If you can reproduce the 999problem, and even better have a solution, you can report it at 1000google-perftools@googlegroups.com. 1001 1002Internal changes include several performance and space-saving tweaks. 1003One is user-visible (but in "stealth mode", and otherwise 1004undocumented): you can compile with `-DTCMALLOC_SMALL_BUT_SLOW`. In 1005this mode, tcmalloc will use less memory overhead, at the cost of 1006running (likely not noticeably) slower. 1007 1008There are many other changes as well, too numerous to recount here, 1009but present in the 1010[http://google-perftools.googlecode.com/svn/tags/google-perftools-1.8/ChangeLog ChangeLog]. 1011 1012 1013=== 7 February 2011 === 1014 1015Thanks to endlessr..., who 1016[http://code.google.com/p/google-perftools/issues/detail?id=307 identified] 1017why some tests were failing under MSVC 10 in release mode. It does not look 1018like these failures point toward any problem with tcmalloc itself; rather, the 1019problem is with the test, which made some assumptions that broke under the 1020some aggressive optimizations used in MSVC 10. I'll fix the test, but in 1021the meantime, feel free to use perftools even when compiled under MSVC 102210. 1023 1024=== 4 February 2011 === 1025 1026I've just released perftools 1.7 1027 1028I apologize for the delay since the last release; so many great new 1029patches and bugfixes kept coming in (and are still coming in; I also 1030apologize to those folks who have to slip until the next release). I 1031picked this arbitrary time to make a cut. 1032 1033Among the many new features in this release is a multi-megabyte 1034reduction in the amount of tcmalloc overhead uder x86_64, improved 1035performance in the case of contention, and many many bugfixes, 1036especially architecture-specific bugfixes. See the 1037[http://google-perftools.googlecode.com/svn/tags/google-perftools-1.7/ChangeLog ChangeLog] 1038for full details. 1039 1040One architecture-specific change of note is added comments in the 1041[http://google-perftools.googlecode.com/svn/tags/perftools-1.7/README README] 1042for using tcmalloc under OS X. I'm trying to get my head around the 1043exact behavior of the OS X linker, and hope to have more improvements 1044for the next release, but I hope these notes help folks who have been 1045having trouble with tcmalloc on OS X. 1046 1047*Windows users*: I've heard reports that some unittests fail on 1048Windows when compiled with MSVC 10 in Release mode. All tests pass in 1049Debug mode. I've not heard of any problems with earlier versions of 1050MSVC. I don't know if this is a problem with the runtime patching (so 1051the static patching discussed in README_windows.txt will still work), 1052a problem with perftools more generally, or a bug in MSVC 10. Anyone 1053with windows expertise that can debug this, I'd be glad to hear from! 1054 1055 1056=== 5 August 2010 === 1057 1058I've just released perftools 1.6 1059 1060This version also has a large number of minor changes, including 1061support for `malloc_usable_size()` as a glibc-compatible alias to 1062`malloc_size()`, the addition of SVG-based output to `pprof`, and 1063experimental support for tcmalloc large pages, which may speed up 1064tcmalloc at the cost of greater memory use. To use tcmalloc large 1065pages, see the 1066[http://google-perftools.googlecode.com/svn/tags/perftools-1.6/INSTALL 1067INSTALL file]; for all changes, see the 1068[http://google-perftools.googlecode.com/svn/tags/perftools-1.6/ChangeLog 1069ChangeLog]. 1070 1071OS X NOTE: improvements in the profiler unittest have turned up an OS 1072X issue: in multithreaded programs, it seems that OS X often delivers 1073the profiling signal (from sigitimer()) to the main thread, even when 1074it's sleeping, rather than spawned threads that are doing actual work. 1075If anyone knows details of how OS X handles SIGPROF events (from 1076setitimer) in threaded programs, and has insight into this problem, 1077please send mail to google-perftools@googlegroups.com. 1078 1079To see if you're affected by this, look for profiling time that pprof 1080attributes to `___semwait_signal`. This is work being done in other 1081threads, that is being attributed to sleeping-time in the main thread. 1082 1083 1084=== 20 January 2010 === 1085 1086I've just released perftools 1.5 1087 1088This version has a slew of changes, leading to somewhat faster 1089performance and improvements in portability. It adds features like 1090`ITIMER_REAL` support to the cpu profiler, and `tc_set_new_mode` to 1091mimic the windows function of the same name. Full details are in the 1092[http://google-perftools.googlecode.com/svn/tags/perftools-1.5/ChangeLog 1093ChangeLog]. 1094 1095 1096=== 11 September 2009 === 1097 1098I've just released perftools 1.4 1099 1100The major change this release is the addition of a debugging malloc 1101library! If you link with `libtcmalloc_debug.so` instead of 1102`libtcmalloc.so` (and likewise for the `minimal` variants) you'll get 1103a debugging malloc, which will catch double-frees, writes to freed 1104data, `free`/`delete` and `delete`/`delete[]` mismatches, and even 1105(optionally) writes past the end of an allocated block. 1106 1107We plan to do more with this library in the future, including 1108supporting it on Windows, and adding the ability to use the debugging 1109library with your default malloc in addition to using it with 1110tcmalloc. 1111 1112There are also the usual complement of bug fixes, documented in the 1113ChangeLog, and a few minor user-tunable knobs added to components like 1114the system allocator. 1115 1116 1117=== 9 June 2009 === 1118 1119I've just released perftools 1.3 1120 1121Like 1.2, this has a variety of bug fixes, especially related to the 1122Windows build. One of my bugfixes is to undo the weird `ld -r` fix to 1123`.a` files that I introduced in perftools 1.2: it caused problems on 1124too many platforms. I've reverted back to normal `.a` files. To work 1125around the original problem that prompted the `ld -r` fix, I now 1126provide `libtcmalloc_and_profiler.a`, for folks who want to link in 1127both. 1128 1129The most interesting API change is that I now not only override 1130`malloc`/`free`/etc, I also expose them via a unique set of symbols: 1131`tc_malloc`/`tc_free`/etc. This enables clients to write their own 1132memory wrappers that use tcmalloc: 1133{{{ 1134 void* malloc(size_t size) { void* r = tc_malloc(size); Log(r); return r; } 1135}}} 1136 1137 1138=== 17 April 2009 === 1139 1140I've just released perftools 1.2. 1141 1142This is mostly a bugfix release. The major change is internal: I have 1143a new system for creating packages, which allows me to create 64-bit 1144packages. (I still don't do that for perftools, because there is 1145still no great 64-bit solution, with libunwind still giving problems 1146and --disable-frame-pointers not practical in every environment.) 1147 1148Another interesting change involves Windows: a 1149[http://code.google.com/p/google-perftools/issues/detail?id=126 new 1150patch] allows users to choose to override malloc/free/etc on Windows 1151rather than patching, as is done now. This can be used to create 1152custom CRTs. 1153 1154My fix for this 1155[http://groups.google.com/group/google-perftools/browse_thread/thread/1ff9b50043090d9d/a59210c4206f2060?lnk=gst&q=dynamic#a59210c4206f2060 1156bug involving static linking] ended up being to make libtcmalloc.a and 1157libperftools.a a big .o file, rather than a true `ar` archive. This 1158should not yield any problems in practice -- in fact, it should be 1159better, since the heap profiler, leak checker, and cpu profiler will 1160now all work even with the static libraries -- but if you find it 1161does, please file a bug report. 1162 1163Finally, the profile_handler_unittest provided in the perftools 1164testsuite (new in this release) is failing on FreeBSD. The end-to-end 1165test that uses the profile-handler is passing, so I suspect the 1166problem may be with the test, not the perftools code itself. However, 1167I do not know enough about how itimers work on FreeBSD to be able to 1168debug it. If you can figure it out, please let me know! 1169 1170=== 11 March 2009 === 1171 1172I've just released perftools 1.1! 1173 1174It has many changes since perftools 1.0 including 1175 1176 * Faster performance due to dynamically sized thread caches 1177 * Better heap-sampling for more realistic profiles 1178 * Improved support on Windows (MSVC 7.1 and cygwin) 1179 * Better stacktraces in linux (using VDSO) 1180 * Many bug fixes and feature requests 1181 1182Note: if you use the CPU-profiler with applications that fork without 1183doing an exec right afterwards, please see the README. Recent testing 1184has shown that profiles are unreliable in that case. The problem has 1185existed since the first release of perftools. We expect to have a fix 1186for perftools 1.2. For more details, see 1187[http://code.google.com/p/google-perftools/issues/detail?id=105 issue 105]. 1188 1189Everyone who uses perftools 1.0 is encouraged to upgrade to perftools 11901.1. If you see any problems with the new release, please file a bug 1191report at http://code.google.com/p/google-perftools/issues/list. 1192 1193Enjoy! 1194