1Version 2.1.5 (3/24/2003)
2
3  * Bug fix: Fortran wrappers were disabled in version 2.1.4.
4
5Version 2.1.4 (3/16/2003)
6
7  * Upgraded to newer versions of autoconf, etcetera, to fix compilation
8    problems on various recent systems.
9
10  * The configure script no longer picks the wrong architecture flags
11    (which caused FFTW to crash) on newer IBM POWER machines running AIX.
12
13  * Multi-threaded transforms should now utilize multiple CPUs on
14    Solaris (which creates threads in single-processor mode by default).
15
16  * Added experimental support for OpenMP (and SGI MP) compiler
17    parallelization directives in the multi-threaded transforms,
18    instead of using explicit thread spawning.  Enable by configuring
19    --with-openmp or --with-sgi-mp in addition to --enable-threads.
20
21  * Expanded FAQ.
22
23Version 2.1.3 (11/7/1999)
24
25  * The configure script no longer overrides the CFLAGS environment
26    variable if it is defined.  (Thanks to Diab Jerius.)
27
28  * Experimental Fortran-callable wrapper routines for MPI FFTW.
29    See mpi/README.f77 for more information.
30
31  * The configure script now detects and works around a stack
32    alignment bug in gcc 2.95.x on x86.
33
34  * configure attempts to guess the appropriate -mcpu flag on
35    Linux/PPC systems, improving performance (especially on G3s with
36    gcc 2.95 or later).
37
38  * Fixed integer overflow bug for complex transforms of large prime
39    sizes (> 32768).  Thanks to Ezio Riva for the bug report.
40
41  * Fixed memory leak in the Matlab wrappers; thanks to Matthew Davis
42    for the bug report.
43
44  * Fixed bugs in the configure script when detecting POSIX threads
45    libraries on AIX and Tru64 (nee Digital) Unix.
46
47  * Fixed bug in multi-threaded transforms on AIX (which strangely
48    creates threads in non-joinable mode by default).  Thanks to
49    Jim Lindsay for the bug report, and for allowing us to debug on
50    Northwestern University's IBM SP2.
51
52  * Slight fix to help build DLL's on Win32 (thanks to Andrew Sterian).
53
54Version 2.1.2 (5/18/1999)
55
56  * Fixed bug in our MPI test programs which made them fail under MPICH with
57    the p4 device (TCP/IP).  (The 2.1.1 transforms worked, but the test
58    programs crashed.)
59
60  * Added missing fftw_f77_threads_init function to the Fortran wrappers
61    for the multi-threaded transforms.  Thanks to V. Sundararajan for
62    the bug report.
63
64  * The codelet generator can now output efficient hard-coded DCT/DST
65    transforms.  As a side effect of this work, we slightly reduced the
66    code size of rfftw.
67
68  * Test programs now support GNU-style long options when used with glibc.
69
70  * Added some more ideas to our TODO list.
71
72  * Improved codelet generator speed.
73
74Version 2.1.1 (3/31/1999)
75
76  * Fixed bug in the complex transforms for certain sizes with
77    intermediate-length prime factors (17-97), which under some
78    (hopefully rare) circumstances could cause incorrect results.
79    Thanks to Ming-Chang Liu for the bug report and patch.  (The test
80    program will now catch this sort of problem when it is run in
81    paranoid mode.)
82
83Version 2.1 (3/8/1999)
84
85  * Added Fortran-callable wrapper routines for the multi-threaded
86    transforms.
87
88  * Documentation fixes and improvements.
89
90  * The --enable-type-prefix option to configure makes it easy to install
91    both single- and double-precision versions of FFTW on the same
92    (Unix) system.  (See the installation section of the manual.)
93
94  * The MPI FFTW routines now include parallel one-dimensional transforms
95    for complex data.  (See the fftw_mpi documentation in the FFTW
96    manual.)
97
98  * The MPI FFTW routines now include parallel multi-dimensional transforms
99    specialized for real data.  (See the rfftwnd_mpi documentation in the
100    FFTW manual.)
101
102  * The MPI FFTW routines are now documented in the main
103    manual (in the doc directory).  On Unix systems, they are also
104    automatically configured, compiled, and installed along with the main
105    FFTW library when you include --enable-mpi in the flags to the
106    configure script.  (See the FFTW manual.)
107
108  * Largely-rewritten MPI code.  It is now cleaner and (sometimes) faster.
109    It also supports the option of a user-supplied workspace for (often)
110    greater performance (using the MPI_Alltoall primitive).  Beware that
111    the interfaces have changed slightly, however.
112
113  * The multi-threaded FFTW routines now include parallel one- and
114    multi-dimensional transforms of real data.  (See the rfftw_threads
115    documentation in the FFTW manual.)
116
117  * The multi-threaded FFTW routines are now documented in the main
118    manual (in the doc directory).  On Unix systems, they are also
119    automatically configured, compiled, and installed along with the main
120    FFTW library when you include --enable-threads in the flags to the
121    configure script.  (See the FFTW manual.)
122
123  * The multi-threaded FFTW routines now include support for Mach C
124    threads (used, for example, in Apple's MacOS X).
125
126  * The Fortran-callable wrapper routines are now incorporated into
127    the ordinary FFTW libraries by default (although you can
128    disable this with the --disable-fortran option to configure) and
129    are documented in the main FFTW manual.
130
131  * Added an illustration of the data layout to the rfftwnd tutorial
132    section of the manual, in the hope of preventing future confusion
133    on this subject.
134
135  * The test programs now allow you to specify multidimensional sizes
136    (e.g. 128x54x81) for the -c and -s correctness and speed test options.
137
138Version 2.0.1 (9/29/98)
139
140  * (bug fix) Due to a poorly-parenthesized expression, rfftwnd overflowed
141    32-bit integer precision for rank > 1 transforms with a final
142    dimension >= 65536.  This is now fixed.  (Thanks to Walter Brisken
143    for the bug report.)
144
145  * (bug fix) Added definition of FFTW_OUT_OF_PLACE to fftw.h.  The
146    flag is mentioned several times in the documentation, but its
147    definition was accidentally omitted since FFTW_OUT_OF_PLACE is the
148    default behavior.
149
150  * Corrected various small errors in the documentation.  Thanks to
151    Geir Thomassen and Jeremy Buhler for their comments.
152
153  * Improved speed of the codelet generator by orders of magnitude,
154    since a user needed a hard-coded fft of size 101.
155
156  * Modified buffering in multidimensional transforms for some speed
157    improvements (only when fftwnd_create_plan_specific is used).
158    Thanks to Geert van Kempen for his tips.
159
160  * Added Andrew Sterian's patch to allow FFTW to be used as a shared
161    library more easily on Win32.
162
163Version 2.0 (9/11/1998)
164
165  * Completely rewritten real-complex transforms, now using
166    specialized codelets and an inherently real-complex algorithm for
167    greatly increased speed.  Also, rfftw can now handle odd sizes and
168    strided transforms.  Beware that the output format for 1D rfftw
169    transforms has changed.  See the manual for more details.
170
171  * The complex transforms now use a fast algorithm for large prime
172    factors, working in O(N lg N) time even for prime sizes.
173    (Previously, the complexity contained an O(p^2) term, where p is
174    the largest prime factor of N.  This is still the case for the
175    rfftw transforms.)  Small prime factors are still more efficient,
176    however.
177
178  * Added functions fftw_one, fftwnd_one, rfftw_one, etcetera, to
179    simplify and clarify the use of fftw for single, unit-stride
180    transforms.
181
182  * Renamed FFTW_COMPLEX, FFTW_REAL to fftw_complex, fftw_real (for
183    greater consistency in capitalization).  The all-caps names will
184    continue to be supported indefinitely, but are deprecated.  (Also,
185    support for the COMPLEX and REAL types from FFTW 1.0 is now
186    disabled by default.)
187
188  * There are now Fortran-callable wrappers for the rfftw real-complex
189    transforms.
190
191  * New section of the manual discussing the use of FFTW with multiple
192    threads, and a new FFTW_THREADSAFE flag (described therein).
193
194  * Added shared library support.  Use configure --enable-shared to
195    produce a shared library instead of a static library (the default).
196
197  * Dropped support for the operation-count (*_op_count) routines
198    introduced in v1.3, as these were little-used and were a pain to
199    keep up-to-date as FFTW changed internally.
200
201  * Made it easier to support floating-point types other than float
202    and double (e.g. long double).  (See the file fftw-int.h.)
203
204Version 1.3 (4/9/1998)
205
206  * Multi-dimensional transforms contain significant performance
207    improvements for dimensions >= 3.
208
209  * Performance improvements in multi-dimensional transforms
210    with howmany > 1 and stride > dist.
211
212  * Improved parallelization and performance in the threads
213    code for dimensions >= 3.
214
215  * Changed the wisdom import/export format (the new wisdom remembers
216    the stride of the plan that generated it, for use with the new
217    create_plan_specific functions).  (You should regenerate any stored
218    wisdom you have anyway, since this is a new version of FFTW.)
219
220  * Several small fixes to aid compilation on some systems.
221
222  * Fixed a bug in the MPI transform (in the transpose routine) that
223    caused errors for some array sizes.
224
225  * Fixed the (hopefully) last few things causing problems with C++
226    compilers.
227
228  * Hack for x86/gcc to properly align local double-precision variables.
229
230  * Completely rewritten codelet generator.  Now it produces
231    better code for non powers of 2, and is ready to produce
232    real->complex transforms.
233
234  * Testing algorithm is now more robust, and has a more rigorous
235    theoretical foundation.  (Bugs in testing large transforms or
236    in single precision are now fixed--these bugs were only in the
237    test programs and not in the FFTW library itself.)
238
239  * Added "specific" planners, which allow plan optimization for a
240    specific array/stride.  They also reduce the memory requirements
241    of the planner, and permit new optimizations in the multi-dimensional
242    case.  (See the *_create_plan_specific functions.)
243
244  * FFTW can now compute a count of the number of arithmetic operations
245    it requires, which is useful for some academic purposes.  (See the
246    *_count_plan_ops functions.)
247
248  * Adapted for use with GNU autoconf to aid installation on UNIX systems.
249    (Installation on non-UNIX systems should be the same as before.)
250
251  * Used gettimeofday function if available.  (This function typically
252    has much higher accuracy than clock(), permitting plans to be
253    created much more quickly than before on many machines.)
254
255  * Made timing algorithm (hopefully) more robust in the face of
256    system interrupts, etc.
257
258  * Added wrapper routines for calling FFTW from MATLAB (in the
259    matlab/ directory).
260
261  * Added wrapper routines for calling FFTW from Fortran (in the
262    fortran/ directory).  (These were available separately before.)
263
264Version 1.2.1 (12/4/1997)
265
266  * Fixed a third bug in the mpi transpose routines (sheesh!) that
267    could cause problems when re-using a transpose plan.  Thanks
268    to Eric Skyllingstad for the bug reports.
269
270  * Fixed another bug in the mpi transpose routines. This bug produced
271    a memory leak and also occasionally tries to free a null pointer,
272    which causes problems on some systems.  The mpi transpose/fft routines
273    now pass all of our malloc paranoia tests.
274
275  * Fixed bug in mpi transpose routines, where wrong results
276    could be given for some large 2D arrays.
277
278Version 1.2 (9/8/1997)
279
280  * Added a FAQ (in the FAQ/ directory).
281
282  * Fixed bug in rfftwnd routines where a block was accidentally
283    allocated to be too small, causing random memory to be
284    overwritten (yikes!).  (Amazingly, this bug only caused the
285    test program to fail on one system that we could find.  Our
286    test suite can now catch this sort of bug.)
287
288  * Abstractified taking differences of times (with fftw_time_diff
289    macro/function) to allow more general timer data structures.
290
291  * Added "wisdom" mechanism for saving plans & related info.
292
293  * Made timing mechanism more robust and maintainable.  (Instead of
294    using a fixed number of iterations, we now repeatedly double
295    the number of iterations until a specified time interval
296    (FFTW_TIME_MIN) is reached.)
297
298  * Fixed header files to prevent difficulties when a mix of C and
299    C++ compilers is used, and to prevent problems with multiple
300    inclusions.
301
302  * Added experimental distributed-memory transforms using MPI.
303
304  * Fixed memory leak in fftwnd_destroy_plan (reported by Richard
305    Sullivan).  Our test programs now all check for leaks.
306
307Version 1.1 (5/5/1997)
308
309  * Improved speed (yes!) [Some clever tricks with twiddle factors
310    and better code generator]
311
312  * Renamed `blocks' to `codelets', just to be fashionable
313
314  * Rewritten planner and executor--much simpler and more readable
315    code.  Reference-counter garbage collection employed throughout.
316
317  * Much improved codelet generator.  The ML code should be now
318    readable by humans, and easier to modify.
319
320  * Support for Prime Factor transforms in the codelet generator.
321
322  * Renamed COMPLEX -> FFTW_COMPLEX to avoid clashes with
323    existing packages.  COMPLEX is still supported
324    for compatibility with 1.0
325
326  * Added experimental real->complex transform (quick hack,
327    use at your own risk).
328
329  * Added experimental parallel transforms using Cilk.
330
331  * Added experimental parallel transforms using threads (currently,
332    POSIX threads and Solaris threads are implemented and tested).
333
334  * Added DOS support, in the sense that we now support 8.3 filenames.
335
336Version 1.0 (3/24/1997)
337
338  * First release.
339