1 /* phasevocoder.c -- this is a time-stretching phase vocoder.
2 
3 Design notes: we will use the "absolute" interface of the cmupv library.
4 "Absolute" means that rather than giving the phase vocoder a hop size
5 with which to move through the input (thus input position is
6 "relative"), a callback gives an exact output location. Thus, the input
7 parameters to the phase vocoder will be the input sound and a mapping
8 from output time to input time. For each frame of output, we'll get a
9 callback asking for input. In the callback, we'll evaluate the mapping
10 up to the frame's center time, using interpolation if necessary, and
11 then evaluate the input sound as needed to find an input frame at the
12 mapped time.
13 
14 Because Nyquist sounds are computed incrementally, the phase vocoder
15 input position must be non-decreasing. This will result in an
16 interface similar to the sound-warp function. The documentation for
17 sound-warp has a detailed explanation of the warp-fn parameter that
18 maps between "score" time and real time.
19 
20 It should be possible to build on the phase vocoder to provide pitch
21 manipulation as well as time stretching, including using high quality
22 resampling provided in sound-warp. Extending the sound-warp interface,
23 suppose we have two control functions. One, called warp-fn maps from
24 "score" time to real time (as in sound-warp). Another, called pitch-fn
25 provides a frequency scale factor (>0) as a function of "score" time.
26 
27 For example, supposed the input is 10s long. To transpose continuously
28 from 0 to 12 semitones, the pitch-fn can be pwev(1, 10, 2), generating
29 an exponential sweep from 1 to 2 over the course of the sound. To
30 simultaneously slow the tempo gradually by a factor of 2, let's use
31 the same function, pwev(1, 10, 2). If we integrate, we get a function
32 from score time to real time (warp-fn), so taking the inverse, we get
33 a function from real time to score time, which can be used to find the
34 location of each input frame for processing.
35 
36 This is not really the solution because we want to incorporate
37 additional stretching to allow for resampling to change the pitch.
38 Consider the construction of time stretching and pitch shifting
39 functions that simultaneously produce the right time mappings and
40 pitch shifting. We will first apply the phase vocoder, then apply
41 resampling.
42 Let V(u) be the mapping from phase vocoder output time to input time,
43 and let Vinv(t) be the inverse ov V(u).
44 Let W(u) be the mapping from phase vocoder output time to final output
45 time, i.e. this is the sound-warp mapping.
46 Let S(t) be the stretch factor to be applied at time t (of the input).
47 Let P(t) be the pitch transposition to be applied at time t (of the input).
48 We can compute Vinv(t) by considering the following: At time t, the
49 signal must be stretched by the product S(t)*P(t) because S(t) is the
50 stretch factor, and because we need to stretch by an additional factor
51 of P(t) so that when we resample to achieve pitch transposition,
52 effectively stretching by 1/P(t), the net stretch due to transposition
53 will be 1.
54 Thus, Vinv(t) = Integral[S(t)*P(t)]. V(u) is derived by taking the
55 inverse of Vinv(t), a primitive operation in Nyquist.
56 Now we need W(u). The input to the sound-warp (resampling) function
57 will have the pitch of the original signal because the phase vocoder
58 preserves pitch. At each point u, the pitch change applied to the
59 signal will be the inverse of the derivative of W(u):
60    Pitch change at u = 1/W'(u)
61 The "pitch change at u" is P(V(u)) and we know V, so we can write
62 W'(u) = 1/P(V(u))
63 thus, W(u) = Integral[1/P(V(u))]
64 
65 
66 INTERFACE WTIH CMUPV
67 --------------------
68 
69 f is the input sound
70 g is the map from output to input
71 
72 Samples are computed by pv_fetch which has a state[] field available
73 as well as an interface to get samples from input signals.
74 
75 The state[] information is partly initialized in snd_phasevocoder, and
76 partly in the first call to snd_fetch(). Original comments imply that
77 this had to do with state[] being hidden within the pvshell
78 abstraction and revealed only to snd_fetch(), but it appears that
79 state[] could be completely initialized in snd_phasevocoder and passed
80 into snd_make_pvshell(). Finishing the initialization in snd_fetch()
81 splits the initialization code and requires a conditional branch in
82 every call to snd_fetch (an insignificant cost) but it defers calling
83 pv_create2() until it is really needed.
84 
85 Output is taken from OUTPUT as needed until REMAINING is zero.
86 Then, pv_get_output2() is called to generate more samples.
87 pv_get_output2() calls the callback, which does most of the complex work.
88 (1) The callback must figure out the "time" of the next frame it will
89     generate. This will be based on out_count provided to the callback.
90 (2) Map this time via g to an input time for f and convert to samples.
91 (3) Subtract framesize/2 to get the first_sample we need from f.
92 (4) f_count is the total sample count for the end of input so the beginning
93     of input is at f_count - fftsize.  first_sample is the place we want
94     to start the next frame, so we need to skip over
95     first_sample - (f_count - fftsize) samples.
96 (5) fill the rest of input from f.
97 
98 When the input frame time is less than framesize/2, we need to fetch
99 samples with a negative sample count. The callback should generate
100 zeros until samplecount 0 is reached, then get samples from f.
101 
102 Logical Stop and Terminate Logic
103 --------------------------------
104 The logical stop time should be the logical stop time of the input (f)
105 mapped to the output. Since g is a map from output time to input time, we want
106         g(output.lst) = f.lst, or output.lst = g-inverse(f.lst)
107     In practice, we're not given g-inverse and would like not to compute
108 it. We iterate through g to find g(t) for each fft frame center
109 time. When we reach the logical stop time of the input, detected by
110 PVSHELL_TEST_F returning PVSHELL_FLAG_LOGICAL_STOP, we can set the
111 logical stop time of the output by linear interpolation as a
112 reasonable approximation.  We save previous time points in g as t0,g0
113 and t1,g1, where g0 = g(t0) and g1 = g(t1). We have the logical stop
114 time of f that we'll call g2. When g0 < g2 <= g1,
115         (t2 - t0)/(t1 - t0) = (g2 - g0)/(g1 - g0), so
116         (t2 - t0) = (t1 - t0) * (g2 - g0)/(g1 - g0), so
117         t2 = t0 + (t1 - t0) * (g2 - g0)/(g1 - g0), where
118         t2 is the logical stop time of the output.
119 If g1 == g0, the function g(t) is not increasing. This should mean that
120 we don't advance the input so we won't discover a logical stop time, but
121 just in case, we can set t2 = t0 if g1 == g0.
122 
123     Notice that we get t2 when g0 < g2 <= g1, so we have to test for that.
124 g2 (the logical stop time of f) in general becomes known when we read that
125 sample from f, but we are always reading fftsize/2 samples ahead in f to
126 fill the analysis window, so we should learn g2 before the output reaches
127 t2. (We won't output t2 until we've overlap-added a number of frames
128 centered beyond t2. These samples-in-progress are held in pvs->output[].)
129 
130     Another complication is that when we are actually ready to output
131 sample t2, it must be at the beginning of output and the output is marked
132 with the logical stop flag. This is handled by pv_fetch. pv_callback
133 (which is not assembling output blocks for Nyquist is only responsible
134 for calculating the logical stop time of the output.)
135 
136     The logical stop time can also be the terminate time of g -- if g
137 terminates, we must terminate the output (otherwise we'll be reading from
138 time 0 of the input, but we're not allowed to go backward.)
139 
140 The terminate time is when the remaining output will be zero. Since the
141 phase vocoder output continues for half a window beyond the last point
142 mapped from input to output, we really don't want to try to do any mapping.
143 Instead, we just wait until the input is all zeros and figure out when the
144 output will be all zeros.
145 
146 Input becomes all zero when either we get a frame past the terminate time
147 of the input f, or we reach beyond the terminate time of g. Either way, we
148 should set a flag saying input has terminated and will be all zero.
149 
150 Output becomes all zero (fftsize/2 - hopsize) beyond the time point of the
151 first all-zero frame: Let's say we see the flag saying the input is all
152 zero because we've terminated on the input side. The *previous* frame was
153 therefore the last non-zero signal, and it extends for fftsize/2, but it was
154 one hopsize ago, so the non-zero signal extends (fftsize/2-hopsize) from the
155 time of the all-zero frame.
156 
157 Access to PV state
158 ------------------
159 
160 Things start with a call to snd_phasevocoder(f, g, fftsize, hopsize). The
161 info is put into pv_state_node, which is passed to pvshell and copied into
162 susp->pvshell.state. The fetch function is pvshell_fetch, which calls
163 pv_fetch through the pointer susp->pvshell.h. h (which is pv_fetch) returns
164 flags to indicate logical stop and terminate, and it returns n, the number
165 of samples computed. If the terminate flag is set, the output is assumed to
166 be zero and the zero block is used.
167 
168 The susp info and the pv_state_node info can be accessed in pv_fetch, but
169 the phase vocoder computation is in a callback. However, the parameter to
170 the callback is the susp pointer, so in the callback we can access the
171 pvshell_type and the pvstate_type data.
172 
173 To calculate the return flags, we have to stuff data into the
174 pvstate_type struct and read it back out in pv_fetch after calling
175 pv_get_output2(), which is the phase vocoder calculation that calls the
176 callback.
177 
178 TODO: if g0 and t0 are not initialized because of early logical stop,
179 what do we do?
180 */
181 
182 #include "assert.h"
183 #include "stdio.h"
184 #ifndef mips
185 #include "stdlib.h"
186 #endif
187 #include "xlisp.h"
188 #include "sound.h"
189 
190 #include "falloc.h"
191 #include "cext.h"
192 #include "pvshell.h"
193 
194 #include "phasevocoder.h"
195 #include "cmupv.h"
196 
197 typedef struct pvstate_struct {
198     int64_t f_count;  /* how many samples have we taken from f? */
199     int64_t g_count;  /* how many samples have we taken from g? */
200     double g_prev; /* the previous value of g (at g_count - 2) */
201     double g_next; /* the current value of g (at g_count - 1) */
202     int64_t sample_count; /* how many total samples computed, specifically
203                         * the number of samples copied into Nyquist
204                         * sample blocks via *out++ = pvs->output[index++];
205                         */
206     Phase_vocoder *pv;   /* the phase vocoder object */
207     sample_type *input;  /* a frame of samples to go into fft */
208     int64_t input_count;    /* sample number of first sample in input */
209     sample_type *output; /* output from phase vocoder */
210     long output_count;   /* since we deliver samples on demand,
211             output_count keeps track of how much is left in output.
212             ouput[OUTPUT_SIZE - output_count] is the next sample to deliver */
213     int fftsize;         /* the length of an fft frame */
214     int hopsize;         /* the hopsize -- not used */
215     int mode;            /* the mode -- see cmupv.h */
216     /* data to compute logical stop time and termination */
217     long f_logical_stop_valid;  /* true when we know f_logical_stop_count */
218     long f_terminated;   /* set when f terminates */
219     long g_terminated;   /* set when g terminates */
220     int64_t f_logical_stop_count;  /* logical stop time of input (f) */
221     int64_t t0; /* output sample count of previous frame */
222     double g0; /* input time of previous frame center */
223     /* data to detect termination */
224     int64_t f_terminate_count; /* sample count of f when it terminates */
225     int64_t g_terminate_count; /* sample count of g when it terminates */
226     /* return values from pv_callback */
227     long logical_stop_valid;
228     long terminate_count_valid;
229     int64_t logical_stop_count; /* sample count of output logical stop */
230     int64_t terminate_count; /* sample count of output terminate time */
231 } pvstate_node, *pvstate_type;
232 
233 #define OUTPUT_SIZE 256
234 
pv_callback(int64_t out_count,float * samples,int len,void * rock)235 int pv_callback(int64_t out_count, float *samples, int len, void *rock)
236 {
237     pvshell_type susp = (pvshell_type) rock;
238     pvstate_type pvs = (pvstate_type) susp->state;
239 
240     /* (1) figure out the "time" of the start of next frame */
241     double out_time = out_count / susp->f->sr;
242     /* (2) Map this time via g to an input time for f. */
243     /* compute sample position that we need; at 0th sample,
244      * pvs->g_count is 1, so add 1 to 0-based position: */
245     double g_count = out_time * susp->g->sr + 1.0;
246     double g; /* the value of g at g_count which is at the time of out_count */
247     int64_t f_start; /* the start sample of input f for the next frame */
248     int hop; /* the hopsize from the previous frame to this frame, thus the
249                 offset into input buffer of the data we want to keep */
250     int got_from_f; /* samples already in input */
251     int needed_from_f; /* samples to get from f this time */
252     sample_type *input = pvs->input;
253     int i;
254 
255     while (pvs->g_count < g_count) {
256         long flags = PVSHELL_TEST_G(susp); /* prepare to get a sample */
257         if (!pvs->g_terminated && (flags & PVSHELL_FLAG_TERMINATE)) {
258             pvs->g_terminated = TRUE;
259             pvs->g_terminate_count = susp->g->current - susp->g_cnt;
260         }
261         pvs->g_prev = pvs->g_next;
262         pvs->g_next = PVSHELL_FETCH_G(susp);
263         pvs->g_count++;
264     }
265     /* numbering samples from 1,
266          pvs->g_prev corresponds to pvs->g_count - 1
267          pvs->g_next corresponds to pvs->g_count, and
268          pvs->g_count - 1 < g_count <= pvs->gcount
269     */
270     /* fetch frame by mapping with g unless we've gone beyond g's
271        termination time */
272     if (!pvs->g_terminated) {
273         /* now interpolate to get the value of g at out_count */
274         g = pvs->g_prev + (pvs->g_next - pvs->g_prev) *
275                           (g_count - (pvs->g_count - 1));
276         /* (3) get the first sample we need from f. */
277         /* g is now the sample time we want for center of f window */
278         f_start = ROUNDBIG(g * susp->f->sr) - pvs->fftsize / 2;
279 
280         /* f_start is now the first sample position of the window */
281         /* (4) shift INPUT */
282         hop = (int) (f_start - pvs->input_count);
283         if (hop < 0) {
284             hop = 0;
285         }
286 
287         /* printf("pv_callback f_start %ld hop %d\n", f_start, hop); */
288 
289         got_from_f = pvs->fftsize - hop;
290         needed_from_f = pvs->fftsize; /* unless we can resuse samples */
291         if (hop == 0) {
292             ; /* nothing to do, the samples are already in input */
293         } else if (hop < pvs->fftsize) {
294             memmove(input, input + hop,
295                     got_from_f * sizeof(sample_type));
296             needed_from_f = hop;
297         } else { /* skip over some samples of f */
298             int skip = hop - pvs->fftsize;
299             int i;
300             got_from_f = 0;
301             for (i = 0; i < skip; i++) {
302                 long flags = PVSHELL_TEST_F(susp);
303                 if (flags) { /* normal case is all flags zero, so I think it
304                                 is faster to test for either and only if we
305                                 know one is set do we test individual flags */
306                     if (flags & PVSHELL_FLAG_LOGICAL_STOP) {
307                         pvs->f_logical_stop_valid = TRUE;
308                         pvs->f_logical_stop_count =
309                                 susp->f->current - susp->f_cnt;
310                     }
311                     if ((flags & PVSHELL_FLAG_TERMINATE) &&
312                         !pvs->f_terminated) {
313                         pvs->f_terminated = TRUE;
314                         pvs->f_terminate_count = susp->f->current - susp->f_cnt;
315                     }
316                 }
317                 PVSHELL_FETCH_F(susp);
318             }
319         }
320         pvs->input_count = f_start;
321         /* (5) fill the rest of input from f */
322         /* (5A) any samples from negative f counts are zero: */
323         while (f_start < 0 && needed_from_f > 0) {
324             input[got_from_f++] = 0.0F;
325             f_start++;
326             needed_from_f--;
327         }
328         /* (5B) any samples for positive f counts use PVSHELL_FETCH_F: */
329         for (i = 0; i < needed_from_f; i++) {
330             long flags = PVSHELL_TEST_F(susp);
331             if (!pvs->f_logical_stop_valid &&
332                 (flags & PVSHELL_FLAG_LOGICAL_STOP)) {
333                 pvs->f_logical_stop_valid = TRUE;
334                 pvs->f_logical_stop_count = susp->f->current - susp->f_cnt;
335             }
336             input[got_from_f++] = PVSHELL_FETCH_F(susp);
337         }
338         /* now, input consists of previous samples that were shifted plus
339            need_from_f samples that were just fetched, giving us fftsize
340            samples to return by copying to samples: */
341         memmove(samples, input, pvs->fftsize * sizeof(float));
342         /* did we terminate? If window is all zeros, we can compute
343            terminate time */
344         if ((!(pvs->terminate_count_valid)) && pvs->f_terminated &&
345             pvs->f_terminate_count <= f_start) {
346             /* new window is all zero, so output terminates soon ... */
347             pvs->terminate_count_valid = TRUE;
348             pvs->terminate_count = out_count - hop + pvs->fftsize / 2;
349             /* printf("pv_callback terminated by f at %ld\n", pvs->terminate_count); */
350         }
351     } else { /* g has terminated, so we just fill input with zeros */
352              /* hopsize does not matter, so we'll set it to fftsize/8 */
353         memset(samples, 0, pvs->fftsize * sizeof(*samples));
354         hop = pvs->fftsize / 8;
355         /* printf("filled samples with 0, hop %d\n", hop); */
356     }
357     /* there are two sources of logical stop: f and g. If f, then
358        pvs->f_logical_stop_valid is TRUE, and we need to map using g-inverse.
359        We'll do that first to get a candidate logical stop time. (This
360        is skipped if g has terminated, because the variable g would not
361        be defined in that case.)
362        Then, test if g is terminated. If so, g_terminate_time is the other
363        candidate logical stop time. If not g_terminated, we do nothing
364        (letting the mapped f logical stop time stand if applicable).
365        Otherwise, if g_terminated then {
366            if pvs->f_logical_stop_valid, take the minimum of the two candidates,
367            else take the terminate time of g }
368        (See comments at top of file for more about the computation here.)
369     */
370     /* see if we should determine the logical stop time */
371     if (pvs->f_logical_stop_valid && !pvs->g_terminated &&
372         !pvs->logical_stop_valid) {
373         /* g is valid because !pvs->g_terminated. Following the math at the
374            top of this file... */
375         int64_t t1 = out_count;
376         double g1 = g;
377         double g2 = pvs->f_logical_stop_count / susp->f->sr;
378         if (pvs->g0 < g2 && g2 <= g1) {
379             if (g1 == pvs->g0) {
380                 assert(FALSE);  /* DEBUG - just to see if it happens */
381                 pvs->logical_stop_valid = TRUE;
382                 pvs->logical_stop_count = pvs->t0;
383             } else {
384                 pvs->logical_stop_count = (int64_t) (pvs->t0 +
385                     (t1 - pvs->t0) * ((g2 - pvs->g0) / (g1 - pvs->g0)));
386             }
387         } /* else, we wait until g catches up to logical stop of f */
388     }
389     if (pvs->g_terminated) {
390         int64_t term_cnt_from_g = /* account for different sample rates */
391                 ROUNDBIG((pvs->g_terminate_count / susp->g->sr) * susp->f->sr);
392         if (pvs->logical_stop_valid) { /* take min to be logical stop count */
393             pvs->logical_stop_count = MIN(pvs->logical_stop_count,
394                                           term_cnt_from_g);
395         } else {
396             pvs->f_logical_stop_valid = TRUE;
397             pvs->logical_stop_count = term_cnt_from_g;
398         }
399         /* maybe output has terminated */
400         if (pvs->g_terminate_count < out_count + pvs->fftsize / 2) {
401             if (pvs->terminate_count_valid) {
402                 pvs->terminate_count = MIN(pvs->terminate_count,
403                                            term_cnt_from_g);
404             } else {
405                 pvs->terminate_count_valid = TRUE;
406                 pvs->terminate_count = term_cnt_from_g;
407             }
408             /* printf("pv_callback terminated by g at %ld\n", term_cnt_from_g); */
409         }
410     }
411     pvs->t0 = out_count;
412     pvs->g0 = g;
413 
414     return hop;
415 }
416 
417 
418 /* pv_fetch -- f is the signal. g is the map from output to input
419  *
420  * g has an arbitrary sample rate with respect to f, and will interpolate.
421  * out is where to put samples,
422  * n is how many samples to compute (maximum)
423  * sample_count is how many output samples we have computed
424  */
pv_fetch(pvshell_type susp,sample_block_values_type out,long * n,int64_t sample_count)425 long pv_fetch(pvshell_type susp,
426               sample_block_values_type out, long *n,
427               int64_t sample_count)
428 {
429     pvstate_type pvs = (pvstate_type) susp->state;
430     int i;
431     int flags = 0;
432     int count = 0; /* how many samples computed? */
433     /* initialize phase vocoder if this is the first call */
434     if (pvs->sample_count == 0) {
435         Phase_vocoder pv = pv_create2(malloc, free, pv_callback, susp);
436         pv_set_blocksize(pv, OUTPUT_SIZE);
437         pv_set_fftsize(pv, pvs->fftsize);
438         pv_set_syn_hopsize(pv, pvs->hopsize);
439         pv_set_mode(pv, pvs->mode);
440         pv_initialize(pv);
441         pvs->pv = pv;
442         pvs->input = (float *) malloc(pvs->fftsize * sizeof(float));
443         pvs->input_count = -pvs->fftsize; /* no valid samples in input yet */
444         /* fill input with zero: we might actually want samples starting
445            near -pvs->fftsize and think that we should use what's already
446            in pvs->input: */
447         memset(pvs->input, 0, pvs->fftsize * sizeof(float));
448     }
449     while (count < *n) {
450         int take = *n - count; /* how many to take from (pv) output */
451         int remaining;
452         int index;
453         if (pvs->output_count <= 0) {
454             pvs->output = pv_get_output2(pvs->pv);
455             pvs->output_count = OUTPUT_SIZE;
456         }
457         remaining = pvs->output_count;
458         /* printf("pv_fetch take %ld remaining %ld\n", take, remaining); */
459         if (take > remaining) take = remaining;
460         if (pvs->terminate_count_valid) {
461             int64_t to_term = pvs->terminate_count - sample_count;
462             if (to_term < take) take = (int) to_term;
463             if (take == 0) {
464                 /* we want to set the terminate flag at the beginning
465                    of the sample block, i.e. only if count == 0; if
466                    there are samples in the block already, we just
467                    return them and we'll set the terminate flag next time
468                 */
469                 if (count == 0) {
470                     flags |= PVSHELL_FLAG_TERMINATE;
471                 }
472             }
473         }
474         if (pvs->logical_stop_valid) {
475             int64_t to_stop = pvs->logical_stop_count - sample_count;
476             /* if we're exactly at the logical stop block, then
477                set the logical stop flag and compute the block as
478                normal. Otherwise, if we have not reached the logical
479                stop sample yet (to_stop > 0) and we have room to go
480                past it (to_stop < take), then take only up to logical
481                stop sample.
482             */
483             if (to_stop == 0 && count == 0) {
484                 flags |= PVSHELL_FLAG_LOGICAL_STOP;
485             } else if (to_stop > 0 && to_stop < take) {
486                 take = (int) to_stop;
487             }
488         }
489         if (take == 0) {
490             break; /* no more samples; we now terminate */
491         }
492         index = OUTPUT_SIZE - pvs->output_count;
493         for (i = 0; i < take; i++) {
494             *out++ = pvs->output[index++];
495         }
496         count += take;
497         sample_count += take;
498         pvs->output_count -= take;
499         pvs->sample_count += take;
500     }
501     *n = count;
502     /* printf("pv_fetch output_count %ld flags %ld\n",
503               pvs->sample_count, susp->flags); */
504     return flags;
505 }
506 
507 
pv_free(struct pvshell_struct * susp)508 void pv_free(struct pvshell_struct *susp)
509 {
510     pvstate_type pvs = (pvstate_type) susp->state;
511     if (pvs->pv) pv_end(pvs->pv);
512     if (pvs->input) free(pvs->input);
513 }
514 
515 
snd_phasevocoder(sound_type f,sound_type g,long fftsize,long hopsize,long mode)516 sound_type snd_phasevocoder(sound_type f, sound_type g, long fftsize,
517                             long hopsize, long mode)
518 {
519     /* we're using 5 doubles of state. The first is a parameter,
520      * and the rest are initialized to zero except for state[2],
521      * aka G_COUNT. This is the number of samples we have read
522      * from G. Since we're interpolating we need a one-sample
523      * lookahead, and initializing the count to -1 causes an
524      * extra fetch and hence 1-sample lookahead. This state is copied
525      * into the pvshell structure, so we don't need to allocate
526      * a vector on the heap.
527      */
528     long temp;
529     if (fftsize == -1)
530         fftsize = 2048;
531     if (hopsize == -1)
532         hopsize = fftsize / 8;
533     pvstate_node state = {
534         0,       /* f_count */
535         0,       /* g_count */
536         0,       /* g_prev */
537         0,       /* g_next */
538         0,       /* sample_count */
539         NULL,    /* pv */
540         NULL,    /* input */
541         0,       /* input_count */
542         NULL,    /* output */
543         0,       /* output_count */
544         fftsize, /* fftsize */
545         hopsize, /* hopsize */
546         mode,    /* mode -- see cmupv.h */
547         FALSE,   /* f_logical_stop_valid */
548         FALSE,   /* f_terminated */
549         FALSE,   /* g_terminated */
550         0,       /* f_logical_stop_count */
551         0,       /* t0: output sample count of previous frame */
552         0,       /* g0: input time of previous frame center */
553         0,       /* f_terminate_count */
554         0,       /* g_terminate_count */
555         FALSE,   /* logical_stop_valid */
556         FALSE,   /* terminate_count_valid */
557         0,       /* logical_stop_count */
558         0        /* terminate_count */
559   };
560 
561     /* If f and g do not start at the same time, we should really
562      * should do something about it, but we'll just throw an error.
563      * Be careful to allow small differences (within one sample).
564      */
565     if (fabs(f->t0 - g->t0) * f->sr > 0.5) {
566         xlfail("phasevocoder inputs must start at the same time");
567     }
568     /* fftsize should be a power of 2, hopsize should be a power of
569      * 2 smaller than fftsize.
570      */
571     if (fftsize <= 0) {
572         xlfail("phasevocoder fftsize must be > 0");
573     }
574     /* Test for power of 2. Subtract 1 and a power of 2 will change
575      * from 0...010...0 to 0...001...1, and the "and" will be zero.
576      * But a non-power of 2 will go from 0...01?...? to 0...01?...?"
577      * and the "and" will be non-zero.
578      */
579     temp = fftsize - 1;
580     if ((temp & fftsize) != 0) {
581         xlfail("phasevocoder fftsize must be a power of 2");
582     }
583     /* Test that hopsize is a power of 2 smaller than fftsize: */
584     temp = fftsize / 2;
585     while (temp && temp != hopsize) temp >>= 1;
586     if (!temp) {
587         xlfail("phasevocoder hopsize must be a power of 2 smaller than fftsize");
588     }
589     /* output the same sample rate and start time as f */
590     sound_type pv = snd_make_pvshell("snd_phasevocoder", f->sr, f->t0,
591                                      &pv_fetch, &pv_free, f, g,
592                                      (void *) &state, sizeof(state));
593     return pv;
594 }
595