• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

android/H03-May-2022-7218

config/H03-May-2022-432334

doc/H03-May-2022-1,8861,600

include/H03-May-2022-4,1821,803

lang/H03-May-2022-681,313672,839

main/H03-May-2022-1,295842

sapi/H03-May-2022-7,8856,058

src/H03-May-2022-24,33115,959

testsuite/H03-May-2022-3,7222,159

tools/H03-May-2022-8,6795,546

wince/H03-May-2022-1,8061,334

windows/H03-May-2022-13956

.time-stampH A D25-Dec-201780 98

ACKNOWLEDGEMENTSH A D21-Oct-20173.5 KiB8075

COPYINGH A D21-Oct-20179.8 KiB209187

Exports.defH A D21-Oct-2017892 5049

MakefileH A D03-May-20226.5 KiB13574

README.mdH A D25-Dec-201714.6 KiB431287

config.guessH A D21-Oct-201744.7 KiB1,5691,356

config.subH A D21-Oct-201734.8 KiB1,7971,659

configureH A D03-May-2022157.8 KiB5,6414,653

configure.inH A D21-Oct-201714.4 KiB491450

flite.slnH A D21-Oct-20172.3 KiB4745

fliteDll.vcxprojH A D21-Oct-201715.2 KiB283283

fliteDll.vcxproj.filtersH A D21-Oct-20173.9 KiB9898

install-shH A D21-Oct-20175.5 KiB252153

missingH A D21-Oct-20176.1 KiB191154

mkinstalldirsH A D21-Oct-2017725 4123

README.md

1
2         Flite: a small run-time speech synthesis engine
3                      version 2.1-release
4          Copyright Carnegie Mellon University 1999-2017
5                      All rights reserved
6                      http://cmuflite.org
7              https://github.com/festvox/flite
8
9
10Flite is an open source small fast run-time text to speech engine.  It
11is the latest addition to the suite of free software synthesis tools
12including University of Edinburgh's Festival Speech Synthesis System
13and Carnegie Mellon University's FestVox project, tools, scripts and
14documentation for building synthetic voices.  However, flite itself
15does not require either of these systems to compile and run.
16
17The core Flite library was developed by Alan W Black <awb@cs.cmu.edu>
18(mostly in his so-called spare time) while employed in the Language
19Technologies Institute at Carnegie Mellon University.  The name
20"flite", originally chosen to mean "festival-lite" is perhaps doubly
21appropriate as a substantial part of design and coding was done over
2230,000ft while awb was travelling, and (usually) isn't in meetings.
23
24The voices, lexicon and language components of flite, both their
25compression techniques and their actual contents were developed by
26Kevin A. Lenzo <lenzo@cs.cmu.edu> and Alan W Black <awb@cs.cmu.edu>.
27
28Flite is the answer to the complaint that Festival is too big, too slow,
29and not portable enough.
30
31o Flite is designed for very small devices, such as PDAs, and also
32  for large server machines which need to serve lots of ports.
33
34o Flite is not a replacement for Festival but an alternative run time
35  engine for voices developed in the FestVox framework where size and
36  speed is crucial.
37
38o Flite is all in ANSI C, it contains no C++ or Scheme, thus requires
39  more care in programming, and is harder to customize at run time.
40
41o It is thread safe
42
43o Voices, lexicons and language descriptions can be compiled
44  (mostly automatically for voices and lexicons) into C representations
45  from their FestVox formats
46
47o All voices, lexicons and language model data are const and in the
48  text segment (i.e. they may be put in ROM).  As they are linked in
49  at compile time, there is virtually no startup delay.
50
51o Although the synthesized output is not exactly the same as the same
52  voice in Festival they are effectively equivalent.  That is, flite
53  doesn't sound better or worse than the equivalent voice in festival,
54  just faster, smaller and scalable.
55
56o For standard diphone voices, maximum run time memory
57  requirements are approximately less than twice the memory requirement
58  for the waveform generated.  For 32bit archtectures
59  this effectively means under 1M.
60
61o The flite program supports, synthesis of individual strings or files
62  (utterance by utterance) to direct audio devices or to waveform files.
63
64o The flite library offers simple functions suitable for use in specific
65  applications.
66
67Flite is distributed with a single 8K diphone voice (derived from the
68cmu_us_kal voice), a pruned lexicon (derived from
69cmulex) and a set of models for US English.  Here are comparisons
70with Festival using basically the same 8KHz diphone voice
71
72                Flite    Festival
73    core code    60K      2.6M
74    USEnglish    100K     ??
75    lexicon      600K     5M
76    diphone      1.8M     2.1M
77    runtime      <1M      16-20M
78
79
80On a 500Mhz PIII, a timing test of the first two chapters of
81"Alice in Wonderland" (doc/alice) was done.  This produces about
821300 seconds of speech.  With flite it takes 19.128 seconds (about
8370.6 times faster than real time) with Festival it takes 97 seconds
84(13.4 times faster than real time).  On the ipaq (with the 16KHz diphones)
85flite synthesizes 9.79 time faster than real time.
86
87Requirements:
88-------------
89
90    o A good C compiler, some of these files are quite large and some C
91      compilers might choke on these, gcc is fine.  Sun CC 3.01 has been
92      tested too.  Visual C++ 6.0 is known to fail on the large diphone
93      database files.  We recommend you use GCC Windows Subsystem for Linux
94      Cygwin or mingw32 instead.
95
96    o GNU Make
97
98    o An audio device isn't required as flite can write its output to
99      a waveform file.
100
101Supported platforms:
102--------------------
103
104We have successfully compiled and run on
105
106    o Various Intel Linux systems (and iPaq Linux), under various versions
107      of GCC (2.7.2 to 6.x)
108
109    o Mac OS X
110
111    o Various Android devices
112
113    o Various openwrt devices
114
115    o FreeBSD 3.x and 4.x
116
117    o Solaris 5.7, and Solaris 9
118
119    o Windows 2000/XP and later under Cygwin 1.3.5 and later
120
121    o Windows 10 with Windows Subsystem for Linux
122
123    o Successfully compiles and runs under 64Bit Linux architectures
124
125    o OSF1 V4.0 (gives an unimportant warning about sizes when compiled cst_val.c)
126
127Previously we supported PalmOS and Windows CE but these seem to be rare
128nowadays so they are no longer actively supported.
129
130Other similar platforms should just work, we have also cross compiled
131on a Linux machine for StrongARM.  However note that new byte order
132architectures may not work directly as there is some careful
133byte order constraints in some structures.  These are portable but may
134require reordering of some fields, contact us if you are moving to
135a new archiecture.
136
137News
138----
139
140New in 2.1 (Oct 2017)
141
142    o Improved Indic front end support (thanks to Suresh Bazaj @ Hear2Read)
143
144    o 18 English Voices (various accents)
145
146    o 12 Indian Voices (Bengali, Gujarati, Hindi, Kannada, Marathi, Panjabi
147      Tamil and Telugu) usually with bilingual (with English) support
148
149    o Can do byteswap architectures [again] (ar9331 yun arduino, zsun etc)
150
151    o flitecheck front-end test suite
152
153    o grapheme based festvox builds give working flitevox voices
154
155    o SAPI support for CG voices (thanks to Alok Parlikar @ Cobalt Speech and
156      Language INC)
157
158    o gcc 6.x support
159
160    o .flitevox files (and models) 40% of previous size, but same quality
161
162New in 2.0.0 (Dec 2014)
163    o Indic language support (Hindi, Tamil and Telugu)
164
165    o SSML support
166
167    o CG voices as files accessilble by file:/// and http://
168      (and set of 13 voices to load)
169
170    o random forest (multimodel support) improves voice quality
171
172    o Supports diffrent sample rates/mgc order to tune for speed
173
174    o Kal diphone 500K smaller
175
176    o Fixed lots of API issues
177
178    o thread safe (again) [after initialization]
179
180    o Generalized tokenstreams (used in Bard Storyteller)
181
182    o simple-Pulseaudio support
183
184    o Improved Android support
185
186    o Removed PalmOS support from distribution
187
188    o Companion multilingual ebook reader Bard Storyteller
189       https://github.com/festvox/bard
190
191New in 1.4.1 (March 2010)
192    o better ssml support (actually does something)
193
194    o better clunit support (smaller)
195
196    o Android support
197
198New in 1.4 (December 2009)
199    o crude multi-voice selection support (may change)
200
201    o 4 basic voices are included 3 clustergen (awb, rms and slt) plus
202      the kal diphone database
203
204    o CMULEX now uses maximum onset for syllabification
205
206    o alsa support
207
208    o Clustergen support (including mlpg with mixed excitation)
209      But is still slow on limited processors
210
211    o Windows support with Visual Studio (specifically for the Olympus
212        Spoken Dialog System)
213
214    o WinCE support is redone with cegcc/mingw32ce with example
215        example TTS app: Flowm: Flite on Windows Mobile
216
217    o Speed-ups in feature interpretation limiting calls to alloc
218
219    o Speed-ups (and fixes) for converting clunits festvox voices
220
221New in 1.3-release (October 2005)
222    o fixes to lpc residual extraction to give better quality output
223
224    o An updated lexicon (festlex_CMU from festival-2.0.95) and better
225      compression its about 30% of the previous size, with about
226      the same accuracy
227    o Fairly substantial code movements to better support PalmOS and
228      multi-platform cross compilation builds
229
230    o A PalmOS 5.0 port with an small example talking app ("flop")
231
232    o runs under ix86_64 linux
233
234New in 1.2-release  (February 2003)
235    o A build process for diphone and clunits/ldom voices
236      FestVox voices can be converted (sometimes) automatically
237
238    o Various bug fixes
239
240    o Initial support for Mac OS X (not talking to audio device yet)
241      but compiles and runs
242
243    o Text files can be synthesize to a single audio file
244
245    o (optional) shared library support (Linux)
246
247Compilation
248-----------
249
250In general
251
252    tar zxvf flite-2.1-current.tar.gz
253
254    cd flite-2.1-current
255    ./configure
256    make
257    make get_voices
258
259Where tar is gnu tar (gtar), and make is gnu make (gmake).
260
261Or
262
263    git clone http://github.com/festvox/flite
264    cd flite
265    ./configure
266    make
267    make get_voices
268
269Configuration should be automatic, but maybe doesn't work in all cases
270especially if you have some new compiler.  You can explicitly set the
271compiler in config/config and add any options you see fit.  Configure
272tries to guess these but it might be unable to guess for cross
273compilation cases Interesting options there are
274
275    -DWORDS_BIGENDIAN=1  for bigendian machines (e.g. Sparc, M68x, ar9331)
276    -DNO_UNION_INITIALIZATION=1  For compilers without C 99 union inintialization
277    -DCST_AUDIO_NONE     if you don't need/want audio support
278
279There are different sets of voices and languages you can select between
280them (and your own sets if you make config/XXX.lv).  For example
281
282    ./configure --with-langvox=transtac
283
284Will use the languages and voices defined in config/transtac.lv
285
286Usage:
287------
288
289The ./bin/flite binary contains all supported voices and you may
290choose between the voices with the -voice flag and list the supported
291voices with the -lw flag.  Note the kal (diphone) voice is a different
292technology from the others and is much less computationally expensive
293but more robotic.  For each voice additional binaries that contain
294only that voice are created in ./bin/flite_FULLVOICENAME,
295e.g. ./bin/flite_cmu_us_awb.  You can also refer to external clustergen
296.flitevox voice via a pathname argument with -voice (note the pathname
297must contain at least one "/")
298
299If it compiles properly a binary will be put in bin/, note by
300default -g is on so it will be bigger than is actually required
301
302    ./bin/flite "Flite is a small fast run-time synthesis engine" flite.wav
303
304Will produce an 8KHz riff headered waveform file (riff is Microsoft's
305wave format often called .WAV).
306
307    ./bin/flite doc/alice
308
309Will play the text file doc/alice.  If the first argument contains
310a space it is treated as text otherwise it is treated as a filename.
311If a second argument is given a waveform file is written to it,
312if no argument is given or "play" is given it will attempt to
313write directly to the audio device (if supported).  if "none"
314is given the audio is simply thrown away (used for benchmarking).
315Explicit options are also available.
316
317    ./bin/flite -v doc/alice none
318
319Will synthesize the file without playing the audio and give a summary
320of the speed.
321
322    ./bin/flite doc/alice alice.wav
323
324will synthesize the whole of alice into a single file (previoous
325versions would only give the last utterance in the file, but
326that is fixed now).
327
328An additional set of feature setting options are available, these are
329*debug* options, Voices are represented as sets of feature values (see
330lang/cmu_us_kal/cmu_us_kal.c) and you can override values on the
331command line.  This can stop flite from working if malicious values
332are set and therefor this facility is not intended to be made
333available for standard users.  But these are useful for
334debugging.  Some typical examples are
335
336Use simple concatenation of diphones without prosodic modification
337
338    ./bin/flite --sets join_type=simple_join doc/intro
339
340Print sentences as they are said
341
342    ./bin/flite -pw doc/alice
343
344Make it speak slower
345
346    ./bin/flite --setf duration_stretch=1.5 doc/alice
347
348Make it speak higher pitch
349
350    ./bin/flite --setf int_f0_target_mean=145 doc/alice
351
352The talking clock is an example talking clode as discussed on
353http://festvox.org/ldom it requires a single argument HH:MM
354under Unix you can call it
355
356    ./bin/flite_time `date +%H:%M`
357
358List the voices linked in directly in this build
359
360    ./bin/flite -lv
361
362Speak with the US male rms voice (builtin version)
363
364    ./bin/flite -voice rms -f doc/alice
365
366Speak with the "Scottish" male awb voice (builtin version)
367
368    ./bin/flite -voice awb -f doc/alice
369
370Speak with the US female slt voice
371
372    ./bin/flite -voice slt -f doc/alice
373
374Speak with AEW voice, download on the fly from festvox.org
375
376    ./bin/flite -voice http://festvox.org/flite/packed/flite-2.1/voices/cmu_us_aew.flitevox -f doc/alice
377
378Speak with AHW voice loaded from the local file.
379
380    ./bin/flite -voice voices/cmu_us_ahw.flitevox -f doc/alice
381
382You can download the available voices into voices/
383
384    ./bin/get_voices us_voices
385
386and/or
387
388    ./bin/get_voices indic_voices
389
390Voice quality
391-------------
392
393So you've eagerly downloaded flite, compiled it and run it, now you
394are disappointed that it doesn't sound wonderful, sure its fast and
395small but what you really hoped for was the dulcit tones of a deep
396baritone voice that would make you desperately hang on every phrase it
397mellifluously produces.  But instead you get an 8Khz diphone voice that
398sounds like it came from the last millenium.
399
400Well, first, you are right, it is an 8KHz diphone voice from the last
401millenium, and that was actually deliberate.  As we developed flite we
402wanted a voice that was stable and that we could directly compare with
403that very same voice in Festival.  Flite is an *engine*.  We want to
404be able take voices built with the FestVox process and compile them
405for flite, the result should be exactly the same quality (though of
406course trading the size for quality in flite is also an option).  The
407included voice is just a sample voice that was used in the testing
408process.
409
410We expect that often voices will be loaded from external files, and we
411have now set up a voice repository in
412
413    http://festvox.org/flite/flite-2.1/voices/*.flitevox
414
415If you visit there with a browser you can hear the examples.  You can
416also download the .flitevox files to you machine so you don't need a
417network connect everytime you need to load a voice.
418
419We are now actively adding to this list of available voices in English (16)
420and other languages.
421
422Bard Storyteller:  https://github.com/festvox/bard
423--------------------------------------------------
424
425Bard is a companion app that reads ebooks, both displaying them and
426actually reading them to you out loud using flite.  Bard supports a
427wide range of fonts, and flite voices, and books in text, html and
428epub format.  Bard is used as a evaluation of flite's capabilities and
429an example of a serious application using flite.
430
431