1\input texinfo   @c -*-texinfo-*-
2@c %**start of header
3@setfilename speech-dispatcher.info
4@settitle Speech Dispatcher
5@finalout
6@c @setchapternewpage odd
7@c %**end of header
8
9@syncodeindex pg cp
10@syncodeindex fn cp
11@syncodeindex vr cp
12
13@include version.texi
14
15@dircategory Sound
16@dircategory Development
17
18@direntry
19* Speech Dispatcher: (speech-dispatcher).       Speech Dispatcher.
20@end direntry
21
22@titlepage
23@title Speech Dispatcher
24@subtitle Mastering the Babylon of TTS'
25@subtitle for Speech Dispatcher @value{VERSION}
26@author Tom@'a@v{s} Cerha <@email{cerha@@brailcom.org}>
27@author Hynek Hanke <@email{hanke@@volny.cz}>
28@author Milan Zamazal <@email{pdm@@brailcom.org}>
29
30@page
31@vskip 0pt plus 1filll
32
33This manual documents Speech Dispatcher, version @value{VERSION}.
34
35Copyright @copyright{} 2001, 2002, 2003, 2006, 2007, 2008 Brailcom, o.p.s.
36
37@quotation
38Permission is granted to copy, distribute and/or modify this document
39under the terms of the GNU Free Documentation License, Version 1.2 or
40any later version published by the Free Software Foundation; with no
41Invariant Sections, with no Front-Cover Texts and no Back-Cover Texts.
42A copy of the license is included in the section entitled ``GNU Free
43Documentation License.''
44@end quotation
45
46You can also (at your option) distribute this manual under the GNU
47General Public License:
48
49@quotation
50Permission is granted to copy, distribute and/or modify this document
51under the terms of the GNU General Public License as published by the
52Free Software Foundation; either version 2 of the License, or (at your
53option) any later version.
54
55A copy of the license is included in the section entitled ``GNU
56General Public License''
57@end quotation
58
59@end titlepage
60
61@ifnottex
62@node Top, Introduction, (dir), (dir)
63
64This manual documents Speech Dispatcher, version @value{VERSION}.
65
66Copyright @copyright{} 2001, 2002, 2003, 2006 Brailcom, o.p.s.
67
68@quotation
69Permission is granted to copy, distribute and/or modify this document
70under the terms of the GNU Free Documentation License, Version 1.2 or
71any later version published by the Free Software Foundation; with no
72Invariant Sections, with no Front-Cover Texts and no Back-Cover Texts.
73A copy of the license is included in the section entitled ``GNU Free
74Documentation License.''
75@end quotation
76
77You can also (at your option) distribute this manual under the GNU
78General Public License:
79
80@quotation
81Permission is granted to copy, distribute and/or modify this document
82under the terms of the GNU General Public License as published by the
83Free Software Foundation; either version 2 of the License, or (at your
84option) any later version.
85
86A copy of the license is included in the section entitled ``GNU
87General Public License''
88@end quotation
89
90@end ifnottex
91
92@ifhtml
93@heading Menu
94@end ifhtml
95
96@menu
97* Introduction::                What is Speech Dispatcher.
98* User's Documentation::        Usage, Configuration...
99* Technical Specifications::
100* Client Programming::          Documentation for application developers.
101* Server Programming::          Documentation for project contributors.
102
103* Download and Contact::        How to get Speech Dispatcher and how to contact us
104* Reporting Bugs::              How to report a bug
105* How You Can Help::            What is needed
106
107* Appendices::
108* GNU General Public License::  Copying conditions for Speech Dispatcher
109* GNU Free Documentation License::  Copying conditions for this manual
110
111* Index of Concepts::
112@end menu
113
114@node Introduction, User's Documentation, Top, Top
115@chapter Introduction
116
117@menu
118* Motivation::                  Why Speech Dispatcher?
119* Basic Design::                How does it work?
120* Features Overview::           What are the assets?
121* Current State::               What is done?
122@end menu
123
124@node Motivation, Basic Design, Introduction, Introduction
125@section Motivation
126@cindex Basic ideas, Motivation
127@cindex Philosophy
128
129Speech Dispatcher is a device independent layer for speech synthesis
130that provides a common easy to use interface for both client
131applications (programs that want to speak) and for software
132synthesizers (programs actually able to convert text to speech).
133
134High quality speech synthesis is now commonly available both as
135propriatary and Free Software solutions. It has a wide field of
136possible uses from educational software to specialized systems,
137e.g. in hospitals or laboratories. It is also a key compensation tool
138for the visually impaired users. For them, it is one of the two
139possible ways of getting output from a computer (the second one being
140a Braille display).
141
142The various speech synthesizers are quite different, both in their
143interfaces and capabilities. Thus a general common interface is needed
144so that the client application programmers have an easy way to use
145software speech synthesis and don't have to care about peculiar
146details of the various synthesizers.
147
148The absence of such a common and standardized interface and thus the
149difficulty for programmers to use software speech synthesis has
150been a major reason why the potential of speech synthesis technology
151is still not fully expoited.
152
153Ideally, there would be little distinction for applications whether
154they output messages on the screen or via speech. Speech Dispatcher
155can be compared to what a GUI toolkit is for the graphical
156interface. Not only does it provide an easy to use interface, some
157kind of theming and configuration mechanisms, but also it takes care
158of some of the issues inherent with this particular mode of output,
159such as the need for speech message serialization and interaction with
160the audio subsystem.
161
162@node Basic Design, Features Overview, Motivation, Introduction
163@section Design
164@cindex Design
165
166@heading Current Design
167The communication between all applications and synthesizers, when
168implemented directly, is a mess. For this purpose, we wanted
169Speech Dispatcher to be a layer separating applications and
170synthesizers so that applications wouldn't have to care about
171synthesizers and synthesizers wouldn't have to care about interaction
172with applications.
173
174We decided we would implement Speech Dispatcher as a server receiving
175commands from applications over a protocol called @code{SSIP},
176parsing them if needed, and calling the appropriate functions
177of output modules communicating with the different synthesizers.
178These output modules are implemented as plug-ins, so that the user
179can just load a new module if he wants to use a new synthesizer.
180
181Each client (application that wants to speak) opens a socket
182connection to Speech Dispatcher and calls functions like say(),
183stop(), and pause() provided by a library implementing the protocol.
184This shared library is still on the client side and sends Speech
185Dispatcher SSIP commands over the socket. When the messages arrive at
186Speech Dispatcher, it parses them, reads the text that should be said
187and puts it in one of several queues according to the priority of the
188message and other criteria. It then decides when, with which
189parameters (set up by the client and the user), and on which
190synthesizer it will say the message. These requests are handled by the
191output plug-ins (output modules) for different hardware and software
192synthesizers and then said aloud.
193
194@image{figures/architecture,155mm,,Speech Dispatcher architecture}
195
196See also the detailed description @ref{Client Programming} interfaces, and
197@ref{Server Programming} documentation.
198
199@heading Future Design
200
201Speech Dispatcher currently mixes two important features: common
202low-level interface to multiple speech synthesizers and message
203management (including priorities and history). This became even more
204evident when we started thinking about handling messages intended for
205output on braille devices.  Such messages of course need to be
206synchronized with speech messages and there is little reason why the
207accessibility tools should send the same message twice for these two
208different kinds of output used by blind people (often simultaneously).
209Outside the world of accessibility, applications also want to either
210have full control over the sound (bypass prioritisation) or to only
211retrieve the synthesized data, but not play them immediatelly.
212
213We want to eventually split Speech Dispatcher into two independent
214components: one providing a low-level interface to speech synthesis
215drivers, which we now call TTS API Provider and is already largely
216implemented in the Free(b)Soft project, and the second doing message
217managemenet, called Message Dispatcher. This will allow Message
218Dispatcher to also output on Braille as well as to use the TTS API
219Provider separately.
220
221From implementation point of view, an opportunity for new design based
222on our previous experiences allowed us to remove several bottlenecks
223for speed (responsiveness), ease of use and ease of implementation of
224extensions (particularly output modules for new synthesizers). From
225the architecture point of view and possibilities for new developments,
226we are entirely convinced that both the new design in general and the
227inner design of the new components is much better.
228
229While a good API and its implementation for Braille are already
230existent in the form of BrlAPI, the API for speech is now under
231developement. Please see another architecture diagram showing how we
232imagine Message Dispatcher in the future.
233
234@image{figures/architecture-future,155mm,,Speech Dispatcher architecture}
235
236References:
237@uref{http://www.freebsoft.org/tts-api/}
238@uref{http://www.freebsoft.org/tts-api-provider/}
239
240@node Features Overview, Current State, Basic Design, Introduction
241@section Features Overview
242Speech Dispatcher from user's point of view:
243
244@itemize @bullet
245@item ability to freely combine applications with your favorite synthesizer
246@item message synchronization and coordination
247@item less time devoted to configuration of applications
248@end itemize
249
250Speech Dispatcher from application programmers's point of view:
251
252@itemize @bullet
253@item easy way to make your applications speak
254@item common interface to different synthesizers
255@item higher level synchronization of messages (priorities)
256@item no need to take care about configuration of voice(s)
257@end itemize
258
259@node Current State,  , Features Overview, Introduction
260@section Current State
261@cindex Synthesizers
262@cindex Other programs
263
264In this version, most of the features of Speech Dispatcher are
265implemented and we believe it is now useful for applications as a
266device independent Text-to-Speech layer and an accessibility message
267coordination layer.
268
269Currently, one of the most advanced applications that works with
270Speech Dispatcher is @code{speechd-el}. This is a client for Emacs,
271targeted primarily for blind people. It is similar to Emacspeak,
272however the two take a bit different approach and serve different user
273needs. You can find speechd-el on
274@uref{http://www.freebsoft.org/speechd-el/}. speechd-el provides
275speech output when using nearly any GNU/Linux text interface, like
276editing text, reading email, browsing the web, etc.
277
278Orca, the primary screen reader for the Gnome Desktop, supports Speech
279Dispatcher directly since its version 2.19.0.  See
280@uref{http://live.gnome.org/Orca/SpeechDispatcher} for more
281information.
282
283We also provide a shared C library, a Python library, a Java, Guile
284and a Common Lisp libraries that implement the SSIP functions of
285Speech Dispatcher in higher level interfaces. Writing client
286applications in these languages should be quite easy.
287
288On the synthesis side, there is good support for Festival, eSpeak,
289Flite, Cicero, IBM TTS, MBROLA, Epos, Dectalk software, Cepstral Swift
290and others. See @xref{Supported Modules}.
291
292We decided not to interface the simple hardware speech devices as they
293don't support synchronization and therefore cause serious problems
294when handling multiple messages.  Also they are not extensible, they
295are usually expensive and often hard to support. Today's computers are
296fast enough to perform software speech synthesis and Festival is a
297great example.
298
299@node User's Documentation, Technical Specifications, Introduction, Top
300@chapter User's Documentation
301
302@menu
303* Installation::                How to get it installed in the best way.
304* Running::                     The different ways to start it.
305* Troubleshooting::             What to do if something doesn't work...
306* Configuration::               How to configure Speech Dispatcher.
307* Tools::                       What tools come with Speech Dispatcher.
308* Synthesis Output Modules::    Drivers for different synthesizers.
309* Security::                    Security mechanisms and restrictions.
310@end menu
311
312@node Installation, Running, User's Documentation, User's Documentation
313@section Installation
314
315This part only deals with the general aspects of installing
316Speech Dispatcher. If you are compiling from source code (distribution
317tarball or git), please refer to the file @file{INSTALL} in your
318source tree.
319
320@subsection The requirements
321
322You will need these components to run Speech Dispatcher:
323@itemize
324@item glib 2.0  (@uref{http://www.gtk.org})
325@item libdotconf 1.3 (@uref{http://github.com/williamh/dotconf})
326@item pthreads
327@end itemize
328
329We recommend to also install these packages:
330@itemize
331 @item Festival (@uref{http://www.cstr.ed.ac.uk/projects/festival/})
332 @item festival-freebsoft-utils 0.3+ (@uref{http://www.freebsoft.org/festival-freebsoft-utils})
333 @item Sound icons library @* (@uref{http://www.freebsoft.org/pub/projects/sound-icons/sound-icons-0.1.tar.gz})
334@end itemize
335
336@subsection Recommended installation procedure
337
338@itemize
339
340@item Install your software synthesizer
341
342Although we highly recommend to use Festival for its excellent
343extensibility, good quality voices, good responsiveness and best
344support in Speech Dispatcher, you might want to start with eSpeak, a
345lightweight multi-lingual feature-complete synthesizer, to get all the
346key components working and perhaps only then switch to
347Festival. Installation of eSpeak should be easier and the default
348configuration of Speech Dispatcher is set up for eSpeak for this
349reason.
350
351You can of course also start with Epos or any other supported synthesizer.
352
353@item Make sure your synthesizer works
354
355There is usually a way to test if the installation of your speech
356synthesizer works. For eSpeak run @code{espeak "test"}, for Flite run
357@code{flite -t "Hello!"} and hear the speech. For Festival run
358@code{festival} and type in
359
360@example
361(SayText "Hello!")
362(quit)
363@end example
364
365@item Install Speech Dispatcher
366
367Install the packages for Speech Dispatcher from your distribution or
368download the source tarball (or git) from
369@url{http://www.freebsoft.org/speechd} and follow the instructions in
370the file @code{INSTALL} in the source tree.
371
372@item Configure Speech Dispatcher
373
374You can skip this step in most cases. If you however want to setup
375your own configuration of the Dispatchers default values, the easiest
376way to do so is through the @code{spd-conf} configuration script. It
377will guide you through the basic configuration. It will also
378subsequently perform some diagnostics tests and offer some limited
379help with troubleshooting. Just execute
380
381@example
382spd-conf
383@end example
384
385under an ordinary user or system user like 'speech-dispatcher'
386depending on whether you like to setup Speech Dispatcher as user or
387system service respectively. You might also want to explore the
388offered options or run some of its subsystems manually, type
389@code{spd-conf -h} for help.
390
391If you do not want to use this script, it doesn't work in your case
392or it doesn't provide enough configuration flexibility, please
393continue as described below and/or in @xref{Running Under Ordinary Users}.
394
395@item Test Speech Dispatcher
396
397The simplest way to test Speech Dispatcher is through
398@code{spd-conf -d} or through the @code{spd-say} tool.
399
400Example:
401@example
402spd-conf -d
403spd-say "Hello!"
404spd-say -l cs -r 90 "Ahoj"
405@end example
406
407If you don't hear anything, please @xref{Troubleshooting}.
408
409@end itemize
410
411@subsection How to use eSpeak NG or eSpeak with MBROLA
412
413Please follow the guidelines at
414@url{https://github.com/espeak-ng/espeak-ng/blob/master/docs/mbrola.md}
415(resp. @url{http://espeak.sourceforge.net/mbrola.html})
416for installing eSpeak NG (resp. eSpeak) with a set of MBROLA voices that you
417want to use.
418
419Check the @file{modules/espeak-ng-mbrola-generic.conf}
420(resp. @file{modules/espeak-mbrola-generic.conf}) configuration files for the
421@code{AddVoice} lines. If a line for any of the voices you have installed
422(and it is supported by your version of eSpeak NG (resp. eSpeak),
423e.g. @code{ls /usr/share/espeak-ng-data/voices/mb/mb-*}
424(resp. @code{ls /usr/share/espeak-data/voices/mb/mb-*}))
425is not contained here, please add it. Check if @code{GenericExecuteString}
426contains the correct name of your mbrola binary and correct path to its voice
427database.
428
429Restart speech-dispatcher and in your client, select
430@code{espeak-ng-mbrola-generic} (resp. @code{espeak-mbrola-generic}) as your
431output module, or test it with the following command
432
433@example
434spd-say -o espeak-ng-mbrola-generic -l cs Testing
435@end example
436
437(resp.
438@example
439spd-say -o espeak-mbrola-generic -l cs Testing
440@end example
441)
442
443@node Running, Troubleshooting, Installation, User's Documentation
444@section Running
445
446Speech Dispatcher is normally executed on a per-user basis.  This
447provides more flexibility in user configuration, access rights and is
448essential in any environment where multiple people use the computer at
449the same time. It used to be possible to run Speech Dispatcher as a system
450service under a special user (and still is, with some limitations), but
451this mode of execution is strongly discouraged.
452
453@menu
454* Running Under Ordinary Users::
455* Running in a Custom Setup::
456* Setting Communication Method::
457@end menu
458
459@node Running Under Ordinary Users, Running in a Custom Setup, Running, Running
460@subsection Running Under Ordinary Users
461
462No special provisions need to be done to run Speech Dispatcher under
463the current user. The Speech Dispatcher process will use (or create) a
464@file{~/.cache/speech-dispatcher/} directory for its purposes (logging,
465pidfile).
466
467Optionally, a user can place his own configuration file in
468@file{~/.config/speech-dispatcher/speechd.conf} and it will be
469automatically loaded by Speech Dispatcher. The preferred way to do so
470is via the @code{spd-conf} configuration command. If this user
471configuration file is not found, Speech Dispatcher will simply use the
472system wide configuration file (e.g. in
473@file{/etc/speech-dispatcher/speechd.conf}).
474
475@example
476# speech-dispatcher
477# spd-say test
478@end example
479
480@node Running in a Custom Setup, Setting Communication Method, Running Under Ordinary Users, Running
481@subsection Running in a Custom Setup
482
483Speech Dispatcher can be run in any other setup of executing users, port
484numbers and system paths as well. The path to configuration, pidfile and
485logfiles can be specified separately via compilation flags,
486configuration file options or command line options in this ascending
487order of their priority.
488
489This way can also be used to start Speech Dispatcher as a system wide
490service from /etc/init.d/ , although this approach is now discouraged.
491
492@node Setting Communication Method,  , Running in a Custom Setup, Running
493@subsection Setting Communication Method
494
495Currently, two different methods are supported for communication
496between the server and its clients.
497
498For local communication, it's preferred to use @emph{Unix sockets},
499where the communication takes place over a Unix socket with its
500driving file located by default in the user's runtime directory as
501@code{XDG_RUNTIME_DIR/speech-dispatcher/speechd.sock}. In this way, there can be no
502conflict between different user sessions using different Speech
503Dispatchers in the same system. By default, permissions are set in
504such a way, that only the same user who started the server can access
505it, and communication is hidden to all other users.
506
507The other supported mechanism is @emph{Inet sockets}. The server will
508thus run on a given port, which can be made accessible either localy
509or to other machines on the network as well. This is very useful in a
510network setup. Be however aware that while using Inet sockets, both
511parties (server and clients) must first agree on the communication
512port number to use, which can create a lot of confusion in a setup
513where multiple instances of the server serve multiple different users.
514Also, since there is currently no authentication mechanism, during
515Inet socket communication, the server will make no distinction between
516the different users connecting to it. The default port is 6560 as set
517in the server configuration.
518
519Client applications will respect the @emph{SPEECHD_ADDRESS} environment
520variable.  The method ('@code{unix_socket}' or '@code{inet_socket}')
521is optionally followed by it's parameters separated by a colon.
522For an exact description, please  @xref{Address specification}.
523
524An example of launching Speech Dispatcher using unix_sockets
525for communication on a non-standard destination and subsequently
526using spd-say to speak a message:
527
528@example
529killall -u `whoami` speech-dispatcher
530speech-dispatcher -c unix_socket -S /tmp/my.sock
531SPEECHD_ADDRESS=unix_socket:/tmp/my.sock spd-say "test"
532@end example
533
534@node Troubleshooting, Configuration, Running, User's Documentation
535@section Troubleshooting
536
537If you are experiencing problems when running Speech Dispatcher, please:
538
539@itemize
540
541@item
542Use @code{spd-conf} to run diagnostics:
543
544@example
545spd-conf -d
546@end example
547
548@item
549Check the appropriate logfile in
550@file{~/.cache/speech-dispatcher/log/speech-dispatcher.log} for user Speech
551Dispatcher or in @file{/var/log/speech-dispatcher/speech-dispatcher.log}. Look
552for lines containing the string 'ERROR' and their surrounding
553contents. If you hear no speech, restart Speech Dispatcher and look
554near the end of the log file -- before any attempts for synthesis of
555any message. Usually, if something goes wrong with the initialization
556of the output modules, a textual description of the problem and a
557suggested solution can be found in the log file.
558
559@item
560If this doesn't reveal the problem, please run
561@example
562spd-conf -D
563@end example
564
565Which will genereate a very detailed logfile archive
566which you can examine yourself or send to us with
567a request for help.
568
569@item
570You can also try to say some message directly through the utility
571@code{spd-say}.
572
573Example:
574@example
575        spd-say "Hello, does it work?"
576        spd-say --language=cs --rate=20 "Everything ok?"
577@end example
578
579@item
580Check if your configuration files (speechd.conf, modules/*.conf)
581are correct (some uninstalled synthesizer specified as the default,
582wrong values for default voice parameters etc.)
583
584@item
585There is a know problem in some versions of Festival. Please make sure
586that Festival server_access_list configuration variable and your
587/etc/hosts.conf are set properly. server_access_list must contain the
588symbolic name of your machine and this name must be defined in
589/etc/hosts.conf and point to your IP address. You can test if this is
590set correctly by trying to connect to the port Festival server is
591running on via an ordinary telnet (by default like this: @code{telnet
592localhost 1314}). If you are not rejected, it works.
593
594@end itemize
595
596@node Configuration, Tools, Troubleshooting, User's Documentation
597@section Configuration
598@cindex configuration
599@cindex default values
600
601Speech Dispatcher can be configured on several different levels.  You
602can configure the global settings through the server configuration
603file, which can be placed either in the Speech Dispatcher default
604configuration system path like /etc/speech-dispatcher/ or in your home
605directory in @file{~/.config/speech-dispatcher/}.  There is also support for
606per-client configuration, this is, specifying different default values
607for different client applications.
608
609Furthermore, applications often come with their own means of configuring
610speech related settings.  Please see the documentation of your
611application for details about application specific configuration.
612
613@menu
614* Configuration file syntax::   Basic rules.
615* Configuration options::       What to configure.
616* Audio Output Configuration::  How to switch to ALSA, Pulse...
617* Client Specific Configuration::  Specific default values for applications.
618* Output Modules Configuration::  Adding and customizing output modules.
619* Log Levels::                  Description of log levels.
620@end menu
621
622@node Configuration file syntax, Configuration options, Configuration, Configuration
623@subsection Configuration file syntax
624
625We use the DotConf library to read a permanent text-file based
626configuration, so the syntax might be familiar to many users.
627
628Each of the string constants, if not otherwise stated differently,
629should be encoded in UTF-8. The option names use only the
630standard ASCII charset restricted to upper- and lowercase characters
631(@code{a}, @code{b}), dashes (@code{-}) and underscores @code{_}.
632
633Comments and temporarily inactive options begin with @code{#}.
634If such an option should be turned on, just remove the comment
635character and set it to the desired value.
636@example
637# this is a comment
638# InactiveOption "this option is turned off"
639@end example
640
641Strings are enclosed in doublequotes.
642@example
643LogFile  "/var/log/speech-dispatcher.log"
644@end example
645
646Numbers are written without any quotes.
647@example
648Port 6560
649@end example
650
651Boolean values use On (for true) and Off (for false).
652@example
653Debug Off
654@end example
655
656@node Configuration options, Audio Output Configuration, Configuration file syntax, Configuration
657@subsection Configuration options
658
659All available options are documented directly in the file and examples
660are provided.  Most of the options are set to their default value and
661commented out.  If you want to change them, just change the value and
662remove the comment symbol @code{#}.
663
664@node Audio Output Configuration, Client Specific Configuration, Configuration options, Configuration
665@subsection Audio Output Configuration
666
667Audio output method (ALSA, Pulse etc.) can be configured centrally
668from the main configuration file @code{speechd.conf}. The option
669@code{AudioOutputMethod} selects the desired audio method and further
670options labeled as @code{AudioALSA...} or @code{AudioPulse...} provide
671a more detailed configuration of the given audio output method.
672
673It is possible to use a list of preferred audio output methods,
674in which case each output module attempts to use the first availble
675in the given order.
676
677The example below prefers Pulse Audio, but will use ALSA if unable
678to connect to Pulse:
679@example
680 AudioOutputMethod "pulse,alsa"
681@end example
682
683Please note however that some more simple output modules or
684synthesizers, like the generic output module, do not respect these
685settings and use their own means of audio output which can't be
686influenced this way. On the other hand, the fallback dummy output
687module tries to use any available means of audio output to deliver its
688error message.
689
690@node Client Specific Configuration, Output Modules Configuration, Audio Output Configuration, Configuration
691@subsection Client Specific Configuration
692
693It is possible to automatically set different default values of speech
694parameters (e.g.  rate, volume, punctuation, default voice...) for
695different applications that connect to Speech Dispatcher. This is
696especially useful for simple applications that have no parameter
697setting capabilities themselves or they don't support a parameter
698setting you wish to change (e.g. language).
699
700Using the commands @code{BeginClient "IDENTIFICATION"} and
701@code{EndClient} it is possible to open and close a section of
702parameter settings that only affects those client applications that
703identify themselves to Speech Dispatcher under the specific
704identification code which is matched against the string
705@code{IDENTIFICATION}.  It is possible to use wildcards ('*' matches
706any number of characters and '?' matches exactly one character) in the
707string @code{IDENTIFICATION}.
708
709The identification code normally consists of 3 parts:
710@code{user:application:connection}. @code{user} is the username of the
711one who started the application, @code{application} is the name of the
712application (usually the name of the binary for it) and
713@code{connection} is a name for the connection (one application might
714use more connections for different purposes).
715
716An example is provided in @code{/etc/speech-dispatcher/speechd.conf}
717(see the line @code{Include "clients/emacs.conf"} and
718@code{/etc/speech-dispatcher/clients/emacs.conf}.
719
720@node Output Modules Configuration, Log Levels, Client Specific Configuration, Configuration
721@subsection Output Modules Configuration
722
723Each user should turn on at least one output module in his
724configuration, if he wants Speech Dispatcher to produce
725any sound output. If no output module is loaded, Speech Dispatcher
726will start, log messages into history and communicate with clients,
727but no sound is produced.
728
729Each output module has an
730``AddModule'' line in
731@file{speech-dispatcher/speechd.conf}. Additionally, each output
732module can have its own configuration file.
733
734The audio output is handled by the output modules themselves, so this
735can be switched in their own configuration files under
736@code{etc/speech-dispatcher/modules/}.
737
738@menu
739* Loading Modules in speechd.conf::
740* Configuration files of output modules::
741* Configuration of the Generic Output Module::
742@end menu
743
744@node Loading Modules in speechd.conf, Configuration files of output modules, Output Modules Configuration, Output Modules Configuration
745@subsubsection Loading Modules in speechd.conf
746
747@anchor{AddModule}
748Each module that should be run when Speech Dispatcher starts must be loaded
749by the @code{AddModule} command in the configuration. Note that you can load
750one binary module multiple times under different names with different
751configurations. This is especially useful for loading the generic output
752module. @xref{Configuration of the Generic Output Module}.
753
754@example
755AddModule "@var{module_name}" "@var{module_binary}" "@var{module_config}"
756@end example
757
758@var{module_name} is the name of the output module.
759
760@var{module_binary} is the name of the binary executable
761of this output module. It can be either absolute or relative
762to @file{bin/speechd-modules/}.
763
764@var{module_config} is the file where the configuration for
765this output module is stored. It can be either absolute or relative
766to @file{etc/speech-dispatcher/modules/}. This parameter is optional.
767
768@node Configuration files of output modules, Configuration of the Generic Output Module, Loading Modules in speechd.conf, Output Modules Configuration
769@subsubsection Configuration Files of Output Modules
770
771Each output module is different and therefore has different
772configuration options. Please look at the comments in its
773configuration file for a detailed description. However, there are
774several options which are common for some output modules. Here is a
775short overview of them.
776
777@itemize
778@item AddVoice "@var{language}" "@var{symbolicname}" "@var{name}"
779@anchor{AddVoice}
780
781Each output module provides some voices and sometimes it even supports
782different languages. For this reason, there is a common mechanism for
783specifying these in the configuration, although no module is
784obligated to use it. Some synthesizers, e.g. Festival, support the
785SSIP symbolic names directly, so the particular configuration of these
786voices is done in the synthesizer itself.
787
788For each voice, there is exactly one @code{AddVoice} line.
789
790@var{language} is the ISO language code of the language of this voice, possibly
791with a region qualification.
792
793@var{symbolicname} is a symbolic name under which you wish this voice
794to be available. See @ref{Top,,Standard Voices, ssip, SSIP
795Documentation} for the list of names you can use.
796
797@var{name} is a name specific for the given output module. Please see
798the comments in the configuration file under the appropriate AddModule
799section for more info.
800
801For example our current definition of voices for Epos (file
802@code{/etc/speech-dispatcher/modules/generic-epos.conf} looks like
803this:
804
805@example
806        AddVoice        "cs"  "male1"   "kadlec"
807        AddVoice        "sk"  "male1"   "bob"
808@end example
809
810@item ModuleDelimiters "@var{delimiters}", ModuleMaxChunkLength @var{length}
811
812Normally, the output module doesn't try to synthesize all
813incoming text at once, but instead it cuts it into smaller
814chunks (sentences, parts of sentences) and then synthesizes
815them one by one. This second approach, used by some output
816modules, is much faster, however it limits the ability of
817the output module to provide good intonation.
818
819NOTE: The Festival module does not use ModuleDelimiters and
820ModuleMaxChunkLength.
821
822For this reason, you can configure at which characters
823(@var{delimiters}) the text should be cut into smaller blocks
824or after how many characters (@var{length}) it should be cut,
825if there is no @var{delimiter} found.
826
827Making the two rules more strict, you will get better speed
828but give away some quality of intonation. So for example
829for slower computers, we recommend to include comma (,)
830in @var{delimiters} so that sentences are cut into phrases,
831while for faster computers, it's preferable
832not to include comma and synthesize the whole compound
833sentence.
834
835The same applies to @code{MaxChunkLength}, it's better
836to set higher values for faster computers.
837
838For example, curently the default for Flite is
839
840@example
841    FestivalMaxChunkLength  500
842    FestivalDelimiters  ".?!;"
843@end example
844
845The output module may also decide to cut sentences on delimiters
846only if they are followed by a space. This way for example
847``file123.tmp'' would not be cut in two parts, but ``The horse
848raced around the fence, that was lately painted green, fell.''
849would be. (This is an interesting sentence, by the way.)
850@end itemize
851
852@node Configuration of the Generic Output Module,  , Configuration files of output modules, Output Modules Configuration
853@subsubsection Configuration files of the Generic Output Module
854
855The generic output module allows you to easily write your
856own output module for synthesizers that have a simple
857command line interface by modifying the configuration
858file. This way, users can add support for their device even if they don't
859know how to program. @xref{AddModule}.
860
861The core part of a generic output module is the command
862execution line.
863
864@defvr {Generic Module Configuration} GenericExecuteSynth "@var{execution_string}"
865
866@code{execution_string} is the command that should be executed
867in a shell when it's desired to say something. In fact, it can
868be multiple commands concatenated by the @code{&&} operator. To stop
869saying the message, the output module will send a KILL signal to
870the process group, so it's important that it immediately
871stops speaking after the processes are killed. (On most GNU/Linux
872system, the @code{play} utility has this property).
873
874In the execution string, you can use the following variables,
875which will be substituted by the desired values before executing
876the command.
877
878@itemize
879@item @code{$DATA}
880The text data that should be said. The string's characters that would interfere
881with bash processing are already escaped. However, it may be necessary to put
882double quotes around it (like this: @code{\"$DATA\"}).
883@item @code{$LANG}
884The language identification string (it's defined by GenericLanguage).
885@item @code{$VOICE}
886The voice identification string (it's defined by AddVoice).
887@item @code{$PITCH}
888The desired pitch (a float number defined in GenericPitchAdd and GenericPitchMultiply).
889@item @code{$PITCH_RANGE}
890The desired pitch range (a float number defined in GenericPitchRangeAdd and GenericPitchRangeMultiply).
891@item @code{$RATE}
892The desired rate or speed (a float number defined in GenericRateAdd and GenericRateMultiply)
893@end itemize
894
895Here is an example from @file{etc/speech-dispatcher/modules/epos-generic.conf}
896@example
897GenericExecuteSynth \
898"epos-say -o --language $LANG --voice $VOICE --init_f $PITCH --init_t $RATE \
899\"$DATA\" | sed -e s+unknown.*$++ >/tmp/epos-said.wav && play /tmp/epos-said.wav >/dev/null"
900@end example
901@end defvr
902
903@defvr {GenericModuleConfiguration} AddVoice "@var{language}" "@var{symbolicname}" "@var{name}"
904@xref{AddVoice}.
905@end defvr
906
907@defvr {GenericModuleConfiguration} GenericLanguage "iso-code" "string-subst"
908
909Defines which string @code{string-subst} should be substituted for @code{$LANG}
910given an @code{iso-code} language code.
911
912Another example from Epos generic:
913@example
914GenericLanguage "en-US" "english-US"
915GenericLanguage "cs" "czech"
916GenericLanguage "sk" "slovak"
917@end example
918@end defvr
919
920@defvr {GenericModuleConfiguration} GenericRateAdd @var{num}
921@end defvr
922@defvr {GenericModuleConfiguration} GenericRateMultiply @var{num}
923@end defvr
924@defvr {GenericModuleConfiguration} GenericPitchAdd @var{num}
925@end defvr
926@defvr {GenericModuleConfiguration} GenericPitchMultiply @var{num}
927@end defvr
928@defvr {GenericModuleConfiguration} GenericPitchRangeAdd @var{num}
929@end defvr
930@defvr {GenericModuleConfiguration} GenericPitchRangeMultiply @var{num}
931These parameters set rate and pitch conversion to compute
932the value of @code{$RATE}, @code{$PITCH} and @code{$PITCH_RANGE}.
933
934The resulting rate (or pitch) is calculated using the following formula:
935@example
936   (speechd_rate * GenericRateMultiply) + GenericRateAdd
937@end example
938where speechd_rate is a value between -100 (lowest) and +100 (highest)
939Some meaningful conversion for the specific text-to-speech system
940used must by defined.
941
942(The values in GenericSthMultiply are multiplied by 100 because
943DotConf currently doesn't support floats. So you can write 0.85 as 85 and
944so on.)
945@end defvr
946
947@node Log Levels,  , Output Modules Configuration, Configuration
948@subsection Log Levels
949
950There are 6 different verbosity levels of Speech Dispatcher logging.
9510 means no logging, while 5 means that nearly all the information
952about Speech Dispatcher's operation is logged.
953
954@itemize @bullet
955
956@item Level 0
957@itemize @bullet
958@item No information.
959@end itemize
960
961@item Level 1
962@itemize @bullet
963@item Information about loading and exiting.
964@end itemize
965
966@item Level 2
967@itemize @bullet
968@item Information about errors that occurred.
969@item Allocating and freeing resources on start and exit.
970@end itemize
971
972@item Level 3
973@itemize @bullet
974@item Information about accepting/rejecting/closing clients' connections.
975@item Information about invalid client commands.
976@end itemize
977
978@item Level 4
979@itemize @bullet
980@item Every received command is output.
981@item Information preceding the command output.
982@item Information about queueing/allocating messages.
983@item Information about the history, sound icons and other
984facilities.
985@item Information about the work of the speak() thread.
986@end itemize
987
988@item Level 5
989(This is only for debugging purposes and will output *a lot*
990of data. Use with caution.)
991@itemize @bullet
992@item Received data (messages etc.) is output.
993@item Debugging information.
994@end itemize
995@end itemize
996
997@node Tools, Synthesis Output Modules, Configuration, User's Documentation
998@section Tools
999
1000Several small tools are distributed together with Speech Dispatcher.
1001@code{spd-say} is a small client that allows you to send messages to
1002Speech Dispatcher in an easy way and have them spoken, or cancel
1003speech from other applications.
1004
1005@menu
1006* spd-say::                     Say a given text or cancel messages in Dispatcher.
1007* spd-conf::                    Configuration, diagnostics and troubleshooting tool
1008* spd-send::                    Direct SSIP communication from command line.
1009@end menu
1010
1011@node spd-say, spd-conf, Tools, Tools
1012@subsection spd-say
1013
1014spd-say is documented in its own manual. @xref{Top,,,spd-say, Spd-say
1015Documentation}.
1016
1017@node spd-conf, spd-send, spd-say, Tools
1018@subsection spd-conf
1019
1020spd-conf is a tool for creating basic configuration, initial setup of
1021some basic settings (output module, audio method), diagnostics and
1022automated debugging with a possibility to send the debugging output to
1023the developers with a request for help.
1024
1025The available command options are self-documented through
1026@code{spd-say -h}. In any working mode, the tool asks the user about
1027future actions and preferred configuration of the basic options.
1028
1029Most useful ways of execution are:
1030@itemize @bullet
1031@item @code{spd-conf}
1032Create new configuration and setup basic settings according to user
1033answers. Run diagnostics and if some problems occur, run debugging
1034and offer to send a request for help to the developers.
1035
1036@item @code{spd-conf -d}
1037Run diagnostics of problems.
1038
1039@item @code{spd-conf -D}
1040Run debugging and offer to send a request for help to the developers.
1041
1042@end itemize
1043
1044@node spd-send,  , spd-conf, Tools
1045@subsection spd-send
1046
1047spd-send is a small client/server application that allows you to
1048establish a connection to Speech Dispatcher and then use a simple
1049command line tool to send and receive SSIP protocol communication.
1050
1051Please see @file{src/c/clients/spd-say/README} in the Speech
1052Dispatcher's source tree for more information.
1053
1054@node Synthesis Output Modules, Security, Tools, User's Documentation
1055@section Synthesis Output Modules
1056@cindex output module
1057@cindex different synthesizers
1058
1059Speech Dispatcher supports concurrent use of multiple output modules.  If the
1060output modules provide good synchronization, you can combine them when
1061reading messages.  For example if module1 can speak English and Czech while
1062module2 speaks only German, the idea is that if there is some message in
1063German, module2 is used, while module1 is used for the other languages.
1064However the language is not the only criteria for the decision.  The rules for
1065selection of an output module can be influenced through the configuration file
1066@file{speech-dispatcher/speechd.conf}.
1067
1068@menu
1069* Provided Functionality::      Some synthesizers don't support the full set of SSIP features.
1070@end menu
1071
1072@node Provided Functionality,  , Synthesis Output Modules, Synthesis Output Modules
1073@subsection Provided functionality
1074
1075Please note that some output modules don't support the full Speech
1076Dispatcher functionality (e.g. spelling mode, sound icons). If there
1077is no easy way around the missing functionality, we don't try to
1078emulate it in some complicated way and rather try to encourage the
1079developers of that particular synthesizer to add that
1080functionality. We are actively working on adding the missing parts to
1081Festival, so Festival supports nearly all of the features of Speech
1082Dispatcher and we encourage you to use it. Much progress has also been
1083done with eSpeak.
1084
1085@menu
1086* Supported Modules::
1087@end menu
1088
1089@node Supported Modules,  , Provided Functionality, Provided Functionality
1090@subsubsection Supported Modules
1091
1092@itemize @bullet
1093
1094@item Festival
1095Festival is a free software multi-language Text-to-Speech
1096synthesis system that is very flexible and extensible using the
1097Scheme scripting language. Currently, it supports high quality
1098synthesis for several languages, and on today's computers it runs
1099reasonably fast.  If you are not sure which one to use and your
1100language is supported by Festival, we advise you to use it. See
1101@uref{http://www.cstr.ed.ac.uk/projects/festival/}.
1102
1103@item eSpeak
1104eSpeak is a newer very lightweight free software engine with a broad
1105range of supported languages and a good quality of voice at high
1106rates. See @uref{http://espeak.sourceforge.net/}.
1107
1108@item Flite
1109Flite (Festival Light) is a lightweight free software TTS synthesizer
1110intended to run on systems with limited resources. At this time, it
1111has only one English voice and porting voices from Festival looks
1112rather difficult.  With the caching mechanism provided by Speech
1113Dispatcher, Festival is faster than Flite in most situations.  See
1114@uref{http://www.speech.cs.cmu.edu/flite/}.
1115
1116@item Generic
1117The Generic module can be used with any synthesizer that can be
1118managed by a simple command line application. @xref{Configuration of
1119the Generic Output Module}, for more details about how to use it.
1120However, it provides only very rudimentary support of speaking.
1121
1122@item Pico
1123The SVOX Pico engine is a software speech synthesizer for German, English (GB
1124and US), Spanish, French and Italian.
1125SVOX produces clear and distinct speech output made possible by the use of
1126Hidden Markov Model (HMM) algorithms.
1127See @uref{http://git.debian.org/?p=collab-maint/svox.git}.
1128Pico documentation can be found at
1129@uref{http://android.git.kernel.org/?p=platform/external/svox.git;
1130a=tree;f=pico_resources/docs}
1131It includes three manuals:
1132- SVOX_Pico_Lingware.pdf
1133- SVOX_Pico_Manual.pdf
1134- SVOX_Pico_architecture_and_design.pdf
1135
1136@end itemize
1137
1138@node Security,  , Synthesis Output Modules, User's Documentation
1139@section Security
1140
1141Speech Dispatcher doesn't implement any special authentication
1142mechanisms but uses the standard system mechanisms to regulate access.
1143
1144If the default `unix_socket' communication mechanism is used, only the
1145user who starts the server can connect to it due to imposed
1146restrictions on the unix socket file permissions.
1147
1148In case of the `inet_socket' communication mechanism, where clients
1149connect to Speech Dispatcher on a specified port, theoretically
1150everyone could connect to it. The access is by default restricted only
1151for connections originating on the same machine, which can be changed
1152via the LocalhostAccessOnly option in the server configuration
1153file. In such a case, the user is reponsible to set appropriate
1154security restrictions on the access to the given port on his machine
1155from the outside network using a firewall or similar mechanism.
1156
1157@node Technical Specifications, Client Programming, User's Documentation, Top
1158@chapter Technical Specifications
1159
1160
1161@menu
1162* Communication mechanisms::
1163* Address specification::
1164* Actions performed on startup::
1165* Accepted signals::
1166@end menu
1167
1168@node Communication mechanisms, Address specification, Technical Specifications, Technical Specifications
1169@section Communication mechanisms
1170
1171Speech Dispatcher supports two communicatino mechanisms: UNIX-style
1172and Inet sockets, which are refered as 'unix-socket' and 'inet-socket'
1173respectively. The communication mechanism is decided on startup and
1174cannot be changed at runtime. Unix sockets are now the default and
1175preferred variant for local communication, Inet sockets are necessary
1176for communication over network.
1177
1178The mechanism for the decision of which method to use is as follows in
1179this order of precedence: command-line option, configuration option,
1180the default value 'unix-socket'.
1181
1182@emph{Unix sockets} are associated with a file in the filesystem. By
1183default, this file is placed in the user's runtime directory (as
1184determined by the value of the XDG_RUNTIME_DIR environment variable and the
1185system configuration for the given user). It's default name is
1186constructed as @code{XDG_RUNTIME_DIR/speech-dispatcher/speechd.sock}. The access
1187permissions for this file are set to 600 so that it's restricted to
1188read/write by the current user.
1189
1190As such, access is handled properly and there are no conflicts between
1191the different instances of Speech Dispatcher run by the different
1192users.
1193
1194Client applications and libraries are supposed to independently
1195replicate the process of construction of the socket path and connect
1196to it, thus establishing a common communication channel in the default
1197setup.
1198
1199It should be however possible in the client libraries and is possible
1200in the server, to define a custom file as a socket name if needed.
1201Client libraries should respect the @var{SPEECHD_ADDRESS} environment
1202variable.
1203
1204@emph{Inet sockets} are based on communication over a given port on
1205the given host, two variables which must be previously agreed between
1206the server and client before a connection can be established. The only
1207implicit security restriction is the server configuration option which
1208can allow or disallow access from machines other than localhost.
1209
1210By convention, the clients should use host and port given by one of
1211the following sources in the following order of precedence: its own
1212configuration, value of the @var{SPEECHD_ADDRESS} environment variable
1213and the default pair (localhost, 6560).
1214
1215@xref{Setting Communication Method}.
1216
1217@node Address specification, Actions performed on startup, Communication mechanisms, Technical Specifications
1218@section Address specification
1219
1220Speech Dispatcher provies several methods of communication and can be
1221used both locally and over network. @xref{Communication
1222mechanisms}. Client applications and interface libraries need to
1223recognize an address, which specifies how and where to contact the
1224appropriate server.
1225
1226Address specification consits from the method and one or more of its
1227parameters, each item separated by a colon:
1228
1229@example
1230method:parameter1:parameter2
1231@end example
1232
1233The method is either 'unix_socket' or 'inet_socket'. Parameters are
1234optional.  If not used in the address line, their default value will
1235be used.
1236
1237Two forms are currently recognized:
1238
1239@example
1240unix_socket:full/path/to/socket
1241inet_socket:host_ip:port
1242@end example
1243
1244Examples of valid address lines are:
1245@example
1246unix_socket
1247unix_socket:/tmp/test.sock
1248inet_socket
1249inet_socket:192.168.0.34
1250inet_socket:192.168.0.34:6563
1251@end example
1252
1253Clients implement different mechanisms how the user can set the
1254address. Clients should respect the @var{SPEECHD_ADDRESS} environment
1255variable @xref{Setting Communication Method}, unless the user
1256ovverrides its value by settins in the client application
1257itself. Clients should fallback to the default address, if neither the
1258environment variable or their specific configuration is set.
1259
1260The default communication address currently is:
1261
1262@example
1263unix_socket:/$XDG_RUNTIME_DIR/speech-dispatcher/speechd.sock
1264@end example
1265
1266where `~' stands for the path to the users home directory.
1267
1268@node Actions performed on startup, Accepted signals, Address specification, Technical Specifications
1269@section Actions performed on startup
1270
1271What follows is an overview of the actions the server takes on startup
1272in this order:
1273
1274@itemize @bullet
1275
1276@item Initialize logging stage 1
1277
1278Set loglevel to 1 and log destination to stderr (logfile is not ready yet).
1279
1280@item Parse command line options
1281
1282Read preferred communication method, destinations for logfile and pidfile
1283
1284@item Establish the @file{~/.config/speech-dispatcher/} and
1285@file{~/.cache/speech-dispatcher/} directories
1286
1287If pid and conf paths were not given as command line options, the
1288server will place them in @file{~/.config/speech-dispatcher/} and
1289@file{~/.cache/speech-dispatcher/} by default. If they
1290are not specified AND the current user doesn't have a system home directory,
1291the server will fail startup.
1292
1293The configuration file is pointed to @file{~/.config/speech-dispatcher/speechd.conf}
1294if it exists, otherwise to @file{/etc/speech-dispatcher/speechd.conf} or a similar
1295system location according to compile options. One of these files must
1296exists, otherwise Speech Dispatcher will not know where to find its output
1297modules.
1298
1299@item Create pid file
1300
1301Check the pid file in the determined location. If an instance of the
1302server is already running, log an error message and exit with error
1303code 1, otherwise create and lock a new pid file.
1304
1305@item Check for autospawning enabled
1306
1307If the server is started with --spawn, check whether autospawn is not
1308disabled in the configuration (DisableAutoSpawn config option in
1309speechd.conf). If it is disabled, log an error message and exit with
1310error code 1.
1311
1312@item Install signal handlers
1313
1314@item Create unix or inet sockets and start listening
1315
1316@item Initialize Speech Dispatcher
1317
1318Read the configuration files, setup some lateral threads, start and
1319initialize output modules. Reinitialize logging (stage 2) into the
1320final logfile destination (as determined by the command line option,
1321the configuration option and the default location in this order of
1322precedence).
1323
1324After this step, Speech Dispatcher is ready to accept new connections.
1325
1326@item Daemonize the process
1327
1328Fork the process, disconnect from standard input and outputs,
1329disconnect from parent process etc. as prescribed by the POSIX
1330standards.
1331
1332@item Initialize the speaking lateral thread
1333
1334Initialize the second main thread which will process the speech
1335request from the queues and pass them onto the Speech Dispatcher
1336modules.
1337
1338
1339@item Start accepting new connections from clients
1340
1341Start listening for new connections from clients and processing them
1342in a loop.
1343
1344@end itemize
1345
1346@node Accepted signals,  , Actions performed on startup, Technical Specifications
1347@section Accepted signals
1348
1349@itemize @bullet
1350
1351@item SIGINT
1352
1353Terminate the server
1354
1355@item SIGHUP
1356
1357Reload configuration from config files but do not restart modules
1358
1359@item SIGUSR1
1360
1361Reload dead output modules (modules which were previously working but
1362crashed during runtime and marked as dead)
1363
1364@item SIGPIPE
1365
1366Ignored
1367
1368@end itemize
1369
1370@node Client Programming, Server Programming, Technical Specifications, Top
1371@chapter Client Programming
1372
1373Clients communicate with Speech Dispatcher via the Speech Synthesis
1374Internet Protocol (SSIP) @xref{Top, , , ssip, Speech Synthesis
1375Internet Protocol documentation}.  The protocol is the actual
1376interface to Speech Dispatcher.
1377
1378Usually you don't need to use SSIP directly.  You can use one of the supplied
1379libraries, which wrap the SSIP interface.  This is the
1380recommended way of communicating with Speech Dispatcher.  We try so support as
1381many programming environments as possible.  This manual (except SSIP) contains
1382documentation for the C and Python libraries, however there are also other
1383libraries developed as external projects.  Please contact us for information
1384about current external client libraries.
1385
1386@menu
1387* C API::                       Shared library for C/C++
1388* Python API::                  Python module.
1389* Guile API::
1390* Common Lisp API::
1391* Autospawning::                How server is started from clients
1392@end menu
1393
1394@node C API, Python API, Client Programming, Client Programming
1395@section C API
1396
1397@menu
1398* Initializing and Terminating in C::
1399* Speech Synthesis Commands in C::
1400* Speech output control commands in C::
1401* Characters and Keys in C::
1402* Sound Icons in C::
1403* Parameter Setting Commands in C::
1404* Other Functions in C::
1405* Information Retrieval Commands in C::
1406* Event Notification and Index Marking in C::
1407* History Commands in C::
1408* Direct SSIP Communication in C::
1409@end menu
1410
1411@node Initializing and Terminating in C, Speech Synthesis Commands in C, C API, C API
1412@subsection Initializing and Terminating
1413
1414@deffn {C API function} SPDConnection* spd_open(char* client_name, char* connection_name, char* user_name, SPDConnectionMode connection_mode)
1415@findex spd_open()
1416
1417Opens a new connection to Speech Dispatcher and returns a socket file
1418descriptor you will use to communicate with Speech Dispatcher. The
1419socket file descriptor is a parameter used in all the other
1420functions. It now uses local communication via inet sockets.
1421See @code{spd_open2} for more details.
1422
1423The three parameters @code{client_name}, @code{connection_name} and
1424@code{username} are there only for informational and navigational
1425purposes, they don't affect any settings or behavior of any
1426functions. The authentication mechanism has nothing to do with
1427@code{username}. These parameters are important for the user when he
1428wants to set some parameters for a given session, when he wants to
1429browse through history, etc. The parameter @code{connection_mode}
1430specifies how this connection should be handled internally and if
1431event notifications and index marking capabilities will be available.
1432
1433@code{client_name} is the name of the client that opens the connection. Normally,
1434it should be the name of the executable, for example ``lynx'', ``emacs'', ``bash'',
1435or ``gcc''. It can be left as NULL.
1436
1437@code{connection_name} determines the particular use of that connection. If you
1438use only one connection in your program, this should be set to ``main'' (passing
1439a NULL pointer has the same effect). If you use two or more connections in
1440your program, their @code{client_name}s should be the same, but @code{connection_name}s
1441should differ. For example: ``buffer'', ``command_line'', ``text'', ``menu''.
1442
1443@code{username} should be set to the name of the user. Normally, you should
1444get this string from the system. If set to NULL, libspeechd will try to
1445determine it automatically by g_get_user_name().
1446
1447@code{connection_mode} has two possible values: @code{SPD_MODE_SINGLE}
1448and @code{SPD_MODE_THREADED}. If the parameter is set to
1449@code{SPD_MODE_THREADED}, then @code{spd_open()} will open an
1450additional thread in your program which will handle asynchronous SSIP
1451replies and will allow you to use callbacks for event notifications
1452and index marking, allowing you to keep track of the progress
1453of speaking the messages. However, you must be aware that your
1454program is now multi-threaded and care must be taken when
1455using/handling signals. If @code{SPD_MODE_SINGLE} is chosen, the
1456library won't execute any additional threads and SSIP will run only as
1457a synchronous protocol, therefore event notifications and index
1458marking won't be available.
1459
1460It returns a newly allocated SPDConnection* structure on success, or @code{NULL}
1461on error.
1462
1463Each connection you open should be closed by spd_close() before the
1464end of the program, so that the associated connection descriptor is
1465closed, threads are terminated and memory is freed.
1466
1467@end deffn
1468
1469@deffn {C API function} SPDConnection* spd_open2(char* client_name, char* connection_name, char* user_name, SPDConnectionMode connection_mode, SPDConnectionMethod method, int autospawn)
1470@findex spd_open2()
1471
1472Opens a new connection to Speech Dispatcher and returns a socket file
1473descriptor. This function is the same as @code{spd_open} except that
1474it gives more control of the communication method and autospawn
1475functionality as described below.
1476
1477@code{method} is either @code{SPD_METHOD_UNIX_SOCKET} or @code{SPD_METHOD_INET_SOCKET}. By default,
1478unix socket communication should be preferred, but inet sockets are necessary for cross-network
1479communication.
1480
1481@code{autospawn} is a boolean flag specifying whether the function
1482should try to autospawn (autostart) the Speech Dispatcher server
1483process if it is not running already. This is set to 1 by default, so
1484this function should normally not fail even if the server is not yet
1485running.
1486
1487@end deffn
1488
1489@deffn {C API function}  int spd_get_client_id(SPDConnection *connection)
1490@findex spd_get_client_id()
1491
1492Get the client ID.
1493
1494@code{connection} is the SPDConnection* connection created by spd_open().
1495
1496It returns the client ID of the connection.
1497
1498@end deffn
1499
1500@deffn {C API function}  void spd_close(SPDConnection *connection)
1501@findex spd_close()
1502
1503Closes a Speech Dispatcher socket connection, terminates associated
1504threads (if necessary) and frees the memory allocated by
1505spd_open(). You should close every connection before the end of your
1506program.
1507
1508@code{connection} is the SPDConnection connection obtained by spd_open().
1509@end deffn
1510
1511@node Speech Synthesis Commands in C, Speech output control commands in C, Initializing and Terminating in C, C API
1512@subsection Speech Synthesis Commands
1513
1514@defvar {C API type} SPDPriority
1515@vindex SPDPriority
1516
1517@code{SPDPriority} is an enum type that represents the possible priorities that
1518can be assigned to a message.
1519
1520@example
1521typedef enum@{
1522    SPD_IMPORTANT = 1,
1523    SPD_MESSAGE = 2,
1524    SPD_TEXT = 3,
1525    SPD_NOTIFICATION = 4,
1526    SPD_PROGRESS = 5
1527@}SPDPriority;
1528@end example
1529
1530@xref{Top,,Message Priority Model,ssip, SSIP Documentation}.
1531
1532@end defvar
1533
1534@deffn {C API function}  int spd_say(SPDConnection* connection, SPDPriority priority, char* text);
1535@findex spd_say()
1536
1537Sends a message to Speech Dispatcher. If this message isn't blocked by
1538some message of higher priority and this CONNECTION isn't paused, it
1539will be synthesized directly on one of the output devices. Otherwise,
1540the message will be discarded or delayed according to its priority.
1541
1542@code{connection} is the SPDConnection* connection created by spd_open().
1543
1544@code{priority} is the desired priority for this message. @xref{Top,,Message Priority Model,ssip, SSIP Documentation}.
1545
1546@code{text} is a null terminated string containing text you want sent
1547to synthesis. It must be encoded in UTF-8. Note that this doesn't have
1548to be what you will finally hear. It can be affected by different
1549settings, such as spelling, punctuation, text substitution etc.
1550
1551It returns a positive unique message identification number on success,
1552-1 otherwise.  This message identification number can be saved and
1553used for the purpose of event notification callbacks or history
1554handling.
1555
1556@end deffn
1557
1558@deffn {C API function}  int spd_sayf(SPDConnection* connection, SPDPriority priority, char* format, ...);
1559@findex spd_sayf()
1560
1561Similar to @code{spd_say()}, simulates the behavior of printf().
1562
1563@code{format} is a string containing text and formatting of the parameters, such as ``%d'',
1564``%s'' etc. It must be encoded in UTF-8.
1565
1566@code{...} is an arbitrary number of arguments.
1567
1568All other parameters are the same as for spd_say().
1569
1570For example:
1571@example
1572       spd_sayf(conn, SPD_TEXT, "Hello %s, how are you?", username);
1573       spd_sayf(conn, SPD_IMPORTANT, "Fatal error on [%s:%d]", filename, line);
1574@end example
1575
1576But be careful with unicode! For example this doesn't work:
1577
1578@example
1579       spd_sayf(conn, SPD_NOTIFY, ``Pressed key is %c.'', key);
1580@end example
1581
1582Why? Because you are supposing that key is a char, but that will
1583fail with languages using multibyte charsets. The proper solution
1584is:
1585
1586@example
1587       spd_sayf(conn, SPD_NOTIFY, ``Pressed key is %s'', key);
1588@end example
1589where key is an encoded string.
1590
1591It returns a positive unique message identification number on success, -1 otherwise.
1592This message identification number can be saved and used for the purpose of
1593event notification callbacks or history handling.
1594@end deffn
1595
1596@node Speech output control commands in C, Characters and Keys in C, Speech Synthesis Commands in C, C API
1597@subsection Speech Output Control Commands
1598
1599@subsubheading Stop Commands
1600
1601@deffn {C API function}  int spd_stop(SPDConnection* connection);
1602@findex spd_stop()
1603
1604Stops the message currently being spoken on a given connection. If there
1605is no message being spoken, does nothing. (It doesn't touch the messages
1606waiting in queues). This is intended for stops executed by the user,
1607not for automatic stops (because automatically you can't control
1608how many messages are still waiting in queues on the server).
1609
1610@code{connection} is the SPDConnection* connection created by spd_open().
1611
1612It returns 0 on success, -1 otherwise.
1613@end deffn
1614
1615@deffn {C API function}  int spd_stop_all(SPDConnection* connection);
1616@findex spd_stop_all()
1617
1618The same as spd_stop(), but it stops every message being said,
1619without distinguishing where it came from.
1620
1621It returns 0 on success, -1 if some of the stops failed.
1622@end deffn
1623
1624@deffn {C API function}  int spd_stop_uid(SPDConnection* connection, int target_uid);
1625@findex spd_stop_uid()
1626
1627The same as spd_stop() except that it stops a client client different from
1628the calling one. You must specify this client in @code{target_uid}.
1629
1630@code{target_uid} is the unique ID of the connection you want
1631to execute stop() on. It can be obtained from spd_history_get_client_list().
1632@xref{History Commands in C}.
1633
1634It returns 0 on success, -1 otherwise.
1635
1636@end deffn
1637
1638@subsubheading Cancel Commands
1639
1640@deffn {C API function}  int spd_cancel(SPDConnection* connection);
1641
1642Stops the currently spoken message from this connection
1643(if there is any) and discards all the queued messages
1644from this connection. This is probably what you want
1645to do, when you call spd_cancel() automatically in
1646your program.
1647@end deffn
1648
1649@deffn {C API function}  int spd_cancel_all(SPDConnection* connection);
1650@findex spd_cancel_all()
1651
1652The same as spd_cancel(), but it cancels every message
1653without distinguishing where it came from.
1654
1655It returns 0 on success, -1 if some of the stops failed.
1656@end deffn
1657
1658@deffn {C API function}  int spd_cancel_uid(SPDConnection* connection, int target_uid);
1659@findex spd_cancel_uid()
1660
1661The same as spd_cancel() except that it executes cancel for some other client
1662than the calling one. You must specify this client in @code{target_uid}.
1663
1664@code{target_uid} is the unique ID of the connection you want to
1665execute cancel() on.  It can be obtained from
1666spd_history_get_client_list().  @xref{History Commands in C}.
1667
1668It returns 0 on success, -1 otherwise.
1669@end deffn
1670
1671@subsubheading Pause Commands
1672
1673@deffn {C API function}  int spd_pause(SPDConnection* connection);
1674@findex int spd_pause()
1675
1676Pauses all messages received from the given connection. No messages
1677except for priority @code{notification} and @code{progress} are thrown
1678away, they are all waiting in a separate queue for resume(). Upon resume(), the
1679message that was being said at the moment pause() was received will be
1680continued from the place where it was paused.
1681
1682It returns immediately. However, that doesn't mean that the speech
1683output will stop immediately. Instead, it can continue speaking
1684the message for a while until a place where the position in the text
1685can be determined exactly is reached. This is necessary to be able to
1686provide `resume' without gaps and overlapping.
1687
1688When pause is on for the given client, all newly received
1689messages are also queued and waiting for resume().
1690
1691It returns 0 on success, -1 if something failed.
1692@end deffn
1693
1694@deffn {C API function}  int spd_pause_all(SPDConnection* connection);
1695@findex spd_pause_all()
1696
1697The same as spd_pause(), but it pauses every message,
1698without distinguishing where it came from.
1699
1700It returns 0 on success, -1 if some of the pauses failed.
1701@end deffn
1702
1703@deffn {C API function}  int spd_pause_uid(SPDConnection* connection, int target_uid);
1704@findex spd_pause_uid()
1705
1706The same as spd_pause() except that it executes pause for a client different from
1707the calling one. You must specify the client in @code{target_uid}.
1708
1709@code{target_uid} is the unique ID of the connection you want
1710to pause. It can be obtained from spd_history_get_client_list().
1711@xref{History Commands in C}.
1712
1713It returns 0 on success, -1 otherwise.
1714@end deffn
1715
1716@subsubheading Resume Commands
1717
1718@deffn {C API function}  int spd_resume(SPDConnection* connection);
1719@findex int spd_resume()
1720
1721Resumes all paused messages from the given connection. The rest
1722of the message that was being said at the moment pause() was
1723received will be said and all the other messages are queued
1724for synthesis again.
1725
1726@code{connection} is the SPDConnection* connection created by spd_open().
1727
1728It returns 0 on success, -1 otherwise.
1729@end deffn
1730
1731@deffn {C API function}  int spd_resume_all(SPDConnection* connection);
1732@findex spd_resume_all()
1733
1734The same as spd_resume(), but it resumes every paused message,
1735without distinguishing where it came from.
1736
1737It returns 0 on success, -1 if some of the pauses failed.
1738@end deffn
1739
1740@deffn {C API function}  int spd_resume_uid(SPDConnection* connection, int target_uid);
1741@findex spd_resume_uid()
1742
1743The same as spd_resume() except that it executes resume for a client different from
1744the calling one. You must specify the client in @code{target_uid}.
1745
1746@code{target_uid} is the unique ID of the connection you want
1747to resume. It can be obtained from spd_history_get_client_list().
1748@xref{History Commands in C}.
1749
1750It returns 0 on success, -1 otherwise.
1751@end deffn
1752
1753@node Characters and Keys in C, Sound Icons in C, Speech output control commands in C, C API
1754@subsection Characters and Keys
1755
1756@deffn {C API function}  int spd_char(SPDConnection* connection, SPDPriority priority, char* character);
1757@findex spd_char()
1758
1759Says a character according to user settings for characters. For example, this can be
1760used for speaking letters under the cursor.
1761
1762@code{connection} is the SPDConnection* connection created by spd_open().
1763
1764@code{priority} is the desired priority for this
1765message. @xref{Top,,Message Priority Model,ssip, SSIP Documentation}.
1766
1767@code{character} is a NULL terminated string of chars containing one UTF-8
1768character. If it contains more characters, only the first one is processed.
1769
1770It returns 0 on success, -1 otherwise.
1771@end deffn
1772
1773@deffn {C API function}  int spd_wchar(SPDConnection* connection, SPDPriority priority, wchar_t wcharacter);
1774@findex spd_say_wchar()
1775
1776The same as spd_char(), but it takes a wchar_t variable as its argument.
1777
1778It returns 0 on success, -1 otherwise.
1779@end deffn
1780
1781@deffn {C API function} int spd_key(SPDConnection* connection, SPDPriority priority, char* key_name);
1782@findex spd_key()
1783
1784Says a key according to user settings for keys.
1785
1786@code{connection} is the SPDConnection* connection created by spd_open().
1787
1788@code{priority} is the desired priority for this
1789message. @xref{Top,,Message Priority Model,ssip, SSIP Documentation}.
1790
1791@code{key_name} is the name of the key in a special format.
1792@xref{Top,,Speech Synthesis and Sound Output Commands, ssip, SSIP
1793Documentation}, (KEY, the corresponding SSIP command) for description
1794of the format of @code{key_name}
1795
1796It returns 0 on success, -1 otherwise.
1797@end deffn
1798
1799@node Sound Icons in C, Parameter Setting Commands in C, Characters and Keys in C, C API
1800@subsection Sound Icons
1801
1802@deffn {C API function}  int spd_sound_icon(SPDConnection* connection, SPDPriority priority, char* icon_name);
1803@findex spd_sound_icon()
1804
1805Sends a sound icon ICON_NAME. These are symbolic names that are mapped
1806to a sound or to a text string (in the particular language) according to
1807Speech Dispatcher tables and user settings. Each program can also
1808define its own icons.
1809
1810@code{connection} is the SPDConnection* connection created by spd_open().
1811
1812@code{priority} is the desired priority for this
1813message. @xref{Top,,Message Priority Model,ssip, SSIP Documentation}.
1814
1815@code{icon_name} is the name of the icon. It can't contain spaces, instead
1816use underscores (`_'). Icon names starting with an underscore
1817are considered internal and shouldn't be used.
1818@end deffn
1819
1820@node Parameter Setting Commands in C, Other Functions in C, Sound Icons in C, C API
1821@subsection Parameter Settings Commands
1822
1823The following parameter setting commands are available. For configuration
1824and history clients there are also functions for setting the value for
1825some other connection and for all connections. They are listed separately below.
1826
1827Please see @ref{Top,,Parameter Setting Commands,ssip, SSIP
1828Documentation} for a general description of what they mean.
1829
1830@deffn {C API function} int spd_set_data_mode(SPDConnection *connection, SPDDataMode mode)
1831@findex spd_set_data_mode()
1832
1833Set Speech Dispatcher data mode. Currently, plain text and SSML are
1834supported. SSML is especially useful if you want to use index marks
1835or include changes of voice parameters in the text.
1836
1837@code{mode} is the requested data mode: @code{SPD_DATA_TEXT} or
1838@code{SPD_DATA_SSML}.
1839
1840@end deffn
1841
1842@deffn {C API function}  int spd_set_language(SPDConnection* connection, char* language);
1843@findex spd_set_language()
1844
1845Sets the language that should be used for synthesis.
1846
1847@code{connection} is the SPDConnection* connection created by spd_open().
1848
1849@code{language} is the language code as defined in RFC 1766 (``cs'',
1850``en'', ``en-US'', ...).
1851
1852@end deffn
1853
1854@deffn {C API function}  int spd_set_output_module(SPDConnection* connection, char* output_module);
1855@findex spd_set_output_module()
1856@anchor{spd_set_output_module}
1857
1858Sets the output module that should be used for synthesis. The parameter
1859of this command should always be entered by the user in some way
1860and not hardcoded anywhere in the code as the available synthesizers
1861and their registration names may vary from machine to machine.
1862
1863@code{connection} is the SPDConnection* connection created by spd_open().
1864
1865@code{output_module} is the output module name under which the module
1866was loaded into Speech Dispatcher in its configuration (``flite'',
1867``festival'', ``epos-generic''... )
1868
1869@end deffn
1870
1871@deffn {C API function} char* spd_get_output_module(SPDConnection* connection);
1872@findex spd_get_output_module()
1873@anchor{spd_get_output_module}
1874
1875Gets the current output module in use for synthesis.
1876
1877@code{connection} is the SPDConnection* connection created by spd_open().
1878
1879It returns the output module name under which the module was loaded into Speech
1880Dispatcher in its configuration (``flite'', ``festival'',  ``espeak''... )
1881
1882@end deffn
1883
1884@deffn {C API function}  int spd_set_punctuation(SPDConnection* connection, SPDPunctuation type);
1885@findex spd_set_punctuation()
1886
1887Set punctuation mode to the given value.  `all' means speak all
1888punctuation characters, `none' means speak no punctuation characters,
1889`some' and `most' mean speak intermediate sets of punctuation characters as
1890set in symbols tables or output modules.
1891
1892@code{connection} is the SPDConnection* connection created by spd_open().
1893
1894@code{type} is one of the following values: @code{SPD_PUNCT_ALL},
1895@code{SPD_PUNCT_NONE}, @code{SPD_PUNCT_SOME}.
1896
1897It returns 0 on success, -1 otherwise.
1898@end deffn
1899
1900@deffn {C API function}  int spd_set_spelling(SPDConnection* connection, SPDSpelling type);
1901@findex spd_set_spelling()
1902
1903Switches spelling mode on and off. If set to on, all incoming messages
1904from this particular connection will be processed according to appropriate
1905spelling tables (see spd_set_spelling_table()).
1906
1907@code{connection} is the SPDConnection* connection created by spd_open().
1908
1909@code{type} is one of the following values: @code{SPD_SPELL_ON}, @code{SPD_SPELL_OFF}.
1910@end deffn
1911
1912@deffn {C API function}  int spd_set_voice_type(SPDConnection* connection, SPDVoiceType voice);
1913@findex spd_set_voice_type()
1914@anchor{spd_set_voice_type}
1915
1916Set a preferred symbolic voice.
1917
1918@code{connection} is the SPDConnection* connection created by spd_open().
1919
1920@code{voice} is one of the following values: @code{SPD_MALE1},
1921@code{SPD_MALE2}, @code{SPD_MALE3}, @code{SPD_FEMALE1}, @code{SPD_FEMALE2},
1922@code{SPD_FEMALE3}, @code{SPD_CHILD_MALE}, @code{SPD_CHILD_FEMALE}.
1923
1924@end deffn
1925
1926@deffn {C API function}  int spd_set_synthesis_voice(SPDConnection* connection, char* voice_name);
1927@findex spd_set_voice_type()
1928@anchor{spd_set_synthesis_voice}
1929
1930Set the speech synthesizer voice to use. Please note that synthesis
1931voices are an attribute of the synthesizer, so this setting only takes
1932effect until the output module in use is changed (via
1933@code{spd_set_output_module()} or via @code{spd_set_language}).
1934
1935@code{connection} is the SPDConnection* connection created by spd_open().
1936
1937@code{voice_name} is any of the voice name values retrieved by @xref{spd_list_synthesis_voices}.
1938
1939@end deffn
1940
1941@deffn {C API function}  int spd_set_voice_rate(SPDConnection* connection, int rate);
1942@findex spd_set_voice_rate()
1943
1944Set voice speaking rate.
1945
1946@code{connection} is the SPDConnection* connection created by spd_open().
1947
1948@code{rate} is a number between -100 and +100 which means
1949the slowest and the fastest speech rate respectively.
1950
1951@end deffn
1952
1953@deffn {C API function}  int spd_get_voice_rate(SPDConnection* connection);
1954@findex spd_get_voice_rate()
1955
1956Get voice speaking rate.
1957
1958@code{connection} is the SPDConnection* connection created by spd_open().
1959
1960It returns the current voice rate.
1961
1962@end deffn
1963
1964@deffn {C API function}  int spd_set_voice_pitch(SPDConnection* connection, int pitch);
1965@findex spd_set_voice_pitch()
1966
1967Set voice pitch.
1968
1969@code{connection} is the SPDConnection* connection created by spd_open().
1970
1971@code{pitch} is a number between -100 and +100, which means the
1972lowest and the highest pitch respectively.
1973
1974@end deffn
1975
1976@deffn {C API function}  int spd_get_voice_pitch(SPDConnection* connection);
1977@findex spd_get_voice_pitch()
1978
1979Get voice pitch.
1980
1981@code{connection} is the SPDConnection* connection created by spd_open().
1982
1983It returns the current voice pitch.
1984
1985@end deffn
1986
1987@deffn {C API function}  int spd_set_voice_pitch_range(SPDConnection* connection, int pitch_range);
1988@findex spd_set_voice_pitch()
1989
1990Set voice pitch range.
1991
1992@code{connection} is the SPDConnection* connection created by spd_open().
1993
1994@code{pitch_range} is a number between -100 and +100, which means the
1995lowest and the highest pitch range respectively.
1996
1997@end deffn
1998
1999@deffn {C API function}  int spd_set_volume(SPDConnection* connection, int volume);
2000@findex spd_set_volume()
2001
2002Set the volume of the voice and sounds produced by Speech Dispatcher's output
2003modules.
2004
2005@code{connection} is the SPDConnection* connection created by spd_open().
2006
2007@code{volume} is a number between -100 and +100 which means
2008the lowest and the loudest voice respectively.
2009
2010@end deffn
2011
2012@deffn {C API function}  int spd_get_volume(SPDConnection* connection);
2013@findex spd_get_volume()
2014
2015Get the volume of the voice and sounds produced by Speech Dispatcher's output
2016modules.
2017
2018@code{connection} is the SPDConnection* connection created by spd_open().
2019
2020It returns the current volume.
2021
2022@end deffn
2023
2024
2025@node Other Functions in C, Information Retrieval Commands in C, Parameter Setting Commands in C, C API
2026@subsection Other Functions
2027
2028@node Information Retrieval Commands in C, Event Notification and Index Marking in C, Other Functions in C, C API
2029@subsection Information Retrieval Commands
2030
2031@deffn {C API function}  char** spd_list_modules(SPDConnection* connection)
2032@findex spd_list_modules()
2033@anchor{spd_list_modules}
2034
2035
2036Returns a null-terminated array of identification names of the available
2037output modules. You can subsequently set the desired output module with
2038@xref{spd_set_output_module}. In case of error, the return value is
2039a NULL pointer.
2040
2041@code{connection} is the SPDConnection* connection created by spd_open().
2042
2043@end deffn
2044
2045@deffn {C API function}  char** spd_list_voices(SPDConnection* connection)
2046@findex spd_list_voices()
2047@anchor{spd_list_voices}
2048
2049Returns a null-terminated array of identification names of the
2050symbolic voices. You can subsequently set the desired voice
2051with @xref{spd_set_voice_type}.
2052
2053Please note that this is a fixed list independent of the synthesizer
2054in use. The given voices can be mapped to specific synthesizer voices
2055according to user wish or may, for example, all be mapped to the same
2056voice. To choose directly from the raw list of voices as implemented
2057in the synthesizer, @xref{spd_list_synthesis_voices}.
2058
2059In case of error, the return value is a NULL pointer.
2060
2061@code{connection} is the SPDConnection* connection created by spd_open().
2062
2063@end deffn
2064
2065@deffn {C API function}  char** spd_list_synthesis_voices(SPDConnection* connection)
2066@findex spd_list_synthesis_voices()
2067@anchor{spd_list_synthesis_voices}
2068
2069Returns a null-terminated array of identification names of
2070@code{SPDVoice*} structures describing the available voices as given
2071by the synthesizer. You can subsequently set the desired voice with
2072@code{spd_set_synthesis_voice()}.
2073
2074@example
2075typedef struct@{
2076  char *name;   /* Name of the voice (id) */
2077  char *language;  /* 2/3-letter ISO language code,
2078                    * possibly followed by 2/3-letter ISO region code,
2079		    * e.g. en-US */
2080  char *variant;   /* a not-well defined string describing dialect etc. */
2081@}SPDVoice;
2082@end example
2083
2084Please note that the list returned is specific to each synthesizer in
2085use (so when you switch to another output module, you must also
2086retrieve a new list). If you want instead to use symbolic voice
2087names which are independent of the synthesizer in use, @xref{spd_list_voices}.
2088
2089In case of error, the return value is a NULL pointer.
2090
2091@code{connection} is the SPDConnection* connection created by spd_open().
2092
2093@end deffn
2094
2095@node Event Notification and Index Marking in C, History Commands in C, Information Retrieval Commands in C, C API
2096@subsection Event Notification and Index Marking in C
2097
2098When the SSIP connection is run in asynchronous mode, it is possible
2099to register callbacks for all the SSIP event notifications and index
2100mark notifications, as defined in @ref{Message Event Notification
2101and Index Marking,,, ssip, SSIP Documentation}
2102
2103@defvar {C API type} SPDNotification
2104@vindex SPDNotification
2105@anchor{SPDNotification}
2106
2107@code{SPDNotification} is an enum type that represents the possible
2108base notification types that can be assigned to a message.
2109
2110@example
2111typedef enum@{
2112    SPD_BEGIN = 1,
2113    SPD_END = 2,
2114    SPD_INDEX_MARKS = 4,
2115    SPD_CANCEL = 8,
2116    SPD_PAUSE = 16,
2117    SPD_RESUME = 32
2118@}SPDNotification;
2119@end example
2120@end defvar
2121
2122There are currently two types of callbacks in the C API.
2123
2124@defvar {C API type} SPDCallback
2125@vindex SPDCallback
2126@anchor{SPDCallback}
2127@code{void (*SPDCallback)(size_t msg_id, size_t client_id, SPDNotificationType state);}
2128
2129This one is used for notifications about the events: @code{BEGIN}, @code{END}, @code{PAUSE}
2130and @code{RESUME}. When the callback is called, it provides three parameters for the event.
2131
2132@code{msg_id} unique identification number of the message the notification is about.
2133
2134@code{client_id} specifies the unique identification number of the client who sent the
2135message. This is usually the same connection as the connection which registered this
2136callback, and therefore uninteresting. However, in some special cases it might be useful
2137to register this callback for other SSIP connections, or register the same callback for
2138several connections originating from the same application.
2139
2140@code{state} is the @code{SPD_Notification} type of this notification. @xref{SPDNotification}.
2141@end defvar
2142
2143@defvar {C API type} SPDCallbackIM
2144@vindex SPDCallbackIM
2145@code{void (*SPDCallbackIM)(size_t msg_id, size_t client_id, SPDNotificationType state,
2146char *index_mark);}
2147
2148@code{SPDCallbackIM} is used for notifications about index marks that have been reached
2149in the message.  (A way to specify index marks is e.g. through the SSML element
2150<mark/> in ssml mode.)
2151
2152The syntax and meaning of these parameters are the same as for @ref{SPDCallback}
2153except for the additional parameter @code{index_mark}.
2154
2155@code{index_mark} is a NULL terminated string associated with the index mark. Please
2156note that this string is specified by client application and therefore it needn't be
2157unique.
2158@end defvar
2159
2160One or more callbacks can be supplied for a given @code{SPDConnection*} connection by
2161assigning the values of pointers to the appropriate functions to the following connection
2162members:
2163
2164@example
2165    SPDCallback callback_begin;
2166    SPDCallback callback_end;
2167    SPDCallback callback_cancel;
2168    SPDCallback callback_pause;
2169    SPDCallback callback_resume;
2170    SPDCallbackIM callback_im;
2171@end example
2172
2173There are three settings commands which will turn notifications on and
2174off for the current SSIP connection and cause the callbacks to be called
2175when the event is registered by Speech Dispatcher.
2176
2177@deffn {C API function} int spd_set_notification_on(SPDConnection* connection, SPDNotification notification);
2178@findex spd_set_notification_on
2179@end deffn
2180@deffn {C API function} int spd_set_notification_off(SPDConnection* connection, SPDNotification notification);
2181@findex spd_set_notification_off
2182@end deffn
2183@deffn {C API function} int spd_set_notification(SPDConnection* connection, SPDNotification notification, const char* state);
2184@findex spd_set_notification
2185
2186These functions will set the notification specified by the parameter
2187@code{notification} on or off (or to the given value)
2188respectively. Note that it is only safe to call these functions after
2189the appropriate callback functions have been set in the @code{SPDCallback}
2190structure. Doing otherwise is not considered an error, but the
2191application might miss some events due to callback functions not being
2192executed (e.g. the client might receive an @code{END} event without
2193receiving the corresponding @code{BEGIN} event in advance.
2194
2195@code{connection} is the SPDConnection* connection created by spd_open().
2196
2197@code{notification} is the requested type of notifications that should be reported by SSIP. @xref{SPDNotification}.
2198Note that also '|' combinations are possible, as illustrated in the example below.
2199
2200@code{state} must be either the string ``on'' or ``off'', for switching the given notification on or off.
2201
2202@end deffn
2203
2204The following example shows how to use callbacks for the simple
2205purpose of playing a message and waiting until its end. (Please note
2206that checks of return values in this example as well as other code
2207not directly related to index marking, have been removed for the purpose
2208of clarity.)
2209
2210@example
2211#include <semaphore.h>
2212
2213sem_t semaphore;
2214
2215/* Callback for Speech Dispatcher notifications */
2216void end_of_speech(size_t msg_id, size_t client_id, SPDNotificationType type)
2217@{
2218   /* We don't check msg_id here since we will only send one
2219       message. */
2220
2221   /* Callbacks are running in a separate thread, so let the
2222       (sleeping) main thread know about the event and wake it up. */
2223   sem_post(&semaphore);
2224@}
2225
2226int
2227main(int argc, char **argv)
2228@{
2229   SPDConnection *conn;
2230
2231   sem_init(&semaphore, 0, 0);
2232
2233   /* Open Speech Dispatcher connection in THREADED mode. */
2234   conn = spd_open("say","main", NULL, SPD_MODE_THREADED);
2235
2236   /* Set callback handler for 'end' and 'cancel' events. */
2237   conn->callback_end = con->callback_cancel = end_of_speech;
2238
2239   /* Ask Speech Dispatcher to notify us about these events. */
2240   spd_set_notification_on(conn, SPD_END);
2241   spd_set_notification_on(conn, SPD_CANCEL);
2242
2243   /* Say our message. */
2244   spd_sayf(conn, SPD_MESSAGE, "%s", argv[1]);
2245
2246   /* Wait for 'end' or 'cancel' of the sent message.
2247      By SSIP specifications, we are guaranteed to get
2248      one of these two eventually. */
2249   sem_wait(&semaphore);
2250
2251   return 0;
2252@}
2253@end example
2254
2255@node History Commands in C, Direct SSIP Communication in C, Event Notification and Index Marking in C, C API
2256@subsection History Commands
2257@findex spd_history_select_client()
2258@findex spd_get_client_list()
2259@findex spd_get_message_list_fd()
2260
2261@node Direct SSIP Communication in C,  , History Commands in C, C API
2262@subsection Direct SSIP Communication in C
2263
2264It might happen that you want to use some SSIP function that is not
2265available through a library or you may want to use an available
2266function in a different manner. (If you think there is something
2267missing in a library or you have some useful comment on the
2268available functions, please let us know.) For this purpose, there are
2269a few functions that will allow you to send arbitrary SSIP commands on
2270your connection and read the replies.
2271
2272@deffn {C API function} int spd_execute_command(SPDConnection* connection, char *command);
2273@findex spd_execute_command()
2274
2275You can send an arbitrary SSIP command specified in the parameter @code{command}.
2276
2277If the command is successful, the function returns a 0. If there is no such
2278command or the command failed for some reason, it returns -1.
2279
2280@code{connection} is the SPDConnection* connection created by spd_open().
2281
2282@code{command} is a null terminated string containing a full SSIP command
2283without the terminating sequence @code{\r\n}.
2284
2285For example:
2286@example
2287        spd_execute_command(fd, "SET SELF RATE 60");
2288        spd_execute_command(fd, "SOUND_ICON bell");
2289@end example
2290
2291It's not possible to use this function for compound commands like @code{SPEAK}
2292where you are receiving more than one reply. If this is your case, please
2293see `spd_send_data()'.
2294@end deffn
2295
2296@deffn {C API function} char* spd_send_data(SPDConnection* connection, const char *message, int wfr);
2297@findex spd_send_data()
2298
2299You can send an arbitrary SSIP string specified in the parameter @code{message}
2300and, if specified, wait for the reply. The string can be any SSIP command, but
2301it can also be textual data or a command parameter.
2302
2303If @code{wfr} (wait for reply) is set to SPD_WAIT_REPLY, you will receive the reply string
2304as the return value. If wfr is set to SPD_NO_REPLY, the return value is a NULL pointer.
2305If wfr is set to SPD_WAIT_REPLY, you should always free the returned string.
2306
2307@code{connection} is the SPDConnection* connection created by spd_open().
2308
2309@code{message} is a null terminated string containing a full SSIP
2310string.  If this is a complete SSIP command, it must include the full
2311terminating sequence @code{\r\n}.
2312
2313@code{wfr} is either SPD_WAIT_REPLY (integer value of 1) or SPD_NO_REPLY (0).
2314This specifies if you expect to get a reply on the sent data according to SSIP.
2315For example, if you are sending ordinary text inside a @code{SPEAK} command,
2316you don't expect to get a reply, but you expect a reply after sending the final
2317sequence @code{\r\n.\r\n} of the multi-line command.
2318
2319For example (simplified by not checking and freeing the returned strings):
2320@example
2321        spd_send_data(conn, "SPEAK", SPD_WAIT_REPLY);
2322        spd_send_data(conn, "Hello world!\n", SPD_NO_REPLY);
2323        spd_send_data(conn, "How are you today?!", SPD_NO_REPLY);
2324        spd_send_data(conn, "\r\n.\r\n.", SPD_WAIT_REPLY);
2325@end example
2326
2327@end deffn
2328
2329
2330@node Python API, Guile API, C API, Client Programming
2331@section Python API
2332
2333There is a full Python API available in @file{src/python/speechd/} in
2334the source tree.  Please see the Python docstrings for full reference
2335about the available objects and methods.
2336
2337Simple Python client:
2338@example
2339import speechd
2340client = speechd.SSIPClient('test')
2341client.set_output_module('festival')
2342client.set_language('en-US')
2343client.set_punctuation(speechd.PunctuationMode.SOME)
2344client.speak("Hello World!")
2345client.close()
2346@end example
2347
2348The Python API respects the environment variables
2349@var{SPEECHD_ADDRESS} it the communication address is not specified
2350explicitly (see @code{SSIPClient} constructor arguments).
2351
2352Implementation of callbacks within the Python API tries to hide the
2353low level details of SSIP callback handling and provide a convenient
2354Pythonic interface.  You just pass a callable object (function) to the
2355@code{speak()} method and this function will be called whenever an
2356event occurs for the corresponding message.
2357
2358Callback example:
2359@example
2360import speechd, time
2361called = []
2362client = speechd.SSIPClient('callback-test')
2363client.speak("Hi!", callback=lambda cb_type: called.append(cb_type))
2364time.sleep(2) # Wait for the events to happen.
2365print "Called callbacks:", called
2366client.close()
2367@end example
2368
2369Real-world callback functions will most often need some sort of
2370context information to be able to distinguish for which message the
2371callback was called.  This can be simply done in Python.  The
2372following example uses the actual message text as the context
2373information within the callback function.
2374
2375Callback context example:
2376@example
2377import speechd, time
2378
2379class CallbackExample(object):
2380    def __init__(self):
2381        self._client = speechd.SSIPClient('callback-test')
2382
2383    def speak(self, text):
2384        def callback(callback_type):
2385            if callback_type == speechd.CallbackType.BEGIN:
2386                print "Speech started:", text
2387            elif callback_type == speechd.CallbackType.END:
2388                print "Speech completed:", text
2389            elif callback_type == speechd.CallbackType.CANCEL:
2390                print "Speech interupted:", text
2391        self._client.speak(text, callback=callback,
2392                           event_types=(speechd.CallbackType.BEGIN,
2393                                        speechd.CallbackType.CANCEL,
2394                                        speechd.CallbackType.END))
2395
2396    def go(self):
2397        self.speak("Hi!")
2398        self.speak("How are you?")
2399        time.sleep(4) # Wait for the events to happen.
2400        self._client.close()
2401
2402CallbackExample().go()
2403@end example
2404
2405@emph{Important notice:} The callback is called in Speech Dispatcher
2406listener thread.  No subsequent Speech Dispatcher interaction is
2407allowed from within the callback invocation.  If you need to do
2408something more complicated, do it in another thread to prevent
2409deadlocks in SSIP communication.
2410
2411@node Guile API, Common Lisp API, Python API, Client Programming
2412@section Guile API
2413
2414The Guile API can be found @file{src/guile/} in
2415the source tree, however it's still considered to be
2416at the experimental stage. Please read @file{src/guile/README}.
2417
2418@node Common Lisp API, Autospawning, Guile API, Client Programming
2419@section Common Lisp API
2420
2421The Common Lisp API can be found @file{src/cl/} in
2422the source tree, however it's still considered to be
2423at the experimental stage. Please read @file{src/cl/README}.
2424
2425@node Autospawning,  , Common Lisp API, Client Programming
2426@section Autospawning
2427
2428It is suggested that client libraries offer an autospawn functionality
2429to automatically start the server process when connecting locally and if
2430it is not already running. E.g. if the client application starts and
2431Speech Dispatcher is not running already, the client will start Speech
2432Dispatcher.
2433
2434The library API should provide a possibility to turn this
2435functionality off, but we suggest to set the default behavior to
2436autospawn.
2437
2438Autospawn is performed by executing Speech Dispatcher with the --spawn
2439parameter under the same user and permissions as the client process:
2440
2441@example
2442speech-dispatcher --spawn
2443@end example
2444
2445With the @code{--spawn} parameter, the process will start and return
2446with an exit code of 0 only if a) it is not already running (pidfile
2447check) b) the server doesn't have autospawn disabled in its
2448configuration c) no other error preventing the start
2449occurs. Otherwise, Speech Dispatcher is not started and the error code
2450of 1 is returned.
2451
2452The client library should redirect its stdout and stderr outputs
2453either to nowhere or to its logging system. It should subsequently
2454completely detach from the newly spawned process.
2455
2456Due to a bug in Speech Dispatcher, it is currently necessary to
2457include a wait statement after the autospawn for about 0.5 seconds
2458before attempting a connection.
2459
2460Please see how autospawn is implemented in the C API and in the Python
2461API for an example.
2462
2463@node Server Programming, Download and Contact, Client Programming, Top
2464@chapter Server Programming
2465
2466@menu
2467* Server Core::                 Internal structure and functionality overview.
2468* Output Modules::              Plugins for various speech synthesizers.
2469@end menu
2470
2471@node Server Core, Output Modules, Server Programming, Server Programming
2472@section Server Core
2473
2474The main documentation for the server core is the code itself. This section
2475is only a general introduction intended to give you some basic information
2476and hints where to look for things. If you are going to make some modifications
2477in the server core, we will be happy if you get in touch with us on
2478@email{speechd-discuss@@nongnu.org}.
2479
2480The server core is composed of two main parts, each of them implemented
2481in a separate thread. The @emph{server part} handles the communication
2482with clients and, with the desired configuration options, stores the messages
2483in the priority queue. The @emph{speaking part} takes care of
2484communicating with the output modules, pulls messages out of the priority
2485queue at the correct time and sends them to the appropriate synthesizer.
2486
2487Synchronization between these two parts is done by thread mutexes.
2488Additionally, synchronization of the speaking part from both sides
2489(server part, output modules) is done via a SYSV/IPC semaphore.
2490
2491@subheading Server part
2492
2493After switching to the daemon mode (if required), it reads configuration
2494files and initializes the speaking part. Then it opens the socket
2495and waits for incoming data. This is implemented mainly in
2496@file{src/server/speechd.c} and @file{src/server/server.c}.
2497
2498There are three types of events: new client connects to speechd,
2499old client disconnects, or a client sends some data. In the third
2500case, the data is passed to the @code{parse()} function defined
2501in @file{src/server/parse.c}.
2502
2503If the incoming data is a new message, it's stored in a
2504queue according to its priority. If it is SSIP
2505commands, it's handled by the appropriate handlers.
2506Handling of the @code{SET} family of commands can be found
2507in @file{src/server/set.c} and @code{HISTORY} commands are
2508processed in @file{src/server/history.c}.
2509
2510All reply messages of SSIP are defined in @file{src/server/msg.h}.
2511
2512@subheading Speaking part
2513
2514This thread, the function @code{speak()} defined in
2515@file{src/server/speaking.c}, is created from the server part process
2516shortly after initialization. Then it enters an infinite loop and
2517waits on a SYSV/IPC semaphore until one of the following actions
2518happen:
2519
2520@itemize @bullet
2521@item
2522The server adds a new message to the queue of messages waiting
2523to be said.
2524@item
2525The currently active output module signals that the message
2526that was being spoken is done.
2527@item
2528Pause or resume is requested.
2529@end itemize
2530
2531After handling the rest of the priority interaction (like actions
2532needed to repeat the last priority progress message) it decides
2533which action should be performed. Usually it's picking up
2534a message from the queue and sending it to the desired output
2535module (synthesizer), but sometimes it's handling the pause
2536or resume requests, and sometimes it's doing nothing.
2537
2538As said before, this is the part of Speech Dispatcher that
2539talks to the output modules. It does so by using the output
2540interface defined in @file{src/server/output.c}.
2541
2542@node Output Modules,  , Server Core, Server Programming
2543@section Output Modules
2544
2545@menu
2546* Basic Structure::             The definition of an output module.
2547* Communication Protocol for Output Modules::
2548* How to Write New Output Module::  How to include support for new synthesizers
2549* The Skeleton of an Output Module::
2550* Output Module Functions::
2551* Module Utils Functions and Macros::
2552* Index Marks in Output Modules::
2553@end menu
2554
2555@node Basic Structure, Communication Protocol for Output Modules, Output Modules, Output Modules
2556@subsection Basic Structure
2557
2558Speech Dispatcher output modules are independent applications that,
2559using a simple common communication protocol, read commands from
2560standard input and then output replies on standard output,
2561communicating the requests to the particular software or hardware
2562synthesizer. Everything the output module writes on standard output
2563or reads from standard input should conform to the specifications
2564of the communication protocol. Additionally, standard error output
2565is used for logging of the modules.
2566
2567Output module binaries are usually located in
2568@file{bin/speechd-modules/} and are loaded automatically when Speech
2569Dispatcher starts, according to configuration.  Their standard
2570input/output/error output is redirected to a pipe to Speech Dispatcher
2571and this way both sides can communicate.
2572
2573When the modules start, they are passed the name of a configuration file
2574that should be used for this particular output module.
2575
2576Each output module is started by Speech Dispatcher as:
2577
2578@example
2579my_module "configfile"
2580@end example
2581
2582where @code{configfile} is the full path to the desired configuration
2583file that the output module should parse.
2584
2585@node Communication Protocol for Output Modules, How to Write New Output Module, Basic Structure, Output Modules
2586@subsection Communication Protocol for Output Modules
2587
2588The protocol by which the output modules communicate on standard
2589input/output is based on @ref{Top,,SSIP,ssip, SSIP
2590Documentation}, although it is highly simplified and a little bit
2591modified for the different purpose here. Another difference
2592is that event notification is obligatory in modules communication,
2593while in SSIP, this is an optional feature. This is because Speech
2594Dispatcher has to know all the events happening in the output modules
2595for the purpose of synchronization of various messages.
2596
2597Since it's very similar to SSIP, @ref{Top,,General Rules,ssip, SSIP
2598Documentation}, for a general description of what the protocol looks
2599like. One of the exceptions is that since the output modules
2600communicate on standard input/output, we use only @code{LF} as the
2601line separator.
2602
2603The return values are:
2604@itemize
2605@item 2xx         OK
2606@item 3xx         CLIENT ERROR or BAD SYNTAX or INVALID VALUE
2607@item 4xx         OUTPUT MODULE ERROR or INTERNAL ERROR
2608
2609@item 700         EVENT INDEX MARK
2610@item 701         EVENT BEGIN
2611@item 702         EVENT END
2612@item 703         EVENT STOP
2613@item 704         EVENT PAUSE
2614@end itemize
2615
2616@table @code
2617@item SPEAK
2618Start receiving a text message in the SSML format and synthesize it.
2619After sending a reply to the command, output module waits for the text
2620of the message.  The text can spread over any number of lines and is
2621finished by an end of line marker followed by the line containing the
2622single character @code{.} (dot).  Thus the complete character sequence
2623closing the input text is @code{LF . LF}.  If any line within the sent
2624text contains only a dot, an extra dot should be prepended before it.
2625
2626During reception of the text message, output module doesn't send a
2627response to the particular lines sent.  The response line is sent only
2628immediately after the @code{SPEAK} command and after receiving the
2629closing dot line. This doesn't provide any means of synchronization,
2630instead, event notification is used for this purpose.
2631
2632There is no explicit upper limit on the size of the text.
2633
2634If the @code{SPEAK} command is received while the output module
2635is already speaking, it is considered an error.
2636
2637Example:
2638@example
2639SPEAK
2640202 OK SEND DATA
2641<speak>
2642Hello, GNU!
2643</speak>
2644.
2645200 OK SPEAKING
2646@end example
2647
2648After receiving the full text (or the first part of it), the output
2649module is supposed to start synthesizing it and take care of
2650delivering it to an audio device. When (or just before) the first
2651synthesized samples are delivered to the audio and start playing, the
2652output module must send the @code{BEGIN} event over the communication
2653socket to Speech Dispatcher, @xref{Events notification and index
2654marking}. After the audio stops playing, the event @code{STOP},
2655@code{PAUSE} or @code{END} must be delivered to Speech
2656Dispatcher. Additionally, if supported by the given synthesizer, the
2657output module can issue events associated with the included SSML index
2658marks when they are reached in the audio output.
2659
2660@item CHAR
2661Synthesize a character. If the synthesizer supports a different behavior
2662for the event of ``character'', this should be used.
2663
2664It works like the command @code{SPEAK} above, except that the argument
2665has to be exactly one line long. It contains the UTF-8 form of exactly
2666one character.
2667
2668@item KEY
2669Synthesize a key name. If the synthesizer supports a different behavior
2670for the event of ``key name'', this should be used.
2671
2672It works like the command @code{SPEAK} above, except that the argument
2673has to be exactly one line long. @xref{Top, ,SSIP KEY,ssip, SSIP
2674Documentation}, for the description of the allowed arguments.
2675
2676@item SOUND_ICON
2677Produce a sound icon. According to the configuration of the particular
2678synthesizer, this can produce either a sound (e.g. .wav) or synthesize
2679some text.
2680
2681It works like the command @code{SPEAK} above, except that the argument
2682has to be exactly one line long. It contains the symbolic name of the
2683icon that should be said. @xref{Top,,SSIP SOUND_ICON, ssip, SSIP
2684Documentation}, for more detailed description of the sound icons
2685mechanism.
2686
2687@item STOP
2688Immediately stop speaking on the output device and cancel synthesizing
2689the current message so that the output module is prepared to receive a
2690new message. If there is currently no message being synthesized, it is
2691not considered an error to call @code{STOP} anyway.
2692
2693This command is asynchronous. The output module is not supposed to
2694send any reply (not even error reply).
2695
2696It should return immediately, although stopping the synthesizer may
2697require a little bit more time. The module must issue one of the events
2698@code{STOPPED} or @code{END} when the module is finally
2699stopped. @code{END} is issued when the playing stopped by itself
2700before the module could terminate it or if the architecture of the
2701output module doesn't allow it to decide, otherwise @code{STOPPED}
2702should be used.
2703
2704@example
2705STOP
2706@end example
2707
2708@item PAUSE
2709Stop speaking the current message at a place where we can exactly
2710determine the position (preferably after a @code{__spd_} index mark).  This
2711doesn't have to be immediate and can be delayed even for a few
2712seconds. (Knowing the position exactly is important so that we can
2713later continue the message without gaps or overlapping.) It doesn't do
2714anything else (like storing the message etc.).
2715
2716This command is asynchronous. The output module is not supposed to
2717send any reply (not even error reply).
2718
2719For example:
2720@example
2721PAUSE
2722@end example
2723
2724@item SET
2725Set one of several speech parameters for the future messages.
2726
2727Each of the parameters is written on a single line in the form
2728@example
2729name=value
2730@end example
2731where @code{value} can be either a number or a string, depending upon
2732the name of the parameter.
2733
2734The @code{SET} environment is terminated by a dot on a single line.
2735Thus the complete character sequence closing the input text is
2736@code{LF . LF}
2737
2738During reception of the settings, output module doesn't send any
2739response to the particular lines sent.  The response line is sent only
2740immediately after the @code{SET} command and after receiving the
2741closing dot line.
2742
2743The available parameters that accept numerical values are @code{rate},
2744@code{pitch} and @code{pitch_range}.
2745
2746The available parameters that accept string values are
2747@code{punctuation_mode}, @code{spelling_mode}, @code{cap_let_recogn},
2748@code{voice}, and @code{language}.  The arguments are the same as for the
2749corresponding SSIP commands, except that they are written with small
2750letters. @xref{Top,,Parameter Setting Commands,ssip, SSIP
2751Documentation}.  The conversion between these string values and the
2752corresponding C enum variables can be easily done using
2753@file{src/common/fdsetconv.c}.
2754
2755Not all of these parameters must be set and the value of the string
2756arguments can also be @code{NULL}. If some of the parameters aren't
2757set, the output module should use its default.
2758
2759It's not necessary to set these parameters on the synthesizer right
2760away, instead, it can be postponed until some message to be spoken arrives.
2761
2762Here is an example:
2763@example
2764SET
2765203 OK RECEIVING SETTINGS
2766rate=20
2767pitch=-10
2768pitch_range=50
2769punctuation_mode=all
2770spelling_mode=on
2771punctuation_some=NULL
2772.
2773203 OK SETTINGS RECEIVED
2774@end example
2775
2776@item AUDIO
2777Audio has exactly the same structure as @code{SET}, but is transmitted
2778only once immediatelly after @code{INIT} to transmit the requested audio
2779parameters and tell the output module to open the audio device.
2780
2781@item QUIT
2782Terminates the output module. It should send the response, deallocate
2783all the resources, close all descriptors, terminate all child
2784processes etc. Then the output module should exit itself.
2785
2786@example
2787QUIT
2788210 OK QUIT
2789@end example
2790@end table
2791
2792@subsubsection Events notification and index marking
2793@anchor{Events notification and index marking}
2794
2795Each output module must take care of sending asynchronous
2796notifications whenever the synthesizer (or the module) starts or stops
2797output audio on the speakers. Additionally, whenever possible, the
2798output module should report back to Speech Dispatcher index marks
2799found in the incoming SSML text whenever they are reached while
2800speaking. See SSML specifications for more details about the
2801@code{mark} element
2802
2803Event and index mark notifications are reported by simply writing them
2804to the standard output. An event notification must never get in
2805between synchronous commands (those which require a reply) and their
2806reply. Before Speech Dispatcher sends any new requests (like
2807@code{SET}, @code{SPEAK} etc.) it waits for the previous request to be
2808terminated by the output module by signalling @code{STOP}, @code{END}
2809or @code{PAUSE} index marks. So the only thing the output module must
2810ensure in order to satisfy this requirement is that it doesn't send
2811any index marks until it acknowledges the receival of the new message
2812via @code{200 OK SPEAKING}. It must also ensure that index marks
2813written to the pipe are well ordered -- of course it doesn't make any
2814sense and it is an error to send any index marks after @code{STOP},
2815@code{END} or @code{PAUSE} is sent.
2816
2817
2818@table @code
2819
2820@item BEGIN
2821
2822This event must be issued whenever the module starts to speak the
2823given message. If this is not possible, it can issue it when it
2824starts to synthesize the message or when it receives the message.
2825
2826It is prepended by the code @code{701} and takes the form
2827
2828@example
2829701 BEGIN
2830@end example
2831
2832@item END
2833
2834This event must be issued whenever the module terminates speaking the
2835given message because it reached its end. If this is not possible, it
2836can issue this event when it is ready to receive a new message after
2837speaking the previous message.
2838
2839Each @code{END} must always be preceeded (possibly not directly) by a
2840@code{BEGIN}.
2841
2842It is prepended by the code @code{702} and takes the form
2843
2844@example
2845702 END
2846@end example
2847
2848@item STOP
2849
2850This event should be issued whenever the module terminates speaking
2851the given message without reaching its end (as a consequence of
2852receiving the STOP command or because of some error) not because of
2853a @code{PAUSE} command. When the synthesizer in use doesn't allow
2854the module to decide, the event @code{END} can be used instead.
2855
2856Each @code{STOP} must always be preceeded (possibly not directly) by a
2857@code{BEGIN}.
2858
2859It is prepended by the code @code{703} and takes the form
2860
2861@example
2862703 STOP
2863@end example
2864
2865@item PAUSE
2866
2867This event should be issued whenever the module terminates speaking
2868the given message without reaching its end because of receiving the
2869@code{PAUSE} command.
2870
2871Each @code{PAUSE} must always be preceeded (possibly not directly) by a
2872@code{BEGIN}.
2873
2874It is prepended by the code @code{704} and takes the form
2875
2876@example
2877704 PAUSE
2878@end example
2879
2880@item INDEX MARK
2881
2882This event should be issued by the output module (if supported)
2883whenever an index mark (SSML tag @code{<mark/>}) is passed while speaking
2884a message. It is preceeded by the code @code{700} and takes the form
2885
2886@example
2887700-name
2888700 INDEX MARK
2889@end example
2890
2891where @code{name} is the value of the SSML attribute @code{name} in
2892the tag @code{<mark/>}.
2893
2894@end table
2895
2896@node How to Write New Output Module, The Skeleton of an Output Module, Communication Protocol for Output Modules, Output Modules
2897@subsection How to Write New Output Module
2898
2899If you want to write your own output module, there are basically two
2900ways to do it. Either you can program it all yourself, which is fine
2901as long as you stick to the definition of an output module and its
2902communication protocol, or you can use our @file{module_*.c} tools.
2903If you use these tools, you will only have to write the core functions
2904like module_speak() and module_stop etc. and you will not have to
2905worry about the communication protocol and other formal things that
2906are common for all modules. Here is how you can do it using the
2907provided tools.
2908
2909We will recommend here a basic structure of the code for an output
2910module you should follow, although it's perfectly ok to establish your
2911own if you have reasons to do so, if all the necessary functions and
2912data are defined somewhere in the file. For this purpose, we will use
2913examples from the output module for Flite (Festival Lite), so it's
2914recommended to keep looking at @code{flite.c} for reference.
2915
2916A few rules you should respect:
2917@itemize
2918@item
2919The @file{module_*.c} files should be included at the specified place and
2920in the specified order, because they include directly some pieces of the
2921code and won't work in other places.
2922@item
2923If one or more new threads are used in the output module, they must block all signals.
2924@item
2925On module_close(), all lateral threads and processes should be terminated,
2926all memory freed. Don't assume module_close() is always called before exit()
2927and the sources will be freed automatically.
2928@item
2929We will be happy if all the copyrights are assigned to Brailcom, o.p.s.
2930in order for us to be in a better legal position against possible intruders.
2931@end itemize
2932
2933@node The Skeleton of an Output Module, Output Module Functions, How to Write New Output Module, Output Modules
2934@subsection The Skeleton of an Output Module
2935
2936Each output module should include @file{module_utils.h} where the
2937SPDMsgSettings structure is defined to be able to handle the different
2938speech synthesis settings.  This file also provides tools which help
2939with writing output modules and making the code simpler.
2940
2941@example
2942#include "module_utils.h"
2943@end example
2944
2945If your plugin needs the audio tools (if you take
2946care of the output to the soundcard instead of the synthesizer),
2947you also have to include @file{spd_audio.h}
2948
2949@example
2950#include "spd_audio.h"
2951@end example
2952
2953The definition of macros @code{MODULE_NAME} and @code{MODULE_VERSION}
2954should follow:
2955
2956@example
2957#define MODULE_NAME     "flite"
2958#define MODULE_VERSION  "0.1"
2959@end example
2960
2961If you want to use the @code{DBG(message)} macro from @file{module_utils.c}
2962to print out debugging messages, you should insert these two lines. (Please
2963don't use printf for debugging, this doesn't work with multiple processes!)
2964(You will later have to actually start debugging in @code{module_init()})
2965
2966@example
2967DECLARE_DEBUG();
2968@end example
2969
2970You don't have to define the prototypes of the core functions
2971like module_speak() and module_stop(), these are already
2972defined in @file{module_utils.h}
2973
2974Optionally, if your output module requires some special configuration,
2975apart from defining voices and configuring debugging (they are handled
2976differently, see below), you can declare the requested option
2977here. It will expand into a dotconf callback and declaration of the
2978variable.
2979
2980(You will later have to actually register these options for
2981Speech Dispatcher in @code{module_load()})
2982
2983There are currently 4 types of possible configuration options:
2984
2985@itemize
2986@item @code{MOD_OPTION_1_INT(name);   /* Set up `int name' */}
2987@item @code{MOD_OPTION_1_STR(name);   /* Set up `char* name' */}
2988@item @code{MOD_OPTION_2(name);       /* Set up `char *name[2]' */}
2989@item @code{MOD_OPTION_@{2,3@}_HT(name);  /* Set up a hash table */}
2990@end itemize
2991
2992@xref{Output Modules Configuration}.
2993
2994For example Flite uses 2 options:
2995@example
2996MOD_OPTION_1_INT(FliteMaxChunkLength);
2997MOD_OPTION_1_STR(FliteDelimiters);
2998@end example
2999
3000Every output module is started in 2 phases: @emph{loading} and
3001@emph{initialization}.
3002
3003The goal of loading is to initialize empty structures for storing
3004settings and declare the DotConf callbacks for parsing configuration
3005files. In the second phase, initialization, all the configuration has
3006been read and the output module can accomplish the rest (check if
3007the synthesizer works, set up threads etc.).
3008
3009You should start with the definition of @code{module_load()}.
3010
3011@example
3012int
3013module_load(void)
3014@{
3015@end example
3016
3017Then you should initialize the settings tables. These are defined in
3018@file{module_utils.h} and will be used to store the settings received
3019by the @code{SET} command.
3020@example
3021    INIT_SETTINGS_TABLES();
3022@end example
3023
3024Also, define the configuration callbacks for debugging if you use
3025the @code{DBG()} macro.
3026
3027@example
3028    REGISTER_DEBUG();
3029@end example
3030
3031Now you can finally register the options for the configuration file
3032parsing. Just use these macros:
3033@itemize
3034        @item MOD_OPTION_1_INT_REG(name, default);  /* for integer parameters */
3035        @item MOD_OPTION_1_STR_REG(name, default);  /* for string parameters */
3036        @item MOD_OPTION_MORE_REG(name);   /* for an array of strings */
3037        @item MOD_OPTION_HT_REG(name);     /* for hash tables */
3038@end itemize
3039
3040Again, an example from Flite:
3041@example
3042    MOD_OPTION_1_INT_REG(FliteMaxChunkLength, 300);
3043    MOD_OPTION_1_STR_REG(FliteDelimiters, ".");
3044@end example
3045
3046If you want to enable the mechanism for setting
3047voices through AddVoice, use this function (for
3048an example see @code{generic.c}):
3049
3050Example from Festival:
3051@example
3052    module_register_settings_voices();
3053@end example
3054
3055@xref{Output Modules Configuration}.
3056
3057If everything went correctly, the function should return 0, otherwise -1.
3058
3059@example
3060    return 0;
3061@}
3062@end example
3063
3064The second phase of starting an output module is handled by:
3065
3066@example
3067int
3068module_init(void)
3069@{
3070@end example
3071
3072If you use the DBG() macro, you should init debugging on the start
3073of this function. From that moment on, you can use DBG(). Apart from that,
3074the body of this function is entirely up to you. You should do all the
3075necessary initialization of the particular synthesizer.  All declared
3076configuration variables and configuration hash tables, together with
3077the definition of voices, are filled with their values (either default
3078or read from configuration), so you can use them already.
3079
3080@example
3081   INIT_DEBUG();
3082   DBG("FliteMaxChunkLength = %d\n", FliteMaxChunkLength);
3083   DBG("FliteDelimiters = %s\n", FliteDelimiters);
3084@end example
3085
3086This function should return 0 if the module was initialized
3087successfully, or -1 if some failure was encountered. In this case, you
3088should clean up everything, cancel threads, deallocate memory etc.; no
3089more functions of this output module will be touched (except for other
3090tries to load and initialize the module).
3091
3092Example from Flite:
3093
3094@example
3095    /* Init flite and register a new voice */
3096    flite_init();
3097    flite_voice = register_cmu_us_kal();
3098
3099    if (flite_voice == NULL)@{
3100        DBG("Couldn't register the basic kal voice.\n");
3101        return -1;
3102    @}
3103    [...]
3104@end example
3105
3106The third part is opening the audio. This is commanded
3107by the @code{AUDIO} protocol command. If the synthesizer is able
3108to retrieve audio data, it is desirable to open the @code{spd_audio}
3109output according to the requested parameters and then use this
3110method for audio output. Audio initialization can be done as
3111follows:
3112
3113@example
3114int
3115module_audio_init(char **status_info)@{
3116  DBG("Opening audio");
3117  return module_audio_init_spd(status_info);
3118@}
3119@end example
3120
3121If it is impossible to retrieve audio from the synthesizer and
3122the synthesizer itself is used for playback, than the module must
3123still contain this function, but it should just return 0 and
3124do nothing.
3125
3126Now you have to define all the synthesis control functions
3127@code{module_speak}, @code{module_stop} etc.  See @ref{Output Module
3128Functions}.
3129
3130At the end, this simple include provides the main() function and all
3131the functionality related to being an output module of Speech
3132Dispatcher (parsing argv[] parameters, communicating on stdin/stdout,
3133...). It's recommended to study this file carefully and try to
3134understand what exactly it does, as it will be part of the source code
3135of your output module.
3136
3137@example
3138#include "module_main.c"
3139@end example
3140
3141If it doesn't work, it's most likely not your fault. Complain!  This
3142manual is not complete and the instructions in this sections aren't
3143either. Get in touch with us and together we can figure out what's
3144wrong, fix it and then warn others in this manual.
3145
3146@node Output Module Functions, Module Utils Functions and Macros, The Skeleton of an Output Module, Output Modules
3147@subsection Output Module Functions
3148
3149@deffn {Output Module Functions} int module_speak (char *data, size_t bytes, EMessageType msgtype)
3150@findex module_speak()
3151
3152This is the function where the actual speech output is produced. It is
3153called every time Speech Dispatcher decides to send a message to
3154synthesis. The data of length @var{bytes} are passed in
3155a NULL terminated string @var{data}.  The argument @var{msgtype}
3156defines what type of message it is (different types should be handled
3157differently, if the synthesizer supports it).
3158
3159Each output module should take care of setting the output device to
3160the parameters from msg_settings (defined in module_utils.h) (See
3161SPDMsgSettings in @file{module_utils.h}). However, it is not an error if
3162some of these values are ignored. At least rate, pitch and language
3163should be set correctly.
3164
3165Speed and pitch are values between -100 and 100 included. 0 is the default
3166value that represents normal speech flow. So -100 is the slowest (or lowest)
3167and +100 is the fastest (or highest) speech.
3168
3169The language parameter is given as a null-terminated string containing
3170the name of the language according to RFC 1766 (en, cs, fr, en-US, ...). If the
3171requested language is not supported by this synthesizer, it's ok to abort
3172and return 0, because that's an error in user settings.
3173
3174An easy way to set the parameters is using the UPDATE_PARAMETER() and
3175UPDATE_STRING_PARAMETER() macros. @xref{Module Utils Functions and
3176Macros}.
3177
3178Example from festival:
3179@example
3180    UPDATE_STRING_PARAMETER(language, festival_set_language);
3181    UPDATE_PARAMETER(voice, festival_set_voice);
3182    UPDATE_PARAMETER(rate, festival_set_rate);
3183    UPDATE_PARAMETER(pitch, festival_set_pitch);
3184    UPDATE_PARAMETER(punctuation_mode, festival_set_punctuation_mode);
3185    UPDATE_PARAMETER(cap_let_recogn, festival_set_cap_let_recogn);
3186@end example
3187
3188This function should return 0 if it fails and 1 if the delivery
3189to the synthesizer is successful. It should return immediately,
3190because otherwise, it would block stopping, priority handling
3191and other important things in Speech Dispatcher.
3192
3193If there is a need to stay longer, you should create a separate thread
3194or process. This is for example the case of some software synthesizers
3195which use a blocking function (eg. spd_audio_play) or hardware devices
3196that have to send data to output modules at some particular
3197speed. Note that if you use threads for this purpose, you have to set
3198them to ignore all signals. The simplest way to do this is to call
3199@code{set_speaking_thread_parameters()} which is defined in
3200module_utils.c.  Call it at the beginning of the thread code.
3201@end deffn
3202
3203@deffn {Output module function}  {int module_stop} (void)
3204@findex module_stop()
3205
3206This function should stop the synthesis of the currently spoken message
3207immediately and throw away the rest of the message.
3208
3209This function should return immediately.  Speech Dispatcher will
3210not send another command until module_report_event_stop() is called.
3211Note that you cannot call module_report_event_stop() from within
3212the call to module_stop().  The best thing to do is emit
3213the stop event from another thread.
3214
3215It should return 0 on success, -1 otherwise.
3216@end deffn
3217
3218@deffn {Output module function}  {size_t module_pause} (void)
3219@findex module_pause()
3220
3221This function should stop speaking on the synthesizer (or sending
3222data to soundcard) just after sending an @code{__spd_} index
3223mark so that Speech Dispatcher knows the position of stop.
3224
3225The pause can wait for a short time until
3226an index mark is reached. However, if it's not possible to determine
3227the exact position, this function should have the same effect
3228as @code{module_stop}.
3229
3230This function should return immediately.  Speech Dispatcher will
3231not send another command until module_report_event_pause() is called.
3232Note that you cannot call module_report_event_pause() from within
3233the call to module_pause().  The best thing to do is emit
3234the pause event from another thread.
3235
3236For some software synthesizers, the desired effect can be archieved in this way:
3237When @code{module_speak()} is called, you execute a separate
3238process and pass it the requested message. This process
3239cuts the message into sentences and then runs in a loop
3240and sends the pieces to synthesis. If a signal arrives
3241from @code{module_pause()}, you set a flag and stop the loop
3242at the point where next piece of text would be synthesized.
3243
3244It's not an error if this function is called when the device
3245is not speaking. In this case, it should return 0.
3246
3247Note there is no module_resume() function.  The semantics of
3248@code{module_pause()} is the same as @code{module_stop()} except that
3249your module should stop after reaching a @code{__spd_} index mark.
3250Just like @code{module_stop()}, it should discard the rest of the
3251message after pausing.  On the next @code{module_speak()} call,
3252Speech Dispatcher will resend the rest of the message after the
3253index mark.
3254@end deffn
3255
3256
3257@node Module Utils Functions and Macros, Index Marks in Output Modules, Output Module Functions, Output Modules
3258@subsection Module Utils Functions and Macros
3259
3260This section describes the various variables, functions and macros
3261that are available in the @file{module_utils.h} file. They are
3262intended to make writing new output modules easier and allow the
3263programmer to reuse existing pieces of code instead of writing
3264everything from scratch.
3265
3266@menu
3267* Initialization Macros and Functions::
3268* Generic Macros and Functions::
3269* Functions used by module_main.c::
3270* Functions for use when talking to synthesizer::
3271* Multi-process output modules::
3272* Memory Handling Functions::
3273@end menu
3274
3275@node Initialization Macros and Functions, Generic Macros and Functions, Module Utils Functions and Macros, Module Utils Functions and Macros
3276@subsubsection Initialization Macros and Functions
3277
3278@deffn {Module Utils macro} INIT_SETTINGS_TABLES ()
3279@findex INIT_SETTINGS_TABLES
3280This macro initializes the settings tables where the parameters
3281received with the @code{SET} command are stored. You must call
3282this macro if you want to use the @code{UPDATE_PARAMETER()}
3283and @code{UPDATE_STRING_PARAMETER()} macros.
3284
3285It is intended to be called from inside a function just
3286after the output module starts.
3287@end deffn
3288
3289@subsubsection Debugging Macros
3290@deffn {Module Utils macro} DBG (format, ...)
3291@findex DBG
3292DBG() outputs a debugging message, if the @code{Debug} option in module's
3293configuration is set, to the file specified in configuration ad
3294@code{DebugFile}. The parameter syntax is the same as for the printf()
3295function. In fact, it calls printf() internally.
3296@end deffn
3297
3298@deffn {Module Utils macro} FATAL (text)
3299@findex FATAL
3300Outputs a message specified as @code{text} and calls exit() with
3301the value EXIT_FAILURE. This terminates the whole output module
3302without trying to kill the child processes or freeing other
3303resources other than those that will be freed by the system.
3304
3305It is intended to be used after some severe error has occurred.
3306@end deffn
3307
3308@node Generic Macros and Functions, Functions used by module_main.c, Initialization Macros and Functions, Module Utils Functions and Macros
3309@subsubsection Generic Macros and Functions
3310
3311@deffn {Module Utils macro} UPDATE_PARAMETER (param, setter)
3312@findex UPDATE_PARAMETER
3313Tests if the integer or enum parameter specified in @code{param}
3314(e.g. rate, pitch, cap_let_recogn, ...) changed since the
3315last time when the @code{setter} function was called.
3316
3317If it changed, it calls the function @code{setter} with the
3318new value. (The new value is stored in the msg_settings
3319structure that is created by module_utils.h, which
3320you normally don't have to care about.)
3321
3322The function @code{setter} should be defined as:
3323@example
3324void setter_name(type value);
3325@end example
3326
3327Please look at the @code{SET} command in the communication protocol
3328for the list of all available parameters.
3329@pxref{Communication Protocol for Output Modules}.
3330
3331An example from Festival output module:
3332@verbatim
3333static void
3334festival_set_rate(signed int rate)
3335{
3336    assert(rate >= -100 && rate <= +100);
3337    festivalSetRate(festival_info, rate);
3338}
3339[...]
3340int
3341module_speak(char *data, size_t bytes, EMessageType msgtype)
3342{
3343    [...]
3344    UPDATE_PARAMETER(rate, festival_set_rate);
3345    UPDATE_PARAMETER(pitch, festival_set_pitch);
3346    [...]
3347}
3348@end verbatim
3349@end deffn
3350
3351@deffn {Module Utils macro} UPDATE_STRING_PARAMETER (param, setter)
3352@findex  UPDATE_STRING_PARAMETER
3353The same as @code{UPDATE_PARAMETER} except that it works for
3354parameters with a string value.
3355@end deffn
3356
3357@node Functions used by module_main.c, Functions for use when talking to synthesizer, Generic Macros and Functions, Module Utils Functions and Macros
3358@subsubsection Functions used by @file{module_main.c}
3359
3360@deffn {Module Utils function} char* do_speak(void)
3361@findex do_speak
3362Takes care of communication after the @code{SPEAK} command was
3363received. Calls @code{module_speak()} when the full text is received.
3364
3365It returns a response according to the communication protocol.
3366@end deffn
3367
3368@deffn {Module Utils function} char* do_stop(void)
3369@findex do_stop
3370Calls the @code{module_stop()} function of the particular
3371output module.
3372
3373It returns a response according to the communication protocol.
3374@end deffn
3375
3376@deffn {Module Utils function} char* do_pause(void)
3377@findex do_pause
3378Calls the @code{module_pause()} function of the particular
3379output module.
3380
3381It returns a response according to the communication protocol
3382and the value returned by @code{module_pause()}.
3383@end deffn
3384
3385@deffn {Module Utils function} char* do_set()
3386@findex do_set
3387Takes care of communication after the @code{SET} command was
3388received. Doesn't call any particular function of the output module,
3389only sets the values in the settings tables. (You should then call the
3390@code{UPDATE_PARAMETER()} macro in module_speak() to actually set the
3391synthesizer to these values.)
3392
3393It returns a response according to the communication protocol.
3394@end deffn
3395
3396@deffn {Module Utils function} char* do_speaking()
3397@findex do_speaking
3398Calls the @code{module_speaking()} function.
3399
3400It returns a response according to the communication protocol
3401and the value returned by @code{module_speaking()}.
3402@end deffn
3403
3404@deffn {Module Utils function} void do_quit()
3405@findex do_quit
3406Prints the farewell message to the standard output, according
3407to the protocol. Then it calls @code{module_close()}.
3408@end deffn
3409
3410@node Functions for use when talking to synthesizer, Multi-process output modules, Functions used by module_main.c, Module Utils Functions and Macros
3411@subsubsection Functions for use when talking to synthesizer
3412
3413@deffn {Module Utils function} static int module_get_message_part ( const char* message, char* part, unsigned int *pos, size_t maxlen, const char* dividers)
3414@findex  module_get_message_part
3415
3416Gets a part of the @code{message} according to the specified @code{dividers}.
3417
3418It scans the text in @code{message} from the byte specified by
3419@code{*pos} and looks for one of the characters specified in
3420@code{dividers} followed by a whitespace character or the
3421terminating NULL byte. If one of them is encountered, the read text is
3422stored in @code{part} and the number of bytes read is
3423returned. If end of @code{message} is reached, the return value is
3424-1.
3425
3426@code{message} is the text to process. It must be a NULL-terminated
3427uni-byte string.
3428
3429@code{part} is a pointer to the place where the output text should
3430be stored. It must contain at least @code{maxlen} bytes of space.
3431
3432@code{maxlen} is the maximum number of bytes that should be written
3433to @code{part}.
3434
3435@code{dividers} is a NULL-terminated uni-byte string containing
3436the punctuation characters where the message should be divided
3437into smaller parts (if they are followed by whitespace).
3438
3439After returning, @code{pos} is the position
3440where the function terminated in processing @code{message}.
3441@end deffn
3442
3443@deffn {Output module function} void module_report_index_mark(char *mark)
3444@findex module_report_index_mark
3445@end deffn
3446@deffn {Output module function} void module_report_event_*()
3447@findex module_report_event_*
3448
3449The @code{module_report_} functions serve for reporting event
3450notifications and index marking events. You should use them whenever
3451you get an event from the synthesizer which is defined in the output
3452module communication protocol.
3453
3454Note that you cannot call these functions from within a call
3455to module_speak(), module_stop(), or module_pause().  The best
3456way to do this is to emit the events from another thread.
3457
3458@end deffn
3459
3460@deffn {Output module function} int module_close(void)
3461@findex module_close
3462
3463This function is called when Speech Dispatcher terminates.  The output
3464module should terminate all threads and processes, free all resources,
3465close all sockets etc.  Never assume this function is called only when
3466Speech Dispatcher terminates and exit(0) will do the work for you.  It's
3467perfectly ok for Speech Dispatcher to load, unload or reload output modules
3468in the middle of its run.
3469
3470@end deffn
3471
3472@node Multi-process output modules, Memory Handling Functions, Functions for use when talking to synthesizer, Module Utils Functions and Macros
3473@subsubsection Multi-process output modules
3474
3475@deffn {Module Utils function} size_t module_parent_wfork ( TModuleDoublePipe dpipe,
3476const char* message, SPDMessageType msgtype, const size_t maxlen,
3477const char* dividers, int *pause_requested)
3478@findex module_parent_wfork
3479
3480It simply sends the data to the
3481child in smaller pieces and waits for confirmation with a single
3482@code{C} character on the pipe from child to parent.
3483
3484@code{dpipe} is a parameter which contains the information
3485necessary for communicating through pipes between the parent and the
3486child and vice-versa.
3487
3488@example
3489typedef struct@{
3490    int pc[2];            /* Parent to child pipe */
3491    int cp[2];            /* Child to parent pipe */
3492@}TModuleDoublePipe;
3493@end example
3494
3495@code{message} is a pointer to a NULL-terminated string containing the message
3496for synthesis.
3497
3498@code{msgtype} is  the type of the message for synthesis.
3499
3500@code{maxlen} is the maximum number of bytes that should be transfered
3501over the pipe.
3502
3503@code{dividers} is a NULL-terminated string containing the punctuation characters
3504at which this function should divide the message into smaller pieces.
3505
3506@code{pause_requested} is a pointer to an integer flag, which is either 0 if
3507no pause request is pending, or 1 if the function should terminate
3508at a convenient place in the message because a pause is requested.
3509
3510In the beginning, it initializes the pipes and then it enters a simple cycle:
3511@enumerate
3512@item
3513Reads a part of the message or an index mark using
3514@code{module_get_message_part()}.
3515@item
3516Looks if there isn't a pending request for pause and handles
3517it.
3518@item
3519Sends the current part of the message to the child
3520using @code{module_parent_dp_write()}.
3521@item
3522Waits until a single character @code{C} comes from the other pipe
3523using @code{module_parent_dp_read()}.
3524@item
3525Repeats the cycle or terminates, if there is no more data.
3526@end enumerate
3527@end deffn
3528
3529@deffn {Module Utils function} int module_parent_wait_continue(TModuleDoublePipe dpipe)
3530@findex module_parent_wait_continue
3531Waits until the character @code{C} (continue) is read from the pipe from child.
3532This function is intended to be run from the parent.
3533
3534@code{dpipe} is the double pipe used for communication between the child and parent.
3535
3536Returns 0 if the character was read or 1 if the pipe was broken before the
3537character could be read.
3538@end deffn
3539
3540@deffn {Module Utils function} void module_parent_dp_init (TModuleDoublePipe dpipe)
3541@findex module_parent_dp_init
3542Initializes pipes (dpipe) in the parent. Currently it only closes the unnecessary ends.
3543@end deffn
3544
3545@deffn {Module Utils function} void module_child_dp_close (TModuleDoublePipe dpipe)
3546@findex module_child_dp_init
3547Initializes pipes (dpipe) in the child. Currently it only closes the unnecessary ends.
3548@end deffn
3549
3550@deffn {Module Utils function} void module_child_dp_write(TModuleDoublePipe dpipe,  const char *msg, size_t bytes)
3551@findex module_child_dp_write
3552Writes the specified number of @code{bytes} from @code{msg} to the pipe to the
3553parent. This function is intended, as the prefix says, to be run from the child.
3554Uses the pipes defined in @code{dpipe}.
3555@end deffn
3556
3557@deffn {Module Utils function} void module_parent_dp_write(TModuleDoublePipe dpipe,  const char *msg, size_t bytes)
3558@findex module_parent_dp_write
3559Writes the specified number of @code{bytes} from @code{msg} into the pipe to the
3560child. This function is intended, as the prefix says, to be run from the parent.
3561Uses the pipes defined in @code{dpipe}.
3562@end deffn
3563
3564@deffn {Module Utils function} int module_child_dp_read(TModuleDoublePipe dpipe  char *msg, size_t maxlen)
3565@findex module_child_dp_read
3566Reads up to @code{maxlen} bytes from the pipe from parent into the buffer @code{msg}.
3567This function is intended, as the prefix says, to be run from the child.
3568Uses the pipes defined in @code{dpipe}.
3569@end deffn
3570
3571@deffn {Module Utils function} int module_parent_dp_read(TModuleDoublePipe dpipe,  char *msg, size_t maxlen)
3572@findex module_parent_dp_read
3573Reads up to @code{maxlen} bytes from the pipe from child into the buffer @code{msg}.
3574This function is intended, as the prefix says, to be run from the parent.
3575Uses the pipes defined in @code{dpipe}.
3576@end deffn
3577
3578@deffn {Module Utils function} void module_sigblockall(void)
3579@findex module_sigblockall
3580Blocks all signals. This is intended to be run from the child processes
3581and threads so that their signal handling won't interfere with the
3582parent.
3583@end deffn
3584
3585@deffn {Module Utils function} void module_sigunblockusr(sigset_t *some_signals)
3586@findex module_sigunblockusr
3587Use the set @code{some_signals} to unblock SIGUSR1.
3588@end deffn
3589
3590@deffn {Module Utils function} void module_sigblockusr(sigset_t *some_signals)
3591@findex module_sigblockusr
3592Use the set @code{some_signals} to block SIGUSR1.
3593@end deffn
3594
3595@node Memory Handling Functions,  , Multi-process output modules, Module Utils Functions and Macros
3596@subsubsection Memory Handling Functions
3597
3598@deffn {Module Utils function} static void* xmalloc (size_t size)
3599@findex xmalloc
3600The same as the classical @code{malloc()} except that it executes
3601@code{FATAL(``Not enough memory'')} on error.
3602@end deffn
3603
3604@deffn {Module Utils function} static void* xrealloc (void *data, size_t size)
3605@findex xrealloc
3606The same as the classical @code{realloc()} except that it also accepts
3607@code{NULL} as @code{data}. In this case, it behaves as @code{xmalloc}.
3608@end deffn
3609
3610@deffn {Module Utils function} void xfree(void *data)
3611@findex xfree
3612The same as the classical @code{free()} except that it checks
3613if data isn't NULL before calling @code{free()}.
3614@end deffn
3615
3616@node Index Marks in Output Modules,  , Module Utils Functions and Macros, Output Modules
3617@subsection Index Marks in Output Modules
3618
3619Output modules need to provide some kind of synchronization and they have to
3620give Speech Dispatcher back some information about what part of the message
3621is currently being said. On the other hand, output modules are not able to tell
3622the exact position in the text because various conversions and message processing take place
3623(sometimes punctuation and spelling substitution, the message needs to be
3624recoded from multibyte to unibyte coding etc.) before the text reaches
3625the synthesizer.
3626
3627For this reason, Speech Dispatcher places so-called index marks in
3628the text it sends to its output modules. They have the form:
3629
3630@example
3631<mark name="id"/>
3632@end example
3633
3634@code{id} is the identifier associated with each index
3635mark. Within a @code{module_speak()} message, each identifer is unique.
3636It consists of the string @code{__spd_} and a counter number.  Numbers
3637begin from zero for each message.  For example, the fourth index mark
3638within a message looks like
3639
3640@example
3641<mark name="__spd_id_3"/>
3642@end example
3643
3644When an index mark is reached, its identifier should be stored
3645so that the output module is able to tell Speech Dispatcher the identifier
3646of the last index mark. Also, index marks are the best place to stop
3647when the module is requested to pause (although it's ok to stop at
3648some place close by and report the last index mark).
3649
3650Notice that index marks are in SSML format using the @code{mark} tag.
3651
3652@node Download and Contact, Reporting Bugs, Server Programming, Top
3653@chapter Download
3654
3655You can download Speech Dispatcher's latest release source code from
3656@uref{http://www.freebsoft.org/speechd}. There is also information
3657on how to set up anonymous access to our git repository.
3658
3659However, you may prefer to download Speech Dispatcher in a binary
3660package for your system. We don't distribute such packages ourselves.
3661If you run Debian GNU/Linux, it should be in the central repository
3662under the name @code{speech-dispatcher} or @code{speechd}. If you run
3663an rpm-based distribution like RedHat, Mandrake or SuSE Linux, please
3664try to look at @uref{http://www.rpmfind.net/}.
3665
3666If you want to contact us, please look at
3667@uref{http://www.freebsoft.org/contact}
3668or use the email @email{users@@lists.freebsoft.org}.
3669
3670@node Reporting Bugs, How You Can Help, Download and Contact, Top
3671@chapter Reporting Bugs
3672
3673If you believe you found a bug in Speech Dispatcher, we will be very
3674grateful if you let us know about it. Please do it by email on the
3675address @email{speechd@@bugs.freebsoft.org}, but please don't send us
3676messages larger than half a megabyte unless we ask you.
3677
3678To report a bug in a way that is useful for the developers is not
3679as easy as it may seem. Here are some hints that you should follow in
3680order to give us the best information so that we can find and fix
3681the bug easily.
3682
3683First of all, please try to describe the problem as exactly as you
3684can. We prefer raw data over speculations about where the problem may
3685lie. Please try to explain in what situation the bug happens. Even
3686if it's a general bug that happens in many situations, please try to
3687describe at least one case in as much detail, as possible.
3688
3689Also, please specify the versions of programs that you use when
3690the bug happens. This is not only Speech Dispatcher, but also
3691the client application you use (speechd-el, say, etc.) and
3692the synthesizer name and version.
3693
3694If you can reproduce the bug, please send us the log file also.  This
3695is very useful, because otherwise, we may not be able to reproduce the
3696bug with our configuration and program versions that differ from
3697yours. Configuration must be set to logging priority at least 4, but
3698best 5, so that it's useful for debugging purposes. You can do so in
3699@file{etc/speech-dispatcher/speechd.conf} by modifying the variable
3700@code{LogLevel}. Also, you may want to modify the log destination with
3701variable @code{LogFile}. After modifying these options, please restart
3702Speech Dispatcher and repeat the situation in which the bug
3703happens. After it happened, please take the log and attach it to the
3704bug report, preferably compressed using @code{gzip}. But note, that
3705when logging with level 5, all the data that come from Speech Dispatcher
3706is also recorded, so make sure there is no sensitive information
3707when you are reproducing the bug. Please make sure you switch back
3708to priority 3 or lower logging, because priority 4 or 5 produces
3709really huge logs.
3710
3711If you are a programmer and you find a bug that is reproducible in
3712SSIP, you can send us the sequence of SSIP commands that lead to the
3713bug (preferably from starting the connection). You can also try to
3714reproduce the bug in a simple test-script under
3715@file{speech-dispatcher/src/tests} in the source tree. Please check
3716@file{speech-dispatcher/src/tests/README} and see the other tests
3717scripts there for an example.
3718
3719When the bug is a SEGMENTATION FAULT, a backtrace from gdb is also
3720valuable, but if you are not familiar with gdb, don't bother with
3721that, we may ask you to do it later.
3722
3723Finally, you may also send us a guess of what you think
3724happens in Speech Dispatcher that causes the bug, but this is
3725usually not very helpful. If you are able to provide additional technical
3726information instead, please do so.
3727
3728@node How You Can Help, Appendices, Reporting Bugs, Top
3729@chapter How You Can Help
3730
3731If you want to contribute to the development of Speech Dispatcher,
3732we will be very happy if you do so. Please contact us on
3733@email{users@@lists.freebsoft.org}.
3734
3735Here is a short, definitively not exhaustive, list of how you can
3736help us and other users.
3737
3738@itemize
3739@item
3740@emph{Donate money:} We are a non-profit organization and we can't work without
3741funding. Brailcom, o.p.s. created Speech Dispatcher, speechd-el and also works
3742on other projects to help blind and visually impaired users of computers. We build
3743on Free Software and GNU/Linux, because we believe this is the right way. But it
3744won't be possible when we have no money. @uref{http://www.freebsoft.org/}
3745
3746@item
3747@emph{Report bugs:} Every user, even if he can't give us money and he is not
3748a programmer, can help us very much by just using our software and telling
3749us about the bugs and inconveniences he encounters. A good user community that
3750reports bugs is a crucial part of development of a good Free Software package.
3751We can't test our software under all circumstances and on all platforms, so each
3752constructive bug report is highly appreciated. You can report bugs in Speech
3753Dispatcher on @email{speechd@@bugs.freebsoft.org}.
3754
3755@item
3756@emph{Write or modify an application to support synthesis:} With
3757Speech Dispatcher, we have provided an interface that allows
3758applications easy access to speech synthesis. However powerful, it's
3759no more than an interface, and it's useless on its own. Now it's time
3760to write the particular client applications, or modify existing
3761applications so that they can support speech synthesis.  It is useful
3762if the application needs a specific interface for blind people or if
3763it wants to use speech synthesis for educational or other purposes.
3764
3765@item
3766@emph{Develop new voices and language definitions for Festival:} In
3767the world of Free Software, currently Festival is the most promising
3768interface for Text-to-Speech processing and speech synthesis. It's
3769an extensible and highly configurable platform for developing synthetic
3770voices. If there is a lack of synthetic voices or no voices at all for
3771some language, we believe the wisest solution is to try to develop
3772a voice in Festival. It's certainly not advisable to develop your
3773own synthesizer if the goal is producing a quality voice system
3774in a reasonable time. Festival developers provide nice documentation
3775about how to develop a voice and a lot of tools that help doing
3776this. We found that some language definitions can be constructed
3777by canibalizing the already existing definitions and can be tuned
3778later. As for the voice samples, one can temporarily use the
3779MBROLA project voices. But please note that, although they are
3780downloadable for free (as price), they are not Free Software
3781and it would be wonderful if we could replace them by Free Software
3782alternatives as soon as possible.
3783See @uref{http://www.cstr.ed.ac.uk/projects/festival/}.
3784
3785@item
3786@emph{Help us with this or other Free-b-Soft projects:} Please look at
3787@uref{http://www.freebsoft.org} to find information about our
3788projects. There is a plenty of work to be done for the blind and
3789visually impaired people to make their work with computers easier.
3790
3791@item
3792@emph{Spread the word about Speech Dispatcher and Free Software:} You can
3793help us, and the whole community around Free Software, just by telling
3794your friends about the amazing world of Free Software. It doesn't
3795have to be just about Speech Dispatcher; you can tell them about
3796other projects or about Free Software in general. Remember that
3797Speech Dispatcher could only arise out of understanding of some people
3798of the principles and ideas behind Free Software. And this is mostly
3799the same for the rest of the Free Software world.
3800See @uref{http://www.gnu.org/} for more information about GNU/Linux
3801and Free Software.
3802
3803@end itemize
3804
3805@node Appendices, GNU General Public License, How You Can Help, Top
3806@appendix Appendices
3807
3808@node GNU General Public License, GNU Free Documentation License, Appendices, Top
3809@appendix GNU General Public License
3810@center Version 2, June 1991
3811@cindex GPL, GNU General Public License
3812
3813@include gpl.texi
3814
3815@node GNU Free Documentation License, Index of Concepts, GNU General Public License, Top
3816@appendix GNU Free Documentation License
3817@center Version 1.2, November 2002
3818@cindex FDL, GNU Free Documentation License
3819
3820@include fdl.texi
3821
3822@node Index of Concepts,  , GNU Free Documentation License, Top
3823@unnumbered Index of Concepts
3824
3825@cindex tail recursion
3826@printindex cp
3827
3828@bye
3829
3830@c  LocalWords:  texinfo setfilename speechd settitle finalout syncodeindex pg
3831@c  LocalWords:  setchapternewpage cp fn vr texi dircategory direntry titlepage
3832@c  LocalWords:  Cerha Hynek Hanke vskip pt filll insertcopying ifnottex dir fd
3833@c  LocalWords:  API SSIP cindex printf ISA pindex Flite Odmluva FreeTTS TTS CR
3834@c  LocalWords:  ViaVoice Lite Tcl Zandt wxWindows AWT spd dfn backend findex
3835@c  LocalWords:  src struct gchar gint const OutputModule intl FDSetElement len
3836@c  LocalWords:  fdset init flite deffn TFDSetElement var int enum EVoiceType
3837@c  LocalWords:  sayf ifinfo verbatiminclude ref UTF ccc ddd pxref LF cs conf
3838@c  LocalWords:  su AddModule DefaultModule xref identd printindex Dectalk GTK
3839
3840@c speechd.texi ends here
3841@c  LocalWords:  emph soundcard precission archieved succes Dispatcher When
3842