1\input texinfo @c -*-texinfo-*- 2@c %**start of header 3@setfilename speech-dispatcher.info 4@settitle Speech Dispatcher 5@finalout 6@c @setchapternewpage odd 7@c %**end of header 8 9@syncodeindex pg cp 10@syncodeindex fn cp 11@syncodeindex vr cp 12 13@include version.texi 14 15@dircategory Sound 16@dircategory Development 17 18@direntry 19* Speech Dispatcher: (speech-dispatcher). Speech Dispatcher. 20@end direntry 21 22@titlepage 23@title Speech Dispatcher 24@subtitle Mastering the Babylon of TTS' 25@subtitle for Speech Dispatcher @value{VERSION} 26@author Tom@'a@v{s} Cerha <@email{cerha@@brailcom.org}> 27@author Hynek Hanke <@email{hanke@@volny.cz}> 28@author Milan Zamazal <@email{pdm@@brailcom.org}> 29 30@page 31@vskip 0pt plus 1filll 32 33This manual documents Speech Dispatcher, version @value{VERSION}. 34 35Copyright @copyright{} 2001, 2002, 2003, 2006, 2007, 2008 Brailcom, o.p.s. 36 37@quotation 38Permission is granted to copy, distribute and/or modify this document 39under the terms of the GNU Free Documentation License, Version 1.2 or 40any later version published by the Free Software Foundation; with no 41Invariant Sections, with no Front-Cover Texts and no Back-Cover Texts. 42A copy of the license is included in the section entitled ``GNU Free 43Documentation License.'' 44@end quotation 45 46You can also (at your option) distribute this manual under the GNU 47General Public License: 48 49@quotation 50Permission is granted to copy, distribute and/or modify this document 51under the terms of the GNU General Public License as published by the 52Free Software Foundation; either version 2 of the License, or (at your 53option) any later version. 54 55A copy of the license is included in the section entitled ``GNU 56General Public License'' 57@end quotation 58 59@end titlepage 60 61@ifnottex 62@node Top, Introduction, (dir), (dir) 63 64This manual documents Speech Dispatcher, version @value{VERSION}. 65 66Copyright @copyright{} 2001, 2002, 2003, 2006 Brailcom, o.p.s. 67 68@quotation 69Permission is granted to copy, distribute and/or modify this document 70under the terms of the GNU Free Documentation License, Version 1.2 or 71any later version published by the Free Software Foundation; with no 72Invariant Sections, with no Front-Cover Texts and no Back-Cover Texts. 73A copy of the license is included in the section entitled ``GNU Free 74Documentation License.'' 75@end quotation 76 77You can also (at your option) distribute this manual under the GNU 78General Public License: 79 80@quotation 81Permission is granted to copy, distribute and/or modify this document 82under the terms of the GNU General Public License as published by the 83Free Software Foundation; either version 2 of the License, or (at your 84option) any later version. 85 86A copy of the license is included in the section entitled ``GNU 87General Public License'' 88@end quotation 89 90@end ifnottex 91 92@ifhtml 93@heading Menu 94@end ifhtml 95 96@menu 97* Introduction:: What is Speech Dispatcher. 98* User's Documentation:: Usage, Configuration... 99* Technical Specifications:: 100* Client Programming:: Documentation for application developers. 101* Server Programming:: Documentation for project contributors. 102 103* Download and Contact:: How to get Speech Dispatcher and how to contact us 104* Reporting Bugs:: How to report a bug 105* How You Can Help:: What is needed 106 107* Appendices:: 108* GNU General Public License:: Copying conditions for Speech Dispatcher 109* GNU Free Documentation License:: Copying conditions for this manual 110 111* Index of Concepts:: 112@end menu 113 114@node Introduction, User's Documentation, Top, Top 115@chapter Introduction 116 117@menu 118* Motivation:: Why Speech Dispatcher? 119* Basic Design:: How does it work? 120* Features Overview:: What are the assets? 121* Current State:: What is done? 122@end menu 123 124@node Motivation, Basic Design, Introduction, Introduction 125@section Motivation 126@cindex Basic ideas, Motivation 127@cindex Philosophy 128 129Speech Dispatcher is a device independent layer for speech synthesis 130that provides a common easy to use interface for both client 131applications (programs that want to speak) and for software 132synthesizers (programs actually able to convert text to speech). 133 134High quality speech synthesis is now commonly available both as 135propriatary and Free Software solutions. It has a wide field of 136possible uses from educational software to specialized systems, 137e.g. in hospitals or laboratories. It is also a key compensation tool 138for the visually impaired users. For them, it is one of the two 139possible ways of getting output from a computer (the second one being 140a Braille display). 141 142The various speech synthesizers are quite different, both in their 143interfaces and capabilities. Thus a general common interface is needed 144so that the client application programmers have an easy way to use 145software speech synthesis and don't have to care about peculiar 146details of the various synthesizers. 147 148The absence of such a common and standardized interface and thus the 149difficulty for programmers to use software speech synthesis has 150been a major reason why the potential of speech synthesis technology 151is still not fully expoited. 152 153Ideally, there would be little distinction for applications whether 154they output messages on the screen or via speech. Speech Dispatcher 155can be compared to what a GUI toolkit is for the graphical 156interface. Not only does it provide an easy to use interface, some 157kind of theming and configuration mechanisms, but also it takes care 158of some of the issues inherent with this particular mode of output, 159such as the need for speech message serialization and interaction with 160the audio subsystem. 161 162@node Basic Design, Features Overview, Motivation, Introduction 163@section Design 164@cindex Design 165 166@heading Current Design 167The communication between all applications and synthesizers, when 168implemented directly, is a mess. For this purpose, we wanted 169Speech Dispatcher to be a layer separating applications and 170synthesizers so that applications wouldn't have to care about 171synthesizers and synthesizers wouldn't have to care about interaction 172with applications. 173 174We decided we would implement Speech Dispatcher as a server receiving 175commands from applications over a protocol called @code{SSIP}, 176parsing them if needed, and calling the appropriate functions 177of output modules communicating with the different synthesizers. 178These output modules are implemented as plug-ins, so that the user 179can just load a new module if he wants to use a new synthesizer. 180 181Each client (application that wants to speak) opens a socket 182connection to Speech Dispatcher and calls functions like say(), 183stop(), and pause() provided by a library implementing the protocol. 184This shared library is still on the client side and sends Speech 185Dispatcher SSIP commands over the socket. When the messages arrive at 186Speech Dispatcher, it parses them, reads the text that should be said 187and puts it in one of several queues according to the priority of the 188message and other criteria. It then decides when, with which 189parameters (set up by the client and the user), and on which 190synthesizer it will say the message. These requests are handled by the 191output plug-ins (output modules) for different hardware and software 192synthesizers and then said aloud. 193 194@image{figures/architecture,155mm,,Speech Dispatcher architecture} 195 196See also the detailed description @ref{Client Programming} interfaces, and 197@ref{Server Programming} documentation. 198 199@heading Future Design 200 201Speech Dispatcher currently mixes two important features: common 202low-level interface to multiple speech synthesizers and message 203management (including priorities and history). This became even more 204evident when we started thinking about handling messages intended for 205output on braille devices. Such messages of course need to be 206synchronized with speech messages and there is little reason why the 207accessibility tools should send the same message twice for these two 208different kinds of output used by blind people (often simultaneously). 209Outside the world of accessibility, applications also want to either 210have full control over the sound (bypass prioritisation) or to only 211retrieve the synthesized data, but not play them immediatelly. 212 213We want to eventually split Speech Dispatcher into two independent 214components: one providing a low-level interface to speech synthesis 215drivers, which we now call TTS API Provider and is already largely 216implemented in the Free(b)Soft project, and the second doing message 217managemenet, called Message Dispatcher. This will allow Message 218Dispatcher to also output on Braille as well as to use the TTS API 219Provider separately. 220 221From implementation point of view, an opportunity for new design based 222on our previous experiences allowed us to remove several bottlenecks 223for speed (responsiveness), ease of use and ease of implementation of 224extensions (particularly output modules for new synthesizers). From 225the architecture point of view and possibilities for new developments, 226we are entirely convinced that both the new design in general and the 227inner design of the new components is much better. 228 229While a good API and its implementation for Braille are already 230existent in the form of BrlAPI, the API for speech is now under 231developement. Please see another architecture diagram showing how we 232imagine Message Dispatcher in the future. 233 234@image{figures/architecture-future,155mm,,Speech Dispatcher architecture} 235 236References: 237@uref{http://www.freebsoft.org/tts-api/} 238@uref{http://www.freebsoft.org/tts-api-provider/} 239 240@node Features Overview, Current State, Basic Design, Introduction 241@section Features Overview 242Speech Dispatcher from user's point of view: 243 244@itemize @bullet 245@item ability to freely combine applications with your favorite synthesizer 246@item message synchronization and coordination 247@item less time devoted to configuration of applications 248@end itemize 249 250Speech Dispatcher from application programmers's point of view: 251 252@itemize @bullet 253@item easy way to make your applications speak 254@item common interface to different synthesizers 255@item higher level synchronization of messages (priorities) 256@item no need to take care about configuration of voice(s) 257@end itemize 258 259@node Current State, , Features Overview, Introduction 260@section Current State 261@cindex Synthesizers 262@cindex Other programs 263 264In this version, most of the features of Speech Dispatcher are 265implemented and we believe it is now useful for applications as a 266device independent Text-to-Speech layer and an accessibility message 267coordination layer. 268 269Currently, one of the most advanced applications that works with 270Speech Dispatcher is @code{speechd-el}. This is a client for Emacs, 271targeted primarily for blind people. It is similar to Emacspeak, 272however the two take a bit different approach and serve different user 273needs. You can find speechd-el on 274@uref{http://www.freebsoft.org/speechd-el/}. speechd-el provides 275speech output when using nearly any GNU/Linux text interface, like 276editing text, reading email, browsing the web, etc. 277 278Orca, the primary screen reader for the Gnome Desktop, supports Speech 279Dispatcher directly since its version 2.19.0. See 280@uref{http://live.gnome.org/Orca/SpeechDispatcher} for more 281information. 282 283We also provide a shared C library, a Python library, a Java, Guile 284and a Common Lisp libraries that implement the SSIP functions of 285Speech Dispatcher in higher level interfaces. Writing client 286applications in these languages should be quite easy. 287 288On the synthesis side, there is good support for Festival, eSpeak, 289Flite, Cicero, IBM TTS, MBROLA, Epos, Dectalk software, Cepstral Swift 290and others. See @xref{Supported Modules}. 291 292We decided not to interface the simple hardware speech devices as they 293don't support synchronization and therefore cause serious problems 294when handling multiple messages. Also they are not extensible, they 295are usually expensive and often hard to support. Today's computers are 296fast enough to perform software speech synthesis and Festival is a 297great example. 298 299@node User's Documentation, Technical Specifications, Introduction, Top 300@chapter User's Documentation 301 302@menu 303* Installation:: How to get it installed in the best way. 304* Running:: The different ways to start it. 305* Troubleshooting:: What to do if something doesn't work... 306* Configuration:: How to configure Speech Dispatcher. 307* Tools:: What tools come with Speech Dispatcher. 308* Synthesis Output Modules:: Drivers for different synthesizers. 309* Security:: Security mechanisms and restrictions. 310@end menu 311 312@node Installation, Running, User's Documentation, User's Documentation 313@section Installation 314 315This part only deals with the general aspects of installing 316Speech Dispatcher. If you are compiling from source code (distribution 317tarball or git), please refer to the file @file{INSTALL} in your 318source tree. 319 320@subsection The requirements 321 322You will need these components to run Speech Dispatcher: 323@itemize 324@item glib 2.0 (@uref{http://www.gtk.org}) 325@item libdotconf 1.3 (@uref{http://github.com/williamh/dotconf}) 326@item pthreads 327@end itemize 328 329We recommend to also install these packages: 330@itemize 331 @item Festival (@uref{http://www.cstr.ed.ac.uk/projects/festival/}) 332 @item festival-freebsoft-utils 0.3+ (@uref{http://www.freebsoft.org/festival-freebsoft-utils}) 333 @item Sound icons library @* (@uref{http://www.freebsoft.org/pub/projects/sound-icons/sound-icons-0.1.tar.gz}) 334@end itemize 335 336@subsection Recommended installation procedure 337 338@itemize 339 340@item Install your software synthesizer 341 342Although we highly recommend to use Festival for its excellent 343extensibility, good quality voices, good responsiveness and best 344support in Speech Dispatcher, you might want to start with eSpeak, a 345lightweight multi-lingual feature-complete synthesizer, to get all the 346key components working and perhaps only then switch to 347Festival. Installation of eSpeak should be easier and the default 348configuration of Speech Dispatcher is set up for eSpeak for this 349reason. 350 351You can of course also start with Epos or any other supported synthesizer. 352 353@item Make sure your synthesizer works 354 355There is usually a way to test if the installation of your speech 356synthesizer works. For eSpeak run @code{espeak "test"}, for Flite run 357@code{flite -t "Hello!"} and hear the speech. For Festival run 358@code{festival} and type in 359 360@example 361(SayText "Hello!") 362(quit) 363@end example 364 365@item Install Speech Dispatcher 366 367Install the packages for Speech Dispatcher from your distribution or 368download the source tarball (or git) from 369@url{http://www.freebsoft.org/speechd} and follow the instructions in 370the file @code{INSTALL} in the source tree. 371 372@item Configure Speech Dispatcher 373 374You can skip this step in most cases. If you however want to setup 375your own configuration of the Dispatchers default values, the easiest 376way to do so is through the @code{spd-conf} configuration script. It 377will guide you through the basic configuration. It will also 378subsequently perform some diagnostics tests and offer some limited 379help with troubleshooting. Just execute 380 381@example 382spd-conf 383@end example 384 385under an ordinary user or system user like 'speech-dispatcher' 386depending on whether you like to setup Speech Dispatcher as user or 387system service respectively. You might also want to explore the 388offered options or run some of its subsystems manually, type 389@code{spd-conf -h} for help. 390 391If you do not want to use this script, it doesn't work in your case 392or it doesn't provide enough configuration flexibility, please 393continue as described below and/or in @xref{Running Under Ordinary Users}. 394 395@item Test Speech Dispatcher 396 397The simplest way to test Speech Dispatcher is through 398@code{spd-conf -d} or through the @code{spd-say} tool. 399 400Example: 401@example 402spd-conf -d 403spd-say "Hello!" 404spd-say -l cs -r 90 "Ahoj" 405@end example 406 407If you don't hear anything, please @xref{Troubleshooting}. 408 409@end itemize 410 411@subsection How to use eSpeak NG or eSpeak with MBROLA 412 413Please follow the guidelines at 414@url{https://github.com/espeak-ng/espeak-ng/blob/master/docs/mbrola.md} 415(resp. @url{http://espeak.sourceforge.net/mbrola.html}) 416for installing eSpeak NG (resp. eSpeak) with a set of MBROLA voices that you 417want to use. 418 419Check the @file{modules/espeak-ng-mbrola-generic.conf} 420(resp. @file{modules/espeak-mbrola-generic.conf}) configuration files for the 421@code{AddVoice} lines. If a line for any of the voices you have installed 422(and it is supported by your version of eSpeak NG (resp. eSpeak), 423e.g. @code{ls /usr/share/espeak-ng-data/voices/mb/mb-*} 424(resp. @code{ls /usr/share/espeak-data/voices/mb/mb-*})) 425is not contained here, please add it. Check if @code{GenericExecuteString} 426contains the correct name of your mbrola binary and correct path to its voice 427database. 428 429Restart speech-dispatcher and in your client, select 430@code{espeak-ng-mbrola-generic} (resp. @code{espeak-mbrola-generic}) as your 431output module, or test it with the following command 432 433@example 434spd-say -o espeak-ng-mbrola-generic -l cs Testing 435@end example 436 437(resp. 438@example 439spd-say -o espeak-mbrola-generic -l cs Testing 440@end example 441) 442 443@node Running, Troubleshooting, Installation, User's Documentation 444@section Running 445 446Speech Dispatcher is normally executed on a per-user basis. This 447provides more flexibility in user configuration, access rights and is 448essential in any environment where multiple people use the computer at 449the same time. It used to be possible to run Speech Dispatcher as a system 450service under a special user (and still is, with some limitations), but 451this mode of execution is strongly discouraged. 452 453@menu 454* Running Under Ordinary Users:: 455* Running in a Custom Setup:: 456* Setting Communication Method:: 457@end menu 458 459@node Running Under Ordinary Users, Running in a Custom Setup, Running, Running 460@subsection Running Under Ordinary Users 461 462No special provisions need to be done to run Speech Dispatcher under 463the current user. The Speech Dispatcher process will use (or create) a 464@file{~/.cache/speech-dispatcher/} directory for its purposes (logging, 465pidfile). 466 467Optionally, a user can place his own configuration file in 468@file{~/.config/speech-dispatcher/speechd.conf} and it will be 469automatically loaded by Speech Dispatcher. The preferred way to do so 470is via the @code{spd-conf} configuration command. If this user 471configuration file is not found, Speech Dispatcher will simply use the 472system wide configuration file (e.g. in 473@file{/etc/speech-dispatcher/speechd.conf}). 474 475@example 476# speech-dispatcher 477# spd-say test 478@end example 479 480@node Running in a Custom Setup, Setting Communication Method, Running Under Ordinary Users, Running 481@subsection Running in a Custom Setup 482 483Speech Dispatcher can be run in any other setup of executing users, port 484numbers and system paths as well. The path to configuration, pidfile and 485logfiles can be specified separately via compilation flags, 486configuration file options or command line options in this ascending 487order of their priority. 488 489This way can also be used to start Speech Dispatcher as a system wide 490service from /etc/init.d/ , although this approach is now discouraged. 491 492@node Setting Communication Method, , Running in a Custom Setup, Running 493@subsection Setting Communication Method 494 495Currently, two different methods are supported for communication 496between the server and its clients. 497 498For local communication, it's preferred to use @emph{Unix sockets}, 499where the communication takes place over a Unix socket with its 500driving file located by default in the user's runtime directory as 501@code{XDG_RUNTIME_DIR/speech-dispatcher/speechd.sock}. In this way, there can be no 502conflict between different user sessions using different Speech 503Dispatchers in the same system. By default, permissions are set in 504such a way, that only the same user who started the server can access 505it, and communication is hidden to all other users. 506 507The other supported mechanism is @emph{Inet sockets}. The server will 508thus run on a given port, which can be made accessible either localy 509or to other machines on the network as well. This is very useful in a 510network setup. Be however aware that while using Inet sockets, both 511parties (server and clients) must first agree on the communication 512port number to use, which can create a lot of confusion in a setup 513where multiple instances of the server serve multiple different users. 514Also, since there is currently no authentication mechanism, during 515Inet socket communication, the server will make no distinction between 516the different users connecting to it. The default port is 6560 as set 517in the server configuration. 518 519Client applications will respect the @emph{SPEECHD_ADDRESS} environment 520variable. The method ('@code{unix_socket}' or '@code{inet_socket}') 521is optionally followed by it's parameters separated by a colon. 522For an exact description, please @xref{Address specification}. 523 524An example of launching Speech Dispatcher using unix_sockets 525for communication on a non-standard destination and subsequently 526using spd-say to speak a message: 527 528@example 529killall -u `whoami` speech-dispatcher 530speech-dispatcher -c unix_socket -S /tmp/my.sock 531SPEECHD_ADDRESS=unix_socket:/tmp/my.sock spd-say "test" 532@end example 533 534@node Troubleshooting, Configuration, Running, User's Documentation 535@section Troubleshooting 536 537If you are experiencing problems when running Speech Dispatcher, please: 538 539@itemize 540 541@item 542Use @code{spd-conf} to run diagnostics: 543 544@example 545spd-conf -d 546@end example 547 548@item 549Check the appropriate logfile in 550@file{~/.cache/speech-dispatcher/log/speech-dispatcher.log} for user Speech 551Dispatcher or in @file{/var/log/speech-dispatcher/speech-dispatcher.log}. Look 552for lines containing the string 'ERROR' and their surrounding 553contents. If you hear no speech, restart Speech Dispatcher and look 554near the end of the log file -- before any attempts for synthesis of 555any message. Usually, if something goes wrong with the initialization 556of the output modules, a textual description of the problem and a 557suggested solution can be found in the log file. 558 559@item 560If this doesn't reveal the problem, please run 561@example 562spd-conf -D 563@end example 564 565Which will genereate a very detailed logfile archive 566which you can examine yourself or send to us with 567a request for help. 568 569@item 570You can also try to say some message directly through the utility 571@code{spd-say}. 572 573Example: 574@example 575 spd-say "Hello, does it work?" 576 spd-say --language=cs --rate=20 "Everything ok?" 577@end example 578 579@item 580Check if your configuration files (speechd.conf, modules/*.conf) 581are correct (some uninstalled synthesizer specified as the default, 582wrong values for default voice parameters etc.) 583 584@item 585There is a know problem in some versions of Festival. Please make sure 586that Festival server_access_list configuration variable and your 587/etc/hosts.conf are set properly. server_access_list must contain the 588symbolic name of your machine and this name must be defined in 589/etc/hosts.conf and point to your IP address. You can test if this is 590set correctly by trying to connect to the port Festival server is 591running on via an ordinary telnet (by default like this: @code{telnet 592localhost 1314}). If you are not rejected, it works. 593 594@end itemize 595 596@node Configuration, Tools, Troubleshooting, User's Documentation 597@section Configuration 598@cindex configuration 599@cindex default values 600 601Speech Dispatcher can be configured on several different levels. You 602can configure the global settings through the server configuration 603file, which can be placed either in the Speech Dispatcher default 604configuration system path like /etc/speech-dispatcher/ or in your home 605directory in @file{~/.config/speech-dispatcher/}. There is also support for 606per-client configuration, this is, specifying different default values 607for different client applications. 608 609Furthermore, applications often come with their own means of configuring 610speech related settings. Please see the documentation of your 611application for details about application specific configuration. 612 613@menu 614* Configuration file syntax:: Basic rules. 615* Configuration options:: What to configure. 616* Audio Output Configuration:: How to switch to ALSA, Pulse... 617* Client Specific Configuration:: Specific default values for applications. 618* Output Modules Configuration:: Adding and customizing output modules. 619* Log Levels:: Description of log levels. 620@end menu 621 622@node Configuration file syntax, Configuration options, Configuration, Configuration 623@subsection Configuration file syntax 624 625We use the DotConf library to read a permanent text-file based 626configuration, so the syntax might be familiar to many users. 627 628Each of the string constants, if not otherwise stated differently, 629should be encoded in UTF-8. The option names use only the 630standard ASCII charset restricted to upper- and lowercase characters 631(@code{a}, @code{b}), dashes (@code{-}) and underscores @code{_}. 632 633Comments and temporarily inactive options begin with @code{#}. 634If such an option should be turned on, just remove the comment 635character and set it to the desired value. 636@example 637# this is a comment 638# InactiveOption "this option is turned off" 639@end example 640 641Strings are enclosed in doublequotes. 642@example 643LogFile "/var/log/speech-dispatcher.log" 644@end example 645 646Numbers are written without any quotes. 647@example 648Port 6560 649@end example 650 651Boolean values use On (for true) and Off (for false). 652@example 653Debug Off 654@end example 655 656@node Configuration options, Audio Output Configuration, Configuration file syntax, Configuration 657@subsection Configuration options 658 659All available options are documented directly in the file and examples 660are provided. Most of the options are set to their default value and 661commented out. If you want to change them, just change the value and 662remove the comment symbol @code{#}. 663 664@node Audio Output Configuration, Client Specific Configuration, Configuration options, Configuration 665@subsection Audio Output Configuration 666 667Audio output method (ALSA, Pulse etc.) can be configured centrally 668from the main configuration file @code{speechd.conf}. The option 669@code{AudioOutputMethod} selects the desired audio method and further 670options labeled as @code{AudioALSA...} or @code{AudioPulse...} provide 671a more detailed configuration of the given audio output method. 672 673It is possible to use a list of preferred audio output methods, 674in which case each output module attempts to use the first availble 675in the given order. 676 677The example below prefers Pulse Audio, but will use ALSA if unable 678to connect to Pulse: 679@example 680 AudioOutputMethod "pulse,alsa" 681@end example 682 683Please note however that some more simple output modules or 684synthesizers, like the generic output module, do not respect these 685settings and use their own means of audio output which can't be 686influenced this way. On the other hand, the fallback dummy output 687module tries to use any available means of audio output to deliver its 688error message. 689 690@node Client Specific Configuration, Output Modules Configuration, Audio Output Configuration, Configuration 691@subsection Client Specific Configuration 692 693It is possible to automatically set different default values of speech 694parameters (e.g. rate, volume, punctuation, default voice...) for 695different applications that connect to Speech Dispatcher. This is 696especially useful for simple applications that have no parameter 697setting capabilities themselves or they don't support a parameter 698setting you wish to change (e.g. language). 699 700Using the commands @code{BeginClient "IDENTIFICATION"} and 701@code{EndClient} it is possible to open and close a section of 702parameter settings that only affects those client applications that 703identify themselves to Speech Dispatcher under the specific 704identification code which is matched against the string 705@code{IDENTIFICATION}. It is possible to use wildcards ('*' matches 706any number of characters and '?' matches exactly one character) in the 707string @code{IDENTIFICATION}. 708 709The identification code normally consists of 3 parts: 710@code{user:application:connection}. @code{user} is the username of the 711one who started the application, @code{application} is the name of the 712application (usually the name of the binary for it) and 713@code{connection} is a name for the connection (one application might 714use more connections for different purposes). 715 716An example is provided in @code{/etc/speech-dispatcher/speechd.conf} 717(see the line @code{Include "clients/emacs.conf"} and 718@code{/etc/speech-dispatcher/clients/emacs.conf}. 719 720@node Output Modules Configuration, Log Levels, Client Specific Configuration, Configuration 721@subsection Output Modules Configuration 722 723Each user should turn on at least one output module in his 724configuration, if he wants Speech Dispatcher to produce 725any sound output. If no output module is loaded, Speech Dispatcher 726will start, log messages into history and communicate with clients, 727but no sound is produced. 728 729Each output module has an 730``AddModule'' line in 731@file{speech-dispatcher/speechd.conf}. Additionally, each output 732module can have its own configuration file. 733 734The audio output is handled by the output modules themselves, so this 735can be switched in their own configuration files under 736@code{etc/speech-dispatcher/modules/}. 737 738@menu 739* Loading Modules in speechd.conf:: 740* Configuration files of output modules:: 741* Configuration of the Generic Output Module:: 742@end menu 743 744@node Loading Modules in speechd.conf, Configuration files of output modules, Output Modules Configuration, Output Modules Configuration 745@subsubsection Loading Modules in speechd.conf 746 747@anchor{AddModule} 748Each module that should be run when Speech Dispatcher starts must be loaded 749by the @code{AddModule} command in the configuration. Note that you can load 750one binary module multiple times under different names with different 751configurations. This is especially useful for loading the generic output 752module. @xref{Configuration of the Generic Output Module}. 753 754@example 755AddModule "@var{module_name}" "@var{module_binary}" "@var{module_config}" 756@end example 757 758@var{module_name} is the name of the output module. 759 760@var{module_binary} is the name of the binary executable 761of this output module. It can be either absolute or relative 762to @file{bin/speechd-modules/}. 763 764@var{module_config} is the file where the configuration for 765this output module is stored. It can be either absolute or relative 766to @file{etc/speech-dispatcher/modules/}. This parameter is optional. 767 768@node Configuration files of output modules, Configuration of the Generic Output Module, Loading Modules in speechd.conf, Output Modules Configuration 769@subsubsection Configuration Files of Output Modules 770 771Each output module is different and therefore has different 772configuration options. Please look at the comments in its 773configuration file for a detailed description. However, there are 774several options which are common for some output modules. Here is a 775short overview of them. 776 777@itemize 778@item AddVoice "@var{language}" "@var{symbolicname}" "@var{name}" 779@anchor{AddVoice} 780 781Each output module provides some voices and sometimes it even supports 782different languages. For this reason, there is a common mechanism for 783specifying these in the configuration, although no module is 784obligated to use it. Some synthesizers, e.g. Festival, support the 785SSIP symbolic names directly, so the particular configuration of these 786voices is done in the synthesizer itself. 787 788For each voice, there is exactly one @code{AddVoice} line. 789 790@var{language} is the ISO language code of the language of this voice, possibly 791with a region qualification. 792 793@var{symbolicname} is a symbolic name under which you wish this voice 794to be available. See @ref{Top,,Standard Voices, ssip, SSIP 795Documentation} for the list of names you can use. 796 797@var{name} is a name specific for the given output module. Please see 798the comments in the configuration file under the appropriate AddModule 799section for more info. 800 801For example our current definition of voices for Epos (file 802@code{/etc/speech-dispatcher/modules/generic-epos.conf} looks like 803this: 804 805@example 806 AddVoice "cs" "male1" "kadlec" 807 AddVoice "sk" "male1" "bob" 808@end example 809 810@item ModuleDelimiters "@var{delimiters}", ModuleMaxChunkLength @var{length} 811 812Normally, the output module doesn't try to synthesize all 813incoming text at once, but instead it cuts it into smaller 814chunks (sentences, parts of sentences) and then synthesizes 815them one by one. This second approach, used by some output 816modules, is much faster, however it limits the ability of 817the output module to provide good intonation. 818 819NOTE: The Festival module does not use ModuleDelimiters and 820ModuleMaxChunkLength. 821 822For this reason, you can configure at which characters 823(@var{delimiters}) the text should be cut into smaller blocks 824or after how many characters (@var{length}) it should be cut, 825if there is no @var{delimiter} found. 826 827Making the two rules more strict, you will get better speed 828but give away some quality of intonation. So for example 829for slower computers, we recommend to include comma (,) 830in @var{delimiters} so that sentences are cut into phrases, 831while for faster computers, it's preferable 832not to include comma and synthesize the whole compound 833sentence. 834 835The same applies to @code{MaxChunkLength}, it's better 836to set higher values for faster computers. 837 838For example, curently the default for Flite is 839 840@example 841 FestivalMaxChunkLength 500 842 FestivalDelimiters ".?!;" 843@end example 844 845The output module may also decide to cut sentences on delimiters 846only if they are followed by a space. This way for example 847``file123.tmp'' would not be cut in two parts, but ``The horse 848raced around the fence, that was lately painted green, fell.'' 849would be. (This is an interesting sentence, by the way.) 850@end itemize 851 852@node Configuration of the Generic Output Module, , Configuration files of output modules, Output Modules Configuration 853@subsubsection Configuration files of the Generic Output Module 854 855The generic output module allows you to easily write your 856own output module for synthesizers that have a simple 857command line interface by modifying the configuration 858file. This way, users can add support for their device even if they don't 859know how to program. @xref{AddModule}. 860 861The core part of a generic output module is the command 862execution line. 863 864@defvr {Generic Module Configuration} GenericExecuteSynth "@var{execution_string}" 865 866@code{execution_string} is the command that should be executed 867in a shell when it's desired to say something. In fact, it can 868be multiple commands concatenated by the @code{&&} operator. To stop 869saying the message, the output module will send a KILL signal to 870the process group, so it's important that it immediately 871stops speaking after the processes are killed. (On most GNU/Linux 872system, the @code{play} utility has this property). 873 874In the execution string, you can use the following variables, 875which will be substituted by the desired values before executing 876the command. 877 878@itemize 879@item @code{$DATA} 880The text data that should be said. The string's characters that would interfere 881with bash processing are already escaped. However, it may be necessary to put 882double quotes around it (like this: @code{\"$DATA\"}). 883@item @code{$LANG} 884The language identification string (it's defined by GenericLanguage). 885@item @code{$VOICE} 886The voice identification string (it's defined by AddVoice). 887@item @code{$PITCH} 888The desired pitch (a float number defined in GenericPitchAdd and GenericPitchMultiply). 889@item @code{$PITCH_RANGE} 890The desired pitch range (a float number defined in GenericPitchRangeAdd and GenericPitchRangeMultiply). 891@item @code{$RATE} 892The desired rate or speed (a float number defined in GenericRateAdd and GenericRateMultiply) 893@end itemize 894 895Here is an example from @file{etc/speech-dispatcher/modules/epos-generic.conf} 896@example 897GenericExecuteSynth \ 898"epos-say -o --language $LANG --voice $VOICE --init_f $PITCH --init_t $RATE \ 899\"$DATA\" | sed -e s+unknown.*$++ >/tmp/epos-said.wav && play /tmp/epos-said.wav >/dev/null" 900@end example 901@end defvr 902 903@defvr {GenericModuleConfiguration} AddVoice "@var{language}" "@var{symbolicname}" "@var{name}" 904@xref{AddVoice}. 905@end defvr 906 907@defvr {GenericModuleConfiguration} GenericLanguage "iso-code" "string-subst" 908 909Defines which string @code{string-subst} should be substituted for @code{$LANG} 910given an @code{iso-code} language code. 911 912Another example from Epos generic: 913@example 914GenericLanguage "en-US" "english-US" 915GenericLanguage "cs" "czech" 916GenericLanguage "sk" "slovak" 917@end example 918@end defvr 919 920@defvr {GenericModuleConfiguration} GenericRateAdd @var{num} 921@end defvr 922@defvr {GenericModuleConfiguration} GenericRateMultiply @var{num} 923@end defvr 924@defvr {GenericModuleConfiguration} GenericPitchAdd @var{num} 925@end defvr 926@defvr {GenericModuleConfiguration} GenericPitchMultiply @var{num} 927@end defvr 928@defvr {GenericModuleConfiguration} GenericPitchRangeAdd @var{num} 929@end defvr 930@defvr {GenericModuleConfiguration} GenericPitchRangeMultiply @var{num} 931These parameters set rate and pitch conversion to compute 932the value of @code{$RATE}, @code{$PITCH} and @code{$PITCH_RANGE}. 933 934The resulting rate (or pitch) is calculated using the following formula: 935@example 936 (speechd_rate * GenericRateMultiply) + GenericRateAdd 937@end example 938where speechd_rate is a value between -100 (lowest) and +100 (highest) 939Some meaningful conversion for the specific text-to-speech system 940used must by defined. 941 942(The values in GenericSthMultiply are multiplied by 100 because 943DotConf currently doesn't support floats. So you can write 0.85 as 85 and 944so on.) 945@end defvr 946 947@node Log Levels, , Output Modules Configuration, Configuration 948@subsection Log Levels 949 950There are 6 different verbosity levels of Speech Dispatcher logging. 9510 means no logging, while 5 means that nearly all the information 952about Speech Dispatcher's operation is logged. 953 954@itemize @bullet 955 956@item Level 0 957@itemize @bullet 958@item No information. 959@end itemize 960 961@item Level 1 962@itemize @bullet 963@item Information about loading and exiting. 964@end itemize 965 966@item Level 2 967@itemize @bullet 968@item Information about errors that occurred. 969@item Allocating and freeing resources on start and exit. 970@end itemize 971 972@item Level 3 973@itemize @bullet 974@item Information about accepting/rejecting/closing clients' connections. 975@item Information about invalid client commands. 976@end itemize 977 978@item Level 4 979@itemize @bullet 980@item Every received command is output. 981@item Information preceding the command output. 982@item Information about queueing/allocating messages. 983@item Information about the history, sound icons and other 984facilities. 985@item Information about the work of the speak() thread. 986@end itemize 987 988@item Level 5 989(This is only for debugging purposes and will output *a lot* 990of data. Use with caution.) 991@itemize @bullet 992@item Received data (messages etc.) is output. 993@item Debugging information. 994@end itemize 995@end itemize 996 997@node Tools, Synthesis Output Modules, Configuration, User's Documentation 998@section Tools 999 1000Several small tools are distributed together with Speech Dispatcher. 1001@code{spd-say} is a small client that allows you to send messages to 1002Speech Dispatcher in an easy way and have them spoken, or cancel 1003speech from other applications. 1004 1005@menu 1006* spd-say:: Say a given text or cancel messages in Dispatcher. 1007* spd-conf:: Configuration, diagnostics and troubleshooting tool 1008* spd-send:: Direct SSIP communication from command line. 1009@end menu 1010 1011@node spd-say, spd-conf, Tools, Tools 1012@subsection spd-say 1013 1014spd-say is documented in its own manual. @xref{Top,,,spd-say, Spd-say 1015Documentation}. 1016 1017@node spd-conf, spd-send, spd-say, Tools 1018@subsection spd-conf 1019 1020spd-conf is a tool for creating basic configuration, initial setup of 1021some basic settings (output module, audio method), diagnostics and 1022automated debugging with a possibility to send the debugging output to 1023the developers with a request for help. 1024 1025The available command options are self-documented through 1026@code{spd-say -h}. In any working mode, the tool asks the user about 1027future actions and preferred configuration of the basic options. 1028 1029Most useful ways of execution are: 1030@itemize @bullet 1031@item @code{spd-conf} 1032Create new configuration and setup basic settings according to user 1033answers. Run diagnostics and if some problems occur, run debugging 1034and offer to send a request for help to the developers. 1035 1036@item @code{spd-conf -d} 1037Run diagnostics of problems. 1038 1039@item @code{spd-conf -D} 1040Run debugging and offer to send a request for help to the developers. 1041 1042@end itemize 1043 1044@node spd-send, , spd-conf, Tools 1045@subsection spd-send 1046 1047spd-send is a small client/server application that allows you to 1048establish a connection to Speech Dispatcher and then use a simple 1049command line tool to send and receive SSIP protocol communication. 1050 1051Please see @file{src/c/clients/spd-say/README} in the Speech 1052Dispatcher's source tree for more information. 1053 1054@node Synthesis Output Modules, Security, Tools, User's Documentation 1055@section Synthesis Output Modules 1056@cindex output module 1057@cindex different synthesizers 1058 1059Speech Dispatcher supports concurrent use of multiple output modules. If the 1060output modules provide good synchronization, you can combine them when 1061reading messages. For example if module1 can speak English and Czech while 1062module2 speaks only German, the idea is that if there is some message in 1063German, module2 is used, while module1 is used for the other languages. 1064However the language is not the only criteria for the decision. The rules for 1065selection of an output module can be influenced through the configuration file 1066@file{speech-dispatcher/speechd.conf}. 1067 1068@menu 1069* Provided Functionality:: Some synthesizers don't support the full set of SSIP features. 1070@end menu 1071 1072@node Provided Functionality, , Synthesis Output Modules, Synthesis Output Modules 1073@subsection Provided functionality 1074 1075Please note that some output modules don't support the full Speech 1076Dispatcher functionality (e.g. spelling mode, sound icons). If there 1077is no easy way around the missing functionality, we don't try to 1078emulate it in some complicated way and rather try to encourage the 1079developers of that particular synthesizer to add that 1080functionality. We are actively working on adding the missing parts to 1081Festival, so Festival supports nearly all of the features of Speech 1082Dispatcher and we encourage you to use it. Much progress has also been 1083done with eSpeak. 1084 1085@menu 1086* Supported Modules:: 1087@end menu 1088 1089@node Supported Modules, , Provided Functionality, Provided Functionality 1090@subsubsection Supported Modules 1091 1092@itemize @bullet 1093 1094@item Festival 1095Festival is a free software multi-language Text-to-Speech 1096synthesis system that is very flexible and extensible using the 1097Scheme scripting language. Currently, it supports high quality 1098synthesis for several languages, and on today's computers it runs 1099reasonably fast. If you are not sure which one to use and your 1100language is supported by Festival, we advise you to use it. See 1101@uref{http://www.cstr.ed.ac.uk/projects/festival/}. 1102 1103@item eSpeak 1104eSpeak is a newer very lightweight free software engine with a broad 1105range of supported languages and a good quality of voice at high 1106rates. See @uref{http://espeak.sourceforge.net/}. 1107 1108@item Flite 1109Flite (Festival Light) is a lightweight free software TTS synthesizer 1110intended to run on systems with limited resources. At this time, it 1111has only one English voice and porting voices from Festival looks 1112rather difficult. With the caching mechanism provided by Speech 1113Dispatcher, Festival is faster than Flite in most situations. See 1114@uref{http://www.speech.cs.cmu.edu/flite/}. 1115 1116@item Generic 1117The Generic module can be used with any synthesizer that can be 1118managed by a simple command line application. @xref{Configuration of 1119the Generic Output Module}, for more details about how to use it. 1120However, it provides only very rudimentary support of speaking. 1121 1122@item Pico 1123The SVOX Pico engine is a software speech synthesizer for German, English (GB 1124and US), Spanish, French and Italian. 1125SVOX produces clear and distinct speech output made possible by the use of 1126Hidden Markov Model (HMM) algorithms. 1127See @uref{http://git.debian.org/?p=collab-maint/svox.git}. 1128Pico documentation can be found at 1129@uref{http://android.git.kernel.org/?p=platform/external/svox.git; 1130a=tree;f=pico_resources/docs} 1131It includes three manuals: 1132- SVOX_Pico_Lingware.pdf 1133- SVOX_Pico_Manual.pdf 1134- SVOX_Pico_architecture_and_design.pdf 1135 1136@end itemize 1137 1138@node Security, , Synthesis Output Modules, User's Documentation 1139@section Security 1140 1141Speech Dispatcher doesn't implement any special authentication 1142mechanisms but uses the standard system mechanisms to regulate access. 1143 1144If the default `unix_socket' communication mechanism is used, only the 1145user who starts the server can connect to it due to imposed 1146restrictions on the unix socket file permissions. 1147 1148In case of the `inet_socket' communication mechanism, where clients 1149connect to Speech Dispatcher on a specified port, theoretically 1150everyone could connect to it. The access is by default restricted only 1151for connections originating on the same machine, which can be changed 1152via the LocalhostAccessOnly option in the server configuration 1153file. In such a case, the user is reponsible to set appropriate 1154security restrictions on the access to the given port on his machine 1155from the outside network using a firewall or similar mechanism. 1156 1157@node Technical Specifications, Client Programming, User's Documentation, Top 1158@chapter Technical Specifications 1159 1160 1161@menu 1162* Communication mechanisms:: 1163* Address specification:: 1164* Actions performed on startup:: 1165* Accepted signals:: 1166@end menu 1167 1168@node Communication mechanisms, Address specification, Technical Specifications, Technical Specifications 1169@section Communication mechanisms 1170 1171Speech Dispatcher supports two communicatino mechanisms: UNIX-style 1172and Inet sockets, which are refered as 'unix-socket' and 'inet-socket' 1173respectively. The communication mechanism is decided on startup and 1174cannot be changed at runtime. Unix sockets are now the default and 1175preferred variant for local communication, Inet sockets are necessary 1176for communication over network. 1177 1178The mechanism for the decision of which method to use is as follows in 1179this order of precedence: command-line option, configuration option, 1180the default value 'unix-socket'. 1181 1182@emph{Unix sockets} are associated with a file in the filesystem. By 1183default, this file is placed in the user's runtime directory (as 1184determined by the value of the XDG_RUNTIME_DIR environment variable and the 1185system configuration for the given user). It's default name is 1186constructed as @code{XDG_RUNTIME_DIR/speech-dispatcher/speechd.sock}. The access 1187permissions for this file are set to 600 so that it's restricted to 1188read/write by the current user. 1189 1190As such, access is handled properly and there are no conflicts between 1191the different instances of Speech Dispatcher run by the different 1192users. 1193 1194Client applications and libraries are supposed to independently 1195replicate the process of construction of the socket path and connect 1196to it, thus establishing a common communication channel in the default 1197setup. 1198 1199It should be however possible in the client libraries and is possible 1200in the server, to define a custom file as a socket name if needed. 1201Client libraries should respect the @var{SPEECHD_ADDRESS} environment 1202variable. 1203 1204@emph{Inet sockets} are based on communication over a given port on 1205the given host, two variables which must be previously agreed between 1206the server and client before a connection can be established. The only 1207implicit security restriction is the server configuration option which 1208can allow or disallow access from machines other than localhost. 1209 1210By convention, the clients should use host and port given by one of 1211the following sources in the following order of precedence: its own 1212configuration, value of the @var{SPEECHD_ADDRESS} environment variable 1213and the default pair (localhost, 6560). 1214 1215@xref{Setting Communication Method}. 1216 1217@node Address specification, Actions performed on startup, Communication mechanisms, Technical Specifications 1218@section Address specification 1219 1220Speech Dispatcher provies several methods of communication and can be 1221used both locally and over network. @xref{Communication 1222mechanisms}. Client applications and interface libraries need to 1223recognize an address, which specifies how and where to contact the 1224appropriate server. 1225 1226Address specification consits from the method and one or more of its 1227parameters, each item separated by a colon: 1228 1229@example 1230method:parameter1:parameter2 1231@end example 1232 1233The method is either 'unix_socket' or 'inet_socket'. Parameters are 1234optional. If not used in the address line, their default value will 1235be used. 1236 1237Two forms are currently recognized: 1238 1239@example 1240unix_socket:full/path/to/socket 1241inet_socket:host_ip:port 1242@end example 1243 1244Examples of valid address lines are: 1245@example 1246unix_socket 1247unix_socket:/tmp/test.sock 1248inet_socket 1249inet_socket:192.168.0.34 1250inet_socket:192.168.0.34:6563 1251@end example 1252 1253Clients implement different mechanisms how the user can set the 1254address. Clients should respect the @var{SPEECHD_ADDRESS} environment 1255variable @xref{Setting Communication Method}, unless the user 1256ovverrides its value by settins in the client application 1257itself. Clients should fallback to the default address, if neither the 1258environment variable or their specific configuration is set. 1259 1260The default communication address currently is: 1261 1262@example 1263unix_socket:/$XDG_RUNTIME_DIR/speech-dispatcher/speechd.sock 1264@end example 1265 1266where `~' stands for the path to the users home directory. 1267 1268@node Actions performed on startup, Accepted signals, Address specification, Technical Specifications 1269@section Actions performed on startup 1270 1271What follows is an overview of the actions the server takes on startup 1272in this order: 1273 1274@itemize @bullet 1275 1276@item Initialize logging stage 1 1277 1278Set loglevel to 1 and log destination to stderr (logfile is not ready yet). 1279 1280@item Parse command line options 1281 1282Read preferred communication method, destinations for logfile and pidfile 1283 1284@item Establish the @file{~/.config/speech-dispatcher/} and 1285@file{~/.cache/speech-dispatcher/} directories 1286 1287If pid and conf paths were not given as command line options, the 1288server will place them in @file{~/.config/speech-dispatcher/} and 1289@file{~/.cache/speech-dispatcher/} by default. If they 1290are not specified AND the current user doesn't have a system home directory, 1291the server will fail startup. 1292 1293The configuration file is pointed to @file{~/.config/speech-dispatcher/speechd.conf} 1294if it exists, otherwise to @file{/etc/speech-dispatcher/speechd.conf} or a similar 1295system location according to compile options. One of these files must 1296exists, otherwise Speech Dispatcher will not know where to find its output 1297modules. 1298 1299@item Create pid file 1300 1301Check the pid file in the determined location. If an instance of the 1302server is already running, log an error message and exit with error 1303code 1, otherwise create and lock a new pid file. 1304 1305@item Check for autospawning enabled 1306 1307If the server is started with --spawn, check whether autospawn is not 1308disabled in the configuration (DisableAutoSpawn config option in 1309speechd.conf). If it is disabled, log an error message and exit with 1310error code 1. 1311 1312@item Install signal handlers 1313 1314@item Create unix or inet sockets and start listening 1315 1316@item Initialize Speech Dispatcher 1317 1318Read the configuration files, setup some lateral threads, start and 1319initialize output modules. Reinitialize logging (stage 2) into the 1320final logfile destination (as determined by the command line option, 1321the configuration option and the default location in this order of 1322precedence). 1323 1324After this step, Speech Dispatcher is ready to accept new connections. 1325 1326@item Daemonize the process 1327 1328Fork the process, disconnect from standard input and outputs, 1329disconnect from parent process etc. as prescribed by the POSIX 1330standards. 1331 1332@item Initialize the speaking lateral thread 1333 1334Initialize the second main thread which will process the speech 1335request from the queues and pass them onto the Speech Dispatcher 1336modules. 1337 1338 1339@item Start accepting new connections from clients 1340 1341Start listening for new connections from clients and processing them 1342in a loop. 1343 1344@end itemize 1345 1346@node Accepted signals, , Actions performed on startup, Technical Specifications 1347@section Accepted signals 1348 1349@itemize @bullet 1350 1351@item SIGINT 1352 1353Terminate the server 1354 1355@item SIGHUP 1356 1357Reload configuration from config files but do not restart modules 1358 1359@item SIGUSR1 1360 1361Reload dead output modules (modules which were previously working but 1362crashed during runtime and marked as dead) 1363 1364@item SIGPIPE 1365 1366Ignored 1367 1368@end itemize 1369 1370@node Client Programming, Server Programming, Technical Specifications, Top 1371@chapter Client Programming 1372 1373Clients communicate with Speech Dispatcher via the Speech Synthesis 1374Internet Protocol (SSIP) @xref{Top, , , ssip, Speech Synthesis 1375Internet Protocol documentation}. The protocol is the actual 1376interface to Speech Dispatcher. 1377 1378Usually you don't need to use SSIP directly. You can use one of the supplied 1379libraries, which wrap the SSIP interface. This is the 1380recommended way of communicating with Speech Dispatcher. We try so support as 1381many programming environments as possible. This manual (except SSIP) contains 1382documentation for the C and Python libraries, however there are also other 1383libraries developed as external projects. Please contact us for information 1384about current external client libraries. 1385 1386@menu 1387* C API:: Shared library for C/C++ 1388* Python API:: Python module. 1389* Guile API:: 1390* Common Lisp API:: 1391* Autospawning:: How server is started from clients 1392@end menu 1393 1394@node C API, Python API, Client Programming, Client Programming 1395@section C API 1396 1397@menu 1398* Initializing and Terminating in C:: 1399* Speech Synthesis Commands in C:: 1400* Speech output control commands in C:: 1401* Characters and Keys in C:: 1402* Sound Icons in C:: 1403* Parameter Setting Commands in C:: 1404* Other Functions in C:: 1405* Information Retrieval Commands in C:: 1406* Event Notification and Index Marking in C:: 1407* History Commands in C:: 1408* Direct SSIP Communication in C:: 1409@end menu 1410 1411@node Initializing and Terminating in C, Speech Synthesis Commands in C, C API, C API 1412@subsection Initializing and Terminating 1413 1414@deffn {C API function} SPDConnection* spd_open(char* client_name, char* connection_name, char* user_name, SPDConnectionMode connection_mode) 1415@findex spd_open() 1416 1417Opens a new connection to Speech Dispatcher and returns a socket file 1418descriptor you will use to communicate with Speech Dispatcher. The 1419socket file descriptor is a parameter used in all the other 1420functions. It now uses local communication via inet sockets. 1421See @code{spd_open2} for more details. 1422 1423The three parameters @code{client_name}, @code{connection_name} and 1424@code{username} are there only for informational and navigational 1425purposes, they don't affect any settings or behavior of any 1426functions. The authentication mechanism has nothing to do with 1427@code{username}. These parameters are important for the user when he 1428wants to set some parameters for a given session, when he wants to 1429browse through history, etc. The parameter @code{connection_mode} 1430specifies how this connection should be handled internally and if 1431event notifications and index marking capabilities will be available. 1432 1433@code{client_name} is the name of the client that opens the connection. Normally, 1434it should be the name of the executable, for example ``lynx'', ``emacs'', ``bash'', 1435or ``gcc''. It can be left as NULL. 1436 1437@code{connection_name} determines the particular use of that connection. If you 1438use only one connection in your program, this should be set to ``main'' (passing 1439a NULL pointer has the same effect). If you use two or more connections in 1440your program, their @code{client_name}s should be the same, but @code{connection_name}s 1441should differ. For example: ``buffer'', ``command_line'', ``text'', ``menu''. 1442 1443@code{username} should be set to the name of the user. Normally, you should 1444get this string from the system. If set to NULL, libspeechd will try to 1445determine it automatically by g_get_user_name(). 1446 1447@code{connection_mode} has two possible values: @code{SPD_MODE_SINGLE} 1448and @code{SPD_MODE_THREADED}. If the parameter is set to 1449@code{SPD_MODE_THREADED}, then @code{spd_open()} will open an 1450additional thread in your program which will handle asynchronous SSIP 1451replies and will allow you to use callbacks for event notifications 1452and index marking, allowing you to keep track of the progress 1453of speaking the messages. However, you must be aware that your 1454program is now multi-threaded and care must be taken when 1455using/handling signals. If @code{SPD_MODE_SINGLE} is chosen, the 1456library won't execute any additional threads and SSIP will run only as 1457a synchronous protocol, therefore event notifications and index 1458marking won't be available. 1459 1460It returns a newly allocated SPDConnection* structure on success, or @code{NULL} 1461on error. 1462 1463Each connection you open should be closed by spd_close() before the 1464end of the program, so that the associated connection descriptor is 1465closed, threads are terminated and memory is freed. 1466 1467@end deffn 1468 1469@deffn {C API function} SPDConnection* spd_open2(char* client_name, char* connection_name, char* user_name, SPDConnectionMode connection_mode, SPDConnectionMethod method, int autospawn) 1470@findex spd_open2() 1471 1472Opens a new connection to Speech Dispatcher and returns a socket file 1473descriptor. This function is the same as @code{spd_open} except that 1474it gives more control of the communication method and autospawn 1475functionality as described below. 1476 1477@code{method} is either @code{SPD_METHOD_UNIX_SOCKET} or @code{SPD_METHOD_INET_SOCKET}. By default, 1478unix socket communication should be preferred, but inet sockets are necessary for cross-network 1479communication. 1480 1481@code{autospawn} is a boolean flag specifying whether the function 1482should try to autospawn (autostart) the Speech Dispatcher server 1483process if it is not running already. This is set to 1 by default, so 1484this function should normally not fail even if the server is not yet 1485running. 1486 1487@end deffn 1488 1489@deffn {C API function} int spd_get_client_id(SPDConnection *connection) 1490@findex spd_get_client_id() 1491 1492Get the client ID. 1493 1494@code{connection} is the SPDConnection* connection created by spd_open(). 1495 1496It returns the client ID of the connection. 1497 1498@end deffn 1499 1500@deffn {C API function} void spd_close(SPDConnection *connection) 1501@findex spd_close() 1502 1503Closes a Speech Dispatcher socket connection, terminates associated 1504threads (if necessary) and frees the memory allocated by 1505spd_open(). You should close every connection before the end of your 1506program. 1507 1508@code{connection} is the SPDConnection connection obtained by spd_open(). 1509@end deffn 1510 1511@node Speech Synthesis Commands in C, Speech output control commands in C, Initializing and Terminating in C, C API 1512@subsection Speech Synthesis Commands 1513 1514@defvar {C API type} SPDPriority 1515@vindex SPDPriority 1516 1517@code{SPDPriority} is an enum type that represents the possible priorities that 1518can be assigned to a message. 1519 1520@example 1521typedef enum@{ 1522 SPD_IMPORTANT = 1, 1523 SPD_MESSAGE = 2, 1524 SPD_TEXT = 3, 1525 SPD_NOTIFICATION = 4, 1526 SPD_PROGRESS = 5 1527@}SPDPriority; 1528@end example 1529 1530@xref{Top,,Message Priority Model,ssip, SSIP Documentation}. 1531 1532@end defvar 1533 1534@deffn {C API function} int spd_say(SPDConnection* connection, SPDPriority priority, char* text); 1535@findex spd_say() 1536 1537Sends a message to Speech Dispatcher. If this message isn't blocked by 1538some message of higher priority and this CONNECTION isn't paused, it 1539will be synthesized directly on one of the output devices. Otherwise, 1540the message will be discarded or delayed according to its priority. 1541 1542@code{connection} is the SPDConnection* connection created by spd_open(). 1543 1544@code{priority} is the desired priority for this message. @xref{Top,,Message Priority Model,ssip, SSIP Documentation}. 1545 1546@code{text} is a null terminated string containing text you want sent 1547to synthesis. It must be encoded in UTF-8. Note that this doesn't have 1548to be what you will finally hear. It can be affected by different 1549settings, such as spelling, punctuation, text substitution etc. 1550 1551It returns a positive unique message identification number on success, 1552-1 otherwise. This message identification number can be saved and 1553used for the purpose of event notification callbacks or history 1554handling. 1555 1556@end deffn 1557 1558@deffn {C API function} int spd_sayf(SPDConnection* connection, SPDPriority priority, char* format, ...); 1559@findex spd_sayf() 1560 1561Similar to @code{spd_say()}, simulates the behavior of printf(). 1562 1563@code{format} is a string containing text and formatting of the parameters, such as ``%d'', 1564``%s'' etc. It must be encoded in UTF-8. 1565 1566@code{...} is an arbitrary number of arguments. 1567 1568All other parameters are the same as for spd_say(). 1569 1570For example: 1571@example 1572 spd_sayf(conn, SPD_TEXT, "Hello %s, how are you?", username); 1573 spd_sayf(conn, SPD_IMPORTANT, "Fatal error on [%s:%d]", filename, line); 1574@end example 1575 1576But be careful with unicode! For example this doesn't work: 1577 1578@example 1579 spd_sayf(conn, SPD_NOTIFY, ``Pressed key is %c.'', key); 1580@end example 1581 1582Why? Because you are supposing that key is a char, but that will 1583fail with languages using multibyte charsets. The proper solution 1584is: 1585 1586@example 1587 spd_sayf(conn, SPD_NOTIFY, ``Pressed key is %s'', key); 1588@end example 1589where key is an encoded string. 1590 1591It returns a positive unique message identification number on success, -1 otherwise. 1592This message identification number can be saved and used for the purpose of 1593event notification callbacks or history handling. 1594@end deffn 1595 1596@node Speech output control commands in C, Characters and Keys in C, Speech Synthesis Commands in C, C API 1597@subsection Speech Output Control Commands 1598 1599@subsubheading Stop Commands 1600 1601@deffn {C API function} int spd_stop(SPDConnection* connection); 1602@findex spd_stop() 1603 1604Stops the message currently being spoken on a given connection. If there 1605is no message being spoken, does nothing. (It doesn't touch the messages 1606waiting in queues). This is intended for stops executed by the user, 1607not for automatic stops (because automatically you can't control 1608how many messages are still waiting in queues on the server). 1609 1610@code{connection} is the SPDConnection* connection created by spd_open(). 1611 1612It returns 0 on success, -1 otherwise. 1613@end deffn 1614 1615@deffn {C API function} int spd_stop_all(SPDConnection* connection); 1616@findex spd_stop_all() 1617 1618The same as spd_stop(), but it stops every message being said, 1619without distinguishing where it came from. 1620 1621It returns 0 on success, -1 if some of the stops failed. 1622@end deffn 1623 1624@deffn {C API function} int spd_stop_uid(SPDConnection* connection, int target_uid); 1625@findex spd_stop_uid() 1626 1627The same as spd_stop() except that it stops a client client different from 1628the calling one. You must specify this client in @code{target_uid}. 1629 1630@code{target_uid} is the unique ID of the connection you want 1631to execute stop() on. It can be obtained from spd_history_get_client_list(). 1632@xref{History Commands in C}. 1633 1634It returns 0 on success, -1 otherwise. 1635 1636@end deffn 1637 1638@subsubheading Cancel Commands 1639 1640@deffn {C API function} int spd_cancel(SPDConnection* connection); 1641 1642Stops the currently spoken message from this connection 1643(if there is any) and discards all the queued messages 1644from this connection. This is probably what you want 1645to do, when you call spd_cancel() automatically in 1646your program. 1647@end deffn 1648 1649@deffn {C API function} int spd_cancel_all(SPDConnection* connection); 1650@findex spd_cancel_all() 1651 1652The same as spd_cancel(), but it cancels every message 1653without distinguishing where it came from. 1654 1655It returns 0 on success, -1 if some of the stops failed. 1656@end deffn 1657 1658@deffn {C API function} int spd_cancel_uid(SPDConnection* connection, int target_uid); 1659@findex spd_cancel_uid() 1660 1661The same as spd_cancel() except that it executes cancel for some other client 1662than the calling one. You must specify this client in @code{target_uid}. 1663 1664@code{target_uid} is the unique ID of the connection you want to 1665execute cancel() on. It can be obtained from 1666spd_history_get_client_list(). @xref{History Commands in C}. 1667 1668It returns 0 on success, -1 otherwise. 1669@end deffn 1670 1671@subsubheading Pause Commands 1672 1673@deffn {C API function} int spd_pause(SPDConnection* connection); 1674@findex int spd_pause() 1675 1676Pauses all messages received from the given connection. No messages 1677except for priority @code{notification} and @code{progress} are thrown 1678away, they are all waiting in a separate queue for resume(). Upon resume(), the 1679message that was being said at the moment pause() was received will be 1680continued from the place where it was paused. 1681 1682It returns immediately. However, that doesn't mean that the speech 1683output will stop immediately. Instead, it can continue speaking 1684the message for a while until a place where the position in the text 1685can be determined exactly is reached. This is necessary to be able to 1686provide `resume' without gaps and overlapping. 1687 1688When pause is on for the given client, all newly received 1689messages are also queued and waiting for resume(). 1690 1691It returns 0 on success, -1 if something failed. 1692@end deffn 1693 1694@deffn {C API function} int spd_pause_all(SPDConnection* connection); 1695@findex spd_pause_all() 1696 1697The same as spd_pause(), but it pauses every message, 1698without distinguishing where it came from. 1699 1700It returns 0 on success, -1 if some of the pauses failed. 1701@end deffn 1702 1703@deffn {C API function} int spd_pause_uid(SPDConnection* connection, int target_uid); 1704@findex spd_pause_uid() 1705 1706The same as spd_pause() except that it executes pause for a client different from 1707the calling one. You must specify the client in @code{target_uid}. 1708 1709@code{target_uid} is the unique ID of the connection you want 1710to pause. It can be obtained from spd_history_get_client_list(). 1711@xref{History Commands in C}. 1712 1713It returns 0 on success, -1 otherwise. 1714@end deffn 1715 1716@subsubheading Resume Commands 1717 1718@deffn {C API function} int spd_resume(SPDConnection* connection); 1719@findex int spd_resume() 1720 1721Resumes all paused messages from the given connection. The rest 1722of the message that was being said at the moment pause() was 1723received will be said and all the other messages are queued 1724for synthesis again. 1725 1726@code{connection} is the SPDConnection* connection created by spd_open(). 1727 1728It returns 0 on success, -1 otherwise. 1729@end deffn 1730 1731@deffn {C API function} int spd_resume_all(SPDConnection* connection); 1732@findex spd_resume_all() 1733 1734The same as spd_resume(), but it resumes every paused message, 1735without distinguishing where it came from. 1736 1737It returns 0 on success, -1 if some of the pauses failed. 1738@end deffn 1739 1740@deffn {C API function} int spd_resume_uid(SPDConnection* connection, int target_uid); 1741@findex spd_resume_uid() 1742 1743The same as spd_resume() except that it executes resume for a client different from 1744the calling one. You must specify the client in @code{target_uid}. 1745 1746@code{target_uid} is the unique ID of the connection you want 1747to resume. It can be obtained from spd_history_get_client_list(). 1748@xref{History Commands in C}. 1749 1750It returns 0 on success, -1 otherwise. 1751@end deffn 1752 1753@node Characters and Keys in C, Sound Icons in C, Speech output control commands in C, C API 1754@subsection Characters and Keys 1755 1756@deffn {C API function} int spd_char(SPDConnection* connection, SPDPriority priority, char* character); 1757@findex spd_char() 1758 1759Says a character according to user settings for characters. For example, this can be 1760used for speaking letters under the cursor. 1761 1762@code{connection} is the SPDConnection* connection created by spd_open(). 1763 1764@code{priority} is the desired priority for this 1765message. @xref{Top,,Message Priority Model,ssip, SSIP Documentation}. 1766 1767@code{character} is a NULL terminated string of chars containing one UTF-8 1768character. If it contains more characters, only the first one is processed. 1769 1770It returns 0 on success, -1 otherwise. 1771@end deffn 1772 1773@deffn {C API function} int spd_wchar(SPDConnection* connection, SPDPriority priority, wchar_t wcharacter); 1774@findex spd_say_wchar() 1775 1776The same as spd_char(), but it takes a wchar_t variable as its argument. 1777 1778It returns 0 on success, -1 otherwise. 1779@end deffn 1780 1781@deffn {C API function} int spd_key(SPDConnection* connection, SPDPriority priority, char* key_name); 1782@findex spd_key() 1783 1784Says a key according to user settings for keys. 1785 1786@code{connection} is the SPDConnection* connection created by spd_open(). 1787 1788@code{priority} is the desired priority for this 1789message. @xref{Top,,Message Priority Model,ssip, SSIP Documentation}. 1790 1791@code{key_name} is the name of the key in a special format. 1792@xref{Top,,Speech Synthesis and Sound Output Commands, ssip, SSIP 1793Documentation}, (KEY, the corresponding SSIP command) for description 1794of the format of @code{key_name} 1795 1796It returns 0 on success, -1 otherwise. 1797@end deffn 1798 1799@node Sound Icons in C, Parameter Setting Commands in C, Characters and Keys in C, C API 1800@subsection Sound Icons 1801 1802@deffn {C API function} int spd_sound_icon(SPDConnection* connection, SPDPriority priority, char* icon_name); 1803@findex spd_sound_icon() 1804 1805Sends a sound icon ICON_NAME. These are symbolic names that are mapped 1806to a sound or to a text string (in the particular language) according to 1807Speech Dispatcher tables and user settings. Each program can also 1808define its own icons. 1809 1810@code{connection} is the SPDConnection* connection created by spd_open(). 1811 1812@code{priority} is the desired priority for this 1813message. @xref{Top,,Message Priority Model,ssip, SSIP Documentation}. 1814 1815@code{icon_name} is the name of the icon. It can't contain spaces, instead 1816use underscores (`_'). Icon names starting with an underscore 1817are considered internal and shouldn't be used. 1818@end deffn 1819 1820@node Parameter Setting Commands in C, Other Functions in C, Sound Icons in C, C API 1821@subsection Parameter Settings Commands 1822 1823The following parameter setting commands are available. For configuration 1824and history clients there are also functions for setting the value for 1825some other connection and for all connections. They are listed separately below. 1826 1827Please see @ref{Top,,Parameter Setting Commands,ssip, SSIP 1828Documentation} for a general description of what they mean. 1829 1830@deffn {C API function} int spd_set_data_mode(SPDConnection *connection, SPDDataMode mode) 1831@findex spd_set_data_mode() 1832 1833Set Speech Dispatcher data mode. Currently, plain text and SSML are 1834supported. SSML is especially useful if you want to use index marks 1835or include changes of voice parameters in the text. 1836 1837@code{mode} is the requested data mode: @code{SPD_DATA_TEXT} or 1838@code{SPD_DATA_SSML}. 1839 1840@end deffn 1841 1842@deffn {C API function} int spd_set_language(SPDConnection* connection, char* language); 1843@findex spd_set_language() 1844 1845Sets the language that should be used for synthesis. 1846 1847@code{connection} is the SPDConnection* connection created by spd_open(). 1848 1849@code{language} is the language code as defined in RFC 1766 (``cs'', 1850``en'', ``en-US'', ...). 1851 1852@end deffn 1853 1854@deffn {C API function} int spd_set_output_module(SPDConnection* connection, char* output_module); 1855@findex spd_set_output_module() 1856@anchor{spd_set_output_module} 1857 1858Sets the output module that should be used for synthesis. The parameter 1859of this command should always be entered by the user in some way 1860and not hardcoded anywhere in the code as the available synthesizers 1861and their registration names may vary from machine to machine. 1862 1863@code{connection} is the SPDConnection* connection created by spd_open(). 1864 1865@code{output_module} is the output module name under which the module 1866was loaded into Speech Dispatcher in its configuration (``flite'', 1867``festival'', ``epos-generic''... ) 1868 1869@end deffn 1870 1871@deffn {C API function} char* spd_get_output_module(SPDConnection* connection); 1872@findex spd_get_output_module() 1873@anchor{spd_get_output_module} 1874 1875Gets the current output module in use for synthesis. 1876 1877@code{connection} is the SPDConnection* connection created by spd_open(). 1878 1879It returns the output module name under which the module was loaded into Speech 1880Dispatcher in its configuration (``flite'', ``festival'', ``espeak''... ) 1881 1882@end deffn 1883 1884@deffn {C API function} int spd_set_punctuation(SPDConnection* connection, SPDPunctuation type); 1885@findex spd_set_punctuation() 1886 1887Set punctuation mode to the given value. `all' means speak all 1888punctuation characters, `none' means speak no punctuation characters, 1889`some' and `most' mean speak intermediate sets of punctuation characters as 1890set in symbols tables or output modules. 1891 1892@code{connection} is the SPDConnection* connection created by spd_open(). 1893 1894@code{type} is one of the following values: @code{SPD_PUNCT_ALL}, 1895@code{SPD_PUNCT_NONE}, @code{SPD_PUNCT_SOME}. 1896 1897It returns 0 on success, -1 otherwise. 1898@end deffn 1899 1900@deffn {C API function} int spd_set_spelling(SPDConnection* connection, SPDSpelling type); 1901@findex spd_set_spelling() 1902 1903Switches spelling mode on and off. If set to on, all incoming messages 1904from this particular connection will be processed according to appropriate 1905spelling tables (see spd_set_spelling_table()). 1906 1907@code{connection} is the SPDConnection* connection created by spd_open(). 1908 1909@code{type} is one of the following values: @code{SPD_SPELL_ON}, @code{SPD_SPELL_OFF}. 1910@end deffn 1911 1912@deffn {C API function} int spd_set_voice_type(SPDConnection* connection, SPDVoiceType voice); 1913@findex spd_set_voice_type() 1914@anchor{spd_set_voice_type} 1915 1916Set a preferred symbolic voice. 1917 1918@code{connection} is the SPDConnection* connection created by spd_open(). 1919 1920@code{voice} is one of the following values: @code{SPD_MALE1}, 1921@code{SPD_MALE2}, @code{SPD_MALE3}, @code{SPD_FEMALE1}, @code{SPD_FEMALE2}, 1922@code{SPD_FEMALE3}, @code{SPD_CHILD_MALE}, @code{SPD_CHILD_FEMALE}. 1923 1924@end deffn 1925 1926@deffn {C API function} int spd_set_synthesis_voice(SPDConnection* connection, char* voice_name); 1927@findex spd_set_voice_type() 1928@anchor{spd_set_synthesis_voice} 1929 1930Set the speech synthesizer voice to use. Please note that synthesis 1931voices are an attribute of the synthesizer, so this setting only takes 1932effect until the output module in use is changed (via 1933@code{spd_set_output_module()} or via @code{spd_set_language}). 1934 1935@code{connection} is the SPDConnection* connection created by spd_open(). 1936 1937@code{voice_name} is any of the voice name values retrieved by @xref{spd_list_synthesis_voices}. 1938 1939@end deffn 1940 1941@deffn {C API function} int spd_set_voice_rate(SPDConnection* connection, int rate); 1942@findex spd_set_voice_rate() 1943 1944Set voice speaking rate. 1945 1946@code{connection} is the SPDConnection* connection created by spd_open(). 1947 1948@code{rate} is a number between -100 and +100 which means 1949the slowest and the fastest speech rate respectively. 1950 1951@end deffn 1952 1953@deffn {C API function} int spd_get_voice_rate(SPDConnection* connection); 1954@findex spd_get_voice_rate() 1955 1956Get voice speaking rate. 1957 1958@code{connection} is the SPDConnection* connection created by spd_open(). 1959 1960It returns the current voice rate. 1961 1962@end deffn 1963 1964@deffn {C API function} int spd_set_voice_pitch(SPDConnection* connection, int pitch); 1965@findex spd_set_voice_pitch() 1966 1967Set voice pitch. 1968 1969@code{connection} is the SPDConnection* connection created by spd_open(). 1970 1971@code{pitch} is a number between -100 and +100, which means the 1972lowest and the highest pitch respectively. 1973 1974@end deffn 1975 1976@deffn {C API function} int spd_get_voice_pitch(SPDConnection* connection); 1977@findex spd_get_voice_pitch() 1978 1979Get voice pitch. 1980 1981@code{connection} is the SPDConnection* connection created by spd_open(). 1982 1983It returns the current voice pitch. 1984 1985@end deffn 1986 1987@deffn {C API function} int spd_set_voice_pitch_range(SPDConnection* connection, int pitch_range); 1988@findex spd_set_voice_pitch() 1989 1990Set voice pitch range. 1991 1992@code{connection} is the SPDConnection* connection created by spd_open(). 1993 1994@code{pitch_range} is a number between -100 and +100, which means the 1995lowest and the highest pitch range respectively. 1996 1997@end deffn 1998 1999@deffn {C API function} int spd_set_volume(SPDConnection* connection, int volume); 2000@findex spd_set_volume() 2001 2002Set the volume of the voice and sounds produced by Speech Dispatcher's output 2003modules. 2004 2005@code{connection} is the SPDConnection* connection created by spd_open(). 2006 2007@code{volume} is a number between -100 and +100 which means 2008the lowest and the loudest voice respectively. 2009 2010@end deffn 2011 2012@deffn {C API function} int spd_get_volume(SPDConnection* connection); 2013@findex spd_get_volume() 2014 2015Get the volume of the voice and sounds produced by Speech Dispatcher's output 2016modules. 2017 2018@code{connection} is the SPDConnection* connection created by spd_open(). 2019 2020It returns the current volume. 2021 2022@end deffn 2023 2024 2025@node Other Functions in C, Information Retrieval Commands in C, Parameter Setting Commands in C, C API 2026@subsection Other Functions 2027 2028@node Information Retrieval Commands in C, Event Notification and Index Marking in C, Other Functions in C, C API 2029@subsection Information Retrieval Commands 2030 2031@deffn {C API function} char** spd_list_modules(SPDConnection* connection) 2032@findex spd_list_modules() 2033@anchor{spd_list_modules} 2034 2035 2036Returns a null-terminated array of identification names of the available 2037output modules. You can subsequently set the desired output module with 2038@xref{spd_set_output_module}. In case of error, the return value is 2039a NULL pointer. 2040 2041@code{connection} is the SPDConnection* connection created by spd_open(). 2042 2043@end deffn 2044 2045@deffn {C API function} char** spd_list_voices(SPDConnection* connection) 2046@findex spd_list_voices() 2047@anchor{spd_list_voices} 2048 2049Returns a null-terminated array of identification names of the 2050symbolic voices. You can subsequently set the desired voice 2051with @xref{spd_set_voice_type}. 2052 2053Please note that this is a fixed list independent of the synthesizer 2054in use. The given voices can be mapped to specific synthesizer voices 2055according to user wish or may, for example, all be mapped to the same 2056voice. To choose directly from the raw list of voices as implemented 2057in the synthesizer, @xref{spd_list_synthesis_voices}. 2058 2059In case of error, the return value is a NULL pointer. 2060 2061@code{connection} is the SPDConnection* connection created by spd_open(). 2062 2063@end deffn 2064 2065@deffn {C API function} char** spd_list_synthesis_voices(SPDConnection* connection) 2066@findex spd_list_synthesis_voices() 2067@anchor{spd_list_synthesis_voices} 2068 2069Returns a null-terminated array of identification names of 2070@code{SPDVoice*} structures describing the available voices as given 2071by the synthesizer. You can subsequently set the desired voice with 2072@code{spd_set_synthesis_voice()}. 2073 2074@example 2075typedef struct@{ 2076 char *name; /* Name of the voice (id) */ 2077 char *language; /* 2/3-letter ISO language code, 2078 * possibly followed by 2/3-letter ISO region code, 2079 * e.g. en-US */ 2080 char *variant; /* a not-well defined string describing dialect etc. */ 2081@}SPDVoice; 2082@end example 2083 2084Please note that the list returned is specific to each synthesizer in 2085use (so when you switch to another output module, you must also 2086retrieve a new list). If you want instead to use symbolic voice 2087names which are independent of the synthesizer in use, @xref{spd_list_voices}. 2088 2089In case of error, the return value is a NULL pointer. 2090 2091@code{connection} is the SPDConnection* connection created by spd_open(). 2092 2093@end deffn 2094 2095@node Event Notification and Index Marking in C, History Commands in C, Information Retrieval Commands in C, C API 2096@subsection Event Notification and Index Marking in C 2097 2098When the SSIP connection is run in asynchronous mode, it is possible 2099to register callbacks for all the SSIP event notifications and index 2100mark notifications, as defined in @ref{Message Event Notification 2101and Index Marking,,, ssip, SSIP Documentation} 2102 2103@defvar {C API type} SPDNotification 2104@vindex SPDNotification 2105@anchor{SPDNotification} 2106 2107@code{SPDNotification} is an enum type that represents the possible 2108base notification types that can be assigned to a message. 2109 2110@example 2111typedef enum@{ 2112 SPD_BEGIN = 1, 2113 SPD_END = 2, 2114 SPD_INDEX_MARKS = 4, 2115 SPD_CANCEL = 8, 2116 SPD_PAUSE = 16, 2117 SPD_RESUME = 32 2118@}SPDNotification; 2119@end example 2120@end defvar 2121 2122There are currently two types of callbacks in the C API. 2123 2124@defvar {C API type} SPDCallback 2125@vindex SPDCallback 2126@anchor{SPDCallback} 2127@code{void (*SPDCallback)(size_t msg_id, size_t client_id, SPDNotificationType state);} 2128 2129This one is used for notifications about the events: @code{BEGIN}, @code{END}, @code{PAUSE} 2130and @code{RESUME}. When the callback is called, it provides three parameters for the event. 2131 2132@code{msg_id} unique identification number of the message the notification is about. 2133 2134@code{client_id} specifies the unique identification number of the client who sent the 2135message. This is usually the same connection as the connection which registered this 2136callback, and therefore uninteresting. However, in some special cases it might be useful 2137to register this callback for other SSIP connections, or register the same callback for 2138several connections originating from the same application. 2139 2140@code{state} is the @code{SPD_Notification} type of this notification. @xref{SPDNotification}. 2141@end defvar 2142 2143@defvar {C API type} SPDCallbackIM 2144@vindex SPDCallbackIM 2145@code{void (*SPDCallbackIM)(size_t msg_id, size_t client_id, SPDNotificationType state, 2146char *index_mark);} 2147 2148@code{SPDCallbackIM} is used for notifications about index marks that have been reached 2149in the message. (A way to specify index marks is e.g. through the SSML element 2150<mark/> in ssml mode.) 2151 2152The syntax and meaning of these parameters are the same as for @ref{SPDCallback} 2153except for the additional parameter @code{index_mark}. 2154 2155@code{index_mark} is a NULL terminated string associated with the index mark. Please 2156note that this string is specified by client application and therefore it needn't be 2157unique. 2158@end defvar 2159 2160One or more callbacks can be supplied for a given @code{SPDConnection*} connection by 2161assigning the values of pointers to the appropriate functions to the following connection 2162members: 2163 2164@example 2165 SPDCallback callback_begin; 2166 SPDCallback callback_end; 2167 SPDCallback callback_cancel; 2168 SPDCallback callback_pause; 2169 SPDCallback callback_resume; 2170 SPDCallbackIM callback_im; 2171@end example 2172 2173There are three settings commands which will turn notifications on and 2174off for the current SSIP connection and cause the callbacks to be called 2175when the event is registered by Speech Dispatcher. 2176 2177@deffn {C API function} int spd_set_notification_on(SPDConnection* connection, SPDNotification notification); 2178@findex spd_set_notification_on 2179@end deffn 2180@deffn {C API function} int spd_set_notification_off(SPDConnection* connection, SPDNotification notification); 2181@findex spd_set_notification_off 2182@end deffn 2183@deffn {C API function} int spd_set_notification(SPDConnection* connection, SPDNotification notification, const char* state); 2184@findex spd_set_notification 2185 2186These functions will set the notification specified by the parameter 2187@code{notification} on or off (or to the given value) 2188respectively. Note that it is only safe to call these functions after 2189the appropriate callback functions have been set in the @code{SPDCallback} 2190structure. Doing otherwise is not considered an error, but the 2191application might miss some events due to callback functions not being 2192executed (e.g. the client might receive an @code{END} event without 2193receiving the corresponding @code{BEGIN} event in advance. 2194 2195@code{connection} is the SPDConnection* connection created by spd_open(). 2196 2197@code{notification} is the requested type of notifications that should be reported by SSIP. @xref{SPDNotification}. 2198Note that also '|' combinations are possible, as illustrated in the example below. 2199 2200@code{state} must be either the string ``on'' or ``off'', for switching the given notification on or off. 2201 2202@end deffn 2203 2204The following example shows how to use callbacks for the simple 2205purpose of playing a message and waiting until its end. (Please note 2206that checks of return values in this example as well as other code 2207not directly related to index marking, have been removed for the purpose 2208of clarity.) 2209 2210@example 2211#include <semaphore.h> 2212 2213sem_t semaphore; 2214 2215/* Callback for Speech Dispatcher notifications */ 2216void end_of_speech(size_t msg_id, size_t client_id, SPDNotificationType type) 2217@{ 2218 /* We don't check msg_id here since we will only send one 2219 message. */ 2220 2221 /* Callbacks are running in a separate thread, so let the 2222 (sleeping) main thread know about the event and wake it up. */ 2223 sem_post(&semaphore); 2224@} 2225 2226int 2227main(int argc, char **argv) 2228@{ 2229 SPDConnection *conn; 2230 2231 sem_init(&semaphore, 0, 0); 2232 2233 /* Open Speech Dispatcher connection in THREADED mode. */ 2234 conn = spd_open("say","main", NULL, SPD_MODE_THREADED); 2235 2236 /* Set callback handler for 'end' and 'cancel' events. */ 2237 conn->callback_end = con->callback_cancel = end_of_speech; 2238 2239 /* Ask Speech Dispatcher to notify us about these events. */ 2240 spd_set_notification_on(conn, SPD_END); 2241 spd_set_notification_on(conn, SPD_CANCEL); 2242 2243 /* Say our message. */ 2244 spd_sayf(conn, SPD_MESSAGE, "%s", argv[1]); 2245 2246 /* Wait for 'end' or 'cancel' of the sent message. 2247 By SSIP specifications, we are guaranteed to get 2248 one of these two eventually. */ 2249 sem_wait(&semaphore); 2250 2251 return 0; 2252@} 2253@end example 2254 2255@node History Commands in C, Direct SSIP Communication in C, Event Notification and Index Marking in C, C API 2256@subsection History Commands 2257@findex spd_history_select_client() 2258@findex spd_get_client_list() 2259@findex spd_get_message_list_fd() 2260 2261@node Direct SSIP Communication in C, , History Commands in C, C API 2262@subsection Direct SSIP Communication in C 2263 2264It might happen that you want to use some SSIP function that is not 2265available through a library or you may want to use an available 2266function in a different manner. (If you think there is something 2267missing in a library or you have some useful comment on the 2268available functions, please let us know.) For this purpose, there are 2269a few functions that will allow you to send arbitrary SSIP commands on 2270your connection and read the replies. 2271 2272@deffn {C API function} int spd_execute_command(SPDConnection* connection, char *command); 2273@findex spd_execute_command() 2274 2275You can send an arbitrary SSIP command specified in the parameter @code{command}. 2276 2277If the command is successful, the function returns a 0. If there is no such 2278command or the command failed for some reason, it returns -1. 2279 2280@code{connection} is the SPDConnection* connection created by spd_open(). 2281 2282@code{command} is a null terminated string containing a full SSIP command 2283without the terminating sequence @code{\r\n}. 2284 2285For example: 2286@example 2287 spd_execute_command(fd, "SET SELF RATE 60"); 2288 spd_execute_command(fd, "SOUND_ICON bell"); 2289@end example 2290 2291It's not possible to use this function for compound commands like @code{SPEAK} 2292where you are receiving more than one reply. If this is your case, please 2293see `spd_send_data()'. 2294@end deffn 2295 2296@deffn {C API function} char* spd_send_data(SPDConnection* connection, const char *message, int wfr); 2297@findex spd_send_data() 2298 2299You can send an arbitrary SSIP string specified in the parameter @code{message} 2300and, if specified, wait for the reply. The string can be any SSIP command, but 2301it can also be textual data or a command parameter. 2302 2303If @code{wfr} (wait for reply) is set to SPD_WAIT_REPLY, you will receive the reply string 2304as the return value. If wfr is set to SPD_NO_REPLY, the return value is a NULL pointer. 2305If wfr is set to SPD_WAIT_REPLY, you should always free the returned string. 2306 2307@code{connection} is the SPDConnection* connection created by spd_open(). 2308 2309@code{message} is a null terminated string containing a full SSIP 2310string. If this is a complete SSIP command, it must include the full 2311terminating sequence @code{\r\n}. 2312 2313@code{wfr} is either SPD_WAIT_REPLY (integer value of 1) or SPD_NO_REPLY (0). 2314This specifies if you expect to get a reply on the sent data according to SSIP. 2315For example, if you are sending ordinary text inside a @code{SPEAK} command, 2316you don't expect to get a reply, but you expect a reply after sending the final 2317sequence @code{\r\n.\r\n} of the multi-line command. 2318 2319For example (simplified by not checking and freeing the returned strings): 2320@example 2321 spd_send_data(conn, "SPEAK", SPD_WAIT_REPLY); 2322 spd_send_data(conn, "Hello world!\n", SPD_NO_REPLY); 2323 spd_send_data(conn, "How are you today?!", SPD_NO_REPLY); 2324 spd_send_data(conn, "\r\n.\r\n.", SPD_WAIT_REPLY); 2325@end example 2326 2327@end deffn 2328 2329 2330@node Python API, Guile API, C API, Client Programming 2331@section Python API 2332 2333There is a full Python API available in @file{src/python/speechd/} in 2334the source tree. Please see the Python docstrings for full reference 2335about the available objects and methods. 2336 2337Simple Python client: 2338@example 2339import speechd 2340client = speechd.SSIPClient('test') 2341client.set_output_module('festival') 2342client.set_language('en-US') 2343client.set_punctuation(speechd.PunctuationMode.SOME) 2344client.speak("Hello World!") 2345client.close() 2346@end example 2347 2348The Python API respects the environment variables 2349@var{SPEECHD_ADDRESS} it the communication address is not specified 2350explicitly (see @code{SSIPClient} constructor arguments). 2351 2352Implementation of callbacks within the Python API tries to hide the 2353low level details of SSIP callback handling and provide a convenient 2354Pythonic interface. You just pass a callable object (function) to the 2355@code{speak()} method and this function will be called whenever an 2356event occurs for the corresponding message. 2357 2358Callback example: 2359@example 2360import speechd, time 2361called = [] 2362client = speechd.SSIPClient('callback-test') 2363client.speak("Hi!", callback=lambda cb_type: called.append(cb_type)) 2364time.sleep(2) # Wait for the events to happen. 2365print "Called callbacks:", called 2366client.close() 2367@end example 2368 2369Real-world callback functions will most often need some sort of 2370context information to be able to distinguish for which message the 2371callback was called. This can be simply done in Python. The 2372following example uses the actual message text as the context 2373information within the callback function. 2374 2375Callback context example: 2376@example 2377import speechd, time 2378 2379class CallbackExample(object): 2380 def __init__(self): 2381 self._client = speechd.SSIPClient('callback-test') 2382 2383 def speak(self, text): 2384 def callback(callback_type): 2385 if callback_type == speechd.CallbackType.BEGIN: 2386 print "Speech started:", text 2387 elif callback_type == speechd.CallbackType.END: 2388 print "Speech completed:", text 2389 elif callback_type == speechd.CallbackType.CANCEL: 2390 print "Speech interupted:", text 2391 self._client.speak(text, callback=callback, 2392 event_types=(speechd.CallbackType.BEGIN, 2393 speechd.CallbackType.CANCEL, 2394 speechd.CallbackType.END)) 2395 2396 def go(self): 2397 self.speak("Hi!") 2398 self.speak("How are you?") 2399 time.sleep(4) # Wait for the events to happen. 2400 self._client.close() 2401 2402CallbackExample().go() 2403@end example 2404 2405@emph{Important notice:} The callback is called in Speech Dispatcher 2406listener thread. No subsequent Speech Dispatcher interaction is 2407allowed from within the callback invocation. If you need to do 2408something more complicated, do it in another thread to prevent 2409deadlocks in SSIP communication. 2410 2411@node Guile API, Common Lisp API, Python API, Client Programming 2412@section Guile API 2413 2414The Guile API can be found @file{src/guile/} in 2415the source tree, however it's still considered to be 2416at the experimental stage. Please read @file{src/guile/README}. 2417 2418@node Common Lisp API, Autospawning, Guile API, Client Programming 2419@section Common Lisp API 2420 2421The Common Lisp API can be found @file{src/cl/} in 2422the source tree, however it's still considered to be 2423at the experimental stage. Please read @file{src/cl/README}. 2424 2425@node Autospawning, , Common Lisp API, Client Programming 2426@section Autospawning 2427 2428It is suggested that client libraries offer an autospawn functionality 2429to automatically start the server process when connecting locally and if 2430it is not already running. E.g. if the client application starts and 2431Speech Dispatcher is not running already, the client will start Speech 2432Dispatcher. 2433 2434The library API should provide a possibility to turn this 2435functionality off, but we suggest to set the default behavior to 2436autospawn. 2437 2438Autospawn is performed by executing Speech Dispatcher with the --spawn 2439parameter under the same user and permissions as the client process: 2440 2441@example 2442speech-dispatcher --spawn 2443@end example 2444 2445With the @code{--spawn} parameter, the process will start and return 2446with an exit code of 0 only if a) it is not already running (pidfile 2447check) b) the server doesn't have autospawn disabled in its 2448configuration c) no other error preventing the start 2449occurs. Otherwise, Speech Dispatcher is not started and the error code 2450of 1 is returned. 2451 2452The client library should redirect its stdout and stderr outputs 2453either to nowhere or to its logging system. It should subsequently 2454completely detach from the newly spawned process. 2455 2456Due to a bug in Speech Dispatcher, it is currently necessary to 2457include a wait statement after the autospawn for about 0.5 seconds 2458before attempting a connection. 2459 2460Please see how autospawn is implemented in the C API and in the Python 2461API for an example. 2462 2463@node Server Programming, Download and Contact, Client Programming, Top 2464@chapter Server Programming 2465 2466@menu 2467* Server Core:: Internal structure and functionality overview. 2468* Output Modules:: Plugins for various speech synthesizers. 2469@end menu 2470 2471@node Server Core, Output Modules, Server Programming, Server Programming 2472@section Server Core 2473 2474The main documentation for the server core is the code itself. This section 2475is only a general introduction intended to give you some basic information 2476and hints where to look for things. If you are going to make some modifications 2477in the server core, we will be happy if you get in touch with us on 2478@email{speechd-discuss@@nongnu.org}. 2479 2480The server core is composed of two main parts, each of them implemented 2481in a separate thread. The @emph{server part} handles the communication 2482with clients and, with the desired configuration options, stores the messages 2483in the priority queue. The @emph{speaking part} takes care of 2484communicating with the output modules, pulls messages out of the priority 2485queue at the correct time and sends them to the appropriate synthesizer. 2486 2487Synchronization between these two parts is done by thread mutexes. 2488Additionally, synchronization of the speaking part from both sides 2489(server part, output modules) is done via a SYSV/IPC semaphore. 2490 2491@subheading Server part 2492 2493After switching to the daemon mode (if required), it reads configuration 2494files and initializes the speaking part. Then it opens the socket 2495and waits for incoming data. This is implemented mainly in 2496@file{src/server/speechd.c} and @file{src/server/server.c}. 2497 2498There are three types of events: new client connects to speechd, 2499old client disconnects, or a client sends some data. In the third 2500case, the data is passed to the @code{parse()} function defined 2501in @file{src/server/parse.c}. 2502 2503If the incoming data is a new message, it's stored in a 2504queue according to its priority. If it is SSIP 2505commands, it's handled by the appropriate handlers. 2506Handling of the @code{SET} family of commands can be found 2507in @file{src/server/set.c} and @code{HISTORY} commands are 2508processed in @file{src/server/history.c}. 2509 2510All reply messages of SSIP are defined in @file{src/server/msg.h}. 2511 2512@subheading Speaking part 2513 2514This thread, the function @code{speak()} defined in 2515@file{src/server/speaking.c}, is created from the server part process 2516shortly after initialization. Then it enters an infinite loop and 2517waits on a SYSV/IPC semaphore until one of the following actions 2518happen: 2519 2520@itemize @bullet 2521@item 2522The server adds a new message to the queue of messages waiting 2523to be said. 2524@item 2525The currently active output module signals that the message 2526that was being spoken is done. 2527@item 2528Pause or resume is requested. 2529@end itemize 2530 2531After handling the rest of the priority interaction (like actions 2532needed to repeat the last priority progress message) it decides 2533which action should be performed. Usually it's picking up 2534a message from the queue and sending it to the desired output 2535module (synthesizer), but sometimes it's handling the pause 2536or resume requests, and sometimes it's doing nothing. 2537 2538As said before, this is the part of Speech Dispatcher that 2539talks to the output modules. It does so by using the output 2540interface defined in @file{src/server/output.c}. 2541 2542@node Output Modules, , Server Core, Server Programming 2543@section Output Modules 2544 2545@menu 2546* Basic Structure:: The definition of an output module. 2547* Communication Protocol for Output Modules:: 2548* How to Write New Output Module:: How to include support for new synthesizers 2549* The Skeleton of an Output Module:: 2550* Output Module Functions:: 2551* Module Utils Functions and Macros:: 2552* Index Marks in Output Modules:: 2553@end menu 2554 2555@node Basic Structure, Communication Protocol for Output Modules, Output Modules, Output Modules 2556@subsection Basic Structure 2557 2558Speech Dispatcher output modules are independent applications that, 2559using a simple common communication protocol, read commands from 2560standard input and then output replies on standard output, 2561communicating the requests to the particular software or hardware 2562synthesizer. Everything the output module writes on standard output 2563or reads from standard input should conform to the specifications 2564of the communication protocol. Additionally, standard error output 2565is used for logging of the modules. 2566 2567Output module binaries are usually located in 2568@file{bin/speechd-modules/} and are loaded automatically when Speech 2569Dispatcher starts, according to configuration. Their standard 2570input/output/error output is redirected to a pipe to Speech Dispatcher 2571and this way both sides can communicate. 2572 2573When the modules start, they are passed the name of a configuration file 2574that should be used for this particular output module. 2575 2576Each output module is started by Speech Dispatcher as: 2577 2578@example 2579my_module "configfile" 2580@end example 2581 2582where @code{configfile} is the full path to the desired configuration 2583file that the output module should parse. 2584 2585@node Communication Protocol for Output Modules, How to Write New Output Module, Basic Structure, Output Modules 2586@subsection Communication Protocol for Output Modules 2587 2588The protocol by which the output modules communicate on standard 2589input/output is based on @ref{Top,,SSIP,ssip, SSIP 2590Documentation}, although it is highly simplified and a little bit 2591modified for the different purpose here. Another difference 2592is that event notification is obligatory in modules communication, 2593while in SSIP, this is an optional feature. This is because Speech 2594Dispatcher has to know all the events happening in the output modules 2595for the purpose of synchronization of various messages. 2596 2597Since it's very similar to SSIP, @ref{Top,,General Rules,ssip, SSIP 2598Documentation}, for a general description of what the protocol looks 2599like. One of the exceptions is that since the output modules 2600communicate on standard input/output, we use only @code{LF} as the 2601line separator. 2602 2603The return values are: 2604@itemize 2605@item 2xx OK 2606@item 3xx CLIENT ERROR or BAD SYNTAX or INVALID VALUE 2607@item 4xx OUTPUT MODULE ERROR or INTERNAL ERROR 2608 2609@item 700 EVENT INDEX MARK 2610@item 701 EVENT BEGIN 2611@item 702 EVENT END 2612@item 703 EVENT STOP 2613@item 704 EVENT PAUSE 2614@end itemize 2615 2616@table @code 2617@item SPEAK 2618Start receiving a text message in the SSML format and synthesize it. 2619After sending a reply to the command, output module waits for the text 2620of the message. The text can spread over any number of lines and is 2621finished by an end of line marker followed by the line containing the 2622single character @code{.} (dot). Thus the complete character sequence 2623closing the input text is @code{LF . LF}. If any line within the sent 2624text contains only a dot, an extra dot should be prepended before it. 2625 2626During reception of the text message, output module doesn't send a 2627response to the particular lines sent. The response line is sent only 2628immediately after the @code{SPEAK} command and after receiving the 2629closing dot line. This doesn't provide any means of synchronization, 2630instead, event notification is used for this purpose. 2631 2632There is no explicit upper limit on the size of the text. 2633 2634If the @code{SPEAK} command is received while the output module 2635is already speaking, it is considered an error. 2636 2637Example: 2638@example 2639SPEAK 2640202 OK SEND DATA 2641<speak> 2642Hello, GNU! 2643</speak> 2644. 2645200 OK SPEAKING 2646@end example 2647 2648After receiving the full text (or the first part of it), the output 2649module is supposed to start synthesizing it and take care of 2650delivering it to an audio device. When (or just before) the first 2651synthesized samples are delivered to the audio and start playing, the 2652output module must send the @code{BEGIN} event over the communication 2653socket to Speech Dispatcher, @xref{Events notification and index 2654marking}. After the audio stops playing, the event @code{STOP}, 2655@code{PAUSE} or @code{END} must be delivered to Speech 2656Dispatcher. Additionally, if supported by the given synthesizer, the 2657output module can issue events associated with the included SSML index 2658marks when they are reached in the audio output. 2659 2660@item CHAR 2661Synthesize a character. If the synthesizer supports a different behavior 2662for the event of ``character'', this should be used. 2663 2664It works like the command @code{SPEAK} above, except that the argument 2665has to be exactly one line long. It contains the UTF-8 form of exactly 2666one character. 2667 2668@item KEY 2669Synthesize a key name. If the synthesizer supports a different behavior 2670for the event of ``key name'', this should be used. 2671 2672It works like the command @code{SPEAK} above, except that the argument 2673has to be exactly one line long. @xref{Top, ,SSIP KEY,ssip, SSIP 2674Documentation}, for the description of the allowed arguments. 2675 2676@item SOUND_ICON 2677Produce a sound icon. According to the configuration of the particular 2678synthesizer, this can produce either a sound (e.g. .wav) or synthesize 2679some text. 2680 2681It works like the command @code{SPEAK} above, except that the argument 2682has to be exactly one line long. It contains the symbolic name of the 2683icon that should be said. @xref{Top,,SSIP SOUND_ICON, ssip, SSIP 2684Documentation}, for more detailed description of the sound icons 2685mechanism. 2686 2687@item STOP 2688Immediately stop speaking on the output device and cancel synthesizing 2689the current message so that the output module is prepared to receive a 2690new message. If there is currently no message being synthesized, it is 2691not considered an error to call @code{STOP} anyway. 2692 2693This command is asynchronous. The output module is not supposed to 2694send any reply (not even error reply). 2695 2696It should return immediately, although stopping the synthesizer may 2697require a little bit more time. The module must issue one of the events 2698@code{STOPPED} or @code{END} when the module is finally 2699stopped. @code{END} is issued when the playing stopped by itself 2700before the module could terminate it or if the architecture of the 2701output module doesn't allow it to decide, otherwise @code{STOPPED} 2702should be used. 2703 2704@example 2705STOP 2706@end example 2707 2708@item PAUSE 2709Stop speaking the current message at a place where we can exactly 2710determine the position (preferably after a @code{__spd_} index mark). This 2711doesn't have to be immediate and can be delayed even for a few 2712seconds. (Knowing the position exactly is important so that we can 2713later continue the message without gaps or overlapping.) It doesn't do 2714anything else (like storing the message etc.). 2715 2716This command is asynchronous. The output module is not supposed to 2717send any reply (not even error reply). 2718 2719For example: 2720@example 2721PAUSE 2722@end example 2723 2724@item SET 2725Set one of several speech parameters for the future messages. 2726 2727Each of the parameters is written on a single line in the form 2728@example 2729name=value 2730@end example 2731where @code{value} can be either a number or a string, depending upon 2732the name of the parameter. 2733 2734The @code{SET} environment is terminated by a dot on a single line. 2735Thus the complete character sequence closing the input text is 2736@code{LF . LF} 2737 2738During reception of the settings, output module doesn't send any 2739response to the particular lines sent. The response line is sent only 2740immediately after the @code{SET} command and after receiving the 2741closing dot line. 2742 2743The available parameters that accept numerical values are @code{rate}, 2744@code{pitch} and @code{pitch_range}. 2745 2746The available parameters that accept string values are 2747@code{punctuation_mode}, @code{spelling_mode}, @code{cap_let_recogn}, 2748@code{voice}, and @code{language}. The arguments are the same as for the 2749corresponding SSIP commands, except that they are written with small 2750letters. @xref{Top,,Parameter Setting Commands,ssip, SSIP 2751Documentation}. The conversion between these string values and the 2752corresponding C enum variables can be easily done using 2753@file{src/common/fdsetconv.c}. 2754 2755Not all of these parameters must be set and the value of the string 2756arguments can also be @code{NULL}. If some of the parameters aren't 2757set, the output module should use its default. 2758 2759It's not necessary to set these parameters on the synthesizer right 2760away, instead, it can be postponed until some message to be spoken arrives. 2761 2762Here is an example: 2763@example 2764SET 2765203 OK RECEIVING SETTINGS 2766rate=20 2767pitch=-10 2768pitch_range=50 2769punctuation_mode=all 2770spelling_mode=on 2771punctuation_some=NULL 2772. 2773203 OK SETTINGS RECEIVED 2774@end example 2775 2776@item AUDIO 2777Audio has exactly the same structure as @code{SET}, but is transmitted 2778only once immediatelly after @code{INIT} to transmit the requested audio 2779parameters and tell the output module to open the audio device. 2780 2781@item QUIT 2782Terminates the output module. It should send the response, deallocate 2783all the resources, close all descriptors, terminate all child 2784processes etc. Then the output module should exit itself. 2785 2786@example 2787QUIT 2788210 OK QUIT 2789@end example 2790@end table 2791 2792@subsubsection Events notification and index marking 2793@anchor{Events notification and index marking} 2794 2795Each output module must take care of sending asynchronous 2796notifications whenever the synthesizer (or the module) starts or stops 2797output audio on the speakers. Additionally, whenever possible, the 2798output module should report back to Speech Dispatcher index marks 2799found in the incoming SSML text whenever they are reached while 2800speaking. See SSML specifications for more details about the 2801@code{mark} element 2802 2803Event and index mark notifications are reported by simply writing them 2804to the standard output. An event notification must never get in 2805between synchronous commands (those which require a reply) and their 2806reply. Before Speech Dispatcher sends any new requests (like 2807@code{SET}, @code{SPEAK} etc.) it waits for the previous request to be 2808terminated by the output module by signalling @code{STOP}, @code{END} 2809or @code{PAUSE} index marks. So the only thing the output module must 2810ensure in order to satisfy this requirement is that it doesn't send 2811any index marks until it acknowledges the receival of the new message 2812via @code{200 OK SPEAKING}. It must also ensure that index marks 2813written to the pipe are well ordered -- of course it doesn't make any 2814sense and it is an error to send any index marks after @code{STOP}, 2815@code{END} or @code{PAUSE} is sent. 2816 2817 2818@table @code 2819 2820@item BEGIN 2821 2822This event must be issued whenever the module starts to speak the 2823given message. If this is not possible, it can issue it when it 2824starts to synthesize the message or when it receives the message. 2825 2826It is prepended by the code @code{701} and takes the form 2827 2828@example 2829701 BEGIN 2830@end example 2831 2832@item END 2833 2834This event must be issued whenever the module terminates speaking the 2835given message because it reached its end. If this is not possible, it 2836can issue this event when it is ready to receive a new message after 2837speaking the previous message. 2838 2839Each @code{END} must always be preceeded (possibly not directly) by a 2840@code{BEGIN}. 2841 2842It is prepended by the code @code{702} and takes the form 2843 2844@example 2845702 END 2846@end example 2847 2848@item STOP 2849 2850This event should be issued whenever the module terminates speaking 2851the given message without reaching its end (as a consequence of 2852receiving the STOP command or because of some error) not because of 2853a @code{PAUSE} command. When the synthesizer in use doesn't allow 2854the module to decide, the event @code{END} can be used instead. 2855 2856Each @code{STOP} must always be preceeded (possibly not directly) by a 2857@code{BEGIN}. 2858 2859It is prepended by the code @code{703} and takes the form 2860 2861@example 2862703 STOP 2863@end example 2864 2865@item PAUSE 2866 2867This event should be issued whenever the module terminates speaking 2868the given message without reaching its end because of receiving the 2869@code{PAUSE} command. 2870 2871Each @code{PAUSE} must always be preceeded (possibly not directly) by a 2872@code{BEGIN}. 2873 2874It is prepended by the code @code{704} and takes the form 2875 2876@example 2877704 PAUSE 2878@end example 2879 2880@item INDEX MARK 2881 2882This event should be issued by the output module (if supported) 2883whenever an index mark (SSML tag @code{<mark/>}) is passed while speaking 2884a message. It is preceeded by the code @code{700} and takes the form 2885 2886@example 2887700-name 2888700 INDEX MARK 2889@end example 2890 2891where @code{name} is the value of the SSML attribute @code{name} in 2892the tag @code{<mark/>}. 2893 2894@end table 2895 2896@node How to Write New Output Module, The Skeleton of an Output Module, Communication Protocol for Output Modules, Output Modules 2897@subsection How to Write New Output Module 2898 2899If you want to write your own output module, there are basically two 2900ways to do it. Either you can program it all yourself, which is fine 2901as long as you stick to the definition of an output module and its 2902communication protocol, or you can use our @file{module_*.c} tools. 2903If you use these tools, you will only have to write the core functions 2904like module_speak() and module_stop etc. and you will not have to 2905worry about the communication protocol and other formal things that 2906are common for all modules. Here is how you can do it using the 2907provided tools. 2908 2909We will recommend here a basic structure of the code for an output 2910module you should follow, although it's perfectly ok to establish your 2911own if you have reasons to do so, if all the necessary functions and 2912data are defined somewhere in the file. For this purpose, we will use 2913examples from the output module for Flite (Festival Lite), so it's 2914recommended to keep looking at @code{flite.c} for reference. 2915 2916A few rules you should respect: 2917@itemize 2918@item 2919The @file{module_*.c} files should be included at the specified place and 2920in the specified order, because they include directly some pieces of the 2921code and won't work in other places. 2922@item 2923If one or more new threads are used in the output module, they must block all signals. 2924@item 2925On module_close(), all lateral threads and processes should be terminated, 2926all memory freed. Don't assume module_close() is always called before exit() 2927and the sources will be freed automatically. 2928@item 2929We will be happy if all the copyrights are assigned to Brailcom, o.p.s. 2930in order for us to be in a better legal position against possible intruders. 2931@end itemize 2932 2933@node The Skeleton of an Output Module, Output Module Functions, How to Write New Output Module, Output Modules 2934@subsection The Skeleton of an Output Module 2935 2936Each output module should include @file{module_utils.h} where the 2937SPDMsgSettings structure is defined to be able to handle the different 2938speech synthesis settings. This file also provides tools which help 2939with writing output modules and making the code simpler. 2940 2941@example 2942#include "module_utils.h" 2943@end example 2944 2945If your plugin needs the audio tools (if you take 2946care of the output to the soundcard instead of the synthesizer), 2947you also have to include @file{spd_audio.h} 2948 2949@example 2950#include "spd_audio.h" 2951@end example 2952 2953The definition of macros @code{MODULE_NAME} and @code{MODULE_VERSION} 2954should follow: 2955 2956@example 2957#define MODULE_NAME "flite" 2958#define MODULE_VERSION "0.1" 2959@end example 2960 2961If you want to use the @code{DBG(message)} macro from @file{module_utils.c} 2962to print out debugging messages, you should insert these two lines. (Please 2963don't use printf for debugging, this doesn't work with multiple processes!) 2964(You will later have to actually start debugging in @code{module_init()}) 2965 2966@example 2967DECLARE_DEBUG(); 2968@end example 2969 2970You don't have to define the prototypes of the core functions 2971like module_speak() and module_stop(), these are already 2972defined in @file{module_utils.h} 2973 2974Optionally, if your output module requires some special configuration, 2975apart from defining voices and configuring debugging (they are handled 2976differently, see below), you can declare the requested option 2977here. It will expand into a dotconf callback and declaration of the 2978variable. 2979 2980(You will later have to actually register these options for 2981Speech Dispatcher in @code{module_load()}) 2982 2983There are currently 4 types of possible configuration options: 2984 2985@itemize 2986@item @code{MOD_OPTION_1_INT(name); /* Set up `int name' */} 2987@item @code{MOD_OPTION_1_STR(name); /* Set up `char* name' */} 2988@item @code{MOD_OPTION_2(name); /* Set up `char *name[2]' */} 2989@item @code{MOD_OPTION_@{2,3@}_HT(name); /* Set up a hash table */} 2990@end itemize 2991 2992@xref{Output Modules Configuration}. 2993 2994For example Flite uses 2 options: 2995@example 2996MOD_OPTION_1_INT(FliteMaxChunkLength); 2997MOD_OPTION_1_STR(FliteDelimiters); 2998@end example 2999 3000Every output module is started in 2 phases: @emph{loading} and 3001@emph{initialization}. 3002 3003The goal of loading is to initialize empty structures for storing 3004settings and declare the DotConf callbacks for parsing configuration 3005files. In the second phase, initialization, all the configuration has 3006been read and the output module can accomplish the rest (check if 3007the synthesizer works, set up threads etc.). 3008 3009You should start with the definition of @code{module_load()}. 3010 3011@example 3012int 3013module_load(void) 3014@{ 3015@end example 3016 3017Then you should initialize the settings tables. These are defined in 3018@file{module_utils.h} and will be used to store the settings received 3019by the @code{SET} command. 3020@example 3021 INIT_SETTINGS_TABLES(); 3022@end example 3023 3024Also, define the configuration callbacks for debugging if you use 3025the @code{DBG()} macro. 3026 3027@example 3028 REGISTER_DEBUG(); 3029@end example 3030 3031Now you can finally register the options for the configuration file 3032parsing. Just use these macros: 3033@itemize 3034 @item MOD_OPTION_1_INT_REG(name, default); /* for integer parameters */ 3035 @item MOD_OPTION_1_STR_REG(name, default); /* for string parameters */ 3036 @item MOD_OPTION_MORE_REG(name); /* for an array of strings */ 3037 @item MOD_OPTION_HT_REG(name); /* for hash tables */ 3038@end itemize 3039 3040Again, an example from Flite: 3041@example 3042 MOD_OPTION_1_INT_REG(FliteMaxChunkLength, 300); 3043 MOD_OPTION_1_STR_REG(FliteDelimiters, "."); 3044@end example 3045 3046If you want to enable the mechanism for setting 3047voices through AddVoice, use this function (for 3048an example see @code{generic.c}): 3049 3050Example from Festival: 3051@example 3052 module_register_settings_voices(); 3053@end example 3054 3055@xref{Output Modules Configuration}. 3056 3057If everything went correctly, the function should return 0, otherwise -1. 3058 3059@example 3060 return 0; 3061@} 3062@end example 3063 3064The second phase of starting an output module is handled by: 3065 3066@example 3067int 3068module_init(void) 3069@{ 3070@end example 3071 3072If you use the DBG() macro, you should init debugging on the start 3073of this function. From that moment on, you can use DBG(). Apart from that, 3074the body of this function is entirely up to you. You should do all the 3075necessary initialization of the particular synthesizer. All declared 3076configuration variables and configuration hash tables, together with 3077the definition of voices, are filled with their values (either default 3078or read from configuration), so you can use them already. 3079 3080@example 3081 INIT_DEBUG(); 3082 DBG("FliteMaxChunkLength = %d\n", FliteMaxChunkLength); 3083 DBG("FliteDelimiters = %s\n", FliteDelimiters); 3084@end example 3085 3086This function should return 0 if the module was initialized 3087successfully, or -1 if some failure was encountered. In this case, you 3088should clean up everything, cancel threads, deallocate memory etc.; no 3089more functions of this output module will be touched (except for other 3090tries to load and initialize the module). 3091 3092Example from Flite: 3093 3094@example 3095 /* Init flite and register a new voice */ 3096 flite_init(); 3097 flite_voice = register_cmu_us_kal(); 3098 3099 if (flite_voice == NULL)@{ 3100 DBG("Couldn't register the basic kal voice.\n"); 3101 return -1; 3102 @} 3103 [...] 3104@end example 3105 3106The third part is opening the audio. This is commanded 3107by the @code{AUDIO} protocol command. If the synthesizer is able 3108to retrieve audio data, it is desirable to open the @code{spd_audio} 3109output according to the requested parameters and then use this 3110method for audio output. Audio initialization can be done as 3111follows: 3112 3113@example 3114int 3115module_audio_init(char **status_info)@{ 3116 DBG("Opening audio"); 3117 return module_audio_init_spd(status_info); 3118@} 3119@end example 3120 3121If it is impossible to retrieve audio from the synthesizer and 3122the synthesizer itself is used for playback, than the module must 3123still contain this function, but it should just return 0 and 3124do nothing. 3125 3126Now you have to define all the synthesis control functions 3127@code{module_speak}, @code{module_stop} etc. See @ref{Output Module 3128Functions}. 3129 3130At the end, this simple include provides the main() function and all 3131the functionality related to being an output module of Speech 3132Dispatcher (parsing argv[] parameters, communicating on stdin/stdout, 3133...). It's recommended to study this file carefully and try to 3134understand what exactly it does, as it will be part of the source code 3135of your output module. 3136 3137@example 3138#include "module_main.c" 3139@end example 3140 3141If it doesn't work, it's most likely not your fault. Complain! This 3142manual is not complete and the instructions in this sections aren't 3143either. Get in touch with us and together we can figure out what's 3144wrong, fix it and then warn others in this manual. 3145 3146@node Output Module Functions, Module Utils Functions and Macros, The Skeleton of an Output Module, Output Modules 3147@subsection Output Module Functions 3148 3149@deffn {Output Module Functions} int module_speak (char *data, size_t bytes, EMessageType msgtype) 3150@findex module_speak() 3151 3152This is the function where the actual speech output is produced. It is 3153called every time Speech Dispatcher decides to send a message to 3154synthesis. The data of length @var{bytes} are passed in 3155a NULL terminated string @var{data}. The argument @var{msgtype} 3156defines what type of message it is (different types should be handled 3157differently, if the synthesizer supports it). 3158 3159Each output module should take care of setting the output device to 3160the parameters from msg_settings (defined in module_utils.h) (See 3161SPDMsgSettings in @file{module_utils.h}). However, it is not an error if 3162some of these values are ignored. At least rate, pitch and language 3163should be set correctly. 3164 3165Speed and pitch are values between -100 and 100 included. 0 is the default 3166value that represents normal speech flow. So -100 is the slowest (or lowest) 3167and +100 is the fastest (or highest) speech. 3168 3169The language parameter is given as a null-terminated string containing 3170the name of the language according to RFC 1766 (en, cs, fr, en-US, ...). If the 3171requested language is not supported by this synthesizer, it's ok to abort 3172and return 0, because that's an error in user settings. 3173 3174An easy way to set the parameters is using the UPDATE_PARAMETER() and 3175UPDATE_STRING_PARAMETER() macros. @xref{Module Utils Functions and 3176Macros}. 3177 3178Example from festival: 3179@example 3180 UPDATE_STRING_PARAMETER(language, festival_set_language); 3181 UPDATE_PARAMETER(voice, festival_set_voice); 3182 UPDATE_PARAMETER(rate, festival_set_rate); 3183 UPDATE_PARAMETER(pitch, festival_set_pitch); 3184 UPDATE_PARAMETER(punctuation_mode, festival_set_punctuation_mode); 3185 UPDATE_PARAMETER(cap_let_recogn, festival_set_cap_let_recogn); 3186@end example 3187 3188This function should return 0 if it fails and 1 if the delivery 3189to the synthesizer is successful. It should return immediately, 3190because otherwise, it would block stopping, priority handling 3191and other important things in Speech Dispatcher. 3192 3193If there is a need to stay longer, you should create a separate thread 3194or process. This is for example the case of some software synthesizers 3195which use a blocking function (eg. spd_audio_play) or hardware devices 3196that have to send data to output modules at some particular 3197speed. Note that if you use threads for this purpose, you have to set 3198them to ignore all signals. The simplest way to do this is to call 3199@code{set_speaking_thread_parameters()} which is defined in 3200module_utils.c. Call it at the beginning of the thread code. 3201@end deffn 3202 3203@deffn {Output module function} {int module_stop} (void) 3204@findex module_stop() 3205 3206This function should stop the synthesis of the currently spoken message 3207immediately and throw away the rest of the message. 3208 3209This function should return immediately. Speech Dispatcher will 3210not send another command until module_report_event_stop() is called. 3211Note that you cannot call module_report_event_stop() from within 3212the call to module_stop(). The best thing to do is emit 3213the stop event from another thread. 3214 3215It should return 0 on success, -1 otherwise. 3216@end deffn 3217 3218@deffn {Output module function} {size_t module_pause} (void) 3219@findex module_pause() 3220 3221This function should stop speaking on the synthesizer (or sending 3222data to soundcard) just after sending an @code{__spd_} index 3223mark so that Speech Dispatcher knows the position of stop. 3224 3225The pause can wait for a short time until 3226an index mark is reached. However, if it's not possible to determine 3227the exact position, this function should have the same effect 3228as @code{module_stop}. 3229 3230This function should return immediately. Speech Dispatcher will 3231not send another command until module_report_event_pause() is called. 3232Note that you cannot call module_report_event_pause() from within 3233the call to module_pause(). The best thing to do is emit 3234the pause event from another thread. 3235 3236For some software synthesizers, the desired effect can be archieved in this way: 3237When @code{module_speak()} is called, you execute a separate 3238process and pass it the requested message. This process 3239cuts the message into sentences and then runs in a loop 3240and sends the pieces to synthesis. If a signal arrives 3241from @code{module_pause()}, you set a flag and stop the loop 3242at the point where next piece of text would be synthesized. 3243 3244It's not an error if this function is called when the device 3245is not speaking. In this case, it should return 0. 3246 3247Note there is no module_resume() function. The semantics of 3248@code{module_pause()} is the same as @code{module_stop()} except that 3249your module should stop after reaching a @code{__spd_} index mark. 3250Just like @code{module_stop()}, it should discard the rest of the 3251message after pausing. On the next @code{module_speak()} call, 3252Speech Dispatcher will resend the rest of the message after the 3253index mark. 3254@end deffn 3255 3256 3257@node Module Utils Functions and Macros, Index Marks in Output Modules, Output Module Functions, Output Modules 3258@subsection Module Utils Functions and Macros 3259 3260This section describes the various variables, functions and macros 3261that are available in the @file{module_utils.h} file. They are 3262intended to make writing new output modules easier and allow the 3263programmer to reuse existing pieces of code instead of writing 3264everything from scratch. 3265 3266@menu 3267* Initialization Macros and Functions:: 3268* Generic Macros and Functions:: 3269* Functions used by module_main.c:: 3270* Functions for use when talking to synthesizer:: 3271* Multi-process output modules:: 3272* Memory Handling Functions:: 3273@end menu 3274 3275@node Initialization Macros and Functions, Generic Macros and Functions, Module Utils Functions and Macros, Module Utils Functions and Macros 3276@subsubsection Initialization Macros and Functions 3277 3278@deffn {Module Utils macro} INIT_SETTINGS_TABLES () 3279@findex INIT_SETTINGS_TABLES 3280This macro initializes the settings tables where the parameters 3281received with the @code{SET} command are stored. You must call 3282this macro if you want to use the @code{UPDATE_PARAMETER()} 3283and @code{UPDATE_STRING_PARAMETER()} macros. 3284 3285It is intended to be called from inside a function just 3286after the output module starts. 3287@end deffn 3288 3289@subsubsection Debugging Macros 3290@deffn {Module Utils macro} DBG (format, ...) 3291@findex DBG 3292DBG() outputs a debugging message, if the @code{Debug} option in module's 3293configuration is set, to the file specified in configuration ad 3294@code{DebugFile}. The parameter syntax is the same as for the printf() 3295function. In fact, it calls printf() internally. 3296@end deffn 3297 3298@deffn {Module Utils macro} FATAL (text) 3299@findex FATAL 3300Outputs a message specified as @code{text} and calls exit() with 3301the value EXIT_FAILURE. This terminates the whole output module 3302without trying to kill the child processes or freeing other 3303resources other than those that will be freed by the system. 3304 3305It is intended to be used after some severe error has occurred. 3306@end deffn 3307 3308@node Generic Macros and Functions, Functions used by module_main.c, Initialization Macros and Functions, Module Utils Functions and Macros 3309@subsubsection Generic Macros and Functions 3310 3311@deffn {Module Utils macro} UPDATE_PARAMETER (param, setter) 3312@findex UPDATE_PARAMETER 3313Tests if the integer or enum parameter specified in @code{param} 3314(e.g. rate, pitch, cap_let_recogn, ...) changed since the 3315last time when the @code{setter} function was called. 3316 3317If it changed, it calls the function @code{setter} with the 3318new value. (The new value is stored in the msg_settings 3319structure that is created by module_utils.h, which 3320you normally don't have to care about.) 3321 3322The function @code{setter} should be defined as: 3323@example 3324void setter_name(type value); 3325@end example 3326 3327Please look at the @code{SET} command in the communication protocol 3328for the list of all available parameters. 3329@pxref{Communication Protocol for Output Modules}. 3330 3331An example from Festival output module: 3332@verbatim 3333static void 3334festival_set_rate(signed int rate) 3335{ 3336 assert(rate >= -100 && rate <= +100); 3337 festivalSetRate(festival_info, rate); 3338} 3339[...] 3340int 3341module_speak(char *data, size_t bytes, EMessageType msgtype) 3342{ 3343 [...] 3344 UPDATE_PARAMETER(rate, festival_set_rate); 3345 UPDATE_PARAMETER(pitch, festival_set_pitch); 3346 [...] 3347} 3348@end verbatim 3349@end deffn 3350 3351@deffn {Module Utils macro} UPDATE_STRING_PARAMETER (param, setter) 3352@findex UPDATE_STRING_PARAMETER 3353The same as @code{UPDATE_PARAMETER} except that it works for 3354parameters with a string value. 3355@end deffn 3356 3357@node Functions used by module_main.c, Functions for use when talking to synthesizer, Generic Macros and Functions, Module Utils Functions and Macros 3358@subsubsection Functions used by @file{module_main.c} 3359 3360@deffn {Module Utils function} char* do_speak(void) 3361@findex do_speak 3362Takes care of communication after the @code{SPEAK} command was 3363received. Calls @code{module_speak()} when the full text is received. 3364 3365It returns a response according to the communication protocol. 3366@end deffn 3367 3368@deffn {Module Utils function} char* do_stop(void) 3369@findex do_stop 3370Calls the @code{module_stop()} function of the particular 3371output module. 3372 3373It returns a response according to the communication protocol. 3374@end deffn 3375 3376@deffn {Module Utils function} char* do_pause(void) 3377@findex do_pause 3378Calls the @code{module_pause()} function of the particular 3379output module. 3380 3381It returns a response according to the communication protocol 3382and the value returned by @code{module_pause()}. 3383@end deffn 3384 3385@deffn {Module Utils function} char* do_set() 3386@findex do_set 3387Takes care of communication after the @code{SET} command was 3388received. Doesn't call any particular function of the output module, 3389only sets the values in the settings tables. (You should then call the 3390@code{UPDATE_PARAMETER()} macro in module_speak() to actually set the 3391synthesizer to these values.) 3392 3393It returns a response according to the communication protocol. 3394@end deffn 3395 3396@deffn {Module Utils function} char* do_speaking() 3397@findex do_speaking 3398Calls the @code{module_speaking()} function. 3399 3400It returns a response according to the communication protocol 3401and the value returned by @code{module_speaking()}. 3402@end deffn 3403 3404@deffn {Module Utils function} void do_quit() 3405@findex do_quit 3406Prints the farewell message to the standard output, according 3407to the protocol. Then it calls @code{module_close()}. 3408@end deffn 3409 3410@node Functions for use when talking to synthesizer, Multi-process output modules, Functions used by module_main.c, Module Utils Functions and Macros 3411@subsubsection Functions for use when talking to synthesizer 3412 3413@deffn {Module Utils function} static int module_get_message_part ( const char* message, char* part, unsigned int *pos, size_t maxlen, const char* dividers) 3414@findex module_get_message_part 3415 3416Gets a part of the @code{message} according to the specified @code{dividers}. 3417 3418It scans the text in @code{message} from the byte specified by 3419@code{*pos} and looks for one of the characters specified in 3420@code{dividers} followed by a whitespace character or the 3421terminating NULL byte. If one of them is encountered, the read text is 3422stored in @code{part} and the number of bytes read is 3423returned. If end of @code{message} is reached, the return value is 3424-1. 3425 3426@code{message} is the text to process. It must be a NULL-terminated 3427uni-byte string. 3428 3429@code{part} is a pointer to the place where the output text should 3430be stored. It must contain at least @code{maxlen} bytes of space. 3431 3432@code{maxlen} is the maximum number of bytes that should be written 3433to @code{part}. 3434 3435@code{dividers} is a NULL-terminated uni-byte string containing 3436the punctuation characters where the message should be divided 3437into smaller parts (if they are followed by whitespace). 3438 3439After returning, @code{pos} is the position 3440where the function terminated in processing @code{message}. 3441@end deffn 3442 3443@deffn {Output module function} void module_report_index_mark(char *mark) 3444@findex module_report_index_mark 3445@end deffn 3446@deffn {Output module function} void module_report_event_*() 3447@findex module_report_event_* 3448 3449The @code{module_report_} functions serve for reporting event 3450notifications and index marking events. You should use them whenever 3451you get an event from the synthesizer which is defined in the output 3452module communication protocol. 3453 3454Note that you cannot call these functions from within a call 3455to module_speak(), module_stop(), or module_pause(). The best 3456way to do this is to emit the events from another thread. 3457 3458@end deffn 3459 3460@deffn {Output module function} int module_close(void) 3461@findex module_close 3462 3463This function is called when Speech Dispatcher terminates. The output 3464module should terminate all threads and processes, free all resources, 3465close all sockets etc. Never assume this function is called only when 3466Speech Dispatcher terminates and exit(0) will do the work for you. It's 3467perfectly ok for Speech Dispatcher to load, unload or reload output modules 3468in the middle of its run. 3469 3470@end deffn 3471 3472@node Multi-process output modules, Memory Handling Functions, Functions for use when talking to synthesizer, Module Utils Functions and Macros 3473@subsubsection Multi-process output modules 3474 3475@deffn {Module Utils function} size_t module_parent_wfork ( TModuleDoublePipe dpipe, 3476const char* message, SPDMessageType msgtype, const size_t maxlen, 3477const char* dividers, int *pause_requested) 3478@findex module_parent_wfork 3479 3480It simply sends the data to the 3481child in smaller pieces and waits for confirmation with a single 3482@code{C} character on the pipe from child to parent. 3483 3484@code{dpipe} is a parameter which contains the information 3485necessary for communicating through pipes between the parent and the 3486child and vice-versa. 3487 3488@example 3489typedef struct@{ 3490 int pc[2]; /* Parent to child pipe */ 3491 int cp[2]; /* Child to parent pipe */ 3492@}TModuleDoublePipe; 3493@end example 3494 3495@code{message} is a pointer to a NULL-terminated string containing the message 3496for synthesis. 3497 3498@code{msgtype} is the type of the message for synthesis. 3499 3500@code{maxlen} is the maximum number of bytes that should be transfered 3501over the pipe. 3502 3503@code{dividers} is a NULL-terminated string containing the punctuation characters 3504at which this function should divide the message into smaller pieces. 3505 3506@code{pause_requested} is a pointer to an integer flag, which is either 0 if 3507no pause request is pending, or 1 if the function should terminate 3508at a convenient place in the message because a pause is requested. 3509 3510In the beginning, it initializes the pipes and then it enters a simple cycle: 3511@enumerate 3512@item 3513Reads a part of the message or an index mark using 3514@code{module_get_message_part()}. 3515@item 3516Looks if there isn't a pending request for pause and handles 3517it. 3518@item 3519Sends the current part of the message to the child 3520using @code{module_parent_dp_write()}. 3521@item 3522Waits until a single character @code{C} comes from the other pipe 3523using @code{module_parent_dp_read()}. 3524@item 3525Repeats the cycle or terminates, if there is no more data. 3526@end enumerate 3527@end deffn 3528 3529@deffn {Module Utils function} int module_parent_wait_continue(TModuleDoublePipe dpipe) 3530@findex module_parent_wait_continue 3531Waits until the character @code{C} (continue) is read from the pipe from child. 3532This function is intended to be run from the parent. 3533 3534@code{dpipe} is the double pipe used for communication between the child and parent. 3535 3536Returns 0 if the character was read or 1 if the pipe was broken before the 3537character could be read. 3538@end deffn 3539 3540@deffn {Module Utils function} void module_parent_dp_init (TModuleDoublePipe dpipe) 3541@findex module_parent_dp_init 3542Initializes pipes (dpipe) in the parent. Currently it only closes the unnecessary ends. 3543@end deffn 3544 3545@deffn {Module Utils function} void module_child_dp_close (TModuleDoublePipe dpipe) 3546@findex module_child_dp_init 3547Initializes pipes (dpipe) in the child. Currently it only closes the unnecessary ends. 3548@end deffn 3549 3550@deffn {Module Utils function} void module_child_dp_write(TModuleDoublePipe dpipe, const char *msg, size_t bytes) 3551@findex module_child_dp_write 3552Writes the specified number of @code{bytes} from @code{msg} to the pipe to the 3553parent. This function is intended, as the prefix says, to be run from the child. 3554Uses the pipes defined in @code{dpipe}. 3555@end deffn 3556 3557@deffn {Module Utils function} void module_parent_dp_write(TModuleDoublePipe dpipe, const char *msg, size_t bytes) 3558@findex module_parent_dp_write 3559Writes the specified number of @code{bytes} from @code{msg} into the pipe to the 3560child. This function is intended, as the prefix says, to be run from the parent. 3561Uses the pipes defined in @code{dpipe}. 3562@end deffn 3563 3564@deffn {Module Utils function} int module_child_dp_read(TModuleDoublePipe dpipe char *msg, size_t maxlen) 3565@findex module_child_dp_read 3566Reads up to @code{maxlen} bytes from the pipe from parent into the buffer @code{msg}. 3567This function is intended, as the prefix says, to be run from the child. 3568Uses the pipes defined in @code{dpipe}. 3569@end deffn 3570 3571@deffn {Module Utils function} int module_parent_dp_read(TModuleDoublePipe dpipe, char *msg, size_t maxlen) 3572@findex module_parent_dp_read 3573Reads up to @code{maxlen} bytes from the pipe from child into the buffer @code{msg}. 3574This function is intended, as the prefix says, to be run from the parent. 3575Uses the pipes defined in @code{dpipe}. 3576@end deffn 3577 3578@deffn {Module Utils function} void module_sigblockall(void) 3579@findex module_sigblockall 3580Blocks all signals. This is intended to be run from the child processes 3581and threads so that their signal handling won't interfere with the 3582parent. 3583@end deffn 3584 3585@deffn {Module Utils function} void module_sigunblockusr(sigset_t *some_signals) 3586@findex module_sigunblockusr 3587Use the set @code{some_signals} to unblock SIGUSR1. 3588@end deffn 3589 3590@deffn {Module Utils function} void module_sigblockusr(sigset_t *some_signals) 3591@findex module_sigblockusr 3592Use the set @code{some_signals} to block SIGUSR1. 3593@end deffn 3594 3595@node Memory Handling Functions, , Multi-process output modules, Module Utils Functions and Macros 3596@subsubsection Memory Handling Functions 3597 3598@deffn {Module Utils function} static void* xmalloc (size_t size) 3599@findex xmalloc 3600The same as the classical @code{malloc()} except that it executes 3601@code{FATAL(``Not enough memory'')} on error. 3602@end deffn 3603 3604@deffn {Module Utils function} static void* xrealloc (void *data, size_t size) 3605@findex xrealloc 3606The same as the classical @code{realloc()} except that it also accepts 3607@code{NULL} as @code{data}. In this case, it behaves as @code{xmalloc}. 3608@end deffn 3609 3610@deffn {Module Utils function} void xfree(void *data) 3611@findex xfree 3612The same as the classical @code{free()} except that it checks 3613if data isn't NULL before calling @code{free()}. 3614@end deffn 3615 3616@node Index Marks in Output Modules, , Module Utils Functions and Macros, Output Modules 3617@subsection Index Marks in Output Modules 3618 3619Output modules need to provide some kind of synchronization and they have to 3620give Speech Dispatcher back some information about what part of the message 3621is currently being said. On the other hand, output modules are not able to tell 3622the exact position in the text because various conversions and message processing take place 3623(sometimes punctuation and spelling substitution, the message needs to be 3624recoded from multibyte to unibyte coding etc.) before the text reaches 3625the synthesizer. 3626 3627For this reason, Speech Dispatcher places so-called index marks in 3628the text it sends to its output modules. They have the form: 3629 3630@example 3631<mark name="id"/> 3632@end example 3633 3634@code{id} is the identifier associated with each index 3635mark. Within a @code{module_speak()} message, each identifer is unique. 3636It consists of the string @code{__spd_} and a counter number. Numbers 3637begin from zero for each message. For example, the fourth index mark 3638within a message looks like 3639 3640@example 3641<mark name="__spd_id_3"/> 3642@end example 3643 3644When an index mark is reached, its identifier should be stored 3645so that the output module is able to tell Speech Dispatcher the identifier 3646of the last index mark. Also, index marks are the best place to stop 3647when the module is requested to pause (although it's ok to stop at 3648some place close by and report the last index mark). 3649 3650Notice that index marks are in SSML format using the @code{mark} tag. 3651 3652@node Download and Contact, Reporting Bugs, Server Programming, Top 3653@chapter Download 3654 3655You can download Speech Dispatcher's latest release source code from 3656@uref{http://www.freebsoft.org/speechd}. There is also information 3657on how to set up anonymous access to our git repository. 3658 3659However, you may prefer to download Speech Dispatcher in a binary 3660package for your system. We don't distribute such packages ourselves. 3661If you run Debian GNU/Linux, it should be in the central repository 3662under the name @code{speech-dispatcher} or @code{speechd}. If you run 3663an rpm-based distribution like RedHat, Mandrake or SuSE Linux, please 3664try to look at @uref{http://www.rpmfind.net/}. 3665 3666If you want to contact us, please look at 3667@uref{http://www.freebsoft.org/contact} 3668or use the email @email{users@@lists.freebsoft.org}. 3669 3670@node Reporting Bugs, How You Can Help, Download and Contact, Top 3671@chapter Reporting Bugs 3672 3673If you believe you found a bug in Speech Dispatcher, we will be very 3674grateful if you let us know about it. Please do it by email on the 3675address @email{speechd@@bugs.freebsoft.org}, but please don't send us 3676messages larger than half a megabyte unless we ask you. 3677 3678To report a bug in a way that is useful for the developers is not 3679as easy as it may seem. Here are some hints that you should follow in 3680order to give us the best information so that we can find and fix 3681the bug easily. 3682 3683First of all, please try to describe the problem as exactly as you 3684can. We prefer raw data over speculations about where the problem may 3685lie. Please try to explain in what situation the bug happens. Even 3686if it's a general bug that happens in many situations, please try to 3687describe at least one case in as much detail, as possible. 3688 3689Also, please specify the versions of programs that you use when 3690the bug happens. This is not only Speech Dispatcher, but also 3691the client application you use (speechd-el, say, etc.) and 3692the synthesizer name and version. 3693 3694If you can reproduce the bug, please send us the log file also. This 3695is very useful, because otherwise, we may not be able to reproduce the 3696bug with our configuration and program versions that differ from 3697yours. Configuration must be set to logging priority at least 4, but 3698best 5, so that it's useful for debugging purposes. You can do so in 3699@file{etc/speech-dispatcher/speechd.conf} by modifying the variable 3700@code{LogLevel}. Also, you may want to modify the log destination with 3701variable @code{LogFile}. After modifying these options, please restart 3702Speech Dispatcher and repeat the situation in which the bug 3703happens. After it happened, please take the log and attach it to the 3704bug report, preferably compressed using @code{gzip}. But note, that 3705when logging with level 5, all the data that come from Speech Dispatcher 3706is also recorded, so make sure there is no sensitive information 3707when you are reproducing the bug. Please make sure you switch back 3708to priority 3 or lower logging, because priority 4 or 5 produces 3709really huge logs. 3710 3711If you are a programmer and you find a bug that is reproducible in 3712SSIP, you can send us the sequence of SSIP commands that lead to the 3713bug (preferably from starting the connection). You can also try to 3714reproduce the bug in a simple test-script under 3715@file{speech-dispatcher/src/tests} in the source tree. Please check 3716@file{speech-dispatcher/src/tests/README} and see the other tests 3717scripts there for an example. 3718 3719When the bug is a SEGMENTATION FAULT, a backtrace from gdb is also 3720valuable, but if you are not familiar with gdb, don't bother with 3721that, we may ask you to do it later. 3722 3723Finally, you may also send us a guess of what you think 3724happens in Speech Dispatcher that causes the bug, but this is 3725usually not very helpful. If you are able to provide additional technical 3726information instead, please do so. 3727 3728@node How You Can Help, Appendices, Reporting Bugs, Top 3729@chapter How You Can Help 3730 3731If you want to contribute to the development of Speech Dispatcher, 3732we will be very happy if you do so. Please contact us on 3733@email{users@@lists.freebsoft.org}. 3734 3735Here is a short, definitively not exhaustive, list of how you can 3736help us and other users. 3737 3738@itemize 3739@item 3740@emph{Donate money:} We are a non-profit organization and we can't work without 3741funding. Brailcom, o.p.s. created Speech Dispatcher, speechd-el and also works 3742on other projects to help blind and visually impaired users of computers. We build 3743on Free Software and GNU/Linux, because we believe this is the right way. But it 3744won't be possible when we have no money. @uref{http://www.freebsoft.org/} 3745 3746@item 3747@emph{Report bugs:} Every user, even if he can't give us money and he is not 3748a programmer, can help us very much by just using our software and telling 3749us about the bugs and inconveniences he encounters. A good user community that 3750reports bugs is a crucial part of development of a good Free Software package. 3751We can't test our software under all circumstances and on all platforms, so each 3752constructive bug report is highly appreciated. You can report bugs in Speech 3753Dispatcher on @email{speechd@@bugs.freebsoft.org}. 3754 3755@item 3756@emph{Write or modify an application to support synthesis:} With 3757Speech Dispatcher, we have provided an interface that allows 3758applications easy access to speech synthesis. However powerful, it's 3759no more than an interface, and it's useless on its own. Now it's time 3760to write the particular client applications, or modify existing 3761applications so that they can support speech synthesis. It is useful 3762if the application needs a specific interface for blind people or if 3763it wants to use speech synthesis for educational or other purposes. 3764 3765@item 3766@emph{Develop new voices and language definitions for Festival:} In 3767the world of Free Software, currently Festival is the most promising 3768interface for Text-to-Speech processing and speech synthesis. It's 3769an extensible and highly configurable platform for developing synthetic 3770voices. If there is a lack of synthetic voices or no voices at all for 3771some language, we believe the wisest solution is to try to develop 3772a voice in Festival. It's certainly not advisable to develop your 3773own synthesizer if the goal is producing a quality voice system 3774in a reasonable time. Festival developers provide nice documentation 3775about how to develop a voice and a lot of tools that help doing 3776this. We found that some language definitions can be constructed 3777by canibalizing the already existing definitions and can be tuned 3778later. As for the voice samples, one can temporarily use the 3779MBROLA project voices. But please note that, although they are 3780downloadable for free (as price), they are not Free Software 3781and it would be wonderful if we could replace them by Free Software 3782alternatives as soon as possible. 3783See @uref{http://www.cstr.ed.ac.uk/projects/festival/}. 3784 3785@item 3786@emph{Help us with this or other Free-b-Soft projects:} Please look at 3787@uref{http://www.freebsoft.org} to find information about our 3788projects. There is a plenty of work to be done for the blind and 3789visually impaired people to make their work with computers easier. 3790 3791@item 3792@emph{Spread the word about Speech Dispatcher and Free Software:} You can 3793help us, and the whole community around Free Software, just by telling 3794your friends about the amazing world of Free Software. It doesn't 3795have to be just about Speech Dispatcher; you can tell them about 3796other projects or about Free Software in general. Remember that 3797Speech Dispatcher could only arise out of understanding of some people 3798of the principles and ideas behind Free Software. And this is mostly 3799the same for the rest of the Free Software world. 3800See @uref{http://www.gnu.org/} for more information about GNU/Linux 3801and Free Software. 3802 3803@end itemize 3804 3805@node Appendices, GNU General Public License, How You Can Help, Top 3806@appendix Appendices 3807 3808@node GNU General Public License, GNU Free Documentation License, Appendices, Top 3809@appendix GNU General Public License 3810@center Version 2, June 1991 3811@cindex GPL, GNU General Public License 3812 3813@include gpl.texi 3814 3815@node GNU Free Documentation License, Index of Concepts, GNU General Public License, Top 3816@appendix GNU Free Documentation License 3817@center Version 1.2, November 2002 3818@cindex FDL, GNU Free Documentation License 3819 3820@include fdl.texi 3821 3822@node Index of Concepts, , GNU Free Documentation License, Top 3823@unnumbered Index of Concepts 3824 3825@cindex tail recursion 3826@printindex cp 3827 3828@bye 3829 3830@c LocalWords: texinfo setfilename speechd settitle finalout syncodeindex pg 3831@c LocalWords: setchapternewpage cp fn vr texi dircategory direntry titlepage 3832@c LocalWords: Cerha Hynek Hanke vskip pt filll insertcopying ifnottex dir fd 3833@c LocalWords: API SSIP cindex printf ISA pindex Flite Odmluva FreeTTS TTS CR 3834@c LocalWords: ViaVoice Lite Tcl Zandt wxWindows AWT spd dfn backend findex 3835@c LocalWords: src struct gchar gint const OutputModule intl FDSetElement len 3836@c LocalWords: fdset init flite deffn TFDSetElement var int enum EVoiceType 3837@c LocalWords: sayf ifinfo verbatiminclude ref UTF ccc ddd pxref LF cs conf 3838@c LocalWords: su AddModule DefaultModule xref identd printindex Dectalk GTK 3839 3840@c speechd.texi ends here 3841@c LocalWords: emph soundcard precission archieved succes Dispatcher When 3842