1Digital Speech Decoder 1.6.0
2Copyright (C) 2010 DSD Author
3GPG Key ID: 0x3F1D7FD0 (74EF 430D F7F2 0A48 FCE6 F630 FAA2 635D 3F1D 7FD0)
4
5Permission to use, copy, modify, and/or distribute this software for any
6purpose with or without fee is hereby granted, provided that the above
7copyright notice and this permission notice appear in all copies.
8
9THE SOFTWARE IS PROVIDED "AS IS" AND ISC DISCLAIMS ALL WARRANTIES WITH
10REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY
11AND FITNESS. IN NO EVENT SHALL ISC BE LIABLE FOR ANY SPECIAL, DIRECT,
12INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM
13LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE
14OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
15PERFORMANCE OF THIS SOFTWARE.
16
17 DSD is able to decode several digital voice formats from discriminator
18 tap audio and synthesize the decoded speech. Speech
19 synthesis requires mbelib, which is a separate package. DSD 1.4.1
20 requires mbelib 1.1 or later.
21
22 Supported formats in version 1.6.0:
23
24 P25 Phase 1 Widely deployed radio standard used in public safety
25 and amateur radio.
26
27 Support includes decoding and synthesis of speech,
28 display of all link control info, and the ability to save
29 and replay .imb data files
30
31 ProVoice EDACS Digital voice format used by public safety and
32 amateur radio.
33
34 Support includes decoding and synthesis of speech and
35 the ability to save and replay .imb data files.
36 Note: not enabled by default, use -fp to enable.
37
38 X2-TDMA Two slot TDMA system currently being deployed by several
39 public safety organizations. Based on the DMR
40 standard with extensions for P25 style signaling.
41
42 Support includes decoding and synthesis of speech,
43 display of all link control info, and the ability to save
44 and replay .amb data files
45
46 DMR/MOTOTRBO "Digital Mobile Radio" Eurpoean two slot TDMA standard.
47 MOTOTRBO is a popular implementation of this standard.
48
49 Support includes decoding and synthesis of speech and
50 the ability to save and replay .amb data files.
51
52 NXDN Digital radio standard used by NEXEDGE and IDAS brands.
53 Supports both 9600 baud (12.5 kHz) and
54 4800 baud (6.25 kHz) digital voice.
55
56 Support includes decoding and synthesis of speech and
57 the ability to save and replay .amb data files.
58
59 Development (no speech) support only:
60
61 D-STAR Amateur radio digital voice standard
62
63 Development support only. DSD recognized frames and can
64 extract the voice bits but speech is not yet decoded.
65 D-STAR likely uses a version of AMBE not yet supported by
66 mbelib. The voice bit interleave pattern also needs to be
67 determined.
68 Note: not enabled by default, use -fd to enable.
69
70 Unsupported formats in version 1.4 considered for future development:
71
72 P25 Phase 2 This is not yet a published standard. Full support is
73 expected once the standard is published and there are
74 systems operating to test against. Phase 2 will use
75 a vocoder supported by mbelib.
76
77 OpenSKY It is possible that the four slot version uses a vocoder
78 supported by mbelib. The two slot version does not.
79
80 Supported demodulation optimizations in version 1.4:
81
82 C4FM Continuous envelope 2 or 4 level FSK with relatively
83 sharp transitions between symbols. Used by most P25
84 systems.
85
86 Optimizations include calibrating decision points only
87 during sync, 4/10 sample window per symbol, and symbol
88 edge timing calibration.
89
90 GFSK Continuous envelope 2 or 4 level FSK with a narrower
91 Gaussian/"raised cosine" filter that affects transitions
92 between symbols. Used by DMR/MOTOTRBO, NXDN and many
93 others. Noisy C4FM signals may be detected as GFSK
94
95 but this is ok, the optimization changes will help with
96 noisy signals.
97
98 Optimizations are similar to C4FM except symbol transitions
99 are only kept out of the middle 4 samples and only the
100 middle two samples are used.
101
102 QPSK Quadrature Phase Shift Keying (and variants) used in
103 some P25 systems and all known X2-TDMA systems. May be
104 advertised under the marketing term "LSM"
105
106 Optimizations include continuous decision point
107 calibration, using middle two samples, and using the
108 symbol midpoint "spike" for symbol timing.
109
110Installation
111
112 DSD should easily compile on any Linux or *BSD system with gcc.
113 Just untar and run "make" or "make install". There are some debugging/
114 development options in config.h that normal users will want to leave
115 disabled as they can severely impact performance.
116
117Operation
118
119 There are two main operating modes, "Live scanner" and "Play files"
120
121 Usage: dsd [options] Live scanner mode
122
123 Live Scanner mode takes 48KHz/16 bit mono audio samples from a
124 sound card input and decodes speech in real time. Options are provided
125 for controling information display and saving mbe data files.
126 The synthesized speech can be output to a soundcard and/or a
127 .wav file.
128
129 Usage: dsd [options] -r <files> Read/Play saved mbe data from file(s)
130
131
132 Play files mode reads mbe data from files specified on the command
133 line (including wildcards) and synthesizes speech from those files.
134 The synthesized speech can be output to a soundcard and/or a
135 .wav file. The -r command line options is used to activate Play files
136 mode.
137
138Display modes
139
140 There are two main display modes in Live scanner mode. "Errorbars"
141 and "Datascope".
142
143 Errorbars mode output for P25 Phase 1 looks like this:
144
145
146Sync: -P25p1 mod: C4FM inlvl: 39% nac: 5C2 src: 0 tg: 32464 TDULC
147Sync: -P25p1 mod: C4FM inlvl: 39% nac: 5C2 src: 0 tg: 32464 TDULC
148Sync: -P25p1 mod: C4FM inlvl: 39% nac: 5C2 src: 0 tg: 32464 TDULC
149Sync: -P25p1 mod: C4FM inlvl: 39% nac: 5C2 src: 0 tg: 32464 TDULC
150Sync: -P25p1 mod: C4FM inlvl: 38% nac: 5C2 src: 0 tg: 32464 TDU
151Sync: -P25p1 mod: C4FM inlvl: 38% nac: 5C2 src: 0 tg: 32464 HDU
152Sync: -P25p1 mod: C4FM inlvl: 42% nac: 5C2 src: 0 tg: 32464 LDU1 e:
153Sync: (-P25p1) mod: C4FM inlvl: 39% nac: 5C2 src: 52610 tg: 32464 (LDU2) e:
154Sync: -P25p1 mod: C4FM inlvl: 38% nac: 5C2 src: 52610 tg: 32464 LDU1 e:
155Sync: -P25p1 mod: C4FM inlvl: 39% nac: 5C2 src: 52610 tg: 32464 LDU2 e:
156Sync: -P25p1 mod: C4FM inlvl: 39% nac: 5C2 src: 52610 tg: 32464 LDU1 e:
157Sync: -P25p1 mod: C4FM inlvl: 39% nac: 5C2 src: 52610 tg: 32464 LDU2 e:
158Sync: -P25p1 mod: C4FM inlvl: 39% nac: 5C2 src: 52610 tg: 32464 LDU1 e:
159
160 "Sync" indicates the frame type detected and whether the polarity is
161 positive or negative. DSD automatically detects and handles either
162 polarity except for DMR/MOTOTRBO/X2-TDMA which unfortunatley use both
163 sync polarities.
164
165 Most combinations of transmitter, receiver and soundcard show netagive
166 (-) polarity for X2-TDMA signals and (+) polarity for DMR/MOTOTRBO so
167 those are the defaults.
168
169********
170 You may need to use the -x option to select non-inverted polarity if
171 you are not getting usable X2-TDMA/MOTOTRBO/DMR speech. As they use both
172 normal and inverted sync it is not possible to detect polariy
173 automatically.
174********
175
176 "mod" indicates the current demodulation optimizations.
177
178 "inlvl" indicates the audio input level. QPSK signals tend to appear
179 much "wider" than C4FM from a discriminator tap so it is important
180 to set your input gain using a QPSK signal if you plan to montir them.
181 It is not necessary nor desirable to get to 100%, in fact your sound
182 card may max out below 100%. It is best to use the Datascope mode for
183 setting input gain (see below). Typical values with good results are
184 40% for C4FM and 66% for QPSK.
185
186 "nac" is the P25 Phase 1 Network Access Code. This is a 12 bit field
187 in each P25 Phase 1 header. It should not be confused with the 16
188 bit System ID used in non-P25 trunking control channels.
189
190 "src" is the radio id of the trasmitting subscriber unit.
191
192 "tg" is the talkgroup derived from link control information.
193
194 "HDU/LDU1/LDU2/TDU/TDULC" are P25 Phase 1 frame types, referred to as
195 frame subtype within DSD.
196
197 "e:" is the beginning of the errorbars display. Each "=" indicates a
198 detected error within the voice data. "R" and "M" indicat that a voice
199 frame was repeated or muted due to excessive errors.
200
201 Values in parentheses () indicate an assumption (soft decision) was
202 made based on the previous frame.
203
204 Errorbars mode output for X2-TDMA looks like this:
205
206Sync: -X2-TDMA mod: QPSK inlvl: 59% src: 17211 tg: 197 [SLOT0] slot1 VOICE e:
207Sync: -X2-TDMA mod: QPSK inlvl: 47% src: 17211 tg: 197 [SLOT0] slot1 VOICE e:
208Sync: -X2-TDMA mod: QPSK inlvl: 43% src: 17211 tg: 197 [SLOT0] slot1 VOICE e:
209Sync: (-X2-TDMA) mod: QPSK inlvl: 28% src: 17211 tg: 197 [SLOT0] slot1 VOICE e:
210
211 DMR/MOTOTRBO display is similar except it does not yet show source
212 and talkgroup information.
213
214 As of version 1.2 DSD shows which specific TDMA slots are active (with
215 capital SLOT letters) and which slot is currently being monitored (with
216 square brackets []. Noisy/degraded signals will affect the accuracy
217 of this display.
218
219 The frame subtypes (Voice/LC etc) are shown based on the DMR standard
220 types.
221
222 Datascope mode output looks like this:
223
224
225Demod mode: C4FM Nac: 8C3
226Frame Type: P25 Phase 1 Talkgroup: 16528
227Frame Subtype: LDU1 Source: 0
228TDMA activity: slot0 slot1 Voice errors:
229+----------------------------------------------------------------+
230| # ^ !| ^ # |
231| * | * |
232| * | * |
233| * | * * |
234| * * | * * |
235| * * | ** * |
236| * ** | ** * |
237| ** ** | ** * |
238| ** ** | ** * |
239| ** ** | ** * |
240+----------------------------------------------------------------+
241 C4FM Example
242
243Demod mode: C4FM Nac: 126
244Frame Type: P25 Phase 1 Talkgroup: 25283
245Frame Subtype: LDU2 Source: 0
246TDMA activity: slot0 slot1 Voice errors:
247+----------------------------------------------------------------+
248| # ^ ! ^ # |
249| * | |
250| * | |
251| ** | |
252| ** | * |
253| * ** | * * |
254| ** ** | * * |
255| *** ** | ** * |
256| *** ** | *** * |
257| *** **** | **** * * |
258+----------------------------------------------------------------+
259 QPSK Example
260
261
262 At the top is various information about the signal, similar to the
263 information provided in Errorbars mode. The large box is similar to
264 a spectrum analyzer viewing the channel bandwidth.
265
266 The horizontal axis is the input audio level, minimum on the left and
267 maximum on the right. The vertical axis is the number of samples
268 seend at each audio level.
269
270 The "*" symbols represent the number of audio
271 samples that were at each level during the aggregation period.
272 (default = 36 symbols) The -S options controls the aggregation period
273 as well as the QPSK tracking symbol buffer, so changing that will affect
274 QPSK performance as well as the Datascope display.
275
276 As you can see from the figures above, clean C4FM signals tend to have
277 four very sharply defined audio levels. The datascope pattern also
278 tends to be faily stable with minor shifts left and right as the
279 receiver tries to frequency track any DC offset.
280
281 QPSK signals on the other hand tend to appear much broader (and artifact
282 of how they are distored by FM PLL discriminators). They also tend
283 to vary wildly in width and centering. This is especially true when
284 monitoring simulcast systems. Muliple QPSK signals interfere much more
285 dramatically with an FM discriminator than C4FM signals.
286
287*******
288 For this reason it is important to isolate your receiver to one
289 transmitter tower, _especially_ for QPSK signals.
290*******
291
292 The "#" symbols indicate the detected min/max values that are used
293 to calibrate the symbol decision points. These are indicated by
294 "!" for the center decision point and "^" for the mid decision points.
295
296Display Options
297
298 There are several options to control the type and quantity of
299 information displayed in Errorbars mode:
300
301 -e Show Frame Info and errorbars (default)
302 -pe Show P25 encryption sync bits
303 -pl Show P25 link control bits
304 -ps Show P25 status bits and low speed data
305 -pt Show P25 talkgroup info
306 -q Don't show Frame Info/errorbars
307 -s Datascope (disables other display options)
308 -t Show symbol timing during sync
309 -v <num> Frame information Verbosity
310 -z <num> Frame rate for datascope
311
312 Most of these options are self explanitory. Symbol timing is a noisy
313 option that allows you to view the quality of the frame sync samples
314 and accuracy of the symbol timing adjustments.
315
316 Symbol Timing display looks like this:
317
318Symbol Timing:
319----------
320----------
321----------
322----------
323----------
324-+++++++++ 1
325+---------- 0
326----------
327++++++++++ 0
328++++++++++
329---------- 0
330----------
331++++++++++ 0
332++++++++++
333++++++++++
334++++++++++
335---------- 0
336++++++++++ 0
337---------- 0
338++++++++++ 0
339++++++++++
340++++++++++
341++++++++++
342++++++++++
343C4FM example
344
345Symbol Timing:
346+---------
347----------
348----------
349----------
350-----X---- 5
351--+++O++++- 4
352----------
353----X----- 4
354++++O++--- 4
355--++O++++- 4
356----X----- 4
357----------
358++++O+++-- 4
359-+++O+++-- 4
360--++O+++-- 4
361--++O+++-- 4
362----------
363++++O++++- 4
364----------
365++++O+++-- 4
366-+++O++++- 4
367-+++O+++++ 4
368-+++O++--- 4
369--++O+++-- 4
370QPSK example
371
372 Symbol timing is only displayed for symbols during the frame sync
373 period. Each horizontal line represents the 10 audio samples for each
374 symbol. "-" indicates an audio sample below the center reference level
375 and "+" represents a sample above center. "X" indicates a low spike
376 below a reference threshold (reference minimum for C4FM and 80%
377 of reference minimum for QPSK). "O" represents a high spike above
378 the high reference threshold. The numbers to the right indicate which
379 sample position the targeted transition occurred (+/- for C4FM or
380 spike high/low for QPSK). The number of audio samples for the next
381 symbol are adjusted to get this value closer to the target (0 for
382 C4FM and 4 for QPSK). This shows how DSD maintains accurate symbol
383 timing. Symbol timing adjustments are only made during sync, which
384 is the only time reliable transitions can be observed.
385
386 In both examples above the symbol timing was off by one sample at
387 the beginning of the frame sync period and was adjusted. Generally
388 if you see any spike values "X/O" in C4FM mode, or lots of them in
389 QPSK mode it indicates noise on the input signal.
390
391Input/Output Options
392
393 -i <device> Audio input device (default is /dev/audio)
394 -o <device> Audio output device (default is /dev/audio)
395 -d <dir> Create mbe data files, use this directory
396 -g <num> Audio output gain (default = 0 = auto)
397 -n Do not send synthesized speech to audio output device
398 -w <file> Output synthesized speech to a .wav file
399
400 The audio in device can be a sound card OR a .wav file if the file
401 is in the exact format 48k/16bits/mono/pcm. Audio in should be an
402 unfilterd discriminator tap signal.
403
404 The audio out device should be a sound card (use the -w options to
405 output to a .wav file).
406
407 If the audio in device is the same as the audio out device, the
408 synthesized speech has to be upsampled to the 48k sample rate required
409 for input. A fast upsample function is provided but still leaves some
410 artifacts.
411
412********
413 The best sound and minimum cpu usage is achieved with separate sound
414 cards for input and output
415********
416
417 if you specify different input/output devices DSD will use 8k as the
418 output sample rate and the lack of resampling results in much better
419 audio as well as lowe cpu consumption.
420
421 If you are using onboard "AC97" sound device you may find that DSD uses
422 much more cpu than expected, in some cases more than is available.
423 This is because many AC97 sound devices are designed to rely on CPU
424 processing power instead of hardware. You may also find that 8k sample
425 rate output is upsampled in the driver using a very basic algorithim
426 resulting in severe distortion. The solution is to use a real hardware
427 sound device (pci card, usb device etc).
428
429 As of version 1.2 DSD now automatically levels the output audio. This
430 greately improves readability and eliminates the painful effects of
431 noise bursts. You can specify a fixed audio output gain with the -g
432 option.
433
434Scanner control options:
435
436 -B <num> Serial port baud rate (default=115200)
437 -C <device> Serial port for scanner control (default=/dev/ttyUSB0)
438 -R <num> Resume scan after <num> TDULC frames or any PDU or TSDU
439
440
441 On some P25 systems Packet Data Units (PDU) are sent on the same
442 frequencies used for voice traffic. If done constantly this can
443 be a severe hinderance to scanning the system in conventional
444 mode. The -R option enables sending a "resume scan" command to
445 a scanner connected to a serial port. Use -B and -C to set the baud
446 rate and serial port device if necessary.
447
448Decoder options
449
450 -fa Auto-detect frame type (default)
451 -f1 Decode only P25 Phase 1
452 -fd Decode only D-STAR* (no audio)
453 -fi Decode only NXDN48* (6.25 kHz) / IDAS*
454 -fn Decode only NXDN96 (12.5 kHz)
455 -fp Decode only ProVoice*
456 -fr Decode only DMR/MOTOTRBO
457 -fx Decode only X2-TDMA
458 -ma Auto-select modulation optimizations (default)
459 -mc Use only C4FM modulation optimizations
460 -mg Use only GFSK modulation optimizations
461 -mq Use only QPSK modulation optimizations
462 -u <num> Unvoiced speech quality (default=3)
463 -xx Expect non-inverted X2-TDMA signal
464 -xr Expect inverted DMR/MOTOTRBO signal
465
466 * denotes frame types that cannot be auto-detected.
467
468 ProVoice and NXDN48 not auto-detected as use different symbol
469 rates (9600 and 2400) than most formats (4800). D-STAR is not
470 enabled by default as voice decode does not work and it has a
471 short sync word that is prone to false triggering. It is included
472 for development/testing only.
473
474 MBE speech synthesis is broken down into two main types of sounds,
475 "Voiced" and "Unvoiced". Voiced speech bands are synthesized with
476 a single sine wave centered in the frequency band with the appropriate
477 phase and amplitude.
478
479 Unvoiced speech is supposed to be generated with a noise source, 256
480 point DFT a number of band filters, followed by a 256 point inverse DFT.
481 For computational simplicity mbelib uses a different method. For each
482 unvoiced speech band, a number of sine waves are generated, each with a
483 different random initial phase. The number of waves used per band is
484 controlled by the -u option. A setting of 4 would approximate the
485 performance of the 256 point DFT method as the maximum number of voice
486 bands is 56, and very low frequencies are not synthesized. Values less
487 than 3 have a noticable lack of unvoiced speech and/or artifacts. The
488 defualt of 3 provides good speech quality with reasonable cpu use.
489 Increasing the quality above the default rapidly consumes more CPU for
490 increasingly diminishing returns.
491
492
493Advanced decoder options
494 -A <num> QPSK modulation auto detection threshold (default=26)
495 -S <num> Symbol buffer size for QPSK decision point tracking
496 (default=36)
497 -M <num> Min/Max buffer size for QPSK decision point tracking
498 (default=15)
499
500Encryption
501
502********
503 Decryption of speech is NOT supported, even if you lawfully posess the
504 encryption keys. Decryption support will not be added in the future as
505 the authors wish to steer as far away from the legal issues associated
506 with encryption as possible.
507********
508
509 We realize that there are many legitemate and lawful uses of decryption
510 software including system/interoperability testing and lawful monitoring.
511 This software is distributed under a liberal BSD license so there is
512 nothing to stop others from supplying patches, forking this project or
513 incorporating it into a commercial product and adding decryption support.
514
515 There is support for displaying the encryption sync bits transmitted in
516 the clear on P25 Phase 1 systems. These bits do not allow for the
517 decryption of signals without the secret encryption keys. The
518 encryption sync bits are useful for determining whether a signal is
519 encrypted vs merely noisy or degraded. As the encryption sync bits
520 typically include long strings of zeros when a transmission is not
521 encrypted they can also be used to visually estimate bit error rates.
522