Secure voice

Secure voice(alternativelysecure speechorciphony) is a term incryptographyfor the encryption ofvoice communicationover a range of communication types such as radio,telephoneorIP.

History

The implementation of voice encryption dates back toWorld War IIwhen secure communication was paramount to the US armed forces. During that time, noise was simply added to a voice signal to prevent enemies from listening to the conversations. Noise was added by playing a record of noise in sync with the voice signal and when the voice signal reached the receiver, the noise signal was subtracted out, leaving the original voice signal. In order to subtract out the noise, the receiver needed to have exactly the same noise signal and the noise records were only made in pairs; one for the transmitter and one for the receiver. Having only two copies of records made it impossible for the wrong receiver to decrypt the signal. To implement the system, the army contractedBell Laboratoriesand they developed a system calledSIGSALY.With SIGSALY, ten channels were used to sample thevoice frequencyspectrum from 250 Hz to 3 kHz and two channels were allocated to sample voice pitch and background hiss. In the time of SIGSALY, the transistor had not been developed and the digital sampling was done by circuits using the model 2051Thyratronvacuum tube. Each SIGSALY terminal used 40 racks of equipment weighing 55 tons and filled a large room. This equipment included radio transmitters and receivers and large phonograph turntables. The voice was keyed to two 410-millimetre (16 in) vinyl phonograph records that contained afrequency-shift keying(FSK) audio tone. The records were played on large precise turntables in sync with the voice transmission.

From the introduction of voice encryption to today, encryption techniques have evolved drastically. Digital technology has effectively replaced old analog methods of voice encryption and by using complex algorithms, voice encryption has become much more secure and efficient. One relatively modern voice encryption method isSub-band coding.With Sub-band Coding, the voice signal is split into multiple frequency bands, using multiple bandpass filters that cover specific frequency ranges of interest. The output signals from the bandpass filters are then lowpass translated to reduce the bandwidth, which reduces the sampling rate. The lowpass signals are then quantized and encoded using special techniques like,pulse-code modulation(PCM). After the encoding stage, the signals are multiplexed and sent out along the communication network. When the signal reaches the receiver, the inverse operations are applied to the signal to get it back to its original state.^[1]A speech scrambling system was developed atBell Laboratoriesin the 1970s bySubhash KakandNikil Jayant.^[2]In this system permutation matrices were used to scramble coded representations (such aspulse-code modulationand variants) of the speech data.Motoroladeveloped a voice encryption system calledDigital Voice Protection(DVP) as part of their first generation of voice encryption techniques. DVP uses aself-synchronizingencryption technique known ascipher feedback(CFB). The extremely high number of possible keys associated with the early DVP algorithm, makes the algorithm very robust and gives a high level of security. As with other symmetric keyed encryption systems, the encryption key is required to decrypt the signal with a special decryption algorithm.

Digital

A digital secure voice usually includes two components, adigitizerto convert between speech and digital signals and anencryptionsystem to provide confidentiality. It is difficult in practice to send the encrypted signal over the samevoiceband communication circuitsused to transmit unencrypted voice, e.g. analogtelephone linesormobile radios,due to bandwidth expansion.

This has led to the use of Voice Coders (vocoders) to achieve tight bandwidth compression of the speech signals.NSA'sSTU-III,KY-57 andSCIPare examples of systems that operate overexisting voicecircuits. TheSTEsystem, by contrast, requires wide bandwidthISDNlines for its normal mode of operation. For encryptingGSMandVoIP,which are natively digital, the standard protocolZRTPcould be used as anend-to-end encryptiontechnology.

Secure voice's robustness greatly benefits from having the voice data compressed into very low bit-rates by special component calledspeech coding,voice compression or voice coder (also known asvocoder). The old secure voice compression standards include (CVSD,CELP,LPC-10eandMELP,where the latest standard is the state of the art MELPe algorithm.

Digital methods using voice compression: MELP or MELPe

TheMELPeor enhanced-MELP(Mixed Excitation Linear Prediction) is aUnited States Department of Defensespeech coding standard used mainly in military applications and satellite communications, secure voice, and secure radio devices. Its development was led and supported byNSA,and NATO. The US government's MELPe secure voice standard is also known as MIL-STD-3005, and the NATO'sMELPesecure voice standard is also known asSTANAG-4591.

The initial MELP was invented by Alan McCree around 1995.^[3]That initial speech coder was standardized in 1997 and was known as MIL-STD-3005.^[4]It surpassed other candidate vocoders in the US DoD competition, including: (a)Frequency Selective Harmonic Coder(FSHC), (b)Advanced Multi-Band Excitation(AMBE), (c)Enhanced Multiband Excitation(EMBE), (d)Sinusoid Transform Coder(STC), and (e)Subband LPC Coder(SBC). Due to its lower complexity^{[citation needed]}than Waveform Interpolative (WI) coder, the MELP vocoder won the DoD competition and was selected forMIL-STD-3005.

Between 1998 and 2001, a new MELP-based vocoder was created at half the rate (i.e. 1200 bit/s) and substantial enhancements were added to the MIL-STD-3005 bySignalCom(later acquired byMicrosoft),AT&T Corporation,andCompandentwhich included (a) additional new vocoder at half the rate (i.e. 1200 bit/s), (b) substantially improved encoding (analysis), (c) substantially improved decoding (synthesis), (d) Noise-Preprocessing for removing background noise, (e) transcoding between the 2400 bit/s and 1200 bit/s bitstreams, and (f) new postfilter. This fairly significant development was aimed to create a new coder at half the rate and have it interoperable with the old MELP standard. This enhanced-MELP (also known as MELPe) was adopted as the new MIL-STD-3005 in 2001 in form of annexes and supplements made to the original MIL-STD-3005, enabling the same quality as the old 2400 bit/s MELP's at half the rate. One of the greatest advantages of the new 2400 bit/s MELPe is that it shares the same bit format as MELP, and hence can interoperate with legacy MELP systems, but would deliver better quality at both ends. MELPe provides much better quality than all older military standards, especially in noisy environments such as battlefield and vehicles and aircraft.

In 2002, following extensive competition and testing, the 2400 and 1200 bit/s US DoD MELPe was adopted also asNATOstandard, known asSTANAG-4591.^[5]As part of NATO testing for new NATO standard, MELPe was tested against other candidates such asFrance's HSX (Harmonic Stochastic eXcitation) andTurkey's SB-LPC (Split-Band Linear Predictive Coding), as well as the old secure voice standards such asFS1015 LPC-10e(2.4 kbit/s),FS1016 CELP(4.8 kbit/s) andCVSD(16 kbit/s). Subsequently, the MELPe won also the NATO competition, surpassing the quality of all other candidates as well as the quality of all old secure voice standards (CVSD,CELPandLPC-10e). TheNATOcompetition concluded that MELPe substantially improved performance (in terms of speech quality, intelligibility, and noise immunity), while reducing throughput requirements. The NATO testing also included interoperability tests, used over 200 hours of speech data, and was conducted by three test laboratories worldwide.CompandentInc, as a part of MELPe-based projects performed forNSAandNATO,provided NSA and NATO with special test-bed platform known asMELCODERdevice that provided the golden reference for real-time implementation of MELPe. The low-cost FLEXI-232 Data Terminal Equipment (DTE) made byCompandent,which are based on theMELCODERgolden reference, are very popular and widely used for evaluating and testing MELPe in real-time, various channels & networks, and field conditions.

TheNATOcompetition concluded thatMELPesubstantially improved performance (in terms of speech quality, intelligibility, and noise immunity), while reducing throughput requirements. The NATO testing also included interoperability tests, used over 200 hours of speech data, and was conducted by three test laboratories worldwide.

In 2005, a new 600 bit/s rate MELPe variation byThales Group(France) was added (without extensive competition and testing as performed for the 2400/1200 bit/s MELPe)^[6]to the NATO standard STANAG-4591, and there are more advanced efforts to lower the bitrates to 300 bit/s and even 150 bit/s.^[7]

In 2010, Lincoln Labs.,Compandent,BBN, and General Dynamics also developed for DARPA a 300 bit/s MELP device.^[8]Its quality was better than the 600 bit/s MELPe, but its delay was longer.

References

^Owens, F. J. (1993).Signal Processing of Speech.Houndmills: MacMillan Press.ISBN 0-333-51922-1.
^Kak, S. and Jayant, N.S., Speech encryption using waveform scrambling. Bell System Technical Journal, vol. 56, pp. 781–808, May–June 1977.
^A Mixed Excitation LPC Vocoder Model for Low Bit Rate Speech Coding, Alan V. McCree, Thomas P. Barnweell, 1995 in IEEE Trans. Speech and Audio Processing (Original MELP)
^Analog-to-Digital Conversion of Voice by 2,400 Bit/Second Mixed Excitation Linear Prediction (MELP), US DoD (MIL_STD-3005, Original MELP)
^THE 1200 AND 2400 BIT/S NATO INTEROPERABLE NARROW BAND VOICE CODER, STANAG-4591, NATO
^MELPe VARIATION FOR 600 BIT/S NATO NARROW BAND VOICE CODER, STANAG-4591, NATO
^Nichols, Randall K. & Lekkas, Panos C. (2002). "Speech cryptology".Wireless Security: Models, Threats, and Solutions.New York: McGraw-Hill.ISBN 0-07-138038-8.
^Alan McCree, “A scalable phonetic vocoder framework using joint predictive vector quantization of MELP parameters,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, 2006, pp. I 705–708, Toulouse, France

[1] Owens, F. J. (1993).Signal Processing of Speech.Houndmills: MacMillan Press.ISBN 0-333-51922-1.

[2] Kak, S. and Jayant, N.S., Speech encryption using waveform scrambling. Bell System Technical Journal, vol. 56, pp. 781–808, May–June 1977.

[3] A Mixed Excitation LPC Vocoder Model for Low Bit Rate Speech Coding, Alan V. McCree, Thomas P. Barnweell, 1995 in IEEE Trans. Speech and Audio Processing (Original MELP)

[4] Analog-to-Digital Conversion of Voice by 2,400 Bit/Second Mixed Excitation Linear Prediction (MELP), US DoD (MIL_STD-3005, Original MELP)

[5] THE 1200 AND 2400 BIT/S NATO INTEROPERABLE NARROW BAND VOICE CODER, STANAG-4591, NATO

[6] MELPe VARIATION FOR 600 BIT/S NATO NARROW BAND VOICE CODER, STANAG-4591, NATO

[7] Nichols, Randall K. & Lekkas, Panos C. (2002). "Speech cryptology".Wireless Security: Models, Threats, and Solutions.New York: McGraw-Hill.ISBN 0-07-138038-8.

[8] Alan McCree, “A scalable phonetic vocoder framework using joint predictive vector quantization of MELP parameters,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, 2006, pp. I 705–708, Toulouse, France

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

Secure voice

Contents

History

Digital

Digital methods using voice compression: MELP or MELPe

See also

References