Pulse-code modulation

(Redirected fromLinear PCM)

Pulse-code modulation(PCM) is a method used todigitallyrepresentanalog signals.It is the standard form ofdigital audioin computers,compact discs,digital telephonyand other digital audio applications. In a PCMstream,theamplitudeof the analog signal issampledat uniform intervals, and each sample isquantizedto the nearest value within a range of digital steps.Alec Reeves,Claude Shannon,Barney OliverandJohn R. Pierceare credited with its invention.[6][7][8]

Pulse-code modulation
Filename extension
.L16,.WAV,.AIFF,.AU,.PCM[1]
Internet media type
audio/L16, audio/L8,[2]audio/L20, audio/L24[3][4]
Type code"AIFF" for L16,[1]none[3]
Magic numberVaries
Type of formatUncompressedaudio
Contained byAudio CD,AES3,WAV,AIFF,AU,M2TS,VOB,and many others
Open format?Yes
Free format?Yes[5]

Linear pulse-code modulation(LPCM) is a specific type of PCM in which the quantization levels are linearly uniform.[5]This is in contrast to PCM encodings in which quantization levels vary as a function of amplitude (as with theA-law algorithmor theμ-law algorithm). ThoughPCMis a more general term, it is often used to describe data encoded as LPCM.

A PCM stream has two basic properties that determine the stream's fidelity to the original analog signal: thesampling rate,which is the number of times per second that samples are taken; and thebit depth,which determines the number of possible digital values that can be used to represent each sample.

History

edit

Early electrical communications started tosamplesignals in order tomultiplexsamples from multipletelegraphysources and to convey them over a single telegraph cable. The American inventorMoses G. Farmerconceived telegraphtime-division multiple xing(TDM) as early as 1853. Electrical engineer W. M. Miner, in 1903, used an electro-mechanicalcommutatorfor time-division multiple xing multiple telegraph signals; he also applied this technology totelephony.He obtained intelligible speech from channels sampled at a rate above 3500–4300 Hz; lower rates proved unsatisfactory.

In 1920, theBartlane cable picture transmission systemused telegraph signaling of characters punched in paper tape to send samples of imagesquantizedto 5 levels.[9]In 1926, Paul M. Rainey ofWestern Electricpatented afacsimile machinethat transmitted its signal using 5-bit PCM, encoded by an opto-mechanicalanalog-to-digital converter.[10]The machine did not go into production.[11]

British engineerAlec Reeves,unaware of previous work, conceived the use of PCM for voice communication in 1937 while working forInternational Telephone and Telegraphin France. He described the theory and its advantages, but no practical application resulted. Reeves filed for a French patent in 1938, and his US patent was granted in 1943.[12]By this time Reeves had started working at theTelecommunications Research Establishment.[11]

The first transmission ofspeechby digital techniques, theSIGSALYencryption equipment, conveyed high-levelAllied communicationsduringWorld War II.In 1943 theBell Labsresearchers who designed the SIGSALY system became aware of the use of PCM binary coding as already proposed by Reeves. In 1949, for the Canadian Navy'sDATARsystem,Ferranti Canadabuilt a working PCM radio system that was able to transmit digitized radar data over long distances.[13]

PCM in the late 1940s and early 1950s used acathode-raycoding tubewith aplate electrodehaving encoding perforations.[14]As in anoscilloscope,the beam was swept horizontally at the sample rate while the vertical deflection was controlled by the input analog signal, causing the beam to pass through higher or lower portions of the perforated plate. The plate collected or passed the beam, producing current variations in binary code, one bit at a time. Rather than natural binary, the grid of Goodall's later tube was perforated to produce a glitch-freeGray codeand produced all bits simultaneously by using a fan beam instead of a scanning beam.[15]

In the United States, theNational Inventors Hall of Famehas honoredBernard M. Oliver[16] andClaude Shannon[17] as the inventors of PCM,[18] as described in "Communication System Employing Pulse Code Modulation",U.S. patent 2,801,281filed in 1946 and 1952, granted in 1956. Another patent by the same title was filed byJohn R. Piercein 1945, and issued in 1948:U.S. patent 2,437,707.The three of them published "The Philosophy of PCM" in 1948.[19]

TheT-carriersystem, introduced in 1961, uses two twisted-pair transmission lines to carry 24 PCMtelephonecalls sampled at 8 kHz and 8-bit resolution. This development improved capacity and call quality compared to the previousfrequency-division multiple xingschemes.

In 1973,adaptive differential pulse-code modulation(ADPCM) was developed, by P. Cummiskey,Nikil JayantandJames L. Flanagan.[20]

Digital audio recordings

edit

In 1967, the first PCM recorder was developed byNHK's research facilities in Japan.[21]The 30 kHz 12-bit device used acompander(similar toDBX Noise Reduction) to extend the dynamic range, and stored the signals on avideo tape recorder.In 1969, NHK expanded the system's capabilities to 2-channelstereoand 32 kHz 13-bit resolution. In January 1971, using NHK's PCM recording system, engineers atDenonrecorded the first commercial digital recordings.[note 1][21]

In 1972, Denon unveiled the first 8-channel digital recorder, the DN-023R, which used a 4-head open reel broadcast video tape recorder to record in 47.25 kHz, 13-bit PCM audio.[note 2]In 1977, Denon developed the portable PCM recording system, the DN-034R. Like the DN-023R, it recorded 8 channels at 47.25 kHz, but it used 14-bits "withemphasis,making it equivalent to 15.5 bits. "[21]

In 1979, the first digital pop album,Bop till You Drop,was recorded. It was recorded in 50 kHz, 16-bit linear PCM using a 3M digital tape recorder.[22]

Thecompact disc(CD) brought PCM to consumer audio applications with its introduction in 1982. The CD uses a44,100 Hzsampling frequency and 16-bit resolution and stores up to 80 minutes of stereo audio per disc.

Digital telephony

edit

The rapid development and wide adoption of PCMdigital telephonywas enabled bymetal–oxide–semiconductor(MOS)switched capacitor(SC) circuit technology, developed in the early 1970s.[23]This led to the development of PCM codec-filter chips in the late 1970s.[23][24]Thesilicon-gateCMOS(complementary MOS) PCM codec-filter chip, developed byDavid A. Hodgesand W.C. Black in 1980,[23]has since been the industry standard for digital telephony.[23][24]By the 1990s,telecommunication networkssuch as thepublic switched telephone network(PSTN) had been largelydigitizedwithvery-large-scale integration(VLSI) CMOS PCM codec-filters, widely used inelectronic switching systemsfortelephone exchanges,user-endmodemsand a wide range ofdigital transmissionapplications such as theintegrated services digital network(ISDN),cordless telephonesandcell phones.[24]

Implementations

edit

PCM is the method of encoding typically used for uncompressed digital audio.[note 3]

  • The4ESS switchintroduced time-division switching into the US telephone system in 1976, based on medium scale integrated circuit technology.[25]
  • LPCM is used for the lossless encoding of audio data in the compact discRed Book standard(informally also known asAudio CD), introduced in 1982.
  • AES3(specified in 1985, upon whichS/PDIFis based) is a particular format using LPCM.
  • LaserDiscswith digital sound have an LPCM track on the digital channel.
  • On PCs, PCM and LPCM often refer to the format used inWAV(defined in 1991) andAIFFaudio container formats (defined in 1988). LPCM data may also be stored in other formats such asAU,raw audio format(header-less file) and various multimediacontainer formats.
  • LPCM has been defined as a part of theDVD(since 1995) andBlu-ray(since 2006) standards.[26][27][28]It is also defined as a part of various digital video and audio storage formats (e.g.DVsince 1995,[29]AVCHDsince 2006[30]).
  • LPCM is used byHDMI(defined in 2002), a single-cable digital audio/video connector interface for transmitting uncompressed digital data.
  • RF64container format (defined in 2007) uses LPCM and also allows non-PCM bitstream storage: various compression formats contained in the RF64 file as data bursts (Dolby E, Dolby AC3, DTS, MPEG-1/MPEG-2 Audio) can be "disguised" as PCM linear.[31]

Modulation

edit
Sampling and quantization of a signal (red) for 4-bit LPCM over a time domain at specific frequency

In the diagram, asine wave(red curve) is sampled and quantized for PCM. The sine wave is sampled at regular intervals, shown as vertical lines. For each sample, one of the available values (on the y-axis) is chosen. The PCM process is commonly implemented on a singleintegrated circuitcalled ananalog-to-digital converter(ADC). This produces a fully discrete representation of the input signal (blue points) that can be easily encoded as digital data for storage or manipulation. Several PCM streams could also be multiplexed into a larger aggregatedata stream,generally for transmission of multiple streams over a single physical link. One technique is calledtime-division multiple xing(TDM) and is widely used, notably in the modern public telephone system.

Demodulation

edit

The electronics involved in producing an accurate analog signal from the discrete data are similar to those used for generating the digital signal. These devices aredigital-to-analog converters(DACs). They produce avoltageorcurrent(depending on type) that represents the value presented on their digital inputs. This output would then generally be filtered and amplified for use.

To recover the original signal from the sampled data, ademodulatorcan apply the procedure of modulation in reverse. After each sampling period, the demodulator reads the next value and transitions the output signal to the new value. As a result of these transitions, the signal retains a significant amount of high-frequency energy due to imaging effects. To remove these undesirable frequencies, the demodulator passes the signal through areconstruction filterthat suppresses energy outside the expected frequency range (greater than theNyquist frequency).[note 4]

Standard sampling precision and rates

edit

Common sample depths for LPCM are 8, 16, 20 or 24 bits persample.[1][2][3][32]

LPCM encodes a single sound channel. Support for multichannel audio depends on file format and relies on synchronization of multiple LPCM streams.[5][33]While two channels (stereo) is the most common format, systems can support up to 8 audio channels (7.1 surround)[2][3]or more.

Common sampling frequencies are 48kHzas used withDVDformat videos, or 44.1 kHz as used in CDs. Sampling frequencies of 96 kHz or 192 kHz can be used on some equipment, but the benefits have been debated.[34]

Limitations

edit

TheNyquist–Shannon sampling theoremshows PCM devices can operate without introducing distortions within their designed frequency bands if they provide a sampling frequency at least twice that of the highest frequency contained in the input signal. For example, intelephony,the usablevoice frequencyband ranges from approximately 300Hzto 3400 Hz.[35]For effective reconstruction of the voice signal, telephony applications therefore typically use an 8000 Hz sampling frequency which is more than twice the highest usable voice frequency.

Regardless, there are potential sources of impairment implicit in any PCM system:

  • Choosing a discrete value that is near but not exactly at the analog signal level for each sample leads toquantization error.[note 5]
  • Between samples no measurement of the signal is made; the sampling theorem guarantees non-ambiguous representation and recovery of the signal only if it has no energy at frequencyfs/2 or higher (one half the sampling frequency, known as theNyquist frequency); higher frequencies will not be correctly represented or recovered and add aliasing distortion to the signal below the Nyquist frequency.
  • As samples are dependent on time, an accurate clock is required for accurate reproduction. If either the encoding or decoding clock is not stable, these imperfections will directly affect the output quality of the device.[note 6]

Processing and coding

edit

Some forms of PCM combine signal processing with coding. Older versions of these systems applied the processing in the analog domain as part of the analog-to-digital process; newer implementations do so in the digital domain. These simple techniques have been largely rendered obsolete by modern transform-basedaudio compressiontechniques, such asmodified discrete cosine transform(MDCT) coding.

  • Linear PCM (LPCM) is PCM with linear quantization.[5]
  • Differential PCM(DPCM) encodes the PCM values as differences between the current and the predicted value. An algorithm predicts the next sample based on the previous samples, and the encoder stores only the difference between this prediction and the actual value. If the prediction is reasonable, fewer bits can be used to represent the same information. For audio, this type of encoding reduces the number of bits required per sample by about 25% compared to PCM.
  • Adaptive differential pulse-code modulation(ADPCM) is a variant of DPCM that varies the size of the quantization step, to allow further reduction of the required bandwidth for a givensignal-to-noise ratio.
  • Delta modulationis a form of DPCM that uses one bit per sample to indicate whether the signal is increasing or decreasing compared to the previous sample.

In telephony, a standard audio signal for a single phone call is encoded as 8,000samples per second,of 8 bits each, giving a 64 kbit/s digital signal known asDS0.The defaultsignal compressionencoding on a DS0 is eitherμ-law (mu-law)PCM (North America and Japan) orA-lawPCM (Europe and most of the rest of the world). These are logarithmic compression systems where a 12- or 13-bit linear PCM sample number is mapped into an 8-bit value. This system is described by international standardG.711.

Where circuit costs are high and loss of voice quality is acceptable, it sometimes makes sense to compress the voice signal even further. An ADPCM algorithm is used to map a series of 8-bit μ-law or A-law PCM samples into a series of 4-bit ADPCM samples. In this way, the capacity of the line is doubled. The technique is detailed in theG.726standard.

Audio coding formatsandaudio codecshave been developed to achieve further compression. Some of these techniques have been standardized and patented. Advanced compression techniques, such asmodified discrete cosine transform(MDCT) andlinear predictive coding(LPC), are now widely used inmobile phones,voice over IP(VoIP) andstreaming media.

Encoding for serial transmission

edit

PCM can be eitherreturn-to-zero(RZ) ornon-return-to-zero(NRZ). For a NRZ system to be synchronized using in-band information, there must not be long sequences of identical symbols, such as ones or zeroes. For binary PCM systems, the density of 1-symbols is calledones-density.[36]

Ones-density is often controlled using precoding techniques such asrun-length limitedencoding, where the PCM code is expanded into a slightly longer code with a guaranteed bound on ones-density before modulation into the channel. In other cases, extraframing bitsare added into the stream, which guarantees at least occasional symbol transitions.

Another technique used to control ones-density is the use of ascrambleron the data, which will tend to turn the data stream into a stream that lookspseudo-random,but where the data can be recovered exactly by a complementary descrambler. In this case, long runs of zeroes or ones are still possible on the output but are considered unlikely enough to allow reliable synchronization.

In other cases, the long term DC value of the modulated signal is important, as building up aDC biaswill tend to move communications circuits out of their operating range. In this case, special measures are taken to keep a count of the cumulative DC bias and to modify the codes if necessary to make the DC bias always tend back to zero.

Many of these codes arebipolar codes,where the pulses can be positive, negative or absent. In the typicalalternate mark inversioncode, non-zero pulses alternate between being positive and negative. These rules may be violated to generate special symbols used for framing or other special purposes.

Nomenclature

edit

The wordpulsein the termpulse-code modulationrefers to the pulses to be found in the transmission line. This perhaps is a natural consequence of this technique having evolved alongside two analog methods,pulse-width modulationandpulse-position modulation,in which the information to be encoded is represented by discrete signal pulses of varying width or position, respectively.[citation needed]In this respect, PCM bears little resemblance to these other forms of signal encoding, except that all can be used in time-division multiple xing, and the numbers of the PCM codes are represented as electrical pulses.

See also

edit

Explanatory notes

edit
  1. ^Among the first recordings wasUzu: The World Of Stomu Yamash'ta 2byStomu Yamashta.
  2. ^The first recording with this new system was recorded inTokyoduring April 24–26, 1972.
  3. ^Other methods exist such aspulse-density modulationused also onSuper Audio CD.
  4. ^Some systems usedigital filteringto remove some of the aliasing, converting the signal from digital to analog at a higher sample rate such that the analoganti-aliasing filteris much simpler. In some systems, no explicit filtering is done at all; as it is impossible for any system to reproduce a signal with infinite bandwidth, inherent losses in the system compensate for the artifacts — or the system simply does not require much precision.
  5. ^Quantization error swings between -q/2 andq/2. In the ideal case (with a fully linear ADC and signal level >>q) it isuniformly distributedover this interval, with zero mean and variance ofq2/12.
  6. ^A slight difference between the encoding and decoding clock frequencies is not generally a major concern; a small constant error is not noticeable. Clock error does become a major issue if the clock contains significantjitter,however.

References

edit
  1. ^abcAlvestrand, Harald Tveit; Salsman, James (May 1999)."RFC 2586 – The Audio/L16 MIME content type".The Internet Society.doi:10.17487/RFC2586.RetrievedMarch 16,2010.{{cite journal}}:Cite journal requires|journal=(help)
  2. ^abcCasner, S. (March 2007)."RFC 4856 – Media Type Registration of Payload Formats in the RTP Profile for Audio and Video Conferences – Registration of Media Type audio/L8".The IETF Trust.doi:10.17487/RFC4856.RetrievedMarch 16,2010.{{cite journal}}:Cite journal requires|journal=(help)
  3. ^abcdBormann, C.; Casner, S.; Kobayashi, K.; Ogawa, A. (January 2002)."RFC 3190 – RTP Payload Format for 12-bit DAT Audio and 20- and 24-bit Linear Sampled Audio".The Internet Society.doi:10.17487/RFC3190.RetrievedMarch 16,2010.{{cite journal}}:Cite journal requires|journal=(help)
  4. ^"Audio Media Types".Internet Assigned Numbers Authority.RetrievedMarch 16,2010.
  5. ^abcd"Linear Pulse Code Modulated Audio (LPCM)".Library of Congress.April 19, 2022.RetrievedSeptember 5,2022.
  6. ^Noll, A. Michael (1997).Highway of Dreams: A Critical View Along the Information Superhighway.Telecommunications (Revised ed.). Mahwah, NJ: Erlbaum. p. 50.ISBN978-0-8058-2557-2.
  7. ^Leibson, Steven (September 7, 2021)."A Brief History of the Single-Chip DSP, Part I".EEJournal.RetrievedSeptember 19,2024.
  8. ^Barrett, G. Douglas (2023).Experimenting the Human: Art, Music, and the Contemporary Posthuman.Chicago London:The University of Chicago Press.p. 102.ISBN978-0-226-82340-9.
  9. ^"The Bartlane Transmission System".DigicamHistory. Archived fromthe originalon February 10, 2010.RetrievedJanuary 7,2010.
  10. ^U.S. patent number 1,608,527; also see p. 8,Data conversion handbook,Walter Allan Kester, ed., Newnes, 2005,ISBN0-7506-7841-0.
  11. ^abJohn Vardalas (June 2013),Pulse Code Modulation: It all Started 75 Years Ago with Alec Reeves,IEEE
  12. ^US 2272070
  13. ^Porter, Arthur (2004).So Many Hills to Climb.Beckham Publications Group.ISBN9780931761188.[page needed]
  14. ^Sears, R. W. (January 1948).Electron Beam Deflection Tube for Pulse Code Modulation.Vol. 27.Bell Labs.pp. 44–57.RetrievedMay 14,2017.{{cite book}}:|work=ignored (help)
  15. ^Goodall, W. M. (January 1951).Television by Pulse Code Modulation.Vol. 30.Bell Labs.pp. 33–49.RetrievedMay 14,2017.{{cite book}}:|work=ignored (help)
  16. ^ "Bernard Oliver".National Inventor's Hall of Fame.Archived fromthe originalon December 5, 2010.RetrievedFebruary 6,2011.
  17. ^ "Claude Shannon".National Inventor's Hall of Fame.Archived fromthe originalon December 6, 2010.RetrievedFebruary 6,2011.
  18. ^ "National Inventors Hall of Fame announces 2004 class of inventors".Science Blog.February 11, 2004.RetrievedFebruary 6,2011.
  19. ^ B. M. Oliver; J. R. Pierce & C. E. Shannon (November 1948). "The Philosophy of PCM".Proceedings of the IRE.36(11): 1324–1331.doi:10.1109/JRPROC.1948.231941.ISSN0096-8390.S2CID51663786.
  20. ^P. Cummiskey, N. S. Jayant, and J. L. Flanagan, "Adaptive quantization in differential PCM coding of speech," Bell Syst. Tech. J., vol. 52, pp. 1105–1118, Sept. 1973.
  21. ^abcThomas Fine (2008)."The dawn of commercial digital recording"(PDF).ARSC Journal.39(1): 1–17.
  22. ^Roger Nichols."I Can't Keep Up With All The Formats II".Archived fromthe originalon October 20, 2002.The Ry Cooder Bop Till You Drop album was the first digitally recorded pop album
  23. ^abcdAllstot, David J. (2016)."Switched Capacitor Filters"(PDF).In Maloberti, Franco; Davies, Anthony C. (eds.).A Short History of Circuits and Systems: From Green, Mobile, Pervasive Networking to Big Data Computing.IEEE Circuits and Systems Society.pp. 105–110.ISBN9788793609860.Archived fromthe original(PDF)on September 30, 2021.RetrievedNovember 29,2019.
  24. ^abcFloyd, Michael D.; Hillman, Garth D. (October 8, 2018) [1st pub. 2000]."Pulse-Code Modulation Codec-Filters".The Communications Handbook(2nd ed.).CRC Press.pp. 26–1, 26–2, 26–3.ISBN9781420041163.
  25. ^Cambron, G. Keith (October 17, 2012).Global Networks: Engineering, Operations and Design.John Wiley & Sons. p. 345.
  26. ^Blu-ray Disc Association (March 2005),White paper Blu-ray Disc Format – 2.B Audio Visual Application Format Specifications for BD-ROM(PDF),retrievedJuly 26,2009
  27. ^"DVD Technical Notes (DVD Video –" Book B ") – Audio data specifications".July 21, 1996.RetrievedMarch 16,2010.
  28. ^Jim Taylor."DVD Frequently Asked Questions (and Answers) – Audio details of DVD-Video".RetrievedMarch 20,2010.
  29. ^"How DV works".Archived fromthe originalon December 6, 2007.RetrievedMarch 21,2010.
  30. ^"AVCHD Information Website – AVCHD format specification overview".RetrievedMarch 21,2010.
  31. ^EBU (July 2009),EBU Tech 3306 – MBWF / RF64: An Extended File Format for Audio(PDF),archived fromthe original(PDF)on November 22, 2009,retrievedJanuary 19,2010
  32. ^Mostafa, Mohamed; Kumar, Rajesh (May 2001)."RFC 3108 – Conventions for the use of the Session Description Protocol (SDP) for ATM Bearer Connections".doi:10.17487/RFC3108.RetrievedMarch 16,2010.{{cite journal}}:Cite journal requires|journal=(help)
  33. ^"PCM, Pulse Code Modulated Audio".Library of Congress. April 6, 2022.RetrievedSeptember 5,2022.
  34. ^Christopher, Montgometry."24/192 Music Downloads, and why they do not make sense".Chris "Monty" Montgomery. Archived fromthe originalon September 6, 2014.RetrievedMarch 16,2013.
  35. ^https:// its.bldrdoc.gov/fs-1037/dir-039/_5829.htm[failed verification]
  36. ^Stallings, William,Digital Signaling Techniques,December 1984, Vol. 22, No. 12,IEEECommunications Magazine

Further reading

edit
edit