In companding, we convert the signal non-linearly into a digital domain to achieve compression. Mu-law coding (also called μ-law) is a nonlinear companding method used in telecommunications. Commanding is a method of compressing a digital signal by reducing the depth of bits before sending it, and then expanding it when it is received. The advantage of mu-law coding is that it retains some of the dynamic range that would be lost if a linear method of reducing bit depth were used instead. Let`s take a closer look at the difference between linear and nonlinear methods of reducing bit depth. Original sample ~~ to 16 bits,”sample after linear~~companding,~~16 to 8 to 16″,error,”sample after non-linear~~companding,~~16 to 8 to 16″,percentage~~error 33,0,100%,32.3% 66,0,100%,63,4.50% 32145,32000,0.45%,31373,2.40% 32178,32000,0.55%,31373,2.50% From Nuvoton`s archived Speech Codec FAQ:”μ-law and A-law are audio compression methods defined by ITU-T Recommendation G.711 that compress 16-bit linear data into 8-bit data Logarithmic. The encoding process (called logarithmic companding) breaks down linear data into segments, doubling the size of each segment larger and higher. This ensures that low amplitude signals (where most of the information takes place in speech) receive the highest binary resolution while providing sufficient dynamic range to encode high amplitude signals. Although this method does not offer a very high compression ratio (about 2:1), it does not require much processing power to decode.
The Î1/4 and A-law companding standards use logarithm-based functions to encode audio examples of Integrated Services Digital Network (ISDN) telephone services using nonlinear quantization. Both are companding techniques used in telephone systems. A-law is used throughout Europe as a competition standard recommended by the CCITT. By limiting the linear value of the sample, which is equal to 12 bits, we can obtain an A-law equation as mentioned below. Here, A is known as a compression parameter and its value is about 87.7 in Europe, while x is the normalized integer that needs to be compressed. The μ-law algorithm (sometimes spelled mu-law, often approximated as u-law) is a companding algorithm primarily used in 8-bit digital PCM telecommunications systems in North America and Japan. It is one of two versions of ITU-T Standard G.711, the other version being the similar A Act. A-law is used in regions where digital telecommunications signals are transmitted to E-1 circuits, such as Europe. U-Law, μ-Law or Mu-Law is a standard signal compression in digital telecommunications.
It is one of two standard versions of the G.711. This companding algorithm is used in telecommunications in North America and Japan to optimize the dynamic range of an analog audio signal before it is digitized. With mu-law encoding, the results are better than if a linear method of bit reduction had been used. Let`s look at the linear method. Again, let`s say we convert from 16 to 8 bits by dividing by 256 and rounding it up. Then the two 16-bit values of 33 and 66 would be converted to 0 to 8 bits. To convert up to 16 bits, we multiply again by 256 and still have 0. We have lost all the information contained in these examples. On the other hand, by a linear method, convert 32145 and 32178 to 8 bits to 125.
When reduced to 16 bits, they both become 32000. A comparison of the percentages of errors for all these samples using the two conversion methods is presented in Table 5.2. You can see that mu-law coding preserves low-amplitude information that would have been lost by a linear method. The overall result is that with a bit depth of only 8 bits for data transmission, it is possible to recover a dynamic range of about 12 bits or 72 dB when the data is decompressed to 16 bits (as opposed to the expected 48 dB of only 8 bits). The dynamic range is increased because fewer low-amplitude samples fall below the noise floor. Discover our key features with a free trial of LiveAgent and discover what it`s like to offer professional services with our solution. Let`s take a closer look at the steps of this algorithm. The μ-law algorithm provides a slightly wider dynamic range than the A-distribution at the expense of lower proportional distortion for small signals. By convention, A-law is used for an international reference if at least one country uses it. In the next two steps, we combine E4/5 with 9, and then D4 with G4, as shown. Steps 2 and 3 are independent of each other and could actually be done in parallel.
The frame is divided into frequency bands with filter banks. Each filter bank is a bandpass filter that allows only one frequency range to pass through. (Chapter 7 provides more details on bandpass filters.) The total frequency range that can occur in the original signal is from 0 to 1/2 of the sampling rate, as we know from Nyquist`s theorem. For example, if the signal sampling rate is 44.1 kHz, the highest frequency that can be present in the signal is 22.05 kHz. Thus, the filter banks produce 32 frequency bands between 0 and 22.05 kHz, each with a width of 22050/32 or about 689 Hz. The values are brought to the power of 3/4 before quantization. This results in uneven quantification to reduce quantization noise for low amplitude signals where it has more damaging effects. 5. Sort the sub-bands into 22 groups called scale factor bands and determine a scale factor for each scale factor range based on the SMR.
For quantification, use non-uniform quantification in combination with scale factors. 2. Use the Fourier transform to transform the time domain data into the frequency domain and send the results to the psychoacoustic analyzer. A) Determine the position of the most significant bit in the input. Samples 33 and 66 on a 16-bit scale become 5 and 9 on an 8-bit scale, respectively. Samples 32145 and 32178 both become 127 on an 8-bit scale. Although the difference between 66 and 33 is the same as the difference between 32178 and 32145, 66 and 33 fall to different levels of quantification when converted to 8 bits, but 32178 and 32145 fall to the same level of quantization. There are more levels of quantification at lower amplitudes after the application of the thematic law function. μ encoding effectively reduces the dynamic range of the signal, thereby increasing the encoding efficiency while distorting the signal in a way that results in a higher signal-to-distortion ratio than that obtained by linear coding for a given number of bits. MDCT, like the Fourier transform, can be used to change audio data from the time domain to the frequency domain. Its distinction is that it is applied to overlapping windows to minimize the occurrence of interference frequencies that occur due to discontinuities at window boundaries. (“Incorrect frequencies” are frequencies that are not actually present in the audio, but result from the transformation.) The overlap between successive MDCT windows depends on the information provided by the psychoacoustic analyzer about the type of audio in the frame and tape.
When transients are involved, the window size is shorter to achieve higher temporal resolution. Otherwise, a larger window is used for higher frequency resolution. Compression and expansion according to the A-distribution equation are shown in Figure 2. This is also known as compander and A-Law extension because of the encoding and decoding process. Is this simply a battle of political sophistication between two standards organizations, or is there a compelling technical reason to choose one over the other (I can`t help but notice that different telecommunications organizations tied to different countries seem to be making efforts for “hegemony” at the expense of simplicity and interoperability)? This code word is then scaled to (-256, 255) using the following format recommended by G.711. 8-bit code = sgn(x) (A |x|) /(1+lnâ¡(A)) for 0 â¤ |x|â¤ 1/A People hear best (i.e. have the greatest amplitude sensitivity) in the range of about 1000 to 5000 Hz, which is close to the range of the human voice. We hear less well at both ends of the frequency spectrum.
This is shown in Figure 5.45, which shows the shape of the hearing threshold across frequencies. Our original 16-bit values of 33 and 66 are converted to 8-bit values in 5 and 9, respectively. We can convert them back to 16 bits using the inverse mu-law function as follows: The range of the signed input is (-4096, +4095) and this sample input x is normalized to interval (-1, 1) using the logarithmic expression. The resulting 32 bands are still in the time domain. Note that dividing the audio signal into frequency bands increases the amount of data at this point by a factor of 32. That is, there are 32 sample sets of 1152 time domains, each containing only the frequencies of its band. (You can better understand this if you imagine that the audio signal is a piece of music that you break down into 32 frequency bands. After the separation, you can play each band individually and hear the piece of music, but only those frequencies in the band. However, the segments should be longer than 1152 samples for you to listen to music, as 1152 samples are only 0.026 seconds of sound at a sampling rate of 44.1 kHz.) Another source of compression in FLAC is the decoration of the channel. For stereo input, the left and right channels can be converted to middle and side channels using the following equations: The effect of the mu-law function is that quantization intervals are more “scattered” at lower amplitudes. FLAC is unpatented and open source.
The FLAC website is a good source of details and documentation on the implementation of the codec. A-law and you-law are two algorithms used to modify an input signal for scanning. These algorithms are implemented in telephony systems around the world. The two algorithms have a fairly minimal difference and most people wouldn`t know the difference. The first difference between the two is the dynamic range of the output; U-law has a wider dynamic range than a-law.