The two important vectors driving audio on the handsets and tablet platforms are the increasing need for mobile product differentiation, and higher user expectations. As for differentiation, smart phone and tablet OEMs are building their products around applications processors and baseband modem ICs (both integrated and multi-chip) from various merchant IC suppliers. As a result, the OEMs do not control and thus cannot differentiate the feature sets in the same way they did when they specified and/or designed the chip sets themselves. Now, they must find other ways to differentiate how their products look, feel, sound; and what they enable users to do. In the not too distant past handset users expected very little in the way of sound, so whatever amplifier was built into the core chip set was good enough. However, with higher performance applications processors, mobile capabilities and user expectations have mushroomed.
Any time/anywhere audio puts a huge burden on cell phone designers to get everything they can out of the little speakers on handsets and tablets, which simply are not known for great sound. Little speakers are limited, but audio expectations are limitless. That boils down to making higher output (louder) speaker amplifiers, adding DSP sound processing (aka: audio post-processing), eliminating pops and clicks, and doing other audio tricks. A lot of engineering is necessary to make those things happen, and being mobile they must happen with extremely low power consumption and without creating unwanted secondary effects like EMI.
Audio engineering is a work of art because sound is such a subjective and personal experience and audio engineering is replete with trade-offs and judgment calls. Below are several examples of the techniques that audio engineers employ to bring better sound to mobile platforms. Each can be a dissertation in and of itself, so they are briefly introduced to call to mind some of the important issues, trends, and solutions.
1. Dynamic Range, SNR, and THD+N
The classic measure of performance in audio is signal to noise ratio (SNR) or dynamic range. As with many comparative measures of a product’s performance, SNR measurements have been subject to “specsmanship” where various interpretations are used to position a product in the best light with less than 100% regard for objective purity. To provide more objectivity, the Audio Engineering Society (AES) created the “AES standard method for digital audio engineering-measurement of digital audio equipment (AES17-1998 r2004)” to measure an audio converter including the entire audio signal chain. This specification defines a ratio of full-scale input/output against background-noise levels using a -60dB input below full-scale at 1 kHz.
When describing audio performance, other important specifications come into play such as total harmonic distortion plus noise (THD+N), which states that noise measurements be taken at -1 dB below full scale. THD plus noise is a more common and useful metric than THD alone because it represents the usage case more accurately (i.e. noise matters). To put it into a formula: THD + N = the sum of Harmonic Power + Noise Power divided by the Total Output Power. Note that output power is meaningful only when the level of distortion present when it is measured is specified, such as 0.01%, 0.1%, 1%, or 10%, so always look for that.
The chart below makes a direct comparison of the dynamic ranges of audio different equipment from CD to professional audio. The driving force for cell phone and tablet audio performance is to approach or beat the dynamic range of MP3 players (around 100db), and that is what is happening now in the market.
Figure 1. Dynamic range of audio products
Support for several mics on mobile platforms is becoming necessary since multiple mics are required for DSP echo cancellation and other sophisticated audio and voice processing. Differential mic inputs are being used to lower susceptibility to noise, RF, and crosstalk. In mobile audio systems, mics, audio codecs, and DSP processors work together to enable advanced features. DSP is where the battle ground in smart phone audio increasingly is being centered. DSP echo cancellation, 3D spatialization, and other audio features use multiple mics to provide inputs for processing. A simple example of why at least two mics are needed is the act of separating voice signals from background noise. One mic listens to the voice plus background noise and the other the background noise only. The DSP/codec system then receives those signals to subtract out the noise to provide a clearer voice signal.
The use of digital mics in lieu of analog mics is increasingly popular and being driven by mobile OEMs since digital traces are more immune to injection of noise from noisy cell phone processors and radio ICs such as the baseband modems, Bluetooth, GPS, and WiFi ICs. Due to such noise injection immunity, digital mics allow for more flexibility in the placement of microphones thus increasing the options for more interesting industrial design, which is clearly growing in importance for mobile end products. Also, there is a trend towards MEMS mics, which are joining ECM mics in the mobile market. The chart below shows the increasing use of digital mics, both ECM and MEMS.
Figure 2. Microphone shipments by type
3. POP AND CLICK REDUCTION
Pop and click noise is a classic issue in audio, and occurs when an audio input is powered up or powered down, muted, or connected to different loads, which can create transients that can be heard through the speaker or headphone. Pop and click reduction is increasingly important in mobile applications since mobile operation naturally leads to turning things off to save power. Of course, that means these they have to be turned on again creating on/off and off/on transitions can cause audible transients. In the past, pop and click abatement was addressed with external capacitors, but with increased levels of integration the drive has been to eliminate capacitors for cost and performance reasons and provide innovative on-chip approaches to reduce pop and click.
Switching between internal and external mics when a handset is plugged in is another source of pops and clicks. This can be managed by providing insertion detection circuitry. Slow turn-on and off is another way to suppress pop and click noise. Slow turn on and off is useful with audio drivers that do not have blocking caps because DC offset voltages can be present.. Slow turn on, like it sounds, ramps the switching resistance so the slew rate of the load voltage comes up slowly enough that the pop or click noise does not occur. One way to do that is by using integrated audio switches that provide slow turn. . The diagram below from a Fairchild slow turn audio switch illustrates the concept. This particular product has an adjustable (slow) turn on and off ranging from 1 to 200 msec. The diagram shows the 100msec setting.
Figure 3. Slow turn on/off timing.
Especially for mobile, pop and click design know-how becomes critical and is another one of those places where audio engineering starts to look more like an art form, because perception of pop and click is subjective, making experience and the little tricks of the trade a bigger part of success model.
4. AGC (Automated Gain Control) and DRC (Dynamic Range Control)
Anyone who has turned up a home stereo amp too high (and who hasn’t) is intimately aware of what happens when a speaker is over-driven. Your ears are greeted with ugly distortion. On the other hand, the sense of a speaker’s loudness cannot be obtained unless the amp is allowed to get close to the speaker’s limits. A fine line needs to be approached. One way to do that is to control the amp’s gain through automated gain control (AGC) and dynamic range control (DRC), which try to optimize the sound quality to prevent distortion while maximizing the recorded signal level. This is like in the old days of making analog recordings where you would manually adjust the input volume while watching the VU meter on the tape recorder to keep the needle out of the red zone during loud passages and increasing the volume during quiet periods so as to mask the bias hiss from analog tape recorders.
Figure 4. Classic VU meter
This was literally manual dynamic range compression. Now it is done automatically by ICs that monitor the signal level and provide feedback to increase or decrease the gain. A too-small signal will call for more gain and a too-large signal will call for a reducing the gain: clearly this is a simple concept. The result is compression of the audible range (i.e. higher lows and lower highs). The output signal is in a narrower amplitude band but is overall louder and is psycho-acoustically more attractive. It is a proven method of providing a better audio experience. AGC/DRC operation can happen in various places in an audio system such as the digital processor, digital section of the audio codec, or the speaker amplifier.
Figure 5. AGC/DRC increase the volume (Sound Pressure Level) without creating distortion from clipping
With the loudness levels increasing for small speakers, the risks of damage is going up. Therefore, speaker protection methods are becoming more important. One of the most dangerous phenomena a speaker can encounter is a clipped signal. Clipping occurs when the amplifier drives beyond the limits of the speaker, which causes the signal to increasingly look more like a square wave than a normal audio signal. The term clipping is used because the top looks clipped off.
Figure 6 Clipped signal
Why clipping is dangerous to the speaker is that as the signal starts to square-off more high frequency harmonics are created presenting more energy than the speaker can tolerate, which can cause permanent damage. Mobile audio amps are now adding anti-clipping mechanisms to limit their output to better match and thereby protect speakers. The most basic anti-clipping method is setting the gain such that it does not go beyond a pre-set level by using an AGC technique. Other even more sophisticated methods of speaker protection are being developed that actively monitor what is happening to the speaker and feed information back to the amp to keep the operation safe. Some active speaker protection schemes can use small signal processors on the amp itself to conduct signal analysis. All types of speaker protection, active or passive, make it easier for mobile platform makers to provide the louder and more durable audio, so speaker protection will continue to be a key item in the audio pantheon.
5. NOISE GATE
Certain mobile audio amplifiers are now offering a noise gate function. A noise gate mutes the output when the signal is below a specified level, just like an old fashioned squelch, to eliminate radio, DAC, and other noise in the system. The noise gate function is typically more useful for voice than music. Attack and decay parameters are sometimes used to control and set the noise gate depending on how it is designed.
Figure 7. Noise gate mutes low signals to remove the perception of noise
An equalizer is simply a set of volume controls applied to narrow slices of frequency ranges across the audio range. 5-band equalizers are typical in mobile platforms. Equalizers enable system designers to customize the output signal to optimize the performance of the speaker in a particular environment. Back in the early days of audiophile stereo when equalizers became known to the general audio consumer, it was noted that once an equalizer was set to account for the particular acoustics of the room it was in was a matter of “set it and forget it” (to steal a phrase from Paul Popeil). This is a similar case; but instead of the room, the mobile designer is concerned about the case of the handset or tablet. The idea is to match the speaker and the case for optimal acoustic performance. Equalization is particularly important if the speaker is jammed into a non-optimal position in the case, which often happens due to industrial design considerations. The equalizer, which is on the chip set’s internal codec or in an external audio codec, sets the parameters for the particular enclosure (case) and thus improves the overall sound of the product. Equalization can be viewed as one of the important DSP functions among several that make the hand set sound better.
7. SOUND MIXERS
A theme that seems to be emerging when looking at the features of mobile audio today is that the stereo equipment experiences that audio fans had in the pure analog days have clear corollaries in the current mobile audio IC domain. Another item to add to this list could be sound mixers. Just like it sounds, mixers mix sound from different sources into the same output. IC audio codecs are able to do that now. Mixers with sample rate conversion are being added to audio codecs and have become a standard feature. The concept of a mixer is very simple. A mixer takes various sound sources and mixes them into a merged signal. The mixing function is used in mobile phones to allow the user to listen to music in the background while receiving a call or navigation commands and to use voice recognition software in conjunction with noise cancellation technology. Various analog line and microphone inputs are mixed in the IC from sources such as baseband voice, mobile TV, GPS, WiFi and FM radio.
8. Class D and EMI Reduction
Discussions of audio seem to eventually get around to the concept of class warfare, meaning the comparison of the various classes of audio amplifiers. So here we go. Class D is a compelling technology for mobile because its efficiency versus Class AB is far better, but the downsides of Class D are EMI and that the sound is not considered to be as good as Class AB. The invisible hand of the market weighed the trade-offs and seems to have decided that Class D for speaker amps is acceptable, at least at the current moment. For headphones, Classes AB, G, and H remain popular choices, though Class D sometimes is used as well.
Figure 8, Class AB versus D efficiency example.
Class D’s tendency to emit unwanted EMI as a result of its inherent PWM switching architecture requires different methods to suppress EMI. One suppression method often employed on PWM systems (like Class D) is spread spectrum modulation, where the switching frequency of the output bridge varies around a center switching frequency. With the frequency variation being randomized the EM energy spreads out more widely and the peak radiated energy decreases.
Another method to reduce EMI is edge rate control (ERC). In a Class D product the high frequency energy is located in the edges of the PWM square wave output, so faster output rise or fall times generate more high frequency energy. By making the edges less sharp the amount of high frequency energy will be reduced. Recall that the more perfect a square wave (i.e. the more square) the more (odd) harmonics are present, so reducing the perfection of the square wave eliminates the high frequency harmonic components that cause radiation. This comes, however, at the cost of making the system dissipate more power. Also, by now it should be clear that changing the shape of the square wave signal is literally distorting it, thus increasing THD+N, though this time on purpose. So, there is a balance as to what is acceptable, and once again we see that making the right tradeoffs is part of the “art” behind good audio engineering.
9. Battery AGC
Another concept used on audio speaker amplifiers is Battery-AGC. Battery-AGC extends battery run time by preventing the user from cranking up the audio volume as the battery approaches depletion. Battery-AGC circuitry monitors the battery voltage and automatically adjusts the gain of the amp according to a selected gain versus voltage curve to limit the draw from the battery when it starts getting low. The curves below show an example of the settings for the Fairchild FAB3102 boosted Class D amp with Battery-AGC.
Figure 9: Battery AGC curves
10. Boosted Speaker Amplifiers
One of the important trends in mobile audio is the move to louder audio, which is made possible by higher output speaker amps. Higher output is accomplished by adding a DC-DC boost converter right onto the speaker amp IC. A good example of an internally-boosted Class D amp is the Fairchild FAB3102.
Figure 10: Boosted Class D speaker amp architecture (FAB3102)
Adding the DC/DC boost converter is having a significant impact on audio IC partitioning because boost circuitry does not fit well inside the analog mobile baseband IC or even an audio codec due to the voltage levels. So, the need for louder operation has started a trend towards disintegration of audio amps. This is another example of the integration-disintegration pendulum swinging back and forth over time. The market is demanding louder audio for several reasons including wanting to share music and audio/video with others (without a dock) and to be able to hear in noisy environments. The ability to hear in noisy environments is especially attractive for emerging markets and is becoming a very important part of the success equation for mobile products meant for those important and growing regions.
The output power of speaker amps is currently in the range of 1.7 to 2.5 W and moving higher because mobile platform makers are saying they simply want the output to be as loud as possible. Higher output drive, of course, means more power consumption, which is not a price mobile platform makers really want to pay. So, audio engineers are finding new ways and re-using old tricks to reduce power. One innovative method being applied to Class D now borrows the concepts behind Class G, namely the use of multiple power rails. The Fairchild FAB3102 is one of the Class D audio amps using such a technique. Figure 7 shows the concept of switching in different power rails depending on the amplitude of the audio signal.
Figure 11. Different power supply rails on the Class D FAB3102 are switched in depending on the amplitude of the audio signal to save power.
11. Audio Post Processing
As noted in the Mic section, the dramatically increased DSP capability of applications processors and the implementation of DSP cores onto audio codecs and voice processing ICs make it possible to add various audio processing functions such as noise cancellation, surround sound/3D spatialization, echo cancellation, enhanced bass response, frequency range extension, and many others. Audio codec and voice processor IC makers are adding powerful DSP cores crafted specifically for audio applications. Such embedded DSPs offer the ability to off-load the applications processor by putting the audio specific functions on the codec or voice processing IC, thus providing a way to separate the mobile audio and video development evolution paths.
It is particularly useful to partition audio and video functions according to the most efficient process node. Smaller process geometries typically apply to video functions first while audio ICs can use prior generations very cost effectively. Audio residing outside the applications processor allows the smart phone and tablet designers more flexibility in adding differentiating audio features without affecting the applications processors. Making changes outside the applications processor is particularly important since modifications to the applications processor, and even to its software, can require wireless carrier and communications authorities re-certification testing of the entire chip set, which is much more time consuming, complicated, and expensive than the re-certifying a phone or tablet. Conversely, when an audio codec is modified in a mobile platform, only the phone or tablet and not the entire chipset would need to be re-certified. So, off-loading audio can speed time to market dramatically and allow for faster audio feature evolution.
12. Audio Sub-systems
One of the species of audio amplifiers is the so-called audio sub-system which combines headphone amplifiers with speaker amplifiers on a single chip to reduce size and cost. The Fairchild FAB2200 product is an example of such a product
Figure 12. Audio sub-system ICs merge speaker and headphone amps on same die
The FAB2200 uses a Class G headphone amp together with a Class D speaker amp and other features. Such an all-in-one speaker and headphone amp partitioning provides cell phone designers with an additional option to choose from.
13. New directions
New areas of mobile audio advancements to watch for may be flat membrane speakers and the merging of haptics (vibrational feedback) and speaker technologies. Haptic actuators are driven by amplifiers that are similar to audio amps in most ways. The merging of speaker and haptics is theoretically possible since vibration actuators on the touch screen can pull double duty by turning the screen itself or the back of the phone case into a speaker. These concepts sound a bit farfetched but patents have been issued regarding the concept of using the handset’s case as a speaker. Finally, it would be remiss to talk about future mobile audio developments without mentioning streaming. Mobile platform and consumer A/V product makers are already adding audio (and video) streaming into their products (such as DLNA). Furthermore, there is a lot of buzz now about streaming from the cloud and using that concept to keep mobile platforms synched, which is certain to be attractive. Industry researchers are saying the mobile music streaming will increase from around 6 million in 2011 to over 160 million subscribers by 2016 for an amazing compound annual growth rate of 95% (with the Asia-Pacific region becoming the largest market during 2012). With such rapid growth in audio streaming, mobile phone and tablet hardware products will need to be equipped with the audio ICs that can bring forth the any time / anywhere mobile audio experience.
July 9, 2012