Choosing Audio Codecs

This time we will analyze what are the main characteristics to take into consideration when choosing codecs for communication systems.

Codec stands for coder-decoder. A codec codes a signal into a digital data stream and decodes a digital stream into a signal.

In the case of VoIP, when choosing a codec, we are mainly interested in two characteristics:

  • Amount of bandwidth used
  • Quality

Adequate bandwidth is necessary for a high-quality conversation. Even when implementing jitter buffers and packet loss concealment in the endpoint, if the current connectivity cannot withstand the traffic transmitted over it, there is going to be huge packet loss, making it impossible for users to have a conversation.


Bandwidth is important whenever communications are routed over the Internet, as we will often have bandwidth limitations when deploying a Unified Communication solution. However, on a LAN, the bandwidth used by audio calls is not important compared to other forms of traffic.

Nowadays, bandwidth is becoming less and less of a constraint, even over the Internet, due to increasingly faster connections. Regardless, when an Internet connection is shared by many VoIP endpoints (meaning that many calls can take place at the same time), we need to make sure that bandwidth is sufficient, and that VoIP traffic is prioritized over all other traffic.

We can compare the most popular VoIP codecs:

How do you read this table?

Bit Rate is the bits per second that must be transmitted to make a phone call.

The Sample Size is the size of the minimal interval at which the codec operates.

Mean Opinion Score (or MOS) is an indication of the quality which will be perceived by a person using the codec during a phone call. The score is calculated by playing back a file and asking real users to rate the quality from 0 to 5. The average of these ratings becomes the Mean Opinion Score.

The Codec Sample Interval is the minimal sample size. Most systems use 20 ms.

To calculate the final bandwidth, we need to take into consideration RTP Headers and L2 Overhead using the following equation:

Total packet size = (L2 header: MP or FRF.12 or Ethernet) + (IP/UDP/RTP header) + (voice payload size)

Bandwidth per unit of time (seconds) can be then calculated by multiplying the packet size by the number of frames per unit of time.

Bandwidth (one direction) = Total packet size * Packets per seconds (in the case of a sample of 20 ms we have 50 packets per second).

Frame Size and Bandwidth

Frame size greatly influences the amount of bandwidth used, especially for high compression codecs. In the case of g.729, the overhead is 22.2 Kbit/s, which is almost three times the nominal bandwidth of the codec.

One way to mitigate this effect is to increase the Voice Payload Size to between 40 and 60 ms. When the number of packets transferred over the network is reduced, the bandwidth is also greatly reduced, but at the cost of increased latency and the loss of information (from packet loss).


Transcoding is the act of converting digital media (originally encoded using one codec) to a different codec. The quality of a call is only as good as the quality of the worst codec used by the two sides of a call. In many cases, it will be even lower than the quality of the worse codec, as the transcoding algorithm can lead to an additional loss in quality.

Therefore, transcoding has the following negative aspects:

  • It decreases the overall quality to the quality of the lower-quality codec or possibly even lower. In a best case scenario, it does not alter the quality, such as when transcoding is performed between similar codecs (for example, G.711a / G.711u).
  • It requires CPU power.
  • It adds latency because of the additional computation time needed.

Clearly, transcoding should be avoided as much as possible.

Narrowband and “Toll quality” Codecs

In the image below (courtesy of the Opus website), we see a comparison of different codecs.

Narrowband codecs (like G.729) offer a high compression rate (at 20ms, three times less bandwidth than G.711a) but at the price of quality, and they cannot be used for fax / modem transmissions.

Codecs like G.711a / G.711u use more bandwidth, but offer the so-called “toll quality” that customers grew accustomed to when using ISDN lines and other digital links for voice transmission (such as T1).

Using even higher quality are the wideband codecs, such as G.722, which use the same bandwidth as G.711a.

Information request:

    I need more information on improving the business communications

    Privacy*: I have read and accept the Privacy*
    I want to receive exclusive content for System Integrators and MSPs on how to grow my Business

    Social Sharing

    Leave a Reply