Share this page : facebooktwitterlinkedinmailfacebooktwitterlinkedinmail
Major Stages of Voice Processing in VoIP


For voice transmission over an IP network, the voice wavelength must be sampled, quantized, encoded, optionally compressed, and then encapsulated in a VoIP packet.

The first four steps are performed by a digital signal processor (DSP) in the originating gateway and are detailed in the following section. The VoIP packets are then delivered to the destination gateway, and the voice information is retrieved from the packet. Finally, a DSP on the terminating gateway decodes the payload and modulates the wavelength to reverse the process performed on the originating gateway.

VoIP components

The components are illustrated below:


The components shown are as follows:
Cisco Unified IP Phones: Provides an IP endpoint for voice communication.
Gatekeeper: Provides call admission control (CAC), bandwidth control and management, and address translation.
Gateway: Provides translation between VoIP and non-VoIP networks such as a public switched telephone network (PSTN). Gateways also provide physical access for local analog and digital voice devices such as telephones, fax machines, key sets, and PBXs.
Cisco Unified Border Element (Cisco UBE): Interconnects two VoIP networks. It acts as a proxy between signaling protocols and can be configured to provide proxy services to the media stream.
Multipoint control unit (MCU): Provides real-time connectivity for participants in multiple locations to attend the same videoconference or meeting.
Call agent: Provides call control for Cisco Unified IP Phones, CAC, bandwidth control and management, and address translation.
Application servers: Provide services such as voice-mail, unified messaging, interactive voice response (IVR), presence information, multimedia conferencing, and others.
Videoconference station: Provides access for end-user participation in videoconferencing. The videoconference station contains a video capture device for video input and a microphone for audio input. The user can view video streams and hear the audio that originates at a remote user station.


Sampling is a process that takes readings of the waveform amplitude at regular intervals, by a process called pulse-amplitude modulation (PAM). The output is a series of pulses that approximates the analog waveform. For this output to have an acceptable level of quality for the signal to be reconstructed, the sampling rate must be rapid enough.



Delay: The maximum one-way delay between any UCM servers for all priority ICCS traffic should not exceed 40 ms, or 80 ms round-trip time (RTT).

Jitter: Jitter is the varying delay that packets incur through the network because of processing, queue, buffer, congestion, or path variation delay.

Bandwidth: Provision the correct amount of bandwidth between each server for the expected call volume, type of devices, and number of devices.

QoS: The network infrastructure relies on QoS engineering to provide consistent and predictable end-to-end levels of service for traffic. Neither QoS nor bandwidth alone is a solution. Rather, QoS-enabled bandwidth must be engineered into the network infrastructure.

Gatekeeper: Cisco gatekeepers are used to group gateways into logical zones and perform call routing between them. Gateways are responsible for edge routing decisions between the Public Switched Telephone Network (PSTN) and the H.323 network. Cisco gatekeepers handle the core call routing among devices in the H.323 network and provide centralized dial plan administration. Without a Cisco gatekeeper, explicit IP addresses for each terminating gateway would have to be configured at the originating gateway and matched to a Voice over IP (VoIP) dial-peer. With a Cisco gatekeeper, gateways query the gatekeeper when trying to establish VoIP calls with remote VoIP gateways.

A voice-switching gateway, connects various analog and digital voice circuits. This functionality is equivalent to the operation of central office switches and PBXs in traditional telephony.

A VoIP gateway connects the traditional telephony network to the IP network. It converts the signaling and media transmission methods used on one side to the other side.

Cisco Unified Border Element (Cisco UBE) interconnects two IP networks. It terminates the signaling sessions and either passes through or terminates the media channels.


Voice Gateway call legs

A voice call over a packet or traditional telephony network is segmented into discrete call legs. When a gateway receives a call setup, it performs a routing decision and sends the call setup request to the next device. The incoming part of the call is referred to as the incoming call leg and the outgoing part of the call is referred to as the outgoing call leg.

On Cisco IOS routers, the call legs are associated with dial peers. One dial peer corresponds to one call leg. A call leg is a logical connection between two gateways (routers) or between a gateway and a telephony device. If the gateway receives or forwards the call over an analog or digital voice circuit, the corresponding call leg is referred to as POTS. If the gateway receives or forwards the call over an IP interface, the corresponding call leg is referred to as VoIP.

The call legs are relevant for call routing. Before a gateway makes the call-routing decision, it must apply the settings defined in the incoming call leg. In the case of POTS incoming call legs, these parameters define how the gateway collects the dialed digits and optional applications. In the case of VoIP incoming call legs, these parameters describe the voice transmission methods, such as codec, voice activity detection (VAD), and dualtone multifrequency (DTMF)-related features.  These parameters must be successfully negotiated between the local and preceding gateway before the call can be forwarded to the next gateway in the path.

Voice-switching Gateway

A voice-switching gateway, as depicted in Figure, has traditional telephony interfaces. Multiple call-signaling protocols exist, such as SS7, ISDN, Q Signaling (QSIG), and the analog signaling methods, including supervisory signaling (loop-start, ground-start, immediate-start, wink-start, delay-start), address signaling (pulse, DTMF), and informational signaling. The voice-switching gateway receives and forwards the call setup request over analog or digital voice circuits. The gateway might have to convert the call signaling and the voice format when the call traverses the gateway from one port to another. The incoming and the outgoing call legs are the POTS call legs.

VoIP gateway

The gateway provides translation between VoIP and non-VoIP networks, such as the PSTN. It converts the signaling and voice signal between traditional telephony circuits and the VoIP transmission in an IP network.

One of the call legs is a POTS call leg, while the other is a VoIP call leg.

The VoIP terminating gateway has the VoIP incoming call leg and the POTS outgoing call leg. Both gateways must first successfully negotiate the VoIP parameters associated with their respective outgoing and incoming call legs before the VoIP terminating gateway can forward the call to the destination PSTN network.

Cisco Unified Border Element

Cisco Unified Border Element, as illustrated in Figure below, forwards an incoming VoIP call as another, outgoing VoIP call. It receives a call setup request, negotiates parameters, and forwards the call setup request to the next gateway. The incoming signaling protocol might differ from the outgoing signaling protocol. When the call is successfully signaled end to end, Cisco UBE might either proxy the media channel, which is referred to as
flow-through, or let the media channel pass through the gateway without any modification, which is referred to as flow-around.

The media proxy function is necessary when the VoIP traffic parameters of the incoming call leg differ from the VoIP parameters of the outgoing call leg. When Cisco UBE proxies the media channel, it changes the IP addresses of the media packets. This feature is very useful for security or connectivity reasons. Both call legs of a Cisco UBE are VoIP call legs.



How Voice Gateways Route calls

A comparison of IP packet routing and call-routing is shown below:

IP routing Call routing
Static or dynamic Only Static
IP routing table Dial Plan
IP route Dial peer
Hop-by-hop routing, where each router makes an independent decision Inbound and outbound call legs, where the gateway
negotiates VoIP parameters with preceding and next
gateways before a call is forwarded
Destination-based routing Called number, matched by destination pattern, is
one of many selection criteria
Longest-match rule The longest-match rule used for a dial peer’s destination pattern exists
Equal paths Preference can be applied to equal dial peers, or a random selection is made if all criteria are the same
Default route Possible to have a default route, which often points to a gatekeeper

Dial peers are essential to implementing dial plans and providing voice services over an IP packet network. Dial peers are used to identify call source and destination endpoints and to define the characteristics that are applied to each call leg in the call connection.

Call legs are router-centric. When an inbound call arrives on a gateway, the gateway finds the inbound dial peer and processes its settings. If the settings are acceptable, the gateway finds the outbound dial peer, establishes the outgoing call leg, and the call is switched from the incoming call leg to the outgoing call leg. You need to configure dial peers to enable call routing on a gateway.

Because dial peers collectively define where to forward calls, all dial peers together build a dial plan, which is equivalent to the IP routing table. The dial peers are static in nature. Hop-by-hop call routing builds on the principle of call legs. Before a call-routing decision is made, the gateway must identify the inbound dial peer and process its parameters. This process might involve VoIP parameter negotiation.
The call-routing decision is the selection of the outbound dial peer. This selection is commonly based on the called number when the
destination-pattern command is used. The selection might be based on other information, and that other criteria might have higher precedence than the called number. When the called number is matched to find the outbound dial peer, the longest-match rule applies.

If more than one dial peer equally matches the dial string, all the matching dial peers are used to form a rotary group. The router attempts to establish the outbound call leg using all the dial peers in the rotary group until one is successful. The selection order within the group can be influenced by configuring a preference value.

Dial Peers

Dial peers are essential to implementing dial plans and providing voice services over an IP packet network. Dial peers are used to identify call source and destination endpoints and to define the characteristics that are applied to each call leg in the call connection.

Type of Dial Peer   Network Technology
Plain old telephone service (POTS) Maps a dial string to a specific voice port on the local gateway. The voice port connects the gateway to the PSTN, PBX, or analog telephone.
VoIP Points to the IP address or DNS name of the destination VoIP
device that terminates the call. This mapping applies to VoIP protocols, such as H.323 and SIP
Multimedia Mail over IP (MMoIP) The dial peer is mapped to the email address of the SMTP server.
This type of dial peer is used for store-and-forward fax (on-ramp
and off-ramp faxing).


POTS dial peer:

In figure below, an analog telephone is connected to the Cisco Unified Communications gateway. The gateway needs two dial peers.  The POTS dial-peer configuration includes at least the telephone number of the analog telephone and the voice port to which it is attached. Based on this information, the gateway forwards calls destined to the defined
telephone over the specified port.

To successfully forward calls in both directions, at least these call-routing elements are needed in every voice-processing system:

An appropriate POTS dial peer that specifies to which voice port the telephone is attached. This applies only to the edge voice-processing systems(Usually a phone).
An appropriate VoIP dial peer that specifies the recipient destination address, or at least the address of the next hop

A VoIP dial peer can point to either an H.323 or SIP device.

VoIP dial-peer parameters include coder-decoder (codec), quality of service (QoS), voice activity detection (VAD), dual-tone multifrequency (DTMF) relay, and fax rate.


Best practices for Single-site model
  • Know the calling patterns for your enterprise. Use the single-site model if most of the calls from your enterprise are within the same site or to PSTN users outside your enterprise.
  • Use G.711 codecs for all endpoints. This practice eliminates the consumption of DSP resources for transcoding, and those resources can be allocated to other functions, such as conferencing and MTPs.
  • Use SIP, SRST, and MGCP gateways for the PSTN. This practice simplifies dial plan configuration. H.323 might be required to support specific functionality, such as support for SS7 or Nonfacility Associated Signaling (NFAS).


Lab Setup

In GNS3 I have been using a 3725 with c3725-adventerprisek9-mz.124-18. For soft phones I have been using Cisco IP Communicator and this:


Those two are the only softphones I could find supporting SCCP. There are a TON that support SIP. I am using VMWare workstation 10.0, but vmware player works fine too. I am running CUCM 8.6 in vmware.

I’d suggest getting comfortable with loopback adapters in GNS3. This is how you can use the softphones. There are some really good step by step CCNA Voice virtual labs on youtube…like this one: