Network Optimization News
Analysis - Blocking Skype won't be easy
By Art Reisman
The Editors of Extreme VoIP asked Art Reisman, chief technical officer at packet shaper APConnections, what lessons
he learned as he tried -- and failed -- to detect and block traffic from Skype, the world's most popular VOIP application.
This article appeared April 19, 2006.
The recent surge in Internet VOIP carriers such as Skype, Vonage, and Net2phone, has fueled a political debate
unforeseen as recently as five year ago. This controversy presents a new plot twist in the ever unfolding soap opera of
government deregulation and who has rights to the last mile of customer wire.
Traditional CLEC providers make most of their money from residential phone and DSL lines. Now they are seeing
competition from nontraditional carriers running VOIP services on the very DSL lines funded by the traditional CLECs
and cable providers. These third parties pipe phone service down their wires without a penny of revenue to the CLECs
that provided that infrastructure.
If you are a savvy reader that keeps up to date with the trade mags you are likely aware that this controversy has all the
human voyeuristic interest normally reserved for tabloids. The players don't have names like Pitt and Aniston, but
instead Skype, Qwest, Comcast and Vonage. You likely have seen various editorials and commentaries on two or
more sides beating this subject to death.
For now I am going to leave the debate alone. Let's just focus on the operational strategy: How to deal with specific
traffic on a data line and how this can be applied to the special case of Skype.
As CTO of a company that specializes in bandwidth control and traffic shaping, I am well informed on the subject of
carriers blocking competitor's traffic on their data networks. I am often asked if we can come up with a solution to block
(insert evil music here) "Skype" traffic. Skype and Vonage have become the scourge of ISP service providers who are
looking to offer phone service for a fee bundled with their data services. The obvious conclusion for the owner of the
data line is to just block these hobos altogether and be done with it.
While blocking most data traffic is easily accomplished, I must confess up front I have feigned a few efforts at blocking
Skype only to retreat to fight another day after being soundly defeated. What follows is a short tutorial on traffic blocking,
made simple for the casual reader of technology. After we cover the general case of traffic blocking we'll cover the
special case of why blocking Skype traffic is a different animal.
Diving right into the mechanics of traffic shaping by application, the first lesson involves how to recognize traffic on a
network. As you are likely aware, all traffic on the Internet travels around in what is called an IP packet. An IP packet can
very simply be thought of as a string of characters moving from Computer A to Computer B. The string of characters is
called the "payload," much like the freight inside a railroad car. On the outside of this payload, or data, is the address
where it is being sent. These two elements, the address and the payload, comprise the complete IP packet.
In the case of different applications on the Internet we would expect to see different kinds of payloads. For example,
let's take the example of a skyscraper being transported from New York to Los Angeles. How could this be done by
using a freight train? Common sense suggests that one would disassemble the office tower, stuff it into as many
freight cars as it takes to transport it, and then when the train arrived in Los Angeles hopefully the workers on the other
end would have the instructions of how to reassemble the tower.
Well, this analogy works with almost anything that is sent across the Internet, only the payload is some form of data, not
a physical hunk of bricks, metal and wires. If we were sending a Word document as an e-mail attachment, guess
what? The contents of the document would be disassembled into a bunch of IP packets and sent to the receiving
e-mail client where it would be reassembled. If I looked at the payload of each Internet packet in transit I could actually
see snippets of the document in each packet (foot note on encryption) and could quite easily read the words as they
This is the basis of traffic blocking: Look inside Internet packets and see if you can tell what they are. Conceptually,
there is really nothing more to it.
Now moving beyond the simple case of sending a word file, let's suppose that we are sending a phone call from user
A to user B. How does that work in a traditional sense? Perhaps you have heard of SIP or H323 as common VOIP
protocols. We need to make a small conceptual hop from the e-mail attachment example to a live phone call moving
across the Internet, but I can assure you this is quite painless. When sending a live a stream of voice data using the
Internet you need to stuff pieces of the digitized phone call into a series of IP packets. Special equipment on the front
end of the phone call digitizes the voice data and stuffs it into an IP packet, it is sent, and at the other side it's
reassembled into a comprehensible voice emulation.
It is possible for an appliance to monitor the data going across the lines, categorize it and display it. Digitized voice data
is much different from a word file in transport because digitized voice when displayed as ASCII characters looks like a
mess of garbled goop. It is conspicuously random, so much so that there is no easily discernible pattern and you can
forget about human readable words.
So how would one tell that the data going over an Internet connection is a voice call?
Before the invention of Skype, things were quite simple. One nice thing about all these standard VOIP solutions from
Avaya, Toshiba, Cisco, and others is that you could see a very predictable human readable information exchange
between two endpoints just prior to the actual phone call. This is what is commonly referred to "call set up." Before a
voice phone call commenced it was common for the two phone systems to exchange data that mimicked a human
Computer A: "Hey buddy, I am about to send you a call."
Computer B response: "Not now, I am busy."
These call setup formalities are sent back and forth inside IP packets as very human readable text streams. Although
perhaps it might not be as comprehensible as "Hey Buddy, I am about to call you," it is often clear just by reading the
text what is going on.
Meanwhile, there are various automated devices engineered by commercial companies that specialize in detecting all
sorts of Internet traffic including voice. Some corporations purchase these devices intent on stopping streaming audio,
or perhaps to give priority to Citrix traffic.
The list of types of things and reasons for detecting and giving special treatment to various data streams of traffic is
endless, and would be an interesting subject in itself, for now let's get back to detecting voice and the special case of
If you recall, with voice calls once the call is up and in progress the data payload looks like garbled goop and that is not
specifically identifiable as a call in progress. Thus, it is important to see the set up in action. The set up of the call
between two IP phones is easily detectable. By remembering the IP addresses involved in the setup, you can safely
assume that future traffic between the two IP addresses is a phone call and block traffic between the two.
Scenario 2 Centralized VOIP Source
The previous scenario assumes two IP end points talking to each other. Another version of VOIP phone service uses a
VOIP PBX. In this scenario all phone calls emanate from a common PBX which has a well known IP address, so it is
just a matter of blocking any traffic to or from that IP address of the PBX if you want to stop voice traffic. Watching a
network of this type will yield one common IP address that always seems to be sending common identifiable call setup
messages to other IP addresses. Once you know this, you only need to remember the IP address of one party (the
PBX) and you can take care of future calls.
Scenario 3 Centralized Broker
In a third scenario a centralized broker is used to set up phone calls. This would typically involve a form of PBX that
arranges a contract between two VOIP phones to talk directly to one another. The centralized PBX is contacted by one of
the parties wishing to make a call. It then contacts the destination party to arrange the call. During this brokered set up
process one could see the setup communication of the broker within the IP packets. The conversation would go
Computer A to broker: "Hi, I'd like to call my friend in Miami but all I have is his name. Can you arrange an IP call for
Broker to Computer A: "Yes, just a second, I'll look him up."
Broker to Computer B: "Hey Miami, a phone in Los Angeles would like to make a phone call . . . "
Well, you get the idea. The final phone call would again be a stream of garbled goop, but by listening to the context of
the setup one could determine both IP addresses about to engage in a phone call and block the call plus future traffic
between the two of them.
So now you know my entire library of knowledge and secrets about detecting VOIP traffic. It is time to move on to what I
don't know about Skype.
Skype calls appear to talk point-to-point when a call is finally set up and active. This activity I can see by setting up
Skype calls in my laboratory. Of course I know beforehand what the two endpoints are, and therefore I can see the
Skype traffic whizzing by on my sniffer. However, when examining the stream I failed to see any human discernible call
set up, so without prior knowledge of a call being made I could never be certain if what I was seeing was a Skype call.
Skype setup appears take place with a common broker, however the set up appears to have no intelligible human
readable pattern. The setup portion of a Skype appears as just garbled goop.
It appears that Skype uses a distributed topology where calls are set up from a number of various ever-changing
brokers. If Skype used a common broker I could learn the IP address of that broker and hence I would know anybody
talking to it is setting up a Skype call. But without a well known common broker, there is no generic way I can look for
contact to a broker.
To date all my common tricks for determining VOIP traffic on the Internet have been thwarted by the Skype designers. I
have no idea if this result was a deliberate attempt to thwart detection or just an unintended side effect of their design.
Perhaps a reader with inside knowledge will step forward and answer this and other questions. For now I have plenty
on my plate, so I'll leave the mystery of Skype detection to my contemporaries.
Art Reisman is chief technical officer of APConnections, known for their line of packet shapers using the NetEqualizer
brand name. You can learn more about his background at NetEqualizer.com, or you can e-mail him at
For more news, click here for the archives.
NetEqualizer Bandwidth Control
Simple to use with the
features you need.
Network Bandwidth Prioritization
Take Control Of Your Network!
Bandwidth Control Switch
24 X 10/100 Rate Controlled Ports &
2 X Gig Uplinks For Under $400!
Buy Packeteer Today
Education discounts, Internet
pricing, Rentals, SW Contracts
Web filtering appliance for your
business. WP, Demo, Eval Unit
Software for controlling bandwidth
based on Time, User, IP & Network
High-end Appliances and Software
solutions at affordable prices.
Bandwidth management solutions
for enterprises & service providers
Free Bandwidth Tester
Free bandwidth analysis and reports
using NetFlow. Live Demo online.