Wednesday, January 4, 2012

Meaningless telecom statistics are a pain in the asr.

In the 1960s, General Telephone and Electronics Corporation, GTE, employed my father to visit their Central Offices in Georgia to connect measurement equipment to their telephone switches. The purpose was to monitor the system for evidence of problems. They came up with all kinds of measurements in that era; one of the most enduring was Answer-Seizure Ratio, ASR. The ASR is 100*(number of calls attempted)/(number of calls answered).

So if we place 1,750 calls in a day, and 1,650 of them are answered, then the ASR is 100*1650/1750=94.3% ASR is formally documented in ITU-T document E.411: "International network management - Operational guidance". One of the earliest written references to Answer-Seizure Ratio from a 1985 CCITT "Red Book" is a sort of apology for the measurement:
The answer/seizure ratio should be also based on historical records or, if available, on measurements taken during the period the route was used. 
Two key points here:

  • An ASR value for a specific route (call path) is meaningful only when compared to other values from the same call path
  • An ASR is meaningful only if it expresses a period of normal traffic

 Let's look at what has happened since 1985:

There are lots of devices automatically answering calls. Recipients have good ways to avoid answering calls. And lots of people are calling that we don't want to talk to. 

For example: suppose a political campaign starts calling a lot of your subscribers.  Those subscribers really don't want to hear from the "COMMITTEE TO REELEC" or "TOLL-FREE CALL" as caller-ID will tell them. So the calls go unanswered, or go to voicemail. If you're measuring ASR for those users (based on the actual calls answered by the endpoints), then the ASR may plummet during the calling campaign.

A low ASR doesn't necessarily mean the network broke. The problem could just be that some dufus is making a lot of calls.

Using ASR as a measurement of network behavior is not an interesting way to assess problems. Instead of trying to classify "success" (as in "an answered call"), look specifically for the problems in the signaling protocol.
  • Errors. How many calls failed with some sort of error? Here you have to be intelligent. For example, an INVITE that "fails" with a 401 is no failure at all. On the other hand, an INVITE that failed with 606 is probably a real failure.
  • Timeouts. How many calls failed due to some sort of timeout. This would indicate that the called device failed to reply to some sort of signaling message.

Friday, December 9, 2011

GrandStream GXP-2100: A step in the right direction

I once frustrated the Grandstream marketing folks at a tradeshow. The year was around 2007, and the show was SuperComm. (I think.) My exposure to GrandStream was the BudgeTone and the HandyTone. We had tested both devices, and made them work, but neither of them was something we could recommend.

I asked the booth-staffers whether they had any business phones. They pointed at something with novelty-oversized pushbuttons and grumped, "Business phones? Of course! We've always had business phones. Look right here."

Time has passed, and they've since released the GXP-2100 phone. It looks a lot more like a traditional business phone. It has a nice, standard feel in a phone.

This is a phone with considering for your deployment. But I'm concerned that the Cost Engineering is still running the show at Grandstream's handset-design department

  • The stand seems especially lightweight, while most other manufacturers have a very sturdy stand.
  • The paper speed-dials seem like a blast from the past. I know Aastra also sells paper speed dials, but this doesn't make it a good idea. The problem is that you can't use those buttons for Busy Lamp Field / Line State Monitoring.
  • My sample model, provided for free by GrandStream, had a rattle in the handset.
Of course, what matters most to me is the phone's reliability, its configuration flexibility, and robust support for troubleshooting. But I don't have a project that allows me to spend time studying that. If you'd like more info, let me know.

Thursday, November 17, 2011

What's Driving SBC Growth?




Seven years ago I sat down in a cramped office outside Boston. I was visiting Acme Packet, and the question on everyone's mind was this: How long will this "SBC" thing hold out?

You see, in 2004, everybody I talked to about the Acme Packet thought it was a temporary fix to a technical oversight. Maybe the SIP people would modify the standard to accommodate NAT and Firewalls better. Or perhaps the firewall and NAT vendors would improve their behavior and start supporting VoIP natively.

And nobody liked the cost the SBC added to the network. In the past seven years, multiple major VoIP software vendors have pleaded with me not to bring up SBC's with my network design clients. "Isn't there a way to design these networks without so much cost?"

But that hasn't happened. I've probably dealt with 100 carrier networks since then, and every one of them was either using an SBC, or making elaborate contrivances to avoid doing so.

But now the question is different. We have at least three significant SBC vendors to think about -- Acme Packet, Metaswitch, and Sonus -- plus Cisco too. And the questions aren't "when will this go away?" so much as "how big will this market become?"

So, what is driving this growth? What has convinced Metaswitch to enter the market at this stage in the game, after such a clear leader has emerged?

PRIs. Specifically: they're turning them off. And SIP trunking is vacuuming up the calls that used to flow through them. How many PBX ports were in use in 2001? That's probably a reasonable estimate for the upper limit of the SIP trunking market over the short term.

Nosiness. Or, more precisely, "Busy Lamp Field," "Line State Monitoring," "Shared Call Appearance," and other popular Hosted VoIP features. These features require many more SIP packets than basic call control, and they consume lots of SBC CPU time as they traverse the network. Carriers are buying SBCs in many cases simply to provide capacity for new features.

SS7 Peering that isn't there any more. Yes, there are ISUP trunks and SS7 linksets out there, but they're becoming the domain of specialized carriers, not every telephone company. Many carriers who have SS7-capable equipment aren't even using it at all; instead, they do SIP peering with other carriers. And the SBC often handles these calls twice: once on the access leg, and again on the carrier leg. So the SBC vendors may be paid twice for each "call."

I frankly wish the SBC wasn't a fact of life. But due to several interesting technical, historical, and human reasons, it's here to stay. I've listed a few, and I'd be glad to discuss others with any reader who is interested.






Monday, August 29, 2011

When will my SBC Be Out of Gas? Predicting Signaling Element Maximum Capacity



When will my Session Border Controller be "full"? The same question applies to any other signaling element in a VoIP network: how much more work can I put into this device before it's overloaded?

In this discussion, we'll focus on the Acme Packet OS-C Session Border Controller, such as the NN4250 or NN3820. The typical constraints people bump into are CPU usage and session licenses.

Predicting Processor Overload



For CPU utilization, the key question is to predict when your CPU will be at the peak. Above that peak, you'll start dropping calls or having other problems. So what's the right peak?

Acme Packet has stated that that when the CPU is somewhere between 80% CPU and 90% CPU, the SD will postpone certain maintenance tasks -- such as session state replication. If the CPU sustains a high load for long enough, then the two SDs in a pair can lose synchronization. So we try to ensure that the peak CPU is no greater than 80%.

Simple Arithmetic For Predicting



So the question for capacity planning is: how much of my CURRENT type of workload can the SD support, and not exceed 80% CPU? Here's one rough prediction method that we've had success with:


maxAllowableCpu <- 0.8

currentInvitesPS <- (Server INVITE requests Recent) / (Recent Period duration in seconds)

currentSipCpu <- ("show process cpu all" tSipd Avg value) * 0.01

predictedInvitesPS <- maxAllowableCpu/currentSipCpu * currentInvitesPS



For example: we're allowing the CPU to grow to 80% utilization. Suppose that you're observing 1260 INVITES over the past 90 seconds. That would mean current INVITES per second is 14. Suppose also that your tSipd CPU load is 50%. That would make your predictedInvitesPS = (0.8 / 0.5) * 14 = 22 Invites Per Second.

I'm assuming that the CPU is dominated by SIP traffic. You'd need a more complex model if you're doing a lot of other CPU work.

Better Than A Spreadsheet



What's great about this model is this it expects linear growth of all the other activity related to your system. Service Providers vary WILDLY in the amount of NOTIFY traffic they support, for features like Busy Lamp Field (BLF). If your BLF usage grows with your INVITES per second, then the INVITES per second can give you a sense for your existing headroom.

Another great feature is that this model accommodates all of your existing SIP Header Manipulation Rules. One SP may have simple NAT_IP HMRs, while another one is doing a total rewrite of large XML SIP messages traversing. So SP's can very a lot in the CPU load. This model assumes that your same HMR will be applied to your future growth just as it is used right now.

Adapting for Customer Count rather than Call Count



You could also model the maximum registered customers, by using the current number of registered users instead of INVITE volume.

A Proven Model



This model has been proven out several times across several customers in the US and abroad. It's been very helpful for predicting growth requirements and getting more SD's on order. And, in some cases, it has helped service providers to optimize their SBC CPU utilization to delay another purchase of another SBC.

Session Counts



Determining when your session licenses or capacity is typically more straightforward, because the resource is less flexible. If an SBC has been licensed for 1000 sessions, then it's easy to determine how many of those sessions are in use right now. You can use the MIB or command-line monitoring. If you need more licenses at peak, then buy more.

Tuesday, June 21, 2011

New Podcast: Samuel Rausch of Comcast: Engineering SIP Timers End-to-End for Optimal Network Performance

At the SIPNOC 2011 Conference held 2011 April 26-27 in Reston, Virginia, Samuel Rausch of Comcast spoke about their project tuning SIP Timers to improve failover performance in their network. They were able to successfully make network faults invisible by tuning the timers appropriately.

I sat down with Samuel during a break in the conference and recorded a short interview on his presentation.

You can listen to the interview by subscribing to the 200OK.info Podcast:


Wednesday, June 15, 2011

Cisco Pricing: Easier to get than ever.

Cisco has made it much easier for engineers to properly account for costs when evaluating alternatives and options. They've published a very extensive list of product part numbers and retail pricing.

The standard way to estimate pricing on larger items (like routers) involves a long and arduous back-and-forth with a Cisco reseller. The reseller does help by configuring the proper combination of parts, but the turnaround time of hours or or days can be a real deterrent to using Cisco.

The job of the sales team should be to help evaluate what I need, propose good solutions, and then answer the questions that aren't already answered in online documentation. Amazingly for Cisco, they don't really make it easy to even contact a sales team. I'm often in the role of recommending Cisco (or other manufacturer) equipment to folks, but Cisco doesn't give me any way to have a chat with a competent sales rep.

Cisco, Juniper, Foundry and others should follow Dell's and IBM's model of giving pricing instantly, online, with a smart configuration tool. But this move by Cisco, publishing the MSRP pricing online, is an excellent first start. It will help us engineers and designers know much more rapidly what cost-effective alternatives exist.

A Microsoft-Excel-free version of Cisco's spreadsheet is available here.

Saturday, March 26, 2011

Do Telephone Companies Actually Matter Any More?

Note: The following essay was adapted from a talk I gave to staff at Earthlink Business in January 2011.

Do we really need telephone companies, or are they a relic of the past technology, now replaced by data networks?

I had this question upon entering the voice side of the Networking Industry, coming from classical Computer Science and Data Networking. Technology history had shown that centralized systems concentrated too much risk, and could generally not scale well. All the successful systems distributed the knowledge and the power. Consider, for example, DNS, the web and email. In every case of successful Internet technology, the smarts and flexibility and power were moved from a centralized database to a vast set of unaffiliated servers. The future was distributed, with intelligence moving closer to the edge.





When I began my work in telephony, I was mildly shocked to learn that the modern Voice over Internet Protocol (VoIP) phones, each with its own IP address, DNS resolver, and modern processor could not automatically locate the other VoIP phones out there on the Internet. Why would a new era of technology bother to reproduce the oddities of a former generation? By that time, in 2003, Skype was a big, working success. Shouldn't a $400 VoIP phone have the same kinds of capability, but without the dependence on a PC?

To my mind, VoIP only made sense if we went all the way to distributing all the logic, intelligence and features to the endpoint. This was the Internet era, after all!





Instead, "modern" VoIP telephone companies are busily reproducing the architecture of the past. Each service provider has some sort of centralized a "switch," and each telephone would be provisioned (configured) in that switch as a separate user or line. Then the switches are connected to one another, primarily via 1970's-1980's TDM technology using antiquated protocols.

It's easy to see the historical reasons for this model: the old endpoints were extremely simple by modern standards. And it's easy to see why the telephone companies would like this model: it ensures customers are dependent on them; that helps to ensure ongoing recurring revenue.

But are those historical reasons enough to maintain the "telephone company" model into the current era? If there's a more efficient model coming, then the market will, eventually, kill off the old model, even if it has a lot of inertia. So is there any real reason to keep this centralized model in place, besides the business motivations, and plain inertia?

And why hasn't VoIP brought about the hope heralded since the 90's: free phone calling, and no more telephone bill?

It's been about ten years since contemporary VoIP systems were born. Now we don't even start CLECs -- we start ITSPs -- but some of the fundamental values of telephone companies have been retained. One key value is the (1) user the location database. Telcos have a way of tracking down a user, whether he's connected to a different point on the frame, or does a SIP REGISTER from a new IP address. Closely related to this, telcos have a (2) simple and understood user addressing -- i.e., telephone numbers. People are comfortable with telephone numbers, and they work reliably to connect calls. Another key value is (3) quality, i.e., reliably establishing the voice call, because the telco may own or operate every device along the path of the telephone call.

Those are all technical issues that users experience. But there are other advantages owing to the telco model that users don't normally experience. One of those is (4) call routing. Users aren't aware of which way their call is routing to reach the final destination, and still achieve the user addressing, quality, and location services. Another related function is (5) interconnection, allowing competent network operators to connect calls to other competent network operators. Closely related to routing is (6) privacy: there's an expectation and a penal code ensuring traditional telephony that the calls are not subject to disclosure. Calls are routed between competent service providers who each take a responsibility for ensuring the privacy.

If we consider VoIP in particular, service providers have an advantage by(7) accommodating existing network complexity -- especially NAT. Hosted NAT traversal through SBCs has been a key contributor allowing Hosted PBX services to be easy to deploy, reducing the cost and complexity of the device at the NAT boundary between Public Internet and private VoIP network. Further, telephone companies can provide (8) simple deployment of endpoints by centralizing the software and configuration of SIP phones into centralized servers.

There are other convenient services, like accounting. If every device was independent, who would keep the records of the phone calls? But perhaps the question is: would anybody care about records, if every call was free. Who keeps the records of web site visits, or Instant Messenger exchanges? So I don't consider accounting to be a fundamental value, but rather an incidental service useful for the current model. Another is interoperability. Service providers reduce the number of logical interconnection points that must be integrated properly. The alternative is negotiation, where each endpoint tries to cooperate with every other endpoint. Within the VoIP industry desultory success with practical endpoint negotiation has not established confidence that this will be easy.

So some of us thought VoIP would be a disruptive technology, and would bring the swift end to the traditional telephone company. But it turns out that the telephone company model appears to provide a lot of value. All of these eight features have to be provided by something, even if it's not the telephone company.

If you're in a business, and you're worried about competition coming in and replacing you with innovation, you should go ahead and do that innovation yourself. So what would it take to displace the telephone company? I.e., what would it take to eliminate the organization that is somehow responsible for each phone call? For this purpose, I'm going to allow some service providers, like DNS operators and ISPs, because they don't know or care about each phone call.



1. User Location: Dynamic DNS to the rescue?
The Telco/service provider knows where users are, via static configuration, or SIP registration, or some other trick. Distributed Hash Tables were a popular idea in 2002 timeframe. Skype was designed as a DHT; has that but they still operate a for-profit service provider at the center involved in individual phone calls. The classic SIP idea is to use DNS, then have a SIP URI Address of Record similar to your email address. For example, sip:lindsey@ecg.co could be the address on my business card. Conceivably, dynamic DNS could be used to update my user-agent location, so that when you call me, DNS can show you how to reach me. (It would also expose to the world the current IP address of one of my SIP devices. But there are some drawbacks: How would traditional PSTN callers contact you? Clearly, you'd want a forwarding number connected to the PSTN, and that would mean you need a service provider.

2. Convenient User Addressing: Long live the telephone number?
Telco-type Service Providers can use telephone numbers, but this relies on a centralized coordination authority to prevent collisions and partition the space. Telephone numbers also once gave a sense of location. One method would be a separate block of telephone numbers, e.g., a new country code, connected to the PSTN. Dynamic DNS could be used to map those to your current IP address, to accommodate numeric dialing from the Internet. But you'd still need a gateway from the PSTN onto the Internet. And when you call to the PSTN, you'd want to appear on caller ID as something useful -- maybe "sip:mark@ecg.co" in the caller name field.

3. Quality. Packet Prioritization by all ISPs
Telephone companies can ensure quality by using dedicated transport for the telephone calls. But users don't have a convenient way of getting their own guaranteed capacity for communication with one another. Right now, the Internet does not provide reliable, consistent quality for all voice calls. It's really close, though. One way to ensure quality would be prioritization. We'd each pay for the right to transmit prioritized packets. All the ISPs would agree to prioritize some packets through their networks, e.g., packets with DSPC value EF. ISPs may find it's easier just to allow 10% of each user's capacity to be prioritized to avoid billing by the packet, and sidestep Net Neutrality concerns this would be used to give some commercial operators an advantage. Hard Problem.

4. Call Routing
Service Providers have to find a path to get your call to the called party. But if parties can discover one another by resolving to a SIP Contact via DNS, then and all service providers can prioritize packets, then this shouldn't be required per se. Any ISP should be able to do this. But if we have one service provider that does not properly prioritize the packets, it would become important to circumvent their network. As of today, practically no service provider prioritizes traffic. So this brings us to: But perhaps the cooperating ISPs could also enable explicit tunnels, as is done with IPv6, to build certain routes where known packet prioritization is available. Hard Problem.

5. Interconnection.
Interconnection for traditional telcos allows them to pass traffic to other telcos. How would a user get calls to and from the PSTN? Overall, the only way to communicate with the PSTN is to be part of the PSTN, and that means being a telco. You could, of course, buy service from a telco, but only for those calls that involve the PSTN. You could do this with a dual-mode SIP endpoint device: it can SIP register with a traditional telco/service provider, and it can participate in the telco-free environment of the Internet as well. Hard Problem.

6. Privacy.
Telcos provide privacy by using arcane technology with expensive monitoring devices. They're also protected by laws in many countries, such as the wiretapping rules in the 1939 Telecom Act in the United States. Telco-free users could provide their own privacy through end-to-end encryption. The standards for SIP over SSL and SRTP are well evolved and proven.

7. Accommodating Existing IP Network Complexity, especially NAT.
NAT and firewalls run counter to everything that's good about the Internet. They're all necessary evils, like pesticides and nuclear weapons. Telcos can deal with NAT and with firewalls by operating an SBC, like the Acme Packet SBC. But if you're trying to receive calls from the Internet, and you're behind NAT and a firewall, and there's no server on the Internet to which you can connect because you have no telco, you still need to receive your inbound calls. uPNP tries to solve this problem. IPv6 could solve the NAT part of this problem, but there are signs that the IPv6 ISPs are too infatuated with NAT to give it up for IPv6. To solve the firewall piece of this, devices need to get better at defending themselves, so that firewalls become unnecessary. Hard Problem.

8. Simple Deployment of Endpoints.
Telcos of the VoIP persuasion operate servers, called "Provisioning Servers" or "Profile Servers" or "CPE Management" servers, that store the VoIP phone software and configuration files. This can make it easy to deploy a minimally-configured endpoints. But if we can assume there is no telco, how will a phone get its software and configuration? The iPhone is a good example here: where does your iPhone get its software -- from the original device vendor, or from the telco? Of course, Apple provides software to the iPhone that a user can install without help from the telco. And vendors of other VoIP devices could deliver their software in the same way: via the Internet, directly to the device. So I think this problem can be easily solved.


So we have four Hard Problems. And we have to rely on ISPs to do a lot more for us to ensure quality. And if we do want PSTN access, we're going to need to keep a service provider.

With this level of difficulty ahead, I would say the future is bright for the Telephone Company.

Wednesday, September 22, 2010

Four Predictions about IPv6 for VoIP Carriers

ORLANDO, Metaswitch Forum. VoIP service providers are thinking more about IPv6 every day, based on my requests. Here at the Metaswitch Forum, the UK-based company is dedicating an entire session to IPv6 deployments. They'll be speaking to the 200+ carriers, large and small, assembled at this year's Metaswitch event.

The questions we hear are, "How will it affect me? How should I be planning now?" Every network wants to know how to respond to the depletion of IPv4 address space, and the "imminent" roll-out of IPv6 space. In this article, I'm going to reach back into history to make some predictions about IPv6's effects on VoIP carriers.

Prediction #1. IPv6 will be deployed first on mobile devices

For example, some carriers are starting to assign IPv6 IP addresses to cell phones and mobile cards. Then they'll use 6-4 Carrier-Grade NAT (CGN) to connect their IPv6 customers to the IPv4 Internet. Mobile devices are a good fit for this because they rarely run services. For example, your PC may run a VNC screen-sharing service, or share a printer, or share a hard drive. That requires other computers to connect into it. If your PC is behind a NAT device then fewer devices can connect to it. There's a security upside to that, but it also limits some services.

Handheld devices almost never run services like this, so their users will be minimally affected by the absence of an IPv4 address assignment.

If this prediction comes to pass, however, the innovation promised by IPv6 will be stymied. This move keeps the server operators in complete control over traffic going from the Internet to the phone. Why NOT have connections to the VoIP SIP stack on your cell phone? Why NOT have inbound file transfers from your friends?


Prediction #2. IPv6 will hasten the focus on the Session Border Controller as key demarcation.

The original dream of many VoIP and Internet designers has been to simplify direct communication. We know that intermediate systems and servers create bottlenecks. Cell Phones and PCs should connect you to friends and colleagues with minimal need for intermediate systems, they opine.

But within Telecom, those "intermediate systems" are called "telephone companies". Connecting you to your friends is called "switching"; it allows you to do things like dial telephone numbers to locate someone.

Session Border Controller (SBC) Vendors already encourage IPv4 VoIP Cores using Private addresses. This arcane detail ensures the SBC is the true gatekeeper between the VoIP core and the public Internet. And Acme Packet already supports IPv6 to IPv4 interworking.

We should expect other SBC vendors to enhance their support. This will enable Service Providers to claim IPv6 readiness -- while leaving their core networks untouched.

In an all IPv4 world, service providers can conceivably interconnect individual devices. For example, Carrier A's application server can communicate with Carrier B's media server. But with an IPv6 world connecting IPv4 networks, this will be much more difficult. So the SBC will be even more critical as a demarcation between carriers.

But is this a bad thing? In general, fewer interactions may be better. (See Christopher Alexander's extensive discussion on this.) Already, smart carriers are using the SBC as the sole interconnection between carriers and customers.

Prediction #3. IPv4 VoIP Core "walled gardens" will persist.

Many carriers are installing new IPv4-only VoIP core systems this very day. Many of them prefer to upgrade only when they must, so some of them will never install IPv6 capable equipment to carry voice calls.

We see a similar situation in the Banking industry. Systems of high importance -- like those that track your bank account -- usually run on decades-old technology using decades-old networking. However, these systems are incredibly important, and the old technology has evolved to be amazingly robust. American banks have a "regulated" side, and folks connect from their modern Windows PCs into the venerable old "Core Processing" system to make transfers.

We'll see the same in telecommunications; in fact, we already do. In many cases, there's no technical reason to use SS7 in many cases, but regulations and reliability concerns send traffic from VoIP system to VoIP system -- via SS7 ISUP and TDM equipment. SIP over IPv4 will be another SS7 ISUP.

Prediction #4. We will never run out of IPv4 address space, unless governments intervene.

Eventually, the global IPv4 registrars like APNIC and ARIN may not have any more giant IP address blocks to assign. Does that mean it's the end of the IPv4 world? No.

From personal experience with many carriers and large enterprises, there is ample IPv4 space unused. Unless governments intervene, these companies with unused IP space will reasonably monetize their IPv4 space by leasing it to other companies.

Friday, July 9, 2010

Metaswitch pitches DC-SBC; should Acme Packet be afraid?

Metaswitch has been talking a lot lately about VoIP Session Border Controllers (SBCs). They recently took their DC-SBC product to the SIPit interop testing event in June. And they just published another white paper about SBCs and their role in IMS.

The Enfield company formerly called Data Connection Limited definitely makes solid software. Rumor has it some of these gents were hired to make some of the first VoIP implementations for Cisco and for Microsoft. And if you look closely when doing VoIP troubleshooting, you'll still see Alcatel-Lucent gear advertising their SIP stack built years ago. So they certainly know VoIP protocols, and I have reason to expect great things from their SBC implementation.

Nevertheless, rolling out a Session Border Controller product at this stage in the market is tough. Acme Packet has a dominate lead in market share; but more importantly, they dominate in field-tested features. The Acme Packet Configuration Guide reads like a military history of battle plans of the form:

so your endpoint devices came at you with <insert insane endpoint behavior here>? Well, add this magic incantation to sip-options and you're off and running!


But Metaswitch isn't trying to go up against Acme Packet. In fact, in early 2009, they signed a deal to re-sell Acme Packet gear.

So what is Metaswitch's angle on Session Border Controllers?

As the recent white paper "Session Border Controllers in IMS" discusses, they see three key markets for their SBC software:

  • High End Tier-1 Mongo Service Provider: Integrate the SBC into the other VoIP equipment, like the access media gateway.
  • Small/Medium Mom-and-Pop Telco: Integrate the SBC into the Service Provider Edge Router.
  • Enterprises: Integrate the SBC into the Access router.


SBC as part of Core VoIP Equipment



The brilliant minds behind IMS believe the SBC features will be smeared across your core feature server, media gateways, and other such devices. You'll have bits of SBC functionality here and there. This logic follows the classic zero-one-infinity principle of Computer Science:

0: The IETF told us we didn't need SBCs at all.

1: Most VoIP Service Providers wanted SBC at one point in the network.

∞: But we IMS guys love SBCs so much, we'll put it everywhere!


The pros:
  • You get to do things like interworking, ALG, codec stripping, and policy enforcement where-ever you dream it to be.
  • You'll get to share the power supplies and rack space of an existing chassis.

    The cons:
  • You get to troubleshoot things like interworking, ALG, codec stripping, and policy enforcement everywhere! The novice-engineers and technicians, given the power to configure these complex features at numerous points in the network, will inevitable create monsters where manipulations in behavior could occur anywhere. While the goal of IMS was to create a straightforward framework that network designers could build to match, the flexibility of the IMS standards will birth mind-blowing complexity.
  • Since when are things like media gateways put at the edge of the network? They're always somewhere inside the core of the Walled Garden managed network, protected from attack or overload by the SBC.
  • Sprinkling SBC functionality throughout ignores one of the key SBC functions in actual working networks: high-performance security policy enforcement.

    Several vendors are trying to do this in the media gateway. Did you know Convergent is still in business? They're attempting to sell a SIP version of their media gateway, complete with SBC functionality built right in. Genband is trying this too with their SBC product, integrating it into their core media gateway.

    SBC in the Service Provider Edge Router



    It's clear from all the marketing that one of Metaswitch's big pushes is get their DC-SIP software integrated into Service Provider Edge Routers. This is the router to which service providers connect their DSL, T1, DS3 and other customers.

    Metaswitch has some neat ideas here. For example, they discuss the ability to assign a virtual SBC instance to individual interfaces on the router. That sounds a lot like the way the Cisco Firewall Services Module (FWSM) lets you assign firewall instances to individual VLANs.


    But more specifically, this sounds like Metaswitch wants to provide exactly the kind of functionality found in the Cisco Unified Border Element, an SBC available in some of Cisco's routers. (In fact, it's possible Cisco is already using DC-SBC for this purpose, but I don't have any direct evidence of that.)

    The pros:
  • You'll have the SBC functionality close to the customers.
  • You'll get to share the power supplies and rack space of an existing chassis.
  • This puts the SBC in the neighborhood of the VoIP-Edge firewall.

    The cons:
  • It's possible the SBC functionality might be a little too close to some customers; you might have some complex routing to get all of your traffic through the SBC blades/line-cards. E.g., suppose you have fifty customer-edge routers but only one core VoIP network. Are you really going to buy 50x SBC cards/licenses, and connect every customer-edge router to the VoIP core network?
  • VoIP Carrier networks are already quite complex. It's a challenge for carriers, currently, to manage the basic network; through in the additional virtualization complexity of VLANs and VRFs, and many smaller Carriers just can't control their network. Further virtualization and combination will only make the network harder to manage. Metaswitch needs to think hard about ways to make the network comprehensible, and virtualization/integration of devices that would otherwise stand alone militates against it.

    SBC in the Customer-Premise Edge Router




    Metaswitch would love to get their DC-SBC into the customer premise equipment. Who wouldn't? This is the device that gets sold millions of times. The Edgewater EdgeMarc is one leading Customer-premise SBC-type gadget.

    The pros:
  • It is helpful to have a device that's aware of the phone calls at the customer edge for troubleshooting and monitoring.
  • The Edgemarc is very popular among network designers, so you'd expect a competing product to do well too.

    The cons:
  • Unfortunately, the Edgemarc is very popular among network designers of overweight networks. They design in so much complexity that the costs per customer can be significantly higher than more streamlined designs. Sometimes the complexity of the CPE ALG interworks with other complexity in the network in ways that make the network fragile.

    Many network designers want an ALG at the customer premise, so there's good reason to expect Metaswitch could be quite successful here.

    MetaSwitch's Gentle Argument against Standalone SBCs



    Metaswitch has gently argued against the standalone SBC, e.g., the Acme Packet SBC products.

    They seem to argue:
  • The IMS standards call for SBC functionality smeared everywhere, so a standalone device isn't the best fit.
  • Having the SBC as a separate device, "adds another device into the network, increasing the network's complexity and latency, and introducing another point of failure".

    However, as a network integrator and operator for the past seven years, I find these arguments weak:
  • It's hard to find a working, functional, profitable network with more than two bits of IMS resemblance. When IMS suggests that SBC features should be available at many points in the network, they're daydreaming in the "wouldn't it be nice if we could" genre.
  • If you install the SBC as separate line-card into an existing router or gateway, you've added all the same complexity that you'd have by installing a new device in the rack. You might be using an existing power supply and communications busses, but the complexity is equivalent.
  • If you DO integrate the SBC into the same software platform with another network element, you've genuinely increased the complexity of that software element while reducing the physical complexity.

    The SBC Integration that Makes Sense




    In many networks, the SBC sits side-by-side with a data firewall. It's not parallel to the Media Gateway, and it's not parallel to the Provider Edge router. So as designer of functional networks, I would be most interested in an SBC / data-firewall integration.

    The combination of two security devices that sit at the border of the same security domain would be quite practical. You could use the same two-plane architecture commonly employed in high-end SBCs (e.g., Acme 4250) and high-end firewalls (e.g., Cisco FWSM):


    • A conventional processor to handle signaling. Scale it up by splitting endpoints across different processors.
    • Network Processors to handle packets after the flows have been approved. Scale it up splitting flows across different NPs.



    Should Acme Packet fear?



    The key reasons I recommend Acme Packet for standalone SBC deployments are:

    1. Stability: The platform works all day, every day for months at a time.
    2. Features: I'm rarely the first to need a feature, so the oddball and innovative features are there by the time I need them. Plus, there are features that are only needed as a carrier scales up.
    3. Deployment Scale: Thousands of these things are out in use, which means they're being tested heavily by those thousands of deployments.



    The DC-SBC might do great for Stability; let's assume it will. And because DC-SBC is a young platform, it's not clear just how widely used it is used, and actually knowing that will be tough. For example, it's important to note that not ever Cisco ISR with IP-IP-Gateway functionality is actually using that functionality; likewise when the DC-SBC is integrated into another device, it's not always clear when or if that will be used.

    But I don't know that DC-SBC has the thousands of oddball features needed for interworking in modern, diverse carrier environments. When I talk to large service providers using Acme Packet SBC, they're always employing tons of custom processing of the signaling by rewriting the SIP in many ways. Further, Service Providers large and small use many of the bizarre and obscure features of the SBC to get a reliable network.

    So if you, dear customer, are considering the DC-SBC, be sure it really has the features you actually need. And the only way to know that is to actually integrate and prove the network under load. And then you have to anticipate what features you'll need as the network scales up.

    I'm not an Acme Packet shareholder, and I genuinely hope for some honest competition among SBCs. SBCs are mind-blowingly complex -- almost as complex as the telephony application itself. For a new SBC to enter the market seriously, some service providers are going to have the guts to deploy and test the things, and the vendors are going to have to work awfully hard to meet the existing expectations for SBCs.⊗

  • Wednesday, June 2, 2010

    SIP Registration Attacks are Here -- Defend Yourself

    A few cases of SIP dictionary attacks using the "friendly-scanner" have been reported recently. These appear to be active attempts to steal service.

    We responded today to an attack on a nationwide Service Provider. They reported up to 69 REGISTERs per second originating from an IP address in Anhui province, China. 69 REGISTERs per second is roughly the equivalent load of 5,000 users.

    Unfortunately for the victims, the "friendly scanner", SIPVicious runs very hot and fast, apparently blasting out lots of requests without even waiting for earlier attempts to fail. The SIPVicious tool is focused on cracking SIP PBXs, and will be only so slightly less effective on Carrier VoIP systems.

    The main reports of problems due to SIP Registration scanning are server overloads. But if the registration scanner users are smart, they'll slow down their rates so they don't alarm the parties being probed.

    How do you defend against SIP Registration storms?


    1. For registering endpoints like SIP phones and IADs always use SIP authentication! use quality passwords.

    2. If you have a competent Session Border Controller like the Acme Packet OS-C system, you can blacklist devices after they fail a few REGISTER attempts.

    3. If you're using non-registering SIP (such as SIP peerings for SIP Trunking), you should have a small number of SIP signaling IP addresses. Use firewall rules / or ACLs to block all SIP except for what comes from that small list.

    4. Use heavy-hitter detectors to spot SIP devices that are sending more-than-normal traffic loads, and alarm your staff.

    Saturday, April 24, 2010

    Surprise! Your new Hosted PBX Features just ruined your business model.

    Advanced IP PBX Features Can Radically Change the Network Engineering, Support, and Costs for Hosted IP PBX Providers


    Many Hosted PBX providers based on VoIP are surprised by the network load caused by the SIP Features "Shared Call Appearance" (SCA), "Shared Line Appearance", "Simultaneous Ring", "Busy Lamp Field" (BLF), or "Line State Monitoring". Both of the big Application Server players, Metaswitch and BroadSoft, offer these features. And carriers are starting to deploy them in spades. They're often together, so I'll call them collectively BLF/SCA/Simring.

    All these features are fundamental departures in the underlying signaling model. I'll explain why, and what a VoIP Service Provider needs to do.

    1,000% Increase in Signaling

    Normal Modern call control requires only a few signaling SIP messages to setup or end a telephone call: INVITE, 100, 180, PRACK, 200, 200, ACK, BYE, 200. You'll get more messages when calls are put on hold, or switch to fax mode, and Metaswitch does session-audits with a re-INVITE every 30 seconds. But 10 or 20 messages are typical for a phone call.

    Enter BLF/SCA. Now every phone in the group can get 6 or more NOTIFY-200 SIP messages for every call placed by other people in the group. Plus they get a call setup attempt for every single call for every person in their group.

    The SIP signaling load per user grows enormously. A user who has 20 calls per day might only need 200 signaling messages for call control, but a user in a 5-person BLF/SCA group could have 1,100 messages per day. Raise that to ten-person group, and now it's 2,180 messages per day.



    The signaling load grows superlinearly -- in this case, it grows as n2 with the number of users in the group!

    Surprise! Your system is full earlier than expected.


    The real danger is not the messages per day -- it's the messages per second at your peak. If your customers are clustered into one geographic area, that peak probably happens around 10:00am or 3:00pm local time. But for typical users, peak load will correspond to the daily workload.

    And the problem is typically not in the routers and switches; signaling load is just more IP traffic. Solving problems in the transport network would be easy. But signaling load affects the application plane -- i.e., the devices that process the SIP. Solving application-layer problems is much more complex, because (obviously) the application has to track the state and progress of all of the user-oriented business logic.

    So what is most likely to get overloaded during the peak?


    1. The Session Border Controller.
      Even some folks at the SBC vendor have been caught off-guard in a few cases by this. In one case, a sales engineer told me that he had planned for a single SBC installation, but needed 3x SBC systems to handle all the BLF/SCA traffic. "The planning tool we had was all wrong," he said.

    2. The Application Server.
      When you're using BLF/SCA, the AS or Call Feature Server (CFS) may have to process 10x as many SIP dialogs, and has to handle 10x as many SIP messages. "The most expensive thing an Application Server does is process a SIP packets," a BroadSoft Systems Engineer once told me.


    BLF/SCA and Simultaneous Ring are certainly very useful -- but they come with a price.

    Cool features -- Big cost differences.


    For example: suppose an SBC costs $67,500 for a non-redundant system and three years of support. Without BLF/SCA/Simultaneous Ring, you could expect to support 30,000 users on the system; i.e., that's $0.75/user/year CapEx for the SBC. But with BLF/SCA/Simultaneous Ring, your efficiency could could drop significantly -- reasonably down to 5,000 users -- i.e., $4.50/user/year CapEx. That's a 6x difference in cost per user -- but still a great price for all the features an SBC provides.

    The revenues from BLF/SCA/Simultaneous Ring are also great, but probably not proportional to the signaling load. Can you really charge 6x or 10x the price for a BLF/SCA/Simultaneous Ring customer? Of course not; but remember that the signaling load is not the only cost of the system. Technical talent, network transport, Customer Premise Equipment, sales and marketing etc., are all significant expenses that are largely unaffected by these features.

    Not all SIP Phones are Created Equal


    You've got to consider the Customer Premise Equipment selection. BLF/SCA are advanced new features, and do not enjoy the robust, mature, time-and-customer-tested support of ordinary call control. While Polycom has been supporting BLF/SCA for years and wide deployment, many other phones do not have the history. There's nothing magical about Polycom -- they just have a head start on software maturity and reliability. And they have the experience with SIP-over-TCP necessary to make BLF work well.

    Finally, test scale before you deploy. Testing a three-member BLF/SCA group in the lab is not adequate preparation to sell a ten-member BLF/SCA group. Because of the non-linear growth in signaling, and the requirement to use SIP over TCP for reliable BLF/Line State updates, these features do not scale up in clean intuitive ways.

    Further, you've got to test the transport. Very low levels of packet loss can seriously affect SIP over UDP deployments of Busy Lamp Field. Your Gigabit-Ethernet VoIP lab probably isn't the best evaluate the reliability of your service, if that service is deployed using T1s to customers.

    Before you deploy a big Simultaneous Ring / BLF / SCA group, you need to test and prove the reliability of that big group. You'll be sorry if you don't, and your customer will let you know just how sorry.

    Proceed -- with Caution.


    Busy Lamp Field, Shared Call Appearance, Shared Line Appearance, and Simultaneous Ring are cool features, and well worth selling. But they constitute a genuine disruption to the ordinary Hosted IP PBX model that has been built with BroadWorks and Metaswitch for years. Network engineering, costs, and support all change in serious ways.

    Wednesday, March 17, 2010

    US Government Technology: "Phone Lines are not like Computers"

    In a March 16 article on The Hill, "Limbaugh prompts healthcare calls, ties up House phone lines", the US House of Representative's Chief Administrative Officer's spokesman, Jeff Ventura, is quoted:


    Our phone system is nearing capacity . . . Unlike computers, which can be scaled to accommodate something like this in real time, phone lines are hard-wired, so you have your capacity and once the capacity is full, you’re going to get the good old-fashioned busy signal.




    The funny part is that the White House's telephone system most certainly IS a computer (i.e., a digital PBX or a small telephone switch, like a DMS10). And unlike most computers, PBXs and telephone switches are built for scaling up. They're designed for growth!

    Further, most computers cannot be scaled up without a lot of design put into the scaling.

    Ventura's attitude reflects a few common misconceptions:

    • Telephone lines aren't computer systems.
    • Telephone lines aren't as flexible as general computers and networks.
    • Computers can be easily added on demand, for any application.


    Telephone Lines ARE Computer Systems

    Every phone system available for at least ten years is a digital computer with some specialized hardware to connect to the PSTN. Many of them are based on ordinary Windows-PC-type components, like an Intel processor and a hard drive. Many of the new VoIP-based systems are just software running on a PC-based server.

    Telephone lines ARE as flexible as general computers and networks

    Old-school ISDN, TDM, or loop-start analog telephone lines are different than typical computing. In regular computing and networking, there are no guarantees that data will get through. But in telephony, the network is built to guarantee that after a call has started, it will continue.

    In general, you have the same sort of design flexibility with telephony that you do with general computing, but you DO have to retain this imperative; i.e., you can't drop or degrade calls after they start. VoIP systems do this through careful planning.

    Computers cannot be easily added on demand, for any arbitrary application

    If you want to scale up a computer system to accommodate more capacity, the system has to be carefully designed to accommodate that scaling. You can't just add computers and connect them to the same network.

    Obviously each new computer has to have the same software running on it. It needs access to the same data. And, if it's a real application that changes that data, it has to keep the data in sync with all the other computers in the cluster. It's no trivial task.

    Fortunately, many designers build this kind of scaling into their applications. But it's certainly not as easy as just adding more servers. Ventura is probably accustomed to computer systems that have this scalability designed into every application.

    Thursday, February 4, 2010

    Metaswitch CEO Kevin DeNuccio focuses on new development, purchases, other countries


    As widely reported today in press releases,


    Metaswitch Networks today signaled its ambition to become a major telecommunications vendor, with the appointment of successful industry executive Kevin DeNuccio as chief executive officer. At the same time, John Lazar has been promoted to chairman.



    On the conference call discussing the move, DeNuccio focused on some key points.

    1. Product Line Expansion.Metaswitch wants to expand their current product line, primarily using their existing engineering base. Metaswitch has strong software-based telecom and network-protocol support, but you don't see see them doing a lot of hardware design with FPGAs or CAMs with their current development staff. Their DSPs, for example, are products from Texas Instruments. Products like the Acme Packet SBC or Cisco Routers have sophisticated fast path hardware.

    So I think we can look for more software-based system-control products, and probably not as many line-rate products like routers, switches. Could Metaswitch make a competent firewall? Absolutely. Could they make a great Call-Quality Enhancer? Yep. Could they make an SBC? They'd actually have a lot of work.

    Could they create a some real competition in the Application Server / Registrar world? Certainly. Since the demise of Sylantro and Tekelec's feature servers, and the inability to launch Sonus's feature server in a significant way. Right now, Metaswitch's product is tightly integrated to their TDM gateway. Some customers (ITSPs) don't actually want an SS7-capable gateway, and don't need T1 voice interfaces.

    2. Purchase of new companies. DeNuccio mentioned purchasing other companies several times. You haven't seen many publicly-announced mergers out of Metaswitch in the past; they've built their own tools, but they sometimes do integrate with others' products. I wonder if we'll hear a merger announcement soon.


    3. Global expansion. Metaswitch is very strong with switches in North America. Just based on the technology, I'd expect the fastest growth in the Caribbean (who share most of the US's telecom technology), and Japan (ditto). I suspect, but don't know, that the British Commonwealth nations might be good targets next, because engineers tend to know about the technologies in use nearby, and many of Metaswitch's developers are in London. I suspect many commonwealth nations use the UK telecom standards and C7 variant.

    Tuesday, December 8, 2009

    If iPhone development was complete, the Apple Store wouldn't have Avaya phones

    When you go to the Apple store, do you see an IBM cash register? No, you see store employees using iPod Touchs to do everything.

    When you go to the Apple store, do you see a PC at the back for use by the manager? No, all the business machines are Macs.

    When you go to the Apple store, do you see PBX business phones? Yes -- you see a store employee talking on an Avaya PBX phone. He's telling a customer about the iPhone.

    So then: Why isn't Apple eating its own dogfood? Why are they selling a phone, but not using it themselves?


    • The PBX business phones are coupled to the location, but iPhones are coupled to individuals. Answer 1: Use a SIP client on the iPhone to connect to the phone system. Employees would carry an own iPhone that can only REGISTER to the local SIP system when it's in the store. Answer 2: Use an AT&T Femtocell, and add some call routing to the Femtocell. When an employees phone is within the store (and within range of the Femtocell), that phone receives business phone calls.

    • The iPhone doesn't have the call control features a store would want; e.g., to put a call back on hold, or to transfer calls to another desk in the store. Answer 1: Provide a custom call control interface for managing calls after their received. This would require calls route through an Application Server (such as BroadWorks or Metaswitch) that has some external call-control interface. Answer 2: Use Centrex features from the AT&T telephone switch.


    Sunday, November 1, 2009

    And then there were two: BroadSoft notices Metaswitch

    At the BroadSoft Connections 2009 "Executive User's Forum" in
    Scottsdale, AZ, I got my first positive confirmation that BroadSoft
    has actually noticed Metaswitch.

    I've written before about the two companies. BroadSoft makes quality
    software, while MetaSwitch makes both good software and good hardware.
    While the two companies are very different, both BroadSoft and
    Metaswitch can deliver phone calls and calling features reliably.
    Their customers are telephone companies: giants like Verizon or
    CenturyLink (formerly Embarq and CenturyTel), or mom-and-pop's like
    NGTelecom. These telephone companies want to use VoIP to offer
    telephone service.

    BroadSoft has long provided features for businesses, like line-hunting
    and Automated Attendant. Metaswitch's carrier base is primarily
    telephone companies offering residential service, but now provides
    many of the same features, but Metaswitch also sells network gateways
    used to connect to the old-school traditional telephone equipment.

    On one hand, the BroadSoft and Metaswitch are not really competitors.
    BroadSoft couldn't clock a T1 or parse an SS7 ISUP message to save its
    life -- i.e., BroadSoft did not function as the VoIP-to-oldschool
    network gateway. Whereas Metaswitch does SS7, CAS, ISDN Q.931, GR-303,
    and many of Metaswitch's carrier customers depend heavily on these
    technologies. On the same hand, BroadSoft has long had integrated
    Voicemail, automated attendant, call center queueing, incoming and
    outgoing calling policy enforcement, music-on-hold, line hunting, and
    a customer-facing web interface -- all built into the most basic
    BroadWorks system.

    On the other hand, BroadSoft and Metaswitch are closer than ever, and
    not because BroadWorks handles robbed-bit signaling. Metaswitch has
    been gradually enhancing its MetaSphere-brand products. According to
    sources within the company, they've added all of the features popular
    among BroadSoft's carrier customers.

    Several years ago, I heard from some key technical Metaswitch
    employees their regard for BroadSoft. But for years, I had never heard
    BroadSoft folks mention Metaswitch. But that changed this week, when
    one of BroadSoft's long-time technical folks started talking smack
    about the "jerks over at MetaSwitch."

    There's a hierarchical, asymmetrical college rivalry between NC State
    University (NCSU), the University of North Carolina at Chapel Hill
    (UNC-CH), and Duke University. While I was at UNC-CH for grad school,
    I heard much about Duke. But once I married into an NCSU family, I
    heard a lot about UNC-CH. I'm not sure who the Duke people care about.

    NCSU thinks more of UNC-CH than UNC-CH thinks of NCSU. And UNC-CH
    thinks more of Duke than Duke thinks of NCSU. At UNC-CH, the "big
    game" is the Duke-UNC match. At NCSU, the "big game" is the NCSU-UNC
    match. To use an operator from Relational Theory: NCSU < UNC-CH < Duke.

    In 2005, MetaSwitch looked up to BroadSoft as a rival, but I never
    heard evidence of the contrary. The rivalry was asymmetrical. But now
    the influence is becoming more balanced: Broadsoft and Metaswitch both
    see each other as rivals.

    Competition is a good thing and can evoke out good hard work. I am not
    biased toward either vendor: if the equipment and software can work,
    I'll make it work. With the consumption of Sylantro and the General
    Bandwidth (GenBand) M6, BroadSoft has removed nearly(*) all the other
    serious competitors. Now there are only two mature VoIP platforms --
    MetaSwitch and BroadSoft.

    In the old days, nearly every telephone company of any size had either
    a Nortel switch or a Lucent switch. (**) And at this point, there seem
    to be two major core-telecom equipment vendors: Metaswitch and
    BroadSoft. Will they be the Nortel and Lucent of the next era of
    telecom?


    ----------------
    (*) I would say "all," but I've heard Sonus is out there. I've heard
    rumors of people using their application server platform, but I've
    never had the pleasure to see it in person.

    (**) OK, a few had Siemens switches, but Nortel and Lucent are
    definitely kings.

    Monday, October 12, 2009

    IT'S THE LATENCY, STUPID. OH, AND PACKET LOSS TOO. The Problems with 3G Mobile Networks.


    All the carriers brag about their 3G Networks. I mostly attend to AT&T 3G and Sprint EV-DO; these are the two carriers I use.

    Email and file transfer performance is usually disappointing. Transfer speeds as measured by the Ookla "Speedtest" application vary wildly, but the round-trip-time (RTT) latency is consistently quite high.

    I recently did a study for a client about the effects of network delay in cross-country file transmission. With fixed networks (such as normal Internet links), cross-country latency is often 40 to 50 ms; much of that time due to the speed of light to get from New York to Seattle, for example. So the round-trip time (RTT) is 80 ms to 100 ms, typically. The overall goal is to transfer a 4 GB file across the US in a matter of minutes. We demonstrated through real lab tests that nontrivial delays under 100 ms do slow TCP transfers, so that FTP will not get the job done. The effects are exaggerated as the network throughput increases. E.g., at 100 Mbps, a link with 16 ms RTT slows to about 52 Mbps (assuming all the TCP buffers/windows are tuned properly.)

    Many national labs, such as LANL and CERN, deal with problems of large data transfers. Their scientific instruments output gigabytes of data within seconds, and they have to transport it across the network or around the world. They have the money to buy 100 Mbps links from the US to Switzerland.

    But many ordinary folks are on the other end of the spectrum, below 1 Mbps, and experiencing large delays. The vaunted 3G networks fall into this category: even with 3 Mbps or better downstream paths, the large delays can also kill performance.

    A reasonable AT&T 3G speed is around 350 Kbps, i.e., 0.35 Mbps. Consider these lab tests with varying levels of delay:


    • 350 Kbps downstream, 45 Kbps upstream, 0 ms RTT delay, 10 kB TCP window: 330 Kbps TCP connection throughput

    • 350 Kbps downstream, 45 Kbps upstream, 200 ms RTT delay, 10 kB TCP window: 230 Kbps TCP connection throughput

    • 350 Kbps downstream, 45 Kbps upstream, 400 ms RTT delay, 10 kB TCP window: 135 Kbps TCP connection throughput

    • 350 Kbps downstream, 45 Kbps upstream, 400 ms RTT delay, 64 kB TCP window: 309 Kbps TCP connection throughput


    The TCP window size is normally configured in the operating system by default; 64 kB (i.e., 65,536 bytes) is a typical value.

    In the AT&T 3G and Sprint EV-DO networks, 400 ms RTT to the first-hop router is quite typical. Consider these examples with higher capacity and 400 ms RTT:


    • 1,350 Kbps downstream, 45 Kbps upstream, 400 ms RTT delay, 10 kB TCP window: 150 Kbps TCP connection throughput

    • 1,350 Kbps downstream, 45 Kbps upstream, 400 ms RTT delay, 64 kB TCP window: 430 Kbps TCP connection throughput

    • 1,350 Kbps downstream, 45 Kbps upstream, 400 ms RTT delay, 64 kB TCP window, 2% downstream packet loss: 220 Kbps TCP connection throughput

    • 1,350 Kbps downstream, 45 Kbps upstream, 400 ms RTT delay, 527 kB TCP window: 600 Kbps TCP connection throughput

    • 1,350 Kbps downstream, 45 Kbps upstream, 400 ms RTT delay, 782 kB TCP window: 650 Kbps TCP connection throughput

    • 1,350 Kbps downstream, 45 Kbps upstream, 400 ms RTT delay, 3500 kB TCP window: 980 Kbps TCP connection throughput


    As you can see, as network delay increases, the TCP window must also increase. But even a tiny amount of packet loss -- 2%, fully reasonable in my tests of these networks -- can slow things down as well.

    Conclusions



    • The TCP Buffer sizes need to be tuned to 3G networks, not just high-performance networks: OS Vendors need to dynamically adjust TCP buffer sizing based on the first-hop round trip time.
    • Users may need to explore non-TCP mechanisms to fully exploit network transport.
    • High latency and even moderate packet loss can significantly reduce the usefulness of 3G wireless networks.


    I performed these lab tests using two Mac OS X 10.5 machines and the Dummynet Delay/bandwidth emulator in the kernel, using 1 MB buffers on each end of the link. iperf was used to generate traffic.

    Wednesday, September 30, 2009

    Metered Broadband Won't Work, or Won't Matter: Stop focusing on the heavy-tail customers

    In a Telephony Online article, we're told that Verizon (Landline) CTO thinks the end will come for flat-rate broadband. He means that you won't be able to pay a flat rate and get unlimited access to the Internet. Apparently, AT&T and others have come out and said the same thing.



    The problem is that CTOs are focusing on the heavy-tail customers; it works like this:
    a


    • Most of your users are clustered toward the bottom. They don't use very much Internet access.
    • A few of your users are serious; they use twice or three times as much as most of your users.
    • A very few of the users never stop downloading. They use thousands or millions of times as much bandwidth as their friends.



    Heavy-tailed distributions skew familiar values like averages and medians; so if you have users that follow a heavy-tailed distribution as described above, and you look at your "average" customer use, you'll get something much larger than your "typical" customer's use.



    The simplest approach to avoid metering broadband use is to just fire the customers who are using way too much. If you drop the top 5% of users, you'll regain tons of network capacity.



    If you want to use versioning, then you can charge one price that covers nearly everybody, then a higher price for people who go over their limits. This is the standard US cell phone billing plan; you get N minutes, and if you use more than N minutes, then you get charged per minute. But people don't want to have the meter running.

    Monday, September 28, 2009

    Google Next Big Service: Plain-text transcriptions of Phone Call, Podcast, and Youtube Videos


    Google provides some very nice services; Google Books is useful and technically interesting. They started with printed books, and figured out how to de-warp those images to produce nice flat pages.



    Then they used Optical Character Recognition (OCR) to convert the photographs of text into actual computer-manageable text. This brings image data into the domain of structured data, and it makes the text searchable. On top of that, it's effectively 500-to-1 compression of the text data (assuming that a 1 MB photograph of a document compresses to about 2 kB of text).



    Google is poised to make a similar transformation by converting audio to text. Witness:


    • Google purchased "Grand Central" and called it "Google Voice". Each Google Voice user gets voicemail service, and incoming voicemails are automatically transcribed to text for email.
    • Many people complained about the transcription quality, but last month, in August 2009, Google Voice announced improved transcription service. Therefore, Google has and is improving the technology to automatically transcribe spoken-word audio.
    • Google Voice functions as type of telephone service, allowing you to place and receive calls. All of the call audio can route through Google Voice's system. This gives Google access to your call audio when using Google Voice.
    • Google owns Youtube, which has much useful and educational spoken-word material.
    • Google Books has demonstrated that Google is interested in converting from analog-domain data (printed books) to structured textual data.
    • Lots of useful, unique spoken-word information is presented in podcast and video form. A few podcasts are recordings of other written material (e.g., the Economist Magazine's podcast and audio edition), but most is probably never published in written format.
    • Spoken word audio is not directly searchable by Google. Further, audio recordings are quite large; the number of bytes per word is very large. (The previous sentence in written form requires about 200 bytes of storage, including all the formatting around it. A 64 kbps MP3/Ogg Vorbis audio stream of the same sentence would require around 80,000 bytes of storage. The ratio is 400:1, similar to the ratio for printed books to printed text.)



    Therefore, we should fully expect Google to implement new services:


    • Podcasts can be transcribed. This would unlock lots of useful information that is currently inaccessible without a huge commitment of human time.
    • Youtube videos can be transcribed. For ordinary, spoken-word presentation material, the benefits would be huge; the spoke-word material could be readily indexed, searched, and read. Google could use the temporal redundancy clues (i.e., whether this frame looks a lot like the previous frame; if there is motion in the video, the frames have little temporal redundancy) in the video to determine whether the video is changing often; if it's just a speaker with some text beside his head, they could even show us what's on the screen as the speaker speaks: At the beginnings of sentences, an individual frame can be extracted and shown along with the text.
    • Google voice telephone calls can be recorded and transcribed. This would be a great way to record notes on work-related phone calls, and then go back and review notes from old calls. Recordings of phone calls are tedious, but transcriptions of those recordings would be fabulously useful.

    Friday, September 11, 2009

    IANAL Analysis of Patent 5,912,888 Claims against VoIP Gateway developers (MetaSwitch, etc.)


    Last week, "Network Gateway Solutions LLC" filed a patent infringement case against many (but not all) VoIP gateway vendors. It's court case 1:09-cv-00667-UNA at the moment. (Justia)

    You can download the complete court filings here. (I paid the PACER document access fees for you.)



    US Patent 5,912,888 is in question. (US Patent and Trademark Office version.) The Patent was filed originally by US Robotics. It describes a Network Access Server, such as the US Robotics Total Control system. I worked extensively with the US Robotics TC system at a regional ISP in the 1990s. It was eventually sold to and marketed by 3Com.



    The patent is a continuation of a patent that was filed June 9, 1994. It describes an "all-digital" system, where the Network Access Server has a T1 link to the telephone network. Back in 1994 and 1995, it was much more common for smallish ISPs to have analog modems connected via RS232 port to a network access server.

    What companies are being sued?

    Many (but not all) of the big names in VoIP network gateways.


    • Adtran Inc -- sued for TA900
    • Audiocodes Ltd. -- sued for Mediant 1000 MSBG
    • Audiocodes Inc.
    • Avaya Inc. -- G350 media gateway
    • Cisco Systems Inc. -- AS5350 universal gateway
    • Genband Inc. -- G6
    • Juniper Networks,Inc. -- J2320
    • Alcatel-Lucent -- CellPipe CELL-IAD-8T
    • Alcatel-Lucent USA Inc. -- CELL-IAD-8T
    • Media5 Corporation -- Mediatrix 3000
    • Mediatrix Telecom Inc.-- Mediatrix 3000
    • Metaswitch Inc.-- PB3100 card
    • Mitel Networks Corporation -- MNC SX-200
    • Mitel Networks Inc. -- MNi SX-200
    • Multi-Tech Systems Inc.-- MultiVOIP MVP2410
    • Patton Electronics Co. -- SmartNode 4960
    • Quintum Technologies LLC -- Tenor Gateway DX2008
    • Siemens AG -- HiPath RG2500
    • Siemens Corporation -- HiPath RG2500
    • Sonus Networks Inc. -- Sonus GSX9000
    • Zhone Technologies Inc. -- IMACS-200


    How did Digium, Taqua, Tekelec, and Nortel get left out of the list? Certainly they didn't actually buy a license for this patent.

    So does this patent apply to modern VoIP gateways?

    It's a real stretch to adapt this to VoIP equipment. The actual complaint sues on the basis of Claim 2 in the patent. The standard complaint is,
    Defendant X, within United States, manufactures, uses, offers for sale, or sells network gateway systems including, but not limited to, Y, that fall within thte '888 patent. These devices have an all-digital network access system that connects remote computers for generating digital data to a network computer on a local or wide are a network via a digital telephone line. The Y includes modems, a telephone control interface, telephone bus, digital signal processing system, parallel bus, and a network gateway controller. At a minimum, X's Y contains each limitation set forth in at least claim 2 of the '888 patent.



    The limitations of Claim 2 require (summarized in my language)


    • At least one modem
    • A telephone control interface for TDM management
    • A telephone bus
    • A digital signal processor (DSP) system for demodulating time-spaced bus signals into incoming binary data without conversion to analog form
    • A parallel bus connected to the DSP
    • A network gateway controller connected to that parallel bus, that formats the data for transmission on the local area network (like Ethernet.


    The key problem for Network Gateway Solutions, LLC is that VoIP gateways have no modem. They have a digital network interface connected to the PSTN, usually in the form of DS1/T1 or DS3/T3 CSU/DSUs, but they don't have a "modem" in any conventional sense. And the lack of a modem is key to the distinction: this patent '888 relates to computers dialing in over modems, through the PSTN, to a network access server (NAS).

    Claims of the Patent vs Modern VoIP Gateways

    But let's look at all the claims, because they all hang together.

    Patent Claims 1 and 2, "An all-digital network access server connecting remote computers generating digital data to a network computer on a local or wide area network": The inventor claims a system that connects a computer to an analog modem which connects to the digital PSTN, and eventually gets connected to another modem in the Network Access Server. But in a VoIP network, there isn't a computer on the PSTN side: it's a telephone in the kitchen.

    Patent Claim 3 discusses "said incoming binary data", which data was generated by the digital computer connected to the analog modem of Claim 1.

    Patent Claim 4 discusses the "packet of data from said network computer" (at the service provider) "destined for said remote computers" (connected to the analog modems) "via said digital telephone line". Again, there are no computers at the other end of the PSTN. Just phones in kitchens.

    Patent Claim 5 also relates to the digital computer at the other end.

    Patent Claim 6 discusses "an all-digital network access server for receiving and transmitting calls representing digital data", but in VoIP calls, the calls don't represent digital data. They represent calls.

    The key to the Network Access Server is a digital computer at the far end of the connection, using an analog modem to connect to the PSTN. But in modern VoIP there is no analog modem. You might stretch to say that SIP phones are a type of digital computer, but there is no analog modem involved.

    So if none of the claims relate to telephone calls, what could they possibly have on MetaSwitch, AudioCodes, and others?

    So who DOES this patent apply to?

    I believe this patent does not properly apply to VoIP network gateways because there is no modem involved in a VoIP gateway, as described in patent '888.



    I'm not opposed to patents in principle. This all-digital network access system is a useful and innovative invention from the early 1990s. I do think Network Gateway Solutions LLC got one defendant right: The Cisco AS5300, AS5350, AS5400, etc that can operate with MICA modems do have modem capabilities. It appears that they are covered under this patent.






    I am not a lawyer, but patents are about technology and precise thinking. Both of those are interesting. The Verizon Patent Win Against Vonage is also interesting to study because it applies quite broadly to VoIP Carriers.

    Wednesday, September 2, 2009

    TMC Blogger, Keating, wowed by new Decades-Old technology




    Tom Keating of the TMC Net "VoIP and Gadgets Blog", please consider researching the facts before posting. You can't trust every company president who booths at the TMC IT Expo.

    In a September 1, 2009 posting, "XCast Labs Cuts VoIP Bandwidth Requirements In Half," Keating oohs and ahs over XCast Lab's use of ordinary SIP technology. The technology innovation is described thus:

    XCast Labs on the other hand uses their patented "direct RTP" which is able to tunnel through both user's firewalls and setup a direct peer-to-peer (P2P) RTP session between the two users. Once the call is setup, the two users are able to send the RTP media directly to each other. Since both callers are in California, they are just a few hops/routers away from each other thus dramatically reducing latency and jitter. XCast Labs simply maintains a small signaling connection to determine when the call ends for billing purposes.


    Keating was "dumbfounded that no one else had thought of this". He should be dumbfounded if no one else had done this. XCast labs is using the the STANDARD way of doing RTP (prior to the hosted VoIP industry). There's nothing new here.


    • Look in RFC 3261 that devices SIPv2: In the basic "SIP Trapezoid", the signaling flows between SIP proxies, but the RTP flows along the shortest, most efficient path between the endpoints.

    • Look at the Acme Packet SBC: it supports Media Release directly. It's simple to configure (mm-in-realm=disabled), and used very commonly.

    • Look at the Ditech PeerPoint C100: it supports direct media between endpoints, including those who are NAT'd with certain types of NAT.

    • Look at the MetaSwitch: You can disable a setting called "restricted media" and get the exact behavior they're describing. So any MetaSwitch customer can do it.

    • Look at BroadWorks: It does direct, shortest-path RTP *by default*. The only way to make BroadWorks control the media is to use advanced call mixing or recording features.



    And let's not forget Verizon's patent, US Patent 6,104,711. In 2006, Vonage was found to be infringing of that patent. That patent covers cases where the IP address of the called party is returned to the calling party. This is exactly the requirement for XCast's direct media to occur!

    So this is prior art -- and actually covered under another patent. Verizon has not apparently tried to enforce this patent, yet, against all the other service provides. In fact, Verizon USES BroadWorks and Acme Packet itself in their new fIOS VoIP architecture.




    Update 2009 September 15: Tom Keating revised his post with some more details on the claim of new technology:
    Ok, some clarification from XConnect. The key point which I guess I may not have made clear in my article is that their direct RTP is able to penetrate firewalls and initiate direct RTP sessions. That's their "secret sauce".

    Here's what Vlad Smelyansky from XConnect told me:

    Your description of our RTP is wrong and the comments (except reference to the patent) are correct. The guy actually picked on the core mistakes and ignored the rest.

    There are gazillions of other cases when people are using direct RTP. Our patent and technology is not about the idea of sending traffic directly between end points because that is part of the RFC. Our patent and technology is that we actually can do it between endpoints behind firewall(s) without any special configuration of firewalls (NAT) or deploying additional technologies (STUN) on those endpoints. ACME, Nextone, and Ditech tried to do this and came to the conclusion that it is impossible without additional protocol or special firewall configuration. But if endpoint devices are on a public IP or the user is willing to do special NAT or use STUN, they all can do Direct RTP for last eight-nine years. Our specialty is that we can do it with almost any endpoint behind a firewall and do it totally plug-n-play. No one else can do that.



    He referred to my comment about the patent. I am not a lawyer, but Vonage was found infringing of the claim in the patent where the IP address of the called party is provided to the calling party. AFAIK, Verizon hasn't tried to sue anybody else about that claim in US patent 6104711. Perhaps XCast has some way of setting up direct media between the calling and called parties without returning their IP addresses to each other; telepathy, maybe?

    Nevertheless, I only mentioned the patent because it's a dated legal document showing that people were thinking about sending media directly between the two voice-over-packet endpoints.





    (SIP Trapezoid image from https://wiki.sch.bme.hu/bin/view/Infoszak/IpTetelKidolgozasMedia?CGISESSID=0d198e2a22067629087267719d075294, but obviously drawn by Cisco.)