Moore’s Law Vs the Reality of Quality Conference Calls

Good morning or good afternoon, everyone. My name is Brad, and happy Tuesday. Thank you for joining our monthly webinar this week or this time. We’re going to be talking about Moore’s law and how it is versus the reality in the quality of conference calls. As we’re getting going, just a couple of housekeeping items before we start.

You should know that the title page is up, so if you don’t see the opening slide, press star, zero on your phone, and an operator will help you. We are in the web conference. Everyone is muted, so don’t worry if this is lunch time for you and you’re eating a sandwich. I won’t interrupt anyone, and that’s also true on the website as well, so there’s a chat pane on the right side, and feel free to punch in a question or tell me what your interest is. Nobody else is going to see that, so even if there were, don’t worry about silly questions or anything like that. We’ve got a lot to cover. It is five after the hour, so let’s go ahead and get started.

First of all, we are going to cover in this session today a number of different things. We’re obviously going to talk just about what Moore’s law is in case you don’t know. Send me a text and tell me if you do know what it is, and we’re going to talk about conferencing from a perspective of last decade, ten years ago, and then what it is currently, and how different aspects of it specifically, transportation or the transport side, and the bridging side, the details within each impact the quality, and how then that relates back to Moore’s law. Ultimately, we’re going to talk about what’s the impact and what kind of solutions are available for us.

Just before we get going, I am President of Adigo here. I’ve got experience of both public and private companies, and specifically to the topic of today and the equipment, and how it impacts conference call quality. I was director of products more than ten years ago at a company called Voyant. That was a company that made the bridging equipment for all the major carriers. Have quite a background, and it was interesting preparing this webinar, thinking back to my days there of what was going on.

First of all, what is Moore’s law? Basically, what Moore’s law talks about is a doubling of performance every 18 months, so very, very quickly. Almost basically every year or two, the performance should actually double in technology. This was originally presented as it relates to transistors on a microprocessor or a chip, but it’s also been proven true for memory, memory density, display monitors and the number of pixels in a display, processing speed. You may remember the old days when Intel talked about their 486, and then 500 Megahertz, and then 750, etcetera.

Basically, the bottom line to Moore’s law is just that everything gets better and better, and typically less expensive very, very rapidly. That’s true, but let’s think about conference calls from ten years ago, so much more than 18 months. What happened with the conference call back in the day? Basically, it was a pretty straight forward process. You dial an 800 number, you’d enter in some form of an ID, a conference ID, and the audio quality I think generally, people thought it was pretty good. Basically though, it was expensive, and so not everybody did it. It was less expensive in 2004 than it was back in the late ’90s, but still fairly expensive.

All right. Let’s think about it today and how does a conference call work today. We’re going to ignore this call because this had a lot of fun, fast, neat features on it, but a typical call that you may be involved with on lots of different types of providers, you would typically still dial some kind of 800 number, but normally, now, you have to search for that email invite, find it because you often have to dial a fairly long, maybe a ten-digit ID so you have to have it right in front of you to know what it is. You may have to go through a number of different types of prompts. It could be a little cumbersome. Once you’re on the call, we hear from people that sometimes there’s audio quality issues often because people might be participating from a mobile phone. You might hear background noise from where they are, maybe at an airport or in a car, and then sometimes, it can be difficult for you to be heard, and you’re trying to get into the conversation but it seems like no one hears you.

Certainly, they’re much less expensive than they used to be ten years ago. No doubt about that, but when you think about all this, is it better than it was than what we described from ten years ago? Certainly, it hasn’t doubled in performance every 18 months, and that means over the course of a decade, it should have improved five times over, and it doesn’t seem like that’s happened in conferencing. Many people would actually say, “It’s worse.” Albeit is cheaper, but the quality and other aspects of a conference call are actually worse.

Why is that? Why hasn’t Moore’s law … If you’ve got some ideas, send a chat in, and let’s discuss them. I’m going to talk about two specific aspects of quality issues in the conference call today, and the first one typically refer to as the ‘Pass the mic’ issue or a walkie-talkie type of phenomenon. That has to do with not being able to hear or get into the conversation, only being able to hear one person at a time.

You may have thought, “Gosh. How come I can’t get into the conversation?” It reminds me of the old days when Nextel is around. You may remember them as a mobile carrier, and they have this push-to-talk technology. It was “Chirp. Hey. Need this at the construction site.” They did mark it to trades, people and construction quite a bit. Basically, it was like a walkie-talkie, and only one microphone if you will could be on at a time, and so, only one person could be talking.

That’s what the situation is now, and the impact of that is you lose the conversational speech aspect of a collaboration. You lose the ability to interrupt somebody. They get going on a monologue maybe like I’m doing now. The dialogue is just less conversational, and you miss out even to understand whether people are following along with you on what you’re talking about, just the simple “Uh-huh”, or “Wait a second”. That is lost, and so, that’s a definite issue for a lot of people.

Another problem people experience in conference calls is what I refer to as the ‘Snap, crackle, pop’. If you have a kid or remember back to the days with the Rice Crispies, it might be a lot of fun if you’re eating breakfast, but not that much fun if you’re on a conference call, and you’re hearing different ‘Artifacts’ we call them into the conversation, or it could be even latency. It could be an issue where a certain speaker or their words seem out of sync. It’s like watching a movie where it’s very easy when the visual is out of sync with the auditory, but that can happen just in a conversation as well when one person seems to always be either behind or not aware that someone has started talking.

Why are these issues happening, and what are the causes? There’s two primary technical reasons why these occur. The first one has to do with transport. Even though we think of conference calls is in the cloud, actually, physically, what’s happening is my voice still needs to physically move to what’s called the ‘Bridge’, which mixes it all together, and back to each of the different people’s endpoint or phone that is participating on the call. Just like a car on a highway, it has to physically move across the highway, and some highways are better and more efficient than others and faster. The same is true with voice and the actual voice energy, and then bridging also, just like there’s eight-lane bridges, and there’s very small bridges, and they have different performance levels.

Let’s talk about those two items in a little bit more detail. First of all, let’s talk about transport from ten years ago, so circa 2004. Who knows what PSTN stands for? Go ahead and send me a chat if you know what that stands for. Basically, in the old days so to speak, the physical route that the voice information traveled was entirely all PSTN or on the publicly switched telephone network.

This was a physical connection, a direct connection, and that system was very predictable, very reliable, and has served people for 60 plus years in a very, very reliable fashion, so much so that people would say, and it was designed actually for 99.999% uptime in quality. People refer to that as the ‘Five nines’, five nines of quality. You can see that on this little pictorial diagram below say a six-person call, everybody is on the old traditional corded handset, and their connections are all to only the PSTN network, and each of their connections is a solid line because that was a solid, direct connection. Again, pretty simple. Pretty straight forward. Very predictable, and very reliable.

What about today? It’s quite different actually as you can imagine. We all know that people are on different types of endpoints first of all, so you’ve got people on a mobile people. You’ve got people on cordless phones. You’ve got people connecting through their computer on Vonage or Skype. Those services go on what’s called the ‘Public internet’ or ‘Public VoIP Connection’. You’ve got people at their office that maybe connecting, but that’s on a VoIP system as well, but maybe that VoIP connection is with quality of service or QOS.

First of all, all of the endpoints are different, and second, the networks are different. Of course, a mobile phone, the transport of that connection is via the mobile network, and some of the other connections are going to be. VoIP is over the internet and different aspects of the internet, and then all those connections have to be combined as well. Now, you’ve got people coming in on different networks and in those networks being interconnected as well, and especially on transport over the internet, you’ve got different architecture within the internet, soft switches, routers, codecs … Those are meant to efficiently move data, but as it relates to voice, that can introduce different types of artifacts of static, sometimes echo, and so you see those notations on the lines connecting the speakers into the cloud if you will on the pictorial diagram here.

Now, instead of having straight, solid lines, we’ve got dash lines from each connection, and they’re wavy because there’s different types of artifacts being introduced. To have a conversation or a conference call in a mixed environment like this is very, very different, and as you can imagine, much more challenging because it has much higher variability. Think about if you’ve ever used Skype. I know my kids talk to their grandparents who haven’t actually [inaudible 00:13:50] in quite a while. They used to do it a little bit more often, but that variability got annoying. Sometimes, it worked just fine, but sometimes it was crap, and you couldn’t even hear, and that was frustrating because of the different types of endpoints, different types of networks, these artifacts that are introduced. I’ve got listed here on the slide if you’re interested some of the causes for these different types of artifacts.

For example, a delay might be caused by excessive hops on the internet or on the VoIP network, or long sections on the public internet as opposed to a dedicated internet connection. Basically, from a provider perspective of conference calling, there are multiple ways to implement this transportation network, and there are dramatic cost differences in how it’s implemented. Each provider does it a different way, and depending on how they do it, it will definitely affect the variability and ultimately the quality. Let’s talk about the next thing now, the bridge. What’s it do? The bridge first of all has to mix and combine the sound from everyone’s endpoint.

Let’s take that example with six people, and maybe there’s two people talking that signified with the heavy lines, the thickness of the line being the volume, so the person in orange and the one in blue were actually talking on the call, so their lines are heavier, and that goes into the bridge. In reality, even the people that aren’t talking, there is some amount of volume coming out of their mouthpiece or into their mouthpiece. There might be for example someone breathing. That puts in a very small amount of noise, but some level of it. Even just listening, there’s going to be some level of just soft noise, and so you can imagine if you combined all of that with six people, the thin lines, even though they’re thin because they’re so many, they can drown out the ones that are actually talking, and the result is just a fuzzy, white noise, and you could imagine the more people on a call, the worse this is.

For example, if there were ten people or 15, I mean, they would literally be just fuzz. How do we avoid that as a conference call? The other thing the bridge does which is critically important is it mutes the people that aren’t talking. All of those four lines that are not actively talking, those are actually muted by the bridge, and so that sound doesn’t get contributed along with the other ones. The process that the bridge does for this is called the ‘Talker Selection Algorithm’ or TSA.

The bridge has a TSA that is monitoring all the connections and identifying which ones are talking, muting the ones that are not, and then combining the resulting talkers, and redistributing that, and so the result is more clarity. As you can imagine, the challenge for a bridge is the talker changes, and often, in an interactive conversation, that talker is changing very rapidly and all the time. Before it was the orange and blue bubbles if you will, and now, it’s the green and purple, and so, the bridge very quickly had to identify that, mute the different lines, and if you will, open up the mics for the purple and green. A bridge is doing this literally in microseconds that processing. That’s a very critical job of the bridge.

How has bridge technology changed? Back in the day, ten years ago, bridges were made with custom hardware. It was called ‘Big iron’. They were loaded with digital signal processors, special chip, they had different layers. Today, typically, they’re generalized servers, and a lot of the processing is done on the software side. Just like we talked about on the transport issue earlier, the cost is a big driver for providers. They’re trying to save money on both of these technologies.

What’s the result of that? Let’s overlay what we’ve talked about, the transport issues on to the bridging issues. Now, if you look at this, it’s a much more complicated slide. Now, we’ve got the different types of endpoints contributing through their different networks, potentially putting artifacts into the conversation. You can imagine for the bridge, how does it know whether some static coming into the line is someone talking, or if it’s an artifact? That confuses or can be confusing to a bridge. It doesn’t know whether to open up that line as a speaker or not.

Secondly, if it does open up that line and there are artifacts on it, that’s a big problem because now all of a sudden, you’re getting static into the connection for everybody. Everybody is starting to hear that artifact that may be introduced from only one specific line. Then, if the processing, that TSA processing on a bridge isn’t extremely fast, you can have latency or from the transport, you could have latency. Now all of a sudden, if you’ve got multiple people talking, they could become out of sync. What happens is for most providers, they choose for their bridge to have a TSA of one, only one mic open at a time because that’s the safest. That eliminates the out of sync issue, it is going to be less likely for artifacts to get introduced and broadcast out to everyone, but the problem is, you lose that conversational speech. This is why you’ve experienced the issue of, “Hey. I can’t be heard.” That’s because the bridge is only allowing one talker at a time.

What can be done? If you pay more on the transport side, you can get better noise cancellation. You can get higher quality codecs, and on the bridging, if you pay more for more powerful processing or have a purpose-built bridge, you can get a more sophisticated TSA. That allows you to have more mics open without introducing the artifact or the latency or sync issue. Basically, you can spend more and get back to the quality of yesterday if you will, where you had good conversational speech, and the ability to interrupt people.

What that looks like is still, you’ve got obviously the different endpoints, but the connections into the bridge are more straight lines. The artifacts aren’t there to the same degree, and you can have conversational speech. You can allow more mics open, so you can see in this case an orange, a blue. They’re all contributing to the conversation and there’s no issues that that introduces because of the smart and quality-focused implementation of the transport and the bridging.

We’ve got additional resources. I know that was a quick explanation, so there’s additional information about this on our website, as well as other information on conference calls and meetings in general, so feel free to check that out. My information is here. I would certainly appreciate a chat, so if you could send me in and let me know if this was helpful or not, we’re constantly trying to get feedback and improve. If you have ideas for other webinars that you’d be interested in, would really appreciate that as well.

With that, if you’ve got additional questions, you can send me that in the chat, and I’ll take those offline. Thanks so much for your time today, and look forward to speaking with you soon. I will go ahead and put the music back on, so if you have a question, I’ll be right with you, or you can email me as well. Thanks so much.  


Related posts