I should have called this article, “How I spent my Christmas vacation developing an ulcer” (from the abominable state of backhaul tech support). I know, I know, WISPs never really get a vacation. I’m so paranoid about being out of touch that I took a satellite phone on a cruise and left the number with my Vienna Hot Dog vendor in case he was worried when I didn’t show up every Tuesday. However, recently I’ve put 24 hours on my cell phone over 4 days, had to drive 700 miles, and waste two 16-hour days on-site while my family was enjoying Italian dinners, Christmas cookies, and many hours around the card table. But this isn’t an article about goofy client customer support calls.
Sure, I can tell you how the first Christmas call I got is from a customer with a new Kindle. Since her computer was wired directly to the CPE, she realized that she needed her wireless router. This would be the router sitting in the garage for the last 18 months, 9 months before we did our installation. So, that means the technician who installed the CPE antenna never even knew she had a wireless router and of course, she thinks he should magically know the security code. No, I’m not here to tell you stories about those kinds of calls since I can’t devote January to compiling them and many of you have your own list.
I’m here to tell you about data carriers who think that everyone above Level 1 tech support doesn’t need to give a flying donut about the customers. As they sit on Mount Data Center Olympus, they look upon us knaves as the ignorant masses who should accept the fact that our network is down until they feel like dealing with it. It’s simply inconceivable to them that they screwed up; it must be our fault. They think about us, the last mile data carriers that aren’t Cox, Time Warner, or CenturyLink the way the Goa’uld think about the Tau’ri (Stargate reference in that the Goa’uld think they are Gods and that the Tau’ri, the humans, are ignorant peons to be enslaved). Yeah, I’m a SciFi geek.
I just spent 4 days working through a level of denial, stonewalling, outright incompetence, arrogance, and downright laziness with 3 out of the 4 companies I deal with. I’m not naming names because in my experience, they aren’t unique. Apparently the concept of customer support is totally foreign to some backhaul carriers. The lack of oversight on how their departments treat their customers is extremely evident because I’m thinking that if the CEO of any of these companies knew how bad it was, heads would be rolling. Of course, the culture of keeping customers with Level 1 technicians trained and boxed in so that it’s impossible to escalate a call (at least that’s what they tell the customers until you threaten all sorts of mean, nasty things) to the mysterious Gods of Networking is the fault of the CEOs. It’s not even remotely possible that level 2 staff made a mistake or even be bothered to check with the customer to see if a call should be closed (part 2 will cover this ingenious decision). If this were one company out of 4, okay, maybe it’s an anomaly. But it occurred with 3 out of 4. And the company that I exempted basically did their job, but I still had to initiate every single follow up call, 5 of them. They never called me back every hour (as they said they would every single time) for an update even though this outage was 5 hours past their SLA. However, when I did call, a technician answered and they then moved to get an update.
Since I was dealing with 2 separate outages, I’ll start with the first one which also took the longest to resolve. That’s because it’s harder to fight your reseller and their last-mile carrier simultaneously. It’s kind of like having to climb over the barb-wire fence first just to get to the mine field. And while you are working your way through the mine-field, barb-wire fence keeps popping up after each step. When you have to do the job of the reseller’s tech support supervisor on top of that, life gets rough. It’s even worse when the Level 2 reseller tech support techs try to parse and jump on every word you say to avoid having to call the last-mile carrier. They also like to subtly remind you that you aren’t qualified to operate a TV remote control, let alone tell them that their network doesn’t work properly and why. I have now learned from 3 different people how the ping command works because apparently I was unaware that I needed a re-education in this. Must be test coming up that nobody has told me about.
If this article does nothing more than tell the resellers and backhaul carriers that somebody needs to be reviewing their absolutely pathetic customer service above Level 1, then it has some value beside my personal therapeutic value. Hopefully this article helps you to navigate through the painful ordeal that is our backhaul carrier’s support processes. I firmly believe now that these processes were developed by the medical insurance industry and ported word-for-word into our industry. Either way, I hope my ranting has some redeeming value because I know that if I have to go through this again, I will be hiring a full time Anger Management consultant, making an appointment with the Hair Club for Men, and putting a gastroenterologist on speed dial.
A few days before Christmas, we started getting calls that web browsing had pauses of 15-30 seconds between pages. NetFlix users were also having big streaming buffering problems. We knew about the NetFlix/Amazon problem and also thought it might be our network since our load was up about 300% since September. We looked at 5.5.2 Ubiquiti firmware as a possibility since there were some rumblings in that area on the forums. So, after quickly testing 5.5.4 Beta 2 in our office, we decided to deploy it on our APs and for a day or so, things seemed to be much better. Ah, problem solved, life is good. Silly me, I was so naïve. It had nothing to do with the firmware although 5.5.4 Beta 2 was working better than 5.5.2. However, we did go back to 5.5.2 when further testing showed some possible compatibility issues with 5.5.4 RC-1 that we didn’t have time to review as of this article (my wife made it clear to me that spending time with the in-laws was more fun than playing with new firmware features across a few hundred radios). The firmware does seem to work very well and will be an improvement but you probably shouldn’t deploy it mixed with 5.5.2 Beta 2 in a production environment yet or at least do your own testing before making that decision.
The problem cropped up again the next day, or at least that’s when we started getting calls again. Now we were looking at the authentication servers and routers. Couldn’t find a problem there either and the Peplink load-balancing routers had been rock solid for years. So, we were down to the provider’s routers. We realized that we were seeing two circuits that normally run about the same level of bandwidth because of the load-balancing, having a bandwidth usage differential of 10-1 that was unexplained. We rebooted one of carrier routers, and bounced the Peplink port on that router. Now it wasn’t coming up at all. This is where the fun starts so grab a HoHo, a glass of milk, and the Pepto-Bismol and get comfortable.
I’m not a big fan of following the rules, especially when everything in my experience set tells me that something smells wrong and the obvious direction I’m being told to follow is the same failed path I’ve gone down before. However, in the interest of the Christmas Spirit and Good Will Towards your Fellow Man thing, I followed process and opened the call as I’m supposed to. A Level 1 tech (or as I have nicknamed them, Evil Guardians of Truth and Productivity and the preventer of getting my circuit back up and running) took the call, and tried the usual “did you power cycle every computer within 5 square miles” shtick. I realized it needed to get escalated quickly, but they said I would hear back from them within a couple hours. Even though I let him do it, I knew the model for their Level 2 tech support was “We don’t need to call no stinkin’ customers back” from past experience. However, I waited my obligatory 2 hours, got scolded by at least 10 new Kindle owners during that time, added another 30 minutes extra, and then started over again.
The standard lie that the Level 1 tech support personnel tell you is that they have no way to contact Level 2 tech support. Companies that think this firewall is a good policy in that a data communications company can’t reach another division within their company clearly need to be cleaning out some middle management. However, the level of difficulty for Level 1 to reach Level 2 varies between companies. Some companies only allow Level 1 to use some type of Instant Messaging or email for Level 2 (brilliant move there CenturyLink, for your business customers and why I’m now another provider at my office). Yeah, dictating to a Level 1 tech person middleman who is typing at 50 words per minute trying to describe a routing problem is so much more efficient than talking at 125 words per minute. And with feedback from new information taking minutes, it takes an hour to convey something that could be resolved in moments. It’s simply a really stupid policy and does nothing but waste time for the Level 1 tech, ticks off the customer to no end, and forces Level 2 techs to multi-task which is difficult in a conversation, but impossible in a highly-technical solution process. It doesn’t help that the Level 2 techs also start off with the same stupid question, “Have you checked the cables and power cycled everything first”. I too am a fan of the IT Crowd (British Comedy TV show about a dysfunctional corporation and the tech support guys in the basement), but that wastes another 20 minutes waiting for reboots and is the reason my cell phone almost became a projectile.
If you calmly explain to the Level 1 tech that you graduated the 3rd grade, know that the word computer doesn’t start with a K, and that your shirt matches your pants, you might gain enough credibility to get them to understand you aren’t a total idiot. When that doesn’t work (you have a 50/50 chance here) then you ask for a supervisor or tell them to connect you to Level 2 because this problem is above their pay grade. This effort raises your chances of getting to Level 2 by 0% but it makes you feel better. That’s when you threaten to commit Hari-Kari, start throwing out your technical analysis of the problem with every detail and what you have done to resolve it thus proving it’s their problem, not yours, and tell them you have contracts from their competition sitting on your desk ready to sign. Miraculously, they have now discovered the extension number of Level 2 support.
So, we are now about 3 hours into this problem and still nothing has been accomplished. When you finally reach Level 2 tech support and you get past the obligatory, “have you power cycled the router”, (I never get tired of hearing that one because it’s so much more funny the 30th time it’s repeated), then you basically sit on the phone while the new tech rereads the notes on this problem. It’s really silly of you to think that anybody in Level 2 even looked at this yet since it’s only an hour past since they were to respond. Oh wait, I almost forgot the best part. Before this second call, while waiting for Level 2 to call you back, this little jewel pops up in my Inbox from said department on my case, Level 2:
“We need verification from the EU if they want DHCP enabled on the router or not.”
After I picked myself off the floor, I reviewed everything to make sure I hadn’t just jumped to another dimension and Rod Serling was talking to an audience about this. Let’s see, one of my circuits is dropping 5-15 pings or basically freezing up every 2 -3 minutes for 30 seconds or so and my other circuit isn’t even close to delivering the bandwidth I’m paying for. That’s the official trouble ticket. Now go back and read the response from Level 2 again and tell me that as an industry, we should be worried that “going networking” hasn’t replaced “going postal”. Of course they called the local carrier and asked them to check everything. When the carrier responded with, “everything is working fine but the DHCP setting might be wrong”, my reseller Level 2 technician thought this was such a great response, he couldn’t wait to share it with me before critically analyzing this vital information to see how it fit into my trouble ticket. Utterly brilliant, and now I’m kicking myself because I didn’t think of the DHCP setting being turned on or off as the possible culprit? What was I thinking?
This is on my business circuit that never used DHCP and has been on static IP addresses from the beginning. How two Level 2 technicians or network engineers came to the epiphany that DHCP has anything to do with this issue is something that I will never be able to explain to my dying day and for which I never got an answer. For reference to my possible reflection that either everyone in the world has gone crazy or it’s just me, reread the first paragraph. It was either that or Allen Funt was going to jump out any time and tell me I was on Candid Camera.
Maybe the Level 2 techs got into the eggnog too early or maybe their multi-tasking led to this mistake. What I do know is that my reseller Level 2 support never called me and then somehow came up with this brilliant conclusion. I also knew at this point that I was going to have to restart the tech support ladder from the bottom with 50 client calls backed up and I’ve been down about 4 hours now. The last thing I wanted to do was walk 2 companies though troubleshooting processes for something they are supposed to be the experts in. I’m also 100 miles from home and missing hand-made ravioli with all my relatives who flew in to visit us for Christmas. I’m not in a great mood for this kind of apathetic effort on their part. I could tell that this was going to be a very long day.
I’m going to pause now until Part Two of this epic adventure. I know that many of you have had far worse technical support problems with issues 300’ in the air in below freezing weather. I understand that and have been through some of that myself so believe, me, I feel for you. This article isn’t about looking for sympathy about some of the challenging and difficult environments that we deal with. Be honest, it beats sitting behind a desk all day and if we didn’t love it, we would be doing something else. It’s more about how difficult it is for a WISP when you have to deal with tech support processes put in place by incumbent carriers that simply don’t care about customer service. It’s either that or they hire managers and staff who would rather be playing Halo 4 than doing their job. The only guy I feel for in this process is the Level 1 tech who has to make a decision to escalate a call to Level 2 or actually call them to do their job, especially if I’m on the other end of that line after my last trouble ticket email from Level 2.
So now you know the reason for my Christmas ulcer. I started the whole process over again but with the knowledge that Level 2 tech support is going to require me to hold their hand to get their job done. That means no mercy anywhere along the process since I am painfully now aware that Level 1 is impotent and Level 2 is lazy and inattentive. As you can see, I’m 3 pages deep and still haven’t gotten a single thing accomplished with either company and now other data circuits are down at another location 80 miles away. The hits just keep on coming. Part two of this article might be a little less cynical after I’ve had lots of really, really good eggnog and with my used to be fresh leftover ravioli that now have to be micro waved to death (aaarrrggghhhh). If I ever needed a HoHo for dessert and comfort, now would be the time.
* * * * *