Sunday, March 16, 2008

Now where did I put those packets...

Several months ago I started noticing some trouble with the outgoing connections. As my wife put it, "the internet is slow again..." Indeed. Initially, power cycling cable modem or router would sometimes do the trick, but about 3 weeks ago that no longer fixed things.

The first thing I checked was my outbound hop. I was surprised to see pretty massive loss to that gateway. I ended up writing a script to check my connection to this gateway every 5 minutes around the clock so I could have some picture of when the outages were occurring.

#!/bin/bash -x
INTERVAL=${1:-300}
LOGDIR="~/work/suddenlink_logging"
while true; do
IP=`wget -q checkip.dyndns.org -O - | awk -F\:\ '{print $2}' | awk -F\< '{print $1}'`;
GATEWAY="${IP%.*}.1"
LF="${LOGDIR}/`date +%Y%m%d:%H:%M:%S`.log"
ping -c 50 ${GATEWAY} | tee ${LF}
gzip ${LF}
sleep ${INTERVAL};
done
I just spent some time with gnuplot and have some images of the issues over the last month.



Once I got the openWRT working the way I wanted, I swapped out my
router hoping that would help me isolate whether or not it was the router causing issues. I had seen a couple times that power-cycling the router would help, but not always. After about a week, I ended up seeing the same issue on the new router.

One of the most frustrating bits of this issue has been that the standard practices of the Suddenlink techs ends up masking the problem. The first thing they tell you to do is power cycle your router and modem. This ends up "fixing" the issue without ever diagnosing why it happened in the first place. They also ask you to connect directly to the cable modem, taking the router "out of the picture." This, too is problematic because the cable modem has to be power-cycled to recognize the MAC address of the new ethernet device before it will allow you to DHCP upstream. As I mentioned, this power-cycle tends to temporarily fix the issue.

Thanks to openWRT though I was able to clone the MAC address of my laptop and when I did this, I observed that typically when the upstream DHCP server sees a new MAC address, you get placed in a different subnet than before. After another round of calls with Suddenlink, I eventually told them that I was having issues even when directly connected. That was the required step before they would send out a tech. Oh, I guess I also forgot to mention that it wasn't clear that sending out a tech would help since Suddenlink can probe the modem remotely to see the various voltage signals going to the device; they all looked fine.

In any case, the tech came out on March 10th and replaced the connector of every piece of cable from the modem to my box outside the house and even the filter in the combo box up the street. In addition to that, I'm also using a loaner cable modem. The current plan is to test for a week to see if I get any packet loss and if I see none, then I'll swap back to my old cable modem and test for another week. That should determine if the issue is with my cable modem or if the wiring fixes made the difference.

I'm hoping to get this issue resolved soon, but as you can see from the charts, one never knows when it's going to cause a problem and there isn't a clear fix other than continuous power-cycling of the devices until their system rights itself.

No comments: