Saturday, August 9, 2008

Somewhere between Home and the Enterprise

You've probably noticed that I'm a bit of a storage guy. But I'm also looking for just want I need, not necessarily more; and I want it at a good price. There are, of course, a number of highly scalable solutions in the storage space, but tend to extend up into the price range that I'm just not willing pay. IBM's DS3000 and 4000 series are good examples. The 3200 is scalable up to 48TB (4 arrays of 12 SAS/SATA drives) , the 4800 up to 16 trays of 14 drives for 224 TB. The 4800 extremely enticing, except for the cost. That leaves me looking at pulling something together myself. SAS controllers provide a great way to scale-out at low cost simply because the cards themselves support a large number of devices, something like 256 per SAS adapater versus the SATA cards which top out at around 24. The scale-out is achieved via the min-SAS external ports to cascade multiple disk enclosures. However, cascading requires the enclosure to include a fan-out adapter; basically the enclosure includes a SAS/SATA chip that handles identifiying and managing the disks in that enclosure. The typical SAS/SATA enclosure at the low end supports up to 4 chassis cascading, this gives us between 48 and 64 devices (12 or 16 disk enclosure). There are some exceptions, the SANBloc J50 and this new beast I've just found. The SANBloc is quite nice, 12 disks, supports cascading up to 7 enclosures and a reasonable price for this no-man's land between the low end, 4-chassis limit and the high end 16+.

I'll probably have to contact the company directly for pricing; none of the web shopping tools seem to know much about the DNF SPOD 16000 JBOD. The SPOD keeps the 7 chassis limit, but sports 16 drives per enclosure, bring the total capacity to 112 devices, roughly half the limit of my fav adapter; Adaptec's 51245 Unified SAS/SATA controller. My ideal setup includes this single card, direct attachment of 16 drives, plus externally another 112 devices, for a total of 128 TB raw capacity.

How does the cost compare? The IBM EXP3000 12 disk enclosure trays start at 3200 a piece, the SANBloc's are $1600, let's assume the SPOD is about $2200. That gives us $22,400 for 84 devices from IBM, $11,200 for the SANBlocs, and $15,400 for 112 devices from DNF. Breaking that down as cost per device: $267 per IBM slot, $134 per SANBloc slot, and $138 per DNF slot. Even at $2500 per DNF, that's only $156 per slot and supports 28 more devices than the SANBloc.

The other concern that I'm throwing out the window here is array performance. Each mini-SAS port has 4 300MB/s ports, for 1200 MB/s of total bandwidth . With 112 300 MB/s SATA II drives we're capable of generating 33,600 MB/s. Funneling that torrent through the 1200 MB/s link and we squeak out ~3.5% of the total bandwidth. For a much higher through put, one needs to incorperate far more Adapater cards without cascading more drives than the link can handle. However, my main use-case is total storage capacity. None of my uses for the storage will exceed the capacity of 1Gbit networking either and that makes either the SANBloc or the DNF a good deal for lower-cost highly scalable storage.

Thursday, April 3, 2008

OpenVPN - solid open-source software

OpenVPN is a project that reminds me why I love open source software. I first read abut OpenVPN on the intertubes after looking for a secure, remote access solution for my church. The current setup there involves lots of Microsoft and all of the crappy and costly "solutions" that go along with it. The previous method of connecting to the church network entailed an RDP session to one windows computer. Only one user was allowed since the license level of the windows machine restricted the remote connections. Lovely. Insecure AND overpriced; what a combination.

Eager to displace any windows installation, I dove into looking for an open-source solution. I installed Ubuntu 6.0.6 LTS server onto an aging Dell machine and a couple of hours later, I had a working OpenVPN installation which bridged remote traffic onto the church network. This was fantastic. With the bridging option, that meant that things like Windows sharing and other broadcast programs would 'Just Work (tm)'.

I leaned heavily on OpenVPN's existing documentation along with a few fantastic tutorials for generating all of the keys. To simplify deployment, I wrote a script that automatically generates a new users keys and config files based on a template from the config samples.

Now, the goodies don't stop there. As I mentioned above, Windows is clearly in the picture at my organization and I knew I was going to need to have some way to provide access to the VPN for the windows users out there. As usual, someone in the open-source community had already done my work for me. Not only does OpenVPN work for windows, someone has created a solid GUI-based version _AND_ get this; they provide a method for generating custom OpenVPN installers which one can embed keys and configurations. This allowed me to bundle a version of OpenVPN and pre-configure the install to put the right config and keys where they need to be on the users system.

After deploying OpenVPN, we've gone from a single-user, windows-only remote access to supporting multiple connections from multiple platforms.

Sunday, March 16, 2008

Now where did I put those packets...

Several months ago I started noticing some trouble with the outgoing connections. As my wife put it, "the internet is slow again..." Indeed. Initially, power cycling cable modem or router would sometimes do the trick, but about 3 weeks ago that no longer fixed things.

The first thing I checked was my outbound hop. I was surprised to see pretty massive loss to that gateway. I ended up writing a script to check my connection to this gateway every 5 minutes around the clock so I could have some picture of when the outages were occurring.

#!/bin/bash -x
INTERVAL=${1:-300}
LOGDIR="~/work/suddenlink_logging"
while true; do
IP=`wget -q checkip.dyndns.org -O - | awk -F\:\ '{print $2}' | awk -F\< '{print $1}'`;
GATEWAY="${IP%.*}.1"
LF="${LOGDIR}/`date +%Y%m%d:%H:%M:%S`.log"
ping -c 50 ${GATEWAY} | tee ${LF}
gzip ${LF}
sleep ${INTERVAL};
done
I just spent some time with gnuplot and have some images of the issues over the last month.



Once I got the openWRT working the way I wanted, I swapped out my
router hoping that would help me isolate whether or not it was the router causing issues. I had seen a couple times that power-cycling the router would help, but not always. After about a week, I ended up seeing the same issue on the new router.

One of the most frustrating bits of this issue has been that the standard practices of the Suddenlink techs ends up masking the problem. The first thing they tell you to do is power cycle your router and modem. This ends up "fixing" the issue without ever diagnosing why it happened in the first place. They also ask you to connect directly to the cable modem, taking the router "out of the picture." This, too is problematic because the cable modem has to be power-cycled to recognize the MAC address of the new ethernet device before it will allow you to DHCP upstream. As I mentioned, this power-cycle tends to temporarily fix the issue.

Thanks to openWRT though I was able to clone the MAC address of my laptop and when I did this, I observed that typically when the upstream DHCP server sees a new MAC address, you get placed in a different subnet than before. After another round of calls with Suddenlink, I eventually told them that I was having issues even when directly connected. That was the required step before they would send out a tech. Oh, I guess I also forgot to mention that it wasn't clear that sending out a tech would help since Suddenlink can probe the modem remotely to see the various voltage signals going to the device; they all looked fine.

In any case, the tech came out on March 10th and replaced the connector of every piece of cable from the modem to my box outside the house and even the filter in the combo box up the street. In addition to that, I'm also using a loaner cable modem. The current plan is to test for a week to see if I get any packet loss and if I see none, then I'll swap back to my old cable modem and test for another week. That should determine if the issue is with my cable modem or if the wiring fixes made the difference.

I'm hoping to get this issue resolved soon, but as you can see from the charts, one never knows when it's going to cause a problem and there isn't a clear fix other than continuous power-cycling of the devices until their system rights itself.

Saturday, March 8, 2008

A Quest Completed

If you've been looking for a computer upgrade recently, you've probably seen the stellar reviews of Intel's new 45nm Penryn-based cpu, the e8400. This dual-core, 3Ghz, 6MB shared L2 cache processor is great price to performance value as well as delivering the best price per watt as well as it runs cooler than the E6600 Conroe both at idle and at load.

I certainly wasn't the only one to notice this. The processor went from being available everywhere online at roughly $190 to nowhere to be found except at shifty places at sky-high prices of $300.

Once I had decided to go ahead and pick one of these guys up it was too late to find it at my favorite online computer store, newegg.com; out-of-stock there. I started searching far and wide. Eventually I ran across this Official Intel e8400 price thread

One of the reasons I love the internet is that almost always you will find that someone else has more time than you do and spends it on finding out information that is useful to everyone and posts it for all. Such is the case with the e8400 shortage. An enterprising netizen discovered that the Intel Xeon e3110 ends up being the same processor as the e8400. It's 100% compatibile, and in most cases when users run a utility like CPUz, it says they have an e8400 processor. I've not seen any reports of the e3110 not working in boards that are known to be working with the e8400. Alas, I missed the e3110 boat as well and that processor too is not available.

But that thread keeps on giving. Friday I read a post that mentioned that a local Fry's had plenty in stock at a reasonable price, $225 + tax. That set the idea for me to visit my local Fry's on Saturday. Bingo! The salesman said they had 60 in stock. Make that 59.

Quest Reward: +5000 xp, +1 skill points, +3 attribute points

Friday, March 7, 2008

240 TB of pudding? I wanna dip my storage array in it![1]

Ha! I'll have to settle for something less "enterprise." How about 96TB? I'm game. There aren't that many low-end storage solutions that can scale up near 100TB ranges without some serious cash and lots of racks. I'm sitting on about 2TB of data: 700G cobbled together by lvm2, and a more robust 1.4TB in a RAID5 setup backed by a 3ware 4-port 9650 RAID card. I can't bare to let drives go unused so swapping out my 4 500G drives in the RAID array for 1TB drives isn't something I'm considering even though that would double my storage to 3TB in the RAID array. Just what would I do with the other 4 500G drives? I had initially hoped that 3ware's multi-card support would allow the construction of logical arrays spread across multiple cards. That is, keep my 4 drives on the current card, purchase a 12-port 3ware, and then build a single logical array of all 16 drives. The 3ware support team said that it is possible but only using software RAID in the host. Well, duh! But what's the point of the hardware RAID?

So where does that leave us? Looking at SAS cards, that's where. SAS being the "enterprise" version of SATA; advanced features like dual ports, multi-pathing, support for large numbers of drives per adapter, etc. Two key features of SAS are of key interest. First, a single SAS card can address something in the range of 128 to 256 devices. Second, SASSATA drives are interchangable. This means you can get "enterprise" scaling (SAS), with cheaper storage (SATA). In my above example, I'd be looking at multiple sets of 16-port SATA RAID cards to get large numbers of drives going in an array. Each one of these RAID cards go for $1200-$1500 a pop. 3ware's SAS controller is $650. Yea, that's right, $650 card can support up to 128 devices!

Hold your horses because there is one other dirty secret about this gold-mine of scalability. You can't directly attach that many drives to a SAS controller. Instead, SAS relies on your storage enclosure to include an expander. This expander helps the SAS controller attach to all of these additional drives. The trouble here is that most storage enclosures with expanders included are rather pricey. Typically $3000 to $6000 - just for the enclosure; you still have to go buy your drives.

There is some light at the end of the tunnel though. First, Adaptec makes a scalable enclosure, the SANBloc S50. With 12 3.5" hot-swap bays, *and* support for daisy-chaining up to 7 S50s to a single SAS controller; we've got a big bowl of pudding here. Looking around, I see empty S50s going for around $1600. An Adaptec SAS adapter and to start with, 12 1G drives, we're looking at about $5600. All told, about $465 per TB with 1 tray and a $270 drive, $419 if you scale out to 7 trays. As drive prices go down, say $200 by end of 2008, then $395 per TB for one tray, or $349 for all 7 trays. That's downright respectable scaling and a rock-bottom price.

Now, before you choke on that initial $5600 layout, let me introduce you to the other feature of the unified SAS card from Adaptec. Rather than just having a port for external enclosures, the Adaptec RAID 51245 has 12 internal ports, and 4 external. This provides an even lower entry point by supporting up to 12 drives internally AND then any of those 7 external S50s. At the cheapest, the card will retail for around $900 and then you can add in any set of drives you like, up to 12 of them, slowly adding a drive at a time up to 12 and then scale out to external storage, all with the same card.

I don't know about you, but I'm going to go get some pudding.

1. Crude references to MTV's The State skits (Barry and Levon, Louie).

Saturday, March 1, 2008

Say Hello to Failsafe

Life can be very exciting on the bleeding edge, but as I read the other day in a forum post, sometimes it means that you get cut. Figuratively speaking, I got "cut" the other morning as I attempted to switch my main router over to the glorious wrt54gl using my hand-tuned openWRT install. This did not go as planned.

I had spent the last week or so tuning the openWRT installation to do exactly what I wanted. I used one of my servers second NIC to simulate a connection to the internet. Something like this:


Such a setup allowed client1 to connect to the internet just like Server could. In my previous post I also spent time splitting up the WIFI and LAN networks to keep them independent of each other. Now, what I didn't do was to test the settings across a reboot of the device. Having not done this crucial step, I went ahead and swapped out my dlink for the wrt54gl. At first, things went well. My WIFI connection to the wrt54gl worked just fine and I could connect to the internet. However, nothing on the LAN was getting a connection to the wrt54gl. I logged onto the wrt54gl (via WIFI link) and looked at the config. After examining the routing table (rount -n), I noticed that the LAN network wasn't listed. I had configured WIFI to use 10.23.24.X and LAN to use 10.23.23.X. At first, I figured this was just a hiccup and attempted to bring the LAN interface up (ifup eth0.0). No such device. Huh. ifup et0. No such device. Booo. After about 10 minutes of nothing working, I figured I would revert the separation configuration and bridge the WIFI and LAN together. This was simple, just add the bridge option back to the config (/etc/config/network). Reboot.

At this point the bleeding had started. The WIFI associated with the wrt54gl, but I couldn't get an IP. I couldn't get an IP via the LAN either. I manually configured both WIFI and LAN and neither could connect to the wrt. Thinking for a but I suddenly realized my mistake. In the previous post, I mentioned some firewall rules. Specifically, I had entries to prevent the WIFI and LAN from passing traffic to each other. And now they were part of the same bridge. This meant that no traffic from either LAN or WIFI was getting into the device. The wrt54gl was bricked and I needed a tourniquet.

As hopeless it seemed, I had to assume I wasn't the first guy to hose up a firewall configuration on the wrt , locking one out of the device. And I was right. The smart folks at openWRT built in a failsafe mode.

I had already attempted to use the reset button on my wrt54gl, hoping that holding that during boot up might reset to defaults. This is true, but as the link described, it has to be done after the DMZ light turns on. Thankfully on my model, holding down before the light came one didn't trash the device. On the second try, I got it to enter Failsafe mode; DMZ light flashing three times a second. A quick reconfiguring of my ethernet interface and I was telnetting to 192.168.1.1 and greeted with the openWRT logo and a shell.

I chose to use the firstboot and sync method and before rebooting, I ran 'passwd' to switch over to ssh. I then rebooted the device and it came up with all of the defaults again. Whew! It was also good to know that even if the Failsafe method hadn't worked for me, there were still several other options: UDP broadcast message and TFTP booting. Great Job openWRT!

Sunday, February 24, 2008

openwrt and wrt54gl - separate wireless from lan

I've always wanted to get one of those wrt54gls (linksys wireless router that you can flash your own linux on) and was motivated to make the purchase after hearing good things from a friend at work. The wrt arrived last Thursday and I had it flashed with the latest openwrt within 5 minutes. There is always that dread that you just bricked your 50 dollar toy while you wait for the firmware to upgrade and reboot the device. Within 2 minutes though, I was telneting to my router and was greeted with a pleasant banner:

_______ ________ __
| |.-----.-----.-----.| | | |.----.| |_
| - || _ | -__| || | | || _|| _|
|_______|| __|_____|__|__||________||__| |____|
|__| W I R E L E S S F R E E D O M
KAMIKAZE (7.09) -----------------------------------
* 10 oz Vodka Shake well with ice and strain
* 10 oz Triple sec mixture into 10 shot glasses.
* 10 oz lime juice Salute!
---------------------------------------------------
root@OpenWrt:~#

Fantastic! The first thing I wanted to learn was how to config iptables so I could pass through traffic from wrt clients to my uplink, but prevent any access to other machines on my home LAN. This would seem to be a fairly common request and I found tons of pages with tips and such but nothing as straight foward was I wanted. I eventually spent time with the iptables tutorial and divined out the following command:

iptables -A FORWARD -m iprange --dst-range 192.168.1.2-192.168.1.255 -j DROP
This is exactly what I needed. First, I found out that the table 'FORWARD' represents network traffic not destined for the host machine. That means it would be either for my LAN, or the internet. My uplink router IP is 192.168.1.1 and that is the only valid IP I want to forward between my two segments. Once I ran that command, I could ping external machines but nothing on my LAN.

After that success, I moved on to the real topic; separating wireless from LAN. I run an open wireless router with essid broadcasting. This means that anyone close enough to the router can associate and connect. I'm comfortable with this setup, however, the default config for all WIFI routers is to bridge the wireless with the wired connections. This just makes sense for most folks. My preference is to ensure that whomever connects to the WIFI link can't attempt to connect/crack/DoS anything on the LAN. I finally stumbled upon exactly what I needed. As usual, the hardest part is to know what to look for when searching on google. I knew that I needed to first remove the wireless link on the wrt from the software bridge that allows traffic between WIFI and the LAN. After separating them, one needs some firewall changes if the WIFI connection is to be able to connect to the internet. The guide to making that happen was on the openwrt wiki. In about 5 minutes, I had it all working. I skipped over the Shorewall config and chose to use a modified version of the 'Using OpenWrt Stock Scripts' section. I removed the last two lines which allowed traffic between WIFI and LAN; exactly what I don't want. But the line above is needed to ensure that WIFI traffic is allowed to WAN. Here is what I added to the 'allow' section:

# wireless-to-wireless OK
iptables -A FORWARD -i $WIFI -o $WIFI -j ACCEPT
# wireless to WAN, if WAN present
[ -z "$WAN" ] || iptables -A FORWARD -i $WIFI -o $WAN -j ACCEPT

# wifi to lan -- OFF
#iptables -A FORWARD -i $WIFI -o $LAN -j ACCEPT
# lan to wifi -- OFF
#iptables -A FORWARD -i $LAN -o $WIFI -j ACCEPT

The next step, of course, is to start playing with QoS on the WIFI link. I want to ensure that openVPN WIFI traffic is prioritized over any other.