Preparing for the Worst and Setting Up for the Best: Western Tool & Supply’s Disaster Recovery and Modernization Story
As Western Tool & Supply’s system administrator, part of my job is to just get IT out of the way. I make sure the systems run smoothly so it’s business as usual, instead of: “Why is our system down again?”
But we’ve had problems. Our servers were out of date and disaster recovery was non-existent. It was one horror story after another. The system would go down all the time. Even worse, when things went wrong, our former IT Manager couldn't be reached. He’d be gone, for whatever reason.
As our system administrator, I wanted to move us away from that sad state of affairs, and I needed the budget to do it. But I was able to justify the expense because ensuring our systems’ availability would help us not only keep the business running on a day-to-day basis—it would also support us for the future.
Finding a Robust IT Backend for a Call Center Business
At Western Tool & Supply, most of our business comes through our call centers. We have ten branches on the West Coast, and our headquarters is in Livermore, California, in the Bay Area. We are the number one regional distributor of cutting tools, in no small part because our customers can call us up and have same-day delivery of the tools they need to keep production going.
Our value-add is immediacy. The delivery system we have in place is like Amazon Prime Now, but for machine tools. If your business is within a certain vicinity of one of our offices, we can deliver the tools you need the same day, at no charge. Our drivers go out twice a day: once in the morning, and again every afternoon. If you place your order by 8:30 a.m., it’s on the nine o’clock truck. If you call us by noon, and we have it in stock, it goes out at 1 p.m. If it’s at one of our other offices, you'll typically see it in a couple of days.
Each of our offices is a warehouse and a call center and we do most of our business by phone, which is why our system has to be fast. Our customers don’t browse the website looking for what they need. They call us with technical cutting questions, or to see if we have a tool in stock, and then place the order. If our system is down or slow, they could miss one of our daily delivery windows. It also inhibits our reps from doing their jobs. That not only has an impact on our business, but could also have a severe impact on the activities of our customers.
Just think about it. Running out of a tool—even a small item at a low price—can cost a machine shop hundreds of thousands of dollars. It can bring down an entire manufacturing line for hours, or days. The sooner we can resupply the item, the faster a shop can get back to work. You can imagine what a horror it was when our systems went down. (Side note: Western Tool & Supply has built a great product to help manage your tool inventory)
Installing a Data Center in a “Regular Office Building”
Many years ago when Livermore became our headquarters, it was decided to house our data center in the building. The problem is there’s nothing particularly special about our head office—except it’s bigger than our ten branch offices.
Sure, they put in special flooring and beefed up the AC and threw in some extra power—but our data center was vulnerable to all the things that can go wrong at a regular office building. If there were a power outage, bye-bye server. If someone doing construction down the street dug up the wrong line, we’d lose our network connection.
I’d thought about moving our server infrastructure to a co-location facility, but a business decision had already been made to keep it at our corporate headquarters. We had to switch gears and start thinking about a disaster recovery solution.
My other challenge was our out-of-date equipment. We had something like 30 bare metal servers. Not only were they six or seven years old, but they were also running Windows Server 2003, and we needed to get up to 2008 or 2012.
That was back in 2014. We had this great idea that moving to VDI (virtual desktop infrastructure) would help us streamline and modernize, as well as possibly solve our disaster recovery problem. So I started doing some research on potential solutions.
Researching the Best Solution
We began to look at virtualization solutions, and hyperconvergence kept popping up as this new concept. So I dug a little deeper. We didn’t have anybody who was an expert in storage or virtualization, but it was obviously the way we wanted to go because of how it could streamline our IT operations.
I was really excited to learn everything about it. I get a kick out of learning what makes things tick. I started tearing down computers and building them back up as early as I can remember. That’s how I started to learn how they work.
I poured so much time into this research because I wanted to make sure we got the best solution. On top of that, it was a lot of money to invest. When handling my budget, I constantly remind myself I’m spending someone else’s money. When I’m put in that position, I ask myself, “If this were my money, how would I want to spend it?”
After we’d figured out how hyperconvergence works, we sought out the top vendors. We then went to lunch with each of them because, hey, we love getting a free lunch.
We had lots of conversations with potential vendors. But we didn’t just take their word for it. We wanted to know what other IT professionals had experienced, so I hit the message boards—Gartner, Reddit, and even SpiceWorks—all the different review sites.
After we’d seen how the various solutions worked in the real world, we narrowed it down to the final two vendors: HPE SimpliVity and Nutanix. In the end, HPE SimpliVity won based on features that centered around those two key initiatives for us: disaster recovery and modernization of our servers.
Avoiding Catastrophes—Big and Small
It’s amazing how far we’ve come thanks to HPE SimpliVity.
Back in the old days, updating our software was a nightmare. Sometimes, we’d apply a patch or update to the latest version of an app, and everything would fall apart. It would take us hours to recover from something like that. We’d have to locate the image and re-image the machine. Even then, we weren’t always sure how old an image was.
With HPE SimpliVity, we’re constantly backing up our servers. It only takes a few seconds, and we can set up multiple backups every day. If an update fails, we simply roll back to a couple of hours or minutes ago. It’s amazing how fast we can recover.
There’s also the topic of redundancy. When we upgraded to HPE SimpliVity servers, we not only got faster hardware that sped up everything, we also got built-in server redundancy. We’ve had a couple of hard drives fail since then. You can’t always avoid failures like that. But when it did happen, it wasn’t catastrophic. The solution to the problem was ready when we were.
Now, we have a disaster recovery system in place running on HPE SimpliVity. We still have the same problems with power outages at our main office, but in the event of a blackout, we can go into vCenter, highlight all our VMs, and power them off safely. We can then spin up those VMs at our disaster recovery site. We haven’t had to do this yet, but if it happens, we’re ready.
Downsizing Data, Upsizing Performance, and Improving Uptime
Another huge saving comes from reduced storage costs. Storage isn’t cheap, and even though the cost of media is going down, and drives are getting bigger and faster, enterprise-class storage is expensive, and you always need more. It’s pretty crazy how data blows up. The amount you need to store, maintain, and protect is always going up.
A further benefit of upgrading our software and hardware was taking advantage of the latest data deduplication technology. Our VDI setup consists of clones of a single image with little variation: We have something like 150 Windows 7 or Windows 10 VMs.
We’re saving around 68 terabytes on our VDI cluster, and another 55 on our server VM cluster. The compression ratio is around 11.7:1 on our VDIs, and 10.1:1 on our servers. The efficiencies are huge.
I can sleep so much better now. I would constantly worry about the servers we had before. Our HPE SimpliVity setup runs better, is more resilient, and makes it easy to recover from failures or add capacity. Spinning up a new VM and adding a new data store is nearly instantaneous.
Spending the Time and Money Now to Future-Proof Our Business
To put it mildly, HPE SimpliVity took us from the Stone Age to the private cloud era. It is a single affordable solution that allowed us to meet a number of business goals at once. We modernized our data center, boosted our processing power, vastly improved our uptime, and streamlined our storage—all within budget.
The thing you have to remember is to take the time and spend the money to do it right. Whatever the project, it always takes longer and costs more than you think. Don’t be afraid to budget more time and money. I used to be a penny pincher but I learned, over time, that it’s in your best interest to invest a little more in your business.
Thanks to HPE SimpliVity, Western Tool & Supply can manage our ever-changing business requirements. We don’t have to worry about missing opportunities because our backend is not up to the task. The right setup today means we’re better prepared for the challenges of tomorrow.
You have to think of the future. And one of the things I hope to see in our future is moving our hardware to a co-location facility. HPE SimpliVity halved our physical footprint—we went from four racks to two. This makes moving to a separate facility more cost effective. Instead of moving a mess of equipment to an external data center, we can pull it off by transferring a few HPE SimpliVity 2U systems.
It’s really liberating to know how simple IT can be. That’s the thing about finding the right IT solution: It simplifies your life, and lets you get on with the job.