Powering Music City and Ensuring Compliance with HCI
Boasting a proud place in America’s cultural heritage, Nashville, Tennessee, has earned its moniker of Music City. From the Grand Ole Opry to the Country Music Hall of Fame and many small stages in between, Nashville Electric Service (NES) powers it all.
Distributing electricity to more than 410,000 customers in central Tennessee, NES is one of the largest municipally-owned utilities in the United States. I know this organization intimately, having been here for more than 20 years. When I was first hired, I had never laid eyes on a server; our only tool was a laptop computer. Today, I oversee installation and maintenance of all operations data centers and communications circuits, including fiber and radio systems.
Times have changed, but the utility industry is slow to adopt new technology. That all changed for us in 2015 when North American Electric Reliability Corporation (NERC) became responsible for compliance. Suddenly, we had to have systems in place to monitor and document actions, and hold onto that data for long periods of time to satisfy audits or face significant fines. We had to change our ways.
Regulatory Compliance Demanded a Change in Architecture
And I do mean suddenly—my group, which maintains all operations communications, was told in February of 2015 that we had to be compliant by July 2015. It was a short time to implement all the systems we needed to meet our compliance requirements.
We purchased a lot of turnkey systems, but one of the compliance requirements was that we couldn’t use a traditional data architecture consisting of shared hardware. Having separate LUNs in the same hardware chassis in the data center did not meet NERC requirements. Because we couldn’t share hardware, we had to come up with a cost-effective way to implement all these different systems in separate physical hardware. The most cost-efficient way to do that was to move to a hyperconverged infrastructure (HCI).
We looked at every HCI platform on the market, including NetApp, SimpliVity, and Dell EMC. While cost efficiency set us on the path of HCI, when it came to comparing platforms against each another, cost became a secondary factor. Our primary goal was to achieve NERC compatibility, and on the technical aspects, Nutanix came out on top. Once the Nutanix clusters were in place we rushed to spin up seven virtual appliances in addition to about 35 Windows servers, but we shouldn’t have worried. Setting up those appliances on Nutanix took minimal time and effort, and it worked fantastically. I don’t believe anyone outside my group even realized we’d changed anything.
Moving Towards Virtualization with a Flexible Solution
We initially started running Nutanix on Dell ESXi hosts in a dark site setup, meaning that we had no internet connectivity. The only way in and out is through specific protocols we set up for manual file transfers.
What’s pretty unique is that we are running Nutanix Metro Availability, a continuous availability solution that limits downtime and replicates all all data between our data centers. We own the fiber between our primary data center and our backup data center, with a 20Gbps link between the two, which gave us the opportunity to run a Metro cluster. That has saved us a lot of heartache by protecting application workloads and also gave us lots of flexibility in performing upgrades, disaster recovery drills, and the like. It’s super easy to just shift all the data storage and compute from one data center to the other and perform whatever maintenance needs to be done, all with no interruptions to the users.
In the future, we hope to move away from the ESXi hosts, but for now it is still required for some of our virtual appliances. In the meantime, we’ve virtualized our Windows-based SIEM, which has allowed us to run the new SIEM on clusters running AHV. We knew that moving our SIEM hardware to virtual would have a lot of compute and storage requirements, but the Nutanix solution scales beautifully. When we chose the hardware for the virtualized SIEM, we chose to go with Nutanix AHV for three reasons:
- It’s compatible with Windows
- It doesn’t have the licensing costs of ESXi
- AHV had recently implemented the real-time sync feature, which would replicate the ESXi Metro Availability functionality
I was sold. In moving to a virtualized SIEM with Nutanix, we expanded data availability, which in turn improved our security posture. Sharing data has become much easier and more granular, we have a lot more compute power, and I’m not concerned about ever running out of storage. I’ve expanded my storage and compute by a third.
It’s easy to add another node if necessary, and Nutanix gives me options: I could deploy storage-only nodes, for example. It’s such a flexible platform, it gives me whatever I need.
Overcoming Internal Reluctance
Using AHV has been a bit of a learning curve because it’s different from ESXi, but the time and effort that’s been put into the common management platform Prism GUI is much appreciated. It looks good, it works well, and even though it’s a rapidly evolving product, it’s pretty bug-free. Since implementing any of our flavors of Nutanix, we have never, ever lost data due to a Nutanix problem, and that speaks volumes.
That said, we had to overcome some reluctance from within our ranks. NES has its electric customers, but from my perspective in managing internal compliance, my customers include our SCADA developers, who want to ensure that all their software runs as expected. These internal customers were comfortable with standalone servers since they could see them and touch them, and it's easy to calculate a one-for-one translation to determine how much redundant hardware they need to cover any potential failure.
Unfortunately, their first experience with virtual machines was a standalone ESXi host, which didn't have the greatest hardware redundancy or compute speed. It took a significant amount of convincing to get them to try a full bore virtual environment after their sketchy experiences with the standalone hosts. They were not at all keen, especially when it came to the idea of migrating critical systems.
That’s why, when we purchased the virtual system for our SIEM, I was able to provide the developers a three-node Nutanix cluster in a development environment that they could play with. Now that they’ve seen how easy it is to create and delete their own machines from the Prism Element interface, and they see how redundant the hardware is, they are very happy. Once they understood how it works, they’ve been quick to ask, “What more can we do with this?” We’re now looking at the possibility of building a larger cluster for the software team and using Nutanix Flow to separate their domains. If we can manage that, we could realize a 75% reduction in network hardware costs.
Nutanix has come a long way in their upgrade cycles, and with Nutanix Life Cycle Manager (LCM), upgrades are now easy and seamless. I can pull updates for all of my Nutanix platforms from one place. It’s been a great help in simplifying the upgrade process as well as documentation for compliance.
Nutanix Prism Central allows me to manage all my clusters at a data center as well as orchestrate Metro Availability all at one centralized data site. Visually, it’s very clean and I can set up alerts which help me stay on top of everything. We also use Veeam for backup, which integrates seamlessly with AHV.
With a Rock-Solid Solution, My Team Can Do More
Using Nutanix has been a huge leap forward for NES. The compute is faster, allowing my team to do more with less hardware and less cost. Because it runs so reliably, I don’t have to spend a lot of time on maintenance. I just get to use it. It’s a fantastic product that works exactly as it should.
Like most of the world these days, my team is short-staffed with only three of us to perform network maintenance as well as administer all the compliance systems from security, to network admin, to help desk functions. Because Nutanix doesn’t require a lot of maintenance and Nutanix support is fast, responsive, and easy to deal with, I don’t have to memorize everything that goes on with this system under the hood. If we do run into an issue, we can get an engineer on the phone and solve the problem quickly. Nutanix helps my small team get all our jobs done.
Our sales team at Nutanix has also been amazing (special shout out to Adam, Brian, and Pete!). I’ve had experiences with other vendors where it has been like pulling teeth trying to get technical details, but the Nutanix sales team always provides me with a Sales Engineer who can give me those detailed answers. Nutanix has also been flexible on their delivery avenues, which is a significant benefit as a municipal utility. It has been a refreshingly pleasant experience working with them.
Going forward, I see us moving more into a virtualized environment, especially for our SCADA group. That will save money in the long run instead of dealing with separate network and hardware costs. It’s a much more efficient way of doing business than with individual machines, and we intend to stay with Nutanix for the long haul.
Before moving to Nutanix, I had no idea how simple virtual administration could be. We came from the land of hardware and it’s been a game-changer for us. Now it just feels natural, the way things should be. Looking back, it’s shocking to see how much we wouldn’t have been able to do without HCI. It has opened our eyes to a whole new world.