Hyperconvergence and Housing: Providing Affordable Homes to Those in Need
Our primary goal at Metropolitan Thames Valley Housing is to provide affordable homes to those in need. We’re one of the largest housing authorities in the UK. We also operate elder care facilities, as well as social centres where people can look for jobs and access various community services.
One of our initiatives is offering low-cost lodgings to doctors, nurses, and police officers in London. Rents in the capital are astronomical, and it is often difficult—if not impossible—for first responders to find affordable accommodations near their workplaces. This is our way of assisting those who help the rest of us in times of crisis.
We also operate a commercial branch that offers shared ownership opportunities to first-time homebuyers. We buy part of a property and they purchase the rest. This allows people to overcome some of the hurdles to owning a property. We then reinvest the profits from these ventures into our social and community efforts.
The current organization emerged from the merger of Thames Valley Housing and Metropolitan Housing in October of last year, and we are still working on consolidating our operational policies and procedures.
Innovation and IT for the Collective Good
IT plays a major role in our activities, and is tightly woven into the fabric of everything we do. One of our biggest concerns is efficiency. The more we can streamline our day-to-day operations, the better we can deliver services to our residents. This is mostly a matter of using hardware and software to work faster and smarter. But there’s also a question of cost. If we can reduce our IT equipment and management expenditures, we can put that money back into the community and build another house, or add a new service.
We are also on the cutting edge in terms of how we use IT to serve our residents. Thames Valley Housing was one of the first housing authorities to offer a self-service web portal where our residents can log in, view statements, make monthly payments, and obtain information about our policies. The site won several awards for innovation, and we’re extremely proud of this. After all, we’re a bunch of geeks trying to make the world a better place.
Ballooning Storage and Disastrous Disaster Recovery
As a public service, we collect and maintain enormous amounts of data, and it just keeps growing. Unfortunately, we weren’t up to speed on storage and disaster recovery. Our SANs were ballooning, and they were lossy to boot. We had rack after rack of servers and hard drives. They were taking up space and consuming far too much power. And you know how these things go: The more devices you’re running, the more things can go wrong.
Our disaster recovery protocol was—well—a disaster. We’d back up everything in our offices to a SAN inside the building. As a last resort, we would write everything to 10 or so tapes overnight, and someone would take them away the next morning. We didn’t have the bandwidth or the speed to move these backups to offsite servers.
To further complicate matters, we could only backup the last seven days' worth of data because we lacked the capacity and the funds to maintain a bigger archive. It also took days to recover from tape, and we could have lost an entire week of backups if a single tape was corrupt.
We had this happen a few times. Every year we do a disaster recovery drill to test our setup and to train our staff for the real thing. This arduous ordeal often took seven days. On several such occasions, we found ourselves three or four days into the process when we discovered a corrupt tape. As a result, we were unable to restore from that backup and we had to go back to another version and start all over.
An Existential Threat
It was obvious that our approach was broken. I knew that storage and disaster recovery technology had evolved, and that we hadn’t kept up. We had to make a change and someone had to drive it, and so I took on that role.
That’s when I sat down and wrote an email to our executive team. I outlined the crisis in detail and explained the consequences. If we were to lose a backup for real, we’d face a complete systems outage for days or weeks on end. What would this cost the business? How would it affect our residents? And how would it affect our reputation? What would happen when the story hit the media? We serve a vulnerable population, and a failure to live up to our obligations would have significant consequences.
“This is a big red flag,” I said. “Can I get some support to change this? If I’m going to take on responsibility for our infrastructure, it is really important that we have this covered.” I’d been here before in my 15 years as an IT professional. You need to know when to ask for help, and you have to know how to do it.
You can’t just say, “People, we have a problem,” and expect executive buy-in. You have to lay out the past, the present, and the future of the issue; and then you have to frame it in the wider context of the organization.
If you present your ask as an IT concern, then it becomes your problem, and you have to deal with it. But if you outline it as an existential threat that affects every aspect of your organization’s activities, you’ll get senior management on board.
A Proven IT Partner
Once I was given the go-ahead, we started looking. I knew we wanted a replicated backup recovery system, but I wasn’t sure whether we should go with an internal solution, or move our backups to the public cloud. I then spoke to our usual IT partners about the challenges we faced. Our friends at Softcat analyzed our storage needs for the next few years.
We’ve been buying storage from Softcat for years, and so they understood our timeline and our needs. They’d also supplied us with servers that were running virtual machines, and these were approaching end of life, too. In the course of our discussions, they introduced us to hyperconverged architecture, sat us down for some workshops, and gave us several options to look at. One of those options was HPE SimpliVity.
I was hesitant at first. For starters, I didn’t know anything about hyperconvergence, and I had to get up to speed. Also, I’d been burned before, and so I asked a lot of questions. I was sceptical about HPE SimpliVity’s promises, but at the same time, I knew that it was backed by HPE, the company that provided most of our servers. Everything looked good on paper, and the demos were fabulous. HPE SimpliVity catered to the backup and replication side of things better than any other product, but it was HPE’s reputation that sealed the deal.
Saving Space, Time, and Energy
The key functions that excited us were inline deduplication and compression, and the ability to quickly restore servers of any size. HPE SimpliVity not only delivered, but also exceeded all our expectations.
The time to complete our annual disaster recovery exercise dropped from a week to a couple of hours after we installed HPE SimpliVity. Now, we use HPE SimpliVity in our dev and test environment, as well as production.
Every few weeks we have to patch our Windows installation. Instead of going live, we spin up our entire dev environment at our disaster recovery site, apply all the patches, and test everything. If there are no issues, we sign off on the updates and install them in our production environment.
Once we’re done, we delete all the test VMs, and that’s it for another month. HPE SimpliVity has empowered us to spin up a test and dev environment at little or no cost whenever we need it.
In real-world performance, our help desk has gone from days of looking through tapes when restoring a single server, to less than half an hour of effort.
The other major innovation in HPE SimpliVity is zero lag compression and deduplication. Traditionally, compressing and deduplicating data created so much overhead that server performance suffered, but this is no longer the case. Again, I was sceptical, but HPE SimpliVity proved me wrong.
Another big win for us was the reduction in the amount of space and energy we use. We went from two-and-a-half racks of servers to less than half, and our move from mechanical hard drives to SSDs further reduced our energy consumption.
Integrating HPE SimpliVity and Veeam
The other change was our move to Veeam as a backup solution. In fact, we adopted it before we went with HPE SimpliVity. At the time it was purely a tape storage solution, but now it also saves backups to a local disc for quick file-level restores. We have partially integrated Veeam, which sees HPE SimpliVity as another virtual host.
Although both perform similar functions, we use them differently. HPE SimpliVity is our primary server restoration tool, but it can also do file-level restores in a pinch. Veeam is for file-level restores, but can also do short-term server restores. It also works with our legacy systems when we have to retrieve a server from our tape archives.
The combination of HPE SimpliVity and Veeam has reduced the time spent retrieving files from days to minutes. It has also reduced server maintenance to a minimum. Everything just works. When there is an issue, HPE SimpliVity’s built-in diagnostic functions trigger an alarm before it escalates into something catastrophic.
We also have more capacity than we need. When a user comes to us with a request for more computing power or more storage, we can spin up a virtual machine and quickly allocate resources. We’re saving even more because we don’t have to go out and spend money on new hardware.
Recently, one of our teams asked for a server with a terabyte of storage, and I immediately signed off on it. A few years ago, I would have needed to sit down and write up a use case, perform a cost analysis, and remap our network to accommodate the new hardware. Now, all I have to do is push a button.
We also don’t need emergency servers any more. We used to keep a few powered-down servers as reserves. We mirrored our systems on to them, and we would switch them on when one of our physical servers failed. HPE SimpliVity allows us to spin up cloned VMs in a matter of seconds.
The Cost of Inaction
I could go on for days about the benefits of a safe, secure, and always available IT infrastructure, but the truth is plain to see. You have a business mandate and a moral imperative to keep your commercial and customer information safe. A data breach is a breach of trust; an ongoing system disruption disrupts your reputation. You cannot afford such failures.
To your executive team, IT infrastructure is not a problem until something goes wrong. If your organization finds itself on the evening news after a malware or ransomware attack, it is too late to contain the damage, even if you can wrestle back control of your network. As an IT professional, you need to stay ahead of the curve, and you need to tell your executive team exactly where things stand.
You cannot sugar coat it. You cannot tell them there’s a magic pill that will make everything go away. Sure, you can talk about specs, but that will get you nowhere. You need to speak their language, and that means showing them the bottom line, thus revealing the cost of inaction to the organization as a whole.
In the end, everything we do at Metropolitan Thames Valley Housing is about getting people into homes, and that takes money. At some point, we fell behind in our IT spending, and our ability to serve the wider community was compromised. We have now closed that gap, and have set our sights on further enhancing the ways we deliver services to our residents.