Doing More with Less: Embracing Automation with Cisco DNA
To achieve more with less effort and expense: That’s the goal of so many organisations, regardless of industry or specialisation. And with the rise of digital transformation, there’s no denying that automation is the way of the future, especially when it comes to cloud infrastructure. Client needs are increasing exponentially, and companies are leaning heavily on the flexibility, speed, and increased capabilities that automation offers.
We are one of those companies. Interxion is a European provider of carrier and cloud-neutral colocation data centre services. We rent out space in our data centres to our customers, who are a big-name enterprise or cloud providers. We have more than 50 data centres across 13 cities and 11 countries within Europe.
Security always tops our list of priorities. We maintain our services as securely as possible because we provide critical infrastructure for customers who harbour critical data for society at large. Flexibility and scalability are also key priorities, because even though we are a large company, our operational team is quite small.
A few years ago, Interxion’s management team declared its intent to double the size of the company within five years without doubling the size of our workforce. To do that, we needed to find a way to scale alongside our customers and keep up with customer demand, all with a limited IT department. The only way this could be achieved was to standardize our processes, have a clear blueprint to follow, and then try to automate as much as possible around those processes.
A System Ripe for Automation
Despite dealing with COVID-19, we’re still moving quickly and building a new data centre in Europe approximately every two to three months. It’s a challenging process, and it was even more challenging when I joined Interxion two years ago.
Back then, there was only one internal network engineer. Now, there are seven team members, including myself, and I’m the manager of the European network engineers. My department is responsible for all the pieces related to the network that makes our data centres operate smoothly, including access, security, surveillance, cooling, heating—any detail that protects and regulates the space so that all customers have to do is rent the space and put their equipment in it.
When I started, we had a lot of end-of-life, end-of-support devices in our data centres around Europe, and we weren’t going to replace them easily overnight. We didn’t have the resources to do so, and management was hesitant to spend a lot of money on maintenance. Replacing old equipment simply wasn’t a priority.
We also lacked operational standardisation for our network and environment. In our efforts to standardise as much as possible, we made huge strides from a security and deployment point of view. This process standardisation was a crucial step in our automation journey, but we still had a long way to go.
On top of that, our data centres need to be 100% reliable. Our business model doesn’t tolerate outages of any kind, including loss of surveillance, cooling, physical equipment, or power. Any outages could be crippling to our customers’ businesses and result in penalties.
This was difficult to maintain, however, given that updating a switch required a physical reboot of the box itself. This meant that the devices connected behind that switch would be down for 10–15 minutes, which was a very tough process to arrange and get approved. With so many data centres and switches—and so few engineers—manually upgrading our infrastructure was essentially impossible.
Upgrade deployment was also cumbersome and error-prone. Often, when we ordered new switches, they would make several stops on their way to us. Once they arrived, my team and I had to manually unbox and configure the switches before sending them to the correct data centre to be installed. This manual process limited our scalability and hindered our growth goals. We had to find a way to improve the way we delivered, monitored, and executed the upgrades while mitigating any possible negative outcomes.
Debating a Brand New Solution
Interxion is a “Cisco house”, so we were looking for a Cisco solution from the beginning. But it was still a lengthy process. When Cisco introduced us to its Cisco Digital Network Architecture (DNA), we liked what we saw, but some people were resistant to using this new product—myself included.
When we started exploring the solution, Cisco DNA was still relatively new, and being an early adopter carries some inherent risks. There weren’t as many peer experts or use cases as there are now. Cisco DNA was designed to manage Cisco devices, the user experience, and required support for our systems. And Cisco DNA Analytics and Assurance would allow us to get greater insights into our network so we could get the most out of it. That all sounded great, but there were some features that were still in development. We would be the ones providing feedback to Cisco about features as they were released and noting which ones needed improving.
Ultimately, it was Cisco DNA’s compatibility with our existing infrastructure, combined with the base features Interxion needed the most (managing updates and deploying zero-touch provisioning of switches), that made it the logical Cisco solution for our needs. We took a deep breath and made the plunge.
Seeking Support from Cisco CX and Cisco TAC
Cisco DNA was designed and built to make it fast and easy for customers to push an upgrade to their entire site and forget about it. That wasn’t going to work in our case. We couldn’t just press the button and reboot the entire data centre, and even if we could, it was difficult to get the operational expenses approved to use Cisco DNA for our entire environment. Instead, we decided to start small and build our case for Cisco DNA as we moved forward.
With that plan in mind, deployment began quickly. As is the case with any new solution, there were some growing pains in trying to integrate such a new product. While we could have tried to sort it all out ourselves, we thankfully found another option.
Usually, with Cisco solutions, you can turn to the community forums to find answers to your questions, but we decided to speed up our resolution with the help of the Cisco Customer Experience (CX) Team and Cisco Technical Assistance Centre (TAC). Both of these internal Cisco teams have been our partners throughout this deployment and automation journey.
This meant that they could help us quickly with known issues, and in return, we were able to highlight required product adjustments and troubleshooting. We were in constant communication with the CX team, Cisco engineers, and Cisco TAC, who, from creating tickets to receiving appliances not covered by support contracts, worked to address our concerns.
Support from Cisco CX and Cisco TAC prevented many headaches. The puzzles we needed to solve required specific knowledge and access to the back end of the product, which you can only get if you are on Cisco TAC. Both teams saved us so much time, energy, and frustration.
Zero-Touch Provisioning Wins Out
Alongside Cisco TAC and their engineers, we were able to manage upgrades in a controlled, standardized fashion. We were also able to monitor the upgrades for certain pre- and post-upgrade checks, which help prove the success of our upgrades.
There were also two specific use cases, Campus Software Image Management and Network Device Onboarding, where we saw the best results.
Cisco DNA has a feature, PNP (Plug-and-play), which allows you to enable your upstream switch, and when you layer an access switch below it, they exchange information. That PNP process points a new, onboarding switch to Cisco DNA. If Cisco DNA recognises this switch as a part of our environment, then Cisco DNA picks it up and applies a profile per its location. Then, it pushes that profile to the switch and provisions it to make it available on the network. If we have a switch-specific configuration we need to apply, then we have the capability to access it directly on our network and apply that specific configuration, all without touching that switch.
This new process saves us a lot of time and also prevents errors. If the switch is programmed properly and all the pieces are in place, it will always provision the switch the exact same way regardless of the number of Cisco Catalyst 9K switches being provisioned or the number of sites. No matter how proficient humans are, mistakes happen. But automating eliminates human error altogether.
We can now schedule upgrades for a specific time or site. Pre-checks are built-in, and the device will flag anything that doesn’t match, so we can troubleshoot before the upgrade begins. Automating the process through Cisco DNA has standardised and controlled the entire upgrade process.
Cisco DNA has done exactly what it was designed to do: automating and simplifying. A process that used to take three weeks can now be delivered in three days.
Now, we open a service request in ServiceNow and the switch is shipped directly to the correct data centre. Our team just has to unpack it, connect it to the network, and it’s provisioned automatically.
To date, we’ve reached about 85% of our goal to automate all of our data centres. Unfortunately, certain operations are on hold due to the global pandemic. However, everything is in place so that, once it’s safe to do so, we can forge ahead.
The Next Level: Assurance and Analytics
I joined Interxion at an exciting time, even though it was challenging. I’ve been able to experience all this growth firsthand, and have a say in developing processes to handle it. The speed with which we can now deploy and make our data centres operational is vital for our business and has a significant impact on our customers’ experience.