Mission (Not) M-PESAble: Managing a Critical Telecom Migration

Enghouse Networks

Migrating a platform is always challenging. For telecommunications companies, it’s even more of a challenge because customers need always-on connectivity. A successful migration plan requires the right tools, the right approach, and the right partner to minimize or prevent service interruptions. At Safaricom, we found the trifecta that led to a painless transition.

I’m a Senior VAS Planning and Evolution Engineer at Safaricom PLC. Based in Kenya, we are East Africa’s largest telecommunications company, serving 42 million customers. We were the first Kenyan operator to roll out 3G connectivity and one of our standout achievements is the launch of M-PESA, the mobile phone-based money transfer, payment, and micro-financing service. 

A successful migration plan requires the right tools, the right approach, and the right partner to minimize or prevent service interruptions.

I’ve been with Safaricom since 2014 and oversee capacity planning and project management to ensure the timely delivery of software updates, new products, and infrastructure upgrades. I help chunk projects into manageable deliverables for our operations and DevOps teams, write and proof the documentation in our knowledge base and reports, manage SLAs, and oversee RFPs.

My job is to get the best from our people, technology, and vendors so we can offer cutting-edge value-added services to Safaricom customers.  These include M-PESA, Baze Video-on-Demand services, Beatz music streaming, and our Bonga Points loyalty program.

Inadequate Bare-Metal Infrastructure

Like many telecommunications companies, we are migrating our infrastructure from bare metal servers to the private cloud. It was never a matter of if, but when. However, in recent years, the disadvantages of staying with traditional infrastructure started to add up. Maintaining and upgrading our bare metal servers and switches was painful and expensive. Our hardware shipped from Ireland, so if we received a defective switch or hard drive, we had to send everything back and wait weeks for a replacement.

We had service agreements for maintenance because the VAS team doesn’t have the time or expertise to install, upgrade, and maintain hardware-based servers. Unfortunately, outsourcing maintenance created a lot of paperwork and caused further delays as we worked with remote and on-site assistance teams to resolve hardware issues.

When you run out of capacity on a bare-metal server, your only option is to buy another server, presenting a huge stumbling block to growth.

Our biggest stumbling block was scalability. When you run out of capacity on a bare-metal server, your only option is to buy another server. The only way around this was to buy more servers than we needed. For example, if we needed 25 servers, we would deploy 30 servers with the same specs, all of them taking up space, consuming power, and generating heat at our data centers. We needed a rack of SMSC servers at each of our sites. We couldn’t deploy certain products without exceeding our electricity budget or the space constraints at our data centers. It was time to move to the cloud.

We Partnered with Enghouse to Move to the Cloud

In March 2020, we were ready to make the change. We initially embarked on an RFP process to find a partner who could transition our infrastructure from bare metal to a private cloud. One of our top criteria was compatibility with our existing messaging platform, which runs on Enghouse Core SMSC. We’ve been an Enghouse customer for years and have built our SMS platform on the company’s products. These include Broadcast Manager to run SMS-based marketing campaigns, SmartGuard for spam and fraud prevention, Diameter Signalling Management for mobility and charging control, and Messaging Gateway to consolidate and deliver application traffic. 

We rely on the Enghouse ecosystem for its ease of use and interoperability. Nevertheless, we performed our due diligence and reached out to multiple suppliers for a quote. Unsurprisingly, Enghouse offered a solution that met our needs in terms of pricing and functionality. More importantly, the company laid out a migration strategy that was transparent to our internal teams and customers.

With the help of Enghouse engineers, we migrated our messaging system and M-PESA to our cloud-based infrastructure one component at a time. We didn’t have to rehearse or pilot the migration, shut down services overnight, or reroute traffic for a couple of hours after every deployment to test the new infrastructure. Instead, we backed up apps and data from each component, copied it to our new infrastructure, and ran the old and new setups in parallel. When we encountered an issue, we rolled back that one component instead of having to rebuild our entire messaging platform.

Our Modularized Approach Succeeded

This modularized approach succeeded beyond our wildest expectations. Every minute of downtime erodes customer confidence, results in lost revenues, incur service costs, and ties up our in-house and contracted technicians. Thanks to Enghouse, we migrated our critical component to the cloud with zero downtime, with no changes to the way services were managed or delivered. It was like replacing a car’s engine while it was still running.

Our customers and most of Safaricom’s team didn’t even realize we switched from bare metal servers to the cloud. Migrating components one at a time meant we didn’t have to flip a switch and hope everything worked. We had already tested every component before deploying it, so each new part of our messaging platform was ready to go each time we made a change.

More importantly, our customers enjoyed uninterrupted access to M-PESA. For Kenyans, M-PESA is much more than a mobile payment service; it also serves as a branchless banking service. If it goes down, people don’t have access to their money, can’t pay their hospital bills, or risk having their cars clamped should they fail to pay a parking ticket. It’s a part of Kenya’s economic fabric and an essential service that we strive to keep running at SLA levels much higher than 99.9%.

We Now Have Room to Grow

Our private cloud infrastructure has given Safaricom room to grow. We can spin up servers at the push of a button, adding or subtracting infrastructure as needed while keeping an eye on costs. We are currently adding to our M-PESA infrastructure and separating it from our other messaging infrastructure, and we can do so without worrying about running out of capacity.

Deploying virtual machines to respond to increased demands substantially reduces software licensing and electricity costs.

Our transaction processing system (TPS) typically handles over 50,000 messages per second, and with bare metal servers, we had just enough capacity to meet this demand. However, network traffic rises exponentially during occasional events like elections, and we see daily spikes at times like the end of the school day when children tend to text their parents. We no longer have to run physical servers 24/7 to meet these increased demands. Instead, we deploy virtual machines as needed, substantially reducing software licensing and electricity costs.

We can also test new Enghouse solutions and system upgrades before deploying them because we now have a dedicated test environment that we can scale up and down as needed. We can trial new Enghouse products and updates to our existing applications in this separate environment without breaking or bogging down our production environment. We can validate patches, new applications, and components like OMC monitoring software and new messaging services before rolling them out to our customers, thus minimizing the risk of downtime.

The Best Partner Makes Change Easy

Over the years, Safaricom has built a partnership based on trust with Enghouse. No project has proven too big or complex for the company. A few years ago, we migrated M-PESA from its original servers in Germany to one of our Kenyan data centers. Enghouse set up concurrent instances of M-PESA during the transition and routed network traffic to both until we could switch over to our internal servers. Their engineers worked with our team to ensure seamless service for our customers.

Enghouse products are also incredibly stable. Their Messaging Gateway is the heart of our SMS infrastructure, and it rarely— if ever—goes down. When there is a problem, it has built-in redundancies, and the failsafe mechanism kicks in. If an MPU messaging node fails, traffic is routed to another node, and the system generates an alert. The gateway is so efficient that we haven’t experienced a single P1 incident—a service outage affecting more than three percent of our users—in the last three years.

Working with Enghouse also means staying within a single ecosystem. When we decide to add infrastructure or deploy new messaging applications, we don’t have to seek out another vendor or purchase a new system. Instead, we can add a new module or functionality. It’s a way of keeping down costs, reducing complexity, and working with a company that offers powerhouse solutions out of the box and room to grow as our customer base grows and our users’ needs change.

There’s nothing fancy about text messaging, but it has to work where and when you need it. As we’ve shown with M-PESA, it can serve as the foundation for transformative services that benefit entire populations and become parts of a nation’s social fabric. Working with Enghouse provides Safaricom the peace of mind, stability, and confidence we need to continue offering Kenyans and West Africans cutting-edge telecommunications products and services for years to come.