Multi-cloud monitoring keeps Q2 integrated operations center humming | Virtual Reality
Five years ago, Q2 had 240 servers. Today, it has 8,500 servers. The company spent $150 million over the last five years building out its infrastructure, where it now hosts more than 4 petabytes of user data.
“We’ve grown from 1.2 million users to 11.5 million users and reduced downtime to one-fifth of what it was during that same period,” says Lou Senko, CIO of Q2, which is headquartered in Austin, Texas, and provides a digital banking platform for banks and credit unions.
Q2’s cloud-based platform is aimed at helping smaller, community-based financial institutions compete with giants such as Bank of America, Wells Fargo and Citigroup. “Local financial institutions have to compete against some big, big players,” Senko says. “It’s our technology that levels the playing field in the digital world.”
Smaller banks can gain access to features consumers want – such as remote deposit, mobile enrollment, financial management and person-to-person payments – and Q2 handles the backend integration, security and performance.
“Weaving that together to create one seamless user experience is what we bring to the table,” says Senko, whose domain includes risk management and assessment, compliance and regulatory management, product development, disaster recovery and incident response.
From a technology standpoint, one of the biggest challenges is tying banks’ legacy backend systems to new applications and exposing them to the internet while maintaining security, compliance and availability. Uptime is critical, obviously. Q2 balances its efforts between increasing resiliency to avoid downtime and streamlining remediation for the inevitable hiccups. “Things are still going to break. So how do you have less downtime? How do you monitor? How do you remediate faster?” he says.
The company rearchitected its platform over the last few years, Senko says. “We were a .Net shop, three-tier architecture, SQL backend,” he says. Q2 made the transition to an open-source microservices architecture enabled by container technologies. “We’ve totally embraced containers and orchestration,” he says.
The goal is agility.
“I can’t sit down with my development department and plan for a year out,” he says. “’What are the things that you’re going to use in the next 12 months? Let’s plan all that out and let’s work with our vendors to make sure we’re covered.’ That’s not the way it happens. It’s very, very rapid here.”
Small development teams work fast to deliver new features, often taking a new idea from conception to production in as short at 60 days.
“All these technologies that we can quickly download from the internet, rapidly prototype, take through QA and then resiliency and scale testing, and then it goes through DevOps, and then boom, into production,” Senko says. “What will show up in six months wasn’t even thought of six months ago.”
As Q2 has grown its platform and quickened its development pace, it has outgrown a number of monitoring tools that couldn’t keep up. One that helped Q2 get to the next level is LogicMonitor. The vendor’s SaaS-based performance monitoring platform is designed for on-premises, publiccloud and hybrid IT infrastructure. “They’ve been able to keep pace with us,” Senko says.
As Q2 onboards new applications, LogicMonitor ensure they’re discoverable and monitorable. It also allows Q2 to consolidate monitoring environments behind a single pane of glass. The LogicMonitor team is nimble, expanding monitoring coverage if Q2 needs it, so IT can stay one step ahead of its development partners, Senko says.
“It’s not only part of our production-ization of new things that come in, but also it’s our backstop should stuff show up that we didn’t know about.”
As Q2 has strengthened its tooling arsenal, it also has changed how it views monitoring and troubleshooting. To take the place of its network operations center (NOC), Q2 built an integrated operations center (IOC). The shift to an IOC brought together more skilled personnel working on more complex issues. When Q2 ran a NOC, issues were often escalated outside the NOC to a DBA or network pro or server team. The IOC brings together all those experts, so that the people who are solving the problem are working alongside the teams that are responding to alerts and monitoring, Senko says.
The IOC has 33 fulltime employees and two managers. Now 87% of the issues that are picked up by monitoring tools are solved from within the IOC, he says.
The organizational shift elevated incident management to a more vital and higher profile discipline at Q2. “We rebranded it,” Senko says. “It really is a showcase. It’s not the stepchild of IT anymore. It’s the heart and soul that watches all this stuff.”
In addition to investing in the IOC buildout, Q2 is prioritizing employee training and engagement. “As we keep evolving the technology platform, we keep the employees growing and evolving with us. We spend a ton of money on training and career development.”
Retaining IT employees who understand the business is critical, Senko says.
“Their knowledge of the customer and how this stuff is used, and their connection to Q2’s mission is something that you can’t easily replace. You can teach people new technology. You can’t teach them that connection.”