Years of acquiring specialised IT management tools have left some enterprises with too many tools and too much overlap. It might be time to clean house.
In the process, companies can shed tools that perpetuate the tendency to monitor specific technology silos, opting instead for tools that encourage a shift to broader IT service management.
"Consolidating tools offers many benefits, including real tool integration, better service perspective, quicker incident resolution and higher service reliability," says Glenn O'Donnell, senior analyst at Forrester Research.
One company that has been there and done that is Oracle, which just went through an ambitious project to decommission piecemeal network monitoring and management tools in favor of a single, consolidated system.
"The real driving force for us was the savings on integration," says Craig Yappert, senior director of enterprise monitoring solutions at Oracle. "It's not that the tools we had weren't capable. But if I had to continue to integrate six or seven different tools into my business process every time something changed, I was taking a huge hit in time and valuable people resources to do that integration."
Oracle chose a platform from Monolith Software. The company initially considered using Monolith as a syslog management tool, but after a demonstration, Oracle started thinking about Monolith as a replacement option for its existing patchwork of network management infrastructure.
"At some point in time, you're looking at another tool, and you come to the realisation that another tool is going to be the straw that breaks the camel's back," Yappert says.
For Oracle, a decade of fast growth led to its decision to re-architect its network monitoring and management approach. During the last 10 years, Oracle acquired more than 60 companies; doubled the number of network devices being monitored; increased international bandwidth between 30x and 50x; and saw its network management workload (measured in tickets initiated and new technologies monitored) grow 70%.
End users dependent on Oracle's systems include 85,000 employees worldwide, plus 4.5 million people who use the hosted Oracle on Demand business applications.
"Over 10 years of doing network and systems monitoring in our IT organisation, we, like most organisations, gathered up tools like we needed another tool to do every little specific thing," Yappert says. Oracle was using HP OpenView for network monitoring, IBM Netcool for some performance and event consolidation, MRTG for measuring network traffic, and homegrown and open-source tools for other tasks.
A forklift replacement of existing technologies is no easy decision, but Oracle justified it with cost-saving opportunities related to licensing, headcount, hardware, and annual software and hardware maintenance.
"We've seen a tremendous reduction in our infrastructure and in the costs associated with that infrastructure," Yappert says. "By the time we turn off our HP OpenView, our IBM Netcool and our open source tooling, we will have removed approximately 20 servers from our mix. We've replaced them with half as many servers within the Monolith space."
However the real driver was less about cost savings and more about the chance to architect a better system. One of the things that stood out about Monolith's architecture is that it gives Oracle complete access to the data that was being gathered and maintained.
"Getting at the data that was in our previous tools was quite difficult, to be honest. Oftentimes the data that was coming in would be stored in a very proprietary format. The vendors didn't want us to have direct access to it, because having direct access to the data sort of removed the need for their layer, their interface or console," Yappert says.
"Monolith is a great engine. It collects the data and organises it. But more importantly, it allows us to get at the data so that we can merge it with other things and provide that bigger business context. We couldn't provide that higher value context before. It was just too difficult to bring all that information together."
IT can now run reports to show performance of the network – such as latency, bandwidth, voice quality scores and device availability, as it relates to a specific line of business or geographic region. For instance, Oracle in the past didn't have a view of all the network components that supported its financial and HR system. "Now we're able to tag and organise those devices, and the events that happen against those devices, around that business grouping," Yappert says. When we have a switch, or router or firewall that has a problem, "we have an understanding of what's impacted by that, at a business level."
Oracle is also merging network data with information it collects from other sources such as its fixed asset system, which aggregates data related to device specifications, purchase information, deployment location and maintenance contract status. With this type of data available, a network support staffer who's responding to a trouble ticket can weigh whether it's time to replace a device based on its age or maintenance contract, for instance, Yappert says.
Oracle's $7.4 billion acquisition of Sun tested the network monitoring system's integration capabilities – and it passed with flying colors. "Within a 45-day period we were able to integrate all of Sun's networking infrastructure into our monitoring model," Yappert says. "We could not have done that with our preexisting model."
A key challenge as Oracle considered its deployment was determining if the Monolith platform could scale to the required size. Oracle ran an evaluation using 5,000 network devices (about one-third of its total network devices). Then as the company added more thousands of devices, "you start to see some scary numbers on the amount of data you're collecting, and how often you're pinging," Yappert says. "There were some struggles there," he says, but none that Monolith and Oracle couldn't resolve.
In the big picture, there's a lesson to be learned about making the right choices about what assets to monitor and how deeply to monitor them, Yappert says.
"There's this general desire to monitor everything at the deepest level possible, everything needs to be monitored for performance, everything needs to be monitored for fault, everything needs to be monitored for configuration," he says. But not every asset requires that level of oversight.
"It may be necessary to monitor routers, firewalls and load balancers at a very deep level for performance reasons, because those tend to be performance control points on a network. But it may not be necessary to monitor access switches, for instance."
His other advice is to make sure that the data you end up with – specific performance charts and graphical representations of network activity, for instance – align with the needs of the people who will be consuming them.
"Oftentimes monitoring and management are done in isolation from the rest of the business," Yappert says. Companies that are considering re-architecting their management infrastructure need to ask themselves: What are the business processes that are going to consume this data? What other information do you need? How are you going to integrate the data?
"It's easy to overwhelm yourself with too much information in the network that just doesn't mean anything."