Users don’t care if server availability is 99.999 per cent, 99.99 per cent or 95 per cent. Nor do they care about response time. They care whether their applications work when they want them.
Recognizing that, the Regional Municipality of Niagara has evolved its service level agreements in recent
years to focus on the availability of applications, not statistics about hardware availability. And that’s the priority in managing the municipality’s infrastructure, says Bob Diakow, director of information systems.
“”It’s really from the perspective of the user and meeting commitments to them,”” he says.
The goal is to monitor application performance and availability, which is what users see, and when something goes wrong find the underlying cause and fix it as quickly as possible — or better yet, see the problem coming before users do.
“”If we’re doing our management work correctly,”” says Bill Dupley, business solutions manager at Hewlett-Packard (Canada) Ltd. in Mississauga, Ont., “”then you, as a user on the screen, should never see a deterioration in performance.””
It’s not always easy. In today’s increasingly complex information systems environments, “”it could be any of a number of pieces that could be affecting the availability of the thing that the user needs to get their work done,”” Diakow says. Hardware, software, local-area and wide-area networks, Internet access — the problem could be in many places.
What IT organizations need is to see the whole picture in one place and identify the root causes of problems quickly. “”That’s I think the promise of some of the larger integrated packages,”” Diakow says. “”So far I think that’s more promise than reality.””
Niagara has assorted systems, network and storage management tools, including some components of Computer Associates International Inc.’s Unicenter, Ipswitch Inc.’s What’s Up Gold network monitoring package and some other standalone tools.
“”From the investigation we’ve done so far,”” Diakow says, “”the ability to get one product that does it all takes a great deal of time, money and resources.”” And while it’s possible to get best-of-breed packages to work together, “”you do have to do some programming or some scripting to make that happen, so you have to look at it and say is it worth it to me?”” he says. “”In most cases, you live with it the way it is.””
MOM’s the word
Alliance Atlantis Communications Inc. has servers and networks spread across multiple continents, the majority running Windows 2000 or 2003, with a few Unix boxes and IBM AS/400s. The Toronto-based entertainment company recently moved to Microsoft Corp.’s Microsoft Operations Manager (MOM) to monitor all servers except the AS/400s, says John Kemp, director of computer operations and infrastructure. Operators still monitor the AS/400s locally using internal monitoring tools. Kemp hopes to tie them in with MOM next year.
MOM’s global monitoring capabilities help the centrally located IT department keep track of distributed systems, but it is not the whole answer, Kemp says. Data from MOM and other tools must be combined to support service level management and other tasks. Alliance Atlantis hopes soon to pull together more functions such as help desk and configuration management so IT staff will be able to see a more complete picture in one place.
The Halton District School Board in Burlington, Ont., moved to Unicenter in search of a more unified view of its IT infrastructure, says Fernando Pinho, network operations and telecommunications manager. The board also has Service Desk, a former IBM product that was sold to Peregrine Systems Inc. in 2000. Ultimately, Pinho says, the goal is to get to a single management console.
Vendors of infrastructure management tools say they understand the need, and in fact have addressed many integration needs and are focusing on requirements most IT managers aren’t even thinking about yet.
Bob Madey, vice-president of strategic and market management at IBM Corp.’s Tivoli software unit, says it’s not that the integration of infrastructure management tools is increasing, but that the type of integration is changing. Integration has existed for some time, Madey says, in that alerts from various monitoring agents are brought together at a single monitoring console where they can be correlated and the root cause extracted.
The trouble is, he says, that this “”bottoms-up”” approach doesn’t work as well in a more distributed computing environment.
A monitor drill-down
In the past, Madey says, an application usually ran on one machine, and “”if you kept the (hardware) up and running, chances are the application was up and running.”” Today, applications may be spread across a lot of infrastructure. You can’t monitor this type of application effectively in the old way, Madey says. Instead, you either take an outside-in approach, in which you follow real or simulated transactions through the system, or an inside-out approach, in which a monitor right on the application server watches everything going in and out. Or you do both, though Madey says that may consume too many resources.
Having identified a problem with an application and traced it to a particular resource, though, it still may be necessary to dig down into the infrastructure to find out exactly what’s wrong with that device. This, Madey says, is where tools for monitoring what users see must work with tools for analysing what happens inside the boxes.
And it’s also where different parts of the IT organization need to work together better. Dupley calls it a major challenge in IT governance.
Traditionally, he says, purchase decisions about systems management tools have been made low in the IT organization. Specialists in various areas chose the tools they needed to do their jobs. But now, there’s a growing need for an overall infrastructure management architecture. “”You’ve moved from just giving me some data to building a process control system,”” Dupley says. “”To do what we’re talking about you have to actually have a system management architecture.””
Practice best standards
One effort in this direction is the Information Technology Infrastructure Library (ITIL), a set of standards for IT service delivery developed by the British government’s Central Computer and Telecommunications Agency. Alliance Atlantis is in the process of implementing ITIL. Kemp says that when he joined the company in April, he found its IT processes needed improving, and felt a best practices standard such as ITIL would help.
The challenges are not all technical, though. An overall infrastructure management architecture removes decisions from individual IT people.
“”You’re taking authority and autonomy away from all these people that used to just choose their own tools, and they don’t go quietly into the night,”” Dupley says..