As information systems and networks become more important to organizations, the task of making sure they do their jobs is increasingly vital. Since some applications are more important than others, that can mean not only keeping the infrastructure running smoothly, but making sure the things that matter
most get priority.
The first step in service-level management is to understand what sort of service you are trying to manage, says Michael Marks, director of service provider market development at Marlboro, Mass.-based Concord Communications, Inc.
“”More and more, the service is some application that some end-user is trying to use to make money for the company,”” Marks says.
It comes down to employees using applications to get work done, so the focus in managing service levels has been shifting from the technical “”speeds and feeds”” issues like bandwidth utilization to application performance and the users’ experience. That is “”probably the biggest trend we’ve been seeing in the last 18 months or so,”” says Gerry Roy, director of network installations for BMC Software Inc. of Houston, a maker of systems management tools. And network administrators are also looking for ways to ensure that the most important applications and users get priority when necessary.
Jay Richards, product manager for the Vantage performance testing suite at CompuWare Corp. in Farmington Hills, Mich., says one good side-effect of the Year 2000 scare is that organizations did evaluations and decided which applications were most important to them. Now networks can be managed with those priorities in mind.
Of course, the underlying “”speeds and feeds”” still affect how applications run. So managers need to be able to make the connection between the way individual applications perform and the behaviour of various parts of the network. If an application is running slowly, is it because of the application itself, the server on which it runs, a network link between that server and the user’s workstation, or what?
When there is a problem with an application, one of the first tasks is to determine whether it is an application-layer problem or one caused by the network, explains Van Negley, marketing manager for Agilent Technologies Inc., a Palo Alto, Calif., company whose products include network test and monitoring instruments.
To get beyond basic network availability, you need to measure applications and the underlying infrastructure so you can monitor the whole picture, says Marks. “”It’s not enough to manage the application,”” he says. “”You have to understand what’s happening at the infrastructure level.””
One way to do this is with dummy transactions, generated by test systems such as NetAlly, from Viola Networks Inc. of Somerset, N.J. Tony van Kessel, Viola’s regional manager in Canada, likens this to testing a circuit by putting a signal on the inputs and then looking at the outputs.
“”You can make the most accurate and meaningful measurements on something if you know what it is that’s expected,”” he says. Viola even has a Web-based agent for testing links to remote devices without installing anything on the remote client.
It’s hard to give useful information to administrators
Lokesh Jindal, director of marketing in charge of service management for Islandia, N.Y.-based Computer Associates International Inc.’s Unicenter systems and network management software, notes that when users spot problems with applications, what they can tell technicians at a corporate service desk or help desk often has little connection to what is actually wrong in the underlying infrastructure. To try to get around that problem, Jindal says, Computer Associates is working on an initiative it calls ServiceAware.
The idea of ServiceAware is that code embedded in the applications themselves would monitor the performance of the infrastructure on which they run.
“”If the application sees a degradation in performance somewhere, it will immediately open a ticket or send a notification to the service desk, giving all the details of its current environment and what are the problems it has seen,”” says Jindal.
Improvements in productivity aren’t always enough
CA has built the foundations for ServiceAware into the next release of Unicenter, which is due out soon. The company is now talking with applications developers, who must build ServiceAware components into off-the-shelf applications to make the concept work. In-house developers will also be able to incorporate ServiceAware capabilities into applications they create, Jindal says.
“”You are expected to deliver better performance at lower cost for a more complex infrastructure,”” says Jindal. Improvements in productivity are probably not going to get you there. What you need is a jump or a paradigm shift.””
About 18 months ago, CompuWare regrouped its network testing products into what Richards calls an application performance suite. The Vantage suite takes data from parts of the network infrastructure and relates it to the over-all performance of applications, he says. Richards says the goal — only partly achieved so far — is to get away from the “”not-me syndrome”” in which IT people simply verify a problem isn’t caused by the piece of the infrastructure for which they’re responsible, then pass the call on to someone else.
BMC’s Patrol Dashboard gathers information from various sources, such as network sniffers, probes and automated dummy transactions, to create a picture of network performance. Administrators can see what response time end-users are receiving, and set the system to notify them if performance strays outside limits they set. When a problem appears, Roy says, Patrol Dashboard can help network managers locate the underlying problem — such as better response time because of increased throughput on a Frame Relay circuit.
Roy admits today’s products still aren’t as good as they could be at determining which pieces of infrastructure affect which services to end users. He hopes to eventually see tools that deal with configuration changes on their own, identifying what has changed and how it affects services.
While better ways of finding the causes of obvious problems are fine, what network managers really want is to see the problems coming and fix them before end-users even notice anything is wrong.
“”What management means to us is the ability to identify and act on problems before they become service-affecting,”” Marks says.
Marks says doing this requires combining historical trends in the performance of the infrastructure with fault management. For instance, he says, it helps first to identify the usual variations in performance, such as the fact that network traffic tends to peak at certain times of day.
“”There are natural trends in usage that produce as a result expected variations in response time and in performance,”” says Marks. So spotting faults before they affect users depends not just on noticing variations in performance, but on noticing how those variations compare with those you would expect.
Once service management tools identify trends that are going to affect performance, they can tell managers what’s happening and even suggest corrective action. In fact, they may be able to go a step farther than that and take the action themselves. That could mean shifting traffic from an overloaded network link to an alternate route.
“”It has to be an automated process, so that way the end user never notices a degradation,”” Roy says.
This is akin to the concept of self-managing systems for which IBM last fall coined the term autonomic computing — a reference to the autonomic portion of the human nervous system that handles functions like breathing. The term encompasses a range of things from what is fairly common today, such as servers that fail over automatically to a second processor or disk when one malfunctions, to an ultimate vision in which network managers simply specify performance requirements for applications and the systems manage themselves to meet those targets.
“”Managers want products that can auto-tune based on pre-defined policies,”” says Roy.
Automated management so far is moving faster on the systems side than in networks, Marks says.
Traffic prioritization hasn’t really taken off
Part of that management job would be allocating available network resources to the most important tasks. If a network link or server goes down, for instance, better that internal applications affecting only employees slow down or even stop than that e-commerce transactions affecting customers suffer.
From a network point of view, this means prioritizing traffic. To date, this has been talked about but not widely practiced. Giving priority to voice over IP (VoIP) applications through Quality of Service (QoS) capabilities in networks has had plenty of attention, because VoIP is not very forgiving of network latency. As Viola Networks’ van Kessel says, “”as soon as you put VoIP on your network, it immediately and automatically becomes the most critical application.””
Beyond that, Roy says, traffic prioritization has not really taken off. “”It seems to be picking up speed, but it doesn’t seem to be growing as fast as we thought it would,”” he says.
But service-level management software, which makes it possible to set different service levels for different applications or parts of the organization, is becoming popular.
“”It’s no longer ‘is the pipe working, is the switch working,’ and that kind of stuff,”” Roy says. “”It’s ‘is the end-user getting the kind of performance they want?'””