Mobile Environments Add New Wrinkles to Service Level Management

Brad Stone
President, Aspirin Software
October, 2003

Information Technology departments want to measure the effectiveness of their computing infrastructure. This means measuring both service performance and availability. Enterprise performance management vendors have recognized in recent years that it is no longer sufficient to report component-level performance and availability statistics. Customers expect to see service-level metrics that directly map to end user experience. Thus system availability statistics have been replaced by service availability, and raw performance data such as the CPU utilization on a particular server has been replaced by end user response time data. As more and more corporate workers become mobile, the customer requirements for performance management software are going to change once again.

Corporations are mobilizing their workforce in different ways and for different reasons. Business executives use laptops together with virtual private networks to access corporate data while traveling. Workers in warehouses use scanner-enabled PDA's to reduce errors associated with manually entering inventory data and to increase efficiency. Field service representatives are taking PDA's on the road to increase efficiency by receiving new work orders in real-time without having to return to the office. To measure mobility performance and availability service levels, new methodologies and metrics are needed.

In the past, service availability would be measured indirectly by measuring the availability of each component being used to deliver the service. Component availability might be measured by something as simple as a ping test. There was typically not enough data collected and maintained to convert this into service availability data. A couple of different approaches have been taken by vendors claiming service level availability metrics. Some companies have made a business out of testing a company's Web-based services from multiple remote locations. Another approach has been to create synthetic transactions that mimic an end user accessing a service. Adequate tests now exist for service availability in wired environments, but it is unclear how appropriate they will be for wireless and mobile environments.

In a wireless environment, one will want to actively test the wireless networking infrastructure (access points and wireless switches), as well as any wired components that are involved in service delivery. However, in many cases it may be inappropriate to actively probe the availability of the mobile device itself. Querying the device can adversely affect device performance, and can put a drain on battery life. Also, for truly mobile workers they may often be out of radio coverage, so an availability test failure may not have any meaning. In this case, availability tests of mobile devices may be useful only when used reactively during troubleshooting.

End user response times is an important service level performance metric to measure. A variety of vendors have developed performance management solutions appropriate for wired environments. For e-business services, one approach has been to install software on end user systems and measure performance directly. Another approach has been to measure the performance of synthetic tests run at predefined intervals.

A problem with measuring wireless performance is a problem with synthetic tests in general. There is no guarantee that they accurately reflect what a real user is experiencing. In e-business environments, network topology can be a common cause of measurement differences; the synthetic test is from within the corporate intranet, and the end user is accessing the service from an unknown Internet location. For wireless environments, significant performance differences can occur even if the network paths are the same for both the test and the end user. Mobile devices could be connected to the same access point, but radio interference could cause them to have radically different response times due to lost network packets and retransmissions.

The apparent solution would be to measure performance with agents running directly on the mobile devices. This is possible, but the data collection creates problems unique to mobility. Available memory and processing power on the device limits the sophistication of a management agent. Also, devices will be periodically powered off to save battery life. This leads to sporadic data collection intervals. However if the user is actively performing business transactions then the data collection should be active as well. Another issue is that devices may be out of radio coverage at unpredictable times, resulting in irregular transmissions to a management server. The limited bandwidth of wireless environments may further restrict the transmissions of performance data. The management server should be prepared to smooth the statistical data to create meaningful graphs for the customer.

Service level monitoring should provide the management dashboard for the IT department. When service levels are not being met, the IT operator should be able to drill down and see component-level statistics. Unlike servers, and wired network infrastructure, mobile devices and wireless network infrastructure do not yet have mature performance instrumentation available. Battery level, for example, is not always possible to track but is a common cause of failures. An SLM solution for mobile environments will have limited value without the ability to drill-down and see detailed performance and availability metrics.

Service level monitoring solutions will need to evolve to account for disconnected operations. In the field service example mentioned above, the service representative will spend most of their time away from the corporate office. However, it will still be important to track device performance, both to understand the device behavior as well as to more accurately measure worker productivity. This is related to the emerging trend toward business process monitoring. Instead of tracking service levels, entire business processes, including both manual and automated steps, are tracked.

Mobility is likely to create a new set of mobility SLM vendors that specifically target this niche. Corporate IT departments should look for these solutions when planning mobile deployments, to ensure that their infrastructure can be tracked and optimized.

Brad Stone is the president of Aspirin Software, a firm providing IT products and consulting services. Stone has more than 15 years of experience delivering enterprise high availability and service level management solutions to market. He has contributed to the designs of mobility management software for Symbol Technologies. As CTO at Resonate, Stone helped architect the company's transition to service level management.