Portable Databases: Information on the Go
<a href="displayarticle.asp?ID=31099123409PM"><img src="archive/1999/pics/portable.jpg" align="left" border="0"></a>A new crop of portable database products enables remote and centralized systems to securely synchronize and replicate information -- empowering mobile users with data and providing the enterprise with up-to-the-minute transaction information, reports Cheryl Krivda in this issue's feature story.
In a world where as many as 75 percent of all computer users periodically work away from their desks, centralized databases are no longer adequate support for enterprise computing. Mobile users require access to corporate data to efficiently serve customers, and the corporation needs frequent updates to its central repository from the transactions made and information gathered by remote users.
Enter a new twist on an old standard: portable and remote database technology. Remote users carry only the chunks of data they need on portable devices, downloading more information as needed and uploading changes to the central repository. A new crop of portable database products enables remote and centralized systems to securely synchronize and replicate information -- empowering mobile users with data and providing the enterprise with up-to-the-minute transaction information.
Portable database technology is applicable to a number of form factors, ranging from standard laptop computers to palm-sized devices to smart cellular phones. Early adopters of portable database technology include industries with a large contingent of field personnel, such as finance and insurance, as well as professionals who cannot be tied to a desk, such as doctors and nurses.
For these users, being able to access a local data store while disconnected from the enterprise system is critical. "Giving them the best information is the key," explains Carl Hartman, vice president of information management marketing at Computer Associates Int'l Inc.
Portable database technology is in its early stages, with many enhancements still in development. But the market has been fueled largely by sales force automation applications, which have grown by double digits over the past three years, explains Doug Leland, group product manager for SQL Server marketing at Microsoft Corp.
Database footprint and other issues related to the physical characteristics of portable databases are coming into focus, yet replication technologies remain a mystery to many NT managers. And, as with any remote computing, the security question is always there.
Characteristics
The significance of footprint issues depends on the devices employed by remote users. Most of the major database vendors claim that their products can support portable databases using a minimal amount of memory and other system resources.
For example, Adaptive Server Anywhere 6.01 from Sybase (www.sybase.com) requires 1 MB for the executable code and an additional 1 MB for caching. Oracle Corp.’s Oracle8i Lite, a single-user database for remote users, has a 400 KB memory footprint. Microsoft’s SQL Server 7.0 requires 3.5 MB of memory and 30 MB of on-disk memory. One of the smaller players, Cloudscape Inc. (www.cloudscape.com), offers a 1.5 MB portable database footprint.
For laptop computer users, these requirements are within the working parameters of their devices. But as more users opt for smaller machines, such as Windows CE and PalmPilot devices, vendors are being pressed to "slim down" database products even further.
Careful NT managers must evaluate how portable database products lose weight. In the ongoing balancing act between size and functionality, vendors choose different approaches. For example, Microsoft stripped out interdependencies from various SQL Server 7.0 modules to create a more "componentized" product, says Leland.
Sybase is beta testing its "ultralight" technology, which "achieves significant database application functionality in 30 to 50 KB," says Chris Kleisath, director of product management at the company. The product incorporates logic that analyzes the user’s application, then customizes each database engine by stripping away any unneeded functionality from the resident database code.
But smaller is not necessarily better for some applications. A company with a product catalog of a million items cannot realistically expect to store that data on a hand-held device. For this size application, a small footprint database that can be easily distributed and offer powerful replication and gateway technology may be a better bet, Hartman says.
Replication
Replicating data stored on a user’s portable device and synchronizing it with the central repository -- and vice versa -- is key to portable database technology. When done well, replication can increase the currency and accuracy of data, provide remote users with opportunities to better serve their customer base, and give a corporation the opportunity to make wiser, more competitive business decisions.
Although vendors employ numerous replication methodologies, NT managers who are selecting a replication product must investigate how updated data is replicated and how it resolves conflicts in updates.
For example, the Cloudscape product replicates only the procedure call that caused data changes, explains Malcolm Colton, vice president of marketing at Cloudscape. The central repository replays the event causing the change through the central application, which uses business rules to determine the appropriateness of individual data changes. This technique minimizes the volume of data uploaded and downloaded, making transactions run more efficiently.
The Sybase product uses a slightly different replication strategy designed to minimize a remote user’s connection time during updates. The Sybase database engine maintains a log of all transactions performed. When an upload or download is requested, the engine sends only the log in a message format, such as Microsoft Mail or Lotus Notes, or as a file, Kleisath explains. Each message is processed by the central database in the same order that it was executed.
A second critical concern in the area of replication is conflict resolution. Remote users may transmit data changes that are contradictory or require resolution from the central database or a central application. For example, if two sales representatives each submit transactions requesting delivery of the last widget in stock, the centralized business application decides which customer receives the widget and which customer waits until inventory is replenished.
Some products apply the change based on a timestamp: the first-changed or most recently changed data is applied and other changes are ignored. Other products assign changes according to user authorization levels. Changes submitted by a user with a higher authority level are applied over those submitted by lower-ranking users.
More sophisticated products allow developers to build rules into the database logic that use corporate business policies to resolve conflicts. So-called "custom resolvers" can be as simple as building a miniprogram using a COM object, Leland says. Such technology is key for NT sites that anticipate an expanding role for portable databases because it ensures that the replication technology can grow with the application over time, he adds.
Cost, Complexity and Compatibility
For NT application developers who want to build their own replication within homegrown corporate programs, vendors such as Raima Corp. (www.raima.com) and Poet Software Corp. (www.poet.com) offer embedded database technology that can be made portable. But "rolling your own" replication adds another layer of complexity, explains William. D. Bell, programming manager at Vista Development Corp., a wholly owned subsidiary of Raima Corp.
Raima’s products offer configurable replication strategies that enable developers to include functionality such as mirroring, triggered replication, or proactive application-level replication, as needed.
But for most NT sites, implementing portable database technology should not be a complex proposition. Even simple-to-use products must provide compatibility with the centralized database or database-reliant applications.
"Compatibility with the server data storage environment is a must," Leland confirms. "Otherwise [NT managers] will be building two environments." Some vendors, he warns, purchased their portable database products, setting the stage for potential compatibility problems between applications running remotely and centrally, and creating a variety of issues for developers.
Security Issues
Security is always a concern in remote computing applications. But having devices that can be easily loaded with critical corporate data -- as well as access to even more data on the central database -- is enough to spook some prospects. Yet portable database vendors say there is no need to worry.
Products include many types of security features ranging from standard authorization measures, such as password protection and access restrictions -- some down to the column level -- to file and disk encryption.
The Sybase product, for example, can automatically encrypt database files on remote machines, to protect against unauthorized access if the device is lost or stolen. While encrypting files provides some procedural overhead, for most applications this is a minor cost, Kleisath says.
In fact, some vendors report that the security is not an issue at all for NT sites whose remote users access esoteric corporate applications. "It’s more important for these companies to have portable database technology that is fast and small," Bell says.
For other sites, the security challenge is a matter of developing protective business policies. Sites should implement a policy that prevents remote users from replicating information unless they have authorized access, Leland says. Such policies blend technology and good management to create the most secure remote computing environment, including portable databases.
Replication Techniques
When learning about portable database technology, don’t forget to ask about the replication methods used by a vendor’s product.
"There is no such thing as the right solution," says Neil Shepherd, senior product marketing manager of mobile and embedded products at Oracle Corp. Every enterprise has different portable database needs that must be met, and vendors offer products with multiple replication technologies.
The most basic type is called snapshot replication. This method typically delivers data that was current at one moment in time for read-only access by remote users.
The most common type of replication is merge replication. This technology collects changes from multiple, occasionally connected users and taps predefined rules or application-specific logic to resolve conflicts among the changes.
For users who connect more frequently, subscriber update technology provides transaction updates while the user is in connected mode. Data integrity is provided using two-phase commit protection.
The most resource-intensive replication strategy uses a subscriber/publisher model, in which frequently connected remote users are updated via message queues. This technology is infrequently used.
When weighing the replication possibilities, NT managers should determine whether they want to replicate data synchronously or asynchronously. Synchronous replication is most often used because it is considered more reliable in transmitting complete data streams. Certain file-based replication methods use asynchronous transmission to perform more limited updates, such as when an abbreviated table of changes is uploaded to the central repository. Sites in which data transmission depends on data integrity should avoid asynchronous data transmission, Shepherd says.
Many vendors offer ways of ensuring that transactions are accurately propagated. For example, many vendors include a two-phase commit feature. If a transaction is not recorded completely and correctly, the entire transaction is rolled back to the data's original state, prior to the attempted effort.
Each enterprise has a unique set of requirements for data replication and propagation that should be respected, says John Ainsworth, vice president of research and development at Computer Associates. "Different applications have different comfort levels," and NT managers are key to determining what is required in each portable database application.