In-Depth
Virtual Tape Technology: Software-Based Tape System Gives New Meaning to "Virtual"
As companies continue to generate vast quantities of data, they must have compact, inexpensive ways to store the information they need to do business. Efficiency and cost control are at the heart of virtual tape technology.
However, most virtual tape systems are clunky configurations that bundle hardware and software – certainly not virtual in the pure meaning of the word. Furthermore, these hybrid systems are plagued with limitations, starting with cost. Because these systems are proprietary, investment in vendor-dependent solutions is tricky. But, beyond fiscal considerations, bundled systems are difficult to implement and even harder to scale. When implemented, they often are based on non-mainframe class components with limited throughput and reliability.
However, that is changing. Recent software-based, hardware-independent advances in virtual tape technology now enable storage managers to create a virtual tape environment from any combination of OS/390 disk and tape hardware – including resources the organization already owns. Through this new technology, storage managers also can add or upgrade resources as needed to take advantage of improvements in storage management capacity and performance, regardless of the manufacturer.
The Virtual Tape Environment
A software-based virtual tape environment is composed of virtual devices, volumes, data sets and the disk buffer pool or cache. An Interactive System Productivity Facility (ISPF) interface enables the storage manager to manage all of these components easily and efficiently.
Once installed, the virtual tape system creates virtual tape devices inside the OS/390 operating system. Because the system emulates the microcode that controls 3480 and 3490 tapes, the operating system acts as if these devices actually exist. In fact, applications read from and write to the virtual devices as if they were real.
In this environment, virtual volumes are managed by any tape management system the same way physical volumes are managed. These virtual volumes may reside in the disk cache or on a physical tape volume stacked with other virtual volumes.
Data sets for the virtual tapes are VSAM linear data sets (LDS) that contain all the data for the virtual volumes, including control and indexing information.
Software-based virtual tape uses a disk buffer pool, also referred to as disk cache, or simply cache, and not to be confused with traditional cache, in the Direct Access Storage Device (DASD) to hold data before and after it is written to tape. As long as a virtual volume remains in disk cache, the volume will be recalled from there. The cache can occupy DASD across any storage device in the storage environment.
When the cache reaches a user-defined threshold, the virtual tape system reclaims space by reusing the space occupied by the least recently used virtual volume that already were copied to physical tape. This process stops when it reaches a preset low threshold. Large tape data sets are divided into multiple virtual volumes. As a result, those data sets that have completed processing can be copied or de-staged to physical tape while succeeding volumes for that data set can continue processing. This ensures that very large tape data sets will not consume the entire disk data set pool or interfere with other tape data sets.
Managing the Virtual Tape Environment
Managing the software-based virtual tape system is surprisingly simple. The storage manager can keep track of all virtual volumes through a display panel. This panel shows whether a volume is in cache, its size, data set names, virtual volume serial and status. Using console commands, or through a virtual tape ISPF screen, the storage manager can determine the current cache status. The screen shows which jobs are using virtual devices, the status of each job, which real tapes are being used for stacking and the current utilization of each.
Furthermore, the storage manager is not required to manually initiate the system for each use. Once installed and the parameters set, the virtual tape system essentially manages itself. The system automatically activates as the operating system prepares to choose a tape device and volume. This happens when a data set name or Data Facility Storage Management Subsystem (DFSMS) dataclass is matched to the virtual tape’s list control tables. If the name appears, the virtual tape goes into action and creates the applicable virtual volume entry.
Retrieving existing information is equally automated. When an application requests an existing virtual data set, the OS/390 finds it and furnishes the requested virtual volume. At the same time, the virtual tape system requests that a virtual volume data set be opened.
If the data set is in cache, the virtual tape system provides it immediately. If not, the data set is recalled from physical tape. After the first blocks of data are transferred, the application can access and process it immediately. The remaining data set is rapidly staged to the cache.
Once the application has closed the information, all data is synchronized and written to the disk data set. As a result, all the data is on disk before the virtual tape volume is dismounted. This ensures data integrity between the application and the virtual volume disk data set.
In a multi-volume output-processing situation, single-thread processing is avoided through the use of concurrent processing. Also, because sometimes data can be received from applications faster than the disk cache can accept it, the virtual tape system detects delays and restricts the virtual device by showing that it is busy.
Virtual Volume Stacking
Most OS/390 tapes are substantially underused, even in the most efficient environments. Usually, less than 25 percent of capacity is common. Some tape stacking systems can increase physical tape utilization more than 80 percent. However, these systems have undesirable side effects, such as application failures, security exposures or extended delays when attempting concurrent access to multiple data sets on a stacked tape. Furthermore, data set stacking is not supported by all applications, and stacking software does nothing to improve throughput.
A software-based virtual tape system maximizes the effectiveness of tape and reduces silo slot management. It automatically stacks virtual tape volumes on physical tapes in a format that eliminates failures, and prevents unauthorized access and extended delays during recalls. A software-based system allows more data to be stored on fewer tapes – up to a 300 percent improvement over non-stacked tapes.
The software-based virtual tape system also allows the storage manager to organize related data for fast access. When output processing is complete, the virtual volume resides on the disk cache, but still must be copied to physical tape. Once the virtual volume is dismounted, the user has complete control of when and how the virtual volumes would be copied from the disk cache to physical tape. To improve tape volume utilization, virtual volumes are stacked together based on their stacking group and retention period. The data is saved in the virtual tape system’s stacking format, allowing simultaneous access to multiple data sets on a physical tape.
Where disaster recovery or media recovery backups are necessary, the virtual tape’s duplexing capability creates a second, stacked physical copy of the data sets in a local or remote site. The duplex copy also is stacked in the virtual tape system’s format for optimum tape utilization. Duplex copies may be used in any OS/390 system containing the virtual tape software at an off-site location. Duplexing can provide a vault copy for disaster backup and recovery processing and for local recovery as well.
Just as important, the software-based virtual tape system also provides for export copies because tapes must sometimes be directly readable without virtual tape systems. To export a copy, the storage manager can request a separate, stacked copy of one or more virtual volumes. Export copies can be accessed by any application capable to read S/390 tapes.
In storing routine data, however, stacking groups increase tape efficiency. These groups allow data sets with similar characteristics and requirements to be stacked together. Through the administrator’s predefined filters, the virtual tape system automatically controls the group assignment and stacking process with minimal ongoing management.
Examples of ways data sets are stacked include additional copies, export copies for off-site use or disaster recovery, expiration date, desired esoteric (automatic device vs. manual device), desired storage location or a user-assigned separator group. Expiration date stacking groups enable the system to minimize fragmentation of free space on cartridges. The fact that different groups use different physical tapes allows a user to ensure that, for example, a second copy of a database log is located on a different tape volume than the primary copy.
Evaluating Virtual Tape Systems
When researching options in virtual tape technology, storage managers must understand both the long-term and short-term impact of installation and implementation to determine which system is more appropriate for their environment. For example, if the system requires specific tape or disk hardware, chances are the system will bring severe limitations and challenges with it. Also, system installation and management should be understood up front. Some virtual tape systems, particularly those based on hardware solutions or software solutions requiring specific hardware, need intense planning.
The storage manager also must be clear on the system’s scalability, flexibility and potential to meet future storage needs. If a system can only be scaled in vendor-specified increments using only vendor-specified hardware, it is neither easily scalable, nor flexible. This is critical. Not only does it mean decreased flexibility, it means that all future needs may be met by only one vendor. This places the organization at a serious disadvantage when negotiating future purchases, thus defeating many of the advantages of a virtual tape system (i.e., increased efficiencies and decreased costs).
About the Author: Osvaldo Ridner is Product Manager for Computer Associates International Inc.