SMS: From the Beginning

Giving credit where it's due

A few columns back, I recalled IBM’s Systems Managed Storage (SMS) initiative in mainframe computing to illuminate current efforts to address storage management woes in open systems environments. In the piece, Nick Tabellion, Vice President of Development for Fujitsu Softek, was credited for playing a major role in developing the strategy and its underlying software components, a feat he is now seeking to reprise in the distributed computing world at Softek.

A few weeks later, in a chance encounter at the Storage World Conference in Anaheim, Calif., I had the pleasure of meeting one of the patent holders on SMS, John Tyrrell. Tyrrell, who is today a Storage Architect with EMC Corporation, introduced himself with an almost shy demeanor and offered a fascinating perspective on the IBM project that defined thinking about storage management thereafter.

In a follow-up e-mail to me, Tyrrell expanded on his recollection of SMS and the problems it sought to address. I was amazed by its relevance to enterprise storage architecture issues that large organizations continue to wrestle with today. Here is what he said.

“SMS was an enormous undertaking starting with about 11 people and growing as high as 1200 (this didn't count all of the support from the other side of the S/390 OS, the hardware, databases, etc.). The most difficult thing about implementing SMS was not technical, it was changing the way you managed storage. IBM spent an enormous effort studying customers and helping them implement SMS. I personally have done over 350 storage studies and visited 650 data centers in the Global 2000 customers.

“SMS actually had [three] logical policies (Storage Class, Management Class, and Data Class) and [one] physical policy (Storage Group). The purpose of each of them [was as follows:]

“Initially, Data Class [was intended] to simplify Job Control Language (JCL). Later, it could be expanded to address the properties of the data that might tag it to the proper management or application. We never made it that far.

“Management Class was intended to provide lifecycle management and backup management at the file level (Hierarchical Storage Management (HSM) controls, deletion control, backup frequency, [number] of backups, lifetime of backups, etc. It was totally integrated into the workings of the DFHSM product of the SMS suite.

“Storage Class identified a level of service associated with the data to guide initial storage selection, as well as access (e.g., there were facilities in S/390 that would do data-in-memory or expanded storage (E-store) loading based on Storage Class). Examples of this were things like the directories of partitioned data sets, application load libraries, VSAM indexes, etc. It also controlled who got priority whenever the cache in the subsystem was overloaded. There was a lot more sophistication than just initial placement.

“Storage Group was the only physical policy. It was intended to identify a physical set of storage, real physical volumes that could be placed in a locked room, could have different allocation thresholds, had the same lease expiration, etc. One use of this was when you knew a set of storage was coming up on lease expiration, you could change the Storage Group policy to quiesce new allocations for going to those devices. In this way, the group of storage would empty [through] atrophy. Allocation on S/390 initially had a ‘parking lot management’ problem because all space was pre-allocated and there were limited rules about extending in general and extending to multiple volumes. Most of these got corrected. But under ‘parking lot management,’ you put buses with buses and Volkswagons with Volkswagons to reduce fragmentation and the chance of failing a future space allocation due to the initial extent limitations and the initial multiple volume rules.

“The best SMS sites were those with the fewest number of policies. Most sites went from private volume management (much like today with LUNs, LUN masking and zoning) to one or two pools. Their goal was to get to a single pool.

“I participated in the first 100 studies done with customers. When we added up the people time to 'manage' storage, we not only counted the IT people but we interviewed application groups and asked them how they did their job with respect to storage. They said, ‘We don't do storage management.’ But after interviewing them, we found they spent lots of time cleaning up space, correcting job failures, resubmitting, looking for space on volumes, changing JCL and resubmitting, backing up their own data, and so on. When we added up the real cost of managing, we found that 80% of the management cost was in the application areas, not the data center. The real cost of managing lots of islands of storage is astronomical. We finally stopped keeping track of it. If you think about the time you spend managing your own little laptop island, this will make sense to you.

“We found that there was a 1-to-1 correspondence between the worst utilization, the most number of performance and space problems, and the most number of people management costs and the number of islands of storage they had to manage. Moral: we should have one LUN and share it. Interesting concept.”

Interesting indeed. The note from Tyrrell concluded with the unassuming observation, almost as if to minimize his achievement, that SMS took storage management in the S/390 world from 10 MB per person to about 15 TB per person. On behalf of the readers of this column, we would like to thank John for writing.

About the Author

Jon William Toigo is chairman of The Data Management Institute, the CEO of data management consulting and research firm Toigo Partners International, as well as a contributing editor to Enterprise Systems and its Storage Strategies columnist. Mr. Toigo is the author of 14 books, including Disaster Recovery Planning, 3rd Edition, and The Holy Grail of Network Storage Management, both from Prentice Hall.