In-Depth
Solid as Rock: From Linux to the S/390, Seton Hall University Overhauls Its Entire Infrastructure
Many well-respected analysts and academics, including MIT Professor of Economics Lester Thurow, claim that Linux will never become a major industry force because it will eventually fragment into competing fiefdoms, similar to the commercial UNIX products. At best, other experts say, Linux will continue to fill a niche as a firewall, name server, small Web server or other dedicated appliance.
Try telling that to the folks at Seton Hall University (SHU) and reactions will likely range from raised eyebrows to quiet bemusement, because Linux is alive and well and extraordinarily successful here.
Bishop James Roosevelt Bayley, the first bishop of Newark, founded SHU in 1856. He named it after his aunt, Mother Elizabeth Ann Seton. Mother Seton was a pioneer in Catholic education and the first American-born saint.
Today, Seton Hall is the largest and oldest diocesan university in the United States. From its original enrollment of a handful of students, Seton Hall is now a major Catholic institution with more than 10,000 students from 40 U.S. states and territories, and dozens of countries.
In past generations, Seton Hall students came from the surrounding communities by foot and by train. Today, they come by planes, trains, automobiles and the Internet, and SHU is continuing its pioneering tradition by embracing information technology in all facets of its operation.
As Seton Hall’s Technology Committee explains, "Part of the mission of Seton Hall University must be to equip its students to take full advantage of the opportunities afforded by an information society. If that mission is going to be fulfilled in an excellent manner, then the University must embrace information technology. A commitment to information technology as an enabler for transforming teaching and learning at Seton Hall is vital to the future competitiveness of this institution."
Seton Hall did not always view information technology this way.
Back in 1994, Seton Hall’s IT infrastructure was typical of many similarly-sized organizations. The University computing infrastructure had evolved over several years to include over forty 486 and early Pentium Netware 3 and 4 servers spread all over campus, on top of a mixture of 10 MB Ethernet and token ring LANs. Administrative applications, such as payroll, grades and student affairs, ran on an aging IBM 4381 mainframe system that filled a room.
Joe DiVito is now a Senior Network Engineer at Seton Hall. Recalling those days, he says, "Seton Hall’s server infrastructure prior to 1995 was virtually 100 percent based on Novell NetWare. All were running 3.x code and for the most part were departmental in nature. As a result, we had servers scattered throughout the campus, but the Academic Computing group had responsibility for user account and password maintenance.
Making things even more complex was the fact that our e-mail system was based on GroupWise. Each server had a separate Post Office with a separate database that had to be synchronized with the Novell Bindery on that server. We had thousands of students and faculty who needed access to computing services, but there were no tools to help us."
So, We Wrote One
DiVito explains, "We adapted the Novell APIs to work with Visual Basic and wrote a tool we dubbed M.A.G.I.C. (Maintenance And Generation of IntraCampus accounts). This tool simultaneously created the Novell account and GroupWise account, and made sure the account was unique across all the Netware servers. We built rules into the software that used a student’s name and major to create both accounts on the correct server and print out an instruction sheet specific to the user name. We refined the tool with Version 2.0 to allow input files from our Student Information System (mainframe-based) that would create accounts for every incoming freshmen at the same time. This totaled over 1,100 new accounts every year.
It was a great tool, and we even came up with a way to easily handle password changes, but, fundamentally, all those little servers made it confusing for everyone."
Today, Matt Stevenson is the Manager of Systems Administration and also remembers those days. "With all those servers, it was sometimes really difficult to remember which server anyone was logged in to at any given time. Sharing files was also very difficult because we had different versions of spreadsheets, word processors and other software installed on different machines with no central authority to coordinate it all. It was a challenge to keep up with it all because there weren’t enough of us to go around."
It was time for a fundamental overhaul and in late 1994, the University put together a committee to figure out what to do. While the committee was responsible for several tasks, its ultimate charge was to assess the current computing and information technology environment on campus and propose a clear direction and long-range plan for the future of information technology at Seton Hall.
The committee eventually included roughly a dozen people from various departments and an outside consultant. They conducted 13 focus group sessions, involving more than 140 members of the University community, including faculty, staff, administrators and students. They also distributed surveys to more than 1,500 members of the University community, with a return rate of more than 40 percent.
Based on this feedback and intense educational sessions with IBM and other vendors, the committee put together an exhaustive report detailing the state of Seton Hall’s IT infrastructure and a vision for the future. The report, dated June 8, 1995, is available on Seton Hall’s Web site at http://technology.shu.edu/techplan/.
The report called for fundamental cultural changes in the way Seton Hall views and uses information technology. The vision was ambitious, calling for spending $15 million over five years to upgrade its infrastructure, train faculty and staff, and make open computing easily available to everyone in the Seton Hall community.
Taking a Gamble
The stakes were high. If the IT trend continued as it was, Seton Hall could well find itself in the 21st century as a footnote in history. If Seton Hall had the courage to implement the recommendations in the report, it had an opportunity to enrich its history of innovation and leadership. To the credit of everyone involved, Seton Hall made the tough decisions to reinvent itself. That report became a blueprint that transformed Seton Hall into a technology powerhouse.
Serious technology work started several months later, with a specific goal to centralize as much management and storage as possible. The first step – rebuild the core network. Seton Hall replaced all the old token ring and 10 MB Ethernet with fast ATM and fiber optic cabling among buildings. The University has 2,200 dorm rooms in six buildings and this project wired every room with at least two network connections. This provided students and faculty easy physical access to the network.
The next project replaced the old IBM 4381 mainframe and its room full of cabinets with a more powerful, refrigerator-sized S/390 generation 4 system. An IBM business partner helped with the migration.
Why not replace all that expensive mainframe software with Linux, especially since the committee recommended moving to open computing? As Stevenson explains, "One of the problems with mainframes is the cost of moving because of the proprietary lock-ins. And, it’s not just the expense of moving, it’s shutting everything down. Besides, our mainframe software all worked fine, so why move off it? That was and is our philosophy – don’t throw out what works."
Next up was Lotus Notes. Seton Hall deployed it on a top-of-the-line RS/6000 SP supercomputer, along with DNS and some other miscellaneous applications. This behemoth contains nine nodes, each with an 8-way, 200 megahertz PPC.
Linux in the data center started innocently enough in 1996 when the primary DNS server, a very old RS/6000 system, failed. The secondary system took over the load and nobody noticed an outage, but Stevenson and others decided to try something new. They had been testing "Red Hat Linux version 3 something," and it claimed to run the same version of BIND as the big RS/6000 iron.
So, they copied the appropriate configuration and zone files to an ancient 486-20sx PC and put it into production as the primary DNS server. To everyone’s amazement, it worked, and to this day, Seton Hall provides DNS this way. Stevenson reports that they recently upgraded the primary DNS server to a 486-66 with a leftover RAID array from one of the old Netware servers. During its busiest period, it is 97 percent idle.
The Seton Hall Web page ran on a Sun SPARC 20 machine using NCSA Mosaic. To provide students and faculty with their own personal Web pages, Seton Hall purchased IBM Netfinity servers and loaded Linux and Apache. Seton Hall eventually migrated its main Web site from the SPARC to the RS/6000 and retired the SPARC.
Next came a massive Novell consolidation project as Seton Hall migrated from over 40 servers to two large IBM Netfinity servers running Netware 4.11. The server group decided to use IBM’s LANRES for centralized storage, which carves out space on the mainframe disk farm using an ESCON connection.
With the bulk of storage now centralized, the next project was to centralize and automate all backups. No more traveling all over campus to change tapes. Instead, Seton Hall totally re-engineered their backups to use Tivoli Storage Manager (TSM), formerly named ADSM. All servers backup their data across the network to the mainframe storage pool.
The mainframe then copies the data to tape. TSM has clients for every platform in use at Seton Hall, including Linux. In fact, IBM’s first Linux software was a TSM client. This means that Seton Hall can buy inexpensive, Intel-based Linux servers with no need for expensive tape drives, and no need for operators to change tapes every night.
By the late 1990s, with these projects up and running, Seton Hall implemented the recommendations from the 1995 technology plan. But, they had only started. Once word was around campus that IT could do powerful things, ideas flowed and Linux blossomed.
The faculty wanted a way to do department and personal Web pages to stay in touch with students, but did not want to learn HTML. The obvious tool of choice was Microsoft Front Page, and this caused a storm of controversy over what to put on the back end. On one hand, Microsoft Windows NT and IIS were a natural fit for Front Page Web development. On the other hand, Linux is free and requires much less hardware than NT or Windows 2000. The debate was rendered moot when Front Page Server Extensions became available with the Apache Web server. Today, Seton Hall’s IT department offers Web hosting services to departments via both FTP and Front Page. They maintain roughly 3,000 user accounts with over 60 virtual Web servers, including hundreds of Frontpage sub-Webs.
The Big Payoff
Linux was penetrating Seton Hall in a more-or-less typical manner, filling traditional roles it had filled in hundreds of other settings. The breakthrough project came in 1998 when some of the faculty decided they wanted to stream audio feeds of their lectures. This way, students could attend class regardless of their geographic location.
Achieving this goal would stress the network and servers as nothing had done before. As a proof of concept, Dr. Steven Landry, Seton Hall’s CIO, decided to use streaming technology to conduct a department meeting with the IT staff of 90 people, spread across multiple buildings on campus. The goal – everyone should be able to fully participate in the meeting from their desks as if they were in Dr. Landry’s office. This meant that everyone would need to see Dr. Landry presenting, with the opportunity to interact in realtime.
As Stevenson explains, "I’d heard people were working on streaming media with Linux, and Linux had done well for us with other applications, so we decided to give it a shot. The test exceeded our expectations. During the meeting, everyone saw video with Dr. Landry conducting the meeting, and they asked questions using AOL Instant Messenger. The whole thing worked flawlessly, and people remarked later that this was one of the most interactive meetings ever."
"This test was remarkable because our streaming server was nothing more than a spare 486-66 with ATM LAN emulation. Even more stunning was the fact that we used drivers that were not yet even beta quality. In fact, the production versions of those drivers only became available in early January 2001 when the 2.4 kernel came out. But, it all worked. For me, this proved that Linux will do pretty much anything we ask it to do."
Based on the overwhelmingly positive outcome of that test, Seton Hall ordered a large Netfinity server with 2 GB RAM and 120 GB disk space to act as a streaming server. Because streaming audio and video is a demanding application, Seton Hall feeds this server with LANE (LAN Emulation) over ATM.
Philosophically, Seton Hall turned a corner with this project. "We could have looked for traditionally developed products for our streaming project, but, like all universities, we have to watch our budget," reflects Stevenson. "So, we tried the experiment with Linux, and it worked beyond all of our expectations. Linux support is great because we can ask questions in any number of forums and get answers from an entire community, often from the developers themselves. And, the whole thing is free. It’s a no-brainer. Why would we do anything else?"
Now, with a very large and sophisticated network, the IT Department needed a way to manage it all with a minimum of labor. So, they decided to set up a Linux statistics machine and loaded it with MRTG, a Linux-based open source project to monitor and report on network use. They use a product called Web Trends Enterprise Suite to track Web site hits, and a series of home-brewed scripts to track other items of interest.
Centralized logging is also important in a large environment. This way, a single logging server can notify staff when problems or issues come up with various application servers. All Linux machines on campus keep local copies of their logs and also send the data to the logging server. Seton Hall also now centrally manages its printers, using the Java HP JetDirect management software. So now, Seton Hall manages everything of interest on the network from one central source powered by Linux.
Seton Hall recently implemented another ambitious project based on Linux – e-mail for life for alumni. This will eventually involve keeping e-mail accounts for a growing population of alumni of roughly 65,000. The biggest challenge of this project is contacting existing alumni to initially set up the service.
Here and Now
The current big project underway involves centralizing all storage onto a 4.3 Terabyte IBM Shark storage array. Roughly 17 servers and 24 SCSI interfaces will connect to this array, and each will carve out a portion of this space as its own. This project is also spawning some file system upgrades. "We’re getting into some very big file systems, some more than 800 GB, and the traditional UNIX file system is starting to have problems. For example, if something shuts down abnormally, it needs to go through an entire FSCK cycle at the next boot.
"This is a big deal with 800 GB! So, we’re going to use the Reiser file system, which is a journaling file system coming with the Linux 2.4 kernel. Reiser uses transactional control similar to databases, so we can simply roll journals forward or backward as needed to recover from a failure event," says Stevenson. "Our FTP site (ftp.shu.edu) uses Reiser right now, and we’re going to roll it out all over the place soon."
The projects won’t end there. Seton Hall is evaluating Lotus Domino on Linux. Storage, of course, will reside inside the IBM Shark. Eventually, all storage – even IBM Mainframe storage – will reside on the Shark. Seton is also evaluating Samba as a replacement for all 15,000 Netware user accounts on campus, using instances of Linux deployed on an IBM S/390 Series mainframe. IBM claims they can run up to 30,000 virtual Linux servers on top of VM. Details are available by searching for "Virtual Linux" at www.s390.ibm.com.
Bernd Walter, Executive Director of IT Services notes, "Because Linux runs on a variety of hardware platforms, we can consolidate workloads onto fewer "boxes," as well as larger scalable hardware platforms like the S/390. It also helps us on the support side, since there are fewer operating systems to support."
That pivotal 1995 report set in motion a chain of events that are still unfolding. As the world moves into the next millennium, Seton Hall University is helping to lead the way. Linux doesn’t scale to the enterprise? Don’t say that to anyone at Seton Hall University because they know better.
Stevenson sums it up best. "One of the most curious things we noticed early on about all this is that our Intel servers have been the most reliable of any. We’ve had hardware problems with the mainframes, the RS/6000s, and all the proprietary stuff. But, the Intel servers have been rock solid. If something new comes along that’s even better, we’ll certainly take a look at it. But, the bar is really high right now."
Greg Scott is Chief Technology Officer at Infrasupport Etc. Inc. (Eagan, Minn.), and has been a technology author for over 10 years. He can be reached at [email protected].