Comparing OS/390 to Windows NT: Been There, Done That
For a long time Microsoft’s Windows was regarded merely as a desktop system, the home of single-user applications like word processors. In NT, Windows has grown up (and keeps growing, to 30, 40 or more million lines of code in Windows 2000). NT does not squarely confront OS/390 as a multi-user, server and transaction-support operating system, but the OS/390 community is conscious of NT, if only because its members employ applications like terminal and FTP clients to reach back-ends like TSO, UNIX System Services and DB2.
For those who are curious about Windows NT’s workings or who feel a vague threat from it, this article will delve beneath the user-friendly GUI to compare and contrast NT with OS/390, looking for similarities and the occasional striking difference.
The Ascent of Windows NT
Windows NT has more history than is usually realized. The direct line started with Digital Equipment engineers who created the VMS operating system; in 1988 many of them moved to Microsoft. Their work was born in 1993 as Windows NT 3.0.
There are other strands of influence. A great deal of the original GUI work was done at Xerox PARC. Virtual storage and preemptive multitasking had been in use since the 1960s and 1970s. Objects had become common currency in the 1980s. The Carnegie-Mellon MACH kernel was reflected in the division of the NT core into layers, where each higher layer is more abstract and bigger than the previous and is available to neighboring layers via APIs only.
Address spaces isolate programs from one another, provide more virtual storage than is available in real memory, are the medium in which most programs run and serve as a point of resource ownership. In NT on x86, an address space is 4G in size; in OS/390, 2G. In a typical NT system, the first 2G addresses (0-7FFFFFFF) are private, and the next 2G (80000000-FFFFFFFF) are common, or system. In OS/390, an address space runs from 0 through 7FFFFFFF. For reasons of history, common areas – CSA, SQA, LPA and nucleus – straddle the 16M line, with private areas above and below.
In NT, process comprises not only address space but also threads and attributes like priority class and resources. In OS/390, address space includes separate virtual storage, tasks, resources and attributes like non-swappability. In both systems, an address space may be swapped out to free up storage frames, and pages may be stolen for the same purpose.
In NT, storage (RAM) is composed of virtual pages, which on x86 are mostly 4K in size. A page can be backed by a real frame and by hard disk storage. Paged virtual storage can be backed or not; non-paged is always backed. Where page faulting cannot be tolerated, backing can be guaranteed by locking down the pages. Otherwise, paged storage is subject to stealing. Before being reused, stolen frames are put on a queue from which they can be reclaimed with content intact. Page tables consume enough real storage that they are allowed to be paged out. OS/390 has very close analogs to all these features.
An NT page may be readable and writeable. For user-mode programs, it may be readable only, or neither readable nor writeable. Storage above 2G is not accessible by user-mode programs. There is also a type of storage duplicated upon first write, which is handy in accommodating writeable static in DLLs and program forking. OS/390 storage can be readable, writeable, copy-on-write or none of these.
One thing not present in NT is storage protection by key. This dates from MVT, an OS/390 predecessor, which ran several programs in a single address space and isolated them by storage key.
Preemptive, Interrupt-Driven Scheduling
In NT and in OS/390, a piece of work has a priority whose characteristic is that the work is subject to losing the CPU to higher-priority work when the work becomes ready. Preemption may happen when an external event, such as an I/O interrupt, takes place; if the interrupt makes some other thread ready and the latter is of higher priority, the interrupted thread loses the CPU until there is no other higher-priority work. Preemption is supplemented by allocating a fixed quantum of time to a thread when it gets control on the CPU. When the quantum expires, the thread may be displaced from the CPU, even by other threads with the same priority.
OS/390 uses time slicing to a similar end: After running a dynamically adjusted amount of time, an SRB or TCB is subject to losing the CPU to higher-priority-ready work when the work becoming ready was occasioned by an interruption on another CPU (if the interruption occurred on the same CPU, the SRB or TCB was preempted immediately). But OS/390 employs techniques in addition to time slicing to achieve fair scheduling.
Windows NT features two fundamental privilege states: User mode (not privileged) and kernel mode (privileged), which correspond to ring 3 and ring 0, respectively, in x386 hardware. In OS/390, problem state is not privileged, and supervisor state is. In both systems privileged state confers, amongst other things, the use of all instructions and access to resources denied to the non-privileged state.
Access to kernel mode in NT can be given by 1) the boot sequence; 2) invocation by an interrupt, such as I/O or timer, by an exception (e.g., invalid opcode) or by the INT instruction; or 3) receiving control from another program itself running in kernel mode. These mechanisms are recognizable from OS/390. INT in particular will be familiar, since it behaves much like OS/390’s SVC instruction: Status is saved in a predefined place, control passes in kernel mode to a handler defined by the operating system and the handler gives control to a routine designated by the operand of the instruction.
Are there user-written INT routines, like user SVCs? No, but read on.
Kernel-Mode Code via a Device Driver
The only supported mechanism for adding kernel-mode code to NT is the device driver. This may sound like having to write an IOS driver to perform authorized functions: In reality the process is simply founded on the I/O structure, but demands little knowledge of I/O unless one intends to effect, or affect, I/O.
To set up a device driver, the registry is updated to indicate that the driver is to be loaded at boot time; alternatively, a program may invoke APIs to install the driver at once. Under either approach, the installer must have sufficient authority. The driver code is loaded in system space, and the initialization entry point is invoked. This routine creates a device object (the driver’s master "control block") and a symbolic link to that object. For a driver called MyDriver, the link might be named \DosDevices\ MyDriver. Finally, the routine defines entry points for operations MyDriver will support. At a minimum, those would be Create (open), DeviceControl and Close.
A program connects to the driver by opening the device object through the file name \\.\MyDriver. The Open effector, running in kernel mode, saves the caller’s state, understands the file name to refer to \DosDevices\MyDriver and gives control to MyDriver’s entry point for Open. This routine, if satisfied that the invoker is a legitimate user, returns a handle.
To get a driver function, the program supplies the handle along with driver-defined parameters in a call to DeviceIoControl. The DeviceIoControl effector (in kernel mode) gives control to MyDriver’s entry point for DeviceControl. This routine satisfies itself that the request is proper, does whatever is necessary and returns.
When the program no longer needs the driver’s functions, it should close \\.\MyDriver so that MyDriver’s Close entry point can clean up (if the program omits to close the file, NT does so upon program termination).
The reader will appreciate the similarities of a driver to an SVC routine. An SVC can be loaded at IPL or dynamically. It is invoked by the SVC instruction, which passes control to the SVC Interrupt Handler, which itself saves caller status and passes control to the designated SVC routine. The SVC runs in PSW key 0, supervisor state, and does what it was designed to carry out. And though SVCs do not require connection, PC routines, the main alternative to SVC routines, do.
I/O processing is perhaps the most striking similarity between Windows NT and OS/390, for in each, I/O is effected in stages and with similar mechanisms.
In NT, I/O works like this:
The application asks for I/O, specifying the handle of an open file and other particulars. NT routes the request to the routine the device driver defined for such I/O. If the device is not busy, the routine calls lower-level components to issue a physical request to the device. If the device is in use, the routine may queue the request off the driver’s device object.
When the device completes its work, it generates an interrupt. NT’s trap handler gets control in a random address space, and control is routed to the driver’s Interrupt Service Routine (ISR). The ISR is running in a state severely restricting execution (no page faulting, for example) and so does only what must be done immediately, e.g., clearing the interrupt, perhaps starting a queued-up I/O operation. The ISR schedules the driver’s Deferred Procedure Call (DPC) routine and exits.
The DPC is given control (usually on the same CPU, in the same random address space). Running at a somewhat less restrictive state, the routine does things like updating control structures in kernel space. If the DPC needs the application’s address space, it schedules an Asynchronous Procedure Call (APC) routine and exits.
The APC runs in the application address space, in kernel mode. The APC, which can page-fault, updates user buffer and parameter areas and makes final status available. If the I/O request has been entirely effected, the APC marks it complete and ready for the application.
Anyone familiar with the I/O mechanism of OS/390 from the earliest MVS days will see similarities: Access-method request, SVC, STARTIO in IOS, the SSCH instruction, interrupt, I/O FLIH gathering status and SLIH perhaps driving more I/O, SRB routine and notification of the originating program. NT and OS/390 have arrived at similar solutions to maximizing thread throughput and efficient device use in an operating system of address spaces and preemptive scheduling.
One thing in NT I/O is strikingly different from anything in OS/390: The ability to intercept requests. By defining a filter driver in front of a file system’s driver, requests made to a file system, network card or display can be inspected and manipulated. For example, a filter driver can be inserted above the NT File System driver, so that requests (open, read, write, etc.) can be seen and perhaps changed. One purpose of a filter driver might be to compress/decompress a file’s data.
Another use could be to learn of any changes made to the file data. A third purpose might be to run a virus scanner as part of file operations. In short, a driver can be simply an exit on the I/O path.
Interrupt Request Level (IRQL)
IRQL is the major NT mechanism with no direct analog in OS/390. IRQL is a software-enforced privilege, and every executing routine has an IRQL. The system scheduler, when it gets control from a routine losing control of its CPU, assumes the IRQL associated with the event causing loss of the CPU. One of the scheduler’s purposes is to drain queues of work requests, so the scheduler starts looking for work that is ready to go, that requires the IRQL now in force and that can run on that CPU. If such work exists, it is given control. Eventually, there remain no suitable work requests, so the scheduler reduces IRQL one notch and starts looking at that level.
A work request is presented by an interrupt. In this context, interrupt comprises hardware events (for example, I/O, timer and clock interrupts) and software events (queuing of requests to run routines). Every interrupt is associated with a specific IRQL and will not be carried out on a CPU so long as something is running at that IRQL. IRQL on the CPU must drop beneath the interrupt IRQL for the interrupt to get service. It is one of those interrupts that causes a routine to lose the CPU.
A kernel-mode routine may increase or decrease its IRQL. NT effects change of IRQL by raising or lowering an internal value and, for higher IRQLs, by masking of interrupts, such as I/O and timer. NT prohibits a routine from calling functions not supporting its current IRQL. In effect, higher IRQL confers greater importance and more restriction.
IRQL is a scheduling mechanism. It does not provide serialization if there is more than a single CPU, for IRQL governs scheduling on the current CPU only. Although IRQL’s hierarchy bears a resemblance to OS/390 locking, IRQL is not truly like OS/390 locking, inasmuch as an OS/390 lock serializes all of a machine’s CPUs.
Serialization and Coordination
For serialization, spin locks are available to kernel-mode routines. These function as spin locks do on OS/390. Spin locks are intended for relatively short code paths. For longer-running code paths and for user-mode programs, a resource object may be acquired with shared or exclusive access, and if it is not immediately available, the invoker may indicate that it will wait until the resource is free. This is rather like OS/390’s ENQ, albeit NT mechanisms do not provide for serialization between machines.
For coordination between two programs in NT, synchronization or notification objects may be created, waited on and signalled ("posted"). If a program needs to wait for a certain time or ensure it is active after a certain time, timer objects may be employed. OS/390 has equivalents in ECBs, WAIT/POST services and timer requests.
Threads in NT are units of work, under which most programs run. They have priority relative to one another, own resources like timers and are an error boundary (an error under one thread does not necessarily affect another thread). Since threads own resources, thread termination is the occasion for resource cleanup, such as closing files. Threads may be doing work, or they may be waiting for work. Consequently, threads can wait on all manner of events (I/O completion, timer objects or objects to be signalled by other threads, for example). OS/390 tasks are analogs of NT threads.
There are, however, a couple of notable differences from OS/390. First, NT has no lightweight threads like OS/390’s SRBs. If a program requires the context (addressability) of a particular thread, the program must create a new thread in the process, or it must queue up an APC to run under the thread. A second difference is that in NT, only threads have priorities, and processes do not. But, the priority class of a process does limit how much its threads can change in priority. In OS/390, address spaces have priorities, and within an address space each task or SRB has a priority relative to the other TCBs and SRBs in the address space.
In NT, two or more processes can share memory by mapping the same real frames to addresses in each address space. If it is a file that the concerned memory contains, the file on hard disk is employed for paging rather than the usual paging space. Since the underlying frames and file storage are shared, the processes enjoy a consistent view of the memory’s contents. OS/390, for its part, supports memory mapping in general, and for VSAM files specifically.
System Activity in the Background
Both NT and OS/390 maintain threads that work in the background. In NT, there are system threads (always running in kernel mode) responsible for things like swapping processes out and in, writing changed pages to paging files, workload balancing and I/O caching via read ahead and write behind. Some of these run in the System process. OS/390, for its part, has special-purpose tasks doing similar duties, executing in authorized state and running in system address spaces.
Services provide functions important to the proper running of NT. Services typically run with special authority (the system account), although any valid id/password may be designated. Services are usually started late in boot, before a person can log on. A service’s start-up can be made contingent on the start-up of other services. In fact, successful completion of NT’s start-up can be made dependent on successful service start-up. Alternatively, a service can be started manually by a person or programmatically, via an API. Once started, a service runs under a thread in a process. A service typically has no GUI or other user interface; rather, the service accepts requests via a programming interface.
Examples of services are anti-virus shields, TCP/IP services, the spooler and the logon service. They are on the order of OS/390 subsystems such as JESx, XCF, CICS, TSO and UNIX System Services daemons.
The Windows NT registry has a role unlike that of any single mechanism in OS/390. It keeps configuration information and so is used in boot. The registry holds information about installed software, including any information the software may wish to save. The registry contains user information and access information. Hence the registry is a general-purpose repository, capable of storing whatever information can be made to fit into its directory-like structure.
The registry has some interesting features. For example, a piece of the registry can be protected from access to programs that lack sufficient authority. By default, some pieces are so protected, and other pieces are open to all access. Another characteristic of the registry is that a program can be apprised of a change to a piece of interest by opening the appropriate key, associating a notification event with the key and waiting on the event. A third feature is that the registry can be backed up, in whole or in part. Given the popularity of registry hacks, flexible backup is no small virtue.
About the Author: James Antognini is Senior Software Engineer at IBM T. J. Watson Research Center in Yorktown Heights, N.Y.