Mutually Exclusive Goals
I ran across an interesting situation a couple weeks ago that strikes at the very heart of the security dilemma we all face every day. The general challenge: How does an organization allow public access to its data, while keeping crackers and other bad guys away?
In this specific scenario, the customer has a series of Web servers sitting outside its firewall and an Ethernet switch and some software to load-balance them. The switch uses Network Address Translation (NAT) to present a single IP address for all the Web servers to the public. Members of the public connect to a URL associated with this IP address, and the switch decides which Web server will service the request. When people connect to the site, they fill out various forms with demographic information. My customer collects this data and analyzes the demographics.
Here are some challenges. Each Web server is a discrete system with its own set of disk drives. This means each Web server updates files on a local hard drive when people fill out the Web forms. But for proper analysis, the data needs to live in a single, central repository. The data also needs to be available immediately. This means solutions that periodically use FTP or some other means to copy data from the Web servers, across the firewall, to the internal main file server are not acceptable.
The system also needs to be highly reliable: No single failure should cause a loss of data or availability. Instead of Windows NT/2000, the internal main file server uses Pathworks Advanced Server running on a four-node Compaq OpenVMS cluster with a fully redundant, central disk farm. Based on AT&T Advanced Server for Unix, Pathworks presents itself to the outside world as a Windows NT primary or backup domain controller. Unlike an NT cluster, any member of an OpenVMS cluster can access any file on any disk in the central disk farm, concurrent with any other OpenVMS cluster member, without fear of data corruption or other concurrency problems. This capability makes Pathworks a perfect file server in situations requiring the utmost reliability.
To meet my customer’s unusually demanding availability and reliability requirements, the most tempting solution is for each of the Web servers to map a drive letter to a share offered by the Pathworks server. This way, each Web server thinks it is writing to its own hard drive, when, in fact, it is populating data into a central directory inside the Pathworks cluster. The data resides safely inside an internal server, immediately available for subsequent analysis.
Unfortunately, this solution creates gaping security holes. To offer a Windows NT share across a firewall, Microsoft knowledge base article Q179442 says the firewall must enable TCP and UDP ports 135, 137,138, 139, and all ports above 1024. It may also be necessary to enable other ports for name translation and other services. To my mind, this pretty much defeats the purpose of a firewall: Why have one only to make it wide-open?
We came up with some other ideas. The most popular is to set up two firewalls. Put up a heavily restricted firewall between the Internet and Web servers that only enables, say, HTTP and SMTP traffic. Next, put up an internal firewall between the Web servers and the internal network that only enables traffic from the Web servers’ IP addresses. This is not perfect because a determined bad guy might somehow use HTTP or SMTP to compromise a Web server and ultimately get inside the internal network, but a bad guy would need to work very hard to make this happen.
Another possibility: Add NFS to the internal file server, and set up the Web servers as NFS clients. This may reduce the ports the firewall must enable, but still creates a security hole.
Or we could use Rlogin or other similar means to move Web form data on demand to the internal servers. When somebody fills out a Web form, the server would capture the data, do an Rlogin across the firewall to the file server, and transmit the data. The Rlogin process on the receiving file server could immediately update the database. Unfortunately, this solution creates a security nightmare because the external Web servers would need to know a password for the internal file server.
Using e-mail instead of Rlogin eliminates the password problem. When a person fills out the form, the Web server could create an e-mail message and deliver it to an internal e-mail server. Some software could constantly listen to a mailbox and take appropriate action when new data arrives.
How do other organizations balance the mutually exclusive goals of high data security with public access? I am curious to know your thoughts. --Greg Scott, Microsoft Certified Systems Engineer (MCSE), is chief technology officer of Infrasupport Etc. Inc. (Eagan, Minn.). Contact him at gregscott@infrasupportetc.com.