Analysis: Do You Know Where Your Data's Been?

John Bussert

You've spent lots of time over the last few years making sure your systems are just right--that there won't be any missed steps in your programs, that Y2K is just a passing problem. But do you know where all your data is coming from?

Take a look at the control you have over your data and you might be a little surprised. Gone are the days where the keypunch department entered everything on pretty little cards that you had complete "back office" control over. Your thought your order entry program was the next best thing to sliced bread because you could edit the data as it was entered!

Now though, we're in an environment where data comes from so many areas. EDI is one, but we have some control over that. At least we've written programs to massage the data so it fits our system. Now however, there are two new methods of getting data that might be more efficient from a user entry standpoint, and they should give us pause when we think about the cleanliness of that data.

I was at a conference on Internet security recently and had my eyes opened quite a bit. Not to the fact that someone could see my messages being transmitted, or that some hacker would come into my system and mess around--that's all pretty obvious stuff to watch for. Some of the things that people are doing on their Web pages with the data just boggles my mind and shows that many of the folks designing and building these sites are not systems people.

The example (from a true-life finding) was a site where the company was selling a product, had prices on the page, and the user filled in a shopping cart. I'm sure most of you have used something like this by now. Well, the developer created HTML pages of the data that was sent down to the browser. The price was sent in a non-updateable field attribute, which sounds fine until you realize that that's exactly what hackers look for. They can easily change the script of the HTML and change the price you sent from $999.99 to $000.99, then send the updated order right back to you.

Until I saw this it never occurred to me that the data we send out could come back different. This problem, however, is really a data design issue. If you wrote your own order entry program, unless you gave your users the ability to change the price for some reason, you'd look up the price from a database table. Even if the price on the screen became corrupt, you could get the right one as the order was processed.

We have to make sure that our data is "scrubbed" as it gets into our DBMS. Now this is not real tough, but it's getting more complicated than it used to be. We need real developers involved in our Web projects. They need to a look at, not just feeding the data to the Web page, but how that data is being returned-if it's not the same, don't use it. You should never use the static data from a page that comes back. Only take the entry fields, then put them through the normal edits and updates that you would if the order was entered through your OE system.

This same problem manifests itself in our wonderful little PCs. Our users have gotten pretty good with Excel spreadsheets and Access databases. It's so easy for them to update, report and manipulate that they can't figure out why it's so hard on the "big system." Now though, they're updating data into that big system from their PC. This can be a great thing, but it can also lead to real data problems if it's not tightly managed.

Should you let users import data directly into their main tables with no up-front edits? I'm all for having users own their data, but who gets to clean it up when it gets messed up? You do. Worse, bad data sometimes manifests itself later after the damage has been done.

It seems that as things get more complicated and the demands to get data into our systems grow, we're losing control of the data coming in. We should have standards setup as to how data can be allowed to enter our databases, and it must be policed in some way.

I hope that by the time you read this, your Y2K activities are dying down, if not complete altogether. So now maybe you can take a few minutes and see where you're vulnerable to data entry point problems. If you were your competitor, or customer, or vendor, where would you be concerned about you?

There are new languages, technologies and people, all involved in updating our data now. But that doesn't mean that the rules for allowing it into the database have changed. It still needs to be checked and bounced if it's not right. Just a little common sense goes a long way.