Under Control


This columnist was recently a panel member at a medical conference on data mining and knowledge discovery. I know little about the subject, but agreed to speak anyway. Fools rush in...? It was actually "an invitational conference for collaborative development of a network-based real-time rule-server (NRR) for timely decision support."

Physicians need tools to help them make timely decisions while in the presence of patients. The complexity of medical science is such that the unaided mind cannot cope with the challenges of delivering state-of-the-art care. Doctors need to mine the knowledge base quickly enough for the knowledge to make a difference. As I listened to the introductory speaker make these simple points I was struck by the similarity of what he said to explanations I have heard vis-á-vis the need for manufacturing execution systems. Only in this case the process is a patient.

Now in the medical field most of what's currently found in databases concerns financials--billing and costs are what's important and what's been put on the line first (much as the first uses of writing were for shopping manifests, not literature). Useful tidbits related to treatments, such as patient history, pharmacy records, surgery schedules, admitting forms and lab test results await integration into the computing realm.

It won't be long before 50 gigabytes databases are fairly common. It's easy to be buried by that quantity of data. Search engines, WWW published sites, terabyte memories and T1 communications bandwidth on every desktop can be problems as well as solutions. But as with every evolutionary challenge, new tools are being developed in response to the tensions arising from immense available storage capabilities and the promise of the Internet, which is the largest database of all.

The task is to reduce the data to a small quantity of useful information. Knowledge data decision (KDD) automates data analysis by identifying valid, novel, useful, and understandable patterns in the data being examined, using data heredity and parentage tracking techniques that have themselves been available for two decades. The technology base for KDD is also available, if expensive. Rapid database and Internet access, including WAN and LAN TCP/IP compatibility, is mandatory. Enabling technologies include neural nets, parallel systems, emergent and inductive software, landscape theory, and system understanding.

Many of the applications envisioned for KDD presuppose the ability to access a variety of data formats and procedures. This implies either a single standard for all databases or using a technology for heterogeneous access. Look for the latter to be the reality. Java and applets are but a start in this direction.

In fact, there is other important data that will not be found in any database. The knowledge engine also needs a awareness of the present state of the "process." For example, one of the "secrets" of the PLC is that the system state is treated as part of the database. Real-time capabilities (fast, robust, and repeatable) must be part of the decision-supporting process. Obviously, it is better to make a less than optimal decision too late. This applies to a surgeon in an operating room or a planner trying to formulate the next day's production schedule.

Imagine a system that operates by connecting diverse databases to a variety of search engines. Interfacing is done via memory-mapped blackboard systems across several embedded PCs with a rules-based scalable, management operating system coupled to it.

Application in manufacturing might include 1) understanding complex purchasing behaviors, 2) demand management and forecasting, and 3) sales-channel analysis. Additional applications can be envisioned in transportation, automotive, semiconductor, pharmaceutical, and food industries.

KDD is more than a search engine. It is a knowledge-based repository that helps in decision making by identifying trends, behaviors, and patterns too evanescent for less compute-intensive engines (ourselves) to handle. With the Web being the de facto standard for databases of the future, we'll be able to put the execution system, and even the entire enterprise, on a phone jack. The opportunity is there, we have the data, now we need the answers. All we lack is courage.

As appeared in Manufacturing Systems Magazine August 1996 Page 116

