Thursday, April 20, 2006

MUMPS (not that one)

[Only computer scientists would name a language to be used in hospital computers after a disease....]
(Massachusetts General Hospital Utility Multi-Programming System), or alternatively M, is a programming language created in the late 1960s for use in the healthcare industry. It was designed to make writing database-driven applications easy while simultaneously making as efficient use of computing resources as possible. Although it never gained widespread popularity, it was adopted as the language-of-choice for many healthcare and financial information systems/databases (especially ones developed in the 1970s and early 1980s) and continues to be used by many of the same clients today.

Because it predates C, BASIC and most other popular languages in current usage, it has very different syntax and terminology. It offers a number of features unavailable in other languages and showcases some rarely used programming and database concepts....

MUMPS is a language designed for building database applications. Secondary language features are designed to help programmers make applications that use as few computing resources as possible. Original implementations were interpreted, though modern implementations may be either fully or partially compiled.

The core feature of MUMPS is that database interaction is transparently built into the language. Simply by using variables prefixed with a caret '^' character, you are referencing a database node. Assignment and retrieval uses the same commands as for interacting with standard RAM-based variables. Additionally, all variables (both RAM and database-based) can be treated as a multidimensional hash/array. Child nodes of a variable (called subscripts in M) can have numeric or string keys (the keys themselves are also called subscripts, such that with the variable name ^A("B",2,6), "B", 2 and 6 are the first, second and third subscripts of ^A). String keys are automatically stored in alphabetical order following all numeric keys. Numeric keys can have negative and/or floating-point values, all of which will be stored in order from lowest to highest. The MUMPS terminology for database-linked variables is a global, not to be confused with the C term for unscoped variables (see Variable scoping).

As a secondary language feature, you can abbreviate nearly all commands and native functions down to a single character to save space. Additionally, special operators exist to let you treat a delimited string (like Comma-separated values) as an array. To reduce the number of hard-disk reads, early MUMPS programmers would store a structure of related information as a delimited string, parsing it out after it was read.

MUMPS has no data types. Numbers can be treated as strings of digits, strings can be cast (coerced, in MUMPS terminology) into numbers by numeric operators. When a string is coerced, the parser turns as much of the string (starting from the left) into a number as it can, then discards the rest. Thus the statement 'IF 20<"30 DUCKS"' is evaluated as TRUE in MUMPS.

Other features of the language standard are designed to help applications interact with each other in a multi-user environment. Database locks, process identifiers and Atomicity of database update transactions are all required of MUMPS implementations that follow the standard.

In contrast to languages based on C, whitespace is signifant. A single space separates a command from its argument, and a space or newline separates the argument from the next command. Commands that take no arguments (like ELSE) require two following spaces; one to separate it from its (nonexistent) argument, then another to separate the "argument" from the next command. Newlines are also significant; an IF, ELSE or FOR command processes/skips everything else on the line. To make them affect multiple lines, you must use the DO command to create a block.


No comments: