The script uses a dictionary file from the CEDICT online dictionary project (I'm currently using the 1 November 1998 update) for its pinyin lookup and translations, augmented by the PY input-method file from cxterm (a Chinese terminal emulator for Unix/X) as a pinyin reference for the increasingly few characters not in CEDICT. Audio links are to the Bell Labs text-to-speech CGI script, and dictionary links are to the index of Web dictionaries at zhongwen.com. A stream of characters is subdivided into "words" solely on the basis of the longest-possible dictionary lookup, so there's no reason the results should actually be correct... but I find them useful, anyway.
It would probably be reasonably straightforward to merge some of this code with James Marshall's CGI proxy script to make a Chinese-character-translating proxy. That would be very cool and I might do it some day, but I won't have the bandwidth to be able to run such a proxy at this site.
The original application for this CGI script, and motivation for developing it, was to give me translations for the characters in a set of grammar notes for the first-year Chinese student (that's me). These notes can be found here with a simple form interface, or here with the advanced form.
I got the grammar notes (and some useful revision at the same time) by reading through Hongchu Fu's online grammar notes and typing lots of the Book 1 notes into the NJStar word processor, nearly verbatim. Any errors will doubtless be mine.
Also included is a list of measure-words. I make no claims for its usefulness and it certainly isn't complete, but you can always just print it out and use it to amaze your friends.
Chris Cannam, 1998