|
Encyclopaedic Compendia |
Hullquist Central |
Ecclectic Interchange |
|
TOPICAL LISTING |
| HOME |
| History |
| Alphabets |
| Breviations |
| Syllbles |
| SpacedOut |
| Comparisons |
| Library |
|
Language Compression
History Newspapers could be briefer, E-mails shorter, paperback novels thinner, and billboards smaller. The cost savings in paper alone could easily offset the national debt to the tune of $20 billion a year and improve the global economy by at least twice that. Hey, we’re in the information age. It’s time for our language to come up to speed, to complement technology, not hobble it. Any effort to improve thru-put by data compression should be agressively pursued. The Holy Grail of Gl5 (pronounced Glish) orthography is not simply a consistent, nor predictable, nor phonetic (shouldn’t that be fonetic?) rendering of printed language, but achieving true readable text compression. Many reforms have littered the political and educational landscape with well-intended schemes for the reform of English spelling. No one denies the shameful state of affairs, nor doubts the benefits that would blossom in the wake of a predictable, consistent and truly phonetic orthography (which the nobel English tongue so sadly lacks). But the motives bouyed with altruistic aims to facilitate our children’s education or assist the foreign language student will never be sufficient to offset the immense inertia of tradition and an intrenched status quo that balk at such efforts simply because ‘it just doesn’t look right.’ Mark Twain recognized this long ago when beseeching the Associated Press to promote the 1906 attempt at Simplified Spelling. "And we shall be rid of phthisis and phthisic and pneumonia and pneumatics, and diphtheria and pterodactyl, and all those other insane words which no man addicted to the simple Christian life can try to spell and not lose some of the bloom of his piety in the demoralizing attempt." Spelling conventions for current English words are sorely weakened by the sheer multiplicity of orthographic alternatives. And entymology plays a role as well. English has adopted so many words from other languages (often retaining much of their original spelling) that rules are strewn with exceptions. Remnants of foreign influence linger on as silent or redundant letters and superfluous syllables—useless vestiges that serve no other purpose than to bloat our tongue and render it a hopeless mass of variation out of control. Or, as Mark Twain said, "grotesque to the eye and revolting to the soul." Consistency The need to
provide English with some semblence of orgnization and consistency should
be considered for more than merely data compression benefits. Truely
phonetic spelling would reduce the length of time required and cost
imposed on educational programs worldwide. And consistency should involve
more than merely spelling. Eliminating grammatical irregularities would
not only facilitate the learning process but also aid in moving English
toward an even more non-inflected language as well as reduce regional
discrimination. Start with consonantal variation. English sports two ways to denote the soft ‘S’ (S,C) and ‘F’ (F,PH) sounds, three alternative renderings for ‘J’ (J,G,DG) and at least five methods of creating a hard ‘C’ (C,K,CK,CH,Q). Many consonants are at one time or another rendered silent (and thus redundant). Take, for instance, the GH in through, the S in island, the B in lamb, the P and S in corps. And the state of vowel usage is really quite dismal. We still tolerate bus, busy, womb , and women . There is much to applaud the vast and colorful English vocabulary. English has accommodated with amazing verbal adroitness thousands of additional words—borrowed, adopted and outright stolen from the rest of the world’s linguistic community. The ever evolving resourcefulness of the language has invested it with a wealth of terminology and beauty of expression that serves equally well the demands of technology and the passions of poetry. But there is, at least in some quarters, a need for a much slimmer version of the language. With the arrival of the Information Age a new, more pressing rationale has emerged for entertaining, once again, that ancient dream of spelling reform. Spelling Pioneers There have been numerous attempts to reform the
English tongue. The language’s archaic orthography, which has remained
virtually unchanged since the 17th century, has drawn the most fire. The
following information comes courtesy of Cornell Kimball and his fine web
page, drawing mostly from Ken Ives' "Written Dialects and Spelling Reform"
(1979) and Abraham Tauber's "Spelling Reform in the United States." Benjamin Franklin,
Noah Webster, Theodore Roosevelt, and Andrew Carnege have all championed
the cause. Noah Webster’s 1806
American Dictionary made the most profound effect of establishing the
significant spelling differences that exist to this day between British
and American orthography. Webster is responsible for rendering ‘joal’ as our current ‘jail’,
removing the ‘u’ from colour and honour, and reversing the ‘re’ in centre
and theatre. He wanted to
remove the silent ‘e’ from such words as ‘give’, and use the double ‘ee’ digliph
for all long ‘e’ sounding words such as ‘read’ and ‘leave’. Sadly, purists of the time
objected too strongly to allow his efforts to prevail. In 1876, the American
Philological Association promoted the use of ar catalog definit gard giv hav
infinit liv tho thru wisht. This same year the International Convention for the
Amendment of English Orthography was held in Philadelphia, during the
Centennial Exposition. Later this organization evolved into the Spelling
Reform Association. A burst of reform associations emerged over the next
few years, among them the British Spelling Reform Association, the
American Philological Association, the National Education
Association. Additional
nominations for improved spelling now included altho thruout thoro thoroly thorofare
program prolog pedagog
decalog. The Simplified
Spelling Board was founded in the U.S. in 1906 and its sister, the
Simplified Spelling Society, appeared two years later in the U.K. One of
the American founding members was Andrew Carnegie, who pumped in more than
$250,000 to promote the cause. U.S. President
Theodore Roosevelt ordered the Government Printing Office on August 27,
1906 to use the Simplified Spelling Board's 300 or so proposed spellings.
The date was strategically chosen by Teddy because the U.S. Congress happened to be
in recess. But the order was later revoked when Congress readjourned that
fallby a
vote of 142 to 24. January 1934 the
Chicago Tribune inaugurated what it called a "practical test of spelling
reform" by applying its own list of 80 respelled words which included advertisment, catalog, agast, ameba,
burocrat, crum, missil, subpena, bazar, hemloc, herse, intern, rime,
sherif, staf, glamor, harth, iland, jaz, tarif, trafic. The list was introduced over a
series of editorials finally reporting that "short spelling wins votes of
readers 3 to 1." The editors
chided dictionary makers for not daring to pioneer the effort. But Within five years their list
had shrivled in half. A few
new recruits were added such as the previous favorites tho, altho, thru, thoro and a
series of 'ph' alternatives such as
autograf, telegraf, philosofy, photograf, sofomore. Over the decades more and more
words were dropped until only "thru" and "tho" remained in 1975 when they,
too, where abandoned. Today the only
remaining vestage of the Spelling Reform Association and the Simplified
Spelling Board is anorganization called
the American Literacy Council. Their modern concern is now directed toward
the teaching of reading and writing as well as spelling
reform. In 1948 linguists
Daniel Jones and Harold Orton proposed their New Spelling system. With the aim
of making English spelling more phonetic, the resulting changes rendered
the appearance of written English uncomfortably foreign. For
example: “Dhe langgwej wood be impruuvd bie
dhe adopshon of nue speling for wurds” This was an attempt to use a consistent phonetic application. Thus ‘dh’ for voiced ‘th’, ‘sh’ instead of ‘ti’, ‘ie’ to indicate the long i sound. But why is ‘uu’ used for long ‘u’ in impruuvd but rendered ‘ue’ in new? And no attempt was made to keep ‘o’ consistent. It appears as short ‘o’ and short ‘u’ in adopshon and long ‘o’ in for. Then the usage of ‘e’: long in be but short in langgwej and speling. But aside from the spelling implications, the result did nothing for shrinking the size of written communication. More spelling systems Spanglish uses letter doubling to change stress and vowel usage. It is summarized at http://www.unifon.org/alfa-saxon.html Founded in 1978, the Better Education thru
Simplified Spelling
orgnization favors simply the
use of tho, thru, and possibly
hav as an initial effort to
buck established convention. The second wave against orthographic inertia would focus on legitimatizing the popular
lite and nite. Australians proposed introducing a series of limited changes beginning in 1984. Their initial nominees were: hed, fotograf, caut, cof, and (once again) giv. These illustrated two basic principles: use of consistent symbols of phonemes, and elimination of silent letters. Except for caut (why not cot?) and fotograf (everyone already uses foto), the improvement in word length was nearly optimal. This effort was
followed by a more comprehensive scheme known as Cut Spelling appearing in
1992. It continues to be
supported by the SSS (www.les.aston.ac.uk/sss) . Cut Spelng:Esy readng for
continuity One first notices that
one can imediatly read CS quite esily without even noing th rules of th
systm. Since most words ar unchanjed and few letrs substituted, one has th
impression of norml ritn english with a lot of od slips, rathr than of a
totaly new riting systm. The esential cor of words, the leters that identify them, is
rarely afectd, so that ther is a hy levl of compatibility between th old
and new spelngs. This is esential for the gradul introduction of any
spelng reform, as ther must be no risk of a brekdown of ritn comunication
between th jenrations educated in th old and th new systms. CS represents
not a radicl upheval, but rather a streamlining, a trimng away of many of
those featurs of traditionl english spelng wich dislocate th smooth
opration of th alfabetic principl of regulr sound-symbl
corespondnce. Several concepts are
exciting: remove all double consonants (like letrs, esential, spelng).
Well,
almost all. Notice impression. Cut Spelng does delete most unecessary vowels
(as in th, norml, esily, systm, ritn, rathr, levl, cor), and apply
consonants fairly consistently (jenrations, unchanjed, alfabetic). But this, too, is a
disappointment. The
applications are still applied inconsistently. In one occurance ‘letters’ appears
as letrs, and in another as leters. Probably an oversight, but
‘the’ is rendered the in two
places instead of the preperred th. And why isn’t ‘-ing’ (as in streamlining) always treated as
efficiently as it is in trimng
and spelng? We are treated to a very nice
compact afectd, but are still
left with unchanjed, substituted,
educated, rather. We still have to deal
with the uncertainty of how to spell with the letter ‘e’. sometimes it is short
(impression, them, identify,
afectd, levl, spelng), CS advocates promote
its space saving ‘advantajs’ in that it is “som 10% shortr than traditionl
spelng. This has sevrl importnt advantajs. To begin with, it saves time
and trubl for evryone involvd in producing ritn text, from scoolchildren
to publishrs, from novlists to advrtisers, from secretris to grafic
desynrs.” Why is this CS example
so timid in achieving even more impressive space reduction by applying its
rules more consistently? If
we can benefit by the concise versions of sevrl and importnt why not producng, scoolchildrn, and advrtisrs? And if we can use ‘y’
to replace ‘ig’ in desynrs why
not capitalize on similar savings for words like tym ? It is understandable
why so much of English’s inconsistency is retained by CS. The emphasis on
keeping the appearance of English for the sake of bridging future
generations was probably irresistable. But CS generally scores high on
quick acceptance and introduces the concept that alternate spellings are
not flaws of education because, as W.C. Fields once observed, “I have no
respect for anyone who can spell a word only one
way.” Cut Spelng proposes
four basic rules: Eliminate silent
letters For example: have, through, who, yacht, herb/honest, psychology, pneumonia, would, debt, scene, treasure, friend, people, build, etc. Eliminate unstressed
vowels before l,m,n,r,d For example: exampl,
chapl, centr, entr, randm, persistnt, curtn, fashn, litl, watrd, bedd,
submitd, edbl. Eliminate double
consonants For example: letr,
betr, ritn, mitn, hapn, clapn, omitd, travld. Eliminate redundant
variations Replace
gh, ph with f ruf, cof, laf, fon,
graf, fiziks, Other than these, it’s
just a capricious as English in predicting just how a word will be
spelled. Gl5 would add to these
concepts: vowel consistency and economy thru an even more radical
limnation of superfulous or non-critical syllables. Something along the line of
Dutton’s Speedwords. Speedwords In
the late 1940s Reginald J. G. Dutton created a constructed language called
Speedwords as a candidate for an international auxiliary language derived
primarilly from English. But it has over the years found greater utility
as an effective stenographic system as it uses only normal Roman
alphabetic letters with no unique symbols. Speedwords
was designed around Zipf's Law, an observation that frequently-used words tend
to be short words. Dutton
claimed that his Speedwords were "logically and methodically built up from
Professor Ernest Horn's remarkable analysis of the frequency of occurrence
of all words. The Iowa
University philologist and his
staff examined and tabulated 15,000,000 running words of all classes of
written and printed matter. The very-high-frequency words tabulated by Professor Horn are
expressed in Dutton Speedwords by single alphabetic letters standing
alone. The next highest in
his order of frequency are alloted two-letter speedwords, and so
on..." Dutton
further intended that a carefully chosen small vocabulary of basic
concepts (or as he called them, "semantic primitives") could be compounded to express
virtually any idea within semantic space. After a study of Roget’s original
1000 thesaurus categories Dutton surmised that "Only 493 one-, two-, or
three-letter word-roots have
to be memorised." Thus, Speedwords attempted to achieve two goals: make
the most common morphemes as brief as possible, and cover all of semantic
space with the fewest possible morphemes. An illustration of
Speedwords’ compact expression capabilities is shown
here: E 3 le ir f v = (There) are three letters here for you. Be 3 letters hir for vous. Dutton drew from other
languages than English, as illustrated here by the French vous. His system also imposed its
own (often arbitrary and highly idiomatic) grammar upon the user. One interesting aspect of this was
his ambitious use of single-letter affixes. Nearly every letter was
assigned some modifying quality. -a unfavorable pro =
promise, proa = threaten (unfavorable promise) Single Letter Words
(both upper and lower case) were also employed: a at, to, toward Speedwords quickly begins to appear largely alien to standard English. It is, in many respects, a divergent language with new word forms and structural mechanisms. It does, however, demonstrate several significant compression techniques that can be adaptable to a more immediately readable format. GL5 GL5, in its primary goal of attaining the highest possible
text compression ratios, shares several aspects of the Speedwords model.
Single letter words are also applied to full advantage producing similar
looking sentences of extreme brevity: izpm
Bdtym = It is past my
bedtime. GL5 diverges from
Speedwords, however, in a few significant areas: Economy In
order to achieve maximum data compression, GL5 incorporates several
techniques to achieve innate reductions in word length and increases in
thru-put: GL5 Breviations |