Blame | Letzte Änderung | Log anzeigen | RSS feed
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 3.2//EN"><html><head><title>HTMLArea Spell Checker</title></head><body><h1>HTMLArea Spell Checker</h1><p>The HTMLArea Spell Checker subsystem consists of the followingfiles:</p><ul><li>spell-checker.js — the spell checker plugin interface forHTMLArea</li><li>spell-checker-ui.html — the HTML code for the userinterface</li><li>spell-checker-ui.js — functionality of the userinterface</li><li>spell-checker-logic.cgi — Perl CGI script that checks a textgiven through POST for spelling errors</li><li>spell-checker-style.css — style for mispelled words</li><li>lang/en.js — main language file (English).</li></ul><h2>Process overview</h2><p>When an end-user clicks the "spell-check" button in the HTMLAreaeditor, a new window is opened with the URL of "spell-check-ui.html".This window initializes itself with the text found in the editor (uses<tt>window.opener.SpellChecker.editor</tt> global variable) and itsubmits the text to the server-side script "spell-check-logic.cgi".The target of the FORM is an inline frame which is used both todisplay the text and correcting.</p><p>Further, spell-check-logic.cgi calls Aspell for each portion of plaintext found in the given HTML. It rebuilds an HTML file that containsclear marks of which words are incorrect, along with suggestions foreach of them. This file is then loaded in the inline frame. Uponloading, a JavaScript function from "spell-check-ui.js" is called.This function will retrieve all mispelled words from the HTML of theiframe and will setup the user interface so that it allows correction.</p><h2>The server-side script (spell-check-logic.cgi)</h2><p><strong>Unicode safety</strong> — the program <em>is</em>Unicode safe. HTML entities are expanded into their correspondingUnicode characters. These characters will be matched as part of theword passed to Aspell. All texts passed to Aspell are in Unicode(when appropriate). <strike>However, Aspell seems to not support Unicodeyet (<ahref="http://mail.gnu.org/archive/html/aspell-user/2000-11/msg00007.html">thread concerning Aspell and Unicode</a>).This mean that words containing Unicodecharacters that are not in 0..255 are likely to be reported as "mispelled" by Aspell.</strike></p><p><strong style="font-variant: small-caps; color:red;">Update:</strong> though I've never seen it mentionedanywhere, it looks that Aspell <em>does</em>, in fact, speakUnicode. Or else, maybe <code>Text::Aspell</code> doestransparent conversion; anyway, this new version of ourSpellChecker plugin is, as tests show so far, fullyUnicode-safe... well, probably the <em>only</em> freewareWeb-based spell-checker which happens to have Unicode support.</p><p>The Perl Unicode manual (man perluniintro) states:</p><blockquote><em>Starting from Perl 5.6.0, Perl has had the capacity to handle Unicodenatively. Perl 5.8.0, however, is the first recommended release forserious Unicode work. The maintenance release 5.6.1 fixed many of theproblems of the initial Unicode implementation, but for example regularexpressions still do not work with Unicode in 5.6.1.</em></blockquote><p>In other words, do <em>not</em> assume that this script isUnicode-safe on Perl interpreters older than 5.8.0.</p><p>The following Perl modules are required:</p><ul><li><a href="http://search.cpan.org/search?query=Text%3A%3AAspell&mode=all" target="_blank">Text::Aspell</a></li><li><a href="http://search.cpan.org/search?query=XML%3A%3ADOM&mode=all" target="_blank">XML::DOM</a></li><li><a href="http://search.cpan.org/search?query=CGI&mode=all" target="_blank">CGI</a></li></ul><p>Of these, only Text::Aspell might need to be installed manually. Theothers are likely to be available by default in most Perl distributions.</p><hr /><address><a href="http://dynarch.com/mishoo/">Mihai Bazon</a></address><!-- Created: Thu Jul 17 13:22:27 EEST 2003 --><!-- hhmts start --> Last modified: Fri Jan 30 19:14:11 EET 2004 <!-- hhmts end --><!-- doc-lang: English --></body></html>