Misspelling or Miscoding?

Marc Durdin marc at keyman.com
Sat Jan 21 04:18:18 CST 2017


On 20 January 2017 at 15:37, Richard Wordingham <
richard.wordingham at ntlworld.com> wrote:

> On Thu, 19 Jan 2017 18:41:07 -0800
> Asmus Freytag <asmusf at ix.netcom.com> wrote:
>
> > On 1/19/2017 5:04 PM, Richard Wordingham wrote:
> > > On Thu, 19 Jan 2017 14:25:14 -0800
> > > Asmus Freytag <asmusf at ix.netcom.com> wrote:
>
> > >> The Khmer example would seem fairly resistant to automated
> > >> correction if it is a free choice. If, instead, the immediately
> > >> preceding consonant comes from two disjoined sets, for example if
> > >> TA COENG TA was possible, but not TA COENG DA, then there's scope
> > >> for spell check.
> > > It's supposed to be based on the phonetics, so a spell check could
> > > be used, but not a grammar rule.  However, I can imagine someone
> > > writing in accordance with a rule restricting them to certain
> > > bases.
> > Your last sentence reads as if you might equally well meant "can't"
> > instead of "can" (?)
>
> I meant 'can'.  According to Huffman's 'Cambodian System of Writing',
> initial TA is to be read as /d/ in compounds formed by infixes.  (The
> spelling may have changed since then.)  Suffixed to ណ NNO (which is in
> the retroflex series), the subscript is to be read as /d/, while
> subscripted to ន NO, it is usually /t/ but occasionally /d/.  I would be
> tempted to apply the Pali & Sanskrit rule of place agreement and
> use COENG DA below ណ NNO and COENG TA below ន NO.  I would expect
> similar agreement with ដ DA and ត TA.
>

Khmer spelling is inconsistent enough that attempts to leverage this kind
of rule are in my opinion of limited utility. This kind of knowledge is
better embedded in dictionaries where it is accessible to readers, than in
an encoding where it just introduces ambiguity and confusion to the average
user.

Presentation is identical in modern Khmer. From what I've observed, most
Khmer users type the subscript which is most obvious to them, that is COENG
+ TA as the major form is visually similar.

The online dictionaries I've consulted are somewhat inconsistent in their
use of COENG DA/TA (and do not normalise searches). The rule regarding
suffixing to ណ NNO seems consistent as far as I can tell, but suffixed to
other letters, the pronunciation is less consistent. In my current Khmer
language learning, My tutors have suggested that the pronunciation is
inconsistent and in some cases can be pronounced either way. Some examples
of words using COENG DA/TA:

បណ្ដាល /bɑndaal/ giving rise to
បន្តិច /bɑntəc/ a little
ប្តី /pdəy/ husband
កត្តា /kattaa/ agent, factor
ចិន្តា /cendaa/ thought, thinking
វិចិន្តា /viʔcəntaa/ or /viʔcəndaa/ daydreaming
ស្ដា /staa/ arrogantly

Marc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20170121/b02a3082/attachment.html>


More information about the Unicode mailing list