More Plural Categories?
Richard Wordingham
richard.wordingham at ntlworld.com
Mon Apr 21 04:23:13 CDT 2014
I fear I've seen found a need for more plural categories. I was
running my own English language data exploration program and came across
the following grammatical error in my output:
'... is a 11-element table.'
This fragment should, of course, have been
'... is an 11-element table.'
I'd not noticed this issue before; perhaps I'd been sensitised by
pondering the production of the Latin locale.
Does the 'others' category need to have a category extracted for
numbers that start with vowels? These numbers would be something like
<pluralRule count="few">i in 11, 18, 80..89, 800..899,
1100..1199, 1800..1899, 8000..8999, 11000..11999, 18000..18999,
80000..89999, 800000..899999</pluralRule>
I don't see a nice way of carrying it on beyond a million. There may
well be national variation in the validity of the 1100..1199 and
1800..1899 ranges.
This complication will extend to quite a few languages.
Are negative numbers supposed to be supported? Negative numbers belong
to the 'other' category in English, but CLDR seems to put -1 in the
'one' category for English. There seems to be a subtle dependency on
whether the word 'minus' denotes a relative value or an absolute value.
The Welsh numbers are complicated enough for natural numbers. They
deviate from taking the unmutated singular noun as follows:
zero: plural form for nouns
one: Soft mutation for feminine nouns
two: Soft mutation for all nouns
few (i.e. 3): Spirant mutation for masculine nouns
many (i.e. 6): Spirant mutation for all nouns
other: No mutation
However, it is not quite as simple as that, even ignoring the argument
that Welsh ought to be localised. The complication arises with the
numerative forms of _blwyddyn_ 'year', namely _blynedd_ 'years' and
_blwydd_ 'years old'. While in general they unusually take the nasal
mutation for 'other' (yielding _mlynedd_ and _mlwydd_), the standard
form for '4 years' is 'pedair blynedd', with no mutation! 'Pedair
blwydd' is the standard form for '4 years old', though 'pedair mlwydd'
is quite common. This makes a seventh category, for '4', but only
significant with _blynedd_ and, less so, _blwydd_, and archaic diction
with _diwrnod_ 'day'.
Welsh may precede numbers by the definite article as English does, so
there is variation between _y_ and _yr_ depending on whether the
following number starts with a vowel or not. This splits 'other' much
as in English, with the complication that Welsh has both vigesimal and
decimal systems - see http://en.wikipedia.org/wiki/Welsh_numerals for a
quick summary. The RBNF rules have gone for the decimal system.
Apparently the choice between the two systems is affected by what is
being counted.
Possibly the words for 'year' should be special-cased - it seems to
have exceptional usage with numbers in several languages. For example,
in Thai, the ages of childen should be expressed using ขวบ (tr. 'khuap')
instead of ปี (tr. 'pi') as the word for 'year'.
Talking of Thai, although usage seems quite variable, there is a rule
that the number for 'one' should follow the classifier rather than
precede it like other numbers. Does this justify Thai having a
separate category 'one'? (At present, it just has the sole
category 'other'.) Possibly this is covered by the advice to consider
special-casing 0 and 1 anyway. There are several cases in Thai where
the numeral '1' normally disappears in speech, e.g. times of the day.
I am also wondering if the existence of what are translated as plural
forms of the demonstrative adjectives calls for a separate category
'one' in Thai. Possibly one can just avoid using these plural forms
when the number of items (one v. more than one) is not known beforehand.
Richard.
More information about the CLDR-Users
mailing list