Use of Unicode 6.3 bidi format chars in CLDR number formats?

Philippe Verdy verdy_p at wanadoo.fr
Thu Apr 28 20:59:05 CDT 2016


Those characters are only needed in plain-text documents that have no other
solution. In rich-text documents, directional controls should be replaced
by styles or equivalent tags (e.g. the "bdi" container element in HTML5).
As we are moving to most applications being developed with an interface in
HTML5 (or similar rednering and layout engines), I'd prefer avoid using
those controls, whose support is erratic or frequently conflicts with
tagging/styling in non obvious ways.
RLI/PDF or LRI/PDF would then not be needed -> use "bdi" instead (whose CSS
styling will map the necessary "unicode-bidi:" property.

I know that "unicode-bidi:isolate" is still not supported everywhere in all
browsers, that still only have "unicode-bidi:embed", but the same browsers
then don't support as well the isolates and don't recognize RLI/PDI and
LRI/PDI as they still use the older specification of the Bidi algorithm
that did not have isolates.

Additionally, RLI  and LRI set a default direction inside, when bdi does
not force it, allowing the content to determine their own initial
direction: RLI and LRI are in fact equivalent to "bdi" elements with a
"dir" attribute, or the combination of CSS "unicode-bidi:isolate" with
"direction:rtl" or "direction:ltr", where bdi alone (without "dir"
attribute) sets "direction:" to "initial" (overriding the inheritance of
the current CSS direction from the parent to the children, in order to
create a true isolate, as if the inner children where in a new separate
document, rendered without any previous context)

Using RLM would make things even worse (the isolation would be completely
lost): the numer will be correct, but any text after the formatted value
would inherit the context set by the formatted entity (which is not
necessarily in the same language or script.

Note also that for currencies, the currency symbols could use a symbol in
the same native script, or an ISO currency symbol (using Latin
letters).That symbol may be left of the incorrect side of the currency
value depending on context or could force a direction after it (notably for
Latin symbols that have strong LTR direction: the symbol itself may need to
be isolted in the whole currency format, combining the formatted value with
the effective symbol).


2016-04-29 0:44 GMT+02:00 Peter Edberg <pedberg at apple.com>:

> Dear CLDR users,
>
> One of the longstanding challenges in CLDR has been designing number
> formats (especially currency formats) and short date formats for
> bidi-language locales (e.g. ar, fa, he) so that the formatted text is
> displayed correctly in various contexts (where there may be no surrounding
> text, or initial text with strong right-to-left or strong left-to-right
> characters). With currency formats, the currency symbols themselves may
> involve characters that are neutral, or strong right-to-left or strong
> left-to-right.
>
> This is exactly one of the types of problems that was intended to be
> addressed by the addition of new bidi direction format characters in
> Unicode 6.3 (Sept. 2013), such as U+2067 RIGHT‑TO‑LEFT ISOLATE (RLI) and
> U+2069 POP DIRECTIONAL ISOLATE (PDI). See UAX #14 Unicode Bidirectional
> Algorithm <http://www.unicode.org/reports/tr9/>.
>
> For CLDR 30, we are considering whether to start using some of these
> characters in some number formats; typically those formats would begin with
> a RLI, and end with a PDI. Two important considerations are:
> 1. Will the systems on which CLDR 30 data is used implement support for
> these bidi direction format characters?
> 2. Do the systems that will be used for CLDR 30 Survey Tool data
> collection implement support for those characters (e.g. for generating
> correct examples)?
>
> For cases in which the answers to the above questions are “no”, we can
> address some of the issues as follows:
> • For #1, in the tools that generate JSON data and ICU-format data,
> options can be added to replace any RLI…PDI combination that wraps a number
> format with an initial RLM (right-left mark) instead. This will result in
> the format having the same display layout when used in isolation, thought
> it may not have the same layout when used in the middle of other text.
> • For #2, the Survey Tool example generators can also replace RLI…PDI with
> an initial RLM, and ensure that the resulting format is displayed by itself
> in a text cell, in order to produce the same format display layout that
> will be produced by the RLI…PDI on systems that support them.
>
> One remaining concern is the extent to which copy-paste of formats
> generated by CLDR will correctly include the RLI..PDI characters.
>
> We would appreciate any input from CLDR users on this, thanks!
>
> Peter Edberg, for the CLDR project
>
>
> _______________________________________________
> CLDR-Users mailing list
> CLDR-Users at unicode.org
> http://unicode.org/mailman/listinfo/cldr-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/cldr-users/attachments/20160429/8acc3a62/attachment-0001.html>


More information about the CLDR-Users mailing list