UCA unnecessary collation weight 0000
    Richard Wordingham via Unicode 
    unicode at unicode.org
       
    Thu Nov  1 16:47:40 CDT 2018
    
    
  
On Thu, 1 Nov 2018 18:39:16 +0100
Philippe Verdy via Unicode <unicode at unicode.org> wrote:
> What this means is that we can safely implement UCA using basic
> substitions (e.g. with a function like "string:gsub(map)" in Lua
> which uses a "map" to map source (binary) strings or regexps,into
> target (binary) strings:
> 
> For a level-3 collation, you just then need only 3 calls to
> "string:gsub()" to compute any collation:
> 
> - the first ":gsub(mapNormalize)" can decompose a source text into
> collation elements and can perform reordering to enforce a normalized
> order (possibly tuned for the tailored locale) using basic regexps.
Are you sure of this?  Will you publish the algorithm?  Have you
passed the official conformance tests?  (Mind you, DUCET is a
relatively easy UCA collation to implement successfully.)
> - the second ":gsub(mapSecondary)"  will substitute any collection
> elements by their "intermediary" collation elements+tertiary weight.
> 
> - the third ":gsub(mapSecondary)" will substitute any "intermediary"
> collation element by their primary weight + secondary weight
Richard.
    
    
More information about the Unicode
mailing list