Grapheme clusters and east asian width

Daniel Bünzli daniel.buenzli at erratique.ch
Tue Sep 15 20:45:27 CDT 2015


Hello,   

Is there any guidance on how to combine the information given by grapheme clusters and the east asian width property to do fixed-width layouts in terminal emulators ?  

For example if we have:  

U+AC01 ( 각 ) HANGUL SYLLABLE GAG

This will delimit a single grapheme cluster with east asian width W and hence 2 columns in a tty. However if we have it as the sequence:

U+1100 ( ᄀ ) HANGUL CHOSEONG KIYEOK
U+1161 ( ᅡ ) HANGUL JUNGSEONG A
U+11A8 ( ᆨ ) HANGUL JONGSEONG KIYEOK



This will delimit a single grapheme cluster, but if I try to add up their east asian widths (W, N, N), this would result in 4 columns.

Does something naïve like looking up only the east asian width of the first scalar value in the grapheme cluster and use 2 columns for it if this is F or W and 1 column otherwise work or are there counter examples where this breaks ? Or is there anything more clever that can be done ?

Thanks,  

Daniel








More information about the Unicode mailing list