Combining Characters
Alex Shpilkin
ashpilkin at gmail.com
Fri Dec 19 15:17:25 CST 2025
On Fri, Dec 19 2025 at 23:02:55 +02:00:00, Alex Shpilkin
<ashpilkin at gmail.com> wrote:
> I haven’t gotten to implementing canonical composition yet
And you can tell because the algorithm I’ve posted is wrong.
Attempted correction (which does introduce a bit of special handling to
account for the starter+starter case):
starter = 0 # sentinel not part of any compositions
starter index = uninitialized
index = 0
while index < length of string:
composition = try to compose (starter, string[index])
if succeeded and (ccc[string[index]] != 0 or index == starter index
+ 1):
string[starter index] = composition
delete string[index]
else:
if ccc[string[index]] == 0: # NB only this late
starter = string[index]
starter index = index
index = index + 1
--
Sorry for the noise,
Alex
More information about the Unicode
mailing list