Proposed Update UAX #9, Unicode Bidirectional Algorithm
CE Whitehead
cewcathar at hotmail.com
Sun Oct 19 13:32:22 CDT 2014
Here are my final comments (which I've also submitted to the feedback page) on TR9,
http://www.unicode.org/reports/tr9/tr9-32.html#BD11 (3.1.2), as well as on sections 3.3 and
4.3.
These are mostly grammar/proofreading nits, but the one on 4.3 is important to fix.
Also I made an error in my previous comments (September 30) on 3.1.3, on the algorithm for BDI 16 -- the original text is correct:
"If the current stack element is at the bottom of the stack, and the values match, meaning
the two characters form a bracket pair, then
Append the text position in the current stack element together with the text position
of the closing paired bracket to the list.
Pop the stack through the current stack element inclusively.
Else, if the current stack element is not at the bottom of the stack, advance it to the
next element deeper in the stack and go back to step 2."
{COMMENT: leave as is; my error}
The other proofreading comment I made on September 30 should remain.
* * *
3.1.2
BD11 algorithm
"Initialize a counter to one.
Scan the text following the embedding initiator:
At an isolate initiator, skip past the matching PDI, or if there is no matching PDI, to the end of the paragraph.
At the end of a paragraph, or at a PDI that matches an isolate initiator before the embedding initiator, stop: the embedding initiator has no matching PDF.
At an embedding initiator, increment the counter.
At a PDF, decrement the counter. If its new value is zero, stop: this is the matching PDF."
{COMMENT: a nitpick: in the second bullet you say "at a PDI that matches an isolate initiator before the embedding initiator" -- this use of "before" is confusing to me; you don't mean that you reach the pdi before reaching the embedding initiator. This can't be the case as you are scanning the text following the embedding initiator; to me the wording is not right; I would change it to: "that matches an isolating intiator that occurred outside/before the/prior to embedding initiator"}
=>
"Initialize a counter to one.
Scan the text following the embedding initiator:
At an isolate initiator, skip past the matching PDI, or if there is no matching PDI, to the end of the paragraph.
At the end of a paragraph, or at a PDI that matches an isolate initiator that occurred prior to the embedding initiator, stop: the embedding initiator has no matching PDF.
At an embedding initiator, increment the counter.
At a PDF, decrement the counter. If its new value is zero, stop: this is the matching PDF."
* * *
3.3.2 "Explict Embeddings", Rule X2, 1rst par, last bullet
"With each RLE, perform the following steps:
Otherwise, this is an overflow RLE. If the overflow isolate count is zero, increment the overflow embedding count by one. Leave all other variables unchanged."
{COMMENT: INSERT HERE FOR CLARITY=>"Otherwise this overflow RLE is within the scope of an overflow isolate initiator, so do nothing."}
* * *
Rule X3, first par, last bullet
"Otherwise, this is an overflow LRE. If the overflow isolate count is zero, increment the overflow embedding count by one. Leave all other variables unchanged. {COMMENT: INSERT HERE FOR CLARITY =>"Otherwise this overflow LRE is within the scope of an overflow isolate initiator, so do nothing."}
{QUESTION: So the embeddings that are done in an overflow isolate are only terminated by the overflow isolate terminator, I gather? No need to reply but my correction only makes sense if this is true.}
* * *
3.3.2 "Explicit Levels and Directions", "Terminating Isolates", X6A, third bullet, then 2nd sub-bullet:
"While the directional isolate status of the last entry on the stack is false, pop the last entry from the directional status stack. (This terminates the scope of those valid embedding initiators within the scope of the matched isolate initiator whose scopes have not been terminated by a matching PDF, and which thus lack a matching PDF. Given that the valid isolate count is non-zero, the directional status stack must contain an entry with directional isolate status true before this step, and thus after this step the last entry on the stack will indeed have a true directional isolate status, i.e. represent the scope of the matched isolate initiator. This cannot be the stack's first entry, which always belongs to the paragraph level and has a false directional status, so there is at least one more entry before it on the stack.)"
{COMMENT: again, the use of "before"and "after" is confusing; the entry that the "directional isolate status" set to "true" was PLACED before this step but I would not say that "the stack contains it before this step"; to me that is sort of comparing "apples and oranges" -- comparing a directional isolate status entry to a step; but this may be nitpicking but I found this tough to read}
=>
"While the directional isolate status of the last entry on the stack is false, pop the last entry from the directional status stack. (This terminates the scope of those valid embedding initiators within the scope of the matched isolate initiator whose scopes have not been terminated by a matching PDF, and which thus lack a matching PDF. Given that the valid isolate count is non-zero, the directional status stack must contain an entry with directional isolate status true; [this entry must have been placed prior the PDI], and thus, once all false entries are popped, the last entry on the stack will indeed have a true directional isolate status, i.e. represent the scope of the matched isolate initiator. This cannot be the stack's first entry, which always belongs to the paragraph level and has a false directional status, so there is at least one more entry before it on the stack.)"
* * *
3.3.5 "Resolving Neutral and Isolate Formatting Types", N0, 2nd bullet, section c
"Otherwise, if there is a strong type it must be opposite the embedding direction. Therefore, test for an established context with a preceding strong type by checking backwards before the opening paired bracket until the first strong type (L, R, or sos) is found."
{COMMENT: would it be better to say, "by checking backwards within the isolating run in which the bracket pair occurs"? You do mean to check just within the current isolating run, I believe. is this correct?}
=> ?
"Otherwise, if there is a strong type it must be opposite the embedding direction. Therefore, test for an established context with a preceding strong type by checking backwards from the opening paired bracket until the first strong type (L, R, or sos) is found.
If there is no strong type within the isolating run sequence where the bracket pair occurs, then set the bracket pair to the embedding direction."
* * *
X6A, last bullet, last sub-bullet
"If the entry's directional override status is not neutral, reset the current character type from PDI to L if the override status is left-to-right, and to R if the override status is right-to-left."
{Just nitpicking; it's usually clearer to start an "if-then" clause with "if" than it is to start it with "then" but you can ignore this suggestion}
=>?
"If the entry's directional override status is not neutral, then, if the override status is left-to-right, reset the current character type from PDI to L; set it to R if the override status is right-to-left."
* * *
There is one typo you do need to fix:
4.3 "Higher-level Protocols"
"Certain characters that do not have the Bidi_Mirrored property can also be depicted by a mirrored glyph in specialized contexts. Such contexts include, but are not limited to, historic scripts and associated punctuation, private-use characters, and characters in mathematical expressions. (See Section 6, Mirroring.) These characters are those that fit at least one of the following conditions:"
{COMMENT: you mean "section 7", which is what the link goes to.}
=>
"Certain characters that do not have the Bidi_Mirrored property can also be depicted by a mirrored glyph in specialized contexts. Such contexts include, but are not limited to, historic scripts and associated punctuation, private-use characters, and characters in mathematical expressions. (See Section 7, Mirroring.) These characters are those that fit at least one of the following conditions:"
* * * * * *
Best,
-- C. E. Whitehead
cewcathar at hotmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20141019/28cc2408/attachment.html>
More information about the Unicode
mailing list