From jcb+unicode at inf.ed.ac.uk  Fri Jan  2 03:23:59 2015
From: jcb+unicode at inf.ed.ac.uk (Julian Bradfield)
Date: Fri,  2 Jan 2015 09:23:59 +0000 (GMT)
Subject: Why is BN weak?
Message-ID: <slrnmacotf.dp4.jcb@home.stevens-bradfield.com>

I've been perusing the Bidi Algorithm, and I am wondering why the
Boundary Neutral class is classified as a weak class rather than a
neutral class. Can somebody explain?

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


From eliz at gnu.org  Fri Jan  2 04:43:16 2015
From: eliz at gnu.org (Eli Zaretskii)
Date: Fri, 02 Jan 2015 12:43:16 +0200
Subject: Why is BN weak?
In-Reply-To: <slrnmacotf.dp4.jcb@home.stevens-bradfield.com>
References: <slrnmacotf.dp4.jcb@home.stevens-bradfield.com>
Message-ID: <834ms9bgzf.fsf@gnu.org>

> From: Julian Bradfield <jcb+unicode at inf.ed.ac.uk>
> Date: Fri,  2 Jan 2015 09:23:59 +0000 (GMT)
> 
> I've been perusing the Bidi Algorithm, and I am wondering why the
> Boundary Neutral class is classified as a weak class rather than a
> neutral class. Can somebody explain?

Because they should take the direction of the surrounding text, I
presume.

What bidi type did you expect them to be?

From richard.wordingham at ntlworld.com  Sat Jan  3 19:29:05 2015
From: richard.wordingham at ntlworld.com (Richard Wordingham)
Date: Sun, 4 Jan 2015 01:29:05 +0000
Subject: Encoding Sequence for Tai Tham Matres Lectionis
Message-ID: <20150104012905.6830b17e@JRWUBU2>

I have two questions, but I begin with some preliminaries in case I am
labouring under any misapprehensions.

Firstly, I assume that any legible text in the Tai Tham script with a
well-defined pronunciation in one of the main languages using the Tai
Tham script (Pali, Tai Kh?n, Tai Lue, Northern Tai and Lao) either:

1) Contains an unencoded character;
2) Has a unique (up to canonical equivalence) correct encoding;
3) Has a glyph with multiple encodings; or
4) Reveals a deficiency in the specification of the encoding of the
script.

Glyphs with multiple encodings most commonly occur with styles that do
not distinguish U+1A62 TAI THAM VOWEL SIGN MAI SAT and U+1A76 TAI THAM
SIGN TONE-2.  These can generally be resolved on the basis of the
pronunciation.

Secondly, what is the definition of the encoding?  Is it just the
Unicode standard, or is it the sequence of approved proposals plus the
Unicode standard (with the latest approval taking precedence)?  I
presume the proposals are relevant, as otherwise there might not be a
defined coding difference between the second syllable of /?a??u?/
'shaky' and the word /hui/ 'to sprinkle'.  The proposals lead to the
encoding <U+1A49 TAI THAM LETTER HIGH HA, U+1A60 TAI THAM SIGN SAKOT,
U+1A3F TAI THAM LETTER LOW YA, U+1A69 TAI THAM VOWEL SIGN U> for the
former and <U+1A49, U+1A69, U+1A60, U+1A3F> for the latter.  The visual
difference lies in the positioning of the vowel; there is no visual
justification for claiming that the dependent consonant is subjoined
to the vowel in either case.

Similarly, there is nothing in TUS itself to specify whether /ku?/
(Lao /k?u?/) 'pair' is spelt <U+1A23 TAI THAM LETTER LOW KA, U+1A6A TAI
THAM VOWEL SIGN UU, 1A76 TAI THAM SIGN TONE-2> or <U+1A23, U+1A76,
U+1A6A>.  Unlike Thai, these two sequences are not canonically
equivalent.

Before LANNA VOWEL SIGN AM and LANNA VOWEL SIGN TALL AM were rejected,
the basic syllable structure for encoding was
<pre-vowel_consonant_stack, vowels_before, vowels_below, vowels_above,
tones_etc, vowels_after, post-vowel_consonant_stack>.  Apart from the
first element of the pre-vowel consonant stack, each elements of the
consonant stacks was either a pair of SAKOT and consonant letter or a
consonant sign.

The script has made use of three consonant letters to indicate
vowels - U+1A3F LETTER LOW YA, U+1A45 LETTER WA and LETTER A.  The
subscript form of LETTER A has for most purposes evolved into a vowel
symbol, U+1A6C TAI THAM VOWEL SIGN OA BELOW, and presents no known
issues.  The combinations <U+1A60, U+1A3F> and <U+1A60, U+1A45>
represent vowels, generally /e/ and /o/ in Tai Kh?n and Tai Lue
and /i:a/ and /u:a/ in Northern Thai and Lao.  These may reasonably be
regarded as matres lectionis.  The question then arose of how to order
them with respect to any other vowels or tone marks.  Thai suggested
that the mater lectionis should come last, treating the syllable as a
pair of chained syllables, but because of Tai Kh?n feedback they were
included in the pre-vowel consonant stack. For interaction with other
vowel symbols, this decision in reflected in the 2007 proposal
http://www.unicode.org/L2/L2007/07007r-n3207r-lanna.pdf .

I have been indexing the Lanna script spellings in the 'Northern Thai
Diction of Palm-Leaf Manuscripts', and I have encountered puzzles with
some very Siamese spellings.

Q1.  Should I treat the mater lectionis as part of the initial stack or
as starting a chained syllable when an unexpected written vowel
appears to proceed it? Specifically:

Q1a. Should I encode a certain writing of /kua?/ 'a wooden or
woven-bamboo tray as <U+1A20 TAI THAM LETTER HIGH KA, U+1A60, U+1A45,
U+1A62 TAI THAM VOWEL SIGN MAI SAT, U+1A61 TAI THAM VOWEL SIGN A> or as
<U+1A20 TAI THAM LETTER HIGH KA, U+1A62 TAI THAM VOWEL SIGN MAI SAT,
U+1A60, U+1A45, U+1A61 TAI THAM VOWEL SIGN A>?  The usual spelling of
this word would be <U+1A20 TAI THAM LETTER HIGH KA, U+1A60, U+1A45,
U+1A6B TAI THAM VOWEL SIGN O, U+1A61 TAI THAM VOWEL SIGN A>.

Q1b. Should I encode a certain writing of /lu?a/ 'firewood' other than
as <U+1A49 TAI THAM LETTER HIGH HA, U+1A56 TAI THAM CONSONANT SIGN
MEDIAL LA, U+1A62 TAI THAM VOWEL SIGN MAI SAT, U+1A60, U+1A45>, and if
so, how.  The usual writing of the word would be encoded as <U+1A49,
U+1A56, U+1A60, U+1A45, U+1A6B TAI THAM VOWEL SIGN O>.

Q1c. I see three reasonable encodings of the writing of /sawi?an/ 'a
large woven basketfor holding unhusked rice'.  The choice between (ii)
and (iii) depends on the answer to Q2.  The three choices are:
(i) <U+1A48 TAI THAM LETTER HIGH SA, U+1A60, U+1A45, U+1A7B TAI THAM
SIGN MAI SAM, U+1A66 TAI THAM VOWEL SIGN II, U+1A60, U+1A3F, U+1A41 TAI
THAM LETTER RA>
(ii) <U+1A48 TAI THAM LETTER HIGH SA, U+1A60, U+1A45, U+1A7B TAI THAM
SIGN MAI SAM, U+1A60, U+1A3F, U+1A66 TAI THAM VOWEL SIGN II, U+1A41 TAI
THAM LETTER RA> and
(iii) <U+1A48 TAI THAM LETTER HIGH SA, U+1A60, U+1A45, U+1A60, U+1A3F,
U+1A7B TAI THAM SIGN MAI SAM, U+1A66 TAI THAM VOWEL SIGN II, U+1A41 TAI
THAM LETTER RA>.
Which encoding should I choose?

Q2. Where should I put the MAI SAM in the encoding of the fuller usual
writing of /sawi?an/?  Should I write
(i) <U+1A48 TAI THAM LETTER HIGH SA, U+1A60, U+1A45, U+1A7B TAI THAM
SIGN MAI SAM, U+1A60, U+1A3F, U+1A41 TAI THAM LETTER RA> or
(ii) <U+1A48 TAI THAM LETTER HIGH SA, U+1A60, U+1A45, U+1A60, U+1A3F,
U+1A7B TAI THAM SIGN MAI SAM, U+1A41 TAI THAM LETTER RA>?

The TUS does not specify where the MAI SAM representing the typically
anaptyctic vowel /a/ should go.  (In this case, /swi?an/ *is* a possible
Northern Thai word.)  The previously cited 2007 proposal says, "it is
stored following the subjoined form to indicate the consonant being at
the start of a new syllable".  However, this moves a mark which is
positioned like a vowel or tone mark into the consonant cluster's
sequence of code points.

The Maefahluang dictionary (p719 of Revision 1) actually writes the mai
sam after the RA.  Should this be regarded as a typographical error?  I
have not been able to discern a pattern in the positioning in that
dictionary of mai sam used to indicate a hidden syllable boundary.

Richard.


From richard.wordingham at ntlworld.com  Sun Jan  4 14:49:05 2015
From: richard.wordingham at ntlworld.com (Richard Wordingham)
Date: Sun, 4 Jan 2015 20:49:05 +0000
Subject: Encoding Sequence for Tai Tham Matres Lectionis
In-Reply-To: <20150104012905.6830b17e@JRWUBU2>
References: <20150104012905.6830b17e@JRWUBU2>
Message-ID: <20150104204905.1137f21d@JRWUBU2>

On Sun, 4 Jan 2015 01:29:05 +0000
Richard Wordingham <richard.wordingham at ntlworld.com> wrote:


> Similarly, there is nothing in TUS itself to specify whether /ku?/
> (Lao /k?u?/) 'pair' is spelt <U+1A23 TAI THAM LETTER LOW KA, U+1A6A
> TAI THAM VOWEL SIGN UU, 1A76 TAI THAM SIGN TONE-2> or <U+1A23, U+1A76,
> U+1A6A>.  Unlike Thai, these two sequences are not canonically
> equivalent.

Correction: Read U+1A75 TAI THAM SIGN TONE-1 for U+1A76 in both
locations.


From olopierpa at gmail.com  Fri Jan 16 15:53:06 2015
From: olopierpa at gmail.com (Pierpaolo Bernardi)
Date: Fri, 16 Jan 2015 22:53:06 +0100
Subject: Unicode 7.0 Paperback Available (correction)
In-Reply-To: <54B976D7.7050100@unicode.org>
References: <54B976D7.7050100@unicode.org>
Message-ID: <CANY8u7EqRVW8T2fRxKDfwrG99p1zvx122Ss=u7H2FN9BwbRYWg@mail.gmail.com>

On Fri, Jan 16, 2015 at 9:38 PM, <announcements at unicode.org> wrote:
>
> (Re-sending with minor correction to price.)

> The Unicode 7.0 core specification is now available in paperback book form.

>  The two volumes may be purchased separately or together. The cost for the pair is US$16.24, plus postage and applicable taxes.

Then why Lulu asks me 22.20 EUR (= 25.68 USD) (+ taxes + shipping)?

And why Lulu says they can ship them only from North America? (so
shipping costs 33.99 EUR, making the total (64.56 EUR = 74.67 USD) not
so cheap anymore)


Puzzled
P.

From olopierpa at gmail.com  Fri Jan 16 16:07:27 2015
From: olopierpa at gmail.com (Pierpaolo Bernardi)
Date: Fri, 16 Jan 2015 23:07:27 +0100
Subject: Unicode 7.0 Paperback Available (correction)
In-Reply-To: <CANY8u7EqRVW8T2fRxKDfwrG99p1zvx122Ss=u7H2FN9BwbRYWg@mail.gmail.com>
References: <54B976D7.7050100@unicode.org>
 <CANY8u7EqRVW8T2fRxKDfwrG99p1zvx122Ss=u7H2FN9BwbRYWg@mail.gmail.com>
Message-ID: <CANY8u7EfaAr9ge0XORyHyyx_bn3nL3NdtswmsV9def4wXuLsYA@mail.gmail.com>

On Fri, Jan 16, 2015 at 10:53 PM, Pierpaolo Bernardi
<olopierpa at gmail.com> wrote:

> And why Lulu says they can ship them only from North America? (so
> shipping costs 33.99 EUR, making the total (64.56 EUR = 74.67 USD) not
> so cheap anymore)

As a point of comparison, the Addison-Wesley edition of the 5.0
version, bought from Amazon, cost me 55.69 EUR in 2006, included taxes
and shipping.

P.

From olopierpa at gmail.com  Fri Jan 16 17:21:36 2015
From: olopierpa at gmail.com (Pierpaolo Bernardi)
Date: Sat, 17 Jan 2015 00:21:36 +0100
Subject: Unicode 7.0 Paperback Available (correction)
In-Reply-To: <CANY8u7EfaAr9ge0XORyHyyx_bn3nL3NdtswmsV9def4wXuLsYA@mail.gmail.com>
References: <54B976D7.7050100@unicode.org>
 <CANY8u7EqRVW8T2fRxKDfwrG99p1zvx122Ss=u7H2FN9BwbRYWg@mail.gmail.com>
 <CANY8u7EfaAr9ge0XORyHyyx_bn3nL3NdtswmsV9def4wXuLsYA@mail.gmail.com>
Message-ID: <CANY8u7FF56UFqYczp_aSFvDUWmgQC+6s=rMioZYdMCw3mGGT2w@mail.gmail.com>

On Fri, Jan 16, 2015 at 11:07 PM, Pierpaolo Bernardi
<olopierpa at gmail.com> wrote:
> On Fri, Jan 16, 2015 at 10:53 PM, Pierpaolo Bernardi
> <olopierpa at gmail.com> wrote:
>
>> And why Lulu says they can ship them only from North America? (so
>> shipping costs 33.99 EUR, making the total (64.56 EUR = 74.67 USD) not
>> so cheap anymore)
>
> As a point of comparison, the Addison-Wesley edition of the 5.0
> version, bought from Amazon, cost me 55.69 EUR in 2006, included taxes
> and shipping.

As another point of comparison, the Lulu edition of 6.1, bought from
Lulu cost me 33.24 USD (including taxes & shipping).

P.

From olopierpa at gmail.com  Fri Jan 16 18:06:16 2015
From: olopierpa at gmail.com (Pierpaolo Bernardi)
Date: Sat, 17 Jan 2015 01:06:16 +0100
Subject: Unicode 7.0 Paperback Available (correction)
In-Reply-To: <54B9A144.6030204@ix.netcom.com>
References: <54B976D7.7050100@unicode.org>
 <CANY8u7EqRVW8T2fRxKDfwrG99p1zvx122Ss=u7H2FN9BwbRYWg@mail.gmail.com>
 <CANY8u7EfaAr9ge0XORyHyyx_bn3nL3NdtswmsV9def4wXuLsYA@mail.gmail.com>
 <54B9A144.6030204@ix.netcom.com>
Message-ID: <CANY8u7EQ_pibc4+vzx8WPrwRMPh+P2s6KyYSvOcgT84G0-rHuQ@mail.gmail.com>

In May 2012 1 USD = 0.78 EUR, and the 6.1 book cost me (all included) US$ 33.24

Today, 1USD = 0.86 EUR, Lulu does not show me prices in USD, only in
EUR, and asks me 64.56 EUR, for the 7.0 books, using the same shipping
method.

The difference in exchange rates does not explain this.

The 2012 price breaks down in this way:

Subtotale $15.96
Spedizione $13.75
IVA: $3.53
Totale $33.24

Today they ask me for the two books:

Subtotale: ?22.20
Spedizione: ?33.99
IVA: ?8,37
Totale: ?64.56

P.


On Sat, Jan 17, 2015 at 12:39 AM, Asmus Freytag (t)
<asmus-inc at ix.netcom.com> wrote:
> On 1/16/2015 2:07 PM, Pierpaolo Bernardi wrote:
>>
>> On Fri, Jan 16, 2015 at 10:53 PM, Pierpaolo Bernardi
>> <olopierpa at gmail.com> wrote:
>>
>>> And why Lulu says they can ship them only from North America? (so
>>> shipping costs 33.99 EUR, making the total (64.56 EUR = 74.67 USD) not
>>> so cheap anymore)
>>
>> As a point of comparison, the Addison-Wesley edition of the 5.0
>> version, bought from Amazon, cost me 55.69 EUR in 2006, included taxes
>> and shipping.
>
>
> Interestingly enough, because of currency rates, your equivalent USD price
> in 2006 would have been nearly identical, approximately $72.50 (depnding on
> which date your order was converted).
>
> Looks like the difference in the price in EUR is largely accounted for by
> the recent over valuation of the dollar and undervaluation of the Euro.
>
> A./
>
>
>> P.
>> _______________________________________________
>> Unicode mailing list
>> Unicode at unicode.org
>> http://unicode.org/mailman/listinfo/unicode
>>
>


From verdy_p at wanadoo.fr  Fri Jan 16 18:23:26 2015
From: verdy_p at wanadoo.fr (Philippe Verdy)
Date: Sat, 17 Jan 2015 01:23:26 +0100
Subject: Unicode 7.0 Paperback Available (correction)
In-Reply-To: <CANY8u7FF56UFqYczp_aSFvDUWmgQC+6s=rMioZYdMCw3mGGT2w@mail.gmail.com>
References: <54B976D7.7050100@unicode.org>
 <CANY8u7EqRVW8T2fRxKDfwrG99p1zvx122Ss=u7H2FN9BwbRYWg@mail.gmail.com>
 <CANY8u7EfaAr9ge0XORyHyyx_bn3nL3NdtswmsV9def4wXuLsYA@mail.gmail.com>
 <CANY8u7FF56UFqYczp_aSFvDUWmgQC+6s=rMioZYdMCw3mGGT2w@mail.gmail.com>
Message-ID: <CAGa7JC3MLHVD_NNMBkPQ-DWPrFbZouT8dFUKHYyD919iSAeZ=A@mail.gmail.com>

For me in France the subtotal is effectively $16.24 but shipping costs
depend on the type of delivery; giving a total of
* $17.66 per regular mail (7 to 17 business days, i.e. up to 3 weeks
**after printing**) or
* $40.05 per "expedited" mail (2 to 5 business days, i.e. up to 1
week **after printing**) or
* $65.95 per "express" delivery (2 to 6 business days **after printing**)

The problem is not the price but the delivery ! I understand the term
"after printing" (because it is printed on demand) however
- for the mslalest price, the delivery delay is excessively long even by
regular mail (I suppose they are waiting to get some volume before sendng;
but 17 days is really too much).
- the two other delivery options are largely overpriced; Lulu profits
largely there and this is not justified by delivery costs.
- even the most expensive "express" option is longer than the "expedited"
option.

REally Lulu is certainly not the best partner for the Consortium and it
would be preferable that the Consortium prints on demand but sells that via
a more serious worldwide partner such as Amazon (if amazon takes a few
percents of sales, just take that into acount in the initial price).
Shipping costs by Amazon are far better (and they can also print on demand
with its own partner printers). And there are far better options for
payments, more secure delivery, with more options at much smaller costs.

Lulu seems very inefficient. I've never anywhere seen "shipping and
delivery costs" so expensive for books (and so slow).

Note that taxes are also not included; so expect to pay also the VAT to the
post office or delivery service to get the product (in France this adds
20%, not only to the products cost for for the full service. But I don't
think that the effective develivery costs are so expensive so you'll pay
taxes on the books plus on a part of the "shipping costs" actually NOT
included in the effective postal costs. That VAT amount pyed on exports is
unpredictable from this reseller.

So don't expect to have people buying these books from Lulu except from US.


2015-01-17 0:21 GMT+01:00 Pierpaolo Bernardi <olopierpa at gmail.com>:

> On Fri, Jan 16, 2015 at 11:07 PM, Pierpaolo Bernardi
> <olopierpa at gmail.com> wrote:
> > On Fri, Jan 16, 2015 at 10:53 PM, Pierpaolo Bernardi
> > <olopierpa at gmail.com> wrote:
> >
> >> And why Lulu says they can ship them only from North America? (so
> >> shipping costs 33.99 EUR, making the total (64.56 EUR = 74.67 USD) not
> >> so cheap anymore)
> >
> > As a point of comparison, the Addison-Wesley edition of the 5.0
> > version, bought from Amazon, cost me 55.69 EUR in 2006, included taxes
> > and shipping.
>
> As another point of comparison, the Lulu edition of 6.1, bought from
> Lulu cost me 33.24 USD (including taxes & shipping).
>
> P.
> _______________________________________________
> Unicode mailing list
> Unicode at unicode.org
> http://unicode.org/mailman/listinfo/unicode
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150117/bbe22f8d/attachment.html>

From verdy_p at wanadoo.fr  Fri Jan 16 18:48:17 2015
From: verdy_p at wanadoo.fr (Philippe Verdy)
Date: Sat, 17 Jan 2015 01:48:17 +0100
Subject: Unicode 7.0 Paperback Available (correction)
In-Reply-To: <CAGa7JC3MLHVD_NNMBkPQ-DWPrFbZouT8dFUKHYyD919iSAeZ=A@mail.gmail.com>
References: <54B976D7.7050100@unicode.org>
 <CANY8u7EqRVW8T2fRxKDfwrG99p1zvx122Ss=u7H2FN9BwbRYWg@mail.gmail.com>
 <CANY8u7EfaAr9ge0XORyHyyx_bn3nL3NdtswmsV9def4wXuLsYA@mail.gmail.com>
 <CANY8u7FF56UFqYczp_aSFvDUWmgQC+6s=rMioZYdMCw3mGGT2w@mail.gmail.com>
 <CAGa7JC3MLHVD_NNMBkPQ-DWPrFbZouT8dFUKHYyD919iSAeZ=A@mail.gmail.com>
Message-ID: <CAGa7JC3yZiqDDTG=9JBQWxwjYjCSgji32vLAf5Osgk5UEfznqg@mail.gmail.com>

Note: Lulu has other sale points for other countries.

I can find volume 2 in France there

http://www.lulu.com/shop/unicode-consortium/unicode-70-volume-1/paperback/product-21962094.html

at ?11.96 (but still not including 20% French VAT and still not the
shipping costs from US only)

And the 3 delivery costs options are at ?9.99, ?29.99, and ?49.99 (same
lengthy delays after printing from US; the third option being complelety
useless).

In summary: no advantage at all for using the French Lulu sale point which
is even more expensive (given the EUR/USD conversion plus possible
additional bank fees for currency change operation, or international
transaction for some credit cards) than the US Lulu sale point.

I've not checked the other "European" Lulu sale points but they are
probably similar. Lulu is not tuned really as an international reseller. I
would suggest to the Consoritum to take some orders directly; printing in
US, and sending via UPS or similar, this would be much more efficient.

Or interested people here could take a group order in a managed paged with
help of a small team to make the delivery from US, at reasonnable costs. A
local foundation could also take a volume order in advance for some region
(EU+EFTA, Japan/Korea/Hong Kong, Russia, South Africa, India, Brasil,
Canada); and if there remains some unsold items after some time (e.g. 6
months), they would be donated to local public libraries or university
library.

Individual sales are not the best terms. People may want to give some money
to the Unicode; however they don't want to give it only for Lulu profiting.
They prefer to give by becoming Unicode members, using the online
databases, and printing themselves what they want.


2015-01-17 1:23 GMT+01:00 Philippe Verdy <verdy_p at wanadoo.fr>:

> For me in France the subtotal is effectively $16.24 but shipping costs
> depend on the type of delivery; giving a total of
> * $17.66 per regular mail (7 to 17 business days, i.e. up to 3 weeks
> **after printing**) or
> * $40.05 per "expedited" mail (2 to 5 business days, i.e. up to 1
> week **after printing**) or
> * $65.95 per "express" delivery (2 to 6 business days **after printing**)
>
> The problem is not the price but the delivery ! I understand the term
> "after printing" (because it is printed on demand) however
> - for the mslalest price, the delivery delay is excessively long even by
> regular mail (I suppose they are waiting to get some volume before sendng;
> but 17 days is really too much).
> - the two other delivery options are largely overpriced; Lulu profits
> largely there and this is not justified by delivery costs.
> - even the most expensive "express" option is longer than the "expedited"
> option.
>
> REally Lulu is certainly not the best partner for the Consortium and it
> would be preferable that the Consortium prints on demand but sells that via
> a more serious worldwide partner such as Amazon (if amazon takes a few
> percents of sales, just take that into acount in the initial price).
> Shipping costs by Amazon are far better (and they can also print on demand
> with its own partner printers). And there are far better options for
> payments, more secure delivery, with more options at much smaller costs.
>
> Lulu seems very inefficient. I've never anywhere seen "shipping and
> delivery costs" so expensive for books (and so slow).
>
> Note that taxes are also not included; so expect to pay also the VAT to
> the post office or delivery service to get the product (in France this adds
> 20%, not only to the products cost for for the full service. But I don't
> think that the effective develivery costs are so expensive so you'll pay
> taxes on the books plus on a part of the "shipping costs" actually NOT
> included in the effective postal costs. That VAT amount pyed on exports is
> unpredictable from this reseller.
>
> So don't expect to have people buying these books from Lulu except from US.
>
>
>
> 2015-01-17 0:21 GMT+01:00 Pierpaolo Bernardi <olopierpa at gmail.com>:
>
>> On Fri, Jan 16, 2015 at 11:07 PM, Pierpaolo Bernardi
>> <olopierpa at gmail.com> wrote:
>> > On Fri, Jan 16, 2015 at 10:53 PM, Pierpaolo Bernardi
>> > <olopierpa at gmail.com> wrote:
>> >
>> >> And why Lulu says they can ship them only from North America? (so
>> >> shipping costs 33.99 EUR, making the total (64.56 EUR = 74.67 USD) not
>> >> so cheap anymore)
>> >
>> > As a point of comparison, the Addison-Wesley edition of the 5.0
>> > version, bought from Amazon, cost me 55.69 EUR in 2006, included taxes
>> > and shipping.
>>
>> As another point of comparison, the Lulu edition of 6.1, bought from
>> Lulu cost me 33.24 USD (including taxes & shipping).
>>
>> P.
>> _______________________________________________
>> Unicode mailing list
>> Unicode at unicode.org
>> http://unicode.org/mailman/listinfo/unicode
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150117/21d422f4/attachment.html>

From olopierpa at gmail.com  Fri Jan 16 19:12:57 2015
From: olopierpa at gmail.com (Pierpaolo Bernardi)
Date: Sat, 17 Jan 2015 02:12:57 +0100
Subject: Unicode 7.0 Paperback Available (correction)
In-Reply-To: <CAGa7JC3yZiqDDTG=9JBQWxwjYjCSgji32vLAf5Osgk5UEfznqg@mail.gmail.com>
References: <54B976D7.7050100@unicode.org>
 <CANY8u7EqRVW8T2fRxKDfwrG99p1zvx122Ss=u7H2FN9BwbRYWg@mail.gmail.com>
 <CANY8u7EfaAr9ge0XORyHyyx_bn3nL3NdtswmsV9def4wXuLsYA@mail.gmail.com>
 <CANY8u7FF56UFqYczp_aSFvDUWmgQC+6s=rMioZYdMCw3mGGT2w@mail.gmail.com>
 <CAGa7JC3MLHVD_NNMBkPQ-DWPrFbZouT8dFUKHYyD919iSAeZ=A@mail.gmail.com>
 <CAGa7JC3yZiqDDTG=9JBQWxwjYjCSgji32vLAf5Osgk5UEfznqg@mail.gmail.com>
Message-ID: <CANY8u7Gy+FN3u9w7mrdqO-4Zh+AexMn_UaRhbPZFdehSVR-zmg@mail.gmail.com>

On Sat, Jan 17, 2015 at 1:48 AM, Philippe Verdy <verdy_p at wanadoo.fr> wrote:
> Note: Lulu has other sale points for other countries.

Lulu was redirecting me automatically to the Italian shop.  Switching
to USA, the price of the books are what they are supposed to be and
the shipping costs too are reduced. Like this

Item Subtotal: $16.24
Shipping Subtotal: $40.04
Value Added Tax: $9.46
Total: $65.74 (= 56.86 EUR today's exchange rate)

Still a lot, but now it is less extravagant. The principal culprit of
the high price seems to be that the shipping cost for two books is
twice that of one book.  No discount.

> I can find volume 2 in France there
>
> http://www.lulu.com/shop/unicode-consortium/unicode-70-volume-1/paperback/product-21962094.html
>
> at ?11.96 (but still not including 20% French VAT and still not the shipping
> costs from US only)

If you proceed with the order, you will be shown shipping and VAT
before confirming.

> And the 3 delivery costs options are at ?9.99, ?29.99, and ?49.99 (same
> lengthy delays after printing from US; the third option being complelety
> useless).

The cheaper shipping is not traced, I have found it totally unreliable
at my expense, and Lulu does not respond of missing deliveries. All
the prices I quoted above refers to the cheaper of the traceable
options.

> Or interested people here could take a group order in a managed paged with
> help of a small team to make the delivery from US, at reasonnable costs. A
> local foundation could also take a volume order in advance for some region
> (EU+EFTA, Japan/Korea/Hong Kong, Russia, South Africa, India, Brasil,
> Canada); and if there remains some unsold items after some time (e.g. 6
> months), they would be donated to local public libraries or university
> library.

I doubt this would be cost effective, even relying on volunteers work.

Best would be to use Amazon logistics.  Many Lulu books are available
from Amazon, I must look into what's involved in making Lulu books
available from Amazon.

>> Note that taxes are also not included; so expect to pay also the VAT to
>> the post office or delivery service to get the product

Nowadays this is handled transparently for you by the carriers. If you
proceed in the checkout you will see your local VAT applied.

Thanks for all the replies. Pardon my rant.
P.


From olopierpa at gmail.com  Fri Jan 16 20:16:54 2015
From: olopierpa at gmail.com (Pierpaolo Bernardi)
Date: Sat, 17 Jan 2015 03:16:54 +0100
Subject: Unicode 7.0 Paperback Available (correction)
In-Reply-To: <CANY8u7Gy+FN3u9w7mrdqO-4Zh+AexMn_UaRhbPZFdehSVR-zmg@mail.gmail.com>
References: <54B976D7.7050100@unicode.org>
 <CANY8u7EqRVW8T2fRxKDfwrG99p1zvx122Ss=u7H2FN9BwbRYWg@mail.gmail.com>
 <CANY8u7EfaAr9ge0XORyHyyx_bn3nL3NdtswmsV9def4wXuLsYA@mail.gmail.com>
 <CANY8u7FF56UFqYczp_aSFvDUWmgQC+6s=rMioZYdMCw3mGGT2w@mail.gmail.com>
 <CAGa7JC3MLHVD_NNMBkPQ-DWPrFbZouT8dFUKHYyD919iSAeZ=A@mail.gmail.com>
 <CAGa7JC3yZiqDDTG=9JBQWxwjYjCSgji32vLAf5Osgk5UEfznqg@mail.gmail.com>
 <CANY8u7Gy+FN3u9w7mrdqO-4Zh+AexMn_UaRhbPZFdehSVR-zmg@mail.gmail.com>
Message-ID: <CANY8u7H+LkR8GA8-i5gAX7E3W1LTNPXE_9HGzEV=7ikAyTD-2Q@mail.gmail.com>

And BTW, I think I have solved the mystery.

When converting from USD to EUR in the local shops they are using the
*inverse* of the exchange rate.  Instead of *multiplying* the USD
price by the exchange rate, they are *dividing* by it (or vice versa).
This explains with great accuracy the weird differences between the
USA shop and the EU ones.  Sigh.

P.


On Sat, Jan 17, 2015 at 2:12 AM, Pierpaolo Bernardi <olopierpa at gmail.com> wrote:
> On Sat, Jan 17, 2015 at 1:48 AM, Philippe Verdy <verdy_p at wanadoo.fr> wrote:
>> Note: Lulu has other sale points for other countries.
>
> Lulu was redirecting me automatically to the Italian shop.  Switching
> to USA, the price of the books are what they are supposed to be and
> the shipping costs too are reduced. Like this
>
> Item Subtotal: $16.24
> Shipping Subtotal: $40.04
> Value Added Tax: $9.46
> Total: $65.74 (= 56.86 EUR today's exchange rate)
>
> Still a lot, but now it is less extravagant. The principal culprit of
> the high price seems to be that the shipping cost for two books is
> twice that of one book.  No discount.
>
>> I can find volume 2 in France there
>>
>> http://www.lulu.com/shop/unicode-consortium/unicode-70-volume-1/paperback/product-21962094.html
>>
>> at ?11.96 (but still not including 20% French VAT and still not the shipping
>> costs from US only)
>
> If you proceed with the order, you will be shown shipping and VAT
> before confirming.
>
>> And the 3 delivery costs options are at ?9.99, ?29.99, and ?49.99 (same
>> lengthy delays after printing from US; the third option being complelety
>> useless).
>
> The cheaper shipping is not traced, I have found it totally unreliable
> at my expense, and Lulu does not respond of missing deliveries. All
> the prices I quoted above refers to the cheaper of the traceable
> options.
>
>> Or interested people here could take a group order in a managed paged with
>> help of a small team to make the delivery from US, at reasonnable costs. A
>> local foundation could also take a volume order in advance for some region
>> (EU+EFTA, Japan/Korea/Hong Kong, Russia, South Africa, India, Brasil,
>> Canada); and if there remains some unsold items after some time (e.g. 6
>> months), they would be donated to local public libraries or university
>> library.
>
> I doubt this would be cost effective, even relying on volunteers work.
>
> Best would be to use Amazon logistics.  Many Lulu books are available
> from Amazon, I must look into what's involved in making Lulu books
> available from Amazon.
>
>>> Note that taxes are also not included; so expect to pay also the VAT to
>>> the post office or delivery service to get the product
>
> Nowadays this is handled transparently for you by the carriers. If you
> proceed in the checkout you will see your local VAT applied.
>
> Thanks for all the replies. Pardon my rant.
> P.


From raymond at almanach.co.uk  Sat Jan 17 04:58:30 2015
From: raymond at almanach.co.uk (Raymond Mercier)
Date: Sat, 17 Jan 2015 10:58:30 -0000
Subject: Unicode 7.0 Paperback Available
Message-ID: <646C8978F0A244DB9A1BBA14A5C45EE7@UserPC>

Since the new printed volume is so expensive when shipping is included, why not try one of the commercial binding services, such as
https://www.doxdirect.com/products/specialist-document-printing/pdf-printing/.
The pdf files that make up Unicode 7.0 can all be downloaded from http://www.unicode.org/versions/Unicode7.0.0/.
It would have been easier of course if the individual pdf?s had been gathered together into larger groups, although one can do that easily within Acrobat.

Best of all would be a volume (or two ?) like that for Unicode 5 produced by Addison Wesley. When I once asked about that for Unicode 6  I was told that it was just too difficult to get the pages formatted suitably for book production. But if the charts can be presented as pdf, why is it difficult to print and bind them ?

Regards
Raymond Mercier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150117/bca58fee/attachment.html>

From raymond at almanach.co.uk  Sat Jan 17 12:35:33 2015
From: raymond at almanach.co.uk (Raymond Mercier)
Date: Sat, 17 Jan 2015 18:35:33 -0000
Subject: Unicode 7.0 Paperback Available
In-Reply-To: <54BAA148.9090705@ix.netcom.com>
References: <646C8978F0A244DB9A1BBA14A5C45EE7@UserPC>
 <54BAA148.9090705@ix.netcom.com>
Message-ID: <75E63C2C335F428BAF7D26CAD3FFE150@UserPC>

Asmus,
Thanks. Indeed I am surprised that a publisher cannot get results as clean and reliable as I do when printing from Acrobat.
R

From: Asmus Freytag (t) 
Sent: Saturday, January 17, 2015 5:52 PM
To: Raymond Mercier ; unicode at unicode.org 
Subject: Re: Unicode 7.0 Paperback Available

Raymond,

even though the source is PDF, the nature of the fonts used for the charts makes this extremely challenging for the printers. Experiments run by some volunteers have determined that you can expect very inconsistent results, because the way these printing services and their contractors handle PDF is just not the same as when you use Acrobat or some browser plug-in to view them on screen.

You may find this a surprising state of affairs, but those are the facts on the ground. It was found that even the same service may get you different results for each order. And by different, I mean, with different discrepancies from the desired output.

These services apparently subcontract with a number of printing presses, all of which may have different software.

A./


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150117/12a7dc30/attachment.html>

From srl at icu-project.org  Sat Jan 17 12:46:46 2015
From: srl at icu-project.org (Steven R. Loomis)
Date: Sat, 17 Jan 2015 10:46:46 -0800
Subject: Unicode 7.0 Paperback Available (correction)
In-Reply-To: <CANY8u7EqRVW8T2fRxKDfwrG99p1zvx122Ss=u7H2FN9BwbRYWg@mail.gmail.com>
References: <54B976D7.7050100@unicode.org>
 <CANY8u7EqRVW8T2fRxKDfwrG99p1zvx122Ss=u7H2FN9BwbRYWg@mail.gmail.com>
Message-ID: <777E7D1D-65AF-4CE1-B440-2AFD8A734A2E@icu-project.org>

So, for reference as I hadn't seen it discussed yet, but shipping to USA:
 ITEMS: 16.24 $USD
 Shipping cost (mail) 5.24
 Subtotal 21.48
 (Plus California tax? 23.09 USD)

Not wanting to downplay annoyances others have noted but I was curious and wanted to check the price. 

S


Enviado desde nuestro iPhone.

> El ene 16, 2015, a las 1:53 PM, Pierpaolo Bernardi <olopierpa at gmail.com> escribi?:
> 
>> On Fri, Jan 16, 2015 at 9:38 PM, <announcements at unicode.org> wrote:
>> 
>> (Re-sending with minor correction to price.)
> 
>> The Unicode 7.0 core specification is now available in paperback book form.
> 
>> The two volumes may be purchased separately or together. The cost for the pair is US$16.24, plus postage and applicable taxes.
> 
> Then why Lulu asks me 22.20 EUR (= 25.68 USD) (+ taxes + shipping)?
> 
> And why Lulu says they can ship them only from North America? (so
> shipping costs 33.99 EUR, making the total (64.56 EUR = 74.67 USD) not
> so cheap anymore)
> 
> 
> Puzzled
> P.
> _______________________________________________
> Unicode mailing list
> Unicode at unicode.org
> http://unicode.org/mailman/listinfo/unicode
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150117/9ffc76aa/attachment.html>

From raymond at almanach.co.uk  Sat Jan 17 14:46:42 2015
From: raymond at almanach.co.uk (Raymond Mercier)
Date: Sat, 17 Jan 2015 20:46:42 -0000
Subject: Unicode 7.0 Paperback Available
In-Reply-To: <54BAB0CD.6050802@ix.netcom.com>
References: <646C8978F0A244DB9A1BBA14A5C45EE7@UserPC>
 <54BAA148.9090705@ix.netcom.com> <75E63C2C335F428BAF7D26CAD3FFE150@UserPC>
 <54BAB0CD.6050802@ix.netcom.com>
Message-ID: <12CC85DFFE044E28A957FF08B4005465@UserPC>

Well why not print a good clean copy with Acrobat and a high quality printer, and do the rest of the volume printing as camera-ready ? I have had complex texts published that way.
R.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150117/defe2e9e/attachment.html>

From verdy_p at wanadoo.fr  Sat Jan 17 16:04:17 2015
From: verdy_p at wanadoo.fr (Philippe Verdy)
Date: Sat, 17 Jan 2015 23:04:17 +0100
Subject: Unicode 7.0 Paperback Available
In-Reply-To: <12CC85DFFE044E28A957FF08B4005465@UserPC>
References: <646C8978F0A244DB9A1BBA14A5C45EE7@UserPC>
 <54BAA148.9090705@ix.netcom.com>
 <75E63C2C335F428BAF7D26CAD3FFE150@UserPC> <54BAB0CD.6050802@ix.netcom.com>
 <12CC85DFFE044E28A957FF08B4005465@UserPC>
Message-ID: <CAGa7JC31DW6A1ed2+oygHWLLthHm__BCvxfv7uNhvF6bt3PCFA@mail.gmail.com>

Could then Unicode create an alternate edition with Amazon Publishing ?
(may be at a different cost, possibly different paper format if you want to
keep the same commercial margins).
It would be available in electronic form (Kindle and its apps for Android
and iOS) and paperback; and really available worldwide at reasonnable price
and shipping costs with many payment and delivery options. And it could be
sold as well in other traditional bookseller shops.

Does Amazon require some preprint volumes to be paid first and require to
pay additonal fees for maintaining stocks; or does Amazon offer an option
for print-on-demand for small volumes ?

How does ISO handle its own (costly) publications (shipped from
Switzerland) ?

Can a national public library contract with Unicode to create their own
edition and distribution ?
Or does Unicode want a direct control on sales with customers? Or does it
have an exclusive publishing contract with Lulu ? Exclusive in US only or
worldwide?

2015-01-17 21:46 GMT+01:00 Raymond Mercier <raymond at almanach.co.uk>:

>   Well why not print a good clean copy with Acrobat and a high quality
> printer, and do the rest of the volume printing as camera-ready ? I have
> had complex texts published that way.
> R.
>
> _______________________________________________
> Unicode mailing list
> Unicode at unicode.org
> http://unicode.org/mailman/listinfo/unicode
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150117/791ca0eb/attachment.html>

From public at khwilliamson.com  Sun Jan 25 00:26:09 2015
From: public at khwilliamson.com (Karl Williamson)
Date: Sat, 24 Jan 2015 23:26:09 -0700
Subject: UAX 29 questions
Message-ID: <54C48C81.3080405@khwilliamson.com>

I vaguely recall asking something like this before, but if so, I didn't 
save the answers, and a search of the archives didn't turn up anything.

Some of the rules in UAX #29 don't make sense to me.

For example, rule WB7a
   Hebrew_Letter 	? 	Single_Quote

seems to say that a Hebrew_Letter followed by a Single Quote shouldn't 
break.  (And Rule WB4 says that actually there can be Extend and Format 
characters between the two and those should be ignored).

But the earlier rule, WB6

  (ALetter | Hebrew_Letter) 	? 	(MidLetter | MidNumLet | Single_Quote) 
(ALetter | Hebrew_Letter)

seems to me to say (among other things) that a Hebrew Letter followed by 
a Single Quote shouldn't break if and only if the latter is also 
followed by either an ALetter or another Hebrew Letter (again modulo 
ignored Format and Extend letters)

This seems contradictory.  One rule says something unconditionally, and 
the other rule adds conditions.

From richard.wordingham at ntlworld.com  Sun Jan 25 01:24:08 2015
From: richard.wordingham at ntlworld.com (Richard Wordingham)
Date: Sun, 25 Jan 2015 07:24:08 +0000
Subject: UAX 29 questions
In-Reply-To: <54C48C81.3080405@khwilliamson.com>
References: <54C48C81.3080405@khwilliamson.com>
Message-ID: <20150125072408.28015391@JRWUBU2>

On Sat, 24 Jan 2015 23:26:09 -0700
Karl Williamson <public at khwilliamson.com> wrote:

> But the earlier rule, WB6
> 
>   (ALetter | Hebrew_Letter) 	? 	(MidLetter | MidNumLet
> | Single_Quote) (ALetter | Hebrew_Letter)
> 
> seems to me to say (among other things) that a Hebrew Letter followed
> by a Single Quote shouldn't break if and only if the latter is also 
> followed by either an ALetter or another Hebrew Letter (again modulo 
> ignored Format and Extend letters)
> 
> This seems contradictory.  One rule says something unconditionally,
> and the other rule adds conditions.

There's no 'only if'.  WB6 applies to 6 combinations, one of which is
redundant because of WB7A.  Removing the redundant condition would
require an additional rule.

Richard.


From verdy_p at wanadoo.fr  Sun Jan 25 06:14:33 2015
From: verdy_p at wanadoo.fr (Philippe Verdy)
Date: Sun, 25 Jan 2015 13:14:33 +0100
Subject: UAX 29 questions
In-Reply-To: <54C48C81.3080405@khwilliamson.com>
References: <54C48C81.3080405@khwilliamson.com>
Message-ID: <CAGa7JC1a9koUYGz8Hsk0et7P+QUAz_7T=Z6UyMObJuyogdbwYw@mail.gmail.com>

This is not a contradiction.

combine the two rules and they are equivalent to these two alternate rules:
WB56 can be read as these two:

 (WB56a) ALetter  ?  (MidLetter | MidNumLet | Single_Quote) (ALetter |
Hebrew_Letter)

 (WB56b)  Hebrew_Letter  ?  (MidLetter | MidNumLet | Single_Quote) (ALetter
| Hebrew_Letter)


Then add :

  (WB57) Hebrew_Letter  ?  Single_Quote

it just removes the condition of a letter following the quote  in WB56b.
So that WB56b and WB57 can be read as equivalent to these two:

 (WB56c)  Hebrew_Letter  ?  (MidLetter | MidNumLet) (ALetter |
Hebrew_Letter)

 (WB57) Hebrew_Letter ? Single_Quote

But you cannot merge any of these two last rules in a single rule for WB56.


2015-01-25 7:26 GMT+01:00 Karl Williamson <public at khwilliamson.com>:

> I vaguely recall asking something like this before, but if so, I didn't
> save the answers, and a search of the archives didn't turn up anything.
>
> Some of the rules in UAX #29 don't make sense to me.
>
> For example, rule WB7a
>   Hebrew_Letter         ?       Single_Quote
>
> seems to say that a Hebrew_Letter followed by a Single Quote shouldn't
> break.  (And Rule WB4 says that actually there can be Extend and Format
> characters between the two and those should be ignored).
>
> But the earlier rule, WB6
>
>  (ALetter | Hebrew_Letter)      ?       (MidLetter | MidNumLet |
> Single_Quote) (ALetter | Hebrew_Letter)
>
> seems to me to say (among other things) that a Hebrew Letter followed by a
> Single Quote shouldn't break if and only if the latter is also followed by
> either an ALetter or another Hebrew Letter (again modulo ignored Format and
> Extend letters)
>
> This seems contradictory.  One rule says something unconditionally, and
> the other rule adds conditions.
> _______________________________________________
> Unicode mailing list
> Unicode at unicode.org
> http://unicode.org/mailman/listinfo/unicode
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150125/c05c9c91/attachment.html>

From rwhlk142 at gmail.com  Sun Jan 25 15:54:44 2015
From: rwhlk142 at gmail.com (Robert Wheelock)
Date: Sun, 25 Jan 2015 16:54:44 -0500
Subject: =?UTF-8?Q?The_NEW_Keyboard_Layout=E2=80=94IEAOU?=
Message-ID: <CAPKujtSKyuk+5y5eTjYtwc1f+Z-XMrLB=TJsd3xFO6+UrxF_MA@mail.gmail.com>

Hello!

I came up with a BRAND-NEW keyboard layout designed to make typing
easier??named the IEAOU (ee-eh-ah-oh-oo) System?based on letter frequencies.

The letters in the new IEAOU layout are arranged as follows:

(TOP):  Digits / Punctuation / Accents
(MEDIAL):  Q Y <:|;> W <"|'> L N D T S H <+|=> <\|!>
(HOME):  X K G F <?|`> P I E A O U
(BOTTOM):  C J Z V B M R <<|,> <>|.> <?|/>

Please respond to air what you?d think of it.  Thank You!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150125/0171d922/attachment.html>

From duerst at it.aoyama.ac.jp  Sun Jan 25 19:22:49 2015
From: duerst at it.aoyama.ac.jp (=?UTF-8?B?Ik1hcnRpbiBKLiBEw7xyc3Qi?=)
Date: Mon, 26 Jan 2015 10:22:49 +0900
Subject: The NEW Keyboard =?UTF-8?B?TGF5b3V04oCUSUVBT1U=?=
In-Reply-To: <CAPKujtSKyuk+5y5eTjYtwc1f+Z-XMrLB=TJsd3xFO6+UrxF_MA@mail.gmail.com>
References: <CAPKujtSKyuk+5y5eTjYtwc1f+Z-XMrLB=TJsd3xFO6+UrxF_MA@mail.gmail.com>
Message-ID: <54C596E9.2050103@it.aoyama.ac.jp>

What's better on this keyboard when compared to the Dvorak layout?
At first sight, it looks heavily right-handed, all the letters that the 
Dvorak keyboard has on the homerow are on the right hand.

Regards,   Martin.

P.S.: I'm a happy Dvorak user.

On 2015/01/26 06:54, Robert Wheelock wrote:
> Hello!
>
> I came up with a BRAND-NEW keyboard layout designed to make typing
> easier??named the IEAOU (ee-eh-ah-oh-oo) System?based on letter frequencies.
>
> The letters in the new IEAOU layout are arranged as follows:
>
> (TOP):  Digits / Punctuation / Accents
> (MEDIAL):  Q Y <:|;> W <"|'> L N D T S H <+|=> <\|!>
> (HOME):  X K G F <?|`> P I E A O U
> (BOTTOM):  C J Z V B M R <<|,> <>|.> <?|/>
>
> Please respond to air what you?d think of it.  Thank You!
>
>
>
> _______________________________________________
> Unicode mailing list
> Unicode at unicode.org
> http://unicode.org/mailman/listinfo/unicode
>


From philip_chastney at yahoo.com  Mon Jan 26 04:13:47 2015
From: philip_chastney at yahoo.com (philip chastney)
Date: Mon, 26 Jan 2015 02:13:47 -0800
Subject: =?utf-8?B?UmU6IFRoZSBORVcgS2V5Ym9hcmQgTGF5b3V04oCUSUVBT1U=?=
In-Reply-To: <54C596E9.2050103@it.aoyama.ac.jp>
Message-ID: <1422267227.6369.YahooMailBasic@web162605.mail.bf1.yahoo.com>

as anybody who has tried to type with a cat on their lap will confirm, there are times when a left- or right-handed bias in the keyboard layout is a positive advantage

/phil
--------------------------------------------
On Mon, 26/1/15, Martin J. D?rst <duerst at it.aoyama.ac.jp> wrote:

 Subject: Re: The NEW Keyboard Layout?IEAOU
 To: "Robert Wheelock" <rwhlk142 at gmail.com>, "unicode at unicode.org" <unicode at unicode.org>
 Date: Monday, 26 January, 2015, 1:22 AM
 
 What's better on this
 keyboard when compared to the Dvorak layout?
 At first sight, it looks heavily right-handed,
 all the letters that the 
 Dvorak keyboard
 has on the homerow are on the right hand.
 
 Regards,???Martin.
 
 P.S.: I'm a happy Dvorak
 user.
 
 On 2015/01/26 06:54,
 Robert Wheelock wrote:
 > Hello!
 >
 > I came up with a
 BRAND-NEW keyboard layout designed to make typing
 > easier??named the IEAOU
 (ee-eh-ah-oh-oo) System?based on letter frequencies.
 >
 > The letters in the
 new IEAOU layout are arranged as follows:
 >
 > (TOP):? Digits /
 Punctuation / Accents
 > (MEDIAL):? Q Y
 <:|;> W <"|'> L N D T S H <+|=>
 <\|!>
 > (HOME):? X K G F
 <?|`> P I E A O U
 > (BOTTOM):? C
 J Z V B M R <<|,> <>|.> <?|/>
 >
 > Please respond to air
 what you?d think of it.? Thank You!
 >
 >
 >
 >
 _______________________________________________
 > Unicode mailing list
 >
 Unicode at unicode.org
 > http://unicode.org/mailman/listinfo/unicode
 >
 
 _______________________________________________
 Unicode mailing list
 Unicode at unicode.org
 http://unicode.org/mailman/listinfo/unicode
 

From marc.blanchet at viagenie.ca  Mon Jan 26 04:51:02 2015
From: marc.blanchet at viagenie.ca (Marc Blanchet)
Date: Mon, 26 Jan 2015 11:51:02 +0100
Subject: =?utf-8?Q?Re=3A_The_NEW_Keyboard_Layout=E2=80=94IEAOU?=
In-Reply-To: <1422267227.6369.YahooMailBasic@web162605.mail.bf1.yahoo.com>
References: <1422267227.6369.YahooMailBasic@web162605.mail.bf1.yahoo.com>
Message-ID: <22C2E6F2-5399-4E36-845C-CDE335714F55@viagenie.ca>


> Le 2015-01-26 ? 11:13, philip chastney <philip_chastney at yahoo.com> a ?crit :
> 
> as anybody who has tried to type with a cat on their lap will confirm, there are times when a left- or right-handed bias in the keyboard layout is a positive advantage

I would then suggest the name of the keyboard to be "MEOW"

Marc.


> 
> /phil
> --------------------------------------------
> On Mon, 26/1/15, Martin J. D?rst <duerst at it.aoyama.ac.jp> wrote:
> 
> Subject: Re: The NEW Keyboard Layout?IEAOU
> To: "Robert Wheelock" <rwhlk142 at gmail.com>, "unicode at unicode.org" <unicode at unicode.org>
> Date: Monday, 26 January, 2015, 1:22 AM
> 
> What's better on this
> keyboard when compared to the Dvorak layout?
> At first sight, it looks heavily right-handed,
> all the letters that the 
> Dvorak keyboard
> has on the homerow are on the right hand.
> 
> Regards,   Martin.
> 
> P.S.: I'm a happy Dvorak
> user.
> 
> On 2015/01/26 06:54,
> Robert Wheelock wrote:
>> Hello!
>> 
>> I came up with a
> BRAND-NEW keyboard layout designed to make typing
>> easier??named the IEAOU
> (ee-eh-ah-oh-oo) System?based on letter frequencies.
>> 
>> The letters in the
> new IEAOU layout are arranged as follows:
>> 
>> (TOP):  Digits /
> Punctuation / Accents
>> (MEDIAL):  Q Y
> <:|;> W <"|'> L N D T S H <+|=>
> <\|!>
>> (HOME):  X K G F
> <?|`> P I E A O U
>> (BOTTOM):  C
> J Z V B M R <<|,> <>|.> <?|/>
>> 
>> Please respond to air
> what you?d think of it.  Thank You!
>> 
>> 
>> 
>> 
> _______________________________________________
>> Unicode mailing list
>> 
> Unicode at unicode.org
>> http://unicode.org/mailman/listinfo/unicode
>> 
> 
> _______________________________________________
> Unicode mailing list
> Unicode at unicode.org
> http://unicode.org/mailman/listinfo/unicode
> 
> 
> _______________________________________________
> Unicode mailing list
> Unicode at unicode.org
> http://unicode.org/mailman/listinfo/unicode


From KalvesmakiJ at doaks.org  Mon Jan 26 07:18:01 2015
From: KalvesmakiJ at doaks.org (Kalvesmaki, Joel)
Date: Mon, 26 Jan 2015 13:18:01 +0000
Subject: =?Windows-1252?Q?Re:_The_NEW_Keyboard_Layout=8BIEAOU?=
In-Reply-To: <54C596E9.2050103@it.aoyama.ac.jp>
Message-ID: <D0EBA73A.B47E%kalvesmakij@doaks.org>

Indeed, Dvorak distributes the home row burden across both hands, vowels
on left most common consonants on right.

Also, there are one-handed variations of Dvorak for both left and right
hands, but unlike the proposal below the main hand?s home row is centered
on the keyboard.

I too am a happy Dvorak user, for 16 years.

jk
--
Joel Kalvesmaki
Editor in Byzantine Studies
Dumbarton Oaks
202 339 6435


On 1/25/15, 8:22 PM, "Martin J. D?rst" <duerst at it.aoyama.ac.jp> wrote:

>What's better on this keyboard when compared to the Dvorak layout?
>At first sight, it looks heavily right-handed, all the letters that the
>Dvorak keyboard has on the homerow are on the right hand.
>
>Regards,   Martin.
>
>P.S.: I'm a happy Dvorak user.
>
>On 2015/01/26 06:54, Robert Wheelock wrote:
>> Hello!
>>
>> I came up with a BRAND-NEW keyboard layout designed to make typing
>> easier??named the IEAOU (ee-eh-ah-oh-oo) System?based on letter
>>frequencies.
>>
>> The letters in the new IEAOU layout are arranged as follows:
>>
>> (TOP):  Digits / Punctuation / Accents
>> (MEDIAL):  Q Y <:|;> W <"|'> L N D T S H <+|=> <\|!>
>> (HOME):  X K G F <?|`> P I E A O U
>> (BOTTOM):  C J Z V B M R <<|,> <>|.> <?|/>
>>
>> Please respond to air what you?d think of it.  Thank You!
>>
>>
>>
>> _______________________________________________
>> Unicode mailing list
>> Unicode at unicode.org
>> http://unicode.org/mailman/listinfo/unicode
>>
>
>_______________________________________________
>Unicode mailing list
>Unicode at unicode.org
>http://unicode.org/mailman/listinfo/unicode


From doug at ewellic.org  Mon Jan 26 10:49:34 2015
From: doug at ewellic.org (Doug Ewell)
Date: Mon, 26 Jan 2015 09:49:34 -0700
Subject: The NEW Keyboard =?UTF-8?Q?Layout=E2=80=94IEAOU?=
Message-ID: <20150126094934.665a7a7059d7ee80bb4d670165c8327d.a211b8c9e5.wbe@email03.secureserver.net>

Robert Wheelock <rwhlk142 at gmail dot com> wrote:

> (TOP):  Digits / Punctuation / Accents

This is too vague. I know this row is not identical to the top (E) row
of the standard U.S. English keyboard, because you moved ` and ! and +
and = to other rows.

--
Doug Ewell | Thornton, CO, USA | http://ewellic.org


From verdy_p at wanadoo.fr  Mon Jan 26 11:36:33 2015
From: verdy_p at wanadoo.fr (Philippe Verdy)
Date: Mon, 26 Jan 2015 18:36:33 +0100
Subject: =?UTF-8?Q?Re=3A_The_NEW_Keyboard_Layout=E2=80=94IEAOU?=
In-Reply-To: <CAPKujtSKyuk+5y5eTjYtwc1f+Z-XMrLB=TJsd3xFO6+UrxF_MA@mail.gmail.com>
References: <CAPKujtSKyuk+5y5eTjYtwc1f+Z-XMrLB=TJsd3xFO6+UrxF_MA@mail.gmail.com>
Message-ID: <CAGa7JC3b21J4MEAGxaZ-jVaz9VWyuOxCDZdC+Mxpj2tBKFWtLA@mail.gmail.com>

Very strange layout of the bottom row.
Note that if your layout is visiblly otimized for the right hand (mostly
for those that use only one hand to type; so that the left part contains
only the least used keys),

I doubt that this really works well for typing just with one hand: those
same users will also not use more than two fingers and the 5th finger (to
the rightmost part of the keyboard) will still be hard to type.

I dont understnd the nwhy you placed punctuations mixed beteen letters on
the left part and the media row, I would have kept them to the rightmost
part (colon/semicolon key, and single/double quotes key.

Now if users will try to use both hands, then your left/right separation
does not work so well
And the letter C is evidently badly placed, more difficult to reach than
the letters J Z V B (for English this C keys at least should be shifted to
the middle; where it would also be more accessible for one-hand typists, or
one-finger typists)

2015-01-25 22:54 GMT+01:00 Robert Wheelock <rwhlk142 at gmail.com>:

> Hello!
>
> I came up with a BRAND-NEW keyboard layout designed to make typing
> easier??named the IEAOU (ee-eh-ah-oh-oo) System?based on letter frequencies.
>
> The letters in the new IEAOU layout are arranged as follows:
>
> (TOP):  Digits / Punctuation / Accents
> (MEDIAL):  Q Y <:|;> W <"|'> L N D T S H <+|=> <\|!>
> (HOME):  X K G F <?|`> P I E A O U
> (BOTTOM):  C J Z V B M R <<|,> <>|.> <?|/>
>
> Please respond to air what you?d think of it.  Thank You!
>
>
>
> _______________________________________________
> Unicode mailing list
> Unicode at unicode.org
> http://unicode.org/mailman/listinfo/unicode
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150126/da3f464f/attachment.html>

From cph13 at case.edu  Mon Jan 26 13:43:41 2015
From: cph13 at case.edu (Clive Hohberger)
Date: Mon, 26 Jan 2015 13:43:41 -0600
Subject: =?UTF-8?Q?Re=3A_The_NEW_Keyboard_Layout=E2=80=94IEAOU?=
In-Reply-To: <CAGa7JC3b21J4MEAGxaZ-jVaz9VWyuOxCDZdC+Mxpj2tBKFWtLA@mail.gmail.com>
References: <CAPKujtSKyuk+5y5eTjYtwc1f+Z-XMrLB=TJsd3xFO6+UrxF_MA@mail.gmail.com>
 <CAGa7JC3b21J4MEAGxaZ-jVaz9VWyuOxCDZdC+Mxpj2tBKFWtLA@mail.gmail.com>
Message-ID: <CA+pJFQj_TanEntMCXWA1md35Ze1u=DBzB+kWFqFS1pvJ+0XepQ@mail.gmail.com>

Robert,
I certainly agree with Philip about typing with a cat on my lap! I use "one
hand for the cat, one for the mouse"...

What I don't understand in the right hand layout is the placement of the
letter P. Given the English letter frequency  ETAOIN SHRDLU, the letter T,
N, or S would make more sense. Putting T and N in the same row  with E, A
O, and moving P and U  would minimize row changing in typing English.

An obvious idea to me if you really want a 1 handed keyboard for languages
 is ETAOIN <,> in the Home Row and SHRDLU<.> above it, and the remaining
consonants below  or to the left of the upper 2 rows of characters.

Also, my life would be easier if you had dyad keys, such as <TH> or <ES> or
<ED>. Again, look at the dyad frequenecy maps, but I suggest you try
minimize row changing during single words as much as possible.
- - -
On a side note, since I am progressively losing the use of my outer fingers
to osteoarthritis, Dragon 13.5 is a far better solution for me than a new
keyboard.

Thanks for the stimulus.
Clive


On Mon, Jan 26, 2015 at 11:36 AM, Philippe Verdy <verdy_p at wanadoo.fr> wrote:

> Very strange layout of the bottom row.
> Note that if your layout is visiblly otimized for the right hand (mostly
> for those that use only one hand to type; so that the left part contains
> only the least used keys),
>
> I doubt that this really works well for typing just with one hand: those
> same users will also not use more than two fingers and the 5th finger (to
> the rightmost part of the keyboard) will still be hard to type.
>
> I dont understnd the nwhy you placed punctuations mixed beteen letters on
> the left part and the media row, I would have kept them to the rightmost
> part (colon/semicolon key, and single/double quotes key.
>
> Now if users will try to use both hands, then your left/right separation
> does not work so well
> And the letter C is evidently badly placed, more difficult to reach than
> the letters J Z V B (for English this C keys at least should be shifted
> to the middle; where it would also be more accessible for one-hand typists,
> or one-finger typists)
>
> 2015-01-25 22:54 GMT+01:00 Robert Wheelock <rwhlk142 at gmail.com>:
>
>> Hello!
>>
>> I came up with a BRAND-NEW keyboard layout designed to make typing
>> easier??named the IEAOU (ee-eh-ah-oh-oo) System?based on letter frequencies.
>>
>> The letters in the new IEAOU layout are arranged as follows:
>>
>> (TOP):  Digits / Punctuation / Accents
>> (MEDIAL):  Q Y <:|;> W <"|'> L N D T S H <+|=> <\|!>
>> (HOME):  X K G F <?|`> P I E A O U
>> (BOTTOM):  C J Z V B M R <<|,> <>|.> <?|/>
>>
>> Please respond to air what you?d think of it.  Thank You!
>>
>>
>>
>> _______________________________________________
>> Unicode mailing list
>> Unicode at unicode.org
>> http://unicode.org/mailman/listinfo/unicode
>>
>>
>
> _______________________________________________
> Unicode mailing list
> Unicode at unicode.org
> http://unicode.org/mailman/listinfo/unicode
>
>


-- 
Clive P. Hohberger, PhD MBA
Managing Director
Clive Hohberger, LLC
+1 847 910 8794
cph13 at case.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150126/df740492/attachment.html>

From verdy_p at wanadoo.fr  Mon Jan 26 14:37:45 2015
From: verdy_p at wanadoo.fr (Philippe Verdy)
Date: Mon, 26 Jan 2015 21:37:45 +0100
Subject: =?UTF-8?Q?Re=3A_The_NEW_Keyboard_Layout=E2=80=94IEAOU?=
In-Reply-To: <CA+pJFQj_TanEntMCXWA1md35Ze1u=DBzB+kWFqFS1pvJ+0XepQ@mail.gmail.com>
References: <CAPKujtSKyuk+5y5eTjYtwc1f+Z-XMrLB=TJsd3xFO6+UrxF_MA@mail.gmail.com>
 <CAGa7JC3b21J4MEAGxaZ-jVaz9VWyuOxCDZdC+Mxpj2tBKFWtLA@mail.gmail.com>
 <CA+pJFQj_TanEntMCXWA1md35Ze1u=DBzB+kWFqFS1pvJ+0XepQ@mail.gmail.com>
Message-ID: <CAGa7JC3pgHNsJCvuoSTBROUN4AxGNSQXctLxmZzTx2-B-k5ofw@mail.gmail.com>

Well, I also frequently have to type with the left-hand only, even if I am
right-handed ! my right hand is for the mouse and I'd like to avoid loosng
the mouse just to type a few characters or words, such as when filling or
correcting a web form where I need the mouse to click the item to edit.

Then the most used keys will be on the wrong side...

Dvoak keyboards (have not really addressed this issue, it was ONLY for
typing faster with two hands and all fingers, and very few users can do
that and they are sufficiently traiend with existing QWERTY/AZERTY/QWERTZ
or ABCD keyboards for not having to sitch to another layout where their
typing speed will be MUCH slower, always looking for keys for a long time;
Dvorak keyboards are then only for the youngest typists that have never
typed on other keyboards)

Note that today many young people first learn to type on the numeric
keyboard of their smartphone, with help of a dictionary predictor to avoid
repeating keys or using long presses... They can't even type efficiently on
QWERTY/AZERTY/QWERTZ or ABCD keyboards !


2015-01-26 20:43 GMT+01:00 Clive Hohberger <cph13 at case.edu>:

> Robert,
> I certainly agree with Philip about typing with a cat on my lap! I use
> "one hand for the cat, one for the mouse"...
>
> What I don't understand in the right hand layout is the placement of the
> letter P. Given the English letter frequency  ETAOIN SHRDLU, the letter T,
> N, or S would make more sense. Putting T and N in the same row  with E, A
> O, and moving P and U  would minimize row changing in typing English.
>
> An obvious idea to me if you really want a 1 handed keyboard for languages
>  is ETAOIN <,> in the Home Row and SHRDLU<.> above it, and the remaining
> consonants below  or to the left of the upper 2 rows of characters.
>
> Also, my life would be easier if you had dyad keys, such as <TH> or <ES>
> or <ED>. Again, look at the dyad frequenecy maps, but I suggest you try
> minimize row changing during single words as much as possible.
> - - -
> On a side note, since I am progressively losing the use of my outer
> fingers to osteoarthritis, Dragon 13.5 is a far better solution for me than
> a new keyboard.
>
> Thanks for the stimulus.
> Clive
>
>
>
> On Mon, Jan 26, 2015 at 11:36 AM, Philippe Verdy <verdy_p at wanadoo.fr>
> wrote:
>
>> Very strange layout of the bottom row.
>> Note that if your layout is visiblly otimized for the right hand (mostly
>> for those that use only one hand to type; so that the left part contains
>> only the least used keys),
>>
>> I doubt that this really works well for typing just with one hand: those
>> same users will also not use more than two fingers and the 5th finger (to
>> the rightmost part of the keyboard) will still be hard to type.
>>
>> I dont understnd the nwhy you placed punctuations mixed beteen letters on
>> the left part and the media row, I would have kept them to the rightmost
>> part (colon/semicolon key, and single/double quotes key.
>>
>> Now if users will try to use both hands, then your left/right separation
>> does not work so well
>> And the letter C is evidently badly placed, more difficult to reach than
>> the letters J Z V B (for English this C keys at least should be shifted
>> to the middle; where it would also be more accessible for one-hand typists,
>> or one-finger typists)
>>
>> 2015-01-25 22:54 GMT+01:00 Robert Wheelock <rwhlk142 at gmail.com>:
>>
>>> Hello!
>>>
>>> I came up with a BRAND-NEW keyboard layout designed to make typing
>>> easier??named the IEAOU (ee-eh-ah-oh-oo) System?based on letter frequencies.
>>>
>>> The letters in the new IEAOU layout are arranged as follows:
>>>
>>> (TOP):  Digits / Punctuation / Accents
>>> (MEDIAL):  Q Y <:|;> W <"|'> L N D T S H <+|=> <\|!>
>>> (HOME):  X K G F <?|`> P I E A O U
>>> (BOTTOM):  C J Z V B M R <<|,> <>|.> <?|/>
>>>
>>> Please respond to air what you?d think of it.  Thank You!
>>>
>>>
>>>
>>> _______________________________________________
>>> Unicode mailing list
>>> Unicode at unicode.org
>>> http://unicode.org/mailman/listinfo/unicode
>>>
>>>
>>
>> _______________________________________________
>> Unicode mailing list
>> Unicode at unicode.org
>> http://unicode.org/mailman/listinfo/unicode
>>
>>
>
>
> --
> Clive P. Hohberger, PhD MBA
> Managing Director
> Clive Hohberger, LLC
> +1 847 910 8794
> cph13 at case.edu
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150126/cfdbb2de/attachment.html>

From verdy_p at wanadoo.fr  Mon Jan 26 14:39:18 2015
From: verdy_p at wanadoo.fr (Philippe Verdy)
Date: Mon, 26 Jan 2015 21:39:18 +0100
Subject: =?UTF-8?Q?Re=3A_The_NEW_Keyboard_Layout=E2=80=94IEAOU?=
In-Reply-To: <54C6A47E.2030606@ix.netcom.com>
References: <CAPKujtSKyuk+5y5eTjYtwc1f+Z-XMrLB=TJsd3xFO6+UrxF_MA@mail.gmail.com>
 <CAGa7JC3b21J4MEAGxaZ-jVaz9VWyuOxCDZdC+Mxpj2tBKFWtLA@mail.gmail.com>
 <CA+pJFQj_TanEntMCXWA1md35Ze1u=DBzB+kWFqFS1pvJ+0XepQ@mail.gmail.com>
 <54C6A47E.2030606@ix.netcom.com>
Message-ID: <CAGa7JC1VUvrTL257v-2LK3UW6QGK90_ECm+Vjm5j2HiiTSqWwg@mail.gmail.com>

those one-handed (end even one-fingered) keyboards exist and are widely
used ! Look at smartphones !

2015-01-26 21:33 GMT+01:00 Asmus Freytag (t) <asmus-inc at ix.netcom.com>:

> On 1/26/2015 11:43 AM, Clive Hohberger wrote:
>
>> Robert,
>> I certainly agree with Philip about typing with a cat on my lap! I use
>> "one hand for the cat, one for the mouse"...
>>
>
> one handed layouts have their place - as I found out after  an injury a
> while back. I quickly reached the point where the one handed layout was
> getting faster than "hunting" over the familiar one, but it still took more
> concentration, so I was glad when I could resume two-handed operation
> sooner than feared...
>
> A./
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150126/bc17412f/attachment.html>

From public at khwilliamson.com  Thu Jan 29 12:52:30 2015
From: public at khwilliamson.com (Karl Williamson)
Date: Thu, 29 Jan 2015 11:52:30 -0700
Subject: UAX 29 questions
In-Reply-To: <CAGa7JC1a9koUYGz8Hsk0et7P+QUAz_7T=Z6UyMObJuyogdbwYw@mail.gmail.com>
References: <54C48C81.3080405@khwilliamson.com>
 <CAGa7JC1a9koUYGz8Hsk0et7P+QUAz_7T=Z6UyMObJuyogdbwYw@mail.gmail.com>
Message-ID: <54CA816E.4020802@khwilliamson.com>

On 01/25/2015 05:14 AM, Philippe Verdy wrote:
> This is not a contradiction.

At the very least it is too sloppy for a standard.  Once there is a 
match in the list of rules, later rules shouldn't have to be looked at. 
  I'll submit a formal feedback form.

But there is another issue as well.  I do not see how the specified 
rules when applied to the sequence of code points:

     U+0041 U+200D U+0020

cause the ZWJ, an Extend, to not break with the "A", an ALetter.

Rule WB4 is

"Ignore Format and Extend characters, except when they appear at the 
beginning of a region of text.".

Not clearly stated, but it appears to me that the ZWJ must be considered 
here to be the beginning of a region of text, as we are looking at the 
boundary between it and the "A".  No rule specifically mentions ALetter 
followed by an Extend, so by the default rule, WB14

"Otherwise, break everywhere (including around ideographs)"

this should be a word break position.  But that is absurd, as the Extend 
is supposed to extend what precedes it.  If I add a rule

"Don't break before Extend or Format"
	 	? (Extend | Format)

my implementation passes all tests.  I added this rule before WB4.


>
> combine the two rules and they are equivalent to these two alternate rules:
> WB56 can be read as these two:
>
>   (WB56a) ALetter  ?  (MidLetter | MidNumLet | Single_Quote) (ALetter |
> Hebrew_Letter)
>
>   (WB56b) Hebrew_Letter  ?  (MidLetter | MidNumLet | Single_Quote)
> (ALetter | Hebrew_Letter)
>
>
> Then add :
>
>    (WB57) Hebrew_Letter ?  Single_Quote
>
> it just removes the condition of a letter following the quote  in WB56b.
> So that WB56b and WB57 can be read as equivalent to these two:
>
>   (WB56c) Hebrew_Letter  ?  (MidLetter | MidNumLet) (ALetter |
> Hebrew_Letter)
>
>   (WB57) Hebrew_Letter ? Single_Quote
>
> But you cannot merge any of these two last rules in a single rule for WB56.
>
>
> 2015-01-25 7:26 GMT+01:00 Karl Williamson <public at khwilliamson.com
> <mailto:public at khwilliamson.com>>:
>
>     I vaguely recall asking something like this before, but if so, I
>     didn't save the answers, and a search of the archives didn't turn up
>     anything.
>
>     Some of the rules in UAX #29 don't make sense to me.
>
>     For example, rule WB7a
>        Hebrew_Letter         ?       Single_Quote
>
>     seems to say that a Hebrew_Letter followed by a Single Quote
>     shouldn't break.  (And Rule WB4 says that actually there can be
>     Extend and Format characters between the two and those should be
>     ignored).
>
>     But the earlier rule, WB6
>
>       (ALetter | Hebrew_Letter)      ?       (MidLetter | MidNumLet |
>     Single_Quote) (ALetter | Hebrew_Letter)
>
>     seems to me to say (among other things) that a Hebrew Letter
>     followed by a Single Quote shouldn't break if and only if the latter
>     is also followed by either an ALetter or another Hebrew Letter
>     (again modulo ignored Format and Extend letters)
>
>     This seems contradictory.  One rule says something unconditionally,
>     and the other rule adds conditions.
>     _________________________________________________
>     Unicode mailing list
>     Unicode at unicode.org <mailto:Unicode at unicode.org>
>     http://unicode.org/mailman/__listinfo/unicode
>     <http://unicode.org/mailman/listinfo/unicode>
>
>


From verdy_p at wanadoo.fr  Thu Jan 29 21:19:44 2015
From: verdy_p at wanadoo.fr (Philippe Verdy)
Date: Fri, 30 Jan 2015 04:19:44 +0100
Subject: UAX 29 questions
In-Reply-To: <54CA816E.4020802@khwilliamson.com>
References: <54C48C81.3080405@khwilliamson.com>
 <CAGa7JC1a9koUYGz8Hsk0et7P+QUAz_7T=Z6UyMObJuyogdbwYw@mail.gmail.com>
 <54CA816E.4020802@khwilliamson.com>
Message-ID: <CAGa7JC07GTaAc82u6w0jFoNuc2kSwbazb1s5t-CgH0a8gKXjeQ@mail.gmail.com>

2015-01-29 19:52 GMT+01:00 Karl Williamson <public at khwilliamson.com>:

> Rule WB4 is
>
> "Ignore Format and Extend characters, except when they appear at the
> beginning of a region of text.".
>
> Not clearly stated, but it appears to me that the ZWJ must be considered
> here to be the beginning of a region of text, as we are looking at the
> boundary between it and the "A".  No rule specifically mentions ALetter
> followed by an Extend, so by the default rule, WB14
>
> "Otherwise, break everywhere (including around ideographs)"


All the text is targeted at finding candidate positions for breaks. It is
not very clear that "ignore" is definitive and means that there cannot be
any further breaks before the Format and Extend characters, except at
beginng of text. So all the rest of rules is ignored, there was a match and
you stop there; no break before;

  Any  ?  (Format | Extend)

This is confirmed in other rules that state the word "otherwise", including
the last one (WB14) you quote which is explciitly not applicable.


But I agree with you that rules WB56 and WB57 should better be rewritten as

 (WB56a):
 ALetter  ?  (MidLetter | MidNumLet | Single_Quote) (ALetter |
Hebrew_Letter)

 (WB56c+WB57 combined):
 Hebrew_Letter  ?  ((MidLetter | MidNumLet) (ALetter | Hebrew_Letter) |
Single_Quote)


Note also that for French, the single quote is followed by a word break,
but NOT a linebreak by default, and also NOT a syllable break for
hyphenation) except in very few exceptions like "aujourd'hui" which is
treated now as a single word -there's an elision but also a contraction of
4 words as if it was written "au jour d' hui", but the term "hui" no longer
occurs anywhere isolately except for that common word where all components
are glued), most elision apostrophes normally occur at end of word (e.g.
after the two apostrohpes in ? l'ann?e n'est pas termin?e ?).

The rare cases where you should not break after an apostrophe is when
elision occurs in the middle of a word in some vulgar expressions like ?
c't'apr?s-m' ? which contains two informal words ? c't' ? and ? apr?s-m' ?
which are abbreviating ? cet apr?s-midi ? in popular language.

In English you have the case where the elision occurs at the begining of a
word : ? it's ? is two words ? it ? and ? 's ? abbreviating ? is ? : or in
the middle ? aren't ? containing two glued words ? are ? and ? n't ?
abbreviating ? not ?.

In both cases, you can use the WB rules, but then treat some exceptions for
candidate.

This way a single matching rule is needed and you no longer need to look
for other rules.

But we are not discussing line breaks here, but only word breaks (for the
purpose of performing dictionary lookups and grammar analysis) : we
shouldbe able with the default rules to "unglue" the words by default,
using then an exception lsiss to see if we must reattach them as they are
not all words.

So first attempt to look for word terminated by an apostrophe, and then
perform language-dependand perform lookup for known exceptions (? aujourd'
? ? hui ? cannot match because ? hui ? is not a separate word) fow whch we
must try something else :

Look for word starting by an apostrophe (n English ? it's ? would be first
treated bythe previous rule as ? it' ? and ? s ? but ? s ? alone is treated
as an exeption, then with this rule it will correctly idenofy ? ?s ?
independantly of the previous word, except if it is an acronym like in ?
GMO's ? because in that case the ? 's ? is not a separate verb or a
genitive particle but a known plural mark).

Word breaks are more complicate to handle than line breaks as they need to
perform dictionary lookups to assert them, But this is the purpose of a
word breaking process to be used in order to perform dicutionnary lookups.
With it, ou can then safely talior the line breaking alogorith in otder to
implement syllable breaking for hyphenation which needs these dictionary
lookups also to detect exceptions to  the normal syllable breaks (which can
be performed only with langiage-secific loolups for some pairs, or digrams
or trgrams)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150130/28e253f6/attachment.html>

From public at khwilliamson.com  Thu Jan 29 23:25:14 2015
From: public at khwilliamson.com (Karl Williamson)
Date: Thu, 29 Jan 2015 22:25:14 -0700
Subject: UAX 29 questions
In-Reply-To: <CAGa7JC07GTaAc82u6w0jFoNuc2kSwbazb1s5t-CgH0a8gKXjeQ@mail.gmail.com>
References: <54C48C81.3080405@khwilliamson.com>
 <CAGa7JC1a9koUYGz8Hsk0et7P+QUAz_7T=Z6UyMObJuyogdbwYw@mail.gmail.com>
 <54CA816E.4020802@khwilliamson.com>
 <CAGa7JC07GTaAc82u6w0jFoNuc2kSwbazb1s5t-CgH0a8gKXjeQ@mail.gmail.com>
Message-ID: <54CB15BA.4080001@khwilliamson.com>

On 01/29/2015 08:19 PM, Philippe Verdy wrote:
>
> 2015-01-29 19:52 GMT+01:00 Karl Williamson <public at khwilliamson.com
> <mailto:public at khwilliamson.com>>:
>
>     Rule WB4 is
>
>     "Ignore Format and Extend characters, except when they appear at the
>     beginning of a region of text.".
>
>     Not clearly stated, but it appears to me that the ZWJ must be
>     considered here to be the beginning of a region of text, as we are
>     looking at the boundary between it and the "A".  No rule
>     specifically mentions ALetter followed by an Extend, so by the
>     default rule, WB14
>
>     "Otherwise, break everywhere (including around ideographs)"
>
>
> All the text is targeted at finding candidate positions for breaks. It
> is not very clear that "ignore" is definitive and means that there
> cannot be any further breaks before the Format and Extend characters,
> except at beginng of text. So all the rest of rules is ignored, there
> was a match and you stop there; no break before;
>
>    Any  ? (Format | Extend)
>
> This is confirmed in other rules that state the word "otherwise",
> including the last one (WB14) you quote which is explciitly not applicable.

I don't understand you here.  I understand all the words, but I don't 
see what you're trying to say.  My claim is that there should be a rule:
as you give

  Any  ? (Format | Extend)

but there isn't.  I think you are maybe trying to say that the word 
"ignore" in this UAX is tantamount to such a rule.  I am a native 
English speaker, and would never have drawn that inference from the 
text.  There are a lot of passages in the Standard that sound like 
gibberish to me.  I know the words' meanings, but the combination don't 
make any sense.  I don't recall ever having this issue in other 
standards I've looked at.

From verdy_p at wanadoo.fr  Fri Jan 30 00:36:03 2015
From: verdy_p at wanadoo.fr (Philippe Verdy)
Date: Fri, 30 Jan 2015 07:36:03 +0100
Subject: UAX 29 questions
In-Reply-To: <54CB15BA.4080001@khwilliamson.com>
References: <54C48C81.3080405@khwilliamson.com>
 <CAGa7JC1a9koUYGz8Hsk0et7P+QUAz_7T=Z6UyMObJuyogdbwYw@mail.gmail.com>
 <54CA816E.4020802@khwilliamson.com>
 <CAGa7JC07GTaAc82u6w0jFoNuc2kSwbazb1s5t-CgH0a8gKXjeQ@mail.gmail.com>
 <54CB15BA.4080001@khwilliamson.com>
Message-ID: <CAGa7JC1ukfdyZ2B5j=Lj6hsMvE3M9j3K50ar6jSKAAofFqQT-g@mail.gmail.com>

The main reason is that the rest if the text does not test pairs starting
by Format or Extend, but Any character that precedes the Format and Extend
characters.
By saying "ignore"; it just says : whilae parsing from start to ed of text,
keep any character in the stqte variable that keeps the WB-property of the
non-ignored character.
:In fact the rule is :
Any  ? (Format | Extend)+
but this is matches more than a simpke pair of characters. Used in all the
rest of the rules.
So effectively all other rules do not contain any reference to Format and
Extend.

2015-01-30 6:25 GMT+01:00 Karl Williamson <public at khwilliamson.com>:

> On 01/29/2015 08:19 PM, Philippe Verdy wrote:
>
>>
>> 2015-01-29 19:52 GMT+01:00 Karl Williamson <public at khwilliamson.com
>> <mailto:public at khwilliamson.com>>:
>>
>>     Rule WB4 is
>>
>>     "Ignore Format and Extend characters, except when they appear at the
>>     beginning of a region of text.".
>>
>>     Not clearly stated, but it appears to me that the ZWJ must be
>>     considered here to be the beginning of a region of text, as we are
>>     looking at the boundary between it and the "A".  No rule
>>     specifically mentions ALetter followed by an Extend, so by the
>>     default rule, WB14
>>
>>     "Otherwise, break everywhere (including around ideographs)"
>>
>>
>> All the text is targeted at finding candidate positions for breaks. It
>> is not very clear that "ignore" is definitive and means that there
>> cannot be any further breaks before the Format and Extend characters,
>> except at beginng of text. So all the rest of rules is ignored, there
>> was a match and you stop there; no break before;
>>
>>    Any  ? (Format | Extend)
>>
>> This is confirmed in other rules that state the word "otherwise",
>> including the last one (WB14) you quote which is explciitly not
>> applicable.
>>
>
> I don't understand you here.  I understand all the words, but I don't see
> what you're trying to say.  My claim is that there should be a rule:
> as you give
>
>  Any  ? (Format | Extend)
>
> but there isn't.  I think you are maybe trying to say that the word
> "ignore" in this UAX is tantamount to such a rule.  I am a native English
> speaker, and would never have drawn that inference from the text.  There
> are a lot of passages in the Standard that sound like gibberish to me.  I
> know the words' meanings, but the combination don't make any sense.  I
> don't recall ever having this issue in other standards I've looked at.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150130/cba0bb81/attachment.html>

From mark at macchiato.com  Fri Jan 30 02:32:56 2015
From: mark at macchiato.com (=?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?=)
Date: Fri, 30 Jan 2015 09:32:56 +0100
Subject: UAX 29 questions
In-Reply-To: <54CA816E.4020802@khwilliamson.com>
References: <54C48C81.3080405@khwilliamson.com>
 <CAGa7JC1a9koUYGz8Hsk0et7P+QUAz_7T=Z6UyMObJuyogdbwYw@mail.gmail.com>
 <54CA816E.4020802@khwilliamson.com>
Message-ID: <CAJ2xs_Ea9RiVLvpC_dk4PpWLZ+aSMh6wNWzNWFnvcnN5X1iXqQ@mail.gmail.com>

I apology in advance that I'm running low on time, and didn't go through
all the messages on this thread carefully. So I may not be fully
appreciating people's positions. I'm just making some quick points about 2
items that caught my eye.


1. There are certainly times where two rules in sequence may overlap, just
for simplicity.

X Y* x Z
Y x Z* W

The first rule could trigger on X Y Z W, even though the second would also
trigger on it. This may or may not be "sloppiness"; sometimes it simply
makes the second rule too convoluted to also exclude triggering on
everything that could possibly trigger earlier.

That being said, if there simplifications in the rules that would make it
clearer, I'd suggest submitting a proposal for that. The UTC is meeting
next week, and could consider it either then or at subsequent meetings.

Note: the HTML files in http://unicode.org/Public/UNIDATA/auxiliary/ have a
number of sample cases (which are also used in the test files). Hovering
over boundaries in those sample cases shows which rule is triggered, such
as in
http://unicode.org/Public/UNIDATA/auxiliary/GraphemeBreakTest.html#samples

We're always open to additional samples that are illustrative of how the
rules work. As I thought about your message, it became clear to me that it
would be useful to have a complete enough set of sample cases that each
rule is triggered by at least one case, if you or anyone else is interested
in helping to add those.


2. Also, the following 2 rules are not equivalent:

a) Any  ? (Format | Extend)
b) X (Extend | Format)* ? X

(b) implies (a), but not the reverse. The difference is on the right side
of characters. Rule b, affects every subsequent rule, and can be viewed as
a shorthand. After it, we can just say:

A B ? C D

And that has the effect of saying:

A (Extend | Format)* B (Extend | Format)* ? C (Extend | Format)* D

See also http://unicode.org/reports/tr29/#Grapheme_Cluster_and_Format_Rules

However, it may not be clear that (b) implies (a); that might be what you
are getting at. If so, then we could add an explicit statement to that
effect.


Mark <https://google.com/+MarkDavis>

*? Il meglio ? l?inimico del bene ?*

On Thu, Jan 29, 2015 at 7:52 PM, Karl Williamson <public at khwilliamson.com>
wrote:

> On 01/25/2015 05:14 AM, Philippe Verdy wrote:
>
>> This is not a contradiction.
>>
>
> At the very least it is too sloppy for a standard.  Once there is a match
> in the list of rules, later rules shouldn't have to be looked at.  I'll
> submit a formal feedback form.
>
> But there is another issue as well.  I do not see how the specified rules
> when applied to the sequence of code points:
>
>     U+0041 U+200D U+0020
>
> cause the ZWJ, an Extend, to not break with the "A", an ALetter.
>
> Rule WB4 is
>
> "Ignore Format and Extend characters, except when they appear at the
> beginning of a region of text.".
>
> Not clearly stated, but it appears to me that the ZWJ must be considered
> here to be the beginning of a region of text, as we are looking at the
> boundary between it and the "A".  No rule specifically mentions ALetter
> followed by an Extend, so by the default rule, WB14
>
> "Otherwise, break everywhere (including around ideographs)"
>
> this should be a word break position.  But that is absurd, as the Extend
> is supposed to extend what precedes it.  If I add a rule
>
> "Don't break before Extend or Format"
>                 ? (Extend | Format)
>
> my implementation passes all tests.  I added this rule before WB4.
>
>
>
>> combine the two rules and they are equivalent to these two alternate
>> rules:
>> WB56 can be read as these two:
>>
>>   (WB56a) ALetter  ?  (MidLetter | MidNumLet | Single_Quote) (ALetter |
>> Hebrew_Letter)
>>
>>   (WB56b) Hebrew_Letter  ?  (MidLetter | MidNumLet | Single_Quote)
>> (ALetter | Hebrew_Letter)
>>
>>
>> Then add :
>>
>>    (WB57) Hebrew_Letter ?  Single_Quote
>>
>> it just removes the condition of a letter following the quote  in WB56b.
>> So that WB56b and WB57 can be read as equivalent to these two:
>>
>>   (WB56c) Hebrew_Letter  ?  (MidLetter | MidNumLet) (ALetter |
>> Hebrew_Letter)
>>
>>   (WB57) Hebrew_Letter ? Single_Quote
>>
>> But you cannot merge any of these two last rules in a single rule for
>> WB56.
>>
>>
>> 2015-01-25 7:26 GMT+01:00 Karl Williamson <public at khwilliamson.com
>> <mailto:public at khwilliamson.com>>:
>>
>>     I vaguely recall asking something like this before, but if so, I
>>     didn't save the answers, and a search of the archives didn't turn up
>>     anything.
>>
>>     Some of the rules in UAX #29 don't make sense to me.
>>
>>     For example, rule WB7a
>>        Hebrew_Letter         ?       Single_Quote
>>
>>     seems to say that a Hebrew_Letter followed by a Single Quote
>>     shouldn't break.  (And Rule WB4 says that actually there can be
>>     Extend and Format characters between the two and those should be
>>     ignored).
>>
>>     But the earlier rule, WB6
>>
>>       (ALetter | Hebrew_Letter)      ?       (MidLetter | MidNumLet |
>>     Single_Quote) (ALetter | Hebrew_Letter)
>>
>>     seems to me to say (among other things) that a Hebrew Letter
>>     followed by a Single Quote shouldn't break if and only if the latter
>>     is also followed by either an ALetter or another Hebrew Letter
>>     (again modulo ignored Format and Extend letters)
>>
>>     This seems contradictory.  One rule says something unconditionally,
>>     and the other rule adds conditions.
>>     _________________________________________________
>>     Unicode mailing list
>>     Unicode at unicode.org <mailto:Unicode at unicode.org>
>>     http://unicode.org/mailman/__listinfo/unicode
>>     <http://unicode.org/mailman/listinfo/unicode>
>>
>>
>>
> _______________________________________________
> Unicode mailing list
> Unicode at unicode.org
> http://unicode.org/mailman/listinfo/unicode
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150130/e4296e75/attachment.html>

From verdy_p at wanadoo.fr  Fri Jan 30 09:59:16 2015
From: verdy_p at wanadoo.fr (Philippe Verdy)
Date: Fri, 30 Jan 2015 16:59:16 +0100
Subject: UAX 29 questions
In-Reply-To: <CAJ2xs_Ea9RiVLvpC_dk4PpWLZ+aSMh6wNWzNWFnvcnN5X1iXqQ@mail.gmail.com>
References: <54C48C81.3080405@khwilliamson.com>
 <CAGa7JC1a9koUYGz8Hsk0et7P+QUAz_7T=Z6UyMObJuyogdbwYw@mail.gmail.com>
 <54CA816E.4020802@khwilliamson.com>
 <CAJ2xs_Ea9RiVLvpC_dk4PpWLZ+aSMh6wNWzNWFnvcnN5X1iXqQ@mail.gmail.com>
Message-ID: <CAGa7JC0Y3bO8nbbVGo6Fc-KEpaT9SrBvRqOaqVBWtbRbVC8U5w@mail.gmail.com>

2015-01-30 9:32 GMT+01:00 Mark Davis ?? <mark at macchiato.com>:

> 2. Also, the following 2 rules are not equivalent:
>
> a) Any  ? (Format | Extend)
> b) X (Extend | Format)* ? X
>

That's what I replied in the first message but using an "as if" which was
not clear enough, my seconde reply reformulated it by making clear about
the right side (the substitution iccuring n the next rules; that you view
as a "shortcut").

Your first argument about convolution is not very justified between WB56
and WB57 that are also clear when rewritten by separating ALetter and
HebrewLetter.

But I also note this case for Hebrew's handling of apostrophes/quotes also
exists in the Latin script (including in English only) for the context of
word-breaking only (this does not apply to linebreaking and syllable
breaking for hyphenation, which are other types of breakers).

The rule about Format and Extend is still kept separate in WB56 and listed
first only because it correctly preserves the canonical equivalences for
extenders, which include all combining characters with non-zero combining
class; and which also include the gold rule for not breaking in the middle
of default grapheme clusters (which also includes joiners like CGJ and ZWJ
with any breaker algorithms, except code point breakers for some conforming
UTF's like UTF-16).

WB57 is evidently subject to tailorings. It just provides a default
behavior where the single quote/apostrophe is handled as an elision mark
most often used at end of words, and glued with the next word without space
separation.

WB57 It also handles the case where it is also followed by some spaces or
other punctuations and the single quote is then not an orthographic elision
mark but a punctuation marking an end of quotation.

One problem is the SingleQuote class used in WB57 is possibly too large :
it acts as an elision mark (apostrophe) only for a smaller number of
single-quote-like characters.

The other problem of WB57 is that it assumes that elision marked by
apostrophes occurs only at end of words (not true even for English) and
this is where per-language tailoring is not only possible but most probably
recommended.

Such tailoring should will affect the behavor of WB56 (notably in English,
French, Italian... where the apostrophe is lexicalized and its usage
regulated by their standard grammar).

----

But I wonder if tailoring of WB57 is not also needed for Hebrew. I see WB57
only as a initial default tailoring for the script itself, not for the
actual language (which may also be Yiddish). And could also include usual
transcriptions of foreign words, or of common but informal
abbreviations/contractions too (the apostrophe is highly prefered to the
dot for abbreviating/contracting in the middle of a word and notably when
the abbreviated part is not even pronounced but completely elided.

It seems ajso that Swedish may also use the colon in the middle of a word,
without space separations, instead of an apostrophe.

Other languages may prefer other signs for elisions (including an hyphen;
which does not break words but only syllables for candidate breaking of
long lines), notably if there are confusions with quote-like letters

Another common notation (found in French typography) uses superscripts for
the final letters when elision occurs in the middle of a word, but this is
in fact just a written abbreviation (this totaly replaces the use of the
abbreviation dot; normally never used in the middle and completely
eliminated in acronyms): this is not really an elision the abbreviated word
with superscript is sctill fullly read without the elision; so the
apostrophe cannot be used.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150130/9bd28626/attachment.html>

From ken.shirriff at gmail.com  Fri Jan 30 10:55:01 2015
From: ken.shirriff at gmail.com (Ken Shirriff)
Date: Fri, 30 Jan 2015 08:55:01 -0800
Subject: Is there an IBM group mark symbol?
Message-ID: <CALBHtZzhXuWp6SuNCnxVyP0JEqjg2zQCYA57XUEmQeOQ7R_Pxg@mail.gmail.com>

I'm writing about the IBM 1401 and there's one character from its character
set that I couldn't find in Unicode: the group mark. The group mark is
three horizontal lines with a vertical line through it (see attached
image). This character is used in various books and publications, so it's a
"real" symbol that is used in text. Would it make sense for me to submit a
proposal to add this character?

Group mark image (from
https://en.wikipedia.org/wiki/IBM_1401#Character_and_op_codes):


Thank you,
Ken
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150130/5c915c88/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: IBM_1401_Group_Mark.GIF
Type: image/gif
Size: 855 bytes
Desc: not available
URL: <http://unicode.org/pipermail/unicode/attachments/20150130/5c915c88/attachment.gif>

From roozbeh at unicode.org  Fri Jan 30 11:15:20 2015
From: roozbeh at unicode.org (Roozbeh Pournader)
Date: Fri, 30 Jan 2015 09:15:20 -0800
Subject: Is there an IBM group mark symbol?
In-Reply-To: <CALBHtZzhXuWp6SuNCnxVyP0JEqjg2zQCYA57XUEmQeOQ7R_Pxg@mail.gmail.com>
References: <CALBHtZzhXuWp6SuNCnxVyP0JEqjg2zQCYA57XUEmQeOQ7R_Pxg@mail.gmail.com>
Message-ID: <CABWzK_U-v21tdY1U-n7cr=anVaD6+U26w9J3+=YzN=xggAt8Sg@mail.gmail.com>

There may be something like it in the math symbols sets, but if there's
not, please feel free to submit a proposal.
On Jan 30, 2015 8:59 AM, "Ken Shirriff" <ken.shirriff at gmail.com> wrote:

> I'm writing about the IBM 1401 and there's one character from its
> character set that I couldn't find in Unicode: the group mark. The group
> mark is three horizontal lines with a vertical line through it (see
> attached image). This character is used in various books and publications,
> so it's a "real" symbol that is used in text. Would it make sense for me to
> submit a proposal to add this character?
>
> Group mark image (from
> https://en.wikipedia.org/wiki/IBM_1401#Character_and_op_codes):
>
>
> Thank you,
> Ken
>
> _______________________________________________
> Unicode mailing list
> Unicode at unicode.org
> http://unicode.org/mailman/listinfo/unicode
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150130/a8b7335a/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: IBM_1401_Group_Mark.GIF
Type: image/gif
Size: 855 bytes
Desc: not available
URL: <http://unicode.org/pipermail/unicode/attachments/20150130/a8b7335a/attachment.gif>

From jf at colson.eu  Fri Jan 30 11:31:14 2015
From: jf at colson.eu (=?UTF-8?B?SmVhbi1GcmFuw6dvaXMgQ29sc29u?=)
Date: Fri, 30 Jan 2015 18:31:14 +0100
Subject: Is there an IBM group mark symbol?
In-Reply-To: <CALBHtZzhXuWp6SuNCnxVyP0JEqjg2zQCYA57XUEmQeOQ7R_Pxg@mail.gmail.com>
References: <CALBHtZzhXuWp6SuNCnxVyP0JEqjg2zQCYA57XUEmQeOQ7R_Pxg@mail.gmail.com>
Message-ID: <54CBBFE2.7080505@colson.eu>


Le 30/01/15 18:30, Jean-Fran?ois Colson a ?crit :
Le 30/01/15 17:55, Ken Shirriff a ?crit :
> I'm writing about the IBM 1401 and there's one character from its 
> character set that I couldn't find in Unicode: the group mark. The 
> group mark is three horizontal lines with a vertical line through it 
> (see attached image). This character is used in various books and 
> publications, so it's a "real" symbol that is used in text. Would it 
> make sense for me to submit a proposal to add this character?

Why not?
In the meantime, you could approximate it with U+2261 IDENTICAL TO 
U+20D2 COMBINING LONG VERTICAL LINE OVERLAY: ??
Here is what that looks like in FreeMono: http://colson.eu/??.png
and in DejaVu Sans Mono: http://colson.eu/??..png

>
> Group mark image (from 
> https://en.wikipedia.org/wiki/IBM_1401#Character_and_op_codes):
>
>
> Thank you,
> Ken
>
>
> _______________________________________________
> Unicode mailing list
> Unicode at unicode.org
> http://unicode.org/mailman/listinfo/unicode


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150130/b9732ad7/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 855 bytes
Desc: not available
URL: <http://unicode.org/pipermail/unicode/attachments/20150130/b9732ad7/attachment.gif>

From frederic.grosshans at gmail.com  Fri Jan 30 11:45:26 2015
From: frederic.grosshans at gmail.com (=?UTF-8?B?RnLDqWTDqXJpYyBHcm9zc2hhbnM=?=)
Date: Fri, 30 Jan 2015 18:45:26 +0100
Subject: Is there an IBM group mark symbol?
In-Reply-To: <CALBHtZzhXuWp6SuNCnxVyP0JEqjg2zQCYA57XUEmQeOQ7R_Pxg@mail.gmail.com>
References: <CALBHtZzhXuWp6SuNCnxVyP0JEqjg2zQCYA57XUEmQeOQ7R_Pxg@mail.gmail.com>
Message-ID: <54CBC336.3090708@gmail.com>

Le 30/01/2015 17:55, Ken Shirriff a ?crit :
> I'm writing about the IBM 1401 and there's one character from its 
> character set that I couldn't find in Unicode: the group mark. The 
> group mark is three horizontal lines with a vertical line through it 
> (see attached image). This character is used in various books and 
> publications, so it's a "real" symbol that is used in text. Would it 
> make sense for me to submit a proposal to add this character?

In may 2007, Ken Whistler answered a slightly more general question on 
old IBM characters :

http://unicode.org/mail-arch/unicode-ml/y2007-m05/0373.html

The group mark was the more problematic and his answer was :

  * You can see it as a glyph variant of ? U+241D SYMBOL FOR GROUP SEPARATOR
  * You can have a symbol of the same appearance by combining ?? U+2261
    IDENTICAL TO, U+20D2 COMBINING LONG VERTICAL LINE OVERLAY.

However, none of the solution would seem to be really practical, and I 
didn?t find any corresponding symbol (including the variants in 
http://unicode.org/Public/UCD/latest/ucd/StandardizedVariants.txt ). A 
proposal might help add it to the standard.

     Fr?d?ric


From markus.icu at gmail.com  Sat Jan 31 16:04:24 2015
From: markus.icu at gmail.com (Markus Scherer)
Date: Sat, 31 Jan 2015 14:04:24 -0800
Subject: N'Ko - which character? 02BC vs. 2019
Message-ID: <CAN49p6poA9Agcg835nWVjOeuOfPgkumiSoi9cJ_gBzVQTso3Hw@mail.gmail.com>

Dear Unicoders, which is the proper second character in "N'Ko"?
See below for details.
Thanks,
markus
---------- Forwarded message ----------
From: Doug Ewell <doug at ewellic.org>
Date: Sat, Jan 31, 2015 at 9:16 AM
Subject: Apostrophes (was: Re: ISO 639-3 changes)
To: Philip Newton <philip.newton at gmail.com>
Cc: ietf-languages at iana.org


Philip Newton wrote:

 4. For existing subtags, when we add a Description with a real click
>> letter, we can simultaneously "correct" any ASCII apostrophes. We
>> have already used both U+02BC MODIFIER LETTER APOSTROPHE (for
>> Gwich?in) and U+2019 RIGHT SINGLE QUOTATION MARK (for N?Ko, both
>> language and script), and I would prefer to stick to one of these
>> consistently for the African languages.
>>
>
> Sounds reasonable to me. FWIW, I?d vote for U+02BC MODIFIER LETTER
> APOSTROPHE.
>

According to TUS, U+02BC is preferred over U+2019 if the character in
question is an actual letter of the orthography, which seems to be true
here.

U+2019 is supported in far more fonts than U+02BC. But I think the goal is
to use the "correct" apostrophe character along with the click letters. As
a side note, on my system I have more fonts that support U+02BC than the
click letters, with only two fonts (Ebrima and MPH 2B Damase) that support
click letters but not U+02BC. If you can't see the click letters, it
doesn't really matter if you can't see the apostrophe either; you'll fall
back to the ASCII name anyhow.

So for bundling "better" apostrophes along with the true click letters in
Description fields like Ju??hoan (Ju/'hoan), as Kent Karlsson originally
proposed, I agree with Philip that we should use U+02BC instead of U+2019.
What does everyone else think?

--
Doug Ewell | Thornton, CO, USA | http://ewellic.org ?
_______________________________________________
Ietf-languages mailing list
Ietf-languages at alvestrand.no
http://www.alvestrand.no/mailman/listinfo/ietf-languages
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://unicode.org/pipermail/unicode/attachments/20150131/e2716f71/attachment.html>