michaelanortonster at gmail.com
Sat Mar 28 06:45:51 CDT 2015
*Important correction from my last sent email*:
*Only 34% from your list exceed 10% of **the average percentile (2.9%)**. *
This is serendipitously common (eg. the Earth:Moon albedo ratio is .36).
A relationship about motion and other natural properties and
charactetristics among the local texts begin to emerge.
On Sat, Mar 28, 2015 at 7:30 AM, Michael Norton <
michaelanortonster at gmail.com> wrote:
> Thanks Doug. I did not know there exists a *representative* sample of
> the world's text. :) I do know that 400 years ago there were about 10,000
> languages; now there are about 6,500. Time flies!
> Your frequency chart is great. The average char appearance is 2.91%.
> Only 34% from your list exceed 10% of it. Therefore, U+0020 is the
> elephant in the room (ie. 15%.05% is far > 2.91%). In fact, it's almost
> >50% greater than the next most-appearing character.
> So from the two frequency lists you've given me (my email and yours) we
> begin to see some patterns emerge. Provided prior data and observation,
> most useful patterns prevail over other more obscure ones and present a
> provocative opportunity for webbers out there....While this is probably out
> of context for most of the 700 Unicode members, I can report that it's good
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Unicode