About this blog…

I am employed by Netnod as head of engineering, research and development and am among other things chair of the Security and Stability Advisory Committee at ICANN. You can find CV and photos of me at this page.

As I wear so many hats, I find it being necessary to somewhere express my personal view on things. This is the location where that happens. Postings on this blog, or at Facebook, Twitter etc, falls under this policy.

The views expressed on this post are mine and do not necessarily reflect the views of Netnod or any other of the organisations I have connections to.

I think python should by default support UCS-4

I think on the platforms that support enough bits python should be compiled with support for all Unicode codepoints. This of course waste a bit of memory, and because of that have impact on I/O etc, but the amount of impact is I think worth it.

I downloaded Python 3.0.1 and run configure with –with-wide-unicode. That did the trick (solved the problems I described here and here). More good information about how to install and have parallell versions of Python can for example be found at Farm Development.

At least the MacOSX build should be for UCS-4.

$ /usr/local/bin/python
Python 3.0.1 (r301:69556, Apr  6 2009, 20:51:21)
[GCC 4.0.1 (Apple Inc. build 5484)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.maxunicode
1114111
>>> a=chr(0x01D400)
>>> len(a)
1
>>> import unicodedata
>>> unicodedata.name(a)
'MATHEMATICAL BOLD CAPITAL A'
>>> unicodedata.name(a)
'MATHEMATICAL BOLD CAPITAL A'
>>> b=unicodedata.normalize('NFKC',a)
>>> hex(ord(b))
'0x41'
>>>

Comments are closed.