#146 closed defect (fixed)
ASCII char > 128 fails - Internationalization no more working
Reported by: | anonymous | Owned by: | Alec Thomas |
---|---|---|---|
Priority: | normal | Component: | TocMacro |
Severity: | normal | Keywords: | |
Cc: | Trac Release: |
Description
After upgrading from trac 0.8 to 0.9 TOC macro does not support international chars:
Error: Macro TOC(None) failed 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in range(128)
Attachments (0)
Change History (10)
comment:1 Changed 19 years ago by
Status: | new → assigned |
---|
comment:2 Changed 19 years ago by
This version (r382) works great and there is no more internationalization problem.
comment:3 Changed 19 years ago by
Hmm, okay. Well I'm really not sure what the correct solution is. Apparently r382 doesn't work with Japanese character sets, but the fix breaks other international characters :\
Perhaps I need to decode using the character set from the HTTP header...
comment:4 Changed 19 years ago by
The version of TOC macro reffered to download: http://trac-hacks.org/download/tocmacro.zip has indeed some difficulties with international chars. But I've tried some version from repository directly source:tocmacro/0.9/TOC.py#382 and it is OK.
comment:5 Changed 19 years ago by
Cc: | marc.zonzon@… added; anonymous removed |
---|
OK getting rid of this decode('utf-8')
solves the problem, just take the last release and apply the patch below, or take the version http://trac-hacks.org/browser/tocmacro/0.9/TOC.py?rev=320 320 (but you lose the docstring!).
I suppose that it fails because you have yet written unicode 16, before this utf-8 and StringIO cannot handle both as specified in http://docs.python.org/lib/module-StringIO.html module-StringIO
I don't think it is a problem of http header:
- I'm using utf-8 so you will end-up with the same code that induces this bug,
- The problem is not a bad 8 bits code, it is that your StringIO does not accept 8 bits.
--- TOC.py.bak 2006-03-17 13:38:26.000000000 +0100 +++ TOC.py 2006-03-17 15:09:52.000000000 +0100 @@ -87,7 +87,7 @@ out.write('<a href="%s">%s</a> : %s</li>\n' % (env.href.wiki(page), page, formatted_header)) break else: - default_anchor = anchor = Formatter._anchor_re.sub("", header.decode('utf-8')) + default_anchor = anchor = Formatter._anchor_re.sub("", header) anchor_n = 1 while anchor in seen_anchors: anchor = default_anchor + str(anchor_n)
comment:6 Changed 19 years ago by
I'm new to this trac ticket system, and when I have entered this cc field, I didn't think my email will appear, how can I erase it; please save me from a new load of spams''
comment:7 Changed 19 years ago by
Cc: | anonymous added; marc.zonzon@… removed |
---|
All you do is clear the CC field...
comment:8 Changed 19 years ago by
Regarding your previous comment, the thing is, the decode() was necessary to make TOC work for #92. So I have conflicting reports, one where decode() fixes a problem and one where it causes a problem.
Unfortunately I know very little about Python's locale support, so I can't really be of much help fixing it. As usual, patches welcome.
comment:9 Changed 19 years ago by
How this decode
can work in japanese is quite mysterious.
- If you decode utf-8 you obtain an unicode string, so if you write this unicode string, you cannot any longer write a 8 bits string.
- If you need to decode the header why not decode the
formatted_header
?
I'm quite new to trac, so I don't know how text is stored in database, but I cannot imagine why decoding here the header to unicode.
Can you try [download:tocmacro-r382 this] version? This was prior to a change which supposedly fixed #92.
If that doesn't work, please enable Trac logging and and add the full traceback to this ticket, along with an example heading that triggers the exception.