Opened 10 years ago

Closed 10 years ago

# Crash when trying to render certain characters

Reported by: Owned by: anonymous athomas normal PageToPdfPlugin normal mankoff+pdfplugin@… 0.10

### Description

It seems that '´', '', and the microsoft smart quotes, crash my instance of the plugin.

I have the following defined in trac.ini:

[pagetopdf]
size = A4
charset = iso-8859-15


The wiki has following text:

´


Python Traceback

Traceback (most recent call last):
File "/usr/lib64/python2.4/site-packages/trac/web/main.py", line 313, in dispatch_request
dispatcher.dispatch(req)
File "/usr/lib64/python2.4/site-packages/trac/web/main.py", line 198, in dispatch
resp = chosen_handler.process_request(req)
File "/usr/lib64/python2.4/site-packages/trac/wiki/web_ui.py", line 126, in process_request
page.text, format, page.name)
File "/usr/lib64/python2.4/site-packages/trac/mimeview/api.py", line 550, in send_converted
content, selector)
File "/usr/lib64/python2.4/site-packages/trac/mimeview/api.py", line 330, in convert_content
output = converter.convert_content(req, mimetype, content, ck)
File "build/bdist.linux-x86_64/egg/pagetopdf/pagetopdf.py", line 20, in convert_content
File "/usr/lib64/python2.4/encodings/iso8859_15.py", line 18, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\xb4' in position 4: character maps to <undefined>


The version of htmldoc is

htmldoc --version
1.8.24 Open Source


I am using the workflow branch from revision 3378.

### comment:1 Changed 10 years ago by athomas

• Resolution set to invalid
• Status changed from new to closed

According to this post, which may or may not be correct, the ACUTE ACCENT cannot be represented in ISO-8859-15 at all:

\$ echo -en '\xc2\xb4' | iconv -f utf-8 -t iso-8859-15

iconv: illegal input sequence at position 0


You will have to change your codepage to something else, or use an alternate character.

### comment:2 follow-up: ↓ 3 Changed 10 years ago by mankoff

#1133 was marked as a dup of this, but the page linked to with this bug in #1133 does not have accent characters.

### comment:3 in reply to: ↑ 2 Changed 10 years ago by anonymous

#1133 was marked as a dup of this, but the page linked to with this bug in #1133 does not have accent characters.

It's not the exact same character, but it's the same basic issue. The character '\u2019' is a RIGHT SINGLE QUOTATION MARK which is also unencodable in iso-8859-15.

### comment:4 Changed 10 years ago by mankoff

• Cc mankoff+pdfplugin@… added; anonymous removed

OK I get it. But what is the solution? I've set my codepage to be something different than iso-8859-15 like so:

[pagetopdf]
size = A4
charset = UTF-8
`

but still get the same error.

Is there an easier fix than editing all pages with those stupid smart-quotes that got pasted in from somewhere else?