Modify

Opened 17 months ago

Last modified 3 months ago

#10974 new defect

sending unicode email fails

Reported by: anonymous Owned by: hasienda
Priority: normal Component: AnnouncerPlugin
Severity: normal Keywords:
Cc: sgeulette, 4glitch@… Trac Release: 1.0

Description

The AnnouncerPlugin fails to send unicode emails for me. I get the following traceback:

Traceback (most recent call last):
  File "/var/lib/trac/lib/python2.7/site-packages/TracAnnouncer-1.0dev_r12503-py2.7.egg/announcer/api.py", line 584, in _real_send
    evt)
  File "/var/lib/trac/lib/python2.7/site-packages/TracAnnouncer-1.0dev_r12503-py2.7.egg/announcer/distributors/mail.py", line 330, in distribute
    self._do_send(transport, event, k, v, fmtdict[k])
  File "/var/lib/trac/lib/python2.7/site-packages/TracAnnouncer-1.0dev_r12503-py2.7.egg/announcer/distributors/mail.py", line 488, in _do_send
    msgText = MIMEText(output, msg_format)
  File "/usr/lib/python2.7/email/mime/text.py", line 30, in __init__
    self.set_payload(_text, _charset)
  File "/usr/lib/python2.7/email/message.py", line 226, in set_payload
    self.set_charset(charset)
  File "/usr/lib/python2.7/email/message.py", line 262, in set_charset
    self._payload = self._payload.encode(charset.output_charset)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 32: ordinal not in range(128)

I'm using the mime_encoding = base64 and the plain text email format (but it seems to happen for html email as well). I'm on Python 2.7.3 and Trac 1.0.1.

Attachments (2)

formatters_patch.diff (440 bytes) - added by patrick 7 months ago.
patch implied by comment #9
sending-unicode-email-fails.patch (1.1 KB) - added by g1itch 3 months ago.
changed patch from comment:3 which worked for me

Download all attachments as: .zip

Change History (17)

comment:1 Changed 17 months ago by rjollos

I'm a bit surprised by this. Could you paste the [announcer] section from your trac.ini file just so we can double-check.

comment:2 follow-up: Changed 17 months ago by rjollos

Hi, Would you mind trying out this (untested) patch?:

  • announcerplugin/trunk/announcer/distributors/mail.py

    diff --git a/announcerplugin/trunk/announcer/distributors/mail.py b/announcerplu
    index b994502..bd79d1f 100644
    a b class EmailDistributor(Component): 
    478478            rootMessage.attach(parentMessage) 
    479479 
    480480            alt_msg_format = 'html' in alternate_style and 'html' or 'plain' 
    481             msgText = MIMEText(alternate_output, alt_msg_format) 
    482             msgText.set_charset(self._charset) 
     481            msgText = MIMEText(alternate_output, alt_msg_format, self._charset) 
    483482            parentMessage.attach(msgText) 
    484483        else: 
    485484            parentMessage = rootMessage 
    486485 
    487486        msg_format = 'html' in format and 'html' or 'plain' 
    488         msgText = MIMEText(output, msg_format) 
     487        msgText = MIMEText(output, msg_format, self._charset) 
    489488        del msgText['Content-Transfer-Encoding'] 
    490         msgText.set_charset(self._charset) 
    491489        # According to RFC 2046, the last part of a multipart message is best 
    492490        #   and preferred. 
    493491        parentMessage.attach(msgText) 

Based on documentation for MIMEText, we may need to specify the charset in the constructor.

comment:3 in reply to: ↑ 2 ; follow-up: Changed 17 months ago by anonymous

Thanks for your quick response. I'm fine with testing your suggestion. Here's the result:

Traceback (most recent call last):
  File "/var/lib/trac/lib/python2.7/site-packages/TracAnnouncer-1.0dev_r12503-py2.7.egg/announcer/api.py", line 584, in _real_send
    evt)
  File "/var/lib/trac/lib/python2.7/site-packages/TracAnnouncer-1.0dev_r12503-py2.7.egg/announcer/distributors/mail.py", line 330, in distribute
    self._do_send(transport, event, k, v, fmtdict[k])
  File "/var/lib/trac/lib/python2.7/site-packages/TracAnnouncer-1.0dev_r12503-py2.7.egg/announcer/distributors/mail.py", line 490, in _do_send
    msgText = MIMEText(output, msg_format, self._charset)
  File "/usr/lib/python2.7/email/mime/text.py", line 29, in __init__
    **{'charset': _charset})
  File "/usr/lib/python2.7/email/mime/base.py", line 25, in __init__
    self.add_header('Content-Type', ctype, **_params)
  File "/usr/lib/python2.7/email/message.py", line 408, in add_header
    parts.append(_formatparam(k.replace('_', '-'), v))
  File "/usr/lib/python2.7/email/message.py", line 45, in _formatparam
    if value is not None and len(value) > 0:

(Regarding the line numbers I shall add, that I didn't remove the other lines but commented them out.)

I did some further debugging, which is hopefully useful to you: self._charset is an instance of email.charset.Charset having the following attributes (extracted by printing self._charset.__dict__ to the log file): {'input_codec': 'utf-8', 'body_encoding': 2, 'input_charset': 'utf-8', 'header_encoding': 2, 'output_charset': 'utf-8', 'output_codec': 'utf-8'}. As the python documentation of the MIMEText constructor says that the third parameter defaults to us-ascii, I just tried the string utf-8, which however creates a broken email like this:

Return-path: <trac@wobsta.de>
Envelope-to: contact@wobsta.de
Delivery-date: Wed, 27 Mar 2013 14:27:38 +0100
Received: from localhost ([127.0.0.1] helo=h2032560.stratoserver.net)
	by h2032560.stratoserver.net with esmtp (Exim 4.76)
	(envelope-from <trac@wobsta.de>)
	id 1UKqO9-0008MK-Tl
	for contact@wobsta.de; Wed, 27 Mar 2013 14:27:37 +0100
Content-Type: multipart/related;
 boundary="===============2509084458330105465=="
MIME-Version: 1.0
Date: Wed, 27 Mar 2013 13:27:37 -0000
To: "undisclosed-recipients:"
Reply-To: trac@wobsta.de
Message-ID: <041.feac52a1e43c7efa8e99991f7bd1ff00@wobsta.de>
From: "wobsta.de" <trac@wobsta.de>
Subject: Blog: rundfunkgebuehren2 comment deleted
Auto-Submitted: auto-generated
Precedence: bulk
X-Announcer-Version: 1.0dev-r12503
X-Mailer: AnnouncerPlugin v1.0dev-r12503 on Trac v1.0.1
X-Trac-Announcement-Realm: blog
X-Trac-Project: wobsta.de
X-Trac-Version: 1.0.1
X-SA-Exim-Connect-IP: 127.0.0.1
X-SA-Exim-Mail-From: trac@wobsta.de
X-SA-Exim-Scanned: No (on h2032560.stratoserver.net); SAEximRunCond expanded to false

This is a multi-part message in MIME format.
--===============2509084458330105465==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64

STNKMWJtUm1kVzVyWjJWaWRXVm9jbVZ1TWpvZ1VuVnVaR1oxYm10blpXTER2R2h5Wlc0Z1pzTzhj
aUJEYjIxd2RYUmxjaUJwYmlCbAphVzVsYlNCSVpXbHRZc084Y204c0lGWmxjbWhoYm1Sc2RXNW5J
SFZ1WkNCRmJuUnpZMmhsYVdSMWJtY2dZbVZwYlNCQ2RXNWtaWE4yClpYSjNZV3gwZFc1bmMyZGxj
bWxqYUhRS0Nnb0sK

--===============2509084458330105465==--

which displays as some random garbage (looks like base64 or so) in the mailer.

However, if I encode output using utf-8, i.e. use msgText = MIMEText(output.encode('utf-8'), msg_format), everything looks fine. Maybe the correct code should thus be msgText = MIMEText(output.encode(self._charset.input_codec), msg_format), but I'm not sure.

For the curious, as you asked me, this is the announcer section of my trac.ini:

[announcer]
default_email_format = text/plain
email_address_resolvers = SpecifiedEmailResolver, SessionEmailResolver, DefaultDomainEmailResolver
email_enabled = true
email_from = trac@wobsta.de
email_from_name =
email_replyto = trac@wobsta.de
email_sender = SmtpEmailSender
email_subject_prefix = __default__
email_to = undisclosed-recipients:
mime_encoding = base64
use_public_cc = false
use_threaded_delivery = false

comment:4 in reply to: ↑ 3 Changed 17 months ago by rjollos

Replying to anonymous:

...
However, if I encode output using utf-8, i.e. use msgText = MIMEText(output.encode('utf-8'), msg_format), everything looks fine. Maybe the correct code should thus be msgText = MIMEText(output.encode(self._charset.input_codec), msg_format), but I'm not sure.

That seems like a good workaround for now. It seems like it should be possible to specify the encoding when constructing the MIMEText object, but the documentation was unclear, and I made a bit of an assumption when deciding to pass a Charset object as the third parameter.

Thank you for debugging and providing all of this info. I'll do some more testing in the next day or so, but if all else fails, we can just encode the output that is passed to the MIMEText constructor like you did.

comment:5 Changed 17 months ago by anonymous

I'm fine with my patched version for now, but I really want to emphasize, that it needs fixing upstream. The reason is, that its a really odd failure, if you don't monitor your system in detail. Let me explain: In case you just setup trac with the annoucer plugin and test it with some simple message (ascii only), it pretty much works like a charm. (So first of all thanks for the great plugin, its awesome.)

I just did so as well, but I was lucky to test it with some real data, which fortunately happened to contain non-ascii characters (as it will be rather common in some of my use-cases). Anyway, the problem is, that everything still looks rather ok, just the email is not sent at all. If you turn on logging in trac, you will find the traceback in the logs, but for the person how triggered the email everything looks fine. You do not receive an error and the item is updated online properly. So in the end of the day, people will just not receive their notification emails, without anybody to understand why this happens! I guess you agree, that this is a really serious issue.

I was just poking around, whether my solution is kind of correct. I agree that the Python documentation is not really useful here. (It's a shame, actually.) However, I found a blog message about it: http://mg.pov.lt/blog/unicode-emails-in-python ... which supports my solution. On the other hand, people have been disussing the problem elsewhere as well: see ht tp://bugs.python.org/issue1368247 ... but it looks like it is still not applied on python2.7. On my system Python still does not contain the suggested patch (ht tp://bugs.python.org/file12190/mimetext-unicode.patch). I'm on Ubuntu 12.04 LTS using the python from the distrubution, which happens to be Python 2.7.3. I was just lucky to try such an encoding myself, however, using self._charset.input_codec, which just happen to be utf-8 in my case as well. Probably self._charset.output_charset is the right thing (although I wonder why it should be the output charset, as the output in the announcer code is kind of the input to the email system, but I just might be confused about the terminology.

comment:6 Changed 17 months ago by anonymous

PS: I'm sorry for posting broken links. The spam filter complained about my reply being spam due to too many external links, so I slightly broke the link format. I'm sorry for that workaround (instead of createing an account for myself, which probably would have worked around the spam filter as well).

comment:7 Changed 13 months ago by rjollos

  • Cc sgeulette added

#11227 closed as a duplicate. We should consider applying the patch in this ticket.

comment:8 Changed 12 months ago by bormotov@…

use AnnouncerPlugin from trunk (r13373), and still got this error

2013-09-02 23:45:45,425 Trac[api] ERROR: AnnouncementSystem failed.
Traceback (most recent call last):
  File "/usr/lib/python2.5/site-packages/announcer/api.py", line 584, in _real_send
    evt)
  File "/usr/lib/python2.5/site-packages/announcer/distributors/mail.py", line 330, in distribute
    self._do_send(transport, event, k, v, fmtdict[k])
  File "/usr/lib/python2.5/site-packages/announcer/distributors/mail.py", line 490, in _do_send
    msgText.set_charset(self._charset)
  File "/usr/lib/python2.5/email/message.py", line 262, in set_charset
    self._payload = charset.body_encode(self._payload)
  File "/usr/lib/python2.5/email/charset.py", line 384, in body_encode
    return email.base64mime.body_encode(s)
  File "/usr/lib/python2.5/email/base64mime.py", line 148, in encode
    enc = b2a_base64(s[i:i + max_unencoded])
UnicodeEncodeError: 'ascii' codec can't encode characters in position 6-13: ordinal not in range(128)

Trac 1.0.1

workaround with MIMEText(output.encode('utf-8'), msg_format) helps

comment:9 Changed 10 months ago by anonymous

Looks like Trac's notification system fixed this problem in trac:changeset:10176 by letting Genshi encode the body:
trac:source:trunk/trac/notification.py@12085:416,474#L409

Announcer could do this in each formatter:
source:announcerplugin/trunk/announcer/formatters.py@12359:152,248,318#L248

comment:10 follow-up: Changed 9 months ago by rjollos

Issue was raised again on the mailing list.

comment:11 in reply to: ↑ 10 Changed 9 months ago by roger@…

Replying to rjollos:

Issue was raised again on the mailing list.

And suddenly it appears again. No changes to the system. But e-mail stops. I get this in the Trac log:

2013-12-03 11:52:47,761 Trac[api] ERROR: AnnouncementSystem failed.
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/TracAnnouncer-1.0dev_r0-py2.7.egg/announcer/api.py", line 584, in _real_send
    evt)
  File "/usr/lib/python2.7/site-packages/TracAnnouncer-1.0dev_r0-py2.7.egg/announcer/distributors/mail.py", line 330, in distribute
    self._do_send(transport, event, k, v, fmtdict[k])
  File "/usr/lib/python2.7/site-packages/TracAnnouncer-1.0dev_r0-py2.7.egg/announcer/distributors/mail.py", line 481, in _do_send
    msgText = MIMEText(alternate_output, alt_msg_format)
  File "/usr/lib/python2.7/email/mime/text.py", line 30, in __init__
    self.set_payload(_text, _charset)
  File "/usr/lib/python2.7/email/message.py", line 226, in set_payload
    self.set_charset(charset)
  File "/usr/lib/python2.7/email/message.py", line 262, in set_charset
    self._payload = self._payload.encode(charset.output_charset)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 472: ordinal not in range(128)

I have made the edit suggested in 8. Odd that it seemed to solve the problem for a bit, and then the problem returns.

comment:12 Changed 9 months ago by jun66j5

#11354 was closed as duplicate.

Changed 7 months ago by patrick

patch implied by comment #9

comment:13 Changed 7 months ago by patrick

I ran into exactly the same problem.

It occurred out of the blue after no issues whatsoever for approx. 30 tickets with various comments and modifications along the way. Then all of sudden this issue shows up. My error message concerned a white space:

UnicodeEncodeError: 'ascii' codec can't encode character u'\u200b' in position 4639: ordinal not in range(128)

The fact that it occurred after some time makes me wonder if this is somehow tied to a counter that is being incremented with each ticket creation, modification, comment etc...

At any rate I applied the patch implied in the comment above and it corrected the problem for me. At least for the time being. See: attachment:formatters_patch.diff. And I apologize that the description of the attachment is linking to an incorrect ticket.

comment:14 Changed 3 months ago by g1itch

  • Cc 4glitch@… added

Changed 3 months ago by g1itch

changed patch from comment:3 which worked for me

comment:15 Changed 3 months ago by g1itch

I tried the patch from comment:3 (with output.encode(self._charset.input_codec)): exception disappeared but now all parts have charset="us-ascii" in Content-Type header. It seems that lines with msgText.set_charset(self._charset) should be retained.

attachment:formatters_patch.diff didn't help at all.

attachment:sending-unicode-email-fails.patch works for me.

Add Comment

Modify Ticket

Action
as new .
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.