Modify

Opened 5 years ago

Last modified 17 months ago

#5357 reopened defect

UnicodeDecodeError on UTF-8 encoded filenames

Reported by: spoke Owned by: anonymous
Priority: highest Component: GitPlugin
Severity: blocker Keywords:
Cc: xelnor Trac Release: 0.12

Description

I just try the plugin version 0.11 on trac 0.11.1, 0.11.5 and trunk, and I allways get this error:

Please, note that trac works fine if using SVN as backend.

Trac detected an internal error:

ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings.
Most recent call last:

    * File "/usr/lib/python2.5/site-packages/Trac-0.12dev_r8263-py2.5.egg/trac/web/main.py", line 464, in _dispatch_request
      Code fragment:
       459. try:
       460. if not env and env_error:
       461. raise HTTPInternalError(env_error)
       462. try:
       463. dispatcher = RequestDispatcher(env)
       464. dispatcher.dispatch(req)
       465. except RequestDone:
       466. pass
       467. resp = req._response or []
       468.  
       469. except HTTPException, e:
      Local variables:
      Name	Value
      after 	[u' except RequestDone:', u' pass', u' resp = ...
      before 	[u' try:', u' if not env and env_error:', u' raise ...
      dispatcher 	<trac.web.main.RequestDispatcher object at 0x8bb984c>
      e 	ProgrammingError('You must not use 8-bit bytestrings unless you use a ...
      env 	<trac.env.Environment object at 0x875060c>
      env_error 	None
      exc_info 	(<class 'pysqlite2.dbapi2.ProgrammingError'>, ProgrammingError('You must ...
      filename 	'/usr/lib/python2.5/site-packages/Trac-0.12dev_r8263-py2.5.egg/trac/web/mai ...
      frames 	[{'function': '_dispatch_request', 'lines_before': [u' try:', u' ...
      has_admin 	True
      line 	u' dispatcher.dispatch(req)'
      lineno 	463
      message 	u'ProgrammingError: You must not use 8-bit bytestrings unless you use a ...
      req 	<Request "GET u'/login'">
      resp 	[]
      tb 	<traceback object at 0x8e09694>
      tb_hide 	None
      traceback 	u'Traceback (most recent call last):\n File ...
    * File "/usr/lib/python2.5/site-packages/Trac-0.12dev_r8263-py2.5.egg/trac/web/main.py", line 181, in dispatch
      Code fragment:
       176. if not req.path_info or req.path_info == '/':
       177. chosen_handler = self.default_handler
       178. # pre-process any incoming request, whether a handler
       179. # was found or not
       180. chosen_handler = self._pre_process_request(req,
       181. chosen_handler)
       182. except TracError, e:
       183. raise HTTPInternalError(e)
       184. if not chosen_handler:
       185. if req.path_info.endswith('/'):
       186. # Strip trailing / and redirect
      Local variables:
      Name	Value
      chosen_handler 	<trac.web.auth.LoginModule object at 0x8bb9bac>
      chrome 	<trac.web.chrome.Chrome object at 0x8b6de4c>
      err 	(<class 'pysqlite2.dbapi2.ProgrammingError'>, ProgrammingError('You must ...
      handler 	<trac.web.auth.LoginModule object at 0x8bb9bac>
      req 	<Request "GET u'/login'">
      self 	<trac.web.main.RequestDispatcher object at 0x8bb984c>
    * File "/usr/lib/python2.5/site-packages/Trac-0.12dev_r8263-py2.5.egg/trac/web/main.py", line 317, in _pre_process_request
      Code fragment:
       312. req.outcookie['trac_form_token']['secure'] = True
       313. return req.outcookie['trac_form_token'].value
       314.  
       315. def _pre_process_request(self, req, chosen_handler):
       316. for filter_ in self.filters:
       317. chosen_handler = filter_.pre_process_request(req, chosen_handler)
       318. return chosen_handler
       319.  
       320. def _post_process_request(self, req, *args):
       321. nbargs = len(args)
       322. resp = args
      Local variables:
      Name	Value
      chosen_handler 	<trac.web.auth.LoginModule object at 0x8bb9bac>
      filter_ 	<trac.versioncontrol.api.RepositoryManager object at 0x8bb9dac>
      req 	<Request "GET u'/login'">
      self 	<trac.web.main.RequestDispatcher object at 0x8bb984c>
    * File "/usr/lib/python2.5/site-packages/Trac-0.12dev_r8263-py2.5.egg/trac/versioncontrol/api.py", line 86, in pre_process_request
      Code fragment:
        81.  
        82. def pre_process_request(self, req, handler):
        83. from trac.web.chrome import Chrome, add_warning
        84. if handler is not Chrome(self.env):
        85. try:
        86. self.get_repository(req.authname).sync()
        87. except TracError, e:
        88. add_warning(req, _("Can't synchronize with the repository "
        89. "(%(error)s). Look in the Trac log for more "
        90. "information.", error=to_unicode(e.message)))
        91.
      Local variables:
      Name	Value
      Chrome 	<class 'trac.web.chrome.Chrome'>
      add_warning 	<function add_warning at 0x8803b54>
      handler 	<trac.web.auth.LoginModule object at 0x8bb9bac>
      req 	<Request "GET u'/login'">
      self 	<trac.versioncontrol.api.RepositoryManager object at 0x8bb9dac>
    * File "/usr/lib/python2.5/site-packages/Trac-0.12dev_r8263-py2.5.egg/trac/versioncontrol/cache.py", line 213, in sync
      Code fragment:
       208. cursor.execute("INSERT INTO node_change "
       209. " (rev,path,node_type,change_type, "
       210. " base_path,base_rev) "
       211. "VALUES (%s,%s,%s,%s,%s,%s)",
       212. (str(next_youngest),
       213. path, kind, action, bpath, brev))
       214.  
       215. # 1.3. iterate (1.1 should always succeed now)
       216. self.youngest = next_youngest
       217. next_youngest = self.repos.next_rev(next_youngest)
       218.  
      Local variables:
      Name	Value
      action 	'A'
      actionmap 	{'edit': 'E', 'add': 'A', 'move': 'M', 'copy': 'C', 'delete': 'D'}
      authz 	<trac.versioncontrol.api.Authorizer object at 0x8d89bac>
      bpath 	''
      brev 	None
      cset 	<tracext.git.git_fs.GitChangeset object at 0x8cf0aac>
      cursor 	<trac.db.util.IterableCursor object at 0x8e3f12c>
      db 	<trac.db.pool.PooledConnection object at 0x8e0984c>
      feedback 	None
      key 	'youngest_rev'
      kind 	'F'
      kindmap 	{'file': 'F', 'dir': 'D'}
      metadata 	{u'youngest_rev': u'', u'repository_dir': u'git:/home/www/git/test/.git'}
      name 	u'youngest_rev'
      next_youngest 	'b4c60abd772d7c8192b8cf3f46d44978eccdc7b0'
      path 	'doc/analysis/Metodolog\xc3\x83\xc2\xada de almacenamiento de datos en el ...
      repos_youngest 	'73f1a95e8d2a161987c16d56cb223e40666ee63b'
      repository_dir 	u'git:/home/www/git/test/.git'
      self 	<tracext.git.git_fs.CachedRepository2 object at 0x8d5a0ec>
      value 	u''
    * File "/usr/lib/python2.5/site-packages/Trac-0.12dev_r8263-py2.5.egg/trac/db/util.py", line 59, in execute
      Code fragment:
        54. return self.cursor.execute(sql)
        55. except Exception, e:
        56. self.log.debug('execute exception: %r', e)
        57. raise
        58. if args:
        59. return self.cursor.execute(sql_escape_percent(sql), args)
        60. return self.cursor.execute(sql)
        61.  
        62. def executemany(self, sql, args=None):
        63. if self.log:
        64. self.log.debug('SQL: %r', sql)
      Local variables:
      Name	Value
      args 	('b4c60abd772d7c8192b8cf3f46d44978eccdc7b0', ...
      self 	<trac.db.util.IterableCursor object at 0x8e3f12c>
      sql 	'INSERT INTO node_change (rev,path,node_type,change_type, ...
    * File "/usr/lib/python2.5/site-packages/Trac-0.12dev_r8263-py2.5.egg/trac/db/sqlite_backend.py", line 59, in execute
      Code fragment:
        54. raise
        55. def execute(self, sql, args=None):
        56. if args:
        57. sql = sql % (('?',) * len(args))
        58. return self._rollback_on_error(sqlite.Cursor.execute, sql,
        59. args or [])
        60. def executemany(self, sql, args=None):
        61. if args:
        62. sql = sql % (('?',) * len(args[0]))
        63. return self._rollback_on_error(sqlite.Cursor.executemany, sql,
        64. args or [])
      Local variables:
      Name	Value
      args 	('b4c60abd772d7c8192b8cf3f46d44978eccdc7b0', ...
      self 	<trac.db.sqlite_backend.PyFormatCursor object at 0x880a2fc>
      sql 	'INSERT INTO node_change (rev,path,node_type,change_type, ...
    * File "/usr/lib/python2.5/site-packages/Trac-0.12dev_r8263-py2.5.egg/trac/db/sqlite_backend.py", line 51, in _rollback_on_error 

Attachments (1)

gitplugin.patch (1.5 KB) - added by xelnor 5 years ago.
Patch enabling gitplugin to support non-ascii file names in the git repository

Download all attachments as: .zip

Change History (13)

comment:1 Changed 5 years ago by spoke

Hi!

We manage to fix this by editing trac/db/sqlite_backend.py and adding

sqlite.register_adapter( str, lambda s:s.decode( 'utf-8' ) )

After sqlite import around line 43.

I think this is not the right way to fix this, but I´m quite short on time and I can´t spend more time on it, sorry.

comment:2 Changed 5 years ago by xelnor

  • Cc xelnor added

I encountered this exact problem.

I managed to find out that it occurred when importing a git repository into trac cache when trac-admin tried to add a file with non-ascii characters in it's name.

I have attached a patch which solved this problem for me.

I hope it fixes the issue for everyone :)

Changed 5 years ago by xelnor

Patch enabling gitplugin to support non-ascii file names in the git repository

comment:3 follow-up: Changed 5 years ago by roadrunner

Two questions regarding the patch:

  1. why the 2nd change, sys.stdout.write('\n')?
  2. shouldn't the 2nd line in the 3rd change be
    p_path = to_unicode(p_path)
    

comment:4 Changed 5 years ago by xelnor

Yes, indeed, you are right :)
The sys.stdout.write remains from my tests, but shouldn't be there ; and the second one is a typo.

comment:5 Changed 4 years ago by hvr

  • Owner changed from hvr to anonymous
  • Status changed from new to assigned
  • Summary changed from Trac detected an internal error: to UnicodeDecodeError on UTF-8 encoded filenames

comment:6 Changed 4 years ago by hvr

  • Resolution set to fixed
  • Status changed from assigned to closed

(In [7755]) GitPlugin: decode git paths to unicode strings; fixes #5357

comment:7 follow-up: Changed 4 years ago by hou5e@…

  • Resolution fixed deleted
  • Status changed from closed to reopened

Thanks for adding these modifications to the plugin. Using the latest version of Trac and the Git Plugin rev 0.11.0.2-6.20100307svn7755 on Fedora 12 gnome, using Postgresql-8.4.2-1.fc12.x86_64.

Right now the offending filename characters don't render Trac into an unusable state with an error message anymore, but I still have related problems.

My test file was named: 
Test for bad characters - ' - ` - ‘ - ’ - •.txt
(Test for bad characters - ' - ` - \u2018 - \u2019 - \u2022.txt)

In the Trac browser, it would be displayed as: 
Test for bad characters - ' - ` - � - � - �.txt
(Test for bad characters - ' - ` - \ufffd - \ufffd - \ufffd.txt)

My test file can be added to Git and the filename is shown in the Trac Browser (Test for bad characters - ' - ` - � - � - �.txt). But, viewing or downloading the file doesn't work and displays a nicer error message:

Oops…
Trac detected an internal error:
TypeError: execv() arg 2 must contain only strings

I confirmed that you could leave the file in the the Git repo, and rename the file to remove the offending characters, and be able to view / download the file from the Browser in Trac. This worked.

Then reverting my git repo to remove the test file, one rev at a time. After reverting the previous rev, I started receiving the following error that I hadn't previously gotten:

Warning:  Can't synchronize with the repository (No changeset 96f335e75e2bed9cc6bf17982908debaa5a362fb in the repository). Look in the Trac log for more information.

Where I had to hack the offending entries out of the PostGreSQL database to get rid of the warning message after reverting both revisions for the test file.

Any help would be appreciated. Thanks!

comment:8 in reply to: ↑ 7 Changed 4 years ago by anonymous

Replying to hou5e@hotmail.com:

Oops…
Trac detected an internal error:
TypeError: execv() arg 2 must contain only strings

sounds as if the git execv invocation gets passed unicode strings which need to be encoded to utf8...

comment:9 in reply to: ↑ 3 ; follow-up: Changed 4 years ago by anonymous

When trying to view diff from two (or more) changesets and names of the changed files contains filenames with non-latin symbols, I get this error message:

Oops…
Trac detected an internal error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 0: ordinal not in range(128)
There was an internal error in Trac. It is recommended that you inform your local Trac administrator and give him all the information he needs to reproduce the issue.
To that end, you could ==== How to Reproduce ==== While doing a GET operation on `/changeset`, Trac issued an internal error. ''(please provide additional details here)'' Request parameters: {{{ {'new': u'a5413b762a963ce9d1cafe33915b268bc685d2e4@\u041f\u043e\u043b\u044c\u0437\u043e\u0432\u0430\u0442\u0435\u043b\u044e', 'old': u'2c5ab623401e75157ba732b8cfd928cc5b5bd6e2@\u041f\u043e\u043b\u044c\u0437\u043e\u0432\u0430\u0442\u0435\u043b\u044e'} }}} ==== System Information ==== || '''Trac''' || `0.11.7` || || '''Python''' || `2.5.5 (r255:77872, Feb 1 2010, 19:53:42) ` [[br]] `[GCC 4.4.3]` || || '''setuptools''' || `0.6` || || '''SQLite''' || `3.6.23` || || '''pysqlite''' || `2.3.2` || || '''Genshi''' || `0.5.1` || || '''Pygments''' || `1.3` || || '''GIT''' || `1.7.0` || ==== Python Traceback ==== {{{ Traceback (most recent call last): File "/usr/lib/python2.5/site-packages/trac/web/main.py", line 450, in _dispatch_request dispatcher.dispatch(req) File "/usr/lib/python2.5/site-packages/trac/web/main.py", line 206, in dispatch resp = chosen_handler.process_request(req) File "/usr/lib/python2.5/site-packages/trac/versioncontrol/web_ui/changeset.py", line 321, in process_request self._render_html(req, repos, chgset, restricted, xhr, data) File "/usr/lib/python2.5/site-packages/trac/versioncontrol/web_ui/changeset.py", line 588, in _render_html 'new': new_node and node_info(new_node, annotated), File "/usr/lib/python2.5/site-packages/trac/versioncontrol/web_ui/changeset.py", line 449, in node_info None), File "/usr/lib/python2.5/site-packages/trac/web/href.py", line 161, in <lambda> self._derived[name] = lambda *args, **kw: self(name, *args, **kw) File "/usr/lib/python2.5/site-packages/trac/web/href.py", line 146, in __call__ if arg is not None]) UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 0: ordinal not in range(128) }}} a ticket.
The action that triggered the error was:
GET: /changeset
TracGuide — The Trac User and Administration Guide

comment:10 in reply to: ↑ 9 Changed 4 years ago by hvr

  • Resolution set to fixed
  • Status changed from reopened to closed
  • Trac Release changed from 0.11 to 0.12

Replying to anonymous:

When trying to view diff from two (or more) changesets and names of the changed files contains filenames with non-latin symbols, I get this error message:

This last one should now have been fixed with http://github.com/hvr/gitplugin/tree/v0.12.0.5

comment:11 Changed 17 months ago by anonymous

  • Resolution fixed deleted
  • Status changed from closed to reopened

trac 0.12.2 (installation upgraded from 0.11)
trac-git 0.12.0.5

Still happens when repository includes a UTF8-named directory and it has a UTF8-named file, i.e:
/РПЗ/РПЗ.odt (both dirname and filename contain cyrrilic letters)

Viewing trac/browser/ is ok, but access to trac/browser/РПЗ causes the same error:

Trac detected an internal error: 
TypeError: execv() arg 2 must contain only strings

My installation was upgraded from trac 0.11.

comment:12 Changed 17 months ago by rjollos

This plugin isn't being maintained, but if you move to Trac 1.0 (latest is 1.0.1) you can find support for Git provided as part of Trac.

Add Comment

Modify Ticket

Action
as reopened .
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.