Opened 11 years ago
Last modified 11 years ago
#11501 new defect
Error when migrating from sqlite to postgresql
Reported by: | Andrew | Owned by: | ejucovy |
---|---|---|---|
Priority: | normal | Component: | MultiRepoSearchPlugin |
Severity: | normal | Keywords: | |
Cc: | Trac Release: |
Description (last modified by )
Hi,
I'm new to Trac but after using the MultiRepoSearchPlugin I find that the searches are pretty slow. I'm hoping that switching to a real database will speed up searches (full text indexes, etc).
Attempting to issue a migrate to switch over to postresql gives this error:
Copying tables: attachment table... 0 records. auth_cookie table... 1 records. cache table... 3 records. component table... 4 records. enum table... 13 records. mastertickets table... 0 records. milestone table... 0 records. node_change table... 9641 records. permission table... 30 records. report table... 8 records. repository table... 8 records. OperationalError: Could not decode to UTF-8 column 'contents' with text 'OggS' repository_node table...
The same error occurs when attempting to use MySQL as well (which is what I prefer).
-Andrew
Attachments (0)
Change History (6)
comment:1 Changed 11 years ago by
Component: | TracMigratePlugin → MultiRepoSearchPlugin |
---|---|
Description: | modified (diff) |
Owner: | changed from Jun Omae to ejucovy |
comment:2 Changed 11 years ago by
What do you recommend to fix this? Would be safe to truncate the repository_node.contents table then just resync the search plugin (or find another since this one looks like it's not excluding binary files) after the migration?
-Andrew
comment:3 Changed 11 years ago by
So I removed the contents column from repository_node and now I'm getting this error:
TypeError: unsupported operand type(s) for /: 'int' and 'NoneType'
Any ideas?
-Andrew
comment:4 Changed 11 years ago by
Never mind! Got it figured out, I forgot to change the collation of the MySQL DB to utf8mb4_bin.
So if anyone else has this issue remove the repository_node.contents column and if using MySQL then make sure you change the DB collation to utf8mb4_bin.
-Andrew
comment:5 Changed 11 years ago by
Thanks jun66j5 - I should have realized that passing the result of to_unicode
directly to a database cursor wouldn't be a great idea. :-P I'll look into the proper way to fix this for MultiRepoSearchPlugin.
Andrew, just want to (belatedly) confirm that your approach is a good workaround - since MultiRepoSearchPlugin just uses its database tables to cache repository contents, you can safely clear them out altogether (both repository_node
and repository_version
) and then re-run multireposearch reindex_all
after your migration to repopulate those tables.
I'll also look into automatically excluding binary files in the next release. (Note to self: trac.mimeview.api.is_binary
sure looks promising.)
Lastly, I should warn that searches can still be pretty slow -- I also migrated my Trac site from sqlite (to postgres) for the same reason, and it's certainly an improvement over the sqlite performance, but the plugin itself doesn't currently do anything to help you set up full-text indexes etc. So by default even on MySQL or Postgres your searches will basically be iterating over all rows in repository_node
with LIKE statements. If you're comfortable setting up a full-text index directly in your database backend against repository_node.contents
, that will likely give much better performance. I haven't had time to look into this (let alone benchmark it) so I've just been living with slow searches on my own Trac installation, but if you do set this up I'd be very eager to hear what you did to set up the indexes and whether it improved performance, if only to add to the "Installation" docs.
comment:6 Changed 11 years ago by
ejucovy when do you plan on releasing a new version to exclude binary files?
When I originally used the plugin it caused the sqlite DB to increase to 13GiB in size! Once the binary issue is figured out I'll re-enable the plugin and create some indexes on the tables but since Trac uses InnoDB tables, the fulltext searches will work completely different:
http://dev.mysql.com/doc/refman/5.6/en/innodb-table-and-index.html#innodb-fulltext-index http://www.mysqlperformanceblog.com/2013/03/04/innodb-full-text-search-in-mysql-5-6-part-2-the-queries/
There might need to be some rejigging of the SQL queries depending on if the DB is sqlite, MySQL or PostgreSQL.
Also do you have any tips for development of Trac plugins? Should I use easy_install to remove the old version, make changes to a copy then reinstall using easy_install. Or is it possible to modify some Python source of the plugin on the fly? I'm having trouble finding where easy_install puts the plugin files, and the documentation for Trac isn't really the greatest!
-Andrew
That is a MultiRepoSearchPlugin issue. The plugin creates
repository_node.contents
asTEXT
column and stores contents of files in the repositories into the column. The column must be stored utf-8 string cause ofTEXT
column, not binary. However, the plugin would store the contents without changes torepository_node.contents
.