Opened 12 years ago
Closed 12 years ago
#10619 closed defect (fixed)
initial indexing of a repository fails (postgres)
Reported by: | anonymous | Owned by: | ejucovy |
---|---|---|---|
Priority: | highest | Component: | MultiRepoSearchPlugin |
Severity: | blocker | Keywords: | |
Cc: | Trac Release: | 0.12 |
Description
I just installed MultiRepoSearchPlugin and after following the instructions I got to the following error:
omega# trac-admin /var/trac/tcsnp multireposearch reindex_all Repo DOES need reindexing Fetching content at development/tcsnp/Makefile.PL ... ... OperationalError: index row size 7280 exceeds btree maximum, 2713 HINT: Values larger than 1/3 of a buffer page cannot be indexed. Consider a function index of an MD5 hash of the value, or use full text indexing.
Upon further investigation, it seems this is a pretty common error in PostgreSQL, when trying to index a large TEXT column with a btree-type index:
http://archives.postgresql.org/pgsql-general/2011-03/msg00436.php
https://code.djangoproject.com/ticket/14904
Is this related to the way that MultiRepoSearchPlugin stores data in the schema? (e.g. can it be changed to make it work in Postgres?)
Or maybe you can use full text indexing (instead of btree-indexing) to achieve the same?
Attachments (0)
Change History (4)
comment:1 Changed 12 years ago by
Status: | new → assigned |
---|
comment:2 Changed 12 years ago by
Typo in the above comment -- the name of the index that you should drop is probably repository_node_contents_idx
(not repository_node_text_idx
)
comment:3 Changed 12 years ago by
I'll actually be migrating my own Trac installation at work from sqlite to postgres over the next week or so, so I'll have a chance to reproduce this bug and test the fix myself if you prefer to wait.
comment:4 Changed 12 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
I've migrated my own Trac instances to postgres after making the change that causes the index to no longer be created for new installations. I've confirmed that this does work.
So, to fix an existing installation that uses postgres and is getting this error, execute the following postgres statement in your trac db:
drop index repository_node_contents_idx;
(I'm not 100% sure that this is always the name of the index, so you might need to use a psql \d repository_node
to find it if the above doesn't work.)
Thanks for the report and very helpful research. The code creates an index on the
repository_node.contents
column since that column is used in the searches, but I admit I didn't give it a lot of thought up front. It actually seems that the default btree index for this text column is completely [worse than] useless for the sorts of text searches the plugin does, since they're not just searching for patterns at the beginning of the string.So, you're completely right that full text indexing would be preferable to btree indexing. I'll need to do more research to figure out how to actually implement that. But in the meantime, I think that removing the indexing altogether won't hurt the performance of the plugin at all, and will bring added benefits like less space consumed, and the ability for you to use it at all. :-)
I've pushed a change to the code to remove the index. I haven't tagged a new release yet, or defined a migration script, because I'd like to make sure this approach helps before releasing it. Could you try re-installing the plugin with this change? Actually the easiest way to test it would be either
drop table repository_node
anddrop table repository_version
, then install the plugin from the git repository's master branch, and re-run the setup;drop index
command -- I'm not sure what the name of the index is (probablyrepository_node_contents_idx
but I'm not 100% sure) so you might need to use a psql\d repository_node
or something to find it.Either way, once you've removed the offending index, could you try re-running the
reindex_all
command and see if the failure goes away?