Ticket #7046 (closed defect: wontfix)

Opened 3 years ago

Last modified 1 year ago

Support multi-repository in 0.12

Reported by: victor Assigned to: rjollos
Priority: normal Component: RepoSearchPlugin
Severity: normal Keywords:
Cc: Trac Release: 0.12

Description

Can anybody implement support multireport in trac 0.12?

Attachments

Change History

05/05/10 08:55:57 changed by rjollos

There is not much hope of this happening anytime soon. I adopted this plugin in hopes of getting it to a usable state, but this is bound to be a very time consuming project and it looks like we need to start from scratch again and plan a project that can overcome the limitations of this plugin.

08/24/10 08:03:43 changed by rjollos

You might take a look at #7545, for which a 0.12 patch was just committed, and discuss with the reporter of that ticket whether it is working now under 0.12.

08/24/10 11:48:15 changed by anonymous

We have this plugin working successfully under 0.12 with the #7545 patch, using just a single "default" repository (upgraded from trac 0.11). The multi-repo support adds new arguments to get_repository():

def get_repository(self, reponame=None, authname=None):

So it would seem you could search multiple repositories too, as long as you have permissions. Thanks for the quick commit, btw!

08/25/10 10:21:38 changed by rjollos

  • summary changed from Support multireport in trac 0.12 to Support multi-repository in 0.12.

07/10/11 05:49:32 changed by rjollos

Hello,

I took over maintainership of this plugin from athomas some time ago. There is a significant amount of work to do on this plugin, and I don't foresee having the time to do it all.

helend has written the TracSuposePlugin, which seems like a much better solution. Rather than writing the repository search functionality from scratch, a Trac interface to an existing repository search tool has been created. Rather than throwing more effort at this plugin, I'd prefer to help helend with enhancements to the TracSuposePlugin, or spend my time on other Trac plugin projects altogether.

I'd like to get some feedback and hear if anyone knows of a compelling reason to continue this project rather than moving to the TracSuposePlugin. Is there functionality in this plugin that doesn't exist in the TracSuposePlugin? I'm open to hearing all opinions and suggestions.

I'll leave these tickets open for about a week, but in all likelihood will close all of them and deprecate the plugin.

Thanks for your time, - Ryan

01/05/12 03:22:38 changed by ejucovy

I was able to get the Repo Search plugin working across multiple Git repositories with the following patch:

Index: tracreposearch/search.py
===================================================================
--- tracreposearch/search.py	(revision 11108)
+++ tracreposearch/search.py	(working copy)
@@ -64,7 +64,15 @@
     def get_search_results(self, req, query, filters):
         if 'repo' not in filters:
             return
-        repo = self.env.get_repository(authname=req.authname)
+        from trac.versioncontrol import RepositoryManager
+        repos = RepositoryManager(self.env).get_all_repositories()
+        results = []
+        for repo in repos:
+            results.extend( list(self.get_search_results_for_repo(req, query, filters, repo)) )
+        return results
+
+    def get_search_results_for_repo(self, req, query, filters, reponame):
+        repo = self.env.get_repository(reponame=reponame, authname=req.authname)
         if not isinstance(query, list):
             query = query.split()
         query = [q.lower() for q in query]
@@ -128,6 +136,6 @@
                     if found:
                         break
 
-                yield (self.env.href.browser(node.path) + (found and '#L%i' % found or ''),
-                       node.path, change.date, change.author,
+                yield (self.env.href.browser(repo.reponame, node.path) + (found and '#L%i' % found or ''),
+                       "%s (in %s)" % (node.path, repo.reponame), change.date, change.author,
                        shorten_result(content, query))

I don't recommend committing it as-is, because I haven't yet tested it with a single-repo setup or with backends other than Git. I also haven't tested the Indexer support yet. But, I wanted to post my progress so far in case anyone else finds this useful.

01/05/12 21:11:13 changed by ejucovy

Here's an updated patch that also makes the Indexer work with multiple repos:

Index: tracreposearch/indexer.py
===================================================================
--- tracreposearch/indexer.py	(revision 11108)
+++ tracreposearch/indexer.py	(working copy)
@@ -39,7 +39,10 @@
         key = key.encode('utf-8')
         if key in self._cache:
             return self._cache[key]
-        return self._cache.setdefault(key, set(self.dbm[key].decode('utf-8').split(pathsep)))
+        try:
+            return self._cache.setdefault(key, set(self.dbm[key].decode('utf-8').split(pathsep)))
+        except:
+            return []
 
     def __setitem__(self, key, value):
         key = key.encode('utf-8')
@@ -102,10 +105,12 @@
 class Indexer:
     _strip = re.compile(r'\S+',re.U)
 
-    def __init__(self, env):
+    def __init__(self, env, reponame):
         self.env = env
-        self.repo = self.env.get_repository()
 
+        from trac.versioncontrol import RepositoryManager
+        self.repo = self.env.get_repository(reponame=reponame)
+
         if not self.env.config.get('repo-search', 'index',
                                    os.getenv('PYTHON_EGG_CACHE', None)):
             raise TracError("Repository search plugin indexer is not " \
@@ -115,9 +120,12 @@
 
         # TODO Should this use the repo location as well?
         env_id = '%08x' % abs(hash(self.env.path))
-        self.index_dir = self.env.config.get('repo-search', 'index',
-                         os.path.join(os.getenv('PYTHON_EGG_CACHE', ''),
-                                      env_id + '.reposearch.idx'))
+        self.index_dir = os.path.join(
+            self.env.config.get('repo-search', 'index',
+                                os.path.join(os.getenv('PYTHON_EGG_CACHE', ''),
+                                             env_id + '.reposearch.idx')),
+            self.repo.reponame)
+            
         self.env.log.debug('Repository search index: %s' % self.index_dir)
         self.minimum_word_length = int(self.env.config.get('repo-search',
                                        'minimum-word-length', 3))
@@ -165,7 +173,7 @@
     def need_reindex(self):
         return not hasattr(self, 'meta') \
             or self.repo.youngest_rev != \
-               int(self.meta.get('last-repo-rev', -1)) \
+               self.meta.get('last-repo-rev', -1) \
             or self.env.config.get('repo-search', 'include', '') \
                != self.meta.get('index-include', '') \
             or self.env.config.get('repo-search', 'exclude', '') \
@@ -247,7 +255,8 @@
                     self._invalidate_file(node.path)
                     self._reindex_node(node)
             new_files.add(node.path)
-        
+            self.sync()
+
         # All files that don't match the new filter criteria must be purged
         # from the index
         invalidated_files = set(self.files.keys())
Index: tracreposearch/search.py
===================================================================
--- tracreposearch/search.py	(revision 11108)
+++ tracreposearch/search.py	(working copy)
@@ -64,7 +64,15 @@
     def get_search_results(self, req, query, filters):
         if 'repo' not in filters:
             return
-        repo = self.env.get_repository(authname=req.authname)
+        from trac.versioncontrol import RepositoryManager
+        repos = RepositoryManager(self.env).get_all_repositories()
+        results = []
+        for repo in repos:
+            results.extend( list(self.get_search_results_for_repo(req, query, filters, repo)) )
+        return results
+
+    def get_search_results_for_repo(self, req, query, filters, reponame):
+        repo = self.env.get_repository(reponame=reponame, authname=req.authname)
         if not isinstance(query, list):
             query = query.split()
         query = [q.lower() for q in query]
@@ -76,7 +84,7 @@
         # Use indexer if possible, otherwise fall back on brute force search.
         try:
             from tracreposearch.indexer import Indexer
-            self.indexer = Indexer(self.env)
+            self.indexer = Indexer(self.env, reponame)
             self.indexer.reindex()
             walker = lambda repo, query: [repo.get_node(filename) for filename
                                           in self.indexer.find_words(query)]
@@ -128,6 +136,6 @@
                     if found:
                         break
 
-                yield (self.env.href.browser(node.path) + (found and '#L%i' % found or ''),
-                       node.path, change.date, change.author,
+                yield (self.env.href.browser(repo.reponame, node.path) + (found and '#L%i' % found or ''),
+                       "%s (in %s)" % (node.path, repo.reponame), change.date, change.author,
                        shorten_result(content, query))

Definitely still don't recommend committing as-is. I'll see if I can find the time to test it against other repo backends, and against a single-repo setup.

With this patch, each distinct repository gets its own index in a separate subdirectory of the index path, and the search iterates over all the indexes.

The initial indexing can become very, very slow and memory-intensive here. I added an extra self.sync() call after every node is walked, in an attempt to reduce the memory usage. But it's still pretty intense and really slow. I'll see if I can come up with any improvements in this area..

(follow-up: ↓ 9 ) 02/19/12 06:13:03 changed by rjollos

ejucovy, since you've forked this plugin, should we close this ticket now? Since you are working on improving the plugin, I'm inclined to contribute my future efforts towards improving your fork rather than continue to work on this plugin.

(in reply to: ↑ 8 ; follow-up: ↓ 10 ) 02/22/12 16:51:19 changed by ejucovy

  • status changed from new to closed.
  • resolution set to wontfix.

Replying to rjollos:

ejucovy, since you've forked this plugin, should we close this ticket now?

Yes, I think that makes sense. As you can probably tell, the MultiRepoSearchPlugin grew out from my patch on this ticket -- it ended up feeling like too big of a design change to implement incrementally with confidence.

So I think the right resolution to this ticket is "if you are using Trac 0.12+ with non-SVN repos and/or multiple repos, try using MultiRepoSearchPlugin."

Since you are working on improving the plugin, I'm inclined to contribute my future efforts towards improving your fork rather than continue to work on this plugin.

That would be great!

(in reply to: ↑ 9 ) 02/22/12 19:26:13 changed by rjollos

Replying to ejucovy:

So I think the right resolution to this ticket is "if you are using Trac 0.12+ with non-SVN repos and/or multiple repos, try using MultiRepoSearchPlugin."

Great. I added that to the wiki page.


Add/Change #7046 (Support multi-repository in 0.12)




Change Properties
Action