Modify

Opened 4 years ago

Closed 2 years ago

Last modified 2 years ago

#7046 closed defect (wontfix)

Support multi-repository in 0.12

Reported by: victor Owned by: rjollos
Priority: normal Component: RepoSearchPlugin
Severity: normal Keywords:
Cc: Trac Release: 0.12

Description

Can anybody implement support multireport in trac 0.12?

Attachments (0)

Change History (10)

comment:1 Changed 4 years ago by rjollos

There is not much hope of this happening anytime soon. I adopted this plugin in hopes of getting it to a usable state, but this is bound to be a very time consuming project and it looks like we need to start from scratch again and plan a project that can overcome the limitations of this plugin.

comment:2 Changed 4 years ago by rjollos

You might take a look at #7545, for which a 0.12 patch was just committed, and discuss with the reporter of that ticket whether it is working now under 0.12.

comment:3 Changed 4 years ago by anonymous

We have this plugin working successfully under 0.12 with the #7545 patch, using just a single "default" repository (upgraded from trac 0.11). The multi-repo support adds new arguments to get_repository():

def get_repository(self, reponame=None, authname=None):

So it would seem you could search multiple repositories too, as long as you have permissions. Thanks for the quick commit, btw!

comment:4 Changed 4 years ago by rjollos

  • Summary changed from Support multireport in trac 0.12 to Support multi-repository in 0.12

comment:5 Changed 3 years ago by rjollos

Hello,

I took over maintainership of this plugin from athomas some time ago. There is a significant amount of work to do on this plugin, and I don't foresee having the time to do it all.

helend has written the TracSuposePlugin, which seems like a much better solution. Rather than writing the repository search functionality from scratch, a Trac interface to an existing repository search tool has been created. Rather than throwing more effort at this plugin, I'd prefer to help helend with enhancements to the TracSuposePlugin, or spend my time on other Trac plugin projects altogether.

I'd like to get some feedback and hear if anyone knows of a compelling reason to continue this project rather than moving to the TracSuposePlugin. Is there functionality in this plugin that doesn't exist in the TracSuposePlugin? I'm open to hearing all opinions and suggestions.

I'll leave these tickets open for about a week, but in all likelihood will close all of them and deprecate the plugin.

Thanks for your time,

  • Ryan

comment:6 Changed 2 years ago by ejucovy

I was able to get the Repo Search plugin working across multiple Git repositories with the following patch:

  • tracreposearch/search.py

     
    6464    def get_search_results(self, req, query, filters): 
    6565        if 'repo' not in filters: 
    6666            return 
    67         repo = self.env.get_repository(authname=req.authname) 
     67        from trac.versioncontrol import RepositoryManager 
     68        repos = RepositoryManager(self.env).get_all_repositories() 
     69        results = [] 
     70        for repo in repos: 
     71            results.extend( list(self.get_search_results_for_repo(req, query, filters, repo)) ) 
     72        return results 
     73 
     74    def get_search_results_for_repo(self, req, query, filters, reponame): 
     75        repo = self.env.get_repository(reponame=reponame, authname=req.authname) 
    6876        if not isinstance(query, list): 
    6977            query = query.split() 
    7078        query = [q.lower() for q in query] 
     
    128136                    if found: 
    129137                        break 
    130138 
    131                 yield (self.env.href.browser(node.path) + (found and '#L%i' % found or ''), 
    132                        node.path, change.date, change.author, 
     139                yield (self.env.href.browser(repo.reponame, node.path) + (found and '#L%i' % found or ''), 
     140                       "%s (in %s)" % (node.path, repo.reponame), change.date, change.author, 
    133141                       shorten_result(content, query)) 

I don't recommend committing it as-is, because I haven't yet tested it with a single-repo setup or with backends other than Git. I also haven't tested the Indexer support yet. But, I wanted to post my progress so far in case anyone else finds this useful.

comment:7 Changed 2 years ago by ejucovy

Here's an updated patch that also makes the Indexer work with multiple repos:

  • tracreposearch/indexer.py

     
    3939        key = key.encode('utf-8') 
    4040        if key in self._cache: 
    4141            return self._cache[key] 
    42         return self._cache.setdefault(key, set(self.dbm[key].decode('utf-8').split(pathsep))) 
     42        try: 
     43            return self._cache.setdefault(key, set(self.dbm[key].decode('utf-8').split(pathsep))) 
     44        except: 
     45            return [] 
    4346 
    4447    def __setitem__(self, key, value): 
    4548        key = key.encode('utf-8') 
     
    102105class Indexer: 
    103106    _strip = re.compile(r'\S+',re.U) 
    104107 
    105     def __init__(self, env): 
     108    def __init__(self, env, reponame): 
    106109        self.env = env 
    107         self.repo = self.env.get_repository() 
    108110 
     111        from trac.versioncontrol import RepositoryManager 
     112        self.repo = self.env.get_repository(reponame=reponame) 
     113 
    109114        if not self.env.config.get('repo-search', 'index', 
    110115                                   os.getenv('PYTHON_EGG_CACHE', None)): 
    111116            raise TracError("Repository search plugin indexer is not " \ 
     
    115120 
    116121        # TODO Should this use the repo location as well? 
    117122        env_id = '%08x' % abs(hash(self.env.path)) 
    118         self.index_dir = self.env.config.get('repo-search', 'index', 
    119                          os.path.join(os.getenv('PYTHON_EGG_CACHE', ''), 
    120                                       env_id + '.reposearch.idx')) 
     123        self.index_dir = os.path.join( 
     124            self.env.config.get('repo-search', 'index', 
     125                                os.path.join(os.getenv('PYTHON_EGG_CACHE', ''), 
     126                                             env_id + '.reposearch.idx')), 
     127            self.repo.reponame) 
     128             
    121129        self.env.log.debug('Repository search index: %s' % self.index_dir) 
    122130        self.minimum_word_length = int(self.env.config.get('repo-search', 
    123131                                       'minimum-word-length', 3)) 
     
    165173    def need_reindex(self): 
    166174        return not hasattr(self, 'meta') \ 
    167175            or self.repo.youngest_rev != \ 
    168                int(self.meta.get('last-repo-rev', -1)) \ 
     176               self.meta.get('last-repo-rev', -1) \ 
    169177            or self.env.config.get('repo-search', 'include', '') \ 
    170178               != self.meta.get('index-include', '') \ 
    171179            or self.env.config.get('repo-search', 'exclude', '') \ 
     
    247255                    self._invalidate_file(node.path) 
    248256                    self._reindex_node(node) 
    249257            new_files.add(node.path) 
    250          
     258            self.sync() 
     259 
    251260        # All files that don't match the new filter criteria must be purged 
    252261        # from the index 
    253262        invalidated_files = set(self.files.keys()) 
  • tracreposearch/search.py

     
    6464    def get_search_results(self, req, query, filters): 
    6565        if 'repo' not in filters: 
    6666            return 
    67         repo = self.env.get_repository(authname=req.authname) 
     67        from trac.versioncontrol import RepositoryManager 
     68        repos = RepositoryManager(self.env).get_all_repositories() 
     69        results = [] 
     70        for repo in repos: 
     71            results.extend( list(self.get_search_results_for_repo(req, query, filters, repo)) ) 
     72        return results 
     73 
     74    def get_search_results_for_repo(self, req, query, filters, reponame): 
     75        repo = self.env.get_repository(reponame=reponame, authname=req.authname) 
    6876        if not isinstance(query, list): 
    6977            query = query.split() 
    7078        query = [q.lower() for q in query] 
     
    7684        # Use indexer if possible, otherwise fall back on brute force search. 
    7785        try: 
    7886            from tracreposearch.indexer import Indexer 
    79             self.indexer = Indexer(self.env) 
     87            self.indexer = Indexer(self.env, reponame) 
    8088            self.indexer.reindex() 
    8189            walker = lambda repo, query: [repo.get_node(filename) for filename 
    8290                                          in self.indexer.find_words(query)] 
     
    128136                    if found: 
    129137                        break 
    130138 
    131                 yield (self.env.href.browser(node.path) + (found and '#L%i' % found or ''), 
    132                        node.path, change.date, change.author, 
     139                yield (self.env.href.browser(repo.reponame, node.path) + (found and '#L%i' % found or ''), 
     140                       "%s (in %s)" % (node.path, repo.reponame), change.date, change.author, 
    133141                       shorten_result(content, query)) 

Definitely still don't recommend committing as-is. I'll see if I can find the time to test it against other repo backends, and against a single-repo setup.

With this patch, each distinct repository gets its own index in a separate subdirectory of the index path, and the search iterates over all the indexes.

The initial indexing can become very, very slow and memory-intensive here. I added an extra self.sync() call after every node is walked, in an attempt to reduce the memory usage. But it's still pretty intense and really slow. I'll see if I can come up with any improvements in this area..

comment:8 follow-up: Changed 2 years ago by rjollos

ejucovy, since you've forked this plugin, should we close this ticket now? Since you are working on improving the plugin, I'm inclined to contribute my future efforts towards improving your fork rather than continue to work on this plugin.

comment:9 in reply to: ↑ 8 ; follow-up: Changed 2 years ago by ejucovy

  • Resolution set to wontfix
  • Status changed from new to closed

Replying to rjollos:

ejucovy, since you've forked this plugin, should we close this ticket now?

Yes, I think that makes sense. As you can probably tell, the MultiRepoSearchPlugin grew out from my patch on this ticket -- it ended up feeling like too big of a design change to implement incrementally with confidence.

So I think the right resolution to this ticket is "if you are using Trac 0.12+ with non-SVN repos and/or multiple repos, try using MultiRepoSearchPlugin."

Since you are working on improving the plugin, I'm inclined to contribute my future efforts towards improving your fork rather than continue to work on this plugin.

That would be great!

comment:10 in reply to: ↑ 9 Changed 2 years ago by rjollos

Replying to ejucovy:

So I think the right resolution to this ticket is "if you are using Trac 0.12+ with non-SVN repos and/or multiple repos, try using MultiRepoSearchPlugin."

Great. I added that to the wiki page.

Add Comment

Modify Ticket

Action
as closed .
as The resolution will be set. Next status will be 'closed'.
to The owner will be changed from rjollos. Next status will be 'closed'.
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.