Modify

Opened 10 years ago

Closed 19 months ago

#746 closed task (wontfix)

Improve performance of GitPlugin

Reported by: hvr Owned by: stuge
Priority: high Component: GitPlugin
Severity: critical Keywords:
Cc: gseanmcg@…, stuge Trac Release: 0.10

Description

The current implementation relies heavely on git-core executables, therefore each information retrieval that needs to access the git repository costs a fork(), making it quite an expensive operation. Therefore performance is quite low right now.

A better performance should be possible to accomplish by:

  • caching meta data
  • optimize GitNode.get_content_length()
  • access the git repository more efficiently

Attachments (0)

Change History (15)

comment:1 Changed 10 years ago by hvr

  • Priority changed from normal to high
  • Severity changed from normal to critical
  • Status changed from new to assigned

comment:2 Changed 10 years ago by hvr

(In [1321]) performance improvements

addresses #746

comment:3 Changed 10 years ago by hvr

(In [1322]) yet another minor performance improvement for big repositories;

addresses #746

comment:4 Changed 9 years ago by hvr

(In [3185]) - changed next revision/previous revision to point to next/previous changeset in a flattened history as provided by git-rev-list --all

  • implemented in-memory commit tree cache in order to speed up typical Trac repository access patterns (addresses #746)
  • allow to wrap GitRepository in a CachedRepository, and thus store meta-data in Trac's sql db (addresses #746)
  • implemented new [git] options to control caching:
    [git]
    
    cached_repository = true
    
    persistent_cache = true
    
  • various other fixes and cleanups

comment:5 Changed 9 years ago by hvr

(In [3199]) GitPlugin: call popen3 with sequence instead of string as command, in order to avoid shell-overhead (addresses #746)

comment:6 Changed 9 years ago by hvr

additional notes:

  • try to create an additional libgit-thin-based Storage in PyGIT.py to evaluate whether the exec+fork+parse overhead is still signifikant
  • most external git calls can be avoided now thanks to extensive caching of meta-data; one of the remaining speed killers is listing directories in the source browser (cat-file -s calls can be optimized, by using GIT 1.5.3+ -l option to ls-tree; but having to call rev-list for each folder element still remains an issue... maybe it'll be possible to get an enhancement to ls-tree merged upstream to provide that information as well...)

comment:7 Changed 9 years ago by osmaker+gitplugin@…

I thought I should mention this since hvr is considering a similar thing with libgit-thin.

I've been working on a small C library to help with performance on this and another project. (I chose not to use libgit-thin, as it has memory leaks which can't be fixed without some significant changes to Git itself.)

The goals for it are simple:

  • Read Git objects (no write support).
  • Do so with high performance, providing various methods which can be tuned to the specific application/command.

I've created a Python wrapper around the library and have implemented some basic Git commands with it in Python (ls-tree, rev-list, cat-file, etc). Already, most calls are magnitudes faster than using the git binaries via popen3.

I plan to have a public, working version available the end of April. (And hopefully some simple patches for low-performance areas of GitPlugin.)

comment:8 follow-up: Changed 8 years ago by jarin.franek@…

I tried the trac with the git plugin on the Linux git repo. Without cache: e.g. cached_repository=false, persistent_cache=false, everything was well. Turning the cache on, however, made me a headache: could not get timeline in 10 minutes (I gave up then) of 100% CPU load. Browsing sources shown revisions 3 or 4 years old only (still 100% CPU).

I guess there are some bugs in the caching code. Should I consider it for a new bug report or is it fine to leave it on this ticket (as it is about performance)?

To replicate the issue:

  1. trac 0.11.2.1, git-plugin 0.11.0.1 svn5076, git 1.6.0.6, python 2.5.2 installed
  2. git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git linux-2.6
  3. mkdir trac
  4. trac-admin ./trac initenv fill in data appropriatelly
  5. in the trac environment config file:
    1. enable git plugin
    2. enable cache
    3. set git_bin to a full path to git binary (it cease to work with only 'git' there, at least for me)
  6. launch tracd --port <your.choice> <path_to_the_trac_env>
  7. try to access the project timeline page via e.g. Firefox,
  8. 100% CPU, no page shown...

and 7b. Try to access code browser at http://localhost:<your.port>/trac/browser 8b. Very old revisions showed, 100% CPU

Note that with a small git repo I had no difficulties with using cache.

comment:9 in reply to: ↑ 8 Changed 7 years ago by georgyo

I was having serious performance issues, and caching was the problem.

One a repo grows past 500 commits, it starts to make trac very slow if caching is enabled.

comment:10 Changed 6 years ago by anonymous

  • Cc gseanmcg@… added; anonymous removed

comment:11 Changed 6 years ago by anonymous

  • Cc stuge added

comment:12 Changed 5 years ago by anonymous

fyi, https://github.com/hvr/trac-git-plugin/pull/12 was just merged which is supposed to have a positive effect on performance

comment:13 Changed 4 years ago by stuge

  • Owner changed from hvr to stuge
  • Status changed from assigned to new

I started work on a libgit2/pygit2 based rewrite of the plugin. The old plugin has meanwhile been included in Trac trunk, and my work to replace it is also done against Trac trunk. I guess that this ticket should be closed, and interested parties should monitor the following pages for progress:

http://trac.edgewall.org/wiki/TracDev/Performance/Git

http://trac.edgewall.org/ticket/10594#comment:6

comment:14 Changed 4 years ago by stuge

  • Status changed from new to assigned

comment:15 Changed 19 months ago by rjollos

  • Resolution set to wontfix
  • Status changed from assigned to closed

GitPlugin is deprecated. Please upgrade to Trac 1.0 and use TracGit.

Add Comment

Modify Ticket

Action
as closed The owner will remain stuge.
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.