Modify

Opened 11 years ago

Closed 2 years ago

#746 closed task (wontfix)

Improve performance of GitPlugin

Reported by: Herbert Valerio Riedel Owned by: stuge
Priority: high Component: GitPlugin
Severity: critical Keywords:
Cc: gseanmcg@…, stuge Trac Release: 0.10

Description

The current implementation relies heavely on git-core executables, therefore each information retrieval that needs to access the git repository costs a fork(), making it quite an expensive operation. Therefore performance is quite low right now.

A better performance should be possible to accomplish by:

  • caching meta data
  • optimize GitNode.get_content_length()
  • access the git repository more efficiently

Attachments (0)

Change History (15)

comment:1 Changed 11 years ago by Herbert Valerio Riedel

Priority: normalhigh
Severity: normalcritical
Status: newassigned

comment:2 Changed 11 years ago by Herbert Valerio Riedel

(In [1321]) performance improvements

addresses #746

comment:3 Changed 11 years ago by Herbert Valerio Riedel

(In [1322]) yet another minor performance improvement for big repositories;

addresses #746

comment:4 Changed 9 years ago by Herbert Valerio Riedel

(In [3185]) - changed next revision/previous revision to point to next/previous changeset in a flattened history as provided by git-rev-list --all

  • implemented in-memory commit tree cache in order to speed up typical Trac repository access patterns (addresses #746)
  • allow to wrap GitRepository in a CachedRepository, and thus store meta-data in Trac's sql db (addresses #746)
  • implemented new [git] options to control caching:
    [git]
    
    cached_repository = true
    
    persistent_cache = true
    
  • various other fixes and cleanups

comment:5 Changed 9 years ago by Herbert Valerio Riedel

(In [3199]) GitPlugin: call popen3 with sequence instead of string as command, in order to avoid shell-overhead (addresses #746)

comment:6 Changed 9 years ago by Herbert Valerio Riedel

additional notes:

  • try to create an additional libgit-thin-based Storage in PyGIT.py to evaluate whether the exec+fork+parse overhead is still signifikant
  • most external git calls can be avoided now thanks to extensive caching of meta-data; one of the remaining speed killers is listing directories in the source browser (cat-file -s calls can be optimized, by using GIT 1.5.3+ -l option to ls-tree; but having to call rev-list for each folder element still remains an issue... maybe it'll be possible to get an enhancement to ls-tree merged upstream to provide that information as well...)

comment:7 Changed 9 years ago by osmaker+gitplugin@…

I thought I should mention this since hvr is considering a similar thing with libgit-thin.

I've been working on a small C library to help with performance on this and another project. (I chose not to use libgit-thin, as it has memory leaks which can't be fixed without some significant changes to Git itself.)

The goals for it are simple:

  • Read Git objects (no write support).
  • Do so with high performance, providing various methods which can be tuned to the specific application/command.

I've created a Python wrapper around the library and have implemented some basic Git commands with it in Python (ls-tree, rev-list, cat-file, etc). Already, most calls are magnitudes faster than using the git binaries via popen3.

I plan to have a public, working version available the end of April. (And hopefully some simple patches for low-performance areas of GitPlugin.)

comment:8 Changed 8 years ago by jarin.franek@…

I tried the trac with the git plugin on the Linux git repo. Without cache: e.g. cached_repository=false, persistent_cache=false, everything was well. Turning the cache on, however, made me a headache: could not get timeline in 10 minutes (I gave up then) of 100% CPU load. Browsing sources shown revisions 3 or 4 years old only (still 100% CPU).

I guess there are some bugs in the caching code. Should I consider it for a new bug report or is it fine to leave it on this ticket (as it is about performance)?

To replicate the issue:

  1. trac 0.11.2.1, git-plugin 0.11.0.1 svn5076, git 1.6.0.6, python 2.5.2 installed
  2. git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git linux-2.6
  3. mkdir trac
  4. trac-admin ./trac initenv fill in data appropriatelly
  5. in the trac environment config file:
    1. enable git plugin
    2. enable cache
    3. set git_bin to a full path to git binary (it cease to work with only 'git' there, at least for me)
  6. launch tracd --port <your.choice> <path_to_the_trac_env>
  7. try to access the project timeline page via e.g. Firefox,
  8. 100% CPU, no page shown...

and 7b. Try to access code browser at http://localhost:<your.port>/trac/browser 8b. Very old revisions showed, 100% CPU

Note that with a small git repo I had no difficulties with using cache.

comment:9 in reply to:  8 Changed 7 years ago by georgyo

I was having serious performance issues, and caching was the problem.

One a repo grows past 500 commits, it starts to make trac very slow if caching is enabled.

comment:10 Changed 7 years ago by anonymous

Cc: gseanmcg@… added; anonymous removed

comment:11 Changed 7 years ago by anonymous

Cc: stuge added

comment:12 Changed 5 years ago by anonymous

fyi, https://github.com/hvr/trac-git-plugin/pull/12 was just merged which is supposed to have a positive effect on performance

comment:13 Changed 5 years ago by stuge

Owner: changed from Herbert Valerio Riedel to stuge
Status: assignednew

I started work on a libgit2/pygit2 based rewrite of the plugin. The old plugin has meanwhile been included in Trac trunk, and my work to replace it is also done against Trac trunk. I guess that this ticket should be closed, and interested parties should monitor the following pages for progress:

http://trac.edgewall.org/wiki/TracDev/Performance/Git

http://trac.edgewall.org/ticket/10594#comment:6

comment:14 Changed 5 years ago by stuge

Status: newassigned

comment:15 Changed 2 years ago by Ryan J Ollos

Resolution: wontfix
Status: assignedclosed

GitPlugin is deprecated. Please upgrade to Trac 1.0 and use TracGit.

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain stuge.
The resolution will be deleted. Next status will be 'reopened'.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.