Ticket #746 (assigned task)

Opened 2 years ago

Last modified 3 months ago

Improve performance of GitPlugin

Reported by: hvr Assigned to: hvr (accepted)
Priority: high Component: GitPlugin
Severity: critical Keywords:
Cc: Trac Release: 0.10

Description

The current implementation relies heavely on git-core executables, therefore each information retrieval that needs to access the git repository costs a fork(), making it quite an expensive operation. Therefore performance is quite low right now.

A better performance should be possible to accomplish by:

  • caching meta data
  • optimize GitNode?.get_content_length()
  • access the git repository more efficiently

Attachments

Change History

09/29/06 05:47:23 changed by hvr

  • priority changed from normal to high.
  • status changed from new to assigned.
  • severity changed from normal to critical.

09/29/06 07:26:50 changed by hvr

(In [1321]) performance improvements

addresses #746

09/29/06 08:03:39 changed by hvr

(In [1322]) yet another minor performance improvement for big repositories;

addresses #746

02/06/08 08:39:50 changed by hvr

(In [3185]) - changed next revision/previous revision to point to next/previous changeset in a flattened history as provided by git-rev-list --all

  • implemented in-memory commit tree cache in order to speed up typical Trac repository access patterns (addresses #746)
  • allow to wrap GitRepository in a CachedRepository, and thus store meta-data in Trac's sql db (addresses #746)
  • implemented new [git] options to control caching:
    [git]
    
    cached_repository = true
    
    persistent_cache = true
    
  • various other fixes and cleanups

02/08/08 15:02:17 changed by hvr

(In [3199]) GitPlugin: call popen3 with sequence instead of string as command, in order to avoid shell-overhead (addresses #746)

02/15/08 03:55:03 changed by hvr

additional notes:

  • try to create an additional libgit-thin-based Storage in PyGIT.py to evaluate whether the exec+fork+parse overhead is still signifikant
  • most external git calls can be avoided now thanks to extensive caching of meta-data; one of the remaining speed killers is listing directories in the source browser (cat-file -s calls can be optimized, by using GIT 1.5.3+ -l option to ls-tree; but having to call rev-list for each folder element still remains an issue... maybe it'll be possible to get an enhancement to ls-tree merged upstream to provide that information as well...)

04/07/08 07:42:37 changed by osmaker+gitplugin@gmail.com

I thought I should mention this since hvr is considering a similar thing with libgit-thin.

I've been working on a small C library to help with performance on this and another project. (I chose not to use libgit-thin, as it has memory leaks which can't be fixed without some significant changes to Git itself.)

The goals for it are simple:

  • Read Git objects (no write support).
  • Do so with high performance, providing various methods which can be tuned to the specific application/command.

I've created a Python wrapper around the library and have implemented some basic Git commands with it in Python (ls-tree, rev-list, cat-file, etc). Already, most calls are magnitudes faster than using the git binaries via popen3.

I plan to have a public, working version available the end of April. (And hopefully some simple patches for low-performance areas of GitPlugin.)


Add/Change #746 (Improve performance of GitPlugin)




Change Properties
Action