Modify

Opened 6 years ago

Last modified 17 months ago

#3456 new defect

plugin consumes all memory with large perforce repository

Reported by: bike_head@… Owned by: lewisbaker
Priority: highest Component: PerforcePlugin
Severity: critical Keywords:
Cc: Archon810 Trac Release: 0.11

Description (last modified by rjollos)

I have perforceplugin working with trac 0.11. My perforce server has a *large* amount of history: about 125,000 changelists. When I start trac for the first time with this plugin enabled it start checking every changelist revision. This 1) takes a huge amount of unnecessary time since most of these changes are old and irrelevant to the new project I'm creating and 2) by the time it reaches change 20,000 the plugin has consumed all the swapspace on my computer (4G!) and so grinds the machine to a halt.

I'm running a sqlLite DBMS with the following [perforce] config entry

[perforce]
branches = goblin
charset = none
labels = labels/*
language =
password =
port = dummy:1666
user = foo
workspace = foo-work

I'm going to try MySQL to see if this helps the problem.

Attachments (0)

Change History (7)

comment:1 Changed 6 years ago by volker.simonis@…

Hi,

I have the same problem: over 100,000 changelists. It took over 30 minutes and 90MB in the database to get the first 400 changelists in, so I doubt this process will eventually finish.

Did you or somebody else found a way to limit the perforceplugin to only fetch the last say 100 changelists or so?

Regards,
Volker

comment:2 Changed 6 years ago by anonymous

I tried to manipulate how the plugin loaded changelists in two ways:

  1. There is a record in one of the tables for trac that tells the plugin what the last changelist it loaded was. I seeded this in order to start the changelist loaded near the end of the sequence. This worked but the plugin doesn't react well when it can't find a changelist it expects when browsing the source tree and will throw ugly errors.
  1. I hacked the initial load sequence to be able to restart properly. In the current incarnation of the plugin, it will always initially load (or sync) all the changelists from the beginning. If it get's interrupted (say it consumes all memory and dies) it usually gets in a state where the last changelist it loaded doesn't match some data in its tables and so it silently stops. I made a change so that it detects this inconsistency, corrects it, and then continues the load. I had 50,000 change lists and had to restart 5 times to get everything into the DBMS.
  2. I also tried to limit the depot to a portion of the depot, but didn't have the time to figure out all the parameters to the plugin that might control this. There are a few undocumented parameters that may help here.

However, I gave up on this because I was just trying to get around the more critical flaw that the plugin has a huge memory hole that really limits it to ~10,000 change lists. Plus, it is very slow when accessing large depot, taking upwards of 30 seconds to display a subdirectory. I'm not sure where the leaks and inefficiencies lie because i'm not very fluent in Python.

comment:3 Changed 6 years ago by anonymous

Seeing the same here on ~36,000 changes -- the process uses up all the machine's memory and crashes long before it even gets partway done. Restarting over and over until it's populated the initial cache eventually digests the entire list, but then I need to resync it...which just starts the memory consumption problem over again. Switching from Bugzilla/Perforce to Trac/Perforce is a non-starter because of this bug. Switching to Subversion will be a hard sell.

comment:4 Changed 6 years ago by tobias@…

  • Priority changed from high to highest

Same here, ~36,000 changelists,

right now im using:
for i in $(seq 26517 34022); do sudo trac-admin /var/trac/sites/express resync ${i}; done

to be able to fill in the gap where it crashed. No idea if this will help me, since it needs to resync eventually?

comment:5 Changed 5 years ago by Archon810

Same experience here, guys.

I have 103k changes and (thank god) I started playing with this integration on a sandbox copy of production trac. It took more than a day to import and the worst part was that Trac blocked for the whole duration.

I was also observing a huge memory leak and had to kill the httpd thread several times (it hung the server the first time).

Also, after the initial sync was done, since it still tries to catch up missing revisions any time someone loads any Trac page, it blocks if there had been more than a few commits and the sync even of only a few revisions still takes ages (as others reported).

I think there is only one solution and that is to make the catchup process async and therefore non-blocking by splitting up each revision sync into a separate process. This way, there would be no need to lock sqlite for long periods of time.

I am pausing p4-trac integration for now and looking forward to a more robust solution, as I can't have random unexpected delays on page loads slowing down our dev team.

Thank you,
Artem from Plaxo
http://beerpla.net
http://twitter.com/ArtemR

comment:6 Changed 5 years ago by Archon810

  • Cc Archon810 added

comment:7 Changed 17 months ago by rjollos

  • Description modified (diff)

Add Comment

Modify Ticket

Action
as new .
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.