Opened 18 years ago

Closed 18 years ago

Last modified 16 years ago

#639 closed enhancement (duplicate)

Large p4 depots kill virtual memory (patch included) — at Version 3

Reported by: r.blum@… Owned by: Lewis Baker
Priority: normal Component: PerforcePlugin
Severity: normal Keywords: needinfo
Cc: Trac Release: 0.10

Description (last modified by Alec Thomas)

If you do an initial sync to a large depot (50,000 changelists), the perforce plugin will use up VM like crazy. (I stopped once I reached about 1.5 GB)

Small modification to the sync procedure solves that problem. (Basically, get changelists in chunks of 1000). Here's my local mod

   # Override sync to precache data to make it run faster
    def sync(self):
        youngest_stored = self.repos.get_youngest_rev_in_cache(self.db)     
        if youngest_stored is None:
            youngest_stored = '0'

        while youngest_stored != str(self.repos.youngest_rev):
            # Need to cache all information for changes since the last
            # sync operation.

            youngest_to_get = self.repos.youngest_rev
            if youngest_to_get > int(youngest_stored) + 1000:
                youngest_to_get = int(youngest_stored) + 1000

            # Obtain a list of changes since the last cache sync
            from p4trac.repos import _P4ChangesOutputConsumer
            output = _P4ChangesOutputConsumer(self.repos._repos)
            self.repos._connection.run('changes', '-l', '-s', 'submitted',
                                       '@>%s,%d' % ( youngest_stored, youngest_to_get ),
                                       output=output)

            if output.errors:
                from p4trac.repos import PerforceError
                raise PerforceError(output.errors)

            changes = output.changes
            changes.reverse()

            # Perform the precaching of the file history for files in these
            # changes.
            self.repos._repos.precacheFileHistoryForChanges(changes)

            youngest_stored=str(youngest_to_get)

        # Call on to the default implementation now that we've cached
        # enough information to make it run a bit faster.
        CachedRepository.sync(self)

Change History (3)

comment:1 Changed 18 years ago by Lewis Baker

Resolution: duplicate
Status: newclosed

Duplicate of #630.

comment:2 Changed 18 years ago by Lewis Baker

Do you know if the excessive virtual memory usage was occuring during the initial call to 'p4 changes' or in the call to precacheFileHistoryForChanges()?

The precacheFileHistoryForChanges() basically retrieves all information about the changes (with a combination of 'p4 describe' and 'p4 filelog' commands) and caches it in an internal datastructure ready for the call to CachedRepository.sync().

I'm not sure that batching up calls to precacheFileHistoryForChanges() is going to fix the problem entirely as the same data will all be held in memory by the time the outer loop exits anyway. However, I am curious as to how/why your patch has alleviated your memory usage problems. Any more info you can give would be a great help.

comment:3 Changed 18 years ago by Alec Thomas

Description: modified (diff)

Fixed formatting.

Note: See TracTickets for help on using tickets.