Modify

Opened 2 years ago

Last modified 2 years ago

#9874 new enhancement

PdfImg macro runs with 100% cpu load

Reported by: falkb Owned by: ursaw
Priority: normal Component: PdfImagePlugin
Severity: critical Keywords:
Cc: Trac Release: 0.11

Description

Displaying PDF files with many pages leads to an internal call of convert which can last very long (a few minutes). During this time, my Windows server was fully charged to capacity (100% CPU load) by the parallel process of that call of convert.exe. Convert.exe is also very memory hungry and after a few moments the whole RAM was allocated by convert.exe. This is critical for a responding Trac service, all web interface reloads tend to suffer from timeouts.

You should set a priority below "normal" for the external call of convert.exe, and maybe limit the memory for that process.

Attachments (0)

Change History (10)

comment:1 Changed 2 years ago by ursaw

I tried on my five year old celeron Linux Laptop the files

Convertnig to PNG worked for me on all files within seconds.

Because I have no Windows (ticket 9842 comment14) I am not able see wheather it is a problem of convert itself or on convert on windows (64bit Win Server 2003 R2) or on your installation.

comment:2 Changed 2 years ago by falkb

It doesn't matter what the reason is, though likely it's because of any circumstances in convert or Ghostscript. I just want you to add a something like a flag "LowerPriorityThanNormal" to your system call of 'convert' because you cannot control anymore how much impact that external call has on the system.

comment:3 follow-up: Changed 2 years ago by falkb

Find out how to start a sub process with lower priority here: http://docs.python.org/release/3.0.1/library/subprocess.html

comment:4 in reply to: ↑ 3 Changed 2 years ago by falkb

Replying to falkb:

Find out how to start a sub process with lower priority here: http://docs.python.org/release/3.0.1/library/subprocess.html

replace

sts = os.system("convert" + " myarg")

with

p = Popen("convert" + " myarg", shell=True)
sts = os.waitpid(p.pid, 0)
setpriority(p.pid,1)

by using this helper function from http://code.activestate.com/recipes/496767-set-process-priority-in-windows:

def setpriority(pid=None,priority=1):
    """ Set The Priority of a Windows Process.  Priority is a value between 0-5 where
        2 is normal priority.  Default sets the priority of the current
        python process but can take any valid process ID. """
        
    import win32api,win32process,win32con
    
    priorityclasses = [win32process.IDLE_PRIORITY_CLASS,
                       win32process.BELOW_NORMAL_PRIORITY_CLASS,
                       win32process.NORMAL_PRIORITY_CLASS,
                       win32process.ABOVE_NORMAL_PRIORITY_CLASS,
                       win32process.HIGH_PRIORITY_CLASS,
                       win32process.REALTIME_PRIORITY_CLASS]
    if pid == None:
        pid = win32api.GetCurrentProcessId()
    handle = win32api.OpenProcess(win32con.PROCESS_ALL_ACCESS, True, pid)
    win32process.SetPriorityClass(handle, priorityclasses[priority])

comment:5 follow-up: Changed 2 years ago by falkb

I also asked about the problem (convert on Windows much slower than on Linux) in another forum and they say you should use a special parameter to avoid trashing the workstation:

convert -limit area 0 foo.pdf foo.png

comment:6 in reply to: ↑ 5 Changed 2 years ago by ursaw

Replying to falkb:

I also asked about the problem (convert on Windows much slower than on Linux) in another forum

see [11408]

can you work on windows server now?

I testes on windows 32Bit XP SP2 and cant report your Problems.

If you satisfied with the resolution, please close the ticket.

comment:7 Changed 2 years ago by falkb

Thanks, your new [11408] should solve the memory problem since you can force the entire PDF to disk rather than memory by adding -limit area 1 to the command line call. At least a direct command line with that parameter didn't let explode the memory usage anymore.

Stay tuned for activities in the ImageMagick forum thread, I gave them the debug output of the slow convert. Still the conversion of a 441 pages PDF takes 15 minutes at 100% CPU load here in spite of your seen fast conversion of the same file. I'm going to wait for the result of that problem analysis, and maybe a solution for the problem, before I start thinking about closing this ticket.

comment:8 follow-up: Changed 2 years ago by falkb

Take a look at the ImageMagick forum discussion thread. They say it could be sufficient to just call Ghostscript instead of ImageMagick, and they describe how to set the proper Ghostscript commandline call. Just using Ghostscript without IM would speed conversion up, too. What do you think about it?

comment:9 in reply to: ↑ 8 ; follow-up: Changed 2 years ago by falkb

Replying to falkb:

Take a look at the ImageMagick forum discussion thread. They say it could be sufficient to just call Ghostscript instead of ImageMagick, and they describe how to set the proper Ghostscript commandline call. Just using Ghostscript without IM would speed conversion up, too. What do you think about it?

Could you change your plugin to use Ghostscript than ImageMagick?

After the continued discussion it turns out the direct using of Ghostscript is about 8 times faster than the same via ImageMagick (see http://www.imagemagick.org/discourse-server/viewtopic.php?f=3&t=20574&start=15), e.g. 3 minutes instead of 25 minutes. And ImageMagick is just for throughputting the data with a bad bottleneck on Windows. You can even better control the conversion behaviour, for instance choose -sDEVICE=pnggray instead of -sDEVICE=pngalpha and the conversion time reduces by 2 minutes more to only 1 minute! Furthermore, the memory usage is much less than converting over ImageMagick (90MB vs. 800MB++).

comment:10 in reply to: ↑ 9 Changed 2 years ago by anonymous

Replying to falkb:

Could you change your plugin to use Ghostscript than ImageMagick?

ursaw, did you try the replacing of image magick by direct use of ghostscript as suggested by falkb in his forum posting 2012-03-22T06:20:30+00:00? it's true that's pretty much faster 'cause im has a terrible implementation of passing the data to gs. ---Dirk

Add Comment

Modify Ticket

Action
as new .
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.