Modify

Opened 11 years ago

Last modified 11 years ago

#11287 assigned enhancement

Convert to .executemany() where appropriate

Reported by: Chris Nelson Owned by: Chris Nelson
Priority: low Component: TracJsGanttPlugin
Severity: normal Keywords:
Cc: Trac Release:

Description (last modified by Jun Omae)

As noted in comment:3:ticket:11131 and http://www.python.org/dev/peps/pep-0249/#executemany, .executemany() is preferred over passing a list of tuples. This can clean up a number of database accesses, potentially making them more portable and readable.

Attachments (0)

Change History (20)

comment:1 Changed 11 years ago by Chris Nelson

Description: modified (diff)

comment:2 Changed 11 years ago by Chris Nelson

Status: newassigned

comment:3 Changed 11 years ago by Chris Nelson

It's not clear to me how to leverage Trac's %s handling which properly quotes strings field values in a DSMS-specific way. I have:

    inClause = "IN (%s)" % ','.join(('%s',) * len(owners))
    cursor.execute(("SELECT id FROM ticket "
                    "WHERE status!=%s AND owner " + 
                    inClause),
                    ['closed'] + list(owners))

How do I get that closed where it belongs?

comment:4 Changed 11 years ago by Chris Nelson

In 13369:

Try using executemany(). Refs #11287, 9648.

This is only one place that needs it but I'd like to see if this fixes
falkb's problem. Or at least let's him get further along.

Last edited 11 years ago by Chris Nelson (previous) (diff)

comment:5 Changed 11 years ago by falkb

Could you tell me an action how to trigger that one for a test? I'm not sure what I changed in a ticket as I got the errors.

comment:6 in reply to:  5 Changed 11 years ago by Chris Nelson

Replying to falkb:

Could you tell me an action how to trigger that one for a test? I'm not sure what I changed in a ticket as I got the errors.

Owner, dependencies, estimated or actual time.

    def _affectsSchedule(self, ticket, old_values):
        # If any of the changed values are schedule related
        for f in old_values:
            if f in self.scheduleFields:
                return True

        # If the status changed, check for closing or reopening
        # tickets and making goals active or inactive.
        if 'status' in old_values:
            # If the ticket changed status in or out of closed, reschedule
            if (old_values['status'] == 'closed' \
                    or ticket['status'] == 'closed'):
                return True

            # If the ticket is a goal and changes status in or out of
            # active, reschedule
            if ticket['type'] == self.pm.goalTicketType:
                wasActive = old_values['status'] in self.pm.activeGoalStatuses
                isActive = ticket['status'] in self.pm.activeGoalStatuses
                if (wasActive and not isActive) \
                        or (isActive and not wasActive):
                    return True

        return False

where

        # Built-in fields that can affect scheduling
        self.scheduleFields = ['owner', 'priority', 'milestone']

        # Configurable fields from plugins that can affect scheduling
        for f in ['estimate', 'start', 'finish', 'pred', 'succ', 'parent']:
            # If the user configured this field
            if f in self.pm.fields:
                # Get the name of the configured field
                self.scheduleFields.append(self.pm.fields[f])

        # FIXME - Make this configurable
        self.scheduleFields.append('parents')
        self.scheduleFields.append('blocking')
        self.scheduleFields.append('blockedby')

comment:7 Changed 11 years ago by falkb

I've update to the latest SVN revision, changed owner, estimated hours and milestone back an forth several times but cannot reproduce errors anymore.

comment:8 Changed 11 years ago by Chris Nelson

executemany() has different (perhaps buggy) behavior vs. .execute() when no results are found.

@@ -919,13 +919,17 @@ class TracPM(Component):
         ids = ticketsByID.keys()
 
         # Get dates from precomputed schedule, if any.
-        inClause = "IN (%s)" % ','.join(('%s',) * len(ids))
-        cursor.execute("SELECT ticket, start, finish" +
-                       " FROM schedule WHERE ticket " +
-                       inClause,
-                       ids)
+        query = "SELECT ticket, start, finish" + \
+            " FROM schedule WHERE ticket IN (%s)"
+        values = []
+        for tid in ids:
+            values.append([ tid ])
+        cursor.executemany(query, values)
+        self.env.log.debug('rowcount:%s' % cursor.rowcount)
         for row in cursor:
             tid, start, finish = row
+            self.env.log.debug('row:%s, tid:%s, start:%s, finish:%s' %
+                               (row, tid, start, finish))
             ticketsByID[tid]['_sched_start'] = start
             ticketsByID[tid]['_sched_finish'] = finish

when none of the IDs are in the schedule table gives a key error; None can't be used to index ticketsByID. The log shows:

2013-08-30 08:50:07,595 Trac[tracpm] DEBUG: rowcount:10
2013-08-30 08:50:07,595 Trac[tracpm] DEBUG: row:(None, None, None), tid:None, start:None, finish:None

comment:9 in reply to:  8 ; Changed 11 years ago by Jun Omae

Description: modified (diff)

Replying to ChrisNelson:

executemany() has different (perhaps buggy) behavior vs. .execute() when no results are found.

No. executemany() is needed instead of inserting multiple rows feature. We have no need to change the select statement.

comment:10 in reply to:  9 ; Changed 11 years ago by Chris Nelson

Replying to jun66j5:

Replying to ChrisNelson:

executemany() has different (perhaps buggy) behavior vs. .execute() when no results are found.

No. executemany() is needed instead of inserting multiple rows feature. We have no need to change the select statement.

falkb was having SELECT ... WHERE ticket in (...long list of ids...) fail in SQLite with too many arguments so I believe that the SELECT does need to be changed.

comment:11 in reply to:  10 ; Changed 11 years ago by Jun Omae

Replying to ChrisNelson:

falkb was having SELECT ... WHERE ticket in (...long list of ids...) fail in SQLite with too many arguments so I believe that the SELECT does need to be changed.

Hmmm, that's strange. According to http://www.sqlite.org/limits.html, no limitation about IN (...).

Also, SQLite can works with 10,000 of %s.

>>> db = env.get_read_db()
>>> cursor = db.cursor()
>>> cursor.execute('SELECT COUNT(*) FROM ticket')
<trac.db.sqlite_backend.EagerCursor object at 0x8c4a96c>
>>> [row[0] for row in cursor]
[1812]      # 1,812 tickets
>>> query = 'SELECT id FROM ticket WHERE id IN (%s)' % ','.join(('%s',) * 10000)
>>> cursor.execute(query, [_ for _ in xrange(10000)])
<trac.db.sqlite_backend.EagerCursor object at 0x8c4a96c>
>>> len([row[0] for row in cursor])
1812

comment:12 Changed 11 years ago by Jun Omae

Ah, executemany() cannot be executed with SELECT statement on SQLite.

Python 2.5.6 (r256:88840, Feb 29 2012, 04:03:24)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-51)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlite3
>>> sqlite3.sqlite_version_info
(3, 3, 6)
>>> from trac.env import open_environment
>>> env = open_environment('/var/trac/1.0')
>>> db = env.get_read_db()
>>> cursor = db.cursor()
>>> cursor.executemany('SELECT * FROM ticket WHERE id=%s', [(id,) for id in xrange(10000)])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jun66j5/src/trac.git/trac/db/util.py", line 77, in executemany
    args)
  File "/home/jun66j5/src/trac.git/trac/db/sqlite_backend.py", line 62, in executemany
    args)
  File "/home/jun66j5/src/trac.git/trac/db/sqlite_backend.py", line 48, in _rollback_on_error
    return function(self, *args, **kwargs)
sqlite3.ProgrammingError: You cannot execute SELECT statements in executemany().

comment:13 in reply to:  11 Changed 11 years ago by Chris Nelson

Replying to jun66j5:

Replying to ChrisNelson:

falkb was having SELECT ... WHERE ticket in (...long list of ids...) fail in SQLite with too many arguments so I believe that the SELECT does need to be changed.

Hmmm, that's strange. According to http://www.sqlite.org/limits.html, no limitation about IN (...).

Also, SQLite can works with 10,000 of %s.

>>> db = env.get_read_db()
>>> cursor = db.cursor()
>>> cursor.execute('SELECT COUNT(*) FROM ticket')
<trac.db.sqlite_backend.EagerCursor object at 0x8c4a96c>
>>> [row[0] for row in cursor]
[1812]      # 1,812 tickets
>>> query = 'SELECT id FROM ticket WHERE id IN (%s)' % ','.join(('%s',) * 10000)
>>> cursor.execute(query, [_ for _ in xrange(10000)])
<trac.db.sqlite_backend.EagerCursor object at 0x8c4a96c>
>>> len([row[0] for row in cursor])
1812

OK. But why did r13369 fix his problem? <shrug>

comment:14 Changed 11 years ago by Chris Nelson

I don't know what to make of comment:82:ticket:9648 and r13369; if falkb is using SQLite and executemany() fixed his problem but executemany() isn't allows with SELECT in SQLite, I guess I have to conclued that falkb didn't trigger the new code in his test. Perhaps he hadn't reenabled the background rescheduler.

comment:15 Changed 11 years ago by Jun Omae

I just read comment:82:ticket:9648 and reproduced "too many SQL variables" issue with Python 2.6 and SQLite 3.6.20. Max number of placeholders is 999. See http://stackoverflow.com/questions/7106016/too-many-sql-variables-error-in-django-witih-sqlite3.

>>> cursor.execute('SELECT * FROM ticket WHERE id IN (%s)' % ','.join(('%s',)*1000), [_ for _ in xrange(1000)])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "trac/db/util.py", line 65, in execute
    return self.cursor.execute(sql_escape_percent(sql), args)
  File "trac/db/sqlite_backend.py", line 78, in execute
    result = PyFormatCursor.execute(self, *args)
  File "trac/db/sqlite_backend.py", line 56, in execute
    args or [])
  File "trac/db/sqlite_backend.py", line 48, in _rollback_on_error
    return function(self, *args, **kwargs)
sqlite3.OperationalError: too many SQL variables
>>> cursor.execute('SELECT * FROM ticket WHERE id IN (%s)' % ','.join(('%s',)*999), [_ for _ in xrange(999)])
<trac.db.sqlite_backend.EagerCursor object at 0x193cc00>

And using executemany() with SELECT and N of values, it executes the query N times and probably slow. I proposal that executing query per 999 values like this (incomplete patch and untested).

  • tracjsgantt/tracpm.py

     
    927927        ids = ticketsByID.keys()
    928928
    929929        # Get dates from precomputed schedule, if any.
    930         inClause = "IN (%s)" % ','.join(('%s',) * len(ids))
    931         cursor.execute("SELECT ticket, start, finish" +
    932                        " FROM schedule WHERE ticket " +
    933                        inClause,
    934                        ids)
    935         for row in cursor:
     930        for row in _fetch_by_values(db,
     931                                    'SELECT ticket, start, finish '
     932                                    'FROM schedule',
     933                                    'ticket', ids):
    936934            tid, start, finish = row
    937935            ticketsByID[tid]['_sched_start'] = start
    938936            ticketsByID[tid]['_sched_finish'] = finish
     
    29552953
    29562954    def ticket_deleted(self, ticket):
    29572955        self.rescheduleTickets(ticket, {})
     2956
     2957    def _fetch_by_values(self, db, select, column, values):
     2958        cursor = db.cursor()
     2959        for idx in xrange(0, len(values), 100):
     2960            args = values[idx:idx + 100]
     2961            query = select + ' WHERE %s IN (%s)' % \
     2962                             (db.quote(column), ','.join(('%s',) * len(args)))
     2963            cursor.execute(query, args)
     2964            for row in cursor:
     2965                yield row

comment:16 Changed 11 years ago by Chris Nelson

It'd be great if I could tie into the Trac core code that takes id IN (1, 2, 3, 4, 7, 10, 11, 12) and translates it to id BETWEEN 1 and 4 OR id=7 OR id BETWEEN 10 and 12 (or whatever).

comment:17 in reply to:  14 Changed 11 years ago by falkb

Replying to ChrisNelson:

... Perhaps he hadn't reenabled the background rescheduler. ...

comment:ticket:9648:85 shows what I've activated, still not sure if this is the right combination of all those components. I think I had all components activated once as I ran into the problems.

Last edited 11 years ago by falkb (previous) (diff)

comment:18 Changed 11 years ago by lotte@…

I'm using the TracJsGanttPlugin in combination with the Timing and Estimation and the Subtickets plugin with everything enabled and I'm getting the following error when creating a new ticket, which I think is related to this ticket:

Trac detected an internal error:
ProgrammingError: You cannot execute SELECT statements in executemany().

Traceback:

Python Traceback

Most recent call last:
File "/usr/lib/python2.7/site-packages/trac/web/main.py", line 497, in _dispatch_request
File "/usr/lib/python2.7/site-packages/trac/web/main.py", line 214, in dispatch
File "/usr/lib/python2.7/site-packages/trac/ticket/web_ui.py", line 180, in process_request
File "/usr/lib/python2.7/site-packages/trac/ticket/web_ui.py", line 465, in _process_newticket_request
File "/usr/lib/python2.7/site-packages/trac/ticket/web_ui.py", line 1289, in _do_create
File "/usr/lib/python2.7/site-packages/trac/ticket/model.py", line 256, in insert
File "build/bdist.linux-x86_64/egg/tracjsgantt/tracpm.py", line 2950, in ticket_created
File "build/bdist.linux-x86_64/egg/tracjsgantt/tracpm.py", line 2745, in rescheduleTickets
File "build/bdist.linux-x86_64/egg/tracjsgantt/tracpm.py", line 2447, in _findAffected
File "build/bdist.linux-x86_64/egg/tracjsgantt/tracpm.py", line 2406, in more
File "build/bdist.linux-x86_64/egg/tracjsgantt/tracpm.py", line 2326, in ownersOf
File "/usr/lib/python2.7/site-packages/trac/db/util.py", line 85, in executemany
File "/usr/lib/python2.7/site-packages/trac/db/sqlite_backend.py", line 62, in executemany
File "/usr/lib/python2.7/site-packages/trac/db/sqlite_backend.py", line 48, in _rollback_on_error

comment:19 in reply to:  18 Changed 11 years ago by lotte

Replying to lotte@…:

I'm using the TracJsGanttPlugin in combination with the Timing and Estimation and the Subtickets plugin with everything enabled and I'm getting the following error when creating a new ticket, which I think is related to this ticket: ...

I Like to confirm that I'm able to create tickets without any errors when I disable the TicketRescheduler

comment:20 Changed 11 years ago by Chris Nelson

In 13705:

Revert "Try using executemany(). Refs #11287, 9648, 11385"

This reverts commit 5738e04f43ada2747ec5326db294c58de2ab89ba.

Modify Ticket

Change Properties
Set your email in Preferences
Action
as assigned The owner will remain Chris Nelson.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.