Opened 11 years ago
Last modified 11 years ago
#11287 assigned enhancement
Convert to .executemany() where appropriate
Reported by: | Chris Nelson | Owned by: | Chris Nelson |
---|---|---|---|
Priority: | low | Component: | TracJsGanttPlugin |
Severity: | normal | Keywords: | |
Cc: | Trac Release: |
Description (last modified by )
As noted in comment:3:ticket:11131 and http://www.python.org/dev/peps/pep-0249/#executemany, .executemany()
is preferred over passing a list of tuples. This can clean up a number of database accesses, potentially making them more portable and readable.
Attachments (0)
Change History (20)
comment:1 Changed 11 years ago by
Description: | modified (diff) |
---|
comment:2 Changed 11 years ago by
Status: | new → assigned |
---|
comment:3 Changed 11 years ago by
comment:5 follow-up: 6 Changed 11 years ago by
Could you tell me an action how to trigger that one for a test? I'm not sure what I changed in a ticket as I got the errors.
comment:6 Changed 11 years ago by
Replying to falkb:
Could you tell me an action how to trigger that one for a test? I'm not sure what I changed in a ticket as I got the errors.
Owner, dependencies, estimated or actual time.
def _affectsSchedule(self, ticket, old_values): # If any of the changed values are schedule related for f in old_values: if f in self.scheduleFields: return True # If the status changed, check for closing or reopening # tickets and making goals active or inactive. if 'status' in old_values: # If the ticket changed status in or out of closed, reschedule if (old_values['status'] == 'closed' \ or ticket['status'] == 'closed'): return True # If the ticket is a goal and changes status in or out of # active, reschedule if ticket['type'] == self.pm.goalTicketType: wasActive = old_values['status'] in self.pm.activeGoalStatuses isActive = ticket['status'] in self.pm.activeGoalStatuses if (wasActive and not isActive) \ or (isActive and not wasActive): return True return False
where
# Built-in fields that can affect scheduling self.scheduleFields = ['owner', 'priority', 'milestone'] # Configurable fields from plugins that can affect scheduling for f in ['estimate', 'start', 'finish', 'pred', 'succ', 'parent']: # If the user configured this field if f in self.pm.fields: # Get the name of the configured field self.scheduleFields.append(self.pm.fields[f]) # FIXME - Make this configurable self.scheduleFields.append('parents') self.scheduleFields.append('blocking') self.scheduleFields.append('blockedby')
comment:7 Changed 11 years ago by
I've update to the latest SVN revision, changed owner, estimated hours and milestone back an forth several times but cannot reproduce errors anymore.
comment:8 follow-up: 9 Changed 11 years ago by
executemany()
has different (perhaps buggy) behavior vs. .execute()
when no results are found.
@@ -919,13 +919,17 @@ class TracPM(Component): ids = ticketsByID.keys() # Get dates from precomputed schedule, if any. - inClause = "IN (%s)" % ','.join(('%s',) * len(ids)) - cursor.execute("SELECT ticket, start, finish" + - " FROM schedule WHERE ticket " + - inClause, - ids) + query = "SELECT ticket, start, finish" + \ + " FROM schedule WHERE ticket IN (%s)" + values = [] + for tid in ids: + values.append([ tid ]) + cursor.executemany(query, values) + self.env.log.debug('rowcount:%s' % cursor.rowcount) for row in cursor: tid, start, finish = row + self.env.log.debug('row:%s, tid:%s, start:%s, finish:%s' % + (row, tid, start, finish)) ticketsByID[tid]['_sched_start'] = start ticketsByID[tid]['_sched_finish'] = finish
when none of the IDs are in the schedule table gives a key error; None
can't be used to index ticketsByID
. The log shows:
2013-08-30 08:50:07,595 Trac[tracpm] DEBUG: rowcount:10 2013-08-30 08:50:07,595 Trac[tracpm] DEBUG: row:(None, None, None), tid:None, start:None, finish:None
comment:9 follow-up: 10 Changed 11 years ago by
Description: | modified (diff) |
---|
Replying to ChrisNelson:
executemany()
has different (perhaps buggy) behavior vs..execute()
when no results are found.
No. executemany()
is needed instead of inserting multiple rows feature. We have no need to change the select statement.
comment:10 follow-up: 11 Changed 11 years ago by
Replying to jun66j5:
Replying to ChrisNelson:
executemany()
has different (perhaps buggy) behavior vs..execute()
when no results are found.No.
executemany()
is needed instead of inserting multiple rows feature. We have no need to change the select statement.
falkb was having SELECT ... WHERE ticket in (...long list of ids...)
fail in SQLite with too many arguments so I believe that the SELECT
does need to be changed.
comment:11 follow-up: 13 Changed 11 years ago by
Replying to ChrisNelson:
falkb was having
SELECT ... WHERE ticket in (...long list of ids...)
fail in SQLite with too many arguments so I believe that theSELECT
does need to be changed.
Hmmm, that's strange. According to http://www.sqlite.org/limits.html, no limitation about IN (...)
.
Also, SQLite can works with 10,000 of %s
.
>>> db = env.get_read_db() >>> cursor = db.cursor() >>> cursor.execute('SELECT COUNT(*) FROM ticket') <trac.db.sqlite_backend.EagerCursor object at 0x8c4a96c> >>> [row[0] for row in cursor] [1812] # 1,812 tickets >>> query = 'SELECT id FROM ticket WHERE id IN (%s)' % ','.join(('%s',) * 10000) >>> cursor.execute(query, [_ for _ in xrange(10000)]) <trac.db.sqlite_backend.EagerCursor object at 0x8c4a96c> >>> len([row[0] for row in cursor]) 1812
comment:12 Changed 11 years ago by
Ah, executemany()
cannot be executed with SELECT statement on SQLite.
Python 2.5.6 (r256:88840, Feb 29 2012, 04:03:24) [GCC 4.1.2 20080704 (Red Hat 4.1.2-51)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import sqlite3 >>> sqlite3.sqlite_version_info (3, 3, 6) >>> from trac.env import open_environment >>> env = open_environment('/var/trac/1.0') >>> db = env.get_read_db() >>> cursor = db.cursor() >>> cursor.executemany('SELECT * FROM ticket WHERE id=%s', [(id,) for id in xrange(10000)]) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/jun66j5/src/trac.git/trac/db/util.py", line 77, in executemany args) File "/home/jun66j5/src/trac.git/trac/db/sqlite_backend.py", line 62, in executemany args) File "/home/jun66j5/src/trac.git/trac/db/sqlite_backend.py", line 48, in _rollback_on_error return function(self, *args, **kwargs) sqlite3.ProgrammingError: You cannot execute SELECT statements in executemany().
comment:13 Changed 11 years ago by
Replying to jun66j5:
Replying to ChrisNelson:
falkb was having
SELECT ... WHERE ticket in (...long list of ids...)
fail in SQLite with too many arguments so I believe that theSELECT
does need to be changed.Hmmm, that's strange. According to http://www.sqlite.org/limits.html, no limitation about
IN (...)
.Also, SQLite can works with 10,000 of
%s
.>>> db = env.get_read_db() >>> cursor = db.cursor() >>> cursor.execute('SELECT COUNT(*) FROM ticket') <trac.db.sqlite_backend.EagerCursor object at 0x8c4a96c> >>> [row[0] for row in cursor] [1812] # 1,812 tickets >>> query = 'SELECT id FROM ticket WHERE id IN (%s)' % ','.join(('%s',) * 10000) >>> cursor.execute(query, [_ for _ in xrange(10000)]) <trac.db.sqlite_backend.EagerCursor object at 0x8c4a96c> >>> len([row[0] for row in cursor]) 1812
OK. But why did r13369 fix his problem? <shrug>
comment:14 follow-up: 17 Changed 11 years ago by
I don't know what to make of comment:82:ticket:9648 and r13369; if falkb is using SQLite and executemany()
fixed his problem but executemany()
isn't allows with SELECT
in SQLite, I guess I have to conclued that falkb didn't trigger the new code in his test. Perhaps he hadn't reenabled the background rescheduler.
comment:15 Changed 11 years ago by
I just read comment:82:ticket:9648 and reproduced "too many SQL variables" issue with Python 2.6 and SQLite 3.6.20. Max number of placeholders is 999. See http://stackoverflow.com/questions/7106016/too-many-sql-variables-error-in-django-witih-sqlite3.
>>> cursor.execute('SELECT * FROM ticket WHERE id IN (%s)' % ','.join(('%s',)*1000), [_ for _ in xrange(1000)]) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "trac/db/util.py", line 65, in execute return self.cursor.execute(sql_escape_percent(sql), args) File "trac/db/sqlite_backend.py", line 78, in execute result = PyFormatCursor.execute(self, *args) File "trac/db/sqlite_backend.py", line 56, in execute args or []) File "trac/db/sqlite_backend.py", line 48, in _rollback_on_error return function(self, *args, **kwargs) sqlite3.OperationalError: too many SQL variables >>> cursor.execute('SELECT * FROM ticket WHERE id IN (%s)' % ','.join(('%s',)*999), [_ for _ in xrange(999)]) <trac.db.sqlite_backend.EagerCursor object at 0x193cc00>
And using executemany()
with SELECT and N of values, it executes the query N times and probably slow. I proposal that executing query per 999 values like this (incomplete patch and untested).
-
tracjsgantt/tracpm.py
927 927 ids = ticketsByID.keys() 928 928 929 929 # Get dates from precomputed schedule, if any. 930 inClause = "IN (%s)" % ','.join(('%s',) * len(ids)) 931 cursor.execute("SELECT ticket, start, finish" + 932 " FROM schedule WHERE ticket " + 933 inClause, 934 ids) 935 for row in cursor: 930 for row in _fetch_by_values(db, 931 'SELECT ticket, start, finish ' 932 'FROM schedule', 933 'ticket', ids): 936 934 tid, start, finish = row 937 935 ticketsByID[tid]['_sched_start'] = start 938 936 ticketsByID[tid]['_sched_finish'] = finish … … 2955 2953 2956 2954 def ticket_deleted(self, ticket): 2957 2955 self.rescheduleTickets(ticket, {}) 2956 2957 def _fetch_by_values(self, db, select, column, values): 2958 cursor = db.cursor() 2959 for idx in xrange(0, len(values), 100): 2960 args = values[idx:idx + 100] 2961 query = select + ' WHERE %s IN (%s)' % \ 2962 (db.quote(column), ','.join(('%s',) * len(args))) 2963 cursor.execute(query, args) 2964 for row in cursor: 2965 yield row
comment:16 Changed 11 years ago by
It'd be great if I could tie into the Trac core code that takes id IN (1, 2, 3, 4, 7, 10, 11, 12)
and translates it to id BETWEEN 1 and 4 OR id=7 OR id BETWEEN 10 and 12
(or whatever).
comment:17 Changed 11 years ago by
Replying to ChrisNelson:
... Perhaps he hadn't reenabled the background rescheduler. ...
comment:ticket:9648:85 shows what I've activated, still not sure if this is the right combination of all those components. I think I had all components activated once as I ran into the problems.
comment:18 follow-up: 19 Changed 11 years ago by
I'm using the TracJsGanttPlugin in combination with the Timing and Estimation and the Subtickets plugin with everything enabled and I'm getting the following error when creating a new ticket, which I think is related to this ticket:
Trac detected an internal error: ProgrammingError: You cannot execute SELECT statements in executemany().
Traceback:
Python Traceback Most recent call last: File "/usr/lib/python2.7/site-packages/trac/web/main.py", line 497, in _dispatch_request File "/usr/lib/python2.7/site-packages/trac/web/main.py", line 214, in dispatch File "/usr/lib/python2.7/site-packages/trac/ticket/web_ui.py", line 180, in process_request File "/usr/lib/python2.7/site-packages/trac/ticket/web_ui.py", line 465, in _process_newticket_request File "/usr/lib/python2.7/site-packages/trac/ticket/web_ui.py", line 1289, in _do_create File "/usr/lib/python2.7/site-packages/trac/ticket/model.py", line 256, in insert File "build/bdist.linux-x86_64/egg/tracjsgantt/tracpm.py", line 2950, in ticket_created File "build/bdist.linux-x86_64/egg/tracjsgantt/tracpm.py", line 2745, in rescheduleTickets File "build/bdist.linux-x86_64/egg/tracjsgantt/tracpm.py", line 2447, in _findAffected File "build/bdist.linux-x86_64/egg/tracjsgantt/tracpm.py", line 2406, in more File "build/bdist.linux-x86_64/egg/tracjsgantt/tracpm.py", line 2326, in ownersOf File "/usr/lib/python2.7/site-packages/trac/db/util.py", line 85, in executemany File "/usr/lib/python2.7/site-packages/trac/db/sqlite_backend.py", line 62, in executemany File "/usr/lib/python2.7/site-packages/trac/db/sqlite_backend.py", line 48, in _rollback_on_error
comment:19 Changed 11 years ago by
Replying to lotte@…:
I'm using the TracJsGanttPlugin in combination with the Timing and Estimation and the Subtickets plugin with everything enabled and I'm getting the following error when creating a new ticket, which I think is related to this ticket: ...
I Like to confirm that I'm able to create tickets without any errors when I disable the TicketRescheduler
It's not clear to me how to leverage Trac's
%s
handling which properly quotes strings field values in a DSMS-specific way. I have:How do I get that
closed
where it belongs?