Modify

Opened 22 months ago

Last modified 20 months ago

#10942 new enhancement

Improvements for plugin's db schema

Reported by: hasienda Owned by: hasienda
Priority: normal Component: VotePlugin
Severity: normal Keywords: schema upgrade db
Cc: Trac Release:

Description

In addition to issues with the db code itself (#10706) I feel that the db schema is flawed too. It should definitely follow Trac's resource concept more closely and store (resource) 'realm' and 'resource_id' of voted resources. This would help to avoid string functions in queries using more portable SQL instead (#4546). And while at it lets track the vote time (stamp) too, because there might be a use case mentioned in #7592.

So how about making a fresh schema version 2 part of the aforementioned db rework then? Count me in, if you need assistance, code review or in-depended testing.

Attachments (0)

Change History (16)

comment:1 Changed 22 months ago by hasienda

  • Type changed from defect to enhancement

comment:2 Changed 22 months ago by rjollos

Sounds like good ideas to me. Feel free to push ahead if you have time, and I will try to keep involved and perhaps even find time to move this forward myself.

comment:3 follow-up: Changed 21 months ago by hasienda

I'm preparing changes to the db schema now.

What I'm worried about is, that these changes are not easy to revert once done. So this is my preliminary schema v2 for discussion:

    schema = [
        Table('votes', key=('realm', 'resource_id', 'username', 'vote'))[
            Column('realm'),
            Column('resource_id'),
            Column('username'),
            Column('vote', 'int'),
            Column('time', type='int64'),
            Column('changetime', type='int64'),
            ]
        ]

Questions:

  • Does the 'vote' belong into primary key? Possible reason: Enforce one vote/user, but this is still not guaranteed now, because in theory there could be two entries, one with vote=-1 and another with vote=1.
  • Do we really need to have POSIX microseconds time stamps? It could ease queries with joins on Trac core tables for Trac>=0.12, but apart from that the extra precision seems irrelevant for this plugin.
  • Do you see the value of changetime too? I plan to not delete votes once submitted, and use a third vote value '0' instead. Therefor one would be able to see unaltered vs. changed votes. OTOH we'll still not track changes, and I see no reason to do so in the future.

Btw, despite talking about stuff mostly relevant for Trac>=0.12 here, I'll take care for backwards-compatibility down to Trac 0.11 as usual.

comment:4 Changed 21 months ago by rjollos

  • Owner changed from rjollos to hasienda

comment:5 follow-ups: Changed 21 months ago by rjollos

Here is my current thinking regarding development of the plugin:

  • Let's plan for the next version with feature enhancements to be 0.2.0, but feel free to bump the 0.1.x line if you fix defects.
  • For 0.2.0, I'm planning to tackle #4546 and #7592.
  • It might be a good time to start a changelog and back-fill it as we go.
  • There was a recent comment on the trac-users mailing list that the SqlQueryForReport.txt wasn't working, so we should take a look at that.

While reviewing the code this weekend I noticed 2 issues:

comment:6 in reply to: ↑ 3 ; follow-up: Changed 21 months ago by rjollos

Replying to hasienda:

I'm preparing changes to the db schema now.

What I'm worried about is, that these changes are not easy to revert once done. So this is my preliminary schema v2 for discussion:

The only additional column I could think of having, and I'm not sure it is even relevant, but some resources have a version. I think that we don't have to worry about this, but we could capture the version of the resource that was voted on.

Questions:

  • Does the 'vote' belong into primary key? Possible reason: Enforce one vote/user, but this is still not guaranteed now, because in theory there could be two entries, one with vote=-1 and another with vote=1.
  • Do we really need to have POSIX microseconds time stamps? It could ease queries with joins on Trac core tables for Trac>=0.12, but apart from that the extra precision seems irrelevant for this plugin.
  • Do you see the value of changetime too? I plan to not delete votes once submitted, and use a third vote value '0' instead. Therefor one would be able to see unaltered vs. changed votes. OTOH we'll still not track changes, and I see no reason to do so in the future.

I have very little experience defining schemas, so I don't think that I'm the best person to answer these questions. You may even want to bring this up on the tracdev mailing list like you did for the timestamp issue. FWIW, the schema looks good to me, and I'd leave vote out of the primary key and use microsecond timestamps!

I do like that you've added the changetime, and I think it will add some value that we might not even see yet.

You may want to keep in mind my proposal to merge the features of the FiveStarVotePlugin (#7615). I can't see that it will drive any schema changes though.

Btw, despite talking about stuff mostly relevant for Trac>=0.12 here, I'll take care for backwards-compatibility down to Trac 0.11 as usual.

It's up to you, but I'd be okay with having the 0.1.x release line be 0.11-compatible, making the schema change for 0.2.0 and dropping supporting for 0.11.x in VotePlugin 0.2.0.

Btw, feel free to set yourself as co-maintainer on the project wiki page. You are doing a lot of work here!

comment:7 in reply to: ↑ 5 Changed 21 months ago by hasienda

Replying to rjollos:

Here is my current thinking regarding development of the plugin:

  • Let's plan for the next version with feature enhancements to be 0.2.0, but feel free to bump the 0.1.x line if you fix defects.
  • For 0.2.0, I'm planning to tackle #4546 and #7592.

As I revealed to you minutes ago, there's already some code to make all that happen on my side. Anyway, glad that we agree on this without speaking about it.

  • It might be a good time to start a changelog and back-fill it as we go.

You seem to like my habit from maintenance of other plugins. So may it be. I've already thought about it too, and will introduce a changelog file with one of the upcoming changes.

  • There was a recent comment on the trac-users mailing list that the SqlQueryForReport.txt wasn't working, so we should take a look at that.

It'll have to be redefined after the schema upgrade, but luckily I've already prepared one here, so no extra work needed on your side. I'll just update that file and include it in at /contrib directory on commit time of the new db schema.

While reviewing the code this weekend I noticed 2 issues:

  • voteable_paths supports globs, but a regex is used for the path match. jun66j5 has previously noted that this might not be a good thing to do (#10226).

Didn't see that, but already had some strange feelings about that part of the code. I'll have to be reworked for making the include/exclude request happen one way or another (#7610).

Yes and no. You'll see, that these statements will even get obsoleted by moving to more appropriate resource references in the schema v2.

comment:8 in reply to: ↑ 6 ; follow-up: Changed 21 months ago by hasienda

Replying to rjollos:

Replying to hasienda:

I'm preparing changes to the db schema now.

What I'm worried about is, that these changes are not easy to revert once done. So this is my preliminary schema v2 for discussion:

The only additional column I could think of having, and I'm not sure it is even relevant, but some resources have a version. I think that we don't have to worry about this, but we could capture the version of the resource that was voted on.

This is an excellent idea. My thinking about a version column stopped after deciding, that a vote wouldn't need a version for itself. While I'm not sure, how to represent votes for previous versions in a minimalistic form right-away, this makes the move to proper resource references complete, sure. I'll implement that in my draft for schema v2 for sure.

Related question: For setting a version on schema upgrade I'd like to use the current (== latest) version, because "silent takeover" is what would happen for votes on any resource update today as well. Do you feel differently?

Note that the meaning of version differs by realm, even for the currently most relevant Trac core realm 'ticket' and 'wiki'. While a new wiki page version could totally change page content, ticket content doesn't change that much by version. It would be sensible to mostly look for changes of summary and description, when deciding on relevance for previous votes.

Questions:

  • Does the 'vote' belong into primary key? Possible reason: Enforce one vote/user, but this is still not guaranteed now, because in theory there could be two entries, one with vote=-1 and another with vote=1.
  • Do we really need to have POSIX microseconds time stamps? It could ease queries with joins on Trac core tables for Trac>=0.12, but apart from that the extra precision seems irrelevant for this plugin.
  • Do you see the value of changetime too? I plan to not delete votes once submitted, and use a third vote value '0' instead. Therefor one would be able to see unaltered vs. changed votes. OTOH we'll still not track changes, and I see no reason to do so in the future.

I have very little experience defining schemas, so I don't think that I'm the best person to answer these questions. You may even want to bring this up on the tracdev mailing list like you did for the timestamp issue. FWIW, the schema looks good to me, and I'd leave vote out of the primary key and use microsecond timestamps!

Ok.

I do like that you've added the changetime, and I think it will add some value that we might not even see yet.

I like this notion. We don't loose much, but could miss a chance for upstream development indeed.

You may want to keep in mind my proposal to merge the features of the FiveStarVotePlugin (#7615). I can't see that it will drive any schema changes though.

While I do agree on the merge plan, this might require an additional column to sanely mark the type of vote done for any give resource. Thinking some more into that direction it would even make that 'vote_type' (preliminary working title) a candidate for inclusion into the primary key for schema v2.

Btw, did I mention, that I like collaboration? It's for the extra spin gained by sharing ideas. Thanks for that again.

Btw, despite talking about stuff mostly relevant for Trac>=0.12 here, I'll take care for backwards-compatibility down to Trac 0.11 as usual.

It's up to you, but I'd be okay with having the 0.1.x release line be 0.11-compatible, making the schema change for 0.2.0 and dropping supporting for 0.11.x in VotePlugin 0.2.0.

Btw, feel free to set yourself as co-maintainer on the project wiki page. You are doing a lot of work here!

Thanks. I'll introduce myself to the wiki page later on, and I'll try out, if it's possible to keep both of us informed about new incoming tickets too.

comment:9 in reply to: ↑ 8 ; follow-up: Changed 21 months ago by rjollos

Replying to hasienda:

Related question: For setting a version on schema upgrade I'd like to use the current (== latest) version, because "silent takeover" is what would happen for votes on any resource update today as well. Do you feel differently?

I think that probably makes sense, but ultimately it depends on how we utilize version. If we intend to record the version of the resource at which the vote was last changed (corresponding to changetime), then I think this definitely makes sense.

The alternative would be to leave the version NULL on schema upgrade. We'd have to take extra care in the code and in queries for the NULL value, but we should be doing that anyway, so it's really just keeping us honest. The advantage is that there would be no ambiguity between votes that occurred pre and post schema upgrade. So far though, I see no harm to this ambiguity.

Btw, did I mention, that I like collaboration? It's for the extra spin gained by sharing ideas. Thanks for that again.

Oh, same here. Working alone, I will learn some. By collaborating, I learn much more, and am more likely to retain what I learned! Besides that, I think we have the opportunity to develop plugins more quickly by collaborating, and just speaking for myself, I'm more likely to stay motivated and less likely to lose focus.

comment:10 in reply to: ↑ 9 Changed 21 months ago by hasienda

Replying to rjollos:

Replying to hasienda:

Related question: For setting a version on schema upgrade I'd like to use the current (== latest) version, because "silent takeover" is what would happen for votes on any resource update today as well. Do you feel differently?

I think that probably makes sense, but ultimately it depends on how we utilize version. If we intend to record the version of the resource at which the vote was last changed (corresponding to changetime), then I think this definitely makes sense.

Yes, this is what I intend to do indeed.

The alternative would be to leave the version NULL on schema upgrade. We'd have to take extra care in the code and in queries for the NULL value, but we should be doing that anyway, so it's really just keeping us honest. The advantage is that there would be no ambiguity between votes that occurred pre and post schema upgrade. So far though, I see no harm to this ambiguity.

Your argument for keeping ambiguous votes like that until next vote update sounds reasonable. I'll remove 'set to latest', that has been implemented for the upgrade script before.

comment:11 in reply to: ↑ 5 Changed 20 months ago by rjollos

Replying to rjollos:

  • voteable_paths supports globs, but a regex is used for the path match. jun66j5 has previously noted that this might not be a good thing to do (#10226).

=> #11037.

comment:12 Changed 20 months ago by rjollos

I finally had a chance to review your patches sent on March 28th, and they look good to me. I've been looking closely at the BookmarkPlugin lately, in order improve the rendering of bookmarks (#9212) and also fix some exceptions that can occur when certain resources are bookmarked. Your patches have given me a lot of suggestions on how to improve the code for BookmarkPlugin as well.

I have the following thoughts regarding the patches:

  • You alerted me to resource_exists not existing prior for Trac <= 0.11.7, which I really appreciated (comment:20:ticket:9785). I will get that issue fixed for the BookmarkPlugin.
  • I'm not sure it is applicable to VotePlugin, but I had to take a lot of care in [13029] to deal with attachments (relevant code starts at line 263 in __init__.py). A bookmark in the attachment realm may or may not have a resource id (for example, the Attachments List page /attachment/ticket/1 would not have a resource id), and it can be difficult to extract the realm, resource id, parent realm and parent id. Consider /attachment/wiki/Page/SubPage/SubSubPage/filename and /attachment/wiki/Page/SubPage/SubSubPage can both be bookmarked. Both may have an arbitrarily long parent resource id, and in the latter case there is no resource id. These complications had me considering whether it might be better to also save the parent realm and id in the table, though I'm unsure of this so far.
  • I see you've taken care to add a lot of compatibility code. I commend you for this, but honestly, when it comes to fixing the same issues with the BookmarkPlugin, I will likely drop support for Trac < 1.0 before making the changes to the DB schema.
  • Really nice that you've take care for resource renaming. I need to do the same for the BookmarkPlugin, caring for at least milestones as well. For the VotePlugin, supporting milestones is probably not so important.

I do more testing and review as you push the changes. I hope you don't mind that I CC you on similar work for the BookmarkPlugin. I see a lot of opportunity here for cross-plugin review.

comment:13 Changed 20 months ago by hasienda

(In [13079]) VotePlugin: Moving to new db schema, refs #4546 and #10942.

Major changes are

  • conversion of internal resource pointers from path to resource identifiers
  • schema version entry in Trac db table system
  • two time stamp columns for both, initial vote and last change/update

Introduce slightly reduced version of common schema upgrade code and unit tests covering possible install/upgrade scenarios. Time stamp columns are prerequisite for something like vote history. This might be available later on as core feature or using a wiki macro.

comment:14 Changed 20 months ago by hasienda

It might not look so, but this is the near-minimum of changes to get to revision-tagged db schema. Especially subtle are required changes for packaging to include upgrade code into Python packages.

Despite of this rather big table extension the discussed merge with FiveStarVotePlugin will require yet another column for vote type (working title). But I heartily agree to not hurry for that and save it for next schema revision instead, if it continues to look like a good idea after some time.

comment:15 Changed 20 months ago by hasienda

As side-effect of internally moving to resources in [13079] is, that previously vote-able realms need to be re-checked now. Today I did that for milestones and will commit required changes in a few minutes.

Thanks to Trac's default milestone names I discovered an insanity in resource_from_path during that development work. At first sight it came as a real surprise, that resource IDs could get accidentally altered, if they contain their realm name: 'milestone/milestone1' --> '1' instead of 'milestone1'.

The regular expression function re.sub() must be limited to one replacement at maximum. Accidental discovery, small change, big gain. Imagine, what nasty bug reports this could have spawn, if undetected.

comment:16 Changed 20 months ago by hasienda

(In [13093]) VotePlugin: Re-enable voting on milestones, refs #4546, #7592 and #10942.

delete_vote obviously needed to not depend on a req object to work. That went unnoticed, because it is called only by change listeners for now.

set_vote was not ready for unversioned resources, and a subtle insanity in resource_from_path surfaced during early testing with milestones too.

Method name changes are done for clarity, that a single method call probably deals with multiple vote entries.

Add Comment

Modify Ticket

Action
as new The owner will remain hasienda.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.