Modify

Opened 16 years ago

Closed 16 years ago

Last modified 15 years ago

#2927 closed enhancement (fixed)

import tool for wiki-based blogs (or documentation to help in development of one)

Reported by: nick carrasco Owned by: John Hampton
Priority: normal Component: FullBlogPlugin
Severity: trivial Keywords:
Cc: John Hampton Trac Release: 0.11

Description

I think that many users of the TracBlogPlugin and SimpleBlogPlugin may be discouraged from switching based on the lack of an import tool. I personally feel that FullBlogPlugin is a much nicer tool with many features to offer. I personally plan on using FullBlogPlugin when I upgrade my server to 0.11 but I would also like to carry along my wiki blog post with out having to repost them all. So basically from my point of view, either a tool or documentation to help build a tool would be great.

Also while I've only been involved with Trac as a User, I'd be happy to help develop and test any tool. I am currently only using 0.11 as a testing system as we upgrade our website to use Trac as the main engine. So any testing won't currently affect my operations.

Attachments (5)

migrate-tracblog.py (6.4 KB) - added by John Hampton 16 years ago.
Script for migrating TracBlogPlugin to FullBlogPlugin
migrate-tracblog.2.py (6.5 KB) - added by mfisk@… 16 years ago.
With bugfixes to avoid SQL errors
migrate-tracblog3.py (6.8 KB) - added by risto.kankkunen@… 16 years ago.
Avoid 'SQL logic error or missing database'
migrate-tracblog4.py (6.8 KB) - added by risto.kankkunen@… 16 years ago.
Avoid 'SQL logic error or missing database' even with -d option
migrate-tracblog5.py (8.4 KB) - added by risto.kankkunen@… 16 years ago.
Migrate also the blog attachments

Download all attachments as: .zip

Change History (25)

comment:1 Changed 16 years ago by osimons

Thanks for positive feedback on the plugin - glad you like it!

As for importing, others have also hinted at it at IRC but no firm request before now. I haven't had the need to import myself, so that is why I have no code for it. Should be simple enough to do, and would perhaps just need to supplement with some properties to be able to copy all information from the wiki to full blog posts.

It won't be part of the running plugin itself anyway, so I'm fine with either attaching it to the plugin home page or adding it to a new /contrib directory inside the plugin - the latter perhaps being the best idea.

If you want to pick this up and develop a basic import tool, I'll be happy to help you out.

comment:2 in reply to:  1 Changed 16 years ago by John Hampton

Cc: John Hampton added; anonymous removed

Replying to osimons:

OK, I think I've admitted defeat wrt TracBlogPlugin. As such, I'm trying to migrate http://pacopablo.com and am working on this migration script.

The migration does look fairly straight forward, though #2956 needs resolving.

I think that putting the script in a contrib directory and then putting instructions on the wiki is the best way forward.

I'll attach the script when I'm done with it

Changed 16 years ago by John Hampton

Attachment: migrate-tracblog.py added

Script for migrating TracBlogPlugin to FullBlogPlugin

comment:3 Changed 16 years ago by John Hampton

Attached the first rev of the migration script. It worked for my site. Would like feedback regarding features.

Also, can we get this into /contrib ?

comment:4 in reply to:  3 ; Changed 16 years ago by nick carrasco

Replying to pacopablo:

First, Thanks! From what I can tell, everything worked great.

All my posts were migrated and my tags were nicely converted to categories but as expected still are accessible through the Tags cloud. Delete-only worked perfect as well.

This all said, for the most part my blog entries with TracBlogPlugin have only been been simple Twitter-like one line entries. I'll try to make some more interesting blog posts and test the tool again a little later.

comment:5 in reply to:  4 Changed 16 years ago by nick carrasco

Continuing ckcin:

I've been playing some new blog entries, the only issues I've found is that attachments don't transfer. This said, I think it is more of a warning statement to users vs an actual issue.

comment:6 in reply to:  3 ; Changed 16 years ago by guyer@…

Replying to pacopablo:

Attached the first rev of the migration script. It worked for my site. Would like feedback regarding features.

When I attempt to run migrate-tracblog.py, I get

Unable to insert 2007_10_16_08.54 into the FullBlog
Unable to insert 2007_12_13_09.13 into the FullBlog

The script quits after this, without comment, even though I have a lot more blog entries.

Oddly, enough, if I examine the database with sqlite3 after this, I do find

INSERT INTO "fullblog_posts" VALUES('2007_12_13_09.13' ...

so the second item is inserted, even though it says it wasn't.

If I have the exception handler in insert_blog_post() print the error, I get

cannot commit transaction - SQL statements in progress

googling for this error message turned up a number of hits, including http://mail.python.org/pipermail/python-list/2005-May/323328.html, but that's about pysqlite 2.0.2, and I'm running 2.4.1:

>>> import trac.db.sqlite_backend as test
>>> test._ver
(3, 4, 0)
>>> test.have_pysqlite
2
>>> test.sqlite.version
'2.4.1'

Even though that message seems obsolete, on a whim I tried commenting out cnx.commit() and adding some diagnostic print statements. I can see that script then "handles" all of my blog entries without further error, but obviously nothing actually gets inserted into the database. I'm completely ignorant wrt SQL, so I have no idea where to move the .commit() that's more appropriate. I also have no idea what a "DML statement" is, but it appears that "INSERT" is one, so that message may be a complete red herring.

I'm running Python 2.5, Trac 0.11rc2, Apache 2.2.8 under Mac OS X 10.5.3.

comment:7 in reply to:  6 ; Changed 16 years ago by John Hampton

Owner: changed from osimons to John Hampton
Status: newassigned

Replying to guyer@nist.gov:

Even though that message seems obsolete, on a whim I tried commenting out cnx.commit() and adding some diagnostic print statements. I can see that script then "handles" all of my blog entries without further error, but obviously nothing actually gets inserted into the database. I'm completely ignorant wrt SQL, so I have no idea where to move the .commit() that's more appropriate. I also have no idea what a "DML statement" is, but it appears that "INSERT" is one, so that message may be a complete red herring.

I'm running Python 2.5, Trac 0.11rc2, Apache 2.2.8 under Mac OS X 10.5.3.

I can't really see why that statement would be causing issues. All it's doing is committing the INSERT.

You could try the following diff:

  • migrate-tracblog.py

    old new  
    7070                    (name, version, title, body, epochtime(publish_time),
    7171                    epochtime(version_time), version_comment, version_author,
    7272                    author, categories))
    73         cnx.commit()
    7473    except:
    7574        print("Unable to insert %s into the FullBlog" % name)
    7675        cnx.rollback()
     76        cnx.close()
     77        return
     78
     79    cnx.commit()
    7780    cnx.close()
    7881
    7982def Main(opts):

Should be functionally equivalent, though.

I mostly use PostgreSQL, would you be able to provide me with a sample sqlite db that fails?

comment:8 in reply to:  7 Changed 16 years ago by guyer@…

Replying to pacopablo:

I can't really see why that statement would be causing issues. All it's doing is committing the INSERT.

Agreed. I was hacking around pretty blindly.

You could try the following diff:

Now it spews

2008-06-11 14:46:09,353 Trac[migrate-tracblog] DEBUG: Error loading wiki page 2007/10/16/08.54
Traceback (most recent call last):
  File "research5/migrate-tracblog.py", line 119, in Main
    categories)
  File "research5/migrate-tracblog.py", line 79, in insert_blog_post
    cnx.commit()
OperationalError: cannot commit transaction - SQL statements in progress

to trac.log for seemingly all of my blog entries (not just the two that got reported in the original version).

For what it's worth, in both the original and patched versions, the trac.log entries ends with

2008-06-11 14:46:09,449 Trac[ticket] DEBUG: SELECT * FROM (SELECT id, keywords, COALESCE(keywords, '') AS fields FROM ticket WHERE status != 'closed') s WHERE (fields LIKE %s) ORDER BY id
2008-06-11 14:46:09,450 Trac[tags] DEBUG: SELECT bp1.name, bp1.categories, bp1.version FROM fullblog_posts bp1,(SELECT name, max(version) AS ver FROM fullblog_posts GROUP BY name) bp2 WHERE bp1.version = bp2.ver AND bp1.name = bp2.name AND (bp1.categories LIKE %s) ORDER BY bp1.name

I mostly use PostgreSQL, would you be able to provide me with a sample sqlite db that fails?

I'll mail it to you privately.

Changed 16 years ago by mfisk@…

Attachment: migrate-tracblog.2.py added

With bugfixes to avoid SQL errors

comment:9 Changed 16 years ago by mfisk@…

I just attached a second version of the script with some fixes to the database code so that it actually works for me. This successfully migrated a pretty large blog for m.

comment:10 Changed 16 years ago by risto.kankkunen@…

Using Python2.4.4 and pysqlite2.3.2 in Debian Etch I got the same error message when trying to migrate blog posts:

OperationalError: SQL logic error or missing database

I could repro the problem just using pysqlite: if you loop with one cursor and then try to commit changes to the same connection inside the loop (while the original cursor is still active), you get the same error:

cur1 = con.cursor()
for r in cur1:
  cur2 = con.cursor()
  cur2.execute(...)
  con.commit() # <-- this fails

I made the following changes to get the script working:

@@ -56,11 +56,10 @@
     """ Return seconds from epoch from a datetime object """
     return int(time.mktime(t.timetuple()))

-def insert_blog_post(env, name, version, title, body, publish_time,
+def insert_blog_post(cnx, name, version, title, body, publish_time,
                      version_time, version_comment, version_author,
                      author, categories):
     """ Insert the post into the FullBlog tables """
-    cnx = env.get_db_cnx()
     cur = cnx.cursor()
     try:
         cur.execute("INSERT INTO fullblog_posts "
@@ -70,11 +69,9 @@
                     (name, version, title, body, epochtime(publish_time),
                     epochtime(version_time), version_comment, version_author,
                     author, categories))
-        cnx.commit()
-    except:
-        print("Unable to insert %s into the FullBlog" % name)
-        cnx.rollback()
-    cnx.close()
+    except Exception, e:
+        print("Unable to insert %s into the FullBlog: %s" % (name, e))
+        raise

 def Main(opts):
     """ Cross your fingers and pray """
@@ -87,6 +84,7 @@
     req = Mock(perm=MockPerm())
     blog = tags.query(req, ' '.join(tlist + ['realm:wiki']))

+    cnx = env.get_db_cnx()
     for resource, page_tags in blog:
         try:
             page = WikiPage(env, version=1, name=resource.id)
@@ -110,7 +108,8 @@
                 else:
                     title = name
                 body = fulltext
-                insert_blog_post(env, name, version, title, body,
+                print "Adding post %s, v%s: %s" % (name, version, title)
+                insert_blog_post(cnx, name, version, title, body,
                                  publish_time, version_time,
                                  version_comment, version_author, author,
                                  categories)
@@ -121,7 +120,13 @@
         except:
             env.log.debug("Error loading wiki page %s" % resource.id,
                           exc_info=True)
-            continue
+            print "Failed to add post %s, v%s: %s" % (name, version, title)
+            print "Undoing back all changes"
+            cnx.rollback()
+            return 1
+
+    cnx.commit()
+    cnx.close()
     return 0

I'll attach the fixed version of the script here.

comment:11 Changed 16 years ago by risto.kankkunen@…

Bummer, I hadn't refreshed this page and hadn't noticed that mfisk@… had pretty much solved the problem the same way. I added some diagnostic output to the script, so I'll attach it anyway, but the version by mfisk@… should work as well.

Changed 16 years ago by risto.kankkunen@…

Attachment: migrate-tracblog3.py added

Avoid 'SQL logic error or missing database'

comment:12 Changed 16 years ago by risto.kankkunen@…

I just found out that neither migrate-tracblog.2.py nor my version of the script works with the -d option:

Traceback (most recent call last):
  File "migrate-tracblog.2.py", line 168, in ?
    rc = Main(options)
  File "migrate-tracblog.2.py", line 122, in Main
    page.delete()
  File "/tmp/python-stage/Trac-0.11-py2.4.egg/trac/wiki/model.py", line 111, in delete
    db.commit()
pysqlite2.dbapi2.OperationalError: SQL logic error or missing database

I fixed the problem by collecting the matching resources into a list, thus completing the SELECT before there is an attempt to commit.

Changed 16 years ago by risto.kankkunen@…

Attachment: migrate-tracblog4.py added

Avoid 'SQL logic error or missing database' even with -d option

Changed 16 years ago by risto.kankkunen@…

Attachment: migrate-tracblog5.py added

Migrate also the blog attachments

comment:13 Changed 16 years ago by anonymous

Priority: lowestnormal

comment:14 in reply to:  10 ; Changed 16 years ago by John Hampton

Replying to risto.kankkunen@iki.fi:

I could repro the problem just using pysqlite: if you loop with one cursor and then try to commit changes to the same connection inside the loop (while the original cursor is still active), you get the same error:

OK, I understand this, however, I don't see how my original script would be doing this, as I am opening and closing a connection every time I migrate a page. I guess this could possibly be related to the trac connection pooling. Perhaps it doesn't close the cursor and when I get call get_db_cnx() I am getting back a connection with an already open cursor.

I'll commit the latest version of the migration script that you provided as it does work. However, I would really like to understand why my version doesn't work with SQLite.

comment:15 Changed 16 years ago by John Hampton

Resolution: fixed
Status: assignedclosed

(In [4062]) Added migrate-tracblog.py script. Closes #2927

comment:16 in reply to:  15 Changed 16 years ago by guyer@…

Replying to pacopablo:

(In [4062]) Added migrate-tracblog.py script. Closes #2927

This script resolve the problems I was having earlier. Thanks.

comment:17 in reply to:  14 Changed 15 years ago by anonymous

Replying to pacopablo:

OK, I understand this, however, I don't see how my original script would be doing this, as I am opening and closing a connection every time I migrate a page. I guess this could possibly be related to the trac connection pooling.

Sorry for not responding earlier... I'm replying based on my vague recollection, but I think the problem was that tags.query() returns a generator that is actually an open cursor. And because of the connection pooling your cnx refers to the same connection the tags module got. The same phenomenon happened with the WikiPage class.

comment:18 Changed 15 years ago by osimons

Hmm. Interesting. I haven't been involved in this at all, but that is an interesting topic with 2 open cursors - one an open generator and the other updates/changes data. According to the pool code, as this executes in the same thread, you will get the same connection returned for both.

I suppose the behaviour for each db type will be different depending on what amount of caching is done in cursor, and what it needs to keep fetching from data as it iterates - and that changes while iterating. Postgres, SQLite and others may all give different behaviour.

Does it work if you consume the generator first, and then work on the data? Try wrapping the use of the generator in a list(), like list(generator_func(stuff)).

comment:19 in reply to:  18 Changed 15 years ago by anonymous

Replying to osimons:

Does it work if you consume the generator first, and then work on the data? Try wrapping the use of the generator in a list(), like list(generator_func(stuff)).

That is exactly the change I made between migrate-tracblog3.py and migrate-tracblog4.py.

comment:20 Changed 15 years ago by risto.kankkunen@…

Sorry for writing anonymously by accident, it was me who wrote comment:17 and comment:19.

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain John Hampton.
The resolution will be deleted. Next status will be 'reopened'.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.