|Version 15 (modified by 5 years ago) (diff),|
SEAT plugin: SEarch wiki and ticket ATtachments
This plugin allows text integral search in wiki and ticket attachments (.pdf, .doc,.ppt,...). The SEAT plugin has the following features:
- new Attachments source to the search page
- an excerpt of the matching documents is presented in the result page
- any format is supported as long as there is a command line tool for plain text conversion (filter command)
Download and Source
How to install?
- Install swish-e, http://swish-e.org/
- Install filter tools for the formats you want to index:
- Download the code from http://trac-hacks.org/wiki/SearchAttachmentsPlugin to YOUR_SOURCE_DIR
- Install the plugin
python setup.py bdist_egg cp dist/TracSearchAttachmentsPlugin-0.1-py2.4.egg /path/to/your/env/plugins/
- Trac 10.x and 11.x source code must be manually modified for SEAT plugin to work. The file to modify is attachment.py, on a Linux Fedora system, it is located in /usr/lib/python2.4/site-packages/trac
Comment: This is already done for Trac 11.4
162 def insert(self, filename, fileobj, size, t=None, db=None): ... ... 184 try: 185 # Note: `path` is an unicode string because `self.path` was one. 186 # As it contains only quoted chars and numbers, we can use `ascii` 187 basename = os.path.basename(path).encode('ascii') 188 filename = unicode_unquote(basename) 189 190 cursor = db.cursor() 191 cursor.execute("INSERT INTO attachment " 192 "VALUES (%s,%s,%s,%s,%s,%s,%s,%s)", 193 (self.parent_type, self.parent_id, filename, 194 self.size, self.time, self.description, self.author, 195 self.ipnr)) 196 shutil.copyfileobj(fileobj, targetfile) 197 self.filename = filename 198 199 self.env.log.info('New attachment: %s by %s', self.title, 200 self.author) 201 202 if handle_ta: 203 db.commit() 204 205 targetfile.close() # << Line to add for SEAT plugin 206 207 for listener in AttachmentModule(self.env).change_listeners: 208 listener.attachment_added(self) 209 finally: 210 targetfile.close()
- Use the trac-seat utility to index existing attachments:
cp YOUR_SOURCE_DIR/searchattachmentsplugin/0.10/trac-seat /path/to/your/env/index cd /path/to/your/env/index chmod +x trac-seat ./trac-seat /path/to/your/env meta ./trac-seat /path/to/your/env index -c cd .. chown -R apache:apache /path/to.your/env/index
Comment: On Trac 11.4 I think that should be /path/to/your/env/attachments/index
- Configure trac.ini
[components] .... searchattachments.* = enabled [attachment] ... # This is the path to the swish-e command on your system swish = /usr/local/bin/swish-e seat = /path/to/your/env/trac-seat # The first %s is the absolute path of the input file. # The second %s is the absolute path of the text file generated by the command. filter.doc = /usr/local/bin/catdoc -b "%s" > "%s" filter.ppt = /usr/local/bin/catppt "%s" > "%s" filter.pdf = /usr/bin/pdftotext "%s" "%s"
Comment: I have had better success omitting the -b flag to catdoc
Comment: Changed config section to [attachment] which is what's actually used in the sources (not [attachments]). This should also be corrected in the plugin's README.
There is no need to declare a filter command for .txt or .text. Text files are handled natively. To index a new non-text format, just add a filter.* entry using the appropriate command line tool for this format.filter.EXTENSION = path_to_EXTENSION_to_text_command -infile "%s" -outfile "%s"
- restart the trac server
03/01/09 06:26:55: I downloaded the plugin, but it does not include a version for trac 0.11. There are also reports on the web that it in fact does not work with trac 0.11. Does anyone know of an update?
10/07/09 16:08:23: Well, this in an inofficial upddate, but as we use SEAT and are about to update from 0.10 to 0.11 we have updated the source as well. If you download the attachment, it contains the source for both Trac-versions.
- I did not use the patch mentioned in the comment above.
- I didn't update files in the 0.10 directory
- I changed the config section name to use 'searchattachments' consistently, both in code and docs, i.e.