wiki:SearchAttachmentsPlugin

Version 15 (modified by jholg, 21 months ago) (diff)

Comment on patch ticket, tried to add some timeline to the few unofficial comments found here

SEAT plugin: SEarch wiki and ticket ATtachments

Description

This plugin allows text integral search in wiki and ticket attachments (.pdf, .doc,.ppt,...). The SEAT plugin has the following features:

  • new Attachments source to the search page
  • an excerpt of the matching documents is presented in the result page
  • any format is supported as long as there is a command line tool for plain text conversion (filter command)

Bugs/Feature Requests

Existing bugs and feature requests for SearchAttachmentsPlugin are here. If you have any issues, create a new ticket.

Download and Source

Download the zipped source, check out using Subversion, or browse the source with Trac.

How to install?

  • Install the plugin
        python setup.py bdist_egg
        cp dist/TracSearchAttachmentsPlugin-0.1-py2.4.egg /path/to/your/env/plugins/
    
  • Trac 10.x and 11.x source code must be manually modified for SEAT plugin to work. The file to modify is attachment.py, on a Linux Fedora system, it is located in /usr/lib/python2.4/site-packages/trac

Comment: This is already done for Trac 11.4

162     def insert(self, filename, fileobj, size, t=None, db=None):
...
...
184         try:
185             # Note: `path` is an unicode string because `self.path` was one.
186             # As it contains only quoted chars and numbers, we can use `ascii`
187             basename = os.path.basename(path).encode('ascii')
188             filename = unicode_unquote(basename)
189
190             cursor = db.cursor()
191             cursor.execute("INSERT INTO attachment "
192                            "VALUES (%s,%s,%s,%s,%s,%s,%s,%s)",
193                            (self.parent_type, self.parent_id, filename,
194                             self.size, self.time, self.description, self.author,
195                             self.ipnr))
196             shutil.copyfileobj(fileobj, targetfile)
197             self.filename = filename
198
199             self.env.log.info('New attachment: %s by %s', self.title,
200                               self.author)
201
202             if handle_ta:
203                 db.commit()
204
205             targetfile.close() # << Line to add for SEAT plugin
206
207             for listener in AttachmentModule(self.env).change_listeners:
208                 listener.attachment_added(self)
209         finally:
210             targetfile.close()

  • Use the trac-seat utility to index existing attachments:
        cp YOUR_SOURCE_DIR/searchattachmentsplugin/0.10/trac-seat /path/to/your/env/index
        cd /path/to/your/env/index
        chmod +x trac-seat
        ./trac-seat /path/to/your/env meta
        ./trac-seat /path/to/your/env index -c
        cd ..
        chown -R apache:apache /path/to.your/env/index
    

Comment: On Trac 11.4 I think that should be /path/to/your/env/attachments/index

  • Configure trac.ini
       [components]
       ....
       searchattachments.* = enabled
    
       [attachment]
       ...
       # This is the path to the swish-e command on your system
       swish = /usr/local/bin/swish-e
       seat  = /path/to/your/env/trac-seat
    
       # The first %s is the absolute path of the input file.
       # The second %s is the absolute path of the text file generated by the command.
       filter.doc = /usr/local/bin/catdoc -b "%s" > "%s"
       filter.ppt = /usr/local/bin/catppt "%s" > "%s"
       filter.pdf = /usr/bin/pdftotext "%s" "%s"
    

Comment: I have had better success omitting the -b flag to catdoc

Comment: Changed config section to [attachment] which is what's actually used in the sources (not [attachments]). This should also be corrected in the plugin's README.

There is no need to declare a filter command for .txt or .text. Text files are handled natively. To index a new non-text format, just add a filter.* entry using the appropriate command line tool for this format.

   filter.EXTENSION = path_to_EXTENSION_to_text_command -infile "%s" -outfile "%s"
  • restart the trac server

Unofficial Comments


03/01/09 06:26:55: I downloaded the plugin, but it does not include a version for trac 0.11. There are also reports on the web that it in fact does not work with trac 0.11. Does anyone know of an update?

10/07/09 16:08:23: Well, this in an inofficial upddate, but as we use SEAT and are about to update from 0.10 to 0.11 we have updated the source as well. If you download the attachment, it contains the source for both Trac-versions.


08/02/12 16:28:09: See ticket:10219 for a patch with changes I had to apply to make SearchAttachmentsPlugin work for me (currently running on Trac 0.11.4).

Notes:

  • I did not use the patch mentioned in the comment above.
  • I didn't update files in the 0.10 directory
  • I changed the config section name to use 'searchattachments' consistently, both in code and docs, i.e. [searchattachments]

--jholg


Recent Changes

[13178] by rjollos on 2013-05-20 09:40:01
Fixes #3064: Fixed incorrect import.
[13177] by rjollos on 2013-05-20 09:38:17
Fixes #1857: Fixed typo of section name in README. Thanks to sassyn@… for spotting this.
[2229] by deltroo on 2007-05-09 17:31:22
SearchAttachmentsPlugin:

inital code of 0.10 version

Author/Contributors

Author: deltroo
Maintainer: deltroo
Contributors:

Attachments (1)

Download all attachments as: .zip