wiki:SearchAttachmentsPlugin

Version 14 (modified by jholg, 12 years ago) (diff)

--

SEAT plugin: SEarch wiki and ticket ATtachments

Notice: This plugin is unmaintained and available for adoption.

Description

This plugin allows text integral search in wiki and ticket attachments (.pdf, .doc,.ppt,...). The SEAT plugin has the following features:

  • new Attachments source to the search page
  • an excerpt of the matching documents is presented in the result page
  • any format is supported as long as there is a command line tool for plain text conversion (filter command)

Bugs/Feature Requests

Existing bugs and feature requests for SearchAttachmentsPlugin are here. If you have any issues, create a new ticket.

Download and Source

Download the [download:searchattachmentsplugin zipped source], check out using Subversion, or browse the source with Trac.

How to install?

  • Install the plugin
        python setup.py bdist_egg
        cp dist/TracSearchAttachmentsPlugin-0.1-py2.4.egg /path/to/your/env/plugins/
    
  • Trac 10.x and 11.x source code must be manually modified for SEAT plugin to work. The file to modify is attachment.py, on a Linux Fedora system, it is located in /usr/lib/python2.4/site-packages/trac

Comment: This is already done for Trac 11.4

162     def insert(self, filename, fileobj, size, t=None, db=None):
...
...
184         try:
185             # Note: `path` is an unicode string because `self.path` was one.
186             # As it contains only quoted chars and numbers, we can use `ascii`
187             basename = os.path.basename(path).encode('ascii')
188             filename = unicode_unquote(basename)
189
190             cursor = db.cursor()
191             cursor.execute("INSERT INTO attachment "
192                            "VALUES (%s,%s,%s,%s,%s,%s,%s,%s)",
193                            (self.parent_type, self.parent_id, filename,
194                             self.size, self.time, self.description, self.author,
195                             self.ipnr))
196             shutil.copyfileobj(fileobj, targetfile)
197             self.filename = filename
198
199             self.env.log.info('New attachment: %s by %s', self.title,
200                               self.author)
201
202             if handle_ta:
203                 db.commit()
204
205             targetfile.close() # << Line to add for SEAT plugin
206
207             for listener in AttachmentModule(self.env).change_listeners:
208                 listener.attachment_added(self)
209         finally:
210             targetfile.close()

  • Use the trac-seat utility to index existing attachments:
        cp YOUR_SOURCE_DIR/searchattachmentsplugin/0.10/trac-seat /path/to/your/env/index
        cd /path/to/your/env/index
        chmod +x trac-seat
        ./trac-seat /path/to/your/env meta
        ./trac-seat /path/to/your/env index -c
        cd ..
        chown -R apache:apache /path/to.your/env/index
    

Comment: On Trac 11.4 I think that should be /path/to/your/env/attachments/index

  • Configure trac.ini
       [components]
       ....
       searchattachments.* = enabled
    
       [attachment]
       ...
       # This is the path to the swish-e command on your system
       swish = /usr/local/bin/swish-e
       seat  = /path/to/your/env/trac-seat
    
       # The first %s is the absolute path of the input file.
       # The second %s is the absolute path of the text file generated by the command.
       filter.doc = /usr/local/bin/catdoc -b "%s" > "%s"
       filter.ppt = /usr/local/bin/catppt "%s" > "%s"
       filter.pdf = /usr/bin/pdftotext "%s" "%s"
    

Comment: I have had better success omitting the -b flag to catdoc

Comment: Changed config section to [attachment] which is what's actually used in the sources (not [attachments]). This should also be corrected in the plugin's README.

There is no need to declare a filter command for .txt or .text. Text files are handled natively. To index a new non-text format, just add a filter.* entry using the appropriate command line tool for this format.

   filter.EXTENSION = path_to_EXTENSION_to_text_command -infile "%s" -outfile "%s"
  • restart the trac server

Unofficial Comments


I downloaded the plugin, but it does not include a version for trac 0.11. There are also reports on the web that it in fact does not work with trac 0.11. Does anyone know of an update?

Well, this in an inofficial upddate, but as we use SEAT and are about to update from 0.10 to 0.11 we have updated the source as well. If you download the attachment, it contains the source for both Trac-versions.


See ticket:10219 for a patch with changes I had to apply to make SearchAttachmentsPlugin work for me (currently running on Trac 0.11.4).

Notes:

  • I did not use the patch mentioned in the comment above.
  • I didn't update files in the 0.10 directory

--jholg


Recent Changes

15177 by rjollos on 2016-01-26 18:12:25
0.2dev: Fix indentation using reindent.py
14892 by rjollos on 2015-08-28 00:22:07
Convert to datetime. Untested patch by srl100@…. Fixes #4930.
14470 by rjollos on 2015-02-27 21:50:21
Branching for Trac 1.0.
(more)

Author/Contributors

Author: deltroo
Maintainer: deltroo
Contributors: