Extension of Search to linked pages, documents or attachments
|Reported by:||mem||Owned by:||anybody|
|Severity:||normal||Keywords:||Search Links Recursive|
Extended search function
It would be exceptionally useful to be able to extend the search to look inside attachments and links and to control the depth of the search. By this I mean that the search would allow for
- a link (depth 1);
- a link within a link (depth 2);
- a link within a link within a link (depth 3);
to be indexed within the search.
For example I make a link to a page outside my trac setup and on this page there is a pdf file linked, like
Federal Reserve which contains links including a pdf file http://www.federalreserve.gov/pubs/bulletin/2010/pdf/legalq409.pdf. For example, I could search for "FDIC" and it would turn up in this paper.
If the external search depth were 2 then the search function would search external links down two levels and include the text FDIC within the pdf document as shown. The unix program lynx is able to recursively locate links to a specified depth. The hack would have to do something similar and then index the pages and files (of allowed types) to be included within the search. It might be most efficient to recognise if the link has changed (either by date or by some hash based upon the data) and only try to index it if it is new or changed.
The beauty of this is that it extends the search from just the trac website to the local nodes of the network and would allow information on adjacent sites specified by links to be searched.
The same search could be extended to attached files too.