Modify

Opened 5 years ago

Last modified 3 months ago

#4743 assigned enhancement

Patch: intertrac support, include external html, remote authentication

Reported by: jholg Owned by: rjollos
Priority: normal Component: IncludeMacro
Severity: normal Keywords: intertrac
Cc: Trac Release: 0.11

Description

Hi,

I finally got round to working on IncludeMacro.

Attached proposal code for IncludeMacro supports:

  • intertrac support (also requested in ticket:1196 )
  • support of actual inclusion of remote html pages, with conversion of relative or server-local absolute links to full absolute links (fixes ticket:3979).
  • the possibility to select part of the included html result by using a genshi transformer xpath expression
  • if installed, BeautifulSoup will be used to clean up messy included html. This is regularly the case for external html, but it also often happens that Trac macros/plugins break Trac's xhtml-validity.

This should also fix ticket:2474 (non-standard http port).

Regarding the 2nd point: The actual inclusion of remote sources is not possible in the original IncludeMacro, only html snippets can be used (unless render_unsafe_content is set to true).

My version therefore changes the HTMLSanitizer instantiation to

HTMLSanitizer.SAFE_TAGS|set(["html", "body"])

which allows inclusion of "full" external html. Without allowing html & body, no (x)html documents will ever be rendered if render_unsafe_content is off, only (x)html snippets. Does it hurt to allow body to be parsed? I don't think so, as already non xhtml-valid inclusion may render due to the parser not yielding events in document order. All offensive stuff inside body will be filtered out nonetheless.

I'm attaching this as a full source file as the changes to the original source are by now quite significant.

I'd love to see getting these capabilities into the official IncludeMacro. Let me know if I can help.

Holger

Attachments (5)

macros.py (15.2 KB) - added by jholg 5 years ago.
proposal includemacro code for intertrac support / external html inclusion
macros.py.patch (3.8 KB) - added by griznog 5 years ago.
Patch to enable include of authenticated websites.
macros.2.py (16.9 KB) - added by jholg 16 months ago.
Updated version of (intertrac- and external html supporting) macros.py, with basic auth and config control over HTMLSanitizer configuration
macros.3.py (18.4 KB) - added by jholg 16 months ago.
Updated version of (intertrac- and external html supporting) macros.py, with basic auth and config control over HTMLSanitizer configuration, plus optional encoding hint argument
macros.4.py (18.6 KB) - added by jholg 16 months ago.
Updated version of (intertrac- and external html supporting) macros.py, with basic auth and config control over HTMLSanitizer configuration, plus optional encoding hint argument, plus using http header encoding ("charset") information

Download all attachments as: .zip

Change History (14)

Changed 5 years ago by jholg

proposal includemacro code for intertrac support / external html inclusion

Changed 5 years ago by griznog

Patch to enable include of authenticated websites.

comment:1 Changed 5 years ago by griznog

I have attached a patch to your macro.py which enables support for authenticated websites. Although I may have broken processing the xpath argument (I don't use that so I didn't test it.)

griznog

comment:2 Changed 16 months ago by hasienda

  • Owner changed from coderanger to rjollos
  • Priority changed from high to normal
  • Summary changed from Patch: intertrac support, include external html to Patch: intertrac support, include external html, remote authentication
  • Type changed from defect to enhancement

#1196 has been closed as a duplicate of this ticket.

Other changes are on order to ensure current maintainers attention. Ultimately, let him choose priority for this feature.

comment:3 Changed 16 months ago by hasienda

#6879 refers to this ticket, because it has been created exclusively for adding remote authentication to this plugin.

comment:4 Changed 16 months ago by jholg

Just spotted the activity here so I'm attaching a newer version of my original patch.
This basically adopts the basic auth support as provided by griznog plus options to control the HTMLSanitizers' "safe tags" and "safe attributes" in the macro config section.

Note that my own use case doesn't really expand to using the auth support so this part isn't well tested (by me).

This is of course still based on an ancient 0.11 revision of the original code; I've been using this for years now in our Trac intranet.

Holger

Changed 16 months ago by jholg

Updated version of (intertrac- and external html supporting) macros.py, with basic auth and config control over HTMLSanitizer configuration

Changed 16 months ago by jholg

Updated version of (intertrac- and external html supporting) macros.py, with basic auth and config control over HTMLSanitizer configuration, plus optional encoding hint argument

comment:5 follow-up: Changed 16 months ago by jholg

And while I'm at it attachment:macros.3.py adds a parameter to manually set an encoding hint for the BeautifulSoup HTML cleanup, for situations when the default detection of the included source encoding doesn't work properly.

Changed 16 months ago by jholg

Updated version of (intertrac- and external html supporting) macros.py, with basic auth and config control over HTMLSanitizer configuration, plus optional encoding hint argument, plus using http header encoding ("charset") information

comment:6 in reply to: ↑ 5 Changed 16 months ago by jholg

Replying to jholg:

And while I'm at it attachment:macros.3.py adds a parameter to manually set an encoding hint for the BeautifulSoup HTML cleanup, for situations when the default detection of the included source encoding doesn't work properly.

To better scratch my current itch this now hands the include source http header encoding information from the content-type charset to BeautifulSoup, if no explicit from_encoding is set (attachment:macros.4.py).

comment:7 Changed 16 months ago by rjollos

I'll probably get around to reviewing this more quickly if you attach it as a unified diff. Thanks for the patch!

comment:8 in reply to: ↑ description Changed 16 months ago by jholg

Hi,

please see the ticket description - I originally attached this as a full source file as the changes were quite substantial by the time I finished the functionality I needed.

That being said I can work on a patch against the latest trunk (or released version) if there are chances that this goes into the official version, and if there's agreement on the target functionality.

In particular:

Attached proposal code for IncludeMacro supports:

I can't remember anything that might be controversial about this.

  • support of actual inclusion of remote html pages, with conversion of relative or server-local absolute links to full absolute links (fixes ticket:3979).

Note that to actually include external html without setting render_unsafe_content you need to modify the "safe tags" settings of HTMLSanitizer. More explanations below.

  • the possibility to select part of the included html result by using a genshi transformer xpath expression

I think this is a nice optional add-on and doesn't add any security-relevant concerns in itself.

  • if installed, BeautifulSoup will be used to clean up messy included html. This is regularly the case for external html, but it also often happens that Trac macros/plugins break Trac's xhtml-validity.

While this introduces an (optional) dependency on BeautifulSoup I found that more often than not included external (x)html is pretty broken and needs fixup. Also, sometimes Trac plugins break xhtml conformance. This is an easy way to mostly cover such issues.

This should also fix ticket:2474 (non-standard http port).

Probably not worth discussing.

Regarding the 2nd point: The actual inclusion of remote sources is not possible in the original IncludeMacro, only html snippets can be used (unless render_unsafe_content is set to true).

My version therefore changes the HTMLSanitizer instantiation to

HTMLSanitizer.SAFE_TAGS|set(["html", "body"])

which allows inclusion of "full" external html. Without allowing html & body, no (x)html documents will ever be rendered if render_unsafe_content is off, only (x)html snippets. Does it hurt to allow body to be parsed? I don't think so, as already non xhtml-valid inclusion may render due to the parser not yielding events in document order. All offensive stuff inside body will be filtered out nonetheless.

As mentioned above I guess this is a point that warrants discussion. As I see things there's basically 2 things that can be done if you want to include external html pages:

  1. Set render_unsafe_content to True. Undesirable as a general setting imho.
  2. Do what I propose here. I am neither a Trac nor an http expert but I fail to see any real security concerns; the original macro author unfortunately never commented on that.

Holger

comment:9 Changed 3 months ago by rjollos

  • Status changed from new to assigned

Add Comment

Modify Ticket

Action
as assigned .
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.