Opened 7 years ago

Last modified 7 years ago

#6960 new enhancement

Ignore elements for export by marker in xhtml

Reported by: entend Owned by: Aurélien Bompard
Priority: normal Component: OdtExportPlugin
Severity: normal Keywords:
Cc: Trac Release: 0.11


Idea which was firstly described there (6953).

It will be useful that some html elements which marked by special marker are not exported.

For example:

Hello <span odtexportignore>burn in hell</span>!

In odt file only "Hello !" paragraph will be.

Attachments (0)

Change History (2)

comment:1 Changed 7 years ago by entend

I think it will be something like that:

html = re.sub('<([\w]+?)\s.*?\s*odtexportignore\s*>.*?</\\1>', '', html)

It will be works if inside <tag></tag> are not any other elements <tag>. For general case regexp will be very interesting... :)

comment:2 Changed 7 years ago by entend

Another concept:

from HTMLParser import HTMLParser

def strip(str, ignoremarker):
    class StrippingParser(HTMLParser):
        def __init__(self, ignoremarker):
            self.stack = []
            self.openignores = []
            self.ignoremarker = ignoremarker
        def handle_data(self, data):
            if not len(self.openignores):
        def handle_starttag(self, tag, attrs):
            if len(self.openignores):
                if self.ignoremarker in dict(attrs):
            elif not (self.ignoremarker in dict(attrs)):
                self.stack.append('<%s%s>' % (tag, self.__html_attrs(attrs)))

        def handle_startend_tag(self, tag, attrs):
            if not len(self.openignores) and \
               not (self.ignoremarker in dict(attrs)):
                self.stack.append('<%s%s/>' % (tag, self.__html_attrs(attrs)))
        def handle_endtag(self, tag):
            if not len(self.openignores):
                self.stack.append('</%s>' % (tag))
            elif tag in self.openignores:
        def __html_attrs(self, attrs):
            _attrs = ''
            if attrs:
                _attrs = ' %s' % (' '.join([('%s="%s"' % (k,v)) for k,v in attrs.iteritems()]))
            return _attrs
    stripparser = StrippingParser(ignoremarker)
    return ''.join(stripparser.stack)

teststr = """
        Hello<div odtexportignore> ,
                <span odtexportignore>burn in hell</span>
            </div> and be happy!

print strip(teststr, "odtexportignore")

Will work on any valid html I think.

P.S. I got some code from

Modify Ticket

as new The owner will remain Aurélien Bompard.

Add Comment

E-mail address and name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.