Modify

Opened 5 years ago

Last modified 5 years ago

#6960 new enhancement

Ignore elements for export by marker in xhtml

Reported by: pnzhdin@… Owned by: abompard
Priority: normal Component: OdtExportPlugin
Severity: normal Keywords:
Cc: Trac Release: 0.11

Description

Idea which was firstly described there (6953).

It will be useful that some html elements which marked by special marker are not exported.

For example:

{{{
#!html
<p>
Hello <span odtexportignore>burn in hell</span>!
</p>
}}}

In odt file only "Hello !" paragraph will be.

Attachments (0)

Change History (2)

comment:1 Changed 5 years ago by pnzhdin@…

I think it will be something like that:

#...
html = re.sub('<([\w]+?)\s.*?\s*odtexportignore\s*>.*?</\\1>', '', html)
#...

It will be works if inside <tag></tag> are not any other elements <tag>. For general case regexp will be very interesting... :)

comment:2 Changed 5 years ago by pnzhdin@…

Another concept:

from HTMLParser import HTMLParser

def strip(str, ignoremarker):
    class StrippingParser(HTMLParser):
        def __init__(self, ignoremarker):
            HTMLParser.__init__(self)
            self.stack = []
            self.openignores = []
            self.ignoremarker = ignoremarker
        
        def handle_data(self, data):
            if not len(self.openignores):
                self.stack.append(data)
    
        def handle_starttag(self, tag, attrs):
            if len(self.openignores):
                if self.ignoremarker in dict(attrs):
                    self.openignores.append(tag)
            elif not (self.ignoremarker in dict(attrs)):
                self.stack.append('<%s%s>' % (tag, self.__html_attrs(attrs)))
            else:
                self.openignores.append(tag)

        def handle_startend_tag(self, tag, attrs):
            if not len(self.openignores) and \
               not (self.ignoremarker in dict(attrs)):
                self.stack.append('<%s%s/>' % (tag, self.__html_attrs(attrs)))
        
        def handle_endtag(self, tag):
            if not len(self.openignores):
                self.stack.append('</%s>' % (tag))
            elif tag in self.openignores:
                self.openignores.pop()
    
        def __html_attrs(self, attrs):
            _attrs = ''
            if attrs:
                _attrs = ' %s' % (' '.join([('%s="%s"' % (k,v)) for k,v in attrs.iteritems()]))
            return _attrs
    
    stripparser = StrippingParser(ignoremarker)
    stripparser.feed(str)
    stripparser.close()
    return ''.join(stripparser.stack)

teststr = """
<html>
    <body>
        Hello<div odtexportignore> ,
                <span odtexportignore>burn in hell</span>
            </div> and be happy!
    </body>
</html>
"""

print strip(teststr, "odtexportignore")

Will work on any valid html I think.

P.S. I got some code from http://unethicalblogger.com/node/180

Add Comment

Modify Ticket

Action
as new The owner will remain abompard.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.