Ticket #2502 (closed defect: fixed)

Opened 5 years ago

Last modified 2 years ago

bad regexp.

Reported by: anonymous Assigned to: rjollos
Priority: normal Component: EmoticonsPlugin
Severity: normal Keywords:
Cc: Trac Release: 0.10

Description

You don't parenthesize your pipe-separated list of tokens, meaning the \b directives that you put at the beginning and end of the regexp only apply to the tokens at either end.

What you want is \b(abc|def|ghi|jkl)\b NOT \babc|def|ghi|jkl\b

-- or better yet \b(?:abc|def|ghi|jkl)\b

(note that the list will be sorted into some arbitrary order by python based upon it's hashing algorithm, the order you specify them in the EMOTICONS map is not relevant)

Basically, two of the emotes in your map will not be rendered, but which two is unpredictable.

With the above list, only the tokens 'def' and 'ghi' are successfully matched and rendered into emotes. abc and jkl are never matched and never rendered into emotes regardless of if they are surrounded by word boundaries or are substrings.

When I tried parenthesizing the expression (thus applying the \b directive to the beginning and end of EVERY token) I found that NOTHING is successfully matched.

Removing the \b directives appears to work just fine - all tokens are matched, and only if they are not substrings (in other words, there seems to be an implicit check for word boundaries anyway).

A quick check of the python tutorial shows that \b can also be considered the sequence for the backspace character. I don't (care to) know python very well, so I don't know if that's what you've done wrong, and I wasn't interested in trying to find out (whitespace sensitivity lol) so I'll leave the investigation into that possibility to you.

Without the \b directives, there is no longer any reason to parenthesize the expression either.

In short, your regexp seems mighty dodgy. This works just fine for me:

pattern = '|'.join([re.escape(pattern) for pattern in EMOTICONS]) yield pattern, _replace

Attachments

Change History

04/30/11 08:42:15 changed by rjollos

  • owner changed from cmlenz to rjollos.

(in reply to: ↑ description ) 07/09/11 11:25:42 changed by rjollos

  • status changed from new to assigned.

Replying to anonymous:

Basically, two of the emotes in your map will not be rendered, but which two is unpredictable.

You are right, defect confirmed.

07/09/11 12:05:13 changed by rjollos

  • status changed from assigned to closed.
  • resolution set to fixed.

(In [10404]) Fixes #2502, #8723: Modified the regular expression to allow escaping and to fix issue such that two emoticons would not be matched.

07/09/11 12:06:59 changed by rjollos

(In [10405]) Refs #2502, #8723: Remove print statement accidentally left in code.


Add/Change #2502 (bad regexp.)




Change Properties
Action