#8266 closed defect (fixed)
If acronym contains a hyphen, it is not linked to correct page
Reported by: | Ryan J Ollos | Owned by: | Ryan J Ollos |
---|---|---|---|
Priority: | normal | Component: | AcronymsPlugin |
Severity: | normal | Keywords: | unicode |
Cc: | Steffen Hoffmann, morpheus.me@… | Trac Release: | 0.11 |
Description
As described in comment:3#857, the following example:
||XXX || XXX Page || XXXPage || || ||YYY-XXX|| YYY-XXX Page || YYY-XXXPage|| ||
Attachments (1)
Change History (9)
Changed 14 years ago by
Attachment: | HyphenExample.png added |
---|
comment:1 Changed 14 years ago by
Owner: | changed from Alec Thomas to Ryan J Ollos |
---|---|
Status: | new → assigned |
comment:2 Changed 14 years ago by
comment:3 follow-up: 5 Changed 14 years ago by
Cc: | Steffen Hoffmann added; anonymous removed |
---|
I've added the UNICODE flag so that acronyms with unicode characters classified as alphanumeric will be matched. An alternative would be to set the LOCALE flag, in which case characters classified as alphanumeric in the environment's locale would be matched. I'm not sure which is better.
hasienda, is this something you'd like to test out, since you have done a lot of work with locales?
comment:4 Changed 14 years ago by
comment:5 Changed 14 years ago by
Keywords: | unicode added |
---|
comment:6 Changed 14 years ago by
Cc: | morpheus.me@… added |
---|
Received a hint about this in #5938, and will submit a fix shortly.
comment:7 Changed 14 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
comment:8 Changed 14 years ago by
I'm re-iterating through issues for this plugin now while preparing for an upcoming Trac application.
The regexp change sucessfully solved another issue for me: acronyms with Unicode characters like German umlauts. Before coming to this ticket I've done own experiments on this matter. Results have been rather confusing to me: re.U flag for that r'^\w+$'
expression didn't result in expected matches:
Python 2.6.6 (r266:84292, Dec 27 2010, 00:02:40) [GCC 4.4.5] on linux2 >>> import re >>> RE = re.compile(r'^\w+$', re.U) >>> RE.match('ä') >>> RE.match('ö') >>> RE.match('ü') <_sre.SRE_Match object at 0xb7359d08> >>> RE.match('ß') >>> RE.match('Ä') >>> RE.match('Ö') >>> RE.match('Ü')
The re.L flag didn't change matches at all. So the very general \S
match is the best I can see right now. Still it troubles me, I may not understand that flags correctly...
I've traced this to the regular expression not matching XXX-YYY. We'll need to modify the regular expression: