Context Navigation

Modify ↓

#8266 closed defect (fixed)

If acronym contains a hyphen, it is not linked to correct page

Reported by:	Ryan J Ollos	Owned by:	Ryan J Ollos
Priority:	normal	Component:	AcronymsPlugin
Severity:	normal	Keywords:	unicode
Cc:	Steffen Hoffmann, morpheus.me@…	Trac Release:	0.11

Description

As described in comment:3#857, the following example:

||XXX    || XXX Page     || XXXPage    || ||
||YYY-XXX|| YYY-XXX Page || YYY-XXXPage|| ||

Results in:

Attachments (1)

HyphenExample.png (8.2 KB) - added by Ryan J Ollos 15 years ago.

Download all attachments as: .zip

Change History (9)

Changed 15 years ago by Ryan J Ollos

Attachment:	HyphenExample.png added

comment:1 Changed 15 years ago by Ryan J Ollos

Owner:	changed from Alec Thomas to Ryan J Ollos
Status:	new → assigned

comment:2 Changed 15 years ago by Ryan J Ollos

I've traced this to the regular expression not matching XXX-YYY. We'll need to modify the regular expression:

valid_acronym = re.compile('^\w+$')

comment:3 follow-up: 5 Changed 15 years ago by Ryan J Ollos

Cc:	Steffen Hoffmann added; anonymous removed

I've added the UNICODE flag so that acronyms with unicode characters classified as alphanumeric will be matched. An alternative would be to set the LOCALE flag, in which case characters classified as alphanumeric in the environment's locale would be matched. I'm not sure which is better.

hasienda, is this something you'd like to test out, since you have done a lot of work with locales?

comment:4 Changed 15 years ago by Ryan J Ollos

(In [9585]) Refs #8266:

Some minor refactoring.
Set the UNICODE flag when compiling the regular expression used to match acronyms.

comment:5 in reply to: 3 Changed 15 years ago by Steffen Hoffmann

Keywords:	unicode added

Replying to rjollos:

hasienda, is this something you'd like to test out, since you have done a lot of work with locales?

Will do and report back here; thank you for the hint.

comment:6 Changed 15 years ago by Ryan J Ollos

Cc:	morpheus.me@… added

Received a hint about this in #5938, and will submit a fix shortly.

comment:7 Changed 15 years ago by Ryan J Ollos

Resolution:	→ fixed
Status:	assigned → closed

(In [9662]) Use \S in the regular expression that extracts acronym definitions from the /wiki/acronym page. \S will match any non-whitespace character, whereas \w only matches alphanumeric characters and the underscore. Fixes #8266.

comment:8 Changed 15 years ago by Steffen Hoffmann

I'm re-iterating through issues for this plugin now while preparing for an upcoming Trac application.

The regexp change sucessfully solved another issue for me: acronyms with Unicode characters like German umlauts. Before coming to this ticket I've done own experiments on this matter. Results have been rather confusing to me: re.U flag for that r'^\w+$' expression didn't result in expected matches:

Python 2.6.6 (r266:84292, Dec 27 2010, 00:02:40) 
[GCC 4.4.5] on linux2
>>> import re
>>> RE = re.compile(r'^\w+$', re.U)
>>> RE.match('ä')
>>> RE.match('ö')
>>> RE.match('ü')
<_sre.SRE_Match object at 0xb7359d08>
>>> RE.match('ß')
>>> RE.match('Ä')
>>> RE.match('Ö')
>>> RE.match('Ü')

The re.L flag didn't change matches at all. So the very general \S match is the best I can see right now. Still it troubles me, I may not understand that flags correctly...

Modify Ticket

Change Properties

Summary:
Type:	Priority:
Component:	Severity:
Keywords:	Cc:	Set your email in Preferences
Trac Release:

Action

leave as closed The owner will remain Ryan J Ollos.

reopen The resolution will be deleted. Next status will be 'reopened'.

Add Comment

Your email or username:

E-mail address and name can be saved in the Preferences.

You may use WikiFormatting here.

Attachments ↑ Description ↑

Note: See TracTickets for help on using tickets.

Download in other formats: