Opened 4 years ago

Closed 5 months ago

# users can post or comment on trac-hacks.org in the name of any registered user

Reported by: Owned by: falkb Ryan J Ollos high AccountManagerPlugin critical Ryan J Ollos, Steffen Hoffmann, lkraav

### Description

I noticed I can comment on tickets here on trac-hacks.org as falkb without being asked for my password.

I have no problem with anonymous users posting or commenting with fictive names or email addresses. I think this is even a useful feature for people who are don't forced to register when they want to report something.

But I think if one tries to post or comment with a registered account name or registered email address, such action has to be acknowledged by a password barrier. This way one may still post with fictive names (fine with me), but is not allowed to post in the name of a known user without confirmation.

I think this is important to prevent account abuse.

### comment:1 Changed 4 years ago by Steffen Hoffmann

I agree to both, your statements and the ticket rating. This is not good at all.

### comment:2 follow-up:  3 Changed 4 years ago by falkb

Let's see if this works...

### comment:3 in reply to:  2 Changed 4 years ago by Michael Renzmann

Let's see if this works...

That was me. It obviously works. However, since the same works on teo, too I think this is a generic Trac issue rather than one specific to trac-hacks.org...

### comment:4 follow-up:  6 Changed 4 years ago by Ryan J Ollos

The issue has been known on t.e.o for a long time. I saw one instance of someone pretending to be ecarter in a ticket a long time ago. There have also been some ticket comments and page edits using usernames that are active on t-h.o, such as myself or falkb. The edits were obviously spam in those cases.

I think we can eventually implement a username check, either in AccountManagerPlugin or TracHacksPlugin.

Last edited 4 years ago by Ryan J Ollos (previous) (diff)

### comment:5 Changed 4 years ago by Steffen Hoffmann

Hm, I'm worried to see t:#1890, that exactly reflects the current state: unchanged source regarding this issue for 8 years by now and not even a comment within the last 3 years.

### comment:6 in reply to:  4 Changed 4 years ago by Steffen Hoffmann

The issue has been known on t.e.o for a long time. ...

I do remember some discussion on the issue myself, yes.

I think we can eventually implement a username check, either in AccountManagerPlugin or TracHacksPlugin.

Good point. In the same notion as we prevent changes to invalid email addresses with email validation enabled we could enforce some sort of username clearance too. I'll start to work on this right after the TagsPlugin release, unless someone beats me with an earlier patch attached to an enhancement request for AccountManagerPlugin.

### comment:7 follow-up:  8 Changed 4 years ago by Ryan J Ollos

After reading through trac:#1890, I'm left feeling that all of the proposed solutions will require quite a bit of effort, although a good solution seems to be outlined in trac:comment:53:ticket:1890 and trac:comment:94:ticket:1890.

A reasonably quick solution for trac-hacks.org would be to implement ITicketManipulator and IWikiManipulator: if req.authname == 'anonymous', then check if author matches a sid in the session table with authenticated = 1. If a match is found, reject the change and tell the user they must login in order to save a change with that username.

A minority of users may not like the requirement to login, but I think it's reasonable to require a login if you've previously registered an account.

### comment:8 in reply to:  7 ; follow-up:  9 Changed 4 years ago by Steffen Hoffmann

A reasonably quick solution for trac-hacks.org would be to implement ITicketManipulator and IWikiManipulator: if req.authname == 'anonymous', then check if author matches a sid in the session table with authenticated = 1. If a match is found, reject the change and tell the user they must login in order to save a change with that username.

I'd go one step further and disallow to even change an anonymous session id to a registered username. While this would be easier done in Trac core, we can do so inside AcctMgr too.

A minority of users may not like the requirement to login, but I think it's reasonable to require a login if you've previously registered an account.

Absolutely reasonable for the sake of clarity, yes.

### comment:9 in reply to:  8 ; follow-up:  10 Changed 4 years ago by Ryan J Ollos

I'd go one step further and disallow to even change an anonymous session id to a registered username.

So we just check if the author matches a sid in the session table and don't require authenticated = 1? I think that would work since the sids are hashes in the case of an unauthenticated session.

While this would be easier done in Trac core

If there's an easier fix to the Trac core, I think the issue would be worth pushing there. However we fix this on t-h.o, they'll probably want to fix it on t.e.o as well.

### comment:10 in reply to:  9 Changed 4 years ago by Steffen Hoffmann

I'd go one step further and disallow to even change an anonymous session id to a registered username.

So we just check if the author matches a sid in the session table and don't require authenticated = 1? I think that would work since the sids are hashes in the case of an unauthenticated session.

Yes, that's the plan, and ideally we'll do so in Trac core code for setting non-random, custom session id's, but I think that at least meanwhile we can intercept on a higher level, or at a different end, i.e. by using a specially crafted authenticator to instantly stop such ambiguous sessions from proceeding.

### Changed 4 years ago by Steffen Hoffmann

proposed changes for preventing ambiguous anonymous session IDs using AccountManagerPlugin

### comment:11 Changed 4 years ago by Steffen Hoffmann

In 13756:

AccountManagerPlugin: Prevent skipping creation of a new authenticated session ID, refs #11622.

Research done for resolving another issue suddenly revealed this issue.

### comment:12 follow-up:  20 Changed 4 years ago by Steffen Hoffmann

Proposed changes do not alter existing IDs but prevent everyone from choosing an existing authenticated session ID now and forever.

Hint: Any anonymous session ID that is identical to an authenticated session ID is deleted automatically next time an anonymous session is promoted to authenticated session ID of the same name (see Session.promote_session() in t:source:trunk/trac/web/session.py for details).

### comment:13 Changed 4 years ago by falkb

Instead of "Choose another session ID, please." I'd rather say "Choose another session ID or login, please."

### comment:14 Changed 4 years ago by falkb

even better "Login or choose another session ID, please."

### Changed 4 years ago by Steffen Hoffmann

2nd version including improved warning message according to falkb's recommendation

### comment:15 follow-up:  18 Changed 4 years ago by Steffen Hoffmann

More suggestions welcome. Or should it have been that easy? ;-)

Last edited 4 years ago by Steffen Hoffmann (previous) (diff)

### comment:16 Changed 4 years ago by falkb

fine with me if it simply prevents to post in the name of registered others. I wonder why there was so much discussion in t:#1890 over so many years.

Last edited 4 years ago by Steffen Hoffmann (previous) (diff)

### comment:18 in reply to:  15 ; follow-up:  19 Changed 4 years ago by paul@…

More suggestions welcome. Or should it have been that easy? ;-)

just do activate the patch! We'll see whether it helps or not

### comment:19 in reply to:  18 Changed 4 years ago by Steffen Hoffmann

More suggestions welcome. Or should it have been that easy? ;-)

just do activate the patch! We'll see whether it helps or not

One step at a time. We don't want to introduce bugs by careless action. Even more since discussion on the corresponding ticket for Trac core went for years by now, so I would rather hear a bit more feedback before applying the patch to AccountManagerPlugin's trunk.

Afterwards we could roll it to this Trac instance, if a) I release another minor version of current stable branch of the plugin, effectively forwarding the change to every updated plugin installation, or b) someone patches the plugin source exclusively for the installation here.

Even with broad approval there is still no solution for Trac core. Many Trac application including t.e.o do not utilize AccountManagerPlugin, that we plan to alter here. And the currently suggested approach itself is generally flawed, because it relies on checking the Trac db table session. Partially repeating comments from the other ticket here for the list of shortcomings: There is not yet a way of

• detecting authorized users, that did not log-in to Trac before
• preventing irritation by username-alike session IDs (without knowing the full user list)
• preventing choice of session IDs of accounts created in the future (likely a 'cantfix', but worth to look into for another pre-registration sanity check for AccountManager's registration procedure)

### comment:20 in reply to:  12 ; follow-ups:  21  26 Changed 4 years ago by Ryan J Ollos

Proposed changes do not alter existing IDs but prevent everyone from choosing an existing authenticated session ID now and forever.

I tested out the patch. It does prevent a user from populating the name attribute (SESSION_ATTRIBUTE.name) of an anonymous session with the sid of an authenticated user.

However, it doesn't prevent users from entering the sid of an authenticated user in the reporter field when creating tickets, in the author field when commenting on tickets and in the author field when editing the wiki. It seems that the latter is the more serious issue that needs to be addressed.

One point I did not previously understand is:

For an authenticated session, the sid (SESSION.sid) is used in the author field when changes are made (and a feature long-requested is to instead display the "full username", from SESSION_ATTRIBUTE.name: trac:#7339). However, for an anonymous session, the sid is a hash and the username is generated from SESSION_ATTRIBUTE.name and SESSION_ATTRIBUTE.email, which is then used to populate the author in a form. So the author of a change is the sid of an authenticated user, but for an unauthenticated user it has nothing to do with the sid of the unathenticated session.

Regarding the code, this line confused me a bit:

not (req.authname and req.authname != 'anonymous')


The intention is more clear to me with:

(not req.authname or req.authname == 'anonymous')

Version 2, edited 4 years ago by Ryan J Ollos (previous) (next) (diff)

### comment:21 in reply to:  20 ; follow-up:  22 Changed 4 years ago by Steffen Hoffmann

However, it doesn't prevent users from entering the sid of an authenticated user in the reporter field when creating tickets, in the author field when commenting on tickets and in the author field when editing the wiki. It seems that the latter is the more serious issue that needs to be addressed.

Sorry, I forgot about these places, because I generally do not allow such anonymous contributions in Trac applications I manage and always care to log-in myself before acting anywhere else too. This needs to get addressed by using the manipulator interfaces indeed.

Regarding the code, this line confused me a bit:

not (req.authname and req.authname != 'anonymous')


The intention is more clear to me with:

(not req.authname or req.authname == 'anonymous')


While the expressions are logically equivalent NOT(a AND b) == NOT(a) OR NOT(b), there is a subtle difference, that let python programmers prefer the first term: First and-ed statement will skip further evaluation of req.authname, if the value is empty or None), the 2nd or-ed statement needs to always check both parts.

### comment:22 in reply to:  21 ; follow-up:  23 Changed 4 years ago by Ryan J Ollos

there is a subtle difference, that let python programmers prefer the first term: First and-ed statement will skip further evaluation of req.authname, if the value is empty or None), the 2nd or-ed statement needs to always check both parts.

More specifically I would say, a python programmer would saw the need for the optimization would prefer the first term. It is no matter though, as it may only be me that tends to get confused by the logic, and anything confusing can always be clarified with a comment.

### comment:23 in reply to:  22 ; follow-up:  25 Changed 4 years ago by Steffen Hoffmann

More specifically I would say, a python programmer would saw the need for the optimization would prefer the first term. It is no matter though, as it may only be me that tends to get confused by the logic, and anything confusing can always be clarified with a comment.

I've seen the same pattern in Trac code several times and got confused about the logic as you did, even replaced it. From that experience I tell you, that it is not always about optimization. Instead of req.authname think of a method, that could return a list. We always want to look at the first list element, but in some cases the list might as well be empty. Here the first statement really shines, as it prevents an IndexError: list index out of range for the empty list, so it is in general a safer expression - nice defensive design example IMHO.

### comment:24 follow-up:  31 Changed 4 years ago by lkraav

Not to turn this into a programming theory discussion, but should Yoda conditions be also generally used in Python? Not sure if so much applicable to the AND version, but definitely for the OR version.

not (req.authname and 'anonymous' != req.authname)
(not req.authname or 'anonymous' == req.authname)


### comment:25 in reply to:  23 Changed 4 years ago by anonymous

Regarding the code, this line confused me a bit:

not (req.authname and req.authname != 'anonymous')


The intention is more clear to me with:

(not req.authname or req.authname == 'anonymous')


While the expressions are logically equivalent NOT(a AND b) == NOT(a) OR NOT(b), there is a subtle difference, that let python programmers prefer the first term: First and-ed statement will skip further evaluation of req.authname, if the value is empty or None), the 2nd or-ed statement needs to always check both parts.

The 2nd or-ed statement also skips the req.authname == 'anonymous' part if not req.authname. The or operator gets short-circuited just like and.

Instead of req.authname think of a method, that could return a list. We always want to look at the first list element, but in some cases the list might as well be empty. Here the first statement really shines, as it prevents an IndexError: list index out of range for the empty list, so it is in general a safer expression - nice defensive design example IMHO.

>>> list = []
>>> not(list and list[0] != 'anonymous')
True
>>> (not list or list[0] == 'anonymous')
True


### comment:26 in reply to:  20 Changed 4 years ago by Steffen Hoffmann

However, it doesn't prevent users from entering the sid of an authenticated user in the reporter field when creating tickets, in the author field when commenting on tickets and in the author field when editing the wiki. It seems that the latter is the more serious issue that needs to be addressed.

Updated patch addresses all but the initial reporter field. I've just overlooked that, sorry.

### Changed 4 years ago by Steffen Hoffmann

3rd version including resource manipulator interface implementations as well as hard request deflection for other cases

### comment:27 Changed 4 years ago by Steffen Hoffmann

Right, the bogus anonymous reporter was handled as a generic case. Not as graceful as I wanted, but still a proof for other author input not handled by manipulators yet.

Just updated the patch in-place, because the additional change was only to include 'newticket' as another exception from the generic case. So it shall be complete now. More comments?

### comment:28 follow-up:  29 Changed 4 years ago by Ryan J Ollos

I'll test and get you feedback within the next day or so.

Are you considering creating a 0.4.5 release that includes this change? That would allow us to deploy this soon and not depend on a patch instance of AccountManagerPlugin.

### comment:29 in reply to:  28 Changed 4 years ago by Steffen Hoffmann

I'll test and get you feedback within the next day or so.

Would be great, thanks.

Are you considering creating a 0.4.5 release that includes this change? That would allow us to deploy this soon and not depend on a patch instance of AccountManagerPlugin.

I didn't think about it, but given the potential of confusion caused by current behavior I would be willing to push it, this way or another (finalize acct_mgr-0.5).

### comment:30 Changed 4 years ago by Steffen Hoffmann

In 13914:

DiscussionPlugin: Enable fine-grained resource permission checks, refs #756 and #11622.

Changes base on a patch proposal by Simon Stelling - thank you very much.

Notes on alterations:

• constructing resources from realm resource for clarity and code reduction
• resource ID parsing in different places justifies dedicated private method
• 'DISCUSSION_ADMIN' inherits 'DISCUSSION_MODERATE' permission action, but permission checks need to check minimal requirement only
• more PEP8 changes (excessive white-space inside brackets, line-wrap)
• gradually adopt 'Yoda conditions' recently suggested as good coding style by lkraav

### comment:31 in reply to:  24 Changed 4 years ago by Ryan J Ollos

Not to turn this into a programming theory discussion, but should Yoda conditions be also generally used in Python?

Steffen captured this on the DevGuide#CodingStyle page. It could be discussed on the TracDev mailing list to see if this is something we'd want to adopt in t:TracDev/CodingStyle.

From the cases I've considered thus far, there is stronger justification for yoda conditions in C. In Python, the following is a syntax error:

if a = 3:
print 'yes'


However, the following case would justify their use:

# suppose you intend this
b = a == some_constant
# but instead you type this, which is not a syntax error
b = a = some_constant
# on the other hand, the "yoda condition" form would be a syntax error
b = some_constant = a
# however, using parenthesis would result in a syntax error as well
b = (a = some_constant)

Last edited 3 years ago by Ryan J Ollos (previous) (diff)

### comment:32 follow-ups:  33  34 Changed 4 years ago by Jun Omae

About the patches, I think it is easy to escape the checking of author's name using Unicode zero-width space (U+200B), Right-to-left mark (U+200F), etc.

### comment:33 in reply to:  32 Changed 4 years ago by Steffen Hoffmann

About the patches, I think it is easy to escape the checking of author's name using Unicode

zero-width space (U+200B), Right-to-left mark (U+200F), etc.

Thanks for picking up on this, but you intention is not completely clear to me yet. Do you mean to encourage me to improve towards including Unicode white space, or do you feel that the whole approach is flawed somehow? I vaguely remember checking for such flag to no avail when developing for this feature. Any pointer on where to head from here would be appreciated.

### comment:34 in reply to:  32 Changed 3 years ago by Ryan J Ollos

About the patches, I think it is easy to escape the checking of author's name using Unicode zero-width space (U+200B), Right-to-left mark (U+200F), etc.

That is a good point. I can reproduce by pasting a few ZWSP characters into the author field before typing the name of an authenticated author. We could use stripws from the trac core, which is available since 1.0: trac:browser:/tags/trac-1.0/trac/util/text.py@:101#L98. For AccountManagerPlugin we'll need to copy the function to the compat module.

Btw, I had been wondering for some time why ZWSP had to be explicitly included in the regex along with \s. A possible explanation is: SO:8928710/121694.

Everything else with the patch looks good to me. Revised patch: 20140616_prevent-ambiguous-sid.patch.

It would be great to apply this to the AccountManagerPlugin and roll out a new version that we can use on t-h.o. Thanks for your efforts to implement this, and I apologize for the delay in my reply.

### comment:35 Changed 3 years ago by Jun Omae

I think we should remove the zero-width characters from entire of the author rahter than strip from trailing and leading of the author.

Code point Bytes Example Notes
U+180E "jun" U+180E "66j5" jun᠎66j5 Maybe requires Mongolian Font
U+200B "jun" U+200B "66j5" jun​66j5
U+200C "jun" U+200C "66j5" jun‌66j5
U+200D "jun" U+200D "66j5" jun‍66j5
U+2060 "jun" U+2060 "66j5" jun⁠66j5
U+FEFF "jun" U+FEFF "66j5" jun﻿66j5

Also, Directional Formatting Characters can be used to bypass the checking.

Code point Bytes Example
U+200E "jun" U+200E "66j5" jun‎66j5
U+200F "jun" U+200F "66j5" jun‏66j5
Last edited 3 years ago by Jun Omae (previous) (diff)

### comment:36 Changed 3 years ago by Ryan J Ollos

Removing all zero-width characters from the string seems fairly straightforward. Do you think you've documented all of the zero-width characters, or do we need to do some additional searching? The list of all possible unicode characters is quite overwhelming.

### comment:37 follow-up:  38 Changed 3 years ago by Jun Omae

I cannot find the documentation about ZWSPs. Instead, found FAQ in unicode.org, Q: Which characters should be displayed as invisible, if not supported?.

According to the answer, the following list is all of ZWSP characters.

Category Code point Example
cursive joiners U+200C a‌b
cursive joiners U+200D a‍b
bidirectional format controls U+061C a؜b
bidirectional format controls U+200E a‎b
bidirectional format controls U+200F a‏b
bidirectional format controls U+202A a‪b
bidirectional format controls U+202B a‫b
bidirectional format controls U+202C a‬b
bidirectional format controls U+202D a‭b
bidirectional format controls U+202E a‮b
bidirectional format controls U+2066 a⁥b
bidirectional format controls U+2067 a⁧b
bidirectional format controls U+2068 a⁨b
bidirectional format controls U+2069 a⁩b
word joiners U+2060 a⁠b
word joiners U+FEFF a﻿b
the zero width space U+200B a​b
invisible math operators U+2061 a⁡b
invisible math operators U+2062 a⁢b
invisible math operators U+2063 a⁣b
invisible math operators U+2064 a⁤b
Jamo filler characters U+115F aᅟb
Jamo filler characters U+1160 aᅠb
variation selectors U+180B a᠋b
variation selectors U+180C a᠌b
variation selectors U+180D a᠍b
variation selectors U+FE00 – U+FE0F a︀b a️b
variation selectors U+E0100 – U+E01EF a󠄀b a󠇯b

### comment:38 in reply to:  37 Changed 3 years ago by Ryan J Ollos

I cannot find the documentation about ZWSPs. Instead, found FAQ in unicode.org, Q: Which characters should be displayed as invisible, if not supported?.

According to the answer, the following list is all of ZWSP characters.

Thanks, I'll make a modified version of stripws, and after the patch is complete we can push the change back to the Trac core if the modified function is generally useful. If anyone else wants to take the lead on revising the patch, please fell free.

### comment:39 Changed 3 years ago by Ryan J Ollos

Component: TracHacks → AccountManagerPlugin changed from Michael Renzmann to Steffen Hoffmann

Moving this to AccountManagerPlugin since the patch will be applied there.

### comment:40 Changed 3 years ago by falkb

I've posted this comment as falkb again without being asked for my password. :-(

### comment:41 Changed 3 years ago by Ryan J Ollos

Of course you did. Do you see resolution fixed on this ticket?

### comment:42 Changed 3 years ago by falkb

Sorry, my comment was not meant rude. I just see something has happened here in the meanwhile (which could be partial commits on the way to the 'fixed' state) but currently I don't have time to read it all. So it was just a very fast contribution of my user test, with the idea of better testing too much than to less. Friendly greetings to you all :-)

### comment:43 follow-up:  44 Changed 3 years ago by Ryan J Ollos

I see. Well I apologize and should not have assumed you were just nagging us ;)

The patch needs to be revised per comment:32. It's not currently my highest priority, but I would like to get it fixed for AccountManagerPlugin release 0.5.

### comment:44 in reply to:  43 Changed 3 years ago by Steffen Hoffmann

...
I would like to get it fixed for AccountManagerPlugin release 0.5.

Great. I did not look into Jun's contribution so far as well, but this should really go into next release, given the rather big impact on (anonymous) request handling.

### comment:45 Changed 3 years ago by Ryan J Ollos

Owner: changed from Steffen Hoffmann to Ryan J Ollos new → accepted

### comment:46 Changed 5 months ago by Ryan J Ollos

Resolution: → fixed accepted → closed

In 16662:

TracAccountManager 0.5dev: Prevent anonymous commenting as a registered user

Initial patch by hasienda, with contributions from jun66j5.

Fixes #11622.

### comment:47 Changed 5 months ago by Ryan J Ollos

This ticket was long overdue.

I tried including the variation selectors in the regex, but the tests fail. Do the code points outside plane 0 need to be specified in a different way? Anyway, I don't think that minor point is too important.

• ## acct_mgr/tests/util.py

 def test_remove_zwsp(self): self.assertEqual(u'user', remove_zwsp(u'\u200buser\u200b')) self.assertEqual(u'user', remove_zwsp(u'\u200fu\ufe00ser\u061c')) self.assertEqual(u'user', remove_zwsp(u'fu\ue0100ser')) def test_suite():
• ## acct_mgr/util.py

 _zwsp_re = re.compile(u'[\\s\u200b-\u200f\u061c\u202a-\u202e' u'\u2066-\u2069\u00ad\u2060\ufeff\u2061-\u2064' u'\u115f\u1160\u180b-\u180d\ufe00-\ufe0f]+', re.UNICODE) u'\u115f\u1160\u180b-\u180d\ufe00-\ufe0f' u'\ue0100-\ue01ef]+', re.UNICODE) def remove_zwsp(text):
\$ python -m unittest acct_mgr.tests.util
.F
======================================================================
FAIL: test_remove_zwsp (acct_mgr.tests.util.UtilTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "acct_mgr/tests/util.py", line 48, in test_remove_zwsp
self.assertEqual(u'user', remove_zwsp(u'\u200buser\u200b'))
AssertionError: u'user' != u''
- user
+

----------------------------------------------------------------------
Ran 2 tests in 0.001s

FAILED (failures=1)


I'll deploy changes to trac-hacks.org within a few days and make a blog post.

If anyone complains about the new behavior running TracAccountManager in their Trac instance, we can move the manipulators and request filter into a component that can be disabled.

### comment:48 Changed 5 months ago by Jun Omae

The u'fu\ue0100ser' should be u'fu\U000e0100ser'.

Linux:

Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
>>> [hex(ord(c)) for c in u'fu\ue0100ser']
['0x66', '0x75', '0xe010', '0x30', '0x73', '0x65', '0x72']
>>> [hex(ord(c)) for c in u'fu\U000e0100ser']
['0x66', '0x75', '0xe0100', '0x73', '0x65', '0x72']


Windows:

Python 2.7.10 (default, May 23 2015, 09:44:00) [MSC v.1500 64 bit (AMD64)] on win32
>>> [hex(ord(c)) for c in u'fu\U000e0100ser']
['0x66', '0x75', '0xdb40', '0xdd00', '0x73', '0x65', '0x72']


If Python is narrow build, U+10000 and later characters cannot be used in regular expression. We should use surrogate pair.

>>> import re
>>> re.compile(u'[\U000e0100-\U000e01ef]+', re.UNICODE)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\usr\apps\python27-amd64\lib\re.py", line 194, in compile
return _compile(pattern, flags)
File "C:\usr\apps\python27-amd64\lib\re.py", line 251, in _compile
raise error, v # invalid expression
>>> re.compile(u'\udb40[\udd00-\uddef]+', re.UNICODE)
<_sre.SRE_Pattern object at 0x0000000002806DD8>

Last edited 5 months ago by Jun Omae (previous) (diff)

### comment:49 Changed 5 months ago by Ryan J Ollos

Thanks, I'll commit that change.

### comment:50 Changed 5 months ago by Ryan J Ollos

In 16663:

TracAccountManager 0.5dev: Remove variation selectors

Patch by Jun Omae.

Refs #11622.

### comment:51 Changed 5 months ago by Ryan J Ollos

Changes deployed to trac-hacks.org. Please test and let me know if there are any issues. Changes described in blog:2017-06-15-username-policy.

Some other ideas to consider:

• Don't let user comment with a registered email address in the author field
• Decorate changes made by unauthenticated users with a subtle indicator

### comment:52 Changed 5 months ago by falkb

Works great! Thanks a lot. Smoke-test succeeded.

### comment:53 follow-up:  54 Changed 5 months ago by Jun Omae

In [16663], u'[\udb40[\udd00-\uddef]+ is wired. That should be (?:\udb40[\udd00-\uddef])+.

It is not needed to use surrogate pairs in unicode literal. Python interpreter automatically convert from UCS-4 character to surrogate pair with narrow build.

Linux (wide build):

Python 2.5.6 (r256:88840, Oct 21 2014, 22:49:55)
[GCC 4.8.2] on linux2
>>> import sys
>>> hex(sys.maxunicode)
'0x10ffff'
>>> parse_arg_list('text=%F3%A0%84%80-%F3%A0%87%AF')
[(u'text', u'\U000e0100-\U000e01ef')]
>>> [hex(ord(c)) for c in parse_arg_list('text=%F3%A0%84%80-%F3%A0%87%AF')[0][1]]
['0xe0100', '0x2d', '0xe01ef']
>>> import re
>>> re.compile(u'[\U000e0100-\U000e01ef]+', re.U)
<_sre.SRE_Pattern object at 0x20cbf30>


Windows (narrow build):

Python 2.7.10 (default, May 23 2015, 09:44:00) [MSC v.1500 64 bit (AMD64)] on win32
>>> import sys
>>> hex(sys.maxunicode)
'0xffff'
>>> from trac.web.api import parse_arg_list
>>> parse_arg_list('text=%F3%A0%84%80-%F3%A0%87%AF')
[(u'text', u'\U000e0100-\U000e01ef')]
>>> [hex(ord(c)) for c in parse_arg_list('text=%F3%A0%84%80-%F3%A0%87%AF')[0][1]]
['0xdb40', '0xdd00', '0x2d', '0xdb40', '0xddef']
>>> import re
>>> re.compile(u'[\U000e0100-\U000e01ef]+', re.U)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\usr\src\trac\venv\trac-1.0.12\lib\re.py", line 194, in compile
return _compile(pattern, flags)
File "C:\usr\src\trac\venv\trac-1.0.12\lib\re.py", line 251, in _compile
raise error, v # invalid expression

• ## accountmanagerplugin/trunk/acct_mgr/tests/util.py

diff --git a/accountmanagerplugin/trunk/acct_mgr/tests/util.py b/accountmanagerplugin/trunk/acct_mgr/tests/util.py
index ce89355c5..5d0ab0f69 100644
 a class UtilTestCase(unittest.TestCase): self.assertEqual(u'user', remove_zwsp(u'user')) self.assertEqual(u'user', remove_zwsp(u'\u200buser\u200b')) self.assertEqual(u'user', remove_zwsp(u'\u200fu\ufe00ser\u061c')) self.assertEqual(u'user', remove_zwsp(u'u\udb40\udd00ser')) self.assertEqual(u'u\U000e00ffser', remove_zwsp(u'u\U000e00ffser')) self.assertEqual(u'user', remove_zwsp(u'u\U000e0100ser')) self.assertEqual(u'user', remove_zwsp(u'u\U000e01efser')) self.assertEqual(u'u\U000e01f0ser', remove_zwsp(u'u\U000e01f0ser')) def test_suite():
• ## accountmanagerplugin/trunk/acct_mgr/util.py

diff --git a/accountmanagerplugin/trunk/acct_mgr/util.py b/accountmanagerplugin/trunk/acct_mgr/util.py
index f57a84e1e..3e0a702a4 100644
 a def pretty_precise_timedelta(time1, time2=None, resolution=None, diff=0): % (str(t) != '0' and t or '')).rstrip() _zwsp_re = re.compile(u'([\\s\u200b-\u200f\u061c\u202a-\u202e' u'\u2066-\u2069\u00ad\u2060\ufeff\u2061-\u2064' u'\u115f\u1160\u180b-\u180d\ufe00-\ufe0f]+|' u'[\udb40[\udd00-\uddef]+)', re.UNICODE) def _create_zwsp_re(): ucs2_range = u'\\s\u200b-\u200f\u061c\u202a-\u202e\u2066-\u2069\u00ad' \ u'\u2060\ufeff\u2061-\u2064\u115f\u1160\u180b-\u180d' \ u'\ufe00-\ufe0f' try: pattern = re.compile(u'[%s\U000e0100-\U000e01ef]+' % ucs2_range, re.UNICODE) except re.error: # Narrow build, re cannot use characters >= 0x10000 pattern = re.compile(u'[%s]+|(?:\udb40[\udd00-\uddef])+' % ucs2_range, re.UNICODE) return pattern _zwsp_re = _create_zwsp_re() def remove_zwsp(text):
Last edited 5 months ago by Jun Omae (previous) (diff)

### comment:54 in reply to:  53 Changed 5 months ago by Jun Omae

In [16663], u'[\udb40[\udd00-\uddef]+ is wired. That should be (?:\udb40[\udd00-\uddef])+.

Ah. My example code in comment:48 is incorrect. Sorry.

### comment:55 Changed 5 months ago by Ryan J Ollos

Complicated! Thanks for the patch.

### comment:56 Changed 5 months ago by Ryan J Ollos

In 16666:

TracAccountManager 0.5dev: Fix regex for narrow and wide builds

Patch by Jun Omae.

Refs #11622.

### Modify Ticket

Change Properties
Action
as closed The owner will remain Ryan J Ollos.
The resolution will be deleted.