Opened 14 years ago
Closed 13 years ago
#8487 closed defect (fixed)
[patch] AcctMgr creates blank lines in password_file under Windows
Reported by: | Jeremy Dunn | Owned by: | Steffen Hoffmann |
---|---|---|---|
Priority: | high | Component: | AccountManagerPlugin |
Severity: | major | Keywords: | EOL style Windows |
Cc: | Trac Release: | 0.12 |
Description (last modified by )
Every time the AcctMgr plugin accesses the configured Password_file, it adds an extra blank line after each line in the file. First save, file doubles in size, then triples, etc. I have only 73 users, but the file was 125MB. Eventually trac runs out of memory when trying to do any operation.
AcctMgr plugin 0.3dev r9591 Trac v0.12. Python v2.5. Windows 2008 R2
Trac.ini: [account-manager] account_changes_notify_addresses = <removed> force_passwd_change = true htdigest_realm = trac notify_actions = new password_file = c:\TracRoot\tracusers.txt password_store = HtDigestStore user_lock_max_time = 0
I had every feature enabled except registration. For the moment I've disabled everything, which is why the lines are commented-out:
[components] #acct_mgr.admin.* = enabled #acct_mgr.admin.accountmanageradminpage = enabled #acct_mgr.api.* = enabled #acct_mgr.api.accountmanager = enabled #acct_mgr.db.* = enabled #acct_mgr.db.sessionstore = disabled #acct_mgr.guard.accountguard = enabled #acct_mgr.htfile.* = enabled #acct_mgr.http.* = enabled #acct_mgr.notification.* = enabled #acct_mgr.pwhash.* = enabled #acct_mgr.svnserve.* = enabled #acct_mgr.web_ui.* = enabled #acct_mgr.web_ui.accountmodule = enabled #acct_mgr.web_ui.loginmodule = enabled #acct_mgr.web_ui.registrationmodule = disabled
Attachments (2)
Change History (30)
comment:1 Changed 14 years ago by
comment:3 Changed 14 years ago by
Description: | modified (diff) |
---|
comment:4 Changed 14 years ago by
Sorry to see that issue, and I suspect that changeset too.
It was clearly bad not to have a Windows test bed for these changes. Glad at least you did test and notice it by now. With *nix systems there has been no report by now, and I saw no problems in several Trac application I maintain myself. So this is clearly related to the special line endings used on Windows systems.
Just for clarification: "Every time the AcctMgr plugin accesses ..." means a write (user add/delete, password update), right? A read operation should really change nothing at all. Could you confirm, please?
I'll have to digg much deeper into the aforementioned changeset [9272]. Looks not too obvious, why just Windows systems should do wrong, since we always add line endings, but now we care about append different endings depending on existing line ending style.
comment:5 Changed 14 years ago by
thanks for your attention to this ticket.
Just for clarification: "Every time the AcctMgr plugin accesses ..." means a write (user add/delete, password update)...
yes, that is correct. when any user changes their own password, or a privileged user changes another user's password, etc.
The odd thing is the file grows *very* fast. As I wrote in the ticket, the password file has 77 actual users. When I found the problem, the file had 42,535,673 lines !! <not> kidding. It was 127,611,031 bytes. I still have the file if we need to analyze it.
It seems hard for me to believe that this was caused only by users. Somehow it must be multiplying the number of blank lines, or similar.
I grep'd out the non-blank lines and created a new password file; and then tried the AcctMgr plugin again. Each time I changed a password, it added just one blank line per user in the file. This doesn't add up to 42M lines !
Anyway the problem is easy to recreate on my system so if you want an example or want me to test a patch I'd be happy to. Unfortunately as I said I'm not a Python programmer. Also it appears the person who installed the AcctMgr plugin may have put it in her home directory, and she's not available for a few weeks more. I'm an admin on the system but inexperienced with Python, so if there's a patch I'll need instructions to install.
comment:6 Changed 14 years ago by
one more thing: a little math shows that 3 * 42,535,673 = 127,607,019, which is 4,012 bytes less than the actual file size. 4,012 / 77 = ~52 bytes/ user, which seems about right. This means that each blank line had -3- characters. Not just \r\n, but something else as well. But not a blank space. grep -c " " <file> yielded only 2 lines. Not sure what it is. Very hard to deal with a file that large in any text editor, and I don't have good *nix tools on this particular system. (no cygwin)
comment:7 Changed 14 years ago by
Aha. I opened the 127M file with a hex editor. The sequence of bytes for each blank line is x0d x0d x0a, i.e. \r\r\n
Perhaps this gives a clue as to what has gone wrong?
comment:8 Changed 14 years ago by
I suddenly recognized today, that #8422 might be related if not a duplicate of this ticket or vice versa.
comment:9 follow-up: 10 Changed 14 years ago by
Yes, it seems to be the same.
Also I realized that the file is growing exponentially because each time it adds \r\r\n. Then next time, every \r is turned into another \r\r\n, etc.
comment:10 Changed 14 years ago by
Status: | new → assigned |
---|
Replying to jeremy.j.dunn@gmail.com:
Yes, it seems to be the same.
Also I realized that the file is growing exponentially because each time it adds \r\r\n. Then next time, every \r is turned into another \r\r\n, etc.
There are two places in current code after [9272] that may cause this, if handling is different in Windows that what I had expected from all Python environments. I'll prepare a debug patch, to enable you to help me moving from a guess to facts soon. Thank you for your patience with this nasty bug. This must be fixed ASAP.
comment:11 Changed 14 years ago by
Keywords: | EOL style added |
---|---|
Priority: | normal → high |
I meant this, really soon.
comment:12 Changed 14 years ago by
Any status on a fix? I'm happy to test patches on Windows 2008 R2.
comment:13 Changed 14 years ago by
Ok - as I wrote before I'm not a Python programmer, but I think I've found the problem:
in htfile.py, lines 122-127
elif line.endswith('\n'): if eol == '\n': new_lines.append(line) else: # restore eol new_lines.append(line.rstrip('\n') + eol)
For Windows, all lines end with \r\n, so the first elif is true. Then, eol is NOT \n, due to lines 103-106, which set eol to \r\n for Windows. So we drop thru to the 'else' clause. Strip the \n (leaving \r) and add eol (\r\n).
NET RESULT: every \r\n gets turned into \r\r\n, which is exactly the observed behavior.
comment:14 Changed 14 years ago by
Sure, it is about those lines. But my question is more about, why opening your password file doesn't happen as asked - with universal newline support [1], you see? Current code has been done not without much thought, especially when it comes to the lines in question.
But obviously it seems like your Python is compiled without that highly useful functionality or not using it for other reasons. I've been unable to find existing evidence, that this is a known bug in any specific Windows version.
Anyway, it has become clear by now, that we shouldn't totally rely on working universal newline support but make sure that we get what we expect. So let's add a small test section and act accordingly.
[1] http://docs.python.org/release/2.3.5/whatsnew/node7.html
Changed 14 years ago by
Attachment: | fx_8487_eol-handling_on_univ-newline-supp_miss.patch added |
---|
preview on possible fix based on current trunk
code
comment:15 Changed 14 years ago by
Summary: | AcctMgr creates blank lines in password_file under Windows → [patch] AcctMgr creates blank lines in password_file under Windows |
---|
comment:16 Changed 14 years ago by
Thanks @hasienda. Likewise I appreciate your support. Where is "here"? I am in Massachusetts/USA, GMT-5.
Unfortunately I do not know how to apply this patch. As I wrote earlier someone else did the install; and things are a bit confused on our development server; and I am not a Python programmer.
We are running the Windows build of Subversion distributed by CollabNet. It's in c:/csvn/. Python is in c:/csvn/Python25/. Trac appears to be in c:/csvn/Python25/Lib/site-packages/. The only reference to AcctMgr is c:/csvn/Python25/Lib/site-packages/tracaccountmanager-0.3dev_r9591-py2.5.egg. But the .egg does not seem to be unpacked anywhere: htfile.py is nowhere in the c:/csvn/ path.
I have reached out to some colleagues with more Python experience. In a few days perhaps I can send a reply if the patch worked or not.
comment:17 Changed 14 years ago by
Hi again. I figured out how to install the patch. Strangely, it is -not- working, and I don't understand why.
I added two sets of debug logic. First to determine eol format:
eol = '\n' f = open(filename, 'rU') lines = f.readlines() newlines = f.newlines if newlines is not None: if isinstance(newlines, str): newlines = [f.newlines] elif isinstance(newlines, tuple): newlines = list(f.newlines) if '\n' not in newlines: if '\r\n' in newlines: # Windows newline style eol = '\r\n' new_lines.append('Windows format') elif '\r' in newlines: # MacOS newline style eol = '\r' new_lines.append('MacOS format')
Second to determine which if statement is executed, using my initials 'jjd':
if len(lines) > 0: for line in lines: if line.startswith(prefix): if not matched and userline: new_lines.append(userline + 'jjd0' + eol) matched = True # preserve existing eol and # handle missing universal newline support gracefully too elif line.endswith(eol) or line.endswith('\r') or \ line.endswith('\r\n'): new_lines.append(line + 'jjd1') # unify eol style using universal newline support elif line.endswith('\n'): new_lines.append(line.rstrip('\n') + 'jjd2' + eol ) # make sure the (last) line has a newline anyway else: new_lines.append(line +'jjd3' + eol)
After making one user change with this version of the code, the TracUsers.txt file has 'windows format' at the top, then 'jjd2' plus \r\r\n at the END of each line.
known facts:
- initial state: each line ends with \r\n only (verified in hex editor)
- eol is \r\n (verified by "windows format")
- case 'jjd2' is taken, which implies that Universal Newline Support is working properly, else it would take case 'jjd1'
- the \r\r\n is AFTER 'jjd2' (verified in hex editor)
I don't understand how
new_lines.append(line.rstrip('\n') + 'jjd2' + eol )
can result in <line>jjd2<\r\r\n>, when eol=\r\n, unless there is something strange with append() or with writelines()
comment:18 Changed 14 years ago by
Thinking about this with fresh eyes:
- if Universal Newline Support were in fact working, then 'newlines' should contain only \n, since it is the result of readlines(). per PEP278
Conversion of newlines happens in all calls that read data: read(), readline(), readlines(), etc.
- Presuming UNS is -not- working, then lines[] contains strings ending with \r\n. This matches the observed behavior, where "windows format" is output after newlines contains \r\n
- Presuming UNS is -not- working then, case 'jjd2' is removing the \n from \r\n, and adding \r\n, leaving \r\r\n. This is the observed behavior.
- If the above are true, then the only thing I don't understand is why isn't this case taken?
elif line.endswith(eol) or line.endswith('\r') or \ line.endswith('\r\n'):
comment:19 follow-up: 20 Changed 14 years ago by
Oh drat - never mind the above comment. It's obvious we have UNS because newlines is not None. newlines only exists with UNS. Also if my above comment were true, we'd get <line>\rJJD2\r\n, which is not the observed behavior.
So I'm back to my prior comment, not understanding how the extra \r is added.
comment:20 Changed 14 years ago by
Replying to jeremydunn:
![...] So I'm back to my prior comment, not understanding how the extra \r is added.
You're not alone. Thanks again for your continued debugging. Could not dream of a better help with this issue. Awesome. And frightening, as it contradicts my logic too. Anyway, out of curiosity, would you please test with the following lines replaced:
- elif line.endswith('\n'): - new_lines.append(line.rstrip('\n') + 'jjd2' + eol + elif line.endswith('\n'): + new_lines.append(line.rstrip('\n').rstrip('\r') + 'jjd2' + eol
As this is a generic strip working for any eol style, it should work regardless of the test result.
comment:21 Changed 14 years ago by
Thinking it twice, there really shouldn't be a change from the previous micro-patch. I rule out a glitch with the append() method to the new_lines list as well, so I'll have a look at the writelines() part now.
comment:22 Changed 14 years ago by
tried the newest patch, but there is no change as anticipated in your last comment. still getting \r\r\n at the end of each line, AFTER whatever additional text I add for debugging.
comment:23 Changed 14 years ago by
Don't know why I didn't search before, but there is some known anomaly about Windows handling file writes [1].
So we may want to do a 'binary' write:
try: - f = open(filename, 'w') + f = open(filename, 'wb') f.writelines(new_lines)
or ditch explicit Universal Newline Support as it appears to be used by Windows under the hood in such an unintuitive way. I tend towards using the 'generic strip method' from comment 20 as a new private method and just pass everything else through as it is. And maybe use os.linesep
, if we encounter eol style when actually processing lines without any eol character as a last resort. Well, now I'm sitting here to decide, how to go on, and I'm not finished with this yet. Only I hope we can go away with better cross-os behavior without writing (more) Windows-specific code.
If you feel adventurous, there's another aspect to check from the aforementioned discussion: 'symmetrical' file open arguments. So it could be interesting to know, if the following small patch does anything good too:
try: - f = open(filename, 'w') + f = open(filename, 'Uw') f.writelines(new_lines)
I've seen this before, but don't know, if it works to change the Windows eol handling mess. But I may not require to know or rely on your findings for the fix to be made, so feel free to test this second thing or not.
[1] http://bytes.com/topic/python/answers/627424-file-object-behavior
comment:24 Changed 14 years ago by
This is my conclusion from the reasoning above and some more reading [1][2][3]. Would you kindly check, how it works for you, please?
[1] http://bytes.com/topic/python/answers/453637-detecting-line-endings
[2] http://stackoverflow.com/questions/275018/how-can-i-remove-chomp-a-newline-in-python/275659
[3] http://bytes.com/topic/python/answers/638840-getting-rid-eol-character
Changed 14 years ago by
Attachment: | fx_8487_eol-handling_reworked_more.patch added |
---|
a different approach, largely dumping Universal Newline Support for reading password files, that should be written back - with syntax fix
comment:25 Changed 14 years ago by
@hacienda: Thank you! The new patch is working properly.
I tested add user, delete user, change password. All operations leave the trac user file with correct (Windows \r\n) line endings and no extra lines. For verification, my system is Trac 0.12, Python 2.5, Windows 2008 R2.
From my POV the bug is fixed; but of course it would make sense to regress it on other OS; other versions of Windows; and other versions of Python before closing.
comment:26 Changed 14 years ago by
I've enjoyed our cooperation. And certainly this helped a lot to re-think what I thought was a perfect solution some months ago, and no-one else reported a problem until now.
I've tested with Trac 0.13dev, Python2.6 on Debian GNU/Linux Squeeze (6.0), so ideally we would get reports for Trac 0.11.x, Python2.4 and maybe even some older MacOS for a full cover. But I'll just commit the changes now, so this will get tested by other adventurous users within the following weeks.
comment:26 Changed 14 years ago by
(In [9923]) AccountManagerPlugin: Fix password file handling on Windows systems, refs #8487.
This is a urgent follow-up to my changes introduced in changeset [9272]. It has been possible to rework this after finally figuring out, that Universal Newline Support works not as expected when writing (back) files. Many thanks to Jeremy Dunn for initial report, debugging and testing. Any non-*nix user should upgrade to avoid painful exponential file growth. Sorry for any inconvenience you've suffered meanwhile.
comment:27 Changed 13 years ago by
No more complaints for more than three months - feels like we're done with it, right?
comment:28 Changed 13 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
I'm happy with it. I just checked, and my tracusers.txt file remains correctly-formatted, without extra line breaks. Thanks again for your support.
- Jeremy
this might be related to [9272] since it mentions EOL values. I suspect the issue is something different between *nix and Windows. I'm a programmer but not a Python programmer so I'm sorry I cannot contribute a solution.