Django

Code

Ticket #3344 (closed: fixed)

Opened 2 years ago

Last modified 2 years ago

newforms UnicodeEncodeError in EmailField on non-successful validation

Reported by: bartekr Assigned to: adrian
Milestone: Component: Forms
Version: SVN Keywords: UnicodeEncodeError EmailField gettext Polish Norwegian unicode unicode-branch
Cc: Triage Stage: Accepted
Has patch: 1 Needs documentation: 0
Needs tests: 0 Patch needs improvement: 0

Description (Last modified by adrian)

newforms EmailField? causes UnicodeEncodeError? exception in case when value isn't correct.

Exception Type:  	UnicodeEncodeError
Exception Value: 	'ascii' codec can't encode character u'\u017a' in position 33: ordinal not in range(128)
Exception Location: 	/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/site-packages/django/newforms/forms.py in _html_output, line 103

I spent some time for debugging and I found that what probably causes the error is:

        RegexField.__init__(self, email_re, max_length, min_length, gettext(u'Enter a valid e-mail address.'), required, widget, label, initial)

newforms/fields.py, line 267 (rev. 4386)

When for debug purposes I changed gettext(u'Enter a valid e-mail address.') to anything else (ex. gettext(u'aaa')), the problem didn't occur and the 'aaa' validation error message was displayed like it should. This is probably a problem with translation file or so, but I'm just a Python/django/gettext newbie and I'm just starting my adventure with all of them, so I'm unable to debug it more.

Attachments

temporary-fix-until-full-unicode.patch (1.1 kB) - added by Øyvind Saltvik <oyvind@saltvik.no> on 02/15/07 03:37:26.
temporary solution for utf-8 gettext translations with newforms, encodes as xmlcharrefs
decode-before-unicode.patch (0.5 kB) - added by Øyvind Saltvik <oyvind.saltvik@gmail.com> on 02/22/07 13:51:04.
utf-8 decode before unicode , attached patch to wrong ticket
4700-UnicodeEncodeError-newforms.diff (0.7 kB) - added by boxed@killingar.net on 03/11/07 12:15:26.
smaller fix that does not needlessly convert to/from ascii

Change History

01/21/07 15:40:08 changed by bartekr

  • needs_better_patch changed.
  • needs_tests changed.
  • needs_docs changed.

Just found that it happens only if settings.LANGUAGE_CODE == 'pl'

However, when settings.LANGUAGE_CODE == 'en' and after translation.activate( 'pl' ) - it DOES NOT happen (and I could except it happens, as most of values are translated to Polish).

01/21/07 15:55:32 changed by bartekr

  • keywords changed from UnicodeEncodeError EmailField to UnicodeEncodeError EmailField gettext Polish.

01/21/07 18:17:42 changed by adrian

  • description changed.
  • stage changed from Unreviewed to Accepted.

Fixed formatting in description.

01/31/07 09:14:45 changed by Honza Král <Honza.Kral@gmail.com>

see #3395, it contains a patch... its due to ValidationErrors? containing unicode error messages that cannot be handled by some template tags, or other functions in this case...

01/31/07 10:18:57 changed by Michael Radziej <mir@noris.de>

  • keywords changed from UnicodeEncodeError EmailField gettext Polish to UnicodeEncodeError EmailField gettext Polish unicode.

02/13/07 12:14:05 changed by Øyvind Saltvik <oyvind.saltvik@gmail.com>

  • keywords changed from UnicodeEncodeError EmailField gettext Polish unicode to UnicodeEncodeError EmailField gettext Polish Norwegian unicode.

Happens with norwegian translation too, any way to fix this, should translations not be in utf-8?

django rev 4490

traceback

Traceback (most recent call last):
File "/var/www/vhosts/amc-info.com/httpdocs/magic-removal/django/template/__init__.py" in render_node
  718. result = node.render(context)
File "/var/www/vhosts/amc-info.com/httpdocs/magic-removal/django/template/__init__.py" in render
  768. output = self.filter_expression.resolve(context)
File "/var/www/vhosts/amc-info.com/httpdocs/magic-removal/django/template/__init__.py" in resolve
  561. obj = resolve_variable(self.var, context)
File "/var/www/vhosts/amc-info.com/httpdocs/magic-removal/django/template/__init__.py" in resolve_variable
  655. current = current()
File "/var/www/vhosts/amc-info.com/httpdocs/magic-removal/django/newforms/forms.py" in as_ul
  135. return self._html_output(u'<li>%(errors)s%(label)s %(field)s%(help_text)s</li>', u'<li>%s</li>', '</li>', u' %s', False)
File "/var/www/vhosts/amc-info.com/httpdocs/magic-removal/django/newforms/forms.py" in _html_output
  116. output.append(normal_row % {'errors': bf_errors, 'label': label, 'field': unicode(bf), 'help_text': help_text})

  UnicodeEncodeError at /spoersmaal/
  'ascii' codec can't encode character u'\xe5' in position 43: ordinal not in range(128)

local vars

bf  	
<django.newforms.forms.BoundField object at 0xb60ec86c>
bf_errors 	
[u'Dette feltet er p\xe5krevd.']
error_row 	
u'<li>%s</li>'
errors_on_separate_row 	
False
field 	
<django.newforms.fields.CharField object at 0xb60e8ecc>
help_text 	
u''
help_text_html 	
u' %s'
hidden_fields 	
[]
label 	
u'Overskrift:'
name 	
'overskrift'
normal_row 	
u'<li>%(errors)s%(label)s %(field)s%(help_text)s</li>'
output 	
[]
row_ender 	
'</li>'
self 	
<django.newforms.models.SpoersmaalForm object at 0xb620c48c>
top_errors 	
[]

02/15/07 03:37:26 changed by Øyvind Saltvik <oyvind@saltvik.no>

  • attachment temporary-fix-until-full-unicode.patch added.

temporary solution for utf-8 gettext translations with newforms, encodes as xmlcharrefs

02/15/07 15:17:30 changed by Michael Radziej <mir@noris.de>

I'm not sure whether StrAndUnicode is the right level to fix the bug, since it's used in many places. Are you sure this doesn't somehow put xmlcharrefs in your database?

02/16/07 02:25:20 changed by Øyvind Saltvik <oyvind.saltvik@gmail.com>

Quite sure, this is only done for validationerrors. Can seem like StrAndUnicode? from the patch, but it's not. :)

02/22/07 13:51:04 changed by Øyvind Saltvik <oyvind.saltvik@gmail.com>

  • attachment decode-before-unicode.patch added.

utf-8 decode before unicode , attached patch to wrong ticket

02/22/07 13:51:42 changed by Øyvind Saltvik <oyvind.saltvik@gmail.com>

  • has_patch set to 1.

03/11/07 12:13:33 changed by boxed@killingar.net

The attached patch produces incorrect output for me (I get the XML character entities in the validation message, instead of the swedish characters). I will attach a new patch that fixes the issue without doing going to/from ascii encoding for no reason, and it works :P

03/11/07 12:15:26 changed by boxed@killingar.net

  • attachment 4700-UnicodeEncodeError-newforms.diff added.

smaller fix that does not needlessly convert to/from ascii

05/14/07 07:51:21 changed by mtredinnick

  • keywords changed from UnicodeEncodeError EmailField gettext Polish Norwegian unicode to UnicodeEncodeError EmailField gettext Polish Norwegian unicode unicode-branch.

This has been fixed as a result of the changes in the unicode branch, at least as best I can tell from the description, since nobody has posted a test case. I can render forms with invalid Email fields and missing content in languages using non-ASCII characters in the translated errors without problems.

I'll close this ticket once the unicode branch is merged back to trunk.

07/04/07 07:11:05 changed by mtredinnick

  • status changed from new to closed.
  • resolution set to fixed.

(In [5609]) Merged Unicode branch into trunk (r4952:5608). This should be fully backwards compatible for all practical purposes.

Fixed #2391, #2489, #2996, #3322, #3344, #3370, #3406, #3432, #3454, #3492, #3582, #3690, #3878, #3891, #3937, #4039, #4141, #4227, #4286, #4291, #4300, #4452, #4702


Add/Change #3344 (newforms UnicodeEncodeError in EmailField on non-successful validation)




Change Properties
Action