#3611 closed bug (patchwelcome)
Wrong codepage in URL (and in post data) when not UTF-8
Reported by: | samsol | Owned by: | samsol |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | ajax | Version: | 1.2.6 |
Keywords: | charset, url, ajax, encodeURIComponent, ajaxrewrite | Cc: | samsol, jaubourg |
Blocked by: | Blocking: |
Description
When html page have charset different from UTF-8 ajax functions do wrong requests. QUERY_STRING parameters encoded with UTF-8, while plain form's Submiting encode it in page's charset.
Test case.
file test.cgi
#!/usr/bin/perl print "Content-Type: text/html; charset=Windows-1251\n"; print "Cache-control: private\n"; print "\n"; print << "EOL"; <html> <head> <script language='javascript' src='jquery-1.2.6.js'></script> </head> <body> <form> <input type='text' name='x' id='x'> <input type='submit'> <a href="javascript:putRussianFLetter()">Put Russian F letter</a> </form> $ENV{'QUERY_STRING'} <button onclick='doClick()'>JQ</button> <script language='javascript'> function doClick() { \$.get('test.cgi', {x:\$('#x')[0].value, anticache:Math.random()}, function( data ){\$('#out').html(data);}); } function putRussianFLetter() { \$('#x')[0].value = '\\u0424'; } </script> <div id='out'></div> </body> </html> EOL
- 1. Navigate to http://your_address_here/test.cgi
- 2. Ensure that browser detected charset Cyrilic (Windows-1251)
- 3. Click on "Put Russian F letter" link
- 4. Ensure text box has 1 char string
- 5. Click on "Submit Query" button.
- 6. Ensure x=%D4 text is appeared on the page
- 7. Click on "Put Russian F letter" link once again
- 8. Ensure text box has 1 letter
- 9. Click on "JQ" button
- 10. Wait, when copy of the page appeared bottom of the page
- 11. Text x=%D0%A4 appeared, while expected x=%D4
Browsers tested
- Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.12) Gecko/20080208 Fedora/2.0.0.12-1.fc7 Firefox/2.0.0.12
- Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.6 (like Gecko)
Attachments (1)
Change History (15)
Changed 14 years ago by
comment:1 Changed 14 years ago by
comment:2 Changed 14 years ago by
Cc: | samsol added |
---|
http://code.google.com/p/browsersec/wiki/Part1#Unicode_in_URLs
The table there says that plain links are always encoded in UTF-8, but XMLHttpRequest should use the current page encoding. I wonder whether we might confuse issues more by trying to follow that. As it stands now, jQuery always uses UTF-8 and it's just a question of whether the string created is sent via POST or GET.
Note that the use of UTF-8 for the body on a POST is non-negotiable; I believe that some browsers like Firefox will force it back if you try to override it. http://www.w3.org/TR/XMLHttpRequest/#send "data is a DOMString: Encode data using UTF-8 for transmission."
What is the impact of using UTF-8 in the URL?
comment:3 Changed 14 years ago by
Try to run test case.
The problem that I have found is...
Server side code (test.cgi) is undergo by different HTTP_1_1_Requests* made from the same html page.
HTTP_1_1_Request made by submitting html form NOT EQUALS to HTTP_1_1_Request made by calling jQuery function $.get
I'm talking about query part only of http-url (or body of POST requests).
http_URL = "http:" "" host [ ":" port ] [ abs_path [ "?" query ]]
<form method="POST" enctype="application/x-www-form-urlencoded"> | page encoding |
$.post | UTF-8 encoding |
<form method="GET"> | page encoding |
$.get | UTF-8 encoding |
Run test case to understand!
*HTTP_1_1_Request - message sent from client to server. See chapter 5 in http://www.ietf.org/rfc/rfc2616.txt
comment:4 Changed 14 years ago by
More information:
http://xkr.us/articles/javascript/encode-compare/
To get the character encoding right, it *seems* like the right thing to do is use escape()
for the URL and any data added to the URL (e.g., data in a GET request). However, escape()
only handles ISO-8859-1, and doesn't encode spaces, so it sounds like we'd need to do our own encoding.
See also #4315 which is a duplicate of this bug.
comment:5 Changed 14 years ago by
Looks like it is standard's collision. Up to DOM Level 3 it is no ability to detect document's input encoding (as I know). But it is possible since Level 3. http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#Document3-inputEncoding In other hand we have http://www.ecma-international.org/publications/standards/Ecma-262.htm section 15.1.3 - it strictly define UTF-8 for encodeURIComponent.
So it is possible to fix this bug using:
if( document.inputEncoding ) { // DOM Level 3 Only part = this.encodeURIComponentWithPropperEncoding(document.inputEncoding, raw ); } else { part = encodeURIComponent( raw ); }
It is still too hard to create encodeURIComponentWithPropperEncoding.
Dear developer! When you will fix this bug (if you will), please change xhr.setRequestHeader("X-Requested-With", "XMLHttpRequest") make something like xhr.setRequestHeader("X-Requested-With", "XMLHttpRequest.v2")
comment:6 Changed 12 years ago by
Status: | new → open |
---|
comment:7 Changed 12 years ago by
Keywords: | ajaxrewrite added |
---|
comment:8 Changed 12 years ago by
Cc: | jaubourg added |
---|
Please confirm this bug still exists in jQuery 1.6b1.
comment:9 Changed 12 years ago by
Milestone: | 1.2 |
---|---|
Owner: | set to samsol |
Status: | open → pending |
comment:10 Changed 12 years ago by
Resolution: | → invalid |
---|---|
Status: | pending → closed |
Because we get so many tickets, we often need to return them to the initial reporter for more information. If that person does not reply within 14 days, the ticket will automatically be closed, and that has happened in this case. If you still are interested in pursuing this issue, feel free to add a comment with the requested information and we will be happy to reopen the ticket if it is still valid. Thanks!
comment:11 Changed 12 years ago by
I'm using jQuery JavaScript Library v1.6 Date: Mon May 2 13:50:00 2011 -0400 and the bug ist still existing.
comment:13 Changed 11 years ago by
Resolution: | invalid |
---|---|
Status: | closed → reopened |
comment:14 Changed 11 years ago by
Resolution: | → patchwelcome |
---|---|
Status: | reopened → closed |
This was closed invalid due to a lack of response from the OP, but in practical terms I don't think we can fix this due to the conflict in the standards involved. XHR wants UTF-8, period. If anyone has a solution we could consider it.
Workaround:
Because JQuery set X-Requested-With request header
we can use it to detect Jquery requests on server side
Example java servlet
in Perl you have to use