Bug Tracker

Modify

Ticket #3611 (closed bug: patchwelcome)

Opened 5 years ago

Last modified 14 months ago

Wrong codepage in URL (and in post data) when not UTF-8

Reported by: samsol Owned by: samsol
Priority: major Milestone:
Component: ajax Version: 1.2.6
Keywords: charset,url,ajax,encodeURIComponent,ajaxrewrite Cc: samsol, jaubourg
Blocking: Blocked by:

Description

When html page have charset different from UTF-8 ajax functions do wrong requests. QUERY_STRING parameters encoded with UTF-8, while plain form's Submiting encode it in page's charset.

Test case.

file test.cgi

#!/usr/bin/perl

print "Content-Type: text/html; charset=Windows-1251\n";
print "Cache-control: private\n";
print "\n";

print << "EOL";
<html>
  <head>
    <script language='javascript' src='jquery-1.2.6.js'></script>
  </head>
  <body>
    <form>
      <input type='text' name='x' id='x'>
      <input type='submit'>
      <a href="javascript:putRussianFLetter()">Put Russian F letter</a>
    </form>
$ENV{'QUERY_STRING'}
    <button onclick='doClick()'>JQ</button>
    <script language='javascript'>
    function doClick() {
      \$.get('test.cgi', {x:\$('#x')[0].value, anticache:Math.random()},
          function( data ){\$('#out').html(data);});
    }
    function putRussianFLetter() {
      \$('#x')[0].value = '\\u0424';
    }
    </script>
    <div id='out'></div>
  </body>
</html>
EOL
  • 1. Navigate to  http://your_address_here/test.cgi
  • 2. Ensure that browser detected charset Cyrilic (Windows-1251)
  • 3. Click on "Put Russian F letter" link
  • 4. Ensure text box has 1 char string
  • 5. Click on "Submit Query" button.
  • 6. Ensure x=%D4 text is appeared on the page
  • 7. Click on "Put Russian F letter" link once again
  • 8. Ensure text box has 1 letter
  • 9. Click on "JQ" button
  • 10. Wait, when copy of the page appeared bottom of the page
  • 11. Text x=%D0%A4 appeared, while expected x=%D4

Browsers tested

  • Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.12) Gecko/20080208 Fedora/2.0.0.12-1.fc7 Firefox/2.0.0.12
  • Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.6 (like Gecko)

Attachments

test.cgi Download (800 bytes) - added by samsol 5 years ago.

Change History

Changed 5 years ago by samsol

comment:1 Changed 4 years ago by samsol

Workaround:

Because JQuery set X-Requested-With request header

xhr.setRequestHeader("X-Requested-With", "XMLHttpRequest");

we can use it to detect Jquery requests on server side

Example java servlet

if("XMLHttpRequest".equals(req.getHeader("X-Requested-With"))) {
    try {
        req.setCharacterEncoding("UTF-8");
    } catch(UnsupportedEncodingException e){
        throw new ServletException( e );
    }
}

in Perl you have to use

if($ENV{'HTTP_X_REQUESTED_WITH'} eq 'XMLHttpRequest'){
    # UTF-8 encoding
}else{
    # your default encoding
}

comment:2 Changed 4 years ago by dmethvin

  • Cc samsol added

 http://code.google.com/p/browsersec/wiki/Part1#Unicode_in_URLs

The table there says that plain links are always encoded in UTF-8, but XMLHttpRequest should use the current page encoding. I wonder whether we might confuse issues more by trying to follow that. As it stands now, jQuery always uses UTF-8 and it's just a question of whether the string created is sent via POST or GET.

Note that the use of UTF-8 for the body on a POST is non-negotiable; I believe that some browsers like Firefox will force it back if you try to override it.  http://www.w3.org/TR/XMLHttpRequest/#send "data is a DOMString: Encode data using UTF-8 for transmission."

What is the impact of using UTF-8 in the URL?

comment:3 Changed 4 years ago by samsol

Try to run test case.

The problem that I have found is...

Server side code (test.cgi) is undergo by different HTTP_1_1_Requests* made from the same html page.

HTTP_1_1_Request made by submitting html form NOT EQUALS to HTTP_1_1_Request made by calling jQuery function $.get

I'm talking about query part only of http-url (or body of POST requests).

http_URL = "http:" "" host [ ":" port ] [ abs_path [ "?" query ]]

<form method="POST" enctype="application/x-www-form-urlencoded"> page encoding
$.post UTF-8 encoding
<form method="GET"> page encoding
$.get UTF-8 encoding

Run test case to understand!


*HTTP_1_1_Request - message sent from client to server. See chapter 5 in  http://www.ietf.org/rfc/rfc2616.txt

comment:4 Changed 4 years ago by dmethvin

More information:

 http://xkr.us/articles/javascript/encode-compare/

To get the character encoding right, it *seems* like the right thing to do is use escape() for the URL and any data added to the URL (e.g., data in a GET request). However, escape() only handles ISO-8859-1, and doesn't encode spaces, so it sounds like we'd need to do our own encoding.

See also #4315 which is a duplicate of this bug.

comment:5 Changed 4 years ago by samsol

Looks like it is standard's collision. Up to DOM Level 3 it is no ability to detect document's input encoding (as I know). But it is possible since Level 3.  http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#Document3-inputEncoding In other hand we have  http://www.ecma-international.org/publications/standards/Ecma-262.htm section 15.1.3 - it strictly define UTF-8 for encodeURIComponent.

So it is possible to fix this bug using:

if( document.inputEncoding ) {
  // DOM Level 3 Only
  part = this.encodeURIComponentWithPropperEncoding(document.inputEncoding, raw );
} else {
  part = encodeURIComponent( raw );
}

It is still too hard to create encodeURIComponentWithPropperEncoding.

Dear developer! When you will fix this bug (if you will), please change xhr.setRequestHeader("X-Requested-With", "XMLHttpRequest") make something like xhr.setRequestHeader("X-Requested-With", "XMLHttpRequest.v2")

comment:6 Changed 3 years ago by dmethvin

  • Status changed from new to open

comment:7 Changed 2 years ago by rwaldron

  • Keywords charset,url,ajax,encodeURIComponent,ajaxrewrite added; charset url ajax encodeURIComponent removed

comment:8 Changed 2 years ago by timmywil

  • Cc jaubourg added

Please confirm this bug still exists in jQuery 1.6b1.

comment:9 Changed 2 years ago by john

  • Owner set to samsol
  • Status changed from open to pending
  • Milestone 1.2 deleted

comment:10 Changed 2 years ago by trac-o-bot

  • Status changed from pending to closed
  • Resolution set to invalid

Because we get so many tickets, we often need to return them to the initial reporter for more information. If that person does not reply within 14 days, the ticket will automatically be closed, and that has happened in this case. If you still are interested in pursuing this issue, feel free to add a comment with the requested information and we will be happy to reopen the ticket if it is still valid. Thanks!

comment:11 Changed 2 years ago by anonymous

I'm using jQuery JavaScript Library v1.6 Date: Mon May 2 13:50:00 2011 -0400 and the bug ist still existing.

comment:12 Changed 16 months ago by sam@…

This bug still exists in jQuery v1.7.1, as of Jan 26, 2012.

comment:13 Changed 16 months ago by dmethvin

  • Status changed from closed to reopened
  • Resolution invalid deleted

comment:14 Changed 16 months ago by dmethvin

  • Status changed from reopened to closed
  • Resolution set to patchwelcome

This was closed invalid due to a lack of response from the OP, but in practical terms I don't think we can fix this due to the conflict in the standards involved. XHR wants UTF-8, period. If anyone has a solution we could consider it.

Please follow the  bug reporting guidlines and use  jsFiddle when providing test cases and demonstrations instead of pasting the code in the ticket.

View

Add a comment

Modify Ticket

Action
as closed
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.