Skip to main content

Bug Tracker

Side navigation

#14608 closed bug (notabug)

Opened December 05, 2013 09:19AM UTC

Closed December 06, 2013 05:42AM UTC

Last modified December 06, 2013 12:53PM UTC

Wrong Textparsing with &reg (not ®)

Reported by: glueck@dozent.net Owned by:
Priority: undecided Milestone: None
Component: manipulation Version: 1.10.2
Keywords: Cc:
Blocked by: Blocking:
Description

If you try to write xxx®_xxx in a html context it will be shown as ® (...e®_b...)

Sample:

$('#report').html("<div>rpt=sls_fleet_stat_tree&omit_olddata=true&ids=false®_by_month=false&cco_by_month=false</div>")

Attachments (0)
Change History (5)

Changed December 05, 2013 09:22AM UTC by Frank Glück <glueck@dozent.net> comment:1

or try this:

$('#report').html("®by")

Changed December 05, 2013 10:19AM UTC by Frank Glück <glueck@dozent.net> comment:2

but it is a little bit difficult:

$('#report').html('® & &euro < > &hearts © &trade £')

Changed December 06, 2013 05:42AM UTC by gibson042 comment:3

component: unfiledmanipulation
resolution: → notabug
status: newclosed

This is not under the control of jQuery, and not even a bug. See http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#consume-a-character-reference for the relevant HTML5 character reference parsing logic:

Consume the maximum number of characters [immediately after the U+0026 AMPERSAND] possible, with the consumed characters matching one of the identifiers in the first column of the named character references table (in a case-sensitive manner).

Note that reg appears in the table, so the first 4 characters of ®by should be consumed and replaced by U+00AE REGISTERED SIGN (®).

Changed December 06, 2013 12:41PM UTC by Frank Glück <glueck@dozent.net> comment:4

If the character reference is being consumed as part of an attribute, and the last character matched is not a U+003B SEMICOLON character (;), and the next character is either a U+003D EQUALS SIGN character (=) or an alphanumeric ASCII character, then, for historical reasons, all the characters that were matched after the U+0026 AMPERSAND character (&) must be unconsumed, and nothing is returned.

So try:

$('#report').html('®=®®by')

Changed December 06, 2013 12:53PM UTC by dmethvin comment:5

The point is, this is not a jQuery bug. If you have issues with the standard, contact the W3C and ask them to change the rules.