JavaScript字符串中的转义html中未终止的字符串文字

时间:2010-11-08 19:45:01

标签: javascript string escaping string-literals double-quotes

我在编码此值时遇到一些javascript字符串文字的问题:

非编码

<!-- Start ValueClick Media 300x250 Code for Test Tag -->
<script language="javascript" src="http://media.fastclick.net/w/get.media?sid=38901&m=6&tp=8&d=j&t=n"></script>
<noscript><a href="http://media.fastclick.net/w/click.here?sid=38901&m=6&c=1" target="_blank">
<img src="http://media.fastclick.net/w/get.media?sid=38901&m=6&tp=8&d=s&c=1"width=300 height=250 border=1></a></noscript>
<!-- End ValueClick Media 300x250 Code for Test Tag -->

我最终得到了这个值:

解码

"<!-- Start ValueClick Media 300x250 Code for Test Tag -->\r\n<script language=\"javascript\" src=\"http://media.fastclick.net/w/get.media?sid=38901&m=6&tp=8&d=j&t=n\"></script>\r\n<noscript><a href=\"http://media.fastclick.net/w/click.here?sid=38901&m=6&c=1\" target=\"_blank\">\r\n<img src=\"http://media.fastclick.net/w/get.media?sid=38901&m=6&tp=8&d=s&c=1\"width=300 height=250 border=1></a></noscript>\r\n<!-- End ValueClick Media 300x250 Code for Test Tag -->"

当在一些javascript代码中用作javascript文字时,Firefox抱怨它没有终止 - 但我不明白为什么我自己。

奇怪的是,如果我从上面的html中删除“</script>”结束标记,则编码版本可以正常工作,如下所示:

Unecoded

<!-- Start ValueClick Media 300x250 Code for Test Tag -->
<script language="javascript" src="http://media.fastclick.net/w/get.media?sid=38901&m=6&tp=8&d=j&t=n">
<noscript><a href="http://media.fastclick.net/w/click.here?sid=38901&m=6&c=1" target="_blank">
<img src="http://media.fastclick.net/w/get.media?sid=38901&m=6&tp=8&d=s&c=1"width=300 height=250 border=1></a></noscript>
<!-- End ValueClick Media 300x250 Code for Test Tag -->

编码

"<!-- Start ValueClick Media 300x250 Code for Test Tag -->\r\n<script language=\"javascript\" src=\"http://media.fastclick.net/w/get.media?sid=38901&m=6&tp=8&d=j&t=n\">\r\n<noscript><a href=\"http://media.fastclick.net/w/click.here?sid=38901&m=6&c=1\" target=\"_blank\">\r\n<img src=\"http://media.fastclick.net/w/get.media?sid=38901&m=6&tp=8&d=s&c=1\"width=300 height=250 border=1></a></noscript>\r\n<!-- End ValueClick Media 300x250 Code for Test Tag -->"

此编码值有效......

任何人都知道我缺少什么?

更新

现在看起来相当明显,我责备缺乏睡眠,在这种情况下,应用程序依赖于JSON.Net的旧版本来编码javascript - 所以我通过引入一个新的JsonConverter来解决这个问题在应用JavaScript转义后,在第二次传递时转义结束标记。

public class EscapeTagsStringConverter : JsonConverter
{
    public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
    {
        if (value == null)
        {
            writer.WriteNull();
            return;
        }

        string escapedValue = ToEscapedJavaScriptString(value.ToString(), '"').Replace("</", "<\\/");

        writer.WriteRawValue("\"" + escapedValue + "\"");
    }

    public override object ReadJson(JsonReader reader, Type objectType, JsonSerializer serializer)
    {
        return reader.Value.ToString();
    }

    public override bool CanConvert(Type objectType)
    {
        return (objectType == typeof (string));
    }

    public static char IntToHex(int n)
    {
        if (n <= 9)
        {
            return (char)(n + 48);
        }
        return (char)((n - 10) + 97);
    }

    public static void WriteCharAsUnicode(TextWriter writer, char c)
    {
        char h1 = IntToHex((c >> 12) & '\x000f');
        char h2 = IntToHex((c >> 8) & '\x000f');
        char h3 = IntToHex((c >> 4) & '\x000f');
        char h4 = IntToHex(c & '\x000f');

        writer.Write('\\');
        writer.Write('u');
        writer.Write(h1);
        writer.Write(h2);
        writer.Write(h3);
        writer.Write(h4);
    }

    public static void WriteEscapedJavaScriptChar(TextWriter writer, char c, char delimiter)
    {
        switch (c)
        {
            case '\t':
                writer.Write(@"\t");
                break;
            case '\n':
                writer.Write(@"\n");
                break;
            case '\r':
                writer.Write(@"\r");
                break;
            case '\f':
                writer.Write(@"\f");
                break;
            case '\b':
                writer.Write(@"\b");
                break;
            case '\\':
                writer.Write(@"\\");
                break;
            case '\'':
                writer.Write((delimiter == '\'') ? @"\'" : @"'");
                break;
            case '"':
                writer.Write((delimiter == '"') ? "\\\"" : @"""");
                break;
            default:
                if (c > '\u001f')
                    writer.Write(c);
                else
                    WriteCharAsUnicode(writer, c);
                break;
        }
    }

    public void WriteEscapedJavaScriptString(TextWriter writer, string value, char delimiter)
    {
        if (value != null)
        {
            for (int i = 0; i < value.Length; i++)
            {
                WriteEscapedJavaScriptChar(writer, value[i], delimiter);
            }
        }
    }

    public string ToEscapedJavaScriptString(string value)
    {
        return ToEscapedJavaScriptString(value, '"');
    }

    public string ToEscapedJavaScriptString(string value, char delimiter)
    {
        using (StringWriter w = CreateStringWriter(GetLength(value) ?? 16))
        {
            WriteEscapedJavaScriptString(w, value, delimiter);
            return w.ToString();
        }
    }

    public static StringWriter CreateStringWriter(int capacity)
    {
        StringBuilder sb = new StringBuilder(capacity);
        StringWriter sw = new StringWriter(sb, CultureInfo.InvariantCulture);

        return sw;
    }

    public static int? GetLength(string value)
    {
        if (value == null)
            return null;
        return value.Length;
    }
}

2 个答案:

答案 0 :(得分:4)

嗯,是的,如果你有:

<script>
    var s= '</script>';
</script>

浏览器如何知道第一个</script>不是脚本元素的真正结尾?每个浏览器,不仅仅是Firefox,都会将其读作:

<script>
    var s= '   // uh-oh! string literal left open!
</script>';    // script element closed. Then some trailing text content
</script>      // close-tag for a script that isn't open, ignore

为避免过早结束包含</(ETAGO)序列的字符串文字,必须以某种方式对其进行转义。你可以说'<\/script>',或'\x3C/script>'甚至是'<'+'/script>'(那个很受欢迎,但我觉得它很不优雅。)

答案 1 :(得分:0)

解码后的值不会在chrome或ff 3.6.10中引发错误 您使用的是哪个版本?