Question

当我通过此函数运行包含双引号的短语时，它会用引号替换引号。

我想完全删除它们（也是单引号）。我怎样才能改变这个功能呢？

function string_sanitize($s) {
    $result = preg_replace("/[^a-zA-Z0-9]+/", "", $s);
    return $result;
}

更新

Example 1: This is 'the' first example 
returns: Thisis030the039firstexample 
Errors: Warning: preg_match_all() [function.preg-match-all]: Unknown modifier '0' in C


Example 2: This is my "second" example
returns: Thisismyquotsecondquotexample
Errors: Invalid express in Xpath

Answer 1

我不会将该函数称为string_sanitize()，因为它具有误导性。你可以称之为strip_non_alphanumeric()。

您当前的功能会删除任何不是大写或小写字母或数字的内容。

您可以使用...

剥离'和"

$str = str_replace(array('\'', '"'), '', $str);

Answer 2

您的原始字符串看起来像"（"）的HTML字符，所以当您尝试清理它时，您只需删除&和{{1} }，留下字符串;的其余部分。

--- --- EDIT

删除非字母数字字符的最简单方法可能是使用html_entity_decode解码HTML字符，然后通过正则表达式运行它。因为，在这种情况下，您不会得到任何需要重新编码的内容，因此您无需执行htmlentities，但值得记住您拥有 HTML数据现在你有原始未编码的数据。

例如：

quot

请注意function string_sanitize($s) { $result = preg_replace("/[^a-zA-Z0-9]+/", "", html_entity_decode($s, ENT_QUOTES)); return $result; }将函数标记为“...转换双引号和单引号。”。

Answer 3

我认为你的preg_replace调用应该是这样的：

$result = preg_replace("/[^a-zA-Z0-9]+/", "", html_entity_decode($s));

有关详细信息，请参阅html_entity_decode reference。

Answer 4

单引号和双引号的简便方法：）仍然留下类似的东西。

$clean_string = str_replace('"', '``', str_replace("'", "`", $UserInput));

Answer 5

您的函数使用正则表达式删除与[a-zA-Z0-9]不同的任何字符，因此它肯定会删除任何“”或“”

编辑：好吧，从Hamish回答我发现你的字符串是一个HTML字符串，因此它解释了为什么“（＆amp; quot）被转换为”quot“。你可以考虑用preg_replace替换&quote，或htmlspecialchars_decode首先。

Answer 6

为了确保删除所有类型的引号（包括那些左侧与右侧不同的引号），我认为它必须是类似的;

function string_sanitize($s) {
    $result = htmlentities($s);
    $result = preg_replace('/^(&quot;)(.*)(&quot;)$/', "$2", $result);
    $result = preg_replace('/^(&laquo;)(.*)(&raquo;)$/', "$2", $result);
    $result = preg_replace('/^(&#8220;)(.*)(&#8221;)$/', "$2", $result);
    $result = preg_replace('/^(&#39;)(.*)(&#39;)$/', "$2", $result);
    $result = html_entity_decode($result);
    return $result;
}

如何从字符串中删除单引号和双引号

6 个答案: