我正在寻找一种方法来检测和删除带引号的引号,例如:"某些内容" 某些东西"某事" 。
在上面的例子中,正如你所看到的那样,斜体用双引号括起来。我想从这些外部引号中删除内部字符串。
因此,表达式应该只是查找带有文本的引号加上另一组文本环绕文本,然后删除包裹最后一个的引号。
这是我目前的代码(php):
preg_match_all('/".*(".*").*"/', $text, $matches);
if(is_array($matches[0])){
foreach($matches[0] as $match){
$text = str_replace($match, '"' . str_replace('"', '', $match) . '"', $text);
}
}
答案 0 :(得分:1)
如果字符串以"
开头且字符串中的双引号始终保持平衡,则可以使用:
这将匹配字符串开头的双引号,然后使用SKIP FAIL跳过匹配。然后它会匹配"
,在组中捕获"
之间的内容,再次匹配"
。
在替换中,您可以使用捕获组1 $1
$pattern = '/^"(*SKIP)(*F)|"([^"]+)"/';
$str = "\"something \"something something\" and then \"something\" something\"";
echo preg_replace($pattern, "$1", $str);
“什么东西然后什么东西”
答案 1 :(得分:1)
您可以将strpos()
与第三个参数(offset)一起使用来查找所有引号并将每个引号从1替换为n-1:
<?php
$data = <<<DATA
something "something "something something" something" something
DATA;
# set up the needed variables
$needle = '"';
$lastPos = 0;
$positions = array();
# find all quotes
while (($lastPos = strpos($data, $needle, $lastPos)) !== false) {
$positions[] = $lastPos;
$lastPos = $lastPos + strlen($needle);
}
# replace them if there are more than 2
if (count($positions) > 2) {
for ($i=1;$i<count($positions)-1;$i++) {
$data[$positions[$i]] = "";
}
}
# check the result
echo $data;
?>
这会产生
something "something something something something" something
<小时/> 你甚至可以在课堂上隐藏它:
class unquote {
# set up the needed variables
var $data = "";
var $needle = "";
var $positions = array();
function cleanData($string="", $needle = '"') {
$this->data = $string;
$this->needle = $needle;
$this->searchPositions();
$this->replace();
return $this->data;
}
private function searchPositions() {
$lastPos = 0;
# find all quotes
while (($lastPos = strpos($this->data, $this->needle, $lastPos)) !== false) {
$this->positions[] = $lastPos;
$lastPos = $lastPos + strlen($this->needle);
}
}
private function replace() {
# replace them if there are more than 2
if (count($this->positions) > 2) {
for ($i=1;$i<count($this->positions)-1;$i++) {
$this->data[$this->positions[$i]] = "";
}
}
}
}
并用
调用它$q = new unquote();
$data = $q->cleanData($data);