我遇到如下问题:
$str="i am a <b>software</b> <span style=\"color:red;\">engineer.</span> i work at a company."; //here, total word 10 (according inner text)
我希望只获得带有标签的5个单词: 的输出:
$output="i am a <b>software</b> <span style=\"color:red;\">engineer.</span>"; // 5 word
怎么可能?请帮帮我..谢谢。
我有单词计数器功能:
function word( $str, $wordCount = 10 ) {
return implode(
'',
array_slice(
preg_split(
'/([\s,\.;\?\!]+)/',
$str,
$wordCount*2+1,
PREG_SPLIT_DELIM_CAPTURE
),
0,
$wordCount*2-1
)
);
}
答案 0 :(得分:1)
以下是一个示例,但您必须对其进行调整以适应单词中允许的字符:
<?php
$input = 'i am a <b>software</b> <span style=\"color:red;\">engineer.</span> i work at a company.';
$pattern = '#((?: \s* (<[^>]*>)* [a-z.-]+ (</[^>]*>)* ){0,5}).*#x';
$result = preg_replace($pattern, '$1', $input);
var_dump($result);
答案 1 :(得分:1)
更精确的解决方案
<?php
$input = 'i am a <b>software</b> <span style=\"color:red;\">engineer. And </span> i work at a company.';
var_dump(customParse($input, 5));
var_dump(customParse($input, 4));
var_dump(customParse($input, 3));
$input = 'i am a <b>software</b> <foo style=\"color:red;\">engineer. And </foo> i work at a company.';
var_dump(customParse($input, 5));
function customParse($input, $limit) {
$pattern = '#(
\s*
(?: <(\w+) [^>]* >)*
[a-z.-]+
(</[^>]*>)*
)#x';
preg_match_all($pattern, $input, $matches);
$result = '';
for ($nbMatch = 0; $nbMatch < $limit; $nbMatch++) {
$capturedText = $matches[0][$nbMatch];
$openTag = $matches[2][$nbMatch];
$closeTag = $matches[3][$nbMatch];
$result .= $capturedText;
if ($openTag && !$closeTag) {
$result .= '</' . $openTag . '>';
}
}
return $result;
}
答案 2 :(得分:0)
有可能。您可以像这样使用preg_match_all
:
<?php
$input = 'i am a <b>software</b> <span style=\"color:red;\">engineer. And </span> i work at a company.';
$pattern = '#(
\s*
(<[^>]*>)*
[a-z.-]+
(</[^>]*>)*
)#x';
preg_match_all($pattern, $input, $matches);
var_dump($matches);
然后,对于每个匹配,您测试$ matches [2] [index]是否为空并且$ matches [3] [index]为空以添加结束标记。 但我认为这不完整,容易出错。您可能需要修改它才能运行所有可能性。