我想将一个句子分成一个段落,每个段落应该少于几个单词。例如:
Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source.
Paragraph 1:
Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old.
Paragraph 2:
Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source.
在上面的例子中,小于20的单词在段落1中,其余的在段落2上。
有没有办法用php实现这个目的?
我已经尝试了$abc = explode(' ', $str, 20);
,它会在一个数组中存储20个单词,然后其余部分将在数组$ abc [' 21']中存储。我如何从前20个数组中提取数据作为第一段,然后将其余数据作为第二段?
答案 0 :(得分:0)
首先将字符串拆分成句子。然后循环遍历句子数组,首先将句子添加到paragraph数组中,然后计算paragraph数组中该单词的单词,如果大于19的增量段落计数器。
$string = 'Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source.';
$sentences = preg_split('/(?<=[.?!;])\s+(?=\p{Lu})/', $string);
$ii = 0;
$paragraphs = array();
foreach ( $sentences as $value ) {
if ( isset($paragraphs[$ii]) ) { $paragraphs[$ii] .= $value; }
else { $paragraphs[$ii] = $value; }
if ( 19 < str_word_count($paragraphs[$ii]) ) {
$ii++;
}
}
print_r($paragraphs);
输出:
Array
(
[0] => Contrary to popular belief, Lorem Ipsum is not simply random text.It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old.
[1] => Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source.
)
在此处找到句子拆分器:Splitting paragraphs into sentences with regexp and PHP