我在这里看到很多关于如何根据.
将字符串拆分为句子的示例,但我的问题是如何根据字数将字符串拆分为句子,并忘记了.
或{{1} }
例如:
,
输出:
function splitToSentences($wordsCount){
....
}
$string = "orem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."
print_r( splitToSentences(10) );
答案 0 :(得分:3)
$content = "orem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.";
$dataSet = explode(' ', $content);
$dataSet = array_chunk($dataSet, 10);
$dataSet = array_map(function($string) {
return implode(' ', $string);
},$dataSet);
var_dump($dataSet);
结果:
array(7) {
[0]=>
string(62) "orem ipsum dolor sit amet, consectetur adipiscing elit, sed do"
[1]=>
string(62) "eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut"
[2]=>
string(68) "enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi"
[3]=>
string(57) "ut aliquip ex ea commodo consequat. Duis aute irure dolor"
[4]=>
string(64) "in reprehenderit in voluptate velit esse cillum dolore eu fugiat"
[5]=>
string(71) "nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in"
[6]=>
string(54) "culpa qui officia deserunt mollit anim id est laborum."
}
答案 1 :(得分:2)
我不熟悉php中的正则表达式,但我相信这个正则表达式可以解决这个问题:
((?:\s*\S+){10})\s*
它最多匹配10个单词,前面有任意数量的空格或换行符,后跟任意数量的空格。 ' 10'是要匹配的单词数。
演示:https://regex101.com/r/yR5uZ8/3
这似乎有效:
<?php
function splitToSentences($wordsCount) {
$str = "orem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.";
preg_match_all("/((?:\s*\S+){".$wordsCount."})\s*/", $str, $match);
return $match[0];
}
print_r(splitToSentences(10));
答案 2 :(得分:0)
试试这个:
$string = 'Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source.';
$sentences = preg_split('/(?<=[.?!;])\s+(?=\p{Lu})/', $string);
$ii = 0;
$paragraphs = array();
foreach ( $sentences as $value ) {
if ( isset($paragraphs[$ii]) ) {
$paragraphs[$ii] .= $value;
} else {
$paragraphs[$ii] = $value;
}
if ( str_word_count($paragraphs[$ii]) > 9 ) {
$ii++;
}
}
print_r($paragraphs);
希望这有帮助。
和平!的xD