基于字符串搜索和重构来分析字符串

时间:2018-07-17 16:06:12

标签: php

我正在尝试根据定义的元素重建字符串,发现它。找到的每个元素,我都需要获取与其相关的文本,直到找到下一个标题。下面,我添加了一些代码以帮助更好地解释。任何方向/帮助表示赞赏。

//defined headings to search on
$search_items ='Summary:,Education:,Experience:,Other:,Qualifications/Requirements:';

$string = 'Summary: It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.Education:Various versions have evolved over the years, sometimes by accident, sometimes on purpose (injected humour and the like.Other:Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old.';

$headers = explode(',', $search_items);

foreach ($headers as $heading) {
    $substring = substr($string, strpos(utf8_encode($string), utf8_encode($heading)) + 1);   
    if ($substring  !== false) {
       echo "<p><strong>$heading</strong></p><p></p>";
    }
} 

使用上面的代码,我可以找到所有标题并将它们包装在Strong标签中。然后$ substring包含标题之后的所有其余文本。问题是我只想要文本,直到找到下一个标题为止,这样我就可以将其放在空的p标签中。

因此使用上面的字符串将类似于:

摘要:

在1960年代,它发行了包含Lorem Ipsum段落的Letraset图纸,并且在最近的桌面出版软件中得到了普及。例如Aldus PageMaker,其中包括Lorem Ipsum的版本。

教育:

这些年来,各种版本不断发展,有时是偶然的,有时是故意注入幽默之类的东西。

1 个答案:

答案 0 :(得分:0)

首先,您存储文本的方式不是很好。如果实际文本中包含标题之一,会发生什么?就是说,最简单的方法实际上是通过正则表达式拆分字符串并获取标题和字符串数据。这是示例代码:

<?php

$search_items ='Summary:,Education:,Experience:,Other:,Qualifications/Requirements:';

$string = 'Summary: It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.Education:Various versions have evolved over the years, sometimes by accident, sometimes on purpose (injected humour and the like.Other:Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old.';

$headers = explode(',', $search_items);
$pattern = '#(' . implode('|', array_map(function($a) { return preg_quote($a); }, $headers)) . ')#';

$data = preg_split($pattern, $string, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);

for ($i=0; $i<count($data); $i+=2) {
    echo "<p><strong>{$data[$i]}</strong></p><p>{$data[$i+1]}</p>";
}