提取两个文本之间更改的所有单词PHP

时间:2017-01-20 16:39:48

标签: php text compare extract words

我需要比较两个总是相同的文本,除了15 0 20个单词,这些单词将被其他单词替换。如何比较这两个文本并打印出已被替换的单词?

1嗨,我的朋友,这是stackoverflow的问题 2嗨男人,这是引用网络

结果: 我的朋友 - >男人
问题 - >引述
stackoverflow - >幅

谢谢大家

1 个答案:

答案 0 :(得分:0)

因此,一种方法是识别每个字符串的共同字。然后,对于每个文本,捕获常用词之间的字符。

function findDifferences($one, $two)
{
    $one .= ' {end}';  // add a common string to the end
    $two .= ' {end}';  // of each string to end searching on.

    // break sting into array of words
    $arrayOne = explode(' ', $one);
    $arrayTwo = explode(' ', $two);

    $inCommon = Array();  // collect the words common in both strings
    $differences = null;  // collect things that are different in each

    // see which words from str1 exist in str2
    $arrayTwo_temp = $arrayTwo;
    foreach ($arrayOne as $i => $word) {
        if ($key = array_search($word, $arrayTwo_temp) !== false) {
            $inCommon[] = $word;
            unset($arrayTwo_temp[$key]);
        }
    }

    $startA = 0;
    $startB = 0;

    foreach ($inCommon as $common) {
        $uniqueToOne = '';
        $uniqueToTwo = '';

        // collect chars between this 'common' and the last 'common'
        $endA = strpos($one, $common, $startA);
        $lenA = $endA - $startA;
        $uniqueToOne = substr($one, $startA, $lenA);

        //collect chars between this 'common' and the last 'common'
        $endB = strpos($two, $common, $startB);
        $lenB = $endB - $startB;
        $uniqueToTwo = substr($two, $startB, $lenB);

        // Add old and new values to array, but not if blank.
        // They should only ever be == if they are blank ''
        if ($uniqueToOne != $uniqueToTwo) {
            $differences[] = Array(
                'old' => trim($uniqueToOne),
                'new' => trim($uniqueToTwo)
            );
        }

        // set the start past the last found common word
        $startA = $endA + strlen($common);
        $startB = $endB + strlen($common);
    }

    // returns false if there aren't any differences
    return $differences ?: false;
}

然后根据需要显示数据是一件小事:

$one = '1 Hi my friend, this is a question for stackoverflow';
$two = '2 Hi men, this is a quoted for web';

$differences = findDifferences($one, $two);

foreach($differences as $diff){
    echo $diff['old'] . ' -> ' . $diff['new'] . '<br>';
}

// 1 -> 2
// my friend, -> men,
// question -> quoted
// stackoverflow -> web