Extract all words changed between two texts PHP


Keywords:php 


Question: 

I need to compare two texts that will always be the same except 15 0 20 words, which will be replaced by others. How can I compare those two texts and print out the words that have been replaced?

1 Hi my friend, this is a question for stackoverflow
2 Hi men, this is a quoted for web

Results: my friend -> men
question -> quoted
stackoverflow -> web

Thank you all


1 Answer: 

So, one approach would be to identify what words each string has in common. Then, for each text, capture the characters in between the common words.

function findDifferences($one, $two)
{
    $one .= ' {end}';  // add a common string to the end
    $two .= ' {end}';  // of each string to end searching on.

    // break sting into array of words
    $arrayOne = explode(' ', $one);
    $arrayTwo = explode(' ', $two);

    $inCommon = Array();  // collect the words common in both strings
    $differences = null;  // collect things that are different in each

    // see which words from str1 exist in str2
    $arrayTwo_temp = $arrayTwo;
    foreach ($arrayOne as $i => $word) {
        if ($key = array_search($word, $arrayTwo_temp) !== false) {
            $inCommon[] = $word;
            unset($arrayTwo_temp[$key]);
        }
    }

    $startA = 0;
    $startB = 0;

    foreach ($inCommon as $common) {
        $uniqueToOne = '';
        $uniqueToTwo = '';

        // collect chars between this 'common' and the last 'common'
        $endA = strpos($one, $common, $startA);
        $lenA = $endA - $startA;
        $uniqueToOne = substr($one, $startA, $lenA);

        //collect chars between this 'common' and the last 'common'
        $endB = strpos($two, $common, $startB);
        $lenB = $endB - $startB;
        $uniqueToTwo = substr($two, $startB, $lenB);

        // Add old and new values to array, but not if blank.
        // They should only ever be == if they are blank ''
        if ($uniqueToOne != $uniqueToTwo) {
            $differences[] = Array(
                'old' => trim($uniqueToOne),
                'new' => trim($uniqueToTwo)
            );
        }

        // set the start past the last found common word
        $startA = $endA + strlen($common);
        $startB = $endB + strlen($common);
    }

    // returns false if there aren't any differences
    return $differences ?: false;
}

Then it's a trivial matter to display the data however you want:

$one = '1 Hi my friend, this is a question for stackoverflow';
$two = '2 Hi men, this is a quoted for web';

$differences = findDifferences($one, $two);

foreach($differences as $diff){
    echo $diff['old'] . ' -> ' . $diff['new'] . '<br>';
}

// 1 -> 2
// my friend, -> men,
// question -> quoted
// stackoverflow -> web