Skip to content

Commit 98d993e

Browse files
committed
Add trimEqual option.
Closes #24. - As well as the Diff class as the SequenceMatcher class, both have 'trimEqual' added as an option. When set to false, equal lines of text won't be stripped of the start and end of the text. - Re-added long Chinese line to the example and test resources.
1 parent 5d03eae commit 98d993e

18 files changed

+212
-113
lines changed

example/a.txt

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,12 @@
1313
<h2>This line is the same for both versions.</h2>
1414

1515
<p>
16-
It's also compatible with multibyte characters (such as emoji) as shown below:
16+
It's also compatible with multibyte characters (like Chinese and emoji) as shown below:
17+
另外我覺得那個評價的白色櫃子有點沒有必要欸。外觀我就不說了 ,怎麼連空間都那麼狹隘。不過倒是從這個地方看出所謂的“改革”
1718
Do you know what "金槍魚罐頭" means in Chinese?
1819
🍏🍎🙂
1920
</p>
21+
2022
<p>Just some lines to demonstrate the collapsing of a block of lines which are the same in both versions.</p>
2123
<p>Just some lines to demonstrate the collapsing of a block of lines which are the same in both versions.</p>
2224
<p>Just some lines to demonstrate the collapsing of a block of lines which are the same in both versions.</p>

example/b.txt

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,12 @@
1313
<h2>This line is added to version2.</h2>
1414

1515
<p>
16-
It's also compatible with multibyte characters (such as emoji) as shown below:
16+
It's also compatible with multibyte characters (like Chinese and emoji) as shown below:
17+
另外我覺得那個評鑑的白色櫃子有點沒有必要欸。外觀我就不說了 ,怎麼連空間都那麼狹隘。不過倒是從這個地方看出所謂的“改革”
1718
Do you know what "魚の缶詰" means in Chinese?
1819
🍎🍏🙂
1920
</p>
21+
2022
<p>Just some lines to demonstrate the collapsing of a block of lines which are the same in both versions.</p>
2123
<p>Just some lines to demonstrate the collapsing of a block of lines which are the same in both versions.</p>
2224
<p>Just some lines to demonstrate the collapsing of a block of lines which are the same in both versions.</p>

example/example.php

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -17,14 +17,16 @@
1717
$b = file_get_contents(dirname(__FILE__) . '/b.txt');
1818

1919
// Options for generating the diff.
20-
$options = [
20+
$customOptions = [
21+
'context' => 2,
22+
'trimEqual' => false,
2123
'ignoreWhitespace' => true,
2224
'ignoreCase' => true,
23-
'context' => 2,
2425
];
2526

26-
// Initialize the diff class.
27-
$diff = new Diff($a, $b /*, $options */);
27+
// Choose one of the initializations.
28+
$diff = new Diff($a, $b); // Initialize the diff class with default options.
29+
//$diff = new Diff($a, $b, $customOptions); // Initialize the diff class with custom options.
2830
?><!DOCTYPE html>
2931
<html lang="en">
3032
<head>

htmlInline.png

4.85 KB
Loading

htmlSideBySide.png

6.18 KB
Loading

htmlUnified.png

6.67 KB
Loading

lib/jblond/Diff.php

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,13 +51,15 @@ class Diff
5151
* @var array Associative array containing the default options available for the diff class and their default
5252
* value.
5353
* - context The amount of lines to include around blocks that differ.
54+
* - trimEqual Strip blocks of equal lines from the start and end of the text.
5455
* - ignoreWhitespace When true, tabs and spaces are ignored while comparing.
5556
* The spacing of version1 is leading.
5657
* - ignoreCase When true, character casing is ignored while comparing.
5758
* The casing of version1 is leading.
5859
*/
5960
private $defaultOptions = [
6061
'context' => 3,
62+
'trimEqual' => true,
6163
'ignoreWhitespace' => false,
6264
'ignoreCase' => false,
6365
];

lib/jblond/Diff/SequenceMatcher.php

Lines changed: 42 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -68,9 +68,11 @@ class SequenceMatcher
6868
* @var array
6969
*/
7070
private $defaultOptions = array(
71-
'ignoreNewLines' => false,
71+
'context' => 3,
72+
'trimEqual' => true,
7273
'ignoreWhitespace' => false,
73-
'ignoreCase' => false
74+
'ignoreCase' => false,
75+
'ignoreNewLines' => false,
7476
);
7577

7678
/**
@@ -84,7 +86,7 @@ class SequenceMatcher
8486
* @param string|array|null $junkCallback Either an array or string that references a callback function
8587
* (if there is one) to determine 'junk' characters.
8688
*/
87-
public function __construct($old, $new, array $options, $junkCallback = null)
89+
public function __construct($old, $new, array $options = [], $junkCallback = null)
8890
{
8991
$this->old = array();
9092
$this->new = array();
@@ -381,9 +383,9 @@ public function getMatchingBlocks(): array
381383

382384
$matchingBlocks = array();
383385
while (!empty($queue)) {
384-
list($alo, $ahi, $blo, $bhi) = array_pop($queue);
386+
[$alo, $ahi, $blo, $bhi] = array_pop($queue);
385387
$longestMatch = $this->findLongestMatch($alo, $ahi, $blo, $bhi);
386-
list($list1, $list2, $list3) = $longestMatch;
388+
[$list1, $list2, $list3] = $longestMatch;
387389
if ($list3) {
388390
$matchingBlocks[] = $longestMatch;
389391
if ($alo < $list1 && $blo < $list2) {
@@ -524,7 +526,7 @@ public function getOpCodes(): array
524526

525527
/**
526528
* Return a series of nested arrays containing different groups of generated
527-
* op codes for the differences between the strings with up to $context lines
529+
* op codes for the differences between the strings with up to $this->options['context'] lines
528530
* of surrounding content.
529531
*
530532
* Essentially what happens here is any big equal blocks of strings are stripped
@@ -533,10 +535,10 @@ public function getOpCodes(): array
533535
* content of the different files but can still provide context as to where the
534536
* changes are.
535537
*
536-
* @param int $context The number of lines of context to provide around the groups.
538+
* @param int $this->options['context'] The number of lines of context to provide around the groups.
537539
* @return array Nested array of all of the grouped op codes.
538540
*/
539-
public function getGroupedOpCodes(int $context = 3): array
541+
public function getGroupedOpCodes(): array
540542
{
541543
$opCodes = $this->getOpCodes();
542544
if (empty($opCodes)) {
@@ -551,47 +553,51 @@ public function getGroupedOpCodes(int $context = 3): array
551553
);
552554
}
553555

554-
if ($opCodes['0']['0'] == 'equal') {
555-
$opCodes['0'] = array(
556-
$opCodes['0']['0'],
557-
max($opCodes['0']['1'], $opCodes['0']['2'] - $context),
558-
$opCodes['0']['2'],
559-
max($opCodes['0']['3'], $opCodes['0']['4'] - $context),
560-
$opCodes['0']['4']
561-
);
562-
}
556+
if ($this->options['trimEqual']) {
557+
if ($opCodes['0']['0'] == 'equal') {
558+
// Remove sequences at the start which are out of context.
559+
$opCodes['0'] = array(
560+
$opCodes['0']['0'],
561+
max($opCodes['0']['1'], $opCodes['0']['2'] - $this->options['context']),
562+
$opCodes['0']['2'],
563+
max($opCodes['0']['3'], $opCodes['0']['4'] - $this->options['context']),
564+
$opCodes['0']['4']
565+
);
566+
}
563567

564-
$lastItem = count($opCodes) - 1;
565-
if ($opCodes[$lastItem]['0'] == 'equal') {
566-
list($tag, $i1, $i2, $j1, $j2) = $opCodes[$lastItem];
567-
$opCodes[$lastItem] = array(
568-
$tag,
569-
$i1,
570-
min($i2, $i1 + $context),
571-
$j1,
572-
min($j2, $j1 + $context)
573-
);
568+
$lastItem = count($opCodes) - 1;
569+
if ($opCodes[$lastItem]['0'] == 'equal') {
570+
[$tag, $i1, $i2, $j1, $j2] = $opCodes[$lastItem];
571+
// Remove sequences at the end which are out of context.
572+
$opCodes[$lastItem] = array(
573+
$tag,
574+
$i1,
575+
min($i2, $i1 + $this->options['context']),
576+
$j1,
577+
min($j2, $j1 + $this->options['context'])
578+
);
579+
}
574580
}
575581

576-
$maxRange = $context * 2;
582+
$maxRange = $this->options['context'] * 2;
577583
$groups = array();
578584
$group = array();
579585

580-
foreach ($opCodes as [$tag, $i1, $i2, $j1, $j2]) {
586+
foreach ($opCodes as $key => [$tag, $i1, $i2, $j1, $j2]) {
581587
if ($tag == 'equal' && $i2 - $i1 > $maxRange) {
582588
$group[] = array(
583589
$tag,
584590
$i1,
585-
min($i2, $i1 + $context),
591+
min($i2, $i1 + $this->options['context']),
586592
$j1,
587-
min($j2, $j1 + $context)
593+
min($j2, $j1 + $this->options['context'])
588594
);
589595
$groups[] = $group;
590596
$group = array();
591-
$i1 = max($i1, $i2 - $context);
592-
$j1 = max($j1, $j2 - $context);
597+
$i1 = max($i1, $i2 - $this->options['context']);
598+
$j1 = max($j1, $j2 - $this->options['context']);
593599
}
594-
echo '';
600+
595601
$group[] = array(
596602
$tag,
597603
$i1,
@@ -601,7 +607,8 @@ public function getGroupedOpCodes(int $context = 3): array
601607
);
602608
}
603609

604-
if (!empty($group) && !(count($group) == 1 && $group[0][0] == 'equal')) {
610+
if ($this->options['trimEqual'] || (!empty($group) && !(count($group) == 1 && $group[0][0] == 'equal'))) {
611+
//Do not add the last sequences. They're out of context.
605612
$groups[] = $group;
606613
}
607614

tests/Diff/SequenceMatcherTest.php

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
<?php
2+
3+
namespace Diff\Renderer;
4+
5+
use jblond\Diff\SequenceMatcher;
6+
use PHPUnit\Framework\TestCase;
7+
8+
class SequenceMatcherTest extends TestCase
9+
{
10+
11+
/**
12+
* Constructor.
13+
*
14+
* @param null $name
15+
* @param array $data
16+
* @param string $dataName
17+
*/
18+
public function __construct($name = null, array $data = [], $dataName = '')
19+
{
20+
parent::__construct($name, $data, $dataName);
21+
}
22+
23+
public function testGetGroupedOpCodes()
24+
{
25+
// Test with default options.
26+
$sequenceMatcher = new SequenceMatcher('54321ABXDE12345', '54321ABxDE12345');
27+
$this->assertEquals(
28+
[[['equal', 4, 7, 4, 7], ['replace', 7, 8, 7, 8], ['equal', 8, 11, 8, 11]]],
29+
$sequenceMatcher->getGroupedOpCodes()
30+
);
31+
32+
// Test with trimEqual disabled.
33+
$sequenceMatcher = new SequenceMatcher('54321ABXDE12345', '54321ABxDE12345', ['trimEqual' => false]);
34+
$this->assertEquals(
35+
[[['equal', 0, 3, 0, 3]], [['equal', 4, 7, 4, 7], ['replace', 7, 8, 7, 8], ['equal', 8, 11, 8, 11]]],
36+
$sequenceMatcher->getGroupedOpCodes()
37+
);
38+
39+
// Test with ignoreWhitespace enabled.
40+
// Note: The sequenceMatcher evaluates the string character by character. Option ignoreWhitespace will ignore
41+
// if the difference if the character is a tab in one sequence and a space in the other.
42+
$sequenceMatcher = new SequenceMatcher("\t54321ABXDE12345 ", " 54321ABXDE12345\t", ['ignoreWhitespace' => true]);
43+
$this->assertEquals(
44+
[[['equal', 14, 17, 14, 17]]],
45+
$sequenceMatcher->getGroupedOpCodes()
46+
);
47+
48+
// Test with ignoreCase enabled.
49+
$sequenceMatcher = new SequenceMatcher('54321ABXDE12345', '54321ABxDE12345', ['ignoreCase' => true]);
50+
$this->assertEquals(
51+
[[['equal', 12, 15, 12, 15]]],
52+
$sequenceMatcher->getGroupedOpCodes()
53+
);
54+
}
55+
}

tests/resources/a.txt

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,12 @@
1313
<h2>This line is the same for both versions.</h2>
1414

1515
<p>
16-
It's also compatible with multibyte characters (such as emoji) as shown below:
16+
It's also compatible with multibyte characters (like Chinese and emoji) as shown below:
17+
另外我覺得那個評價的白色櫃子有點沒有必要欸。外觀我就不說了 ,怎麼連空間都那麼狹隘。不過倒是從這個地方看出所謂的“改革”
1718
Do you know what "金槍魚罐頭" means in Chinese?
1819
🍏🍎🙂
1920
</p>
21+
2022
<p>Just some lines to demonstrate the collapsing of a block of lines which are the same in both versions.</p>
2123
<p>Just some lines to demonstrate the collapsing of a block of lines which are the same in both versions.</p>
2224
<p>Just some lines to demonstrate the collapsing of a block of lines which are the same in both versions.</p>

tests/resources/b.txt

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,12 @@
1313
<h2>This line is added to version2.</h2>
1414

1515
<p>
16-
It's also compatible with multibyte characters (such as emoji) as shown below:
16+
It's also compatible with multibyte characters (like Chinese and emoji) as shown below:
17+
另外我覺得那個評鑑的白色櫃子有點沒有必要欸。外觀我就不說了 ,怎麼連空間都那麼狹隘。不過倒是從這個地方看出所謂的“改革”
1718
Do you know what "魚の缶詰" means in Chinese?
1819
🍎🍏🙂
1920
</p>
21+
2022
<p>Just some lines to demonstrate the collapsing of a block of lines which are the same in both versions.</p>
2123
<p>Just some lines to demonstrate the collapsing of a block of lines which are the same in both versions.</p>
2224
<p>Just some lines to demonstrate the collapsing of a block of lines which are the same in both versions.</p>

0 commit comments

Comments
 (0)