When the array size is large, performance of the built-in function array_diff() will be extremely poor. To solve this problem, we can implement a custom array_diff() function using hash maps. Here is the source code:
function better_array_diff($a, $b) {
$map = array();
foreach($a as $val) $map[$val] = 1;
foreach($b as $val) if(isset($map[$val])) unset($map[$val]);
return $map;
}
Let’s compare the performance now:
$a = range(1, 10000);
$b = range(5000, 15000);shuffle($a);
shuffle($b);$ts = microtime(true);
array_diff($a, $b);
printf(“array_diff=%.4f\n”, microtime(true) – $ts);$ts = microtime(true);
better_array_diff($a, $b);
printf(“better_array_diff=%.4f\n”, microtime(true) – $ts);
Result:
array_diff=22.9525
better_array_diff=0.0092