Monday, 22 August 2016

arrays - PHP memory exhaused while using array_combine in foreach loop



I'm having a trouble when tried to use array_combine in a foreach loop. It will end up with an error:



PHP Fatal error:  Allowed memory size of 268435456 bytes exhausted (tried to allocate 85 bytes) in


Here is my code:




$data = array();
$csvData = $this->getData($file);
if ($columnNames) {
$columns = array_shift($csvData);
foreach ($csvData as $keyIndex => $rowData) {
$data[$keyIndex] = array_combine($columns, array_values($rowData));
}
}


return $data;


The source file CSV which I've used has approx ~1,000,000 rows. This row



$csvData = $this->getData($file)


I was using a while loop to read CSV and assign it into an array, it's working without any problem. The trouble come from array_combine and foreach loop.




Do you have any idea to resolve this or simply have a better solution?





Here is the code to read the CSV file (using while loop)



$data = array();
if (!file_exists($file)) {
throw new Exception('File "' . $file . '" do not exists');
}


$fh = fopen($file, 'r');
while ($rowData = fgetcsv($fh, $this->_lineLength, $this->_delimiter, $this->_enclosure)) {
$data[] = $rowData;
}
fclose($fh);
return $data;





The code above is working without any problem if you are playing around with a CSV file <=20,000~30,000 rows. From 50,000 rows and up, the memory will be exhausted.


Answer



You're in fact keeping (or trying to keep) two distinct copies of the whole dataset in your memory. First you load the whole CSV date into memory using getData() and the you copy the data into the $data array by looping over the data in memory and creating a new array.



You should use stream based reading when loading the CSV data to keep just one data set in memory. If you're on PHP 5.5+ (which you definitely should by the way) this is a simple as changing your getData method to look like that:



protected function getData($file) {
if (!file_exists($file)) {
throw new Exception('File "' . $file . '" do not exists');

}

$fh = fopen($file, 'r');
while ($rowData = fgetcsv($fh, $this->_lineLength, $this->_delimiter, $this->_enclosure)) {
yield $rowData;
}
fclose($fh);
}



This makes use of a so-called generator which is a PHP >= 5.5 feature. The rest of your code should continue to work as the inner workings of getData should be transparent to the calling code (only half of the truth).



UPDATE to explain how extracting the column headers will work now.



$data = array();
$csvData = $this->getData($file);
if ($columnNames) { // don't know what this one does exactly
$columns = null;
foreach ($csvData as $keyIndex => $rowData) {
if ($keyIndex === 0) {

$columns = $rowData;
} else {
$data[$keyIndex/* -1 if you need 0-index */] = array_combine(
$columns,
array_values($rowData)
);
}
}
}


return $data;

No comments:

Post a Comment

c++ - Does curly brackets matter for empty constructor?

Those brackets declare an empty, inline constructor. In that case, with them, the constructor does exist, it merely does nothing more than t...