PHP Generators and Laravel Lazy Collections

4th February 2025
A common issue we often encounter when handling data in PHP is the "Allowed Memory Exhausted" error. This typically occurs when processing files or larger API responses without pagination, especially those containing hundreds of thousands of records.
Allowed memory size of 134217728 bytes exhausted
(tried to allocate 268435464 bytes)
Storing data in memory
The error above is essentially an "out of memory" issue—you've attempted to load something into memory, but there simply isn't enough space.
Before you rush to allocate more memory, let's take a step back and examine what's actually happening.
To illustrate this, let's create a simple PHP file called test.php
and add the following code:
$numbers = range(1, 5);
When we run this file, the variable $numbers
will be assigned a new array of five integers ranging from 1
to 5
:
[1, 2, 3, 4, 5]
This makes sense—the range()
function in PHP generates an array that starts with the first argument and ends with the second.
As expected, this array is stored in memory, with space allocated for five elements, and it remains there until the script finishes executing.
But what happens if we try the following?
$numbers = range(1, 9000000);
As you might have guessed, we've attempted to allocate memory for nine million integers—yikes! The outcome is exactly what we'd expect.
Allowed memory size of 134217728 bytes exhausted
(tried to allocate 268435464 bytes)
Storing this much data in memory just isn’t practical. We need a better way to iterate over large data sets without loading everything at once. This is where generators come in.
PHP Generators
Let's start by writing some code, running it, and observing the results. After that, I'll explain how it works.
First, our script will include the following code:
function makeNumbers(int $start, int $end)
{
for ($i = $start; $i <= $end; $i++) {
yield $i;
}
}
$numbers = makeNumbers(1, 9000000);
foreach ($numbers as $number) {
echo "Number " . $number . "\n";
}
When you run this code, you'll notice that while it still takes some time to generate nine million numbers, it doesn’t crash. There's no memory exhaustion issue, and eventually, you’ll see output similar to this:
...
...
Number 8999997
Number 8999998
Number 8999999
Number 9000000
Whew! That’s a lot of numbers!
But what’s actually happening here? How are we able to generate and iterate through nine million numbers using a variable called $numbers
without running out of memory?
This is where generators come into play. Notice that in the makeNumbers()
function, we’re looping nine million times, and each time, we yield the index. On the first iteration, we yield 1
, then 2
, then 3
… and so on.
In theory, we could have written this function like this:
function makeNumbers()
{
yield 1;
yield 2;
yield 3;
// oh my god no ...
yield 9000000;
}
$numbers = makeNumbers();
foreach ($numbers as $number) {
echo "Number " . $number . "\n";
}
This is obviously a bad idea for many reasons, but it highlights an important concept when understanding generators.
If this were a regular function with a return
statement, it could only return once. However, with the yield
statement, we can "yield" as many times as needed—in this case, nine million times.
Rather than returning an array of integers (or even a single integer), the yield
keyword returns a generator.
$numbers = makeNumbers();
get_class($numbers); // "Generator"
The Generator class
If we take a quick look at php.net, we can see something interesting about the Generator
class returned from this function—it implements the Iterator
interface.
This means we can use foreach
(or any other looping construct) to iterate over it seamlessly.
The Iterator
interface defines several methods, which means we can use these on our Generator
object, returned when we used the yield
keyword. These methods include:
current()
key()
next()
rewind()
valid()
We won’t cover all of these methods in detail, but let’s take a look at what happens when we call some of them on our returned Generator
object.
function makeNumbers(int $start, int $end)
{
for ($i = $start; $i <= $end; $i++) {
yield $i;
}
}
$numbers = makeNumbers(1, 9000000);
$numbers->current(); // 1
$numbers->next(); // null
$numbers->current(); // 2
As you can see, the generator behaves like any other iterator object. We can use current()
to get the value at the current position, and next()
to move the pointer to the next value—in this case, from 1
to 2
.
Notice that calling next()
doesn’t return anything; it simply advances the internal pointer to the next value.
The reason we can iterate over nine million numbers without running out of memory is that at no point do we actually store all those numbers. Instead, we’re just yielding them one by one—first 1
, then 2
, then 3
, and so on.
Laravel LazyCollection
With this in mind, let’s take a look at a commonly used class in Laravel—the Collection
.
$numbers = new Collection([1, 2, 3, 4, 5]);
You don’t need to be a rocket surgeon to see what’s happening here—we’re creating an array of five numbers and passing it into a Laravel Collection
. Think of collections as "arrays on steroids."
But what happens if we try this with nine million numbers again?
$numbers = new Collection(range(1, 900000));
Ah yes, the dreaded "Allowed Memory Exhausted" error! But that’s okay—we saw this coming, right?
Allowed memory size of 134217728 bytes exhausted
(tried to allocate 268435464 bytes)
Laravel provides another type of collection called LazyCollection
, which leverages PHP’s generators under the hood.
Let’s create one and see what happens.
$numbers = LazyCollection::make([1, 2, 3]);
Notice how we’re creating the collection using LazyCollection::make()
.
Behind the scenes, this is leveraging a generator, meaning the full array is never stored in memory at once—just like we discussed earlier.
To illustrate this further, let’s see what happens when we try to create an array with nine million numbers again.
One key difference here is that instead of passing in a full array, we can also pass in a generator function:
$numbers = LazyCollection::make(function () {
for ($i = 1; $i <= 9000000; $i++) {
yield $i;
}
});
foreach ($numbers as $number) {
echo "Number #" . $number . PHP_EOL;
}
The result?
...
...
Number 8999997
Number 8999998
Number 8999999
Number 9000000
Whew! That’s a lot of numbers again!
Under the hood, LazyCollection
uses generators just like we demonstrated earlier with plain PHP. The key reason we don’t run into memory issues is that we never store all nine million numbers at once—we simply yield one integer at a time.
Of course, generating nine million numbers like this isn’t exactly a real-world scenario. So let’s explore some more practical use cases for LazyCollection
.
Using LazyCollections with Files
A common real-world scenario is processing large log files, which can easily reach gigabytes in size. Loading such a file into memory all at once would be impractical—but with a LazyCollection
, we can efficiently iterate over its contents.
Once again, at no point do we store the entire file in memory. Instead, we yield one line at a time until we’ve processed everything.
LazyCollection::make(function () {
$handle = fopen(storage_path('foo.txt'), 'r');
while (($line = fgets($handle)) !== false) {
yield $line;
}
})->each(function ($line) {
echo "Processing: " . $line . PHP_EOL;
});
Processing: 2025-02-04 10:58:53, WARNING, 3604, Log Entry 1
Processing: 2025-02-04 10:58:53, ERROR, 7586, Log Entry 2
Processing: 2025-02-04 10:58:53, WARNING, 4853, Log Entry 3
Processing: 2025-02-04 10:58:53, WARNING, 1850, Log Entry 4
Processing: 2025-02-04 10:58:53, INFO, 6766, Log Entry 5
Processing: 2025-02-04 10:58:53, INFO, 4979, Log Entry 6
Processing: 2025-02-04 10:58:53, WARNING, 9854, Log Entry 7
Processing: 2025-02-04 10:58:53, WARNING, 3030, Log Entry 8
One important thing to note is that while processing a large file like this may still take a long time, it won’t cause memory exhaustion.
This approach is especially useful for handling large datasets efficiently, and code like this could easily be implemented in a custom console command or a queued job in Laravel.
Using LazyCollections with API Responses and Streaming
Not all APIs are designed with efficiency in mind. While some provide pagination and HATEOAS links to navigate through data seamlessly, others simply return a massive JSON payload without any structure for handling large datasets. In this case, we’re dealing with an API that sends back a single, enormous JSON dump, making it far from ideal for efficient processing.
With that challenge in mind, let’s take a look at the following approach for consuming this kind of API response in a way that keeps memory usage low and ensures smooth processing.
use JsonMachine\Items;
LazyCollection::make(function () {
$response = Http::withoutVerifying()
->withHeaders([
'Accept' => 'application/json'
])->get(
'https://acme.com/api/companies'
);
$stream = $response->toPsrResponse()
->getBody()
->detach();
$companies = Items::fromStream($stream);
foreach ($companies as $company) {
yield (array) $company;
}
})->each(function (array $company) {
// do something with a $company
// which we fetched from the response
});
The example above efficiently processes a massive dataset of fifty thousand companies from an API without overloading the system. Instead of fetching and storing the entire response in memory, we leverage JSON Machine’s streaming capabilities along with Laravel’s LazyCollection
.
The Http::get()
request fetches the data with the necessary headers. However, rather than calling $response->body()
(which would load everything into memory), we convert it into a PSR response stream and then detach it. This ensures that we can process the response as it arrives instead of waiting for the entire dataset to load.
In Laravel, calling toPsrResponse()
on a HTTP response converts the Laravel Response
object into a PSR-7 compliant response. PSR-7 is a PHP standard for HTTP messages, ensuring interoperability between different libraries and frameworks. This conversion is particularly useful when working with third-party libraries—such as JSON Machine—that require a PSR-7 response instead of Laravel's native response format.
One key advantage of using toPsrResponse()
is that it provides direct access to the streamed response body, allowing us to handle large API responses efficiently.
In the example above, getBody()->detach()
extracts the underlying stream from the PSR-7 response, which JSON Machine can then process incrementally. This is crucial when dealing with huge JSON payloads, as it avoids loading the entire response into memory.
Using Items::fromStream($stream)
, we stream the JSON response incrementally, allowing us to process one company at a time. JSON Machine is designed to handle massive JSON datasets efficiently by only keeping one record in memory at any given moment.
We then iterate over each company in the response and yield it as an array, making it compatible with Laravel’s LazyCollection
. This approach is crucial for handling large datasets because it eliminates the need to hold millions of records in an array, which would likely crash the application.
Finally, we use ->each(function (array $company) { ... })
to process each company as it is streamed. This allows us to immediately act on each record, whether it's inserting it into a database, performing transformations, or further processing.
Since LazyCollection
only retrieves records as needed, it ensures that we never exceed memory limits, even with millions of entries. This method is particularly useful for bulk data imports, real-time API processing, and handling large log files—scenarios where traditional json_decode()
would be impractical.
By combining JSON Machine with LazyCollection
, we achieve scalability, performance, and efficiency when working with massive JSON responses.
Conclusion
I know this post has been a long one, but hopefully you now have a good understanding on PHP Generators, how they work and where you would use them, also hopefully it managed to demystify some of the Laravel magic around the LazyCollection class.