Large dataframes can consume a significant amount of memory. By processing data in smaller chunks, you can avoid running out of memory and access data faster.
In the code above, using chunksize=100000
is approximately 5495 times faster than not using chunksize
.