Read large files line by line in node.js
Update
This solution no longer works with the latest version of node.js since bufferSize
support was dropped, for an updated solution rather view https://coderwall.com/p/ohjerg?&p=8&q=
When working with large data files (500MB+) you don't want to read an entire file into memory, currently the node.js API doesn't support reading files line by line using createReadStream()
.
For this reason we check each character for a new line, this might seem a bit weird, but once you start working with large files keeping your memory low is of top priority, and the pros quickly outweigh the cons.
//get the filesystem module
var fs = require('fs'),
stream = fs.createReadStream("/path/to/large/file", {
flags: 'r',
encoding: 'utf-8',
fd: null,
bufferSize: 1
}),
line ='';
//start reading the file
stream.addListener('data', function (char) {
// pause stream if a newline char is found
stream.pause();
if(char == '\n'){
(function(){
//do whatever you want to do with the line here.
//....
//When done set the line back to empty, and resume the strem
line = '';
stream.resume();
})();
}
else{
//add the new char to the current line
line += char;
stream.resume();
}
})
Written by Stephan Steynfaardt
Related protips
2 Responses
Does not work any more - the bufferSize option has been removed.
Bala Clark has a simplified update on this, I haven't tested it yet, but it seems pretty easy.
https://coderwall.com/p/ohjerg?&p=8&q=
UPDATE:
Tested Bala's solution on a 2.5G file and it world perfectly!