Reading Large Files Using Node.js

April 04, 2020

I recently faced the task of analyzing a massive dataset consisting of log files. When I attempted to open the file in Excel, my laptop simply froze. Given the limitations of the tools available, I decided to parse the file using a Node.js script.


To read a small file, you might use the following script:

var fs = require('fs');

fs.readFile('path/mySmallFile.txt', 'utf-8', (err, data) => {
  if (err) {
    throw err;

Using this script, you should be able to read the content of a small file. However, for large files, you might encounter a buffer error like RangeError: Attempt to allocate Buffer larger than maximum size. The script would terminate, producing an error similar to the following:

Error: "toString" failed
  at stringSlice (buffer.js)
  at Buffer.toString (buffer.js)
  at FSReqWrap.readFileAfterClose [as oncomplete]


To read a large file, you can use Node.js's native readline library like so:

var fs = require('fs');
var readline = require('readline');

const rl = readline.createInterface({
  input: fs.createReadStream('path/largeFile.csv'),
  output: process.stdout,
  terminal: false

rl.on('line', (line) => {

rl.on('pause', () => {

Replace the file path with the path to your large file. Inside the on('line') function, you can process the file line by line—such as parsing it into JSON and incrementing a counter. The final sum can be displayed using the on('pause') function after the file has been completely read.

With this approach, you should now be able to process massive datasets using Node.js. For more information, please refer to the official documentation: Node.js Readline API.

Profile picture

Software development professional with expertise in application architecture, cloud solutions deployment, and financial products development. Possess a Master's degree in Computer Science and an MBA in Finance. Highly skilled in AWS (Certified Solutions Architect, Developer and SysOps Administrator), GCP (Professional Cloud Architect), Microsoft Azure, Kubernetes(CKA, CKAD, CKS, KCNA), and Scrum(PSM, PSPO) methodologies. Happy to connect