Tuesday, August 25, 2020

The Node.js Event Loop: A Developer’s Guide to Concepts & Code

The Node.js Event Loop

Asynchrony in any programming language is hard. Concepts like concurrency, parallelism, and deadlocks make even the most seasoned engineers shiver. Code that executes asynchronously is unpredictable and difficult to trace when there are bugs. The problem is inescapable because modern computing has multiple cores. There’s a thermal limit in each single core of the CPU, and nothing is getting any faster. This puts pressure on the developer to write efficient code that takes advantage of the hardware.

JavaScript is single-threaded, but does this limit Node from utilizing modern architecture? One of the biggest challenges is dealing with multiple threads because of its inherent complexity. Spinning up new threads and managing context switch in between is expensive. Both the operating system and the programmer must do a lot of work to deliver a solution that has many edge cases. In this take, I’ll show you how Node deals with this quagmire via the event loop. I’ll explore every part of the Node.js event loop and demonstrate how it works. One of the “killer app” features in Node is this loop because it solved a hard problem in a radical new way.

What is the Event Loop?

The event loop is a single-threaded, non-blocking, and asynchronously concurrent loop. For those without a computer science degree, imagine a web request that does a database lookup. A single thread can only do one thing at a time. Instead of waiting on the database to respond, it continues to pick up other tasks in the queue. In the event loop, the main loop unwinds the call stack and doesn’t wait on callbacks. Because the loop doesn’t block, it’s free to work on more than one web request at a time. Multiple requests can get queued at the same time, which makes it concurrent. The loop doesn’t wait for everything from one request to complete, but picks up callbacks as they come without blocking.

The loop itself is semi-infinite, meaning if the call stack or the callback queue are empty it can exit the loop. Think of the call stack as synchronous code that unwinds, like console.log, before the loop polls for more work. Node uses libuv under the covers to poll the operating system for callbacks from incoming connections.

You may be wondering, why does the event loop execute in a single thread? Threads are relatively heavy in memory for the data it needs per connection. Threads are operating system resources that spin up, and this doesn’t scale to thousands of active connections.

Multiple threads in general also complicate the story. If a callback comes back with data, it must marshal context back to the executing thread. Context switching between threads is slow, because it must synchronize current state like the call stack or local variables. The event loop crushes bugs when multiple threads share resources, because it’s single-threaded. A single-threaded loop cuts thread-safety edge cases and can context switch much faster. This is the real genius behind the loop. It makes effective use of connections and threads while remaining scalable.

Enough theory; time to see what this looks like in code. Feel free to follow along in a REPL or download the source code.

Semi-infinite Loop

The biggest question the event loop must answer is whether the loop is alive. If so, it figures out how long to wait on the callback queue. At each iteration, the loop unwinds the call stack, then polls.

Here’s an example that blocks the main loop:

setTimeout(
  () => console.log('Hi from the callback queue'),
  5000); // Keep the loop alive for this long

const stopTime = Date.now() + 2000;
while (Date.now() < stopTime) {} // Block the main loop

If you run this code, note the loop gets blocked for two seconds. But the loop stays alive until the callback executes in five seconds. Once the main loop unblocks, the polling mechanism figures out how long it waits on callbacks. This loop dies when the call stack unwinds and there are no more callbacks left.

The Callback Queue

Now, what happens when I block the main loop and then schedule a callback? Once the loop gets blocked, it doesn’t put more callbacks on the queue:

const stopTime = Date.now() + 2000;
while (Date.now() < stopTime) {} // Block the main loop

// This takes 7 secs to execute
setTimeout(() => console.log('Ran callback A'), 5000);

This time the loop stays alive for seven seconds. The event loop is dumb in its simplicity. It has no way of knowing what might get queued in the future. In a real system, incoming callbacks get queued and execute as the main loop is free to poll. The event loop goes through several phases sequentially when it’s unblocked. So, to ace that job interview about the loop, avoid fancy jargon like “event emitter” or “reactor pattern”. It’s a humble single-threaded loop, concurrent, and non-blocking.

The Event Loop with async/await

To avoid blocking the main loop, one idea is to wrap synchronous I/O around async/await:

const fs = require('fs');
const readFileSync = async (path) => await fs.readFileSync(path);

readFileSync('readme.md').then((data) => console.log(data));
console.log('The event loop continues without blocking...');

Anything that comes after the await comes from the callback queue. The code reads like synchronously blocking code, but it doesn’t block. Note async/await makes readFileSync thenable, which takes it off the main loop. Think of anything that comes after await as non-blocking via a callback.

Full disclosure: the code above is for demonstration purposes only. In real code, I recommend fs.readFile, which fires a callback that can be wrapped around a promise. The general intent is still valid, because this takes blocking I/O off the main loop.

Continue reading The Node.js Event Loop: A Developer’s Guide to Concepts & Code on SitePoint.


by Camilo Reyes via SitePoint

No comments:

Post a Comment