Developments in Blockchain Security & Privacy

Mistaking Coroutines for Goroutines

Fluentbit is an open-source, multi-platform log processor and forwarder that can collect data/logs from different sources, unify the data, and send it to multiple destinations.

A lightweight counterpart to Fluentd, Fluentbit was created to fill in Fluentd’s gaps. As a result, Fluentbit’s alternate solutions can be somewhat surprising.

Resolving Delays

Since Fluentbit performs log processing, IT teams also use it to perform data analysis. Sorting tagged data for this purpose is relatively straightforward but retrieving data quickly can be problematic.

In particular, output plugins for Fluentbit are prone to delay from network issues. Eduardo Silva explains the issue in a lengthy presentation video:

So one of the challenges that we had when we were creating or designing Fluentbit, we said “okay this will run in a single process, we get the data in and we’re going to send the data out. But most of the places where the data is going out is network operations. If you have done some programming with TCP,  that means create a socket, resolve an address, connect to the address, and wait for notification. And that process sometimes can take what ten times (the length) and that can block all your program. So how do we avoid that? 

This occurs with “callback hell“, where asynchronous Javascript makes it necessary to “freeze” a computation and allow the remainder of a program execute later via a callback.

Occasionally, this asynchronous operation runs into network problems (i.e. socket gets closed or notifications are sent from the kernel). With an output plugin, this can lead to data leaks and memory leaks.

For the application, subsequent tasks become blocked and a longer waiting period ensues.

Note that sending data to a destination is already a lengthy process, requiring a TCP connection to be created, data to be converted to JSON (for the output plugin), data written over the network, and a response to be received (ie. okay, retry, or error).

The Goroutine Solution

What to do? The solution might lie in temporarily deferring the tasks that are being blocked. As explained in a previous post, Golang performs this solution quite well with goroutines. As golang-book.com explains, 

The (program below) consists of two goroutines. The first goroutine is implicit and is the main function itself. The second goroutine is created when we call go f(0). Normally when we invoke a function our program will execute all the statements in a function and then return to the next line following the invocation. With a goroutine we return immediately to the next line and don’t wait for the function to complete. This is why the call to the Scanln function has been included; without it the program would exit before being given the opportunity to print all the number

Mistaking Coroutines for Goroutines 1

Rather than create a new coding sequence, the Go language allows independent sequences to run separately. And they will not be allowed to run asynchronously, as both sender and receiver must be present for a goroutine to execute.

How Coroutines Work

Fluentbit, written in the C programming language, refers to their programming sequences as co-routines. They’re similar to goroutines- but more explicit in nature. As Eduardo Silva explains,

So that’s why we come up with implementing co-routines. Maybe if you are using Golang you’re quite familiar with co-routines. Or a way to suspend execution of your function and just wait that somebody wake you up and restore that specific entry point where you were working.  I’m going to explain that with a code example now. So implementing co-routines in the output plugins was a solution. Because we abstract how the developer of the output plugin is handling the connections, how it’s reading data, and so on. So all about the network operations, error status, and that kind of things,  the Fluentbit API takes care of that. So we reduce the risk of memory leaks or any kind of problem.

In his presentation, Eduardo Silva details how Fluentbit puts this into practice:

If we look visually at this code, you will see that this is a blocking code. But internally, every time that you issue any kind of network operation (and I’m focusing on an upstream connection, get a connection to an upstream server- I mean create a socket, connect, and be ready to write data) Fluentbit is going to suspend execution of that function. And then we’ll continue working, processing data. And once the kernel notifies the connection “hey it’s ready” it’s going to resume from that specific entry point. 

So with this kind of model, we avoid many problems with network i/o or leaking data or whatever. And of course, in the beginning, it’s not too straightforward to implement the design internally. But after that, we came up with a solution where you can write your own plugins in a safe way. Of course, you can leak memory if you don’t write good code correct? But that happens to every person. That’s what we say.

We suspended while the connection was done, we resumed later, and then, we issue the HTTP request. That means I’m going to send my payload to the server. I just include the data and I suspend again. Because I will tell the kernel, please let me know when you just flushed the data. I’m not going to sit and wait right? I have things to do. And then when I’m ready I’m going to resume and the internal API is going to validate the status and so on. So this is the most simple example of how this works internally. And it’s very reliable.

The Difference Between The Two

What is the difference between a goroutine and a coroutine? In Stacksoverflow, a user named Kostix gives a very succinct answer:

IMO, a coroutine implies supporting of explicit means for transferring control to another coroutine. That is, the programmer programs a coroutine in a way when they decide when a coroutine should suspend execution and pass its control to another coroutine (either by calling it or by returning/exiting (usually called yielding)).

Go’s “goroutines” are another thing: they implicitly surrender control at certain indeterminate points1which happen when the goroutine is about to sleep on some (external) resource like I/O completion, channel send etc. This approach combined with sharing state via channels enables the programmer to write the program logic as a set of sequential light-weight processes which removes the spaghetti code problem common to both coroutine- and event-based approaches.