Standard IO in Zig

Posted on Apr 13, 2024

Most of the “starter projects” I’d like to tackle in Zig involve at least some IO, so one of the first things I tried to do with the language is get a feel for how to read and write to different devices.

I knew that Zig’s “hello world” was more complex than in other languages, and I knew why. Still I was a bit puzzled by some of the APIs around IO. Here are some notes.

A simple version

// io1.zig
const std = @import("std");

pub fn main() !void {
    const stdin = std.io.getStdIn().reader();
    const stdout = std.io.getStdOut().writer();

    var buf: [48]u8 = undefined;

    while (true) {
        try stdout.print("?> ", .{});
        var res = try stdin.readUntilDelimiter(&buf, '\n');
        try stdout.print("{s}\n", .{res});
    }
}

Save it as io1.zig and run it with zig run io1.zig. You’re greeted by a simple prompt:

$ zig run io1.zig
?>

Just type something and press ENTER. The program will spit out what you typed in and give you a new prompt:

$ zig run io1.zig
?> foo
foo
?>

This will continue until you press C-c at which point the program exits:

$ zig run io1.zig
?> foo
foo
?> bar
bar
?> ^C

How does it work?

Let’s dive right into the main function:

const stdin = std.io.getStdIn().reader();
const stdout = std.io.getStdOut().writer();

Ok, so stdin and stdout are structs that implement the Reader and Writer interfaces respectively. They’re const because we don’t mutate them in the program. I’m not going to pretend I understand interfaces in Zig yet.

The rest is pretty self explanatory. Some notes:

For the first line, we could also have used try stdout.writeAll("?> "); to avoid having to pass in an empty anonymous struct.
we have to use try everywhere since all of the statements might fail.
you may wonder: how do know that the reader and writer have methods writeAll, readUntilDelimiter, print, etc..? Documentation is currently lacking. The best way to learn about these is to read the actual implementation. The code for the reader and writer interfaces are in std/io/Reader.zig and std/io/Writer.zig.

Note: I’ve been pleasantly surprised by how accessible the code for the standard library is. I’ve been able to find my way around it pretty easily despite having zero zig experience, and for simple things like IO, reading the library has been a perfectly adequate replacement for documentation.

A better version

If you do read the std library code as suggested above, you’ll see the following comment above the readUntilDelimter function:

/// Deprecated: use `streamUntilDelimiter` with FixedBufferStream's writer instead.

Uh.. what does that mean? You get a hint by reading the code for the function:

// In std/io/Reader.zig

pub fn readUntilDelimiter(self: Self, buf: []u8, delimiter: u8) anyerror![]u8 {
    var fbs = std.io.fixedBufferStream(buf);
    try self.streamUntilDelimiter(fbs.writer(), delimiter, fbs.buffer.len);
    const output = fbs.getWritten();
    buf[output.len] = delimiter; // emulating old behaviour
    return output;
}

OK, so under the hood the function actually doing the work is streamUntilDelimiter which, if you check its definition (in the same file) takes a struct implementing the writer interface, a delimiter and a length. The writer passed to this function is from a fixedBufferStream. You can see here that the function does an assignment that is not actually needed:

buf[output.len] = delimiter;

which can be skipped. We also don’t need to re-create a new FixedBufferStream every time: there is a reset method that allows us to reuse the stream. If we extract the relevant part, we can write the following:

// io2.zig
const std = @import("std");

pub fn main() !void {
    const stdin = std.io.getStdIn().reader();
    const stdout = std.io.getStdOut().writer();

    var buf: [48]u8 = undefined;
    var fbs = std.io.fixedBufferStream(&buf);

    while (true) : (fbs.reset()) {
        // Read
        try stdout.print("?> ", .{});
        try stdin.streamUntilDelimiter(fbs.writer(), '\n', fbs.buffer.len);
        var content = fbs.getWritten();

        // Print
        try stdout.print("{s}\n", .{content});
    }
}

Improved?

I can see why the new version would be more efficient than the previous one (since it does less work). The part that can be puzzling is why do we have to contort ourselves to use streamUntilDelimiter? It seems that reading the content of stdin into an array is a common enough task that we would want to have a single function (like readUntilDelimiter) that does it.. you know, make the common case simple?

e.g. the following implementation gives something closer to the kind of allocation-free read function I would be expecting (without some of the error checking, etc..):

pub fn readUntilDelimiter(reader: anytype, buf: []u8, delimiter: u8) anyerror![]u8 {
    var n: usize = 0;
    while (true) {
        const byte: u8 = try reader.readByte();
        if (byte == delimiter) break;
        buf[n] = byte;
        n += 1;
    }
    return buf[0..n];
}

we could then swap out the line stdin.readUntilDelimiter call from our first version and replace it with:

var res = readUntilDelimiter(stdin, &buf, '\n');

which is very close to the version in the standard library except it is not a method and does not do some of the unnecessary assignments.

Again, I’m not sure whether the use of interfaces in this is case is a compromise between maintainability (less code duplication, etc..) and performance (see this post explaining some of the performance implications), or whether there is some important other reason for doing it that way..

Misc: beware the slice

Take the “improved” version (io2.zig) and add the following the following print statement below the statement printing the content variable:

try stdout.print("{s}\n", .{buf[0..content.len]});

Compile and run: whatever you type is now printed twice.

$ zig run io1.zig
?> foo
foo
foo
?>

In other words, buf has been mutated inside the FixedBufferStream. Well of course it has.. otherwise why would you pass it as argument to fixedBufferStream() in the first place? The reason is feels odd to me just goes to show how used I am to high-level, alloc-everything-on-the-heap style of programming.