P1883 (LLFIO) review
std::filesystem
brought in a nice set of features to C++’s standard library in 2017 and it became much easier to
write cross platform code that deals with paths. Long gone are those days when for Linux you had to use /
as path separator
and on Windows the opposite \
.
While working strictly with paths has become easier and you can pretty much ignore the peculiarities of different systems, you
actually can’t escape from specific platform code when you start to use the files that those paths reffer to.
For example to open a file on Linux you might call open()
while on Windows you might call NtOpenFile()
, then each of those
functions return different descriptor types that represent the opened resource, int
for Linux, PHANDLE
for Windows.
Then there are also different types of IO that you can perform, “regular” IO where you write/read one buffer at a time and also “scatter-gather” IO where you write multiple buffers to a file in one call. You can also have different types of files: simple/regular files, memory mapped files.
So each OS provide different APIs and flags that will basically allow you to use all those nice features. I guess you can see that we won’t have any easy days dealing with this mess.
But you can use std::fstream
for IO⌗
Yeah the std::fstream
’s interface works with std::filesystem::path
but how good is it when performance matters and you have
to churn files at several Gb/s? Turns out it’s not that great as we can see from examples like
this one or
this one.
So in the end it seems like we still have to grok the POSIX file APIs, assuming we’re developing for Linux, or MS documentation if we aim for Windows, or both if we support both thus diving into the depths of low level. Yeah a tough one when you have to deliver something fast and time is running out quickly.
LLFIO⌗
Turns out there’s someone else who experienced this pain too, and that guy’s name is Niall Douglas who wrote LLFIO.
I’ll quote Niall about LLFIO
’s purpose:
Herein lies my proposed zero whole machine memory copy file i/o and filesystem library for the C++ standard, intended for storage devices with ~1 microsecond 4Kb transfer latencies and those supporting Storage Class Memory (SCM)/Direct Access Storage (DAX).
High expectations, I know!
While going through his library’s source code and documentation I’ve learned that some of its features are slated for arrival in 2026
in the standard library: file_handle
and mapped_file_handle
.
With those two abstractions, besides the usuall read()
/write()
you get a bunch of other functionality, like scatter-gather IO,
file locks, support for extents, anonyous/temporary files, very good support for mapped files etc. And the concepts that we all
know and love still apply; as their names imply you would want to use a file_handle
when working with “regular” files, for
example maybe you build an application that handles tons of emails then a file_handle
should be good enough. When you need memory
regions backed via a file then a mapped_file_handle
should be used, for example in a database where you might optimize for speed,
or maybe a program packer/loader where you want to map files in memory.
From my time with LLFIO
I can tell that the library is easy to use, if you have previous experience with C++ and especially if you have
used the OS specific APIs and concepts that have to do with IO then it should be easy to translate what you already know to
LLFIO
’s interfaces.
Unfortunately I haven’t dived into benchmarking to really check its performance claims, but from peeks that I got into its source code the performance should really be there. There are some benchmarks that the author provides, but as always when you have to do something high performance my advice is to test, if possible, on the hardware and OS that you expect your software to run and then make a decision, you’ll get much more realistic results for your specific workload.
But it’s going to be part of standard library from 2026, I can’t wait that long?⌗
It should be quite easy to pull it in your project, especially if you use CMake
.
But what about Boost?⌗
Unfortunately I don’t have much experience with Boost’s abstractions over file handles and mmapped files. When I had to deal with tasks such as working with files I either rolled my own abstractions or used what was already available in the project.
Closing thoughts⌗
Overall I’m pleased to know that we’re close to having a “standard libray” way to interact with files (C’s FILE
doesn’t count in my
book). Things like this and lack of networking support really grind my gears sometimes … and thanks God that fmt’s format
made it.