DwarFS
The Deduplicating Warp-speed Advanced Read-only File System.
A fast high compression read-only file system for Linux and Windows.
Table of contents
- Overview
- History
- Building and Installing
- Usage
- Using the Libraries
- Windows Support
- macOS Support
- Use Cases
- Dealing with Bit Rot
- Extended Attributes
- Comparison
- Performance Monitoring
- Other Obscure Features
- Stargazers over Time
Overview
DwarFS is a read-only file system with a focus on achieving very high compression ratios in particular for very redundant data.
This probably doesn't sound very exciting, because if it's redundant, it should compress well. However, I found that other read-only, compressed file systems don't do a very good job at making use of this redundancy. See here for a comparison with other compressed file systems.
DwarFS also doesn't compromise on speed and for my use cases I've found it to be on par with or perform better than SquashFS. For my primary use case, DwarFS compression is an order of magnitude better than SquashFS compression, it's 6 times faster to build the file system, it's typically faster to access files on DwarFS and it uses less CPU resources.
To give you an idea of what DwarFS is capable of, here's a quick comparison of DwarFS and SquashFS on a set of video files with a total size of 39 GiB. The twist is that each unique video file has two sibling files with a different set of audio streams (this is an actual use case). So there's redundancy in both the video and audio data, but as the streams are interleaved and identical blocks are typically very far apart, it's challenging to make use of that redundancy for compression. SquashFS essentially fails to compress the source data at all, whereas DwarFS is able to reduce the size by almost a factor of 3, which is close to the theoretical maximum:
$ du -hs dwarfs-video-test
39G dwarfs-video-test
$ ls -lh dwarfs-video-test.*fs
-rw-r--r-- 1 mhx users 14G Jul 2 13:01 dwarfs-video-test.dwarfs
-rw-r--r-- 1 mhx users 39G Jul 12 09:41 dwarfs-video-test.squashfs
Furthermore, when mounting the SquashFS image and performing a random-read
throughput test using fio-3.34, both
squashfuse
and squashfuse_ll
top out at around 230 MiB/s:
$ fio --readonly --rw=randread --name=randread --bs=64k --direct=1 \
--opendir=mnt --numjobs=4 --ioengine=libaio --iodepth=32 \
--group_reporting --runtime=60 --time_based
[...]
READ: bw=230MiB/s (241MB/s), 230MiB/s-230MiB/s (241MB/s-241MB/s), io=13.5GiB (14.5GB), run=60004-60004msec
In comparison, DwarFS manages to sustain random read rates of 20 GiB/s:
READ: bw=20.2GiB/s (21.7GB/s), 20.2GiB/s-20.2GiB/s (21.7GB/s-21.7GB/s), io=1212GiB (1301GB), run=60001-60001msec
Distinct features of DwarFS are:
-
Clustering of files by similarity using a similarity hash function. This makes it easier to exploit the redundancy across file boundaries.
-
Segmentation analysis across file system blocks in order to reduce the size of the uncompressed file system. This saves memory when using the compressed file system and thus potentially allows for higher cache hit rates as more data can be kept in the cache.
-
Categorization framework to categorize files or even fragments of files and then process individual categories differently. For example, this allows you to not waste time trying to compress incompressible files or to compress PCM audio data using FLAC compression.
-
Highly multi-threaded implementation. Both the file system creation tool as well as the FUSE driver are able to make good use of the many cores of your system.
History
I started working on DwarFS in 2013 and my main use case and major motivation was that I had several hundred different versions of Perl that were taking up something around 30 gigabytes of disk space, and I was unwilling to spend more than 10% of my hard drive keeping them around for when I happened to need them.
Up until then, I had been using Cromfs for squeezing them into a manageable size. However, I was getting more and more annoyed by the time it took to build the filesystem image and, to make things worse, more often than not it was crashing after about an hour or so.
I had obviously also looked into SquashFS, but never got anywhere close to the compression rates of Cromfs.
This alone wouldn't have been enough to get me into writing DwarFS, but at around the same time, I was pretty obsessed with the recent developments and features of newer C++ standards and really wanted a C++ hobby project to work on. Also, I've wanted to do something with FUSE for quite some time. Last but not least, I had been thinking about the problem of compressed file systems for a bit and had some ideas that I definitely wanted to try.
The majority of the code was written in 2013, then I did a couple of cleanups, bugfixes and refactors every once in a while, but I never really got it to a state where I would feel happy releasing it. It was too awkward to build with its dependency on Facebook's (quite awesome) folly library and it didn't have any documentation.
Digging out the project again this year, things didn't look as grim as they used to. Folly now builds with CMake and so I just pulled it in as a submodule. Most other dependencies can be satisfied from packages that should be widely available. And I've written some rudimentary docs as well.
Building and Installing
Note to Package Maintainers
DwarFS should usually build fine with minimal changes out of the box.
If it doesn't, please file a issue. I've set up
CI jobs
using Docker images for Ubuntu (22.04
and 24.04),
Fedora Rawhide
and Arch
that can help with determining an up-to-date set of dependencies.
Note that building from the release tarball requires less dependencies
than building from the git repository, notably the ronn
tool as well
as Python and the mistletoe
Python module are not required when
building from the release tarball.
There are some things to be aware of:
-
There's a tendency to try and unbundle the folly and fbthrift libraries that are included as submodules and are built along with DwarFS. While I agree with the sentiment, it's unfortunately a bad idea. Besides the fact that folly does not make any claims about ABI stability (i.e. you can't just dynamically link a binary built against one version of folly against another version), it's not even possible to safely link against a folly library built with different compile options. Even subtle differences, such as the C++ standard version, can cause run-time errors. See this issue for details. Currently, it is not even possible to use external versions of folly/fbthrift as DwarFS is building minimal subsets of both libraries; these are bundled in the
dwarfs_common
library and they are strictly used internally, i.e. none of the folly or fbthrift headers are required to build against DwarFS' libraries. -
Similar issues can arise when using a system-installed version of GoogleTest. GoogleTest itself recommends that it is being downloaded as part of the build. However, you can use the system installed version by passing
-DPREFER_SYSTEM_GTEST=ON
to thecmake
call. Use at your own risk. -
For other bundled libraries (namely
fmt
,parallel-hashmap
,range-v3
), the system installed version is used as long as it meets the minimum required version. Otherwise, the preferred version is fetched during the build.
Prebuilt Binaries
Each release has pre-built,
statically linked binaries for Linux-x86_64
, Linux-aarch64
and
Windows-AMD64
available for download. These should run without
any dependencies and can be useful especially on older distributions
where you can't easily build the tools from source.
Universal Binaries
In addition to the binary tarballs, there's a universal binary
available for each architecture. These universal binaries contain
all tools (mkdwarfs
, dwarfsck
, dwarfsextract
and the dwarfs
FUSE driver) in a single executable. These executables are compressed
using upx, so they are much smaller than
the individual tools combined. However, it also means the binaries need
to be decompressed each time they are run, which can have a signficant
overhead. If that is an issue, you can either stick to the "classic"
individual binaries or you can decompress the universal binary, e.g.:
upx -d dwarfs-universal-0.7.0-Linux-aarch64
The universal binaries can be run through symbolic links named after the proper tool. e.g.:
$ ln -s dwarfs-universal-0.7.0-Linux-aarch64 mkdwarfs
$ ./mkdwarfs --help
This also works on Windows if the file system supports symbolic links:
> mklink mkdwarfs.exe dwarfs-universal-0.7.0-Windows-AMD64.exe
> .\mkdwarfs.exe --help
Alternatively, you can select the tool by passing --tool=<name>
as
the first argument on the command line:
> .\dwarfs-universal-0.7.0-Windows-AMD64.exe --tool=mkdwarfs --help
Note that just like the dwarfs.exe
Windows binary, the universal
Windows binary depends on the winfsp-x64.dll
from the
WinFsp project. However, for the
universal binary, the DLL is loaded lazily, so you can still use all
other tools without the DLL.
See the Windows Support section for more details.
Dependencies
DwarFS uses CMake as a build tool.
It uses both Boost and Folly, though the latter is included as a submodule since very few distributions actually offer packages for it. Folly itself has a number of dependencies, so please check here for an up-to-date list.
It also uses Facebook Thrift,
in particular the frozen
library, for storing metadata in a highly
space-efficient, memory-mappable and well defined format. It's also
included as a submodule, and we only build the compiler and a very
reduced library that contains just enough for DwarFS to work.
Other than that, DwarFS really only depends on FUSE3 and on a set of compression libraries that Folly already depends on (namely lz4, zstd and liblzma).
The dependency on googletest will be automatically resolved if you build with tests.
A good starting point for apt-based systems is probably:
$ apt install \
gcc \
g++ \
clang \
git \
ccache \
ninja-build \
cmake \
make \
bison \
flex \
fuse3 \
pkg-config \
binutils-dev \
libacl1-dev \
libarchive-dev \
libbenchmark-dev \
libboost-chrono-dev \
libboost-context-dev \
libboost-filesystem-dev \
libboost-iostreams-dev \
libboost-program-options-dev \
libboost-regex-dev \
libboost-system-dev \
libboost-thread-dev \
libbrotli-dev \
libevent-dev \
libhowardhinnant-date-dev \
libjemalloc-dev \
libdouble-conversion-dev \
libiberty-dev \
liblz4-dev \
liblzma-dev \
libzstd-dev \
libxxhash-dev \
libmagic-dev \
libparallel-hashmap-dev \
librange-v3-dev \
libssl-dev \
libunwind-dev \
libdwarf-dev \
libelf-dev \
libfmt-dev \
libfuse3-dev \
libgoogle-glog-dev \
libutfcpp-dev \
libflac++-dev \
nlohmann-json3-dev
Note that when building with gcc
, the optimization level will be
set to -O2
instead of the CMake default of -O3
for release
builds. At least with versions up to gcc-10
, the -O3
build is
up to 70% slower than a
build with