Skip to content

Blog

Moss Unlocked

Well, it’s not all doom and gloom these days. We’ve actually made some significant progress in the last few days, so it seems a good time to share a progress update.

Extracting content from moss archives

Oh yeah, that totally happened. So, we can now successfully build moss packages from boulder and then extract them to disk once again with moss. This might sound totally uninteresting, but it demonstrates that our format is actually working as intended.

Admittedly the code is super rough within moss and somewhat proof of concept, however we’re able to extract the contents of the moss archive and rebuild the layout on disk.

Well, quirky new format for one. A moss archive currently consists of 4 “payloads”, or major sections:

  • MetaPayload

    Contains all package information, with strongly typed keys.

  • IndexPayload

    Contains the IDs of all unique files (hash) and their offsets within the ContentPayload

  • LayoutPayload

    A sequence of specialised structs describing the final “layout” of the package on disk, with attributes, paths, and for regular files, the ID of the file in the ContentPayload to build this file from.

  • ContentPayload

    A binary blob containing every unique file from the package, in an order described by the IndexPayload. The files are stored sequentially with no gaps.

Additionally, each payload is independently compressed using zstd. In order to extract files to disk, we must first decompress ContentPayload to a temporary file. Next, we blit each file from the “megablob” to the cache store, using the IndexPayload to understand the offsets. Finally, we apply the instructions in LayoutPayload to construct the final layout on disk, hardlinking the cache assets into their final locations, setting attributes, etc.

Net result? Deduplication on a per package basis, and a system-wide deduplication policy allowing sharing of identical assets on disk between multiple packages. This will also power our core update mechanism, whereby each update is atomic, and is simply the difference on disk from the previous version, permitting a powerful rollback mechanism.

Multiple

There are areas where we’re doing things inefficiently, and we’ll certainly improve that in future revisions of the important. For example, IndexPayload actually wastes some bytes by storing redundant information that can be calculated at runtime. Additionally, we want to use the zstd C APIs directly to gain the level of control we actually need. We’re also going to wrap the copy_file_range syscall to make extraction of the content payload more efficient and not rely on userspace copies.

However, we’re working towards a prealpha, and some inefficiencies are OK. Our first port of call will be a prealpha image constructed from .stone files, produced by boulder, installed by moss. This will validate our toolchain and tooling and serve as a jumping off point for the project.

Stay tuned, there is a whole bunch of awesome coming now that moss is officially unlocked and progressing.

I want to thank everyone who is currently supporting the project. I also want to personally thank you for your understanding of the setbacks of real life, given the difficult times myself and my family have been going through. I hope it is clear that I remain committed to the project and it’s future, which is why we’re transparently run and funded via OpenCollective.

Despite the rough times, work continues, and awesome people join our ranks on a regular basis. Stability is on the immediate horizon and work with Serpent OS grows exponentially. You can be part of our journey, and help us build an amazing community and project that outlives us all.

Optimising Package Distribution

Getting updates as fast as possible to users has made deltas a popular and sought after feature for distributing packages. Over the last couple of days, I’ve been investigating various techniques we can look at to support deltas in moss.

Minimising the size of updates is particularly valuable where files are downloaded many times and even better if they’re updated infrequently. With a rolling release, packages will be updated frequently, so creating deltas can become resource intensive, especially if supporting updates over a longer period of time. Therefore it’s important to get the right balance between compression speed, decompression memory and minimising file size.

Package prioritiesHow best to meet these needs
DevelopersCreation speedQuickly created packages
UsersFile size and update speedSize minimised deltas

From the users point of view, minimising file size and upgrade time are important priorities, but for a developer, the speed at which packages are created and indexed is vital to progression. Deltas are different to packages in that they aren’t required immediately, so there’s minimal impact in taking longer to minimise their size. By getting deltas right, we can trade-off the size of packages to speed up development, while users will not be affected and fetch only size optimised deltas.

QtWebEngine provides a reasonable test case where the package is a mix of binaries, resources and translations, but above average in size (157.3MB uncompressed). The first trade-off for speed over size has already been made by incorporating zstd in moss over xz, where even with max compression zstd is already 5.6% larger than using xz. This is of course due to the amazing decompression speeds where zstd is magnitudes faster.

Compression levels with zstd

With maximum compression, large packages can take over a minute to compress. With a moderate increase in size, one can reduce compression time by 2-10x. While making me happier as a developer, it does create extra network load during updates.

Full Packagezstd -16 -T0zstd -16zstd -19 -T0zstd -19zstd -22xz -9
Time to create5.4s26.8s27.8s56.0s70.6s66.5s
Size of package52.6MB52.6MB49.2MB49.2MB48.4MB45.9MB

There are two basic methods for deltas. One simple method is to include only files that have been changed since the earlier release. With reproducible builds, it is typical to create the same file from the same inputs. However, with a rolling release model, files will frequently have a small change from dependency changes and new versions of the package itself. In these circumstances the delta size starts to get closer to the full package anyway. As a comparison to other delta techniques, this approach resulted in a 38.2MB delta as it was a rebuild of the same version at a different time so the resources and translations were unchanged (and therefore omitted from the delta).

An alternative is a binary diff, which is a significant improvement when files have small changes between releases. bsdiff has long been used for this purpose and trying it out (without knowing much about it) I managed to create a delta of 33.2MB, not a bad start at all.

To highlight the weakness of the simple method, when you compare the delta across a version change, the simple delta was only a 7% reduction of the full package (as most files have changed), while using an optimal binary diff, it still managed to achieve a respectable 31% size reduction.

While looking into alternatives, I happened to stumble across a new feature in zstd which can be used to create deltas. As we already use zstd heavily it should make integration easier. --patch-from allows zstd to use the old uncompressed package as a dictionary to create the newer package. In this way common parts between the releases will be reused in order to reduce the file size. Playing around I quickly achieved the same size as bsdiff, and with a few tweaks was able to further reduce the delta by a further 23.5%! The best part is that it has the same speedy decompression as zstd, so it will recreate most packages from deltas in the blink of an eye!

Deltaonly changed filesbsdiffzstd -19zstd -22zstd -22 —zstd=chainLog=30
Time to create60.8s153.0s85.5s111.6s131.8s
Size of delta38.2MB33.2MB33.3MB28.5MB25.4MB

There’s certainly a lot of information to digest, but the next step is to integrate a robust delta solution into the moss format. I really like the zstd approach, where you can tune for speed with an increase in size if desired. With minimising on delta size, users can benefit from smaller updates while developers can benefit from faster package creation times.

Some final thoughts for future consideration:

  • zstd has seen many improvements over the years, so I believe that ratios and performance will see incremental improvements over time. Release 1.4.7 already brought significant improvements to deltas (which are reflected in this testing).
  • The highest compression levels (--ultra) are single threaded in zstd, so delta creation can be done in parallel to maximise utilisation.
  • Over optimising the tunables can have a negative impact on both speed and size. As an example, --zstd=targetLength=4096 did result in a 2KB reduction in size at the same speed, but when applied to different inputs (kernel source tree), it not only made it larger by a few hundred bytes, but added 4 minutes to delta creation!
  • Memory usage of applying deltas can be high for large packages (1.74GB for the kernel source tree) as it has to ingest the full size of the original uncompressed package. It is certainly possible to split up payloads with some delta users even creating patches on a per file basis. It is a bit more complicated when library names change version numbers each release with only the SONAME symlink remaining unchanged.
  • There’s always the option to repack packages at higher compression levels later (when deltas are created). This solves getting the package ‘live’ ASAP and minimises the size (eventually), but adds some complication.

Moss Format: Read Write Support

It’s been 8 days since our last blogpost and a lot of development work has happened in that time. Specifically, we’ve completely reworked the internals of the moss-format module to support read/write operations.. Which means installation is coming soon (TM)

Development work on moss-format

So, many commits have been made to the core repositories, however the most important project to focus on right now is moss-format, which we used to define and implement our binary and source formats. This module is shared between boulder, our build tool, and moss, our package manager.

We’ve removed the old enumeration approach from the Reader class, instead requiring that it processes data payloads in-place, deferring reading the content payload streams. We’ve also enforced strong typing to allow safe and powerful APIs:

import moss.format.binary.payload.meta : MetaPayload;
auto reader = new Reader(File(argv[0], "rb"));
auto metadata = reader.payload!MetaPayload();

Right now we can read and write our MetaPayload from and to the stream, allowing us to encode & decode the metadata associated with the package, with strong type information (i.e. Uint64, String, etc.)

We need to restore the IndexPayload, LayoutPayload and ContentPayload definitions. The first two are simply data payloads and will largely follow the design of the newly reimplemented MetaPayload. Then we restore ContentPayload support, and this will allow the next steps: unpack, install.

Many of the babysteps required are done now, which power our binary format. The design of the API is done in a way which will allow powerful manipulation via the std.algorithm and std.range APIs, enabling extremely simple and reliable installation routines.

It might seem odd that we’ve spent so much time on the core format, however I should point out that the design of the format is central to the OS design. Our installation routine is not unpacking of an archive.

With our binary format, the stream contains multiple PayloadHeaders, with fixed lengths, type, tag, version, and compression information. It is up to each Payload implementation to then parse the binary data contained within. Currently our Payloads are compressed using ZSTD, though other compression algorithms are supported.

So, we have the MetaPayload for metadata. Additionally, we encode all unique files in the package in a single compressed payload, the ContentPayload. The offset to each unique file (by hash) is stored within the IndexPayload, and the filesystem layout is encoded as data within the LayoutPayload.

In order to unpack a moss package, the ContentPayload blob will be decompressed to a temporary location. Then, each struct within the IndexPayload can be used to copy portions (copy_file_range) of the blob to the cache store for permanence. We skip each Index if its already present within the cache. Finally, the target filesystem is populated with a view of all installed packages using the LayoutPayload structs, creating a shallow installation using hardlinks and directories.

The net result is deep deduplication, atomic updates, and flexibility for the user. Once we add transactions it’ll be possible to boot an older version of the OS using the deduplication capabilities for offline recovery. Additionally there is no requirement for file deletion, rename or modification for an update to succeed.

Huge progress. Major excitement. Such wow. Soon alphas.

Unlocking Moss

Wait, what? Another blog post? In the same WEEK? Yeah totally doing that now. So, this is just another devlog update but there have been some interesting updates that we’d like to share.

Thanks to some awesome work from Peter, we now have LDC (The LLVM D Lang Compiler) present in the stage3 bootstrap. To simplify the process we use the official binary release of LDC to bootstrap LDC for Serpent OS.

In turn, this has allowed us to get to a point where we can now build moss and boulder within stage3. This is super important, as we’ll use the stage3 chroot and boulder to produce the binary packages that create stage4.

Some patching has taken place to prevent using ld.gold and instead use lld to integrate with our toolchain decisions.

A wild LDC appears
A wild LDC appears

Originally our prototype moss format only contained a ContentPayload for files, and a MetaPayload for metadata, package information, etc. As a result, we opted for simple structs, basic case handling and an iterable Reader implementation.

As the format expanded, we bolted deduplication in as a core feature. To achieve this, we converted the ContentPayload into a “megablob”, i.e. every unique file in the package, one after the other, all compressed in one operation. We then store the offsets and IDs of these files within an IndexPayload to allow splitting the “megablob” into separate, unique assets. Consequently, we added a LayoutPayload which describes the final file system layout of the package, referencing the unique ID (hash) of the asset to install.

So, while the format grew into something we liked, the code supporting it became very limiting. After many internal debates and discussions, we’re going to approach the format from a different angle on the code front.

It will no longer be necessary/possible to iterate payloads in sequence within a moss archive, instead we’ll preload the data (unparsed) and stick it aside when reading the file, winding it to the ContentPayload segment if found. After initial loading of the file is complete, the Reader API will support retrieval (and lazy unpacking ) of a data segment. In turn this will allow code to “grab” the content, index and layout payloads and use advanced APIs to cache assets, and apply them to disk in a single blit operation.

In short, we’ve unlocked the path to installing moss packages while preserving the advanced features of the format. The same new APIs will permit introspection of the archives metadata, and of course, storing these records in a stateful system database.

Oh you’re quite welcome :P Hopefully now you can see our plan, and that we’re on track to meet our not-28th target. Sure, some code needs throwing away, but all codebases are evolutionary. Our major hurdle has been solved (mentally) - now it’s just time to implement it and let the good times roll.

Cleanups Complete

Well, we’re officially back to working around the clock. After spending some time optimising my workflow, I’ve been busy getting the entire codebase cleaned up, so we can begin working towards an MVP.

Lots and lots of work

Since Friday, I’ve been working extensively on cleaning up the codebase for the following projects:

  • boulder
  • moss
  • moss-core
  • moss-format

As it now stands, 1 lint issue (a non-issue, really) exists across all 4 projects. A plethora of issues have been resolved, ranging from endian correctness in the format to correct idiomatic D-lang integration and module naming.

Granted, cleanups aren’t all that sexy. Peter has been updating many parts of the bootstrap project, introducing systemd 247.3, LLVM 11.0.1, etc. We now have all 3 stages building correctly again for the x86_64 target.

Yikes, tough audience! So we’ve formed a new working TODO which will make its way online as a public document at some point. The first stage, cleanups, is done. Next is to restore feature development, working specifically on the extraction and install routines. From there, much work and cadence will be unlocked, allowing us to work towards an MVP.

TODO

I know, it makes me feel all … cute and professional. At the moment we’re cooking up a high level description of how an MVP demonstration could be deployed. While most of the features are ambiguous, our current ambition is to deploy a preinstalled QCOW2 Serpent OS image as a prealpha to help validate moss/boulder/etc.

It’ll be ugly. Likely slow. Probably insecure as all hell. But it’ll be something. It’ll allow super basic interaction with moss, and a small collection of utilities that can be used in a terminal environment. No display whatsoever. That can come in a future iteration :)

ETA? Definitely not by the 28th of February. When it’s done.