<< Posts

Baking Assets: Embedding files into compiled binaries

There are many scenarios in developing the chimer webserver and the website that is using it where I've needed to read files from disk. Reading configuration files as well as files requested in HTTP URIs are just some examples. For everything served on the website so far though, I've taken the slightly different approach of embedding the files I would have loaded (e.g., with fstream or fopen/fread) directly into the executable. This is something I've learned about mostly in the context of video game development. We used bin2c in a computer graphics course at UC Davis, but the new C23 #embed is probably a more widely known way of doing this kind of thing. In code, what this looks like is that the file has already been pre-loaded into a string. One example of that is serving the picture of myself in the top left corner of this page.

/// File: me_jpg_example.cpp
/// Creator: nathan@nathanieljwright.com
/// Copyright (C) 2024 Nathaniel Wright. All Rights Reserved.
#include "chimer/app.hpp"
#include "chimer/net/http/common/header_common.hpp"
#include "chimer/thread/cancellation_token_source.hpp"
#include "img/me_jpg.hpp"

using namespace chimer;
using namespace chimer::thread;
using namespace chimer::net::http;
std::unique_ptr<app> srv;
cancellation_token_source token_source;
int main(int argc, char *argv[])
{
	srv = std::make_unique<app>(token_source.token());
	srv->register_action(
		http_method::get, "/me.jpg", [](const http_request &req, http_response &res, const cancellation_token &token) {
			res.headers[header::content_type] = image::jpeg;
			res.set_body(me_jpg);
		});
	srv->run(argc, argv);
}
The me_jpg variable is just a std::string_view that wraps an std::array of all the data of the jpg file. I will not post the entirety of the me_jpg array below. It is too large and isn't interesting to look at. However, here's an example of the generated array and string_view for the code snippet above! It's still large, but is much more of a manageable size for this post. The astute reader will realize that since I can post the generated array of this code snippet the embedding strategy being described is used these blog posts as well.
#pragma once
#include <array>
#include <string_view>
static const constexpr auto me_jpg_example_arr = std::array<char, 861>{(char)0x2F,
	(char)0x2F, (char)0x2F, (char)0x20, (char)0x46, (char)0x69, (char)0x6C, (char)0x65, (char)0x3A, (char)0x20, (char)0x6D, (char)0x65,
	(char)0x5F, (char)0x6A, (char)0x70, (char)0x67, (char)0x5F, (char)0x65, (char)0x78, (char)0x61, (char)0x6D, (char)0x70, (char)0x6C,
	(char)0x65, (char)0x2E, (char)0x63, (char)0x70, (char)0x70, (char)0x0A, (char)0x2F, (char)0x2F, (char)0x2F, (char)0x20, (char)0x43,
	(char)0x72, (char)0x65, (char)0x61, (char)0x74, (char)0x6F, (char)0x72, (char)0x3A, (char)0x20, (char)0x6E, (char)0x61, (char)0x74,
	(char)0x68, (char)0x61, (char)0x6E, (char)0x40, (char)0x6E, (char)0x61, (char)0x74, (char)0x68, (char)0x61, (char)0x6E, (char)0x69,
	(char)0x65, (char)0x6C, (char)0x6A, (char)0x77, (char)0x72, (char)0x69, (char)0x67, (char)0x68, (char)0x74, (char)0x2E, (char)0x63,
	(char)0x6F, (char)0x6D, (char)0x0A, (char)0x2F, (char)0x2F, (char)0x2F, (char)0x20, (char)0x43, (char)0x6F, (char)0x70, (char)0x79,
	(char)0x72, (char)0x69, (char)0x67, (char)0x68, (char)0x74, (char)0x20, (char)0x28, (char)0x43, (char)0x29, (char)0x20, (char)0x32,
	(char)0x30, (char)0x32, (char)0x34, (char)0x20, (char)0x4E, (char)0x61, (char)0x74, (char)0x68, (char)0x61, (char)0x6E, (char)0x69,
	(char)0x65, (char)0x6C, (char)0x20, (char)0x57, (char)0x72, (char)0x69, (char)0x67, (char)0x68, (char)0x74, (char)0x2E, (char)0x20,
	(char)0x41, (char)0x6C, (char)0x6C, (char)0x20, (char)0x52, (char)0x69, (char)0x67, (char)0x68, (char)0x74, (char)0x73, (char)0x20,
	(char)0x52, (char)0x65, (char)0x73, (char)0x65, (char)0x72, (char)0x76, (char)0x65, (char)0x64, (char)0x2E, (char)0x0A, (char)0x23,
	(char)0x69, (char)0x6E, (char)0x63, (char)0x6C, (char)0x75, (char)0x64, (char)0x65, (char)0x20, (char)0x26, (char)0x71, (char)0x75,
	(char)0x6F, (char)0x74, (char)0x3B, (char)0x63, (char)0x68, (char)0x69, (char)0x6D, (char)0x65, (char)0x72, (char)0x2F, (char)0x61,
	(char)0x70, (char)0x70, (char)0x2E, (char)0x68, (char)0x70, (char)0x70, (char)0x26, (char)0x71, (char)0x75, (char)0x6F, (char)0x74,
	(char)0x3B, (char)0x0A, (char)0x23, (char)0x69, (char)0x6E, (char)0x63, (char)0x6C, (char)0x75, (char)0x64, (char)0x65, (char)0x20,
	(char)0x26, (char)0x71, (char)0x75, (char)0x6F, (char)0x74, (char)0x3B, (char)0x63, (char)0x68, (char)0x69, (char)0x6D, (char)0x65,
	(char)0x72, (char)0x2F, (char)0x6E, (char)0x65, (char)0x74, (char)0x2F, (char)0x68, (char)0x74, (char)0x74, (char)0x70, (char)0x2F,
	(char)0x63, (char)0x6F, (char)0x6D, (char)0x6D, (char)0x6F, (char)0x6E, (char)0x2F, (char)0x68, (char)0x65, (char)0x61, (char)0x64,
	(char)0x65, (char)0x72, (char)0x5F, (char)0x63, (char)0x6F, (char)0x6D, (char)0x6D, (char)0x6F, (char)0x6E, (char)0x2E, (char)0x68,
	(char)0x70, (char)0x70, (char)0x26, (char)0x71, (char)0x75, (char)0x6F, (char)0x74, (char)0x3B, (char)0x0A, (char)0x23, (char)0x69,
	(char)0x6E, (char)0x63, (char)0x6C, (char)0x75, (char)0x64, (char)0x65, (char)0x20, (char)0x26, (char)0x71, (char)0x75, (char)0x6F,
	(char)0x74, (char)0x3B, (char)0x63, (char)0x68, (char)0x69, (char)0x6D, (char)0x65, (char)0x72, (char)0x2F, (char)0x74, (char)0x68,
	(char)0x72, (char)0x65, (char)0x61, (char)0x64, (char)0x2F, (char)0x63, (char)0x61, (char)0x6E, (char)0x63, (char)0x65, (char)0x6C,
	(char)0x6C, (char)0x61, (char)0x74, (char)0x69, (char)0x6F, (char)0x6E, (char)0x5F, (char)0x74, (char)0x6F, (char)0x6B, (char)0x65,
	(char)0x6E, (char)0x5F, (char)0x73, (char)0x6F, (char)0x75, (char)0x72, (char)0x63, (char)0x65, (char)0x2E, (char)0x68, (char)0x70,
	(char)0x70, (char)0x26, (char)0x71, (char)0x75, (char)0x6F, (char)0x74, (char)0x3B, (char)0x0A, (char)0x23, (char)0x69, (char)0x6E,
	(char)0x63, (char)0x6C, (char)0x75, (char)0x64, (char)0x65, (char)0x20, (char)0x26, (char)0x71, (char)0x75, (char)0x6F, (char)0x74,
	(char)0x3B, (char)0x69, (char)0x6D, (char)0x67, (char)0x2F, (char)0x6D, (char)0x65, (char)0x5F, (char)0x6A, (char)0x70, (char)0x67,
	(char)0x2E, (char)0x68, (char)0x70, (char)0x70, (char)0x26, (char)0x71, (char)0x75, (char)0x6F, (char)0x74, (char)0x3B, (char)0x0A,
	(char)0x0A, (char)0x75, (char)0x73, (char)0x69, (char)0x6E, (char)0x67, (char)0x20, (char)0x6E, (char)0x61, (char)0x6D, (char)0x65,
	(char)0x73, (char)0x70, (char)0x61, (char)0x63, (char)0x65, (char)0x20, (char)0x63, (char)0x68, (char)0x69, (char)0x6D, (char)0x65,
	(char)0x72, (char)0x3B, (char)0x0A, (char)0x75, (char)0x73, (char)0x69, (char)0x6E, (char)0x67, (char)0x20, (char)0x6E, (char)0x61,
	(char)0x6D, (char)0x65, (char)0x73, (char)0x70, (char)0x61, (char)0x63, (char)0x65, (char)0x20, (char)0x63, (char)0x68, (char)0x69,
	(char)0x6D, (char)0x65, (char)0x72, (char)0x3A, (char)0x3A, (char)0x74, (char)0x68, (char)0x72, (char)0x65, (char)0x61, (char)0x64,
	(char)0x3B, (char)0x0A, (char)0x75, (char)0x73, (char)0x69, (char)0x6E, (char)0x67, (char)0x20, (char)0x6E, (char)0x61, (char)0x6D,
	(char)0x65, (char)0x73, (char)0x70, (char)0x61, (char)0x63, (char)0x65, (char)0x20, (char)0x63, (char)0x68, (char)0x69, (char)0x6D,
	(char)0x65, (char)0x72, (char)0x3A, (char)0x3A, (char)0x6E, (char)0x65, (char)0x74, (char)0x3A, (char)0x3A, (char)0x68, (char)0x74,
	(char)0x74, (char)0x70, (char)0x3B, (char)0x0A, (char)0x73, (char)0x74, (char)0x64, (char)0x3A, (char)0x3A, (char)0x75, (char)0x6E,
	(char)0x69, (char)0x71, (char)0x75, (char)0x65, (char)0x5F, (char)0x70, (char)0x74, (char)0x72, (char)0x26, (char)0x6C, (char)0x74,
	(char)0x3B, (char)0x61, (char)0x70, (char)0x70, (char)0x26, (char)0x67, (char)0x74, (char)0x3B, (char)0x20, (char)0x73, (char)0x72,
	(char)0x76, (char)0x3B, (char)0x0A, (char)0x63, (char)0x61, (char)0x6E, (char)0x63, (char)0x65, (char)0x6C, (char)0x6C, (char)0x61,
	(char)0x74, (char)0x69, (char)0x6F, (char)0x6E, (char)0x5F, (char)0x74, (char)0x6F, (char)0x6B, (char)0x65, (char)0x6E, (char)0x5F,
	(char)0x73, (char)0x6F, (char)0x75, (char)0x72, (char)0x63, (char)0x65, (char)0x20, (char)0x74, (char)0x6F, (char)0x6B, (char)0x65,
	(char)0x6E, (char)0x5F, (char)0x73, (char)0x6F, (char)0x75, (char)0x72, (char)0x63, (char)0x65, (char)0x3B, (char)0x0A, (char)0x69,
	(char)0x6E, (char)0x74, (char)0x20, (char)0x6D, (char)0x61, (char)0x69, (char)0x6E, (char)0x28, (char)0x69, (char)0x6E, (char)0x74,
	(char)0x20, (char)0x61, (char)0x72, (char)0x67, (char)0x63, (char)0x2C, (char)0x20, (char)0x63, (char)0x68, (char)0x61, (char)0x72,
	(char)0x20, (char)0x2A, (char)0x61, (char)0x72, (char)0x67, (char)0x76, (char)0x5B, (char)0x5D, (char)0x29, (char)0x0A, (char)0x7B,
	(char)0x0A, (char)0x09, (char)0x73, (char)0x72, (char)0x76, (char)0x20, (char)0x3D, (char)0x20, (char)0x73, (char)0x74, (char)0x64,
	(char)0x3A, (char)0x3A, (char)0x6D, (char)0x61, (char)0x6B, (char)0x65, (char)0x5F, (char)0x75, (char)0x6E, (char)0x69, (char)0x71,
	(char)0x75, (char)0x65, (char)0x26, (char)0x6C, (char)0x74, (char)0x3B, (char)0x61, (char)0x70, (char)0x70, (char)0x26, (char)0x67,
	(char)0x74, (char)0x3B, (char)0x28, (char)0x74, (char)0x6F, (char)0x6B, (char)0x65, (char)0x6E, (char)0x5F, (char)0x73, (char)0x6F,
	(char)0x75, (char)0x72, (char)0x63, (char)0x65, (char)0x2E, (char)0x74, (char)0x6F, (char)0x6B, (char)0x65, (char)0x6E, (char)0x28,
	(char)0x29, (char)0x29, (char)0x3B, (char)0x0A, (char)0x09, (char)0x73, (char)0x72, (char)0x76, (char)0x2D, (char)0x26, (char)0x67,
	(char)0x74, (char)0x3B, (char)0x72, (char)0x65, (char)0x67, (char)0x69, (char)0x73, (char)0x74, (char)0x65, (char)0x72, (char)0x5F,
	(char)0x61, (char)0x63, (char)0x74, (char)0x69, (char)0x6F, (char)0x6E, (char)0x28, (char)0x0A, (char)0x09, (char)0x09, (char)0x68,
	(char)0x74, (char)0x74, (char)0x70, (char)0x5F, (char)0x6D, (char)0x65, (char)0x74, (char)0x68, (char)0x6F, (char)0x64, (char)0x3A,
	(char)0x3A, (char)0x67, (char)0x65, (char)0x74, (char)0x2C, (char)0x20, (char)0x26, (char)0x71, (char)0x75, (char)0x6F, (char)0x74,
	(char)0x3B, (char)0x2F, (char)0x6D, (char)0x65, (char)0x2E, (char)0x6A, (char)0x70, (char)0x67, (char)0x26, (char)0x71, (char)0x75,
	(char)0x6F, (char)0x74, (char)0x3B, (char)0x2C, (char)0x20, (char)0x5B, (char)0x5D, (char)0x28, (char)0x63, (char)0x6F, (char)0x6E,
	(char)0x73, (char)0x74, (char)0x20, (char)0x68, (char)0x74, (char)0x74, (char)0x70, (char)0x5F, (char)0x72, (char)0x65, (char)0x71,
	(char)0x75, (char)0x65, (char)0x73, (char)0x74, (char)0x20, (char)0x26, (char)0x61, (char)0x6D, (char)0x70, (char)0x3B, (char)0x72,
	(char)0x65, (char)0x71, (char)0x2C, (char)0x20, (char)0x68, (char)0x74, (char)0x74, (char)0x70, (char)0x5F, (char)0x72, (char)0x65,
	(char)0x73, (char)0x70, (char)0x6F, (char)0x6E, (char)0x73, (char)0x65, (char)0x20, (char)0x26, (char)0x61, (char)0x6D, (char)0x70,
	(char)0x3B, (char)0x72, (char)0x65, (char)0x73, (char)0x2C, (char)0x20, (char)0x63, (char)0x6F, (char)0x6E, (char)0x73, (char)0x74,
	(char)0x20, (char)0x63, (char)0x61, (char)0x6E, (char)0x63, (char)0x65, (char)0x6C, (char)0x6C, (char)0x61, (char)0x74, (char)0x69,
	(char)0x6F, (char)0x6E, (char)0x5F, (char)0x74, (char)0x6F, (char)0x6B, (char)0x65, (char)0x6E, (char)0x20, (char)0x26, (char)0x61,
	(char)0x6D, (char)0x70, (char)0x3B, (char)0x74, (char)0x6F, (char)0x6B, (char)0x65, (char)0x6E, (char)0x29, (char)0x20, (char)0x7B,
	(char)0x0A, (char)0x09, (char)0x09, (char)0x09, (char)0x72, (char)0x65, (char)0x73, (char)0x2E, (char)0x68, (char)0x65, (char)0x61,
	(char)0x64, (char)0x65, (char)0x72, (char)0x73, (char)0x5B, (char)0x68, (char)0x65, (char)0x61, (char)0x64, (char)0x65, (char)0x72,
	(char)0x3A, (char)0x3A, (char)0x63, (char)0x6F, (char)0x6E, (char)0x74, (char)0x65, (char)0x6E, (char)0x74, (char)0x5F, (char)0x74,
	(char)0x79, (char)0x70, (char)0x65, (char)0x5D, (char)0x20, (char)0x3D, (char)0x20, (char)0x69, (char)0x6D, (char)0x61, (char)0x67,
	(char)0x65, (char)0x3A, (char)0x3A, (char)0x6A, (char)0x70, (char)0x65, (char)0x67, (char)0x3B, (char)0x0A, (char)0x09, (char)0x09,
	(char)0x09, (char)0x72, (char)0x65, (char)0x73, (char)0x2E, (char)0x73, (char)0x65, (char)0x74, (char)0x5F, (char)0x62, (char)0x6F,
	(char)0x64, (char)0x79, (char)0x28, (char)0x6D, (char)0x65, (char)0x5F, (char)0x6A, (char)0x70, (char)0x67, (char)0x29, (char)0x3B,
	(char)0x0A, (char)0x09, (char)0x09, (char)0x7D, (char)0x29, (char)0x3B, (char)0x0A, (char)0x09, (char)0x73, (char)0x72, (char)0x76,
	(char)0x2D, (char)0x26, (char)0x67, (char)0x74, (char)0x3B, (char)0x72, (char)0x75, (char)0x6E, (char)0x28, (char)0x61, (char)0x72,
	(char)0x67, (char)0x63, (char)0x2C, (char)0x20, (char)0x61, (char)0x72, (char)0x67, (char)0x76, (char)0x29, (char)0x3B, (char)0x0A,
	(char)0x7D, (char)0x0A, };

static const constexpr std::string_view me_jpg_example(me_jpg_example_arr.begin(), me_jpg_example_arr.end());

Advantages

There's one extra bit of information that I didn't mention in the above example. As part of the baking process for embedding these files into the executable extra processes can be applied to the data before it's fully baked. One that I've implemented already is compressing the files with zlib's compress2 function. So, when something like style.css (another real file being served by this website) is embedded and used in HTTP responses, it's already been pre-compressed. There is no processing, allocation, or caching of results that needs to be done. The resource has already been embedded in the exact way the client is expecting it to be in, so it can just be put inside the http_response object and serialized. In the happy path, where the browser supports the deflate algorithm (all modern browsers) that means the chimer web server does one allocation to store the serialized http_response. That's it. This is also true of the simpler case where a resource was just baked in and not pre-processed in any way (e.g., the jpg image shown above... since jpgs are already a compressed image)

Disadvantages

There are a couple disadvantages to this approach. One big drawback that has to be realized is that any asset you bake into an executable cannot be dynamically changed at runtime. This is especially true of compressed resources where you can't reasonably expect to modify the compressed binary string in a useful way. This relates to the happy path I mentioned in the last section. Extra work has to be done to support the sad path on resources that have been processed in a way that the client cannot understand. If a client does not support the deflate algorithm for Content-Encoding then, if your web server is nice, a fallback encoding (or none) will be used. To support this in an asset baking system, you may need to bake multiple versions of the same file processed in different ways for the most compatibility. This is not something that I've actually done for the resources being served on this website... so the sad path would be very sad indeed.

The second disadvantage is build complexity! I've essentially traded code complexity in the form of reading/compressing files for each web server endpoint with doing the exact same thing... but at build time now. What you don't see in this post is the added lines of CMake code and the entire asset baker utility program I've built to enable the simple usage code above. This is a lot of complexity to add in order to save one or two allocations along a web server's request path. And to be a bit more detailed: at the scale that I'm talking about for this website (and likely anything else relatively small), this asset baking strategy would really only save on allocations. It probably will not save you anything on file reads. This is because at a small scale all of the files that the web server reads are going to end up in the filesystem's cache. It wouldn't matter that you read them every request.

Conclusion

So, is asset baking worth it? Well, I had fun implementing it, so just from that standpoint I would say yes. But I've found more uses for asset baking than I would have thought. Specifically, for code snippets in writing these blog posts it has been a useful tool. I can write snippets that are guaranteed to compile with the current chimer source code and embed them into these blog posts after that validation has occurred. That's so cool! And I don't think I would have stumbled on this kind of work flow without having the asset baker tool already implemented.