Learning Resources
Configuration
Introduction
Barely any application runs without a user-provided configuration. At the very least, a distributed system needs to allow users to tell how to reach remote nodes in the network or how other nodes in the network may reach this node.
CAF makes it easy to configure various aspects of applications and also allows users to fine-tune CAF components. As we will see, CAF uses a simple and yet powerful API to handle configurations. Whether the user provides a configuration file, a command line argument, or environment variables, CAF can handle it all.
The Actor System Configuration
The central customization point in CAF is the actor_system_config
. On this
class, we can register various settings before creating the actor system. Once
CAF has parsed the user-provided configuration, it stores the settings in the
actor_system_config
object. This object is then passed to the actor system
constructor.
Of course, the actor_system_config
does not only store the user-provided
settings. It also allows us to read settings programmatically. Not just for
primitive types, but also for arbitrary user-defined types as long as they
provide an inspect
function.
To retrieve values from a configuration object, CAF offers functions such as
get_as
and get_or
. These functions allow us to convert the stored values to
the desired type. If the requested value does not exist or the type conversion
fails, CAF either provides an error when using get_as
or falls back to a
default value when using get_or
.
Adding Custom Options
The idiomatic way to run a CAF application with user-defined settings is to
implement a subclass of actor_system_config
. This subclass may add custom
options to the configuration object in the constructor. CAF already adds the
default options to the object in the parent constructor.
To see the pieces in motion, let us consider an example service that needs to
connect to a database and listens for clients on a specific port. We want to
have the environment variables DB_PORT
, DB_HOST
, and DB_TABLE
to override
the default values. To register our custom options, we subclass
actor_system_config
and add the options in the constructor as follows:
class my_config : public caf::actor_system_config {
public:
my_config() {
opt_group{custom_options_, "global"}
.add<bool>("verbose", "enable additional output");
opt_group{custom_options_, "server"}
.add<std::string>("listen-address,l", "the optional listen address")
.add<uint16_t>("port,p", "the port to listen on");
opt_group{custom_options_, "database"}
.add<uint16_t>("port,,DB_PORT", "the port to connect to for the database")
.add<std::string>("host,,DB_HOST", "the database host")
.add<std::string>("table,,DB_TABLE", "the name of the table to use");
}
};
When passing a name for an option, CAF expects a comma-separated list of arguments. The first argument is the long name, the second argument is a series of short names (where each character represents a short name), and the third argument is the environment variable name. Only the long name is mandatory.
CAF organizes options in a tree-like structure. Each option group is a node in
the tree. The root node is the actor_system_config
object itself. When passing
global
as the group name, the options will be stored in the root node
directly.
In our example, the options are organized as follows:
root
:verbose
: a boolean flag to enable additional outputserver
:listen-address
: the optional listen addressport
: the port to listen on
database
:port
: the port to connect to for the databasehost
: the database hosttable
: the name of the table to use
Writing Configuration Files
The tree-based structure of the configuration object is also reflected in the format for the configuration files. For our custom configuration object from above, a configuration file would look like this in JSON:
{
"verbose": true,
"server": {
"listen-address": "127.0.0.1",
"port": 8080
},
"database": {
"port": 3306,
"host": "localhost",
"table": "my_table"
}
}
JSON is a good choice if you generate your configuration programmatically. However, if configurations are written by hand, JSON is not the most user-friendly format. Luckily, CAF supports a more streamlined syntax for configuration files. We simply drop the outermost braces, omit quotes for keys, use '=' instead of ':', and get rid of unnecessary commas:
verbose = true
server = {
listen-address = "127.0.0.1"
port = 8080
}
database = {
port = 3306
host = "localhost"
table = "my_table"
}
These two examples are equivalent. The second one is simply more concise. Of course, the user can also input the configuration via environment variables as well as on the command line. We get to this later. First, we want to see how we can use custom types in our configuration to consume sub-trees of the configuration in one go.
Reading Custom Types from the Configuration
In order to read custom types from the configuration, we need to implement an
inspect
function for the type. This is the same mechanism that allows CAF to
serialize and deserialize custom types on the network.
First, we define a server_config
struct that holds the server configuration.
struct server_config {
uint16_t port;
std::optional<std::string> listen_address;
};
template <class Inspector>
bool inspect(Inspector& f, server_config& x) {
return f.object(x).fields(f.field("port", x.port),
f.field("listen-address", x.listen_address));
}
Next, we define a database_config
struct that holds the database configuration.
struct database_config {
uint16_t port;
std::optional<std::string> host;
std::string table;
};
template <class Inspector>
bool inspect(Inspector& f, database_config& x) {
return f.object(x).fields(f.field("port", x.port),
f.field("host", x.host),
f.field("table", x.table));
}
As you probably have noticed, we have defined some member variables as
std::optional
. This tells CAF that a missing value is not an error. If the
user does not provide a value for server_config::listen_address
or
database_config::host
, CAF will simply leave the member variable empty.
With these two structs in place, we can start playing with our custom
configuration. We define a caf_main
function that takes the actor system and
our custom configuration as arguments. CAF will look at the type of the second
parameter and automatically use that configuration type to initialize the actor
system. Then, we will use get_as
and get_or
to give a couple examples of how
to read values from the configuration.
void caf_main(caf::actor_system& sys, const my_config& cfg) {
// Get the verbose flag, defaults to false.
sys.println("verbose: {}", caf::get_or(cfg, "verbose", false));
// Get the server configuration in one go.
if (auto server = caf::get_as<server_config>(cfg, "server")) {
sys.println("server: port = {}, listen-address = {}",
server->port, server->listen_address);
} else {
sys.println("no valid server config available");
}
// Get the database configuration in one go.
if (auto db = caf::get_as<database_config>(cfg, "database")) {
sys.println("database: port = {}, host = {}, table = {}",
db->port, db->host, db->table);
} else {
sys.println("no valid database config available");
}
}
CAF_MAIN()
Running the Example
If we compile our example application from before as example1
and then run it
without arguments, we will see:
$ ./example1
verbose: false
no valid server config available
no valid database config available
The only field required for the server configuration is the port. We can set this parameter on the command line and see how the output changes:
$ ./example1 -p 8080
verbose: false
server: port = 8080, listen-address = null
no valid database config available
We can also use the long name for the port option:
$ ./example1 --server.port=8080
verbose: false
server: port = 8080, listen-address = null
no valid database config available
Note that the long options support both --option=value
and --option value
:
$ ./example1 --server.port 8080
verbose: false
server: port = 8080, listen-address = null
no valid database config available
Now, let's pass the JSON configuration from above as a command line argument:
$ ./example1 --config-file example1.json
verbose: true
server: port = 8080, listen-address = *"127.0.0.1"
database: port = 3306, host = *"localhost", table = my_table
Note: CAF renders values with a nullable type using a *
prefix, e.g.,
*"localhost"
to indicate that this value could be null
.
The config-file
options is an implicit option that CAF provides. By default,
CAF also provides a help
option with short names -h
and -?
as well as few
other default options. Running the binary with -h
prints the following help
text for our example:
$ ./example1 -h
database options:
--database.port=<uint16_t> : the port to connect to for the database
--database.host=<std::string> : the database host
--database.table=<std::string> : the name of the table to use
global options:
(-h|-?|--help) : print help text to STDERR and exit
--long-help : same as --help but list options that are omitted by default
--dump-config : print configuration and exit
--config-file=<std::string> : sets a path to a configuration file
--verbose : enable additional output
server options:
(-l|--server.listen-address) <std::string> : the optional listen address
(-p|--server.port) <uint16_t> : the port to listen on
The last way to pass the configuration is via environment variables. By default,
CAF converts the option name to all-uppercase and separates words with an
underscore. We didn't explicitly set a environment variable name for the
server.port
option, so CAF will auto-generate one:
$ export SERVER_PORT=8080
$ ./example1
verbose: false
server: port = 8080, listen-address = null
no valid database config available
For the database options, we did override the default environment variable names
with DB_PORT
, DB_HOST
, and DB_TABLE
. When setting these environment
variables, we see the following output:
$ export DB_HOST=127.0.0.1
$ export DB_TABLE=foo
$ export DB_PORT=1234
$ ./example1
verbose: false
no valid server config available
database: port = 1234, host = *"127.0.0.1", table = foo
Before we move on to the next section, let's see what happens if we provide a configuration file and environment variables:
$ export DB_HOST=127.0.0.1
$ export DB_TABLE=foo
$ export DB_PORT=1234
$ ./example1 --config-file example1.json
verbose: true
server: port = 8080, listen-address = *"127.0.0.1"
database: port = 1234, host = *"127.0.0.1", table = foo
Remember, our configuration file sets the database port to 3306, the host to
localhost
, and the table to my_table
. As we can see, the environment
variables override the configuration file.
When parsing the configuration, CAF will use the following order of precedence:
- Command line arguments
- Environment variables
- Configuration file
The command line arguments have the highest precedence, as we can see in the following example:
$ export DB_HOST=127.0.0.1
$ export DB_TABLE=foo
$ export DB_PORT=1234
$ ./example1 --config-file example1.json --database.port=2200
verbose: true
server: port = 8080, listen-address = *"127.0.0.1"
database: port = 2200, host = *"127.0.0.1", table = foo
Using and Extending Dump Config
The --dump-config
option is a built-in option that prints the configuration
and exits. The output of this option is generated from the dump_content
member
function. The output includes any options that were set by the user, as well as
the default values if available.
In order to customize the output of --dump-config
, we can override the
dump_content
member function. For example, we could use port 8080 as the
default port for the server (by always using get_or
with a default value of
8080). To show this default value in the output of --dump-config
, we need to
override the dump_content
member function and add the default value to the
output.
The following code snippet shows how to override the dump_content
member
function. By calling super::dump_content()
, we first retrieve the default
output. Then, we extend the output with caf::put_missing
(which will not
modify the dictionary if a value is already present) and return it.
class my_config : public caf::actor_system_config {
public:
using super = actor_system_config;
my_config() {
opt_group{custom_options_, "global"}
.add<bool>("verbose", "enable additional output");
opt_group{custom_options_, "server"}
.add<std::string>("listen-address,l", "the optional listen address")
.add<uint16_t>("port,p", "the port to listen on");
opt_group{custom_options_, "database"}
.add<uint16_t>("port,,DB_PORT", "the port to connect to for the database")
.add<std::string>("host,,DB_HOST", "the database host")
.add<std::string>("table,,DB_TABLE", "the name of the table to use");
}
caf::settings dump_content() const override {
auto result = super::dump_content();
auto& server = result["server"].as_dictionary();
caf::put_missing(server, "port", 8080);
return result;
}
};
The type settings
is a dictionary that maps strings to config_value
, which
is a recursive type that can hold primitive values as well as lists and
dictionaries. We will see how to work with config_value
in more detail in the
next section.
Now, when we run the application with the --dump-config
option, we see the
default port value in the output:
$ ./example2 --dump-config
server {
port = 8080
}
Config Values
The class config_value
represents a primitive value (numbers, booleans,
strings), none
(no value), a list of config_value
objects, or a dictionary
mapping strings to config_value
objects.
The type settings
that we have seen earlier is an alias for
dictionary<config_value>
, whereas dictionary
is a std::map
-like type that
always uses strings as keys.
Usually, we do not work with config_value
directly. Instead, we use the
functions get_as
and get_or
directly on the actor_system_config
object.
However, sometimes there are cases where we need to work with config_value
directly. For example, when overriding the dump_content
member function. So,
if you are interested in the details of the config_value
API, read on!
However, of course feel free to skip right to the conclusion if you are not
interested in the finer details.
Basics
Much like a regular std::variant
, a config_value
accepts any input in its
constructor that it can convert to one of its types. For example, we can define
the three config values x
, y
and z
as integer, floating point and string
as follows:
Source Code
auto x = caf::config_value{1};
auto y = caf::config_value{2.0};
auto z = caf::config_value{"three"};
sys.println("x = {}", x);
sys.println("y = {}", y);
sys.println("z = {}", z);
Output
x = 1
y = 2
z = three
Users may treat config_value
as a simple sum type similar to std::variant
by
using functions like get
, get_if
, and holds_alternative
. However, most of
the time we will use the function pair get_as
and get_or
that we have seen
in the previous sections.
On-the-fly Conversions with get_as
Configuration values never exist in a vacuum. They typically represent input by
the user. That input may be a string that actually represents a timespan.
Luckily ,the function get_as
is the Swiss Army knife of type conversions. The
function takes one template parameter T
that represents the target type and it
returns expected<T>
. An expected<T>
represents an optional value, but
unlike std::optional
it carries an error
if no value exists or if the
conversion fails.
Source Code
auto x = caf::config_value{"5s"};
if (auto ts = caf::get_as<caf::timespan>(x))
sys.println("ts = {}", *ts);
else
sys.println("oops: {}", ts.error());
Output
ts = 5s
CAF knows how to convert the string "5s"
into a timespan
. It also knows how
to convert numbers in case a user typed in an integer while the system expects a
double
. Basically, CAF may perform all sorts of type conversions as long as
the target type may reasonably hold the value. CAF will also perform bound
checks for integral types. For example, if the user inputs a number that is too
large to fit into a 16-bit integer, the conversion will fail:
Source Code
auto x = caf::config_value{42};
if (auto narrow_x = caf::get_as<uint16_t>(x))
sys.println("narrow_x = {}", *narrow_x);
else
sys.println("oops: {}", narrow_x.error());
auto y = caf::config_value{1'000'000};
if (auto narrow_y = caf::get_as<uint16_t>(y))
sys.println("narrow_y = {}", *narrow_y);
else
sys.println("oops: {}", narrow_y.error());
Output
narrow_x = 42
oops: conversion_failed("narrowing error")
The function get_as
only performs safe conversions. In this case, by
performing bound checks. Hence, only the conversion for 42 succeeds, because
1,000,000 does not fit into 16 bit!
While converting between builtin types is neat, the real power of get_as
comes
from the fact that it tightly integrates with the type inspection API! We have
already seen this in action when we converted a dictionary to a server_config
object. However, CAF can go even further and first convert a string to a
dictionary and then convert that dictionary to a custom type. All in one shot.
For our next example, we will use this simple point_2d
struct:
struct point_2d {
int32_t x;
int32_t y;
};
template <class Inspector>
bool inspect(Inspector& f, point_2d& x) {
return f.object(x).fields(f.field("x", x.x), f.field("y", x.y));
}
Then, we can read a point_2d
directly from a config_value
that holds a
string (representing a dictionary) as follows):
Source Code
auto x = caf::config_value{"{x = 12, y = 21}"} ;
if (auto point = caf::get_as<point_2d>(x))
sys.println("got a point: ({}, {})", point->x, point->y);
else
sys.println("oops: {}", point.error());
Output
got a point: (12, 21)
We have also used get_or
before. The only thing left worth mentioning is that
get_or
will use get_as
internally. Hence, it can do all the conversions that
get_as
can do. The only difference is that get_or
will return the fallback
value if the conversion fails.
Lists
Aside from storing single values, the type config_value
can also store lists.
Each element in the list is a config_value
again. Hence, we can nest lists
arbitrarily.
Creating Lists
The constructor of config_value
is explicit
to stop the compiler from
automatically converting values to config_value
everywhere. However, this
makes initializing a list of config values cumbersome:
// Note: caf::config_value::list xs{1, 2, 3}; -- will not compile
auto xs = caf::config_value::list{caf::config_value{1},
caf::config_value{2},
caf::config_value{3}};
sys.println("{}", xs);
The above snippet prints [1, 2, 3]
, but it requires a lot of boilerplate code
to initialize the list. Constructing a config value from a list adds even more
boilerplate code, because we need to wrap the entire initialization again:
auto xs = caf::config_value{caf::config_value::list{caf::config_value{1},
caf::config_value{2},
caf::config_value{3}}};
sys.println("{}", xs);
The second snippet also prints [1, 2, 3]
. The only difference is that xs
is
a config value holding a list this time. To make working with lists easier, CAF
offers the factory function make_config_value_list
:
Source Code
auto xs = caf::make_config_value_list(1, 2, 3);
sys.println("{}", xs);
Output
[1, 2, 3]
Since config value lists are heterogeneous, we can also construct a list with mixed types:
Source Code
auto xs = caf::make_config_value_list(1, "two", 3.0);
sys.println("{}", xs);
Output
[1, "two", 3]
Using as_list
Sometimes, we receive a config value and need to convert it to a list before
continuing. If the value already contains a list then we want to make sure not
to override it, because we want to keep existing entries. For this particular
use case, config_value
provides the member function as_list
:
Source Code
auto x = caf::config_value{};
auto y = caf::config_value{42};
auto z = caf::make_config_value_list(1, 2, 3);
sys.println("(1) x as list = {}", x.as_list());
sys.println("(2) y as list = {}", y.as_list());
sys.println("(3) z as list = {}", z.as_list());
Output
(1) x as list = []
(2) y as list = [42]
(3) z as list = [1, 2, 3]
In the first case, we convert nothing to a list. The only way CAF could
perform this conversion is by creating an empty list. In the second case, the
variable y
contains the integer 42. Here, CAF simply lifts the single value
into a list with one element. Lastly, z
already contains a list, so CAF can
simply return the stored list in this case without any conversion.
Working with as_list
avoids unnecessary boilerplate code, as we can see in the
following example that creates a list of three lists of three integers each:
Source Code
caf::config_value x;
sys.println("(1) x = {}", x);
auto& ls = x.as_list();
sys.println("(2) x = {}", x);
ls.resize(3); // Fills the list with three null elements.
sys.println("(3) x = {}", x);
auto num = int64_t{0};
for (auto& element : ls) {
auto& nested = element.as_list();
nested.emplace_back(num);
for (++num; num % 3 != 0; ++num)
nested.emplace_back(num);
}
sys.println("(4) x = {}", x);
Output
(1) x = null
(2) x = []
(3) x = [null, null, null]
(4) x = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
Running the example prints x
four times:
- The first line prints the default-constructed
x
. As we can see, initially it is justnull
. - The second time we print
x
is after we have calledas_list
on it. This member function converts the config value to a list. Hence, the second line shows[]
. - After resizing the vector, we have three
null
objects in the list. - Finally, the fourth line displays our final result after filling
x
with the desired content in thefor
-loop.
Converting to Homogeneous Lists
We already know how to conveniently create a list of config values using
make_config_value_list
:
auto xs = caf::make_config_value_list(1, 2, 3);
sys.println("{}", xs);
Our container xs
from the snippet above consists of integers only. Just like
get_as
converts to single values, the function also knows how to convert the
config value lists to homogeneous list types:
Source Code
auto xs = caf::make_config_value_list(1, 2, 3);
if (auto ints = caf::get_as<std::vector<int32_t>>(xs)) {
sys.println("xs is a vector of int: {}", *ints);
} else {
sys.println("xs is not a vector of int: {}", ints.error());
}
Output
xs is a vector of int: [1, 2, 3]
Note that CAF does not limit its users to std::vector
. The automatic unboxing
supports all types that behave like STL containers such as std::vector
,
std::list
, std::set
and std::unordered_set
Converting to Tuples
Because config_value
may hold a variety of types, config_value::list
may
hold elements of different types. Such lists cannot convert to std::vector
or
similar data structures except when converting to a type that can construct from
different types. However, heterogeneous lists can convert to tuples:
Source Code
auto xs = caf::make_config_value_list(1, "two", 3.3);
if (auto tup = caf::get_as<std::tuple<int32_t, std::string, double>>(xs)) {
sys.println("tup: {}", *tup);
} else {
sys.println("oops: {}", tup.error());
}
Output
tup: [1, "two", 3.3]
This also applies to std::array
and std::pair
. To CAF, every type that
specializes std::tuple_size
etc. is treated the same way with respect to
get_as
and get_or
.
Source Code
auto xs = caf::make_config_value_list(1, "two");
if (auto tup = caf::get_as<std::pair<int32_t, std::string>>(xs)) {
sys.println("tup: {}", *tup);
} else {
sys.println("oops: {}", tup.error());
}
auto ys = caf::make_config_value_list(1, 2, 3);
if (auto arr = caf::get_as<std::array<int32_t, 3>>(ys)) {
sys.println("arr: {}", *arr);
} else {
sys.println("oops: {}", arr.error());
}
Output
tup: [1, "two"]
arr: [1, 2, 3]
Dictionaries and Settings
In CAF, settings
is an alias for dictionary<config_value>
. A dictionary is
a map
with string keys. Semantically, a dictionary<T>
is equivalent to a
std::map<std::string, T>
.
Using as_dictionary
Analogous to as_list
, CAF also offers an as_dictionary
member function that
returns the config_value
as a dictionary, converting it to a dictionary if
needed.
Source Code
auto x = caf::config_value{};
sys.println("(1) x = {}", x);
auto& dict = x.as_dictionary();
sys.println("(2) x = {}", x);
dict.emplace("foo", "bar");
dict.emplace("int-value", 42);
dict.emplace("int-value", 23);
sys.println("(3) x = {}", x);
Output
(1) x = null
(2) x = {}
(3) x = {foo = "bar", "int-value" = 42}
Running the example prints x
three times again:
- Initially,
x
isnull
once again. - After calling
as_dictionary
,x
now is an empty dictionary. - Just like
std::map
, callingemplace
tries to add a new entry to the dictionary and does nothing if the entry already exists. This means that 42 will remain associated to the keyint-value
(not 23).
Just like with as_list
, CAF tries to convert the content of a config value to
a dictionary. However, dictionaries require two values: key and value.
Hence, the only other data structures that CAF can convert into dictionaries are
lists of lists, where each nested list has exactly two values, and strings
that can be parsed into a valid dictionary:
Source Code
auto x = caf::make_config_value_list(caf::make_config_value_list("one", 1),
caf::make_config_value_list("two", 2),
caf::make_config_value_list("three", 3));
sys.println("(1) x = {}", x);
x.as_dictionary();
sys.println("(2) x = {}", x);
auto y = caf::config_value{"{answer = 42}"};
sys.println("(3) y = {}", y);
y.as_dictionary();
sys.println("(4) y = {}", y);
Output
(1) x = [["one", 1], ["two", 2], ["three", 3]]
(2) x = {one = 1, three = 3, two = 2}
(3) y = {answer = 42}
(4) y = {answer = 42}
The automatic parsing of strings to dictionaries also enables the conversion
from strings to point_2d
we observed earlier. The inspect
function exposes
the fields inside an object to CAF and we can naturally translate a dictionary
to an object by interpreting the keys as field names. So as long as a
config_value
represents a dictionary, CAF can use the inspect
function for
trying to construct a C++ object.
Converting to Regular Map Types
At this point, you can probably guess what our next example illustrates. Yes,
get_as
once last time!
Source Code
auto x = caf::config_value{};
auto& dict = x.as_dictionary();
dict.emplace("1", 10);
dict.emplace("2", 20);
dict.emplace("3", 30);
if (auto m1 = caf::get_as<std::map<double, int32_t>>(x))
sys.println("m1: {}", *m1);
else
sys.println("oops: {}", m1.error());
if (auto m2 = caf::get_as<std::unordered_map<int32_t, double>>(x))
sys.println("m2: {}", *m2);
else
sys.println("oops: {}", m2.error());
Output
m1: {1 = 10, 2 = 20, 3 = 30}
m2: {3 = 30, 2 = 20, 1 = 10}
As you can see, get_as
also performs "deep" conversions by converting the
string key of the dictionary to another type. In this case, CAF converts the
strings to integers or floating point numbers as needed.
Conclusion
CAF provides a powerful API for handling configurations. The
actor_system_config
object is the central customization point in CAF. It
allows users to register custom options and read settings programmatically.
From the custom options, CAF automatically creates parsers for command line
arguments, environment variables, and configuration files. The user can provide
a configuration in any of these formats, and CAF will parse it into the
actor_system_config
object.
If multiple sources provide the same option, CAF follows the order of precedence that most POSIX applications use: command line arguments have the highest precedence, followed by environment variables, and finally configuration files.
To deal with the parsed configuration, CAF provides the get_as
and get_or
functions. These functions allow users to convert the stored values to the
desired type. The type conversions leverage the type inspection API to allow
users to convert custom types from the configuration. The conversions will also
perform bound checks for integral types.