Support for user-defined merge-strategies
Sometimes you want users to be able to define in their configuration how a certain option should merge with existing values.
For example, if I have:
#[derive(Config)]
struct MyConfig {
#[setting(nested, merge = schematic::merge::replace)]
items: Vec<Item>,
}
I want the user to be able to determine how two MyConfig::items values should be merged by Schematic instead of hard-coding it to replace.
One solution I had in mind was to allow deserializing a wrapper enum, where one deserializes from a regular Vec, or from an object that includes the merge strategy:
#[derive(Config)]
struct MyConfig {
#[setting(nested)]
items: ReplaceOrMergedVec<Item>,
}
#[derive(Serialize, Deserialize, Config)]
#[serde(untagged)]
pub enum ReplaceOrMergedVec<T: Config> {
#[setting(nested, merge = schematic::merge::replace)]
Replace(Vec<T>),
#[setting(merge = merge_vec_with_strategy)]
Merged(MergedVec<T>),
}
#[derive(Serialize, Deserialize)]
pub struct MergedVec<T: Config> {
strategy: Strategy,
items: Vec<T>
}
#[derive(Serialize, Deserialize)]
pub enum Strategy {
Append,
Prepend,
Replace,
}
fn merge_vec_with_strategy<T: Config>(
mut prev: MergedVec<T>,
next: MergedVec<T>,
context: &(),
) -> MergeResult<MergedVec<T>> {
match prev.strategy {
// ...
}
}
I ran into two issues with this:
- It looks like Schematic does not support generic types on configuration structs.
- You cannot define the
mergeproc-macro attribute on anestednon-collection item in a struct.
I can work around (1) by not using generics here (although it would definitely be nice), but that second feature is required because you need access to the strategy value do know how to merge the items.
I understand that there's the possibility of using Context for this, but it's not closely tied to specific fields of the configuration struct(s), and also would not be set through the configuration files themselves, so it's not really suitable for this use-case.
Any thoughts on supporting these two features, or perhaps supporting user-defined merge strategies differently?
I got fairly far without generic types with this implementation:
/// Vec of items, either defaulting to a merge strategy of `replace`, or
/// defining a specific merge strategy.
#[derive(Debug, Clone, Config)]
#[serde(untagged)]
pub enum VecWithMerge {
/// A list of items that are merged using the [`schematic::merge::replace`]
#[setting(nested, default)]
Replace(VecReplace),
/// A list of items that are merged using the specified merge strategy.
#[setting(merge = merge_vec_with_strategy)]
Merged(VecWithStrategy),
}
impl VecWithMerge {
/// Get the list of items.
#[must_use]
pub fn get(&self) -> &[MyInnerConfig] {
match self {
Self::Replace(replace) => &replace.items,
Self::Merged(merged) => &merged.items,
}
}
/// Consume the list of items.
#[must_use]
pub fn into_inner(self) -> Vec<MyInnerConfig> {
match self {
Self::Replace(replace) => replace.items,
Self::Merged(merged) => merged.items,
}
}
}
impl std::ops::Index<usize> for VecWithMerge {
type Output = MyInnerConfig;
fn index(&self, index: usize) -> &Self::Output {
self.get().index(index)
}
}
// impl std::ops::Index<usize> for PartialVecWithMerge {
// type Output = PartialMyInnerConfig;
//
// fn index(&self, index: usize) -> &Self::Output {
// match self {
// Self::Replace(replace) => &replace.items[index],
// Self::Merged(merged) => &merged.items[index],
// }
// }
// }
/// A list of items that are merged using the [`schematic::merge::replace`]
/// strategy.
#[derive(Debug, Clone, Config)]
pub struct VecReplace {
/// The list of items.
#[setting(nested, default = default_inner, merge = schematic::merge::replace)]
pub items: Vec<MyInnerConfig>,
}
/// A list of items that are merged using the specified merge strategy.
#[derive(Debug, Clone, Default, PartialEq, Serialize, Deserialize, Schematic)]
pub struct VecWithStrategy {
/// Merge strategy.
strategy: MergeStrategy,
/// The list of items.
items: Vec<MyInnerConfig>,
}
/// Merge strategy for `VecWithStrategy`.
#[derive(Debug, Clone, Copy, Default, PartialEq, Serialize, Deserialize, ConfigEnum)]
pub enum MergeStrategy {
/// See [`schematic::merge::prepend_vec`].
#[default]
Prepend,
/// See [`schematic::merge::append_vec`].
Append,
/// See [`schematic::merge::replace`].
Replace,
}
/// Merge two `VecWithStrategy` values.
#[expect(clippy::trivially_copy_pass_by_ref)]
fn merge_vec_with_strategy(
prev: VecWithStrategy,
next: VecWithStrategy,
context: &(),
) -> MergeResult<VecWithStrategy> {
let items = match prev.strategy {
MergeStrategy::Append => schematic::merge::append_vec(prev.items, next.items, context)?,
MergeStrategy::Prepend => schematic::merge::prepend_vec(prev.items, next.items, context)?,
MergeStrategy::Replace => schematic::merge::replace(prev.items, next.items, context)?,
};
Ok(items.map(|items| VecWithStrategy {
strategy: next.strategy,
items,
}))
}
But one issue I ran into is that I cannot mark VecWithMerge::Merged as nested:
Nested variants do not support
merge.
And if I don't, I cannot implement impl std::ops::Index<usize> for PartialVecWithMerge, because it would return either MyInnerConfig or PartialMyInnerConfig, depending on the enum variant. Perhaps I could remove nested from the VecWithMerge::Replace variant, but then I can't use validation and other features provided by schematic on the inner types, so I want to keep that nested (and ideally also make VecWithMerge::Merged support nested).
Also ideally I could swap VecWithMerge::Replace(VecReplace) with VecWithMerge::Replace(Vec<MyInnerConfig>), but that too resulted in an error, due to the fact that the proc-macro doesn't handle certain tokens correctly:
1 error: comparison operators cannot be chained ▐
--> ... ▐
|
59 | Replace(Vec<MyInnerConfig>),
| ^ ^
|
help: use `::<...>` instead of `<...>` to specify lifetime, type, or const arguments
|
59 | Replace(Vec::< MyInnerConfig >),
| ++
What does this look like on the user side? How would the configure how the strategy works?
Given this:
#[derive(Config)]
struct MyConfig {
#[setting(nested)]
items: VecWithMerge,
}
// ... everything from the other comment
#[derive(Config)]
struct MyInnerConfig {
foo: String
}
You could do either of the following if you want the default merge behaviour:
{
"items": [{ "foo": "bar" }, { "foo": "baz" }]
}
or, if you want to define a custom merge strategy, you would do:
{
"items": {
"strategy": "append",
"items": [{ "foo": "bar" }, { "foo": "baz" }]
}
}
Both variants would deserialize to MyConfig.
Then, if you had another configuration file that got merged, it would use the merge strategy of prev to merge next on top of it, and then swap out the merge strategy of prev with next for if/when you have another layer on top of that to merge.
I do just now realize there's a flaw in this design, in that merge_vec_with_strategy is placed incorrectly. It currently only allows merging two of the latter variants (e.g. VecWithMerge::Merged), whereas you want to be able to mix and match between the two variants when merging, so the merge_vec_with_strategy strategy needs to be placed one level higher:
#[derive(Config)]
struct MyConfig {
#[setting(nested, merge = merge_vec_with_strategy)]
items: VecWithMerge,
}
/// Merge two `VecWithMerge ` values.
#[expect(clippy::trivially_copy_pass_by_ref)]
fn merge_vec_with_strategy(
prev: VecWithMerge,
next: VecWithMerge,
context: &(),
) -> MergeResult< VecWithMerge > {
let (prev_strategy, prev_items) = match prev {
VecWithMerge::Replace(v) => (MergeStrategy::Replace, v.items),
VecWithMerge::Merged(v) => (v.strategy, v.items),
}
let (next_strategy, next_items) = match prev {
VecWithMerge::Replace(v) => (MergeStrategy::Replace, v.items),
VecWithMerge::Merged(v) => (v.strategy, v.items),
}
let items = match prev_strategy {
MergeStrategy::Append => schematic::merge::append_vec(prev_items, next_items, context)?,
MergeStrategy::Prepend => schematic::merge::prepend_vec(prev_items, next_items, context)?,
MergeStrategy::Replace => schematic::merge::replace(prev_items, next_items, context)?,
};
Ok(items.map(|items| VecWithMerge::Merged(VecWithStrategy {
strategy: next_strategy,
items,
})))
}
I'm also thinking you might want to swap the logic of which strategy determines the merge result. That is, if I have files A and B, and I merge B on top of A, I want B to be able to determine what should happen with its values as it gets merged on top of other files (A in this case).
In other words, letting next dictate the merge strategy when merging on top of prev allows for a layered architecture where individual fields in layers can set that they should be the "new canonical value", instead of letting them be merged with existing values.