zcbor test cases not working
I have successfully used https://github.com/NordicSemiconductor/zcbor for generating C code from CDDLs, but my CDDLs and their provided test cases are getting error messages when I try to run them through this tool.
Can you provide some clarification on why the errors are occurring? Is this tool missing support?
Example for https://github.com/NordicSemiconductor/zcbor/blob/main/tests/cases/corner_cases.cddl:
-
14 - onetofourbytes: uint .size 0b1..0o4, ; Testing binary and octal
- ^^^^^^^ missing definition for rule b1..0o4
-
34 - TaggedUnion = #6.4321(bool) // #6.2345(uint)
- ^^ expected rule identifier followed by an assignment token '=', '/=' or '//='
-
46 - #6.0(tentothirtybytetstr: tstr .size 10..30),
- ^^^^^^^^^^^^^^^^^^^ invalid tag syntax
-
64 - #6.10(boolval: bool),
- ^^^^^^^ invalid tag syntax
-
73 - Union = Group / MultiGroup / (3,4) / "\"hello\""
- ^ expected rule identifier followed by an assignment token '=', '/=' or '//='
-
77 - union: (7=>uint) / (-8=>uint),
- ^ missing closing delimiter
-
86 - Level1 = [Level2]
- ^^^^^^ missing definition for rule Level2
87 - Level2 = [2**3Level3] ; Use ** here to test backwards compatibility.
- ^ invalid group entry syntax
-
116 - unabstractedunion1: (choice1: 1 // choice2: 2),
- ^^ invalid group entry syntax
-
121 - upto4nils: 0*0x04nil,
- ^^ missing definition for rule il
-
140 - floats: *float,
- ^ invalid group entry syntax
-
196 - "r" : (result:result_code),
- ^ missing closing delimiter
It's not our code that is giving the errors, but the CDDL parsing library we use. I see you already made an issue there (anweiss/cddl/issues/199).
Could you provide your own CDDLs that are not working? That could be more helpful as they're probably more real-world and less edge cases (that file is called corner_cases.cddl.)
There are definitely things not supported by cddl-codegen though that would be able to parse with cddl.
Some things look a bit weird in that CDDL though:
For the line 64 one if you rewrite it a bit it should be accepted by us. e.g. change #6.10(boolval: bool) to boolval: #6.10(bool). I can't find any examples using the formatting there in the CDDL RFC, but I also can't find any using tagged members in general so maybe you can do it both ways.
For line 34 I'm confused as it uses group choice syntax // instead of type choice / but it doesn't seem to be declaring a group? At least the examples I've seen before would be inside of parenthesis marking it as a group e.g. how it's done in line 116.
I appreciate the quick feedback.
Here's a simpler CDDL based on what I'm trying to do:
CommandIDs =
(Zero: 0) /
(One: 1) /
(Two: 2) /
(Three: 3)
MyArray = [
a: uint .le 3,
b: uint .le 2,
c: uint .le 1,
]
Info = [
? MyArray,
? myBStr: bstr .size 10
]
Command = [
cmd: CommandIDs,
myInt: uint,
? Info,
]
Response = [
cmd: CommandIDs,
myInt: uint,
rsp: (Ack: 0) / (NAck: 1),
? Info
]
And here's the error:
error: parser errors
- input:4:15
-
4 - (Zero: 0) /
- ^ expected rule identifier followed by an assignment token '=', '/=' or '//='
-
29 - rsp: (Ack: 0) / (NAck: 1),
- ^ missing closing delimiter
Error: "incremental parsing error"
I welcome any feedback. Again, this is working with zcbor but not with, I guess, the CDDL parsing library in use.
What do you want the output generated to look like for CommandIDs? The way we handle basic groups like that we tend to create a struct with the fields you have there. If you instead remove the fields (they're all single constants anyway) and just make it a type choice on integers we can instead convert it into a simple C-style enum since all the fields of those basic groups are constants.
With these changes:
CommandIDs =
0 ; @name Zero
/ 1 ; @name One
/ 2 ; @name Two
/ 3 ; @name Three
AckRsp =
0 ; @name Ack
/ 1 ; @name NAck
Response = [
...
rsp: AckRsp,
...
]
Those ; @name comments aren't required, it's just a way to tell the tool exactly what you want things to be called. If you don't add them they'll be called CommandIds0, CommandIds1, etc.
Unfortunately there are two unsupported things even after rewriting those:
-
.le on numbers. We've only implemented it on array/map lengths since that's the only place we've needed it so far in our use cases - but it wouldn't be too hard to add into deserialization code. In the meantime if it's not completely necessary you could remove it and either not have the check or add it in afterwards by hand.
-
We don't support deserialization for array-encoded structs with optional fields. The general case for this can be complicated, if not even totally ambiguous e.g. situations like
[ ? uint, uint, ? (uint, text), ? text]. The situation you have there is actually something we have in one place in one of the Cardano CDDL specs that we use this tool to generate from so I might actually take a look at this today and see what I can do, at least in cases where it's totally non-ambiguous. In the meantime we still generate serialization code, we just don't implement theDeserializetrait.
e.g. after running the code I get after the fact:
Not generating Command::deserialize() - reasons:
Array with optional field info: Info
Not generating Info::deserialize() - reasons:
Array with optional field my_array: MyArray
Array with optional field my_b_str: Vec<u8>
Not generating Response::deserialize() - reasons:
field info: Info couldn't generate serialize
Array with optional field info: Info
The resulting C code for CommandIDs is this:
struct CommandIDs_ {
enum {
_CommandIDs_Zero = 0,
_CommandIDs_One = 1,
_CommandIDs_Two = 2,
_CommandIDs_Three = 3,
} _CommandIDs_choice;
};
I don't see anything in the RFC about using "; @ label" but I do seem sentences like: From 2.1.2
[A colon] is also the common way of naming elements of an array just for documentation; see Section 3.4.
This less clean solution works and produces basically the same C code (there's an extra underscore '_' for some reason):
Zero = 0
One = 1
Two = 2
Three = 3
CommandIDs = Zero / One / Two / Three
So, now I am running with this (removed all "labels")
Zero = 0
One = 1
Two = 2
Three = 3
CommandIDs =
Zero /
One /
Two /
Three
MyArray = [
uint,
uint,
uint,
]
Info = [
? MyArray,
? bstr .size 10
]
Command = [
CommandIDs,
uint,
? Info,
]
Ack = 0
NAck = 1
Response = [
CommandIDs,
uint,
Ack / NAck,
? Info
]
and am getting a crash:
-----------------------------------------
- Generating code...
------------------------------------
thread 'main' panicked at 'should not expose Fixed type in member, only needed for serializaiton: Fixed(Uint(0))', src/intermediate.rs:1426:31
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Any ideas?
Can you share what you ran with to get your "Not generating..." output? I think the zcbor solution to an array-encoded struct with option fields is by adding "_present" booleans to the generated struct, since the order has to be maintained. I think Section 3.5.3 of the RFC helps explain things.
@nslowell I got the array struct deserialization with optional fields working: #204
Note: regarding the .le on the integers we still generate without errors, they're just ignored. So the deserialization code looks something like:
let a = Ok(raw.unsigned_integer()? as u64)
.map_err(|e: DeserializeError| e.annotate("a"))?;
let b = Ok(raw.unsigned_integer()? as u64)
.map_err(|e: DeserializeError| e.annotate("b"))?;
let c = Ok(raw.unsigned_integer()? as u64)
.map_err(|e: DeserializeError| e.annotate("c"))?;
without any checks.
So now with the above change to the CommandIds choice + Ack one and the PR I linked your code generates code, just without the .le checks for now. I might take a look later to see if it's a trivial change to add in a quick error check to deserialization code. If it is I can make a PR, if not it's not a high priority for us since we haven't needed to use it for our own generation purposes (outside of ones that just specifying rust primitives like uint .le 4294967295 and such, which we do support as a way to generate the code as u32, etc instead of u64).
I don't see anything in the RFC about using "; @ label" but I do seem sentences like:
It's a non-standard thing. The RFC specifies comments as ; but they don't impact the underlying CBOR. We just use comments as a way to tell cddl-codegen certain things about how we want the generated output code. You could omit them and the underlying CBOR supported by the output generated code wouldn't change, just how the code looks like.
Here's the code I used using the default CLI options:
CommandIDs =
0 ; @name Zero
/ 1 ; @name One
/ 2 ; @name Two
/ 3 ; @name Three
MyArray = [
a: uint, ; .le 3,
b: uint, ; .le 2,
c: uint, ; .le 1,
]
Info = [
? MyArray,
? myBStr: bstr .size 10
]
Command = [
cmd: CommandIDs,
myInt: uint,
? Info,
]
AckRsp =
0 ; @name Ack
/ 1 ; @name NAck
Response = [
cmd: CommandIDs,
myInt: uint,
rsp: AckRsp,
? Info
]
Note: obviously if you run using master you won't get the Deserialize trait generated for some of those types, but if you run on that PR I linked the deserialization should be generated.
[A colon] is also the common way of naming elements of an array just for documentation; see Section 3.4.
Yes, we do that in general. It's just that when you make a basic group like (zero: 0) our generator will then think you want a struct and it will generate pub struct Zero; pub struct One, etc then pub enum CommandIDs { Zero(Zero), One(One), ... }. If this is a common CDDL pattern for enums we could maybe add in a special case for it. The reason it's generating these structs is we just have the general group code generation which makes more sense when the fields aren't fixed e.g. vec3 = [x: uint, y: uint, z: uint] would generate pub struct Vec3 { pub x: u64, pub y: u64, pub z: u64 } etc.
Sorry the delay.
I ran with your exact CDDL and am getting a crash. See below. Any ideas?
> RUST_BACKTRACE=full ~/cddl-codegen/target/release/cddl-codegen --input Test2.cddl --output export
int
int
Switching from scope 'lib' to 'lib'
CommandIDs
MyArray
Info
Command
AckRsp
Response
_CDDL_CODEGEN_EXTERN_TYPE_
_CDDL_CODEGEN_RAW_BYTES_TYPE_
------------------------------------------
- Handling rule: lib:AckRsp
------------------------------------
AckRsp
------------------------------------------
- Handling rule: lib:CommandIDs
------------------------------------
CommandIDs
------------------------------------------
- Handling rule: lib:MyArray
------------------------------------
MyArray
------------------------------------------
- Handling rule: lib:Info
------------------------------------
Info
MyArray
MyArray
------------------------------------------
- Handling rule: lib:Command
------------------------------------
Command
CommandIDs
CommandIDs
Info
Info
------------------------------------------
- Handling rule: lib:Response
------------------------------------
Response
CommandIDs
CommandIDs
AckRsp
AckRsp
Info
Info
------------------------------------------
- Handling rule: lib:_CDDL_CODEGEN_EXTERN_TYPE_
------------------------------------
_CDDL_CODEGEN_EXTERN_TYPE_
------------------------------------------
- Handling rule: lib:_CDDL_CODEGEN_RAW_BYTES_TYPE_
------------------------------------
_CDDL_CODEGEN_RAW_BYTES_TYPE_
-----------------------------------------
- Generating code...
------------------------------------
thread 'main' panicked at 'can't read: static/serialization.rs', src/generation.rs:225:21
stack backtrace:
0: 0x5641b7c903b1 - std::backtrace_rs::backtrace::libunwind::trace::h81b3ec7e24504ef6
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
1: 0x5641b7c903b1 - std::backtrace_rs::backtrace::trace_unsynchronized::h06b108a8ac6d8321
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
2: 0x5641b7c903b1 - std::sys_common::backtrace::_print_fmt::hddd2a53a0a5c1518
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/sys_common/backtrace.rs:65:5
3: 0x5641b7c903b1 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h8286d7e4527fc2ef
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/sys_common/backtrace.rs:44:22
4: 0x5641b7cbb0cc - core::fmt::rt::Argument::fmt::h8fd89467fbfd7337
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/core/src/fmt/rt.rs:138:9
5: 0x5641b7cbb0cc - core::fmt::write::h51943d696b01a399
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/core/src/fmt/mod.rs:1094:21
6: 0x5641b7c8d68e - std::io::Write::write_fmt::h917c762025be867c
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/io/mod.rs:1714:15
7: 0x5641b7c901c5 - std::sys_common::backtrace::_print::h80196d93ed8f3afc
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/sys_common/backtrace.rs:47:5
8: 0x5641b7c901c5 - std::sys_common::backtrace::print::hfc2f9057cab2ca91
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/sys_common/backtrace.rs:34:9
9: 0x5641b7c9195a - std::panicking::panic_hook_with_disk_dump::{{closure}}::h044c489edf02dba6
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:278:22
10: 0x5641b7c915f3 - std::panicking::panic_hook_with_disk_dump::hc012686c97dee411
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:312:9
11: 0x5641b7c91f5b - std::panicking::default_hook::h38859ccab48a30da
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:239:5
12: 0x5641b7c91f5b - std::panicking::rust_panic_with_hook::hbec509e90f57ee70
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:729:13
13: 0x5641b7c91e57 - std::panicking::begin_panic_handler::{{closure}}::h24e3657f27ebdc7b
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:621:13
14: 0x5641b7c90816 - std::sys_common::backtrace::__rust_end_short_backtrace::h0fb71023354b550a
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/sys_common/backtrace.rs:151:18
15: 0x5641b7c91ba2 - rust_begin_unwind
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:617:5
16: 0x5641b79be133 - core::panicking::panic_fmt::haa55128da9cd75d4
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/core/src/panicking.rs:67:14
17: 0x5641b79ece7c - cddl_codegen::generation::GenerationScope::export::he5da078ac6a96e1a
18: 0x5641b7aa257f - cddl_codegen::main::h579e05b56747a76c
19: 0x5641b7aa71d3 - std::sys_common::backtrace::__rust_begin_short_backtrace::hf7258d21fb51bae9
20: 0x5641b7aac421 - std::rt::lang_start::{{closure}}::hd7b4de7351dc80e1
21: 0x5641b7c86ffb - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::h2240a7a2e9d7937c
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/core/src/ops/function.rs:284:13
22: 0x5641b7c86ffb - std::panicking::try::do_call::h6d94044a2dac5948
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:524:40
23: 0x5641b7c86ffb - std::panicking::try::hc16721a649b6347e
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:488:19
24: 0x5641b7c86ffb - std::panic::catch_unwind::ha5ec842243b12570
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panic.rs:142:14
25: 0x5641b7c86ffb - std::rt::lang_start_internal::{{closure}}::h718bb866ab67d475
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/rt.rs:148:48
26: 0x5641b7c86ffb - std::panicking::try::do_call::hca57e22a5a3d68fb
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:524:40
27: 0x5641b7c86ffb - std::panicking::try::hfd3cbded253620e6
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:488:19
28: 0x5641b7c86ffb - std::panic::catch_unwind::h1fa2ff1a7cb410d5
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panic.rs:142:14
29: 0x5641b7c86ffb - std::rt::lang_start_internal::h1dfc41da9ec083ca
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/rt.rs:148:20
30: 0x5641b7aa2c85 - main
31: 0x7f4218c2a083 - __libc_start_main
at /build/glibc-SzIz7B/glibc-2.31/csu/../csu/libc-start.c:308:16
32: 0x5641b79be85e - _start
33: 0x0 - <unknown>
@nslowell I think it depends where you're running it from. So those files in /static/ aren't being located. Try running it from the cddl-codegen directory. Or pass in --static-dir=<location of /static/> so it knows where to find them.
cool, thanks. I was able to produce code with your sample you gave me.
So, now I've got this CDDL that's causing a different crash:
CommandIDs = (
Zero: 0 /
One: 1 /
Two: 2 /
Three: 3 /
Four: 4 /
Five: 5 /
Six: 6 /
Seven: 7
)
u32 = uint .size 4
Version = [
major: u32,
minor: u32,
patch: u32,
]
Info = [
? Version,
? myStr1: bstr .size 10,
? myStr2: [*bstr],
? myStr3: [*bstr],
? myStr4: [*bstr],
? myInt: u32,
]
Command = [
cmd: CommandIDs,
value: u32,
? Info,
]
rsp = (Ack: 0 / NAck: 1)
Response = [
cmd: CommandIDs,
resp: rsp,
? Info,
]
backtrace:
-----------------------------------------
- Generating code...
------------------------------------
BytesList
thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', src/intermediate.rs:1129:64
stack backtrace:
0: rust_begin_unwind
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/std/src/panicking.rs:617:5
1: core::panicking::panic_fmt
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/core/src/panicking.rs:67:14
2: core::panicking::panic
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/core/src/panicking.rs:117:5
3: core::option::Option<T>::unwrap
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/core/src/option.rs:935:21
4: cddl_codegen::intermediate::RustType::cbor_types
at ./src/intermediate.rs:1129:39
5: cddl_codegen::intermediate::EnumVariant::cbor_types
at ./src/intermediate.rs:1946:46
6: cddl_codegen::generation::generate_enum
at ./src/generation.rs:6140:31
7: cddl_codegen::generation::GenerationScope::generate_type_choices_from_variants
at ./src/generation.rs:2947:9
8: cddl_codegen::generation::GenerationScope::generate
at ./src/generation.rs:625:25
9: cddl_codegen::main
at ./src/main.rs:152:5
10: core::ops::function::FnOnce::call_once
at /rustc/864bdf7843e1ceabc824ed86d97006acad6af643/library/core/src/ops/function.rs:250:5
@nslowell If you rewrite those enums as mentioned earlier in the issue it won't panic e.g.:
CommandIDs =
0 ; @name Zero
/ 1 ; @name One
/ 2 ; @name Two
/ 3 ; @name Three
/ 4 ; @name Four
/ 5 ; @name Five
/ 6 ; @name Six
/ 7 ; @name Seven
rsp = 0 ; @name Ack
/ 1 ; @name NAck
Unfortunately even with #204 we can't emit deserialization traits (serialization works though) because the optional array fields are ambiguous.
Look here:
? myStr2: [*bstr],
? myStr3: [*bstr],
? myStr4: [*bstr],
Let's say you're deserializing and you get two arrays of bytes. How will the code know if they should be myStr2 and myStr3? Or myStr2 and myStr4? Or myStr3 and myStr4?