Make approach to defining enums consistent across the project
#654 introduces the enum type Ecosystem implemented differently to other enums (e.g. analysis.Mode and staticanalysis.Task and analysisrun.DynamicPhase)
We should develop a consistent approach to how we define enums:
Some areas that should be covered:
- should the base type be string vs numeric
- how do we handle lists of enums
- how do enums interact with flags
- how do enums handle serialization (JSON)
Both:
- require String()
Numeric based enums:
- can use
iota - can consume as little as 8bits (byte)
- very fast to compare equality
-
encoding.TextMarshalerandencoding.TextUnmarshalerneeded to supportflag.TextVarand json serialization - requires strings to be specified for marshaling and unmarshaling (although an array of strings may make this easier)
String based enums:
- consume 128 bits per enum
- simpler String() function and serialization
- equality is more costly (at minimum two comparisons, maximum the size of the shortest string)
Some research:
- https://www.practical-go-lessons.com/chap-27-enum-iota-and-bitmask - proposes int + iota and implementing fmt.Stringer() and serialization interfaces.
- https://threedots.tech/post/safer-enums-in-go/ - documents each style, suggests strings are better for URL slugs
- Go stdlib:
- https://github.com/golang/go/blob/master/src/time/time.go#L302 - uses int, and a
[]stringto associate constant with string - https://github.com/golang/go/blob/master/src/net/interface.go#L39 - uses uint as bitmask, and a
[]stringto associate constant with string
- https://github.com/golang/go/blob/master/src/time/time.go#L302 - uses int, and a
- GitHub:
- https://cs.github.com/TencentBlueKing/bk-cmdb/blob/53fa5996b3ddff2ec8e9dba039452ce46ce26dca/src/common/blog/glog/glog.go - uses int32 and
[]stringto associate constant with string, implementsflag.Value(origin: https://code.google.com/p/google-glog) - https://github.com/uber-go/zap/blob/master/zapcore/level.go - uses int8 and switches to implement various interfaces.
- https://github.com/zhlhahaha/gvisor/blob/master/runsc/config/config.go - uses int and switches to implement
flag.Value - https://github.com/dt/cockroach/blob/9fd4ce118639ff09d11fffd85fbad6d42fa727c4/pkg/cli/flags_util.go#L156 - dumpMode uses int, and switches to implement
flag.Value
- https://cs.github.com/TencentBlueKing/bk-cmdb/blob/53fa5996b3ddff2ec8e9dba039452ce46ce26dca/src/common/blog/glog/glog.go - uses int32 and
As far as flag.TextVar usage go, the usage is currently super low since the API was only recently introduced.
The standard approaches I've seen for enums is either:
type Foo int
const (
A Foo = iota
B
C
)
func (f Foo) String() string {
switch f {
case A:
return "A"
case B:
return "B"
case C:
return "C"
default:
return "<unknown>"
}
}
or
type Foo int
const (
A Foo = iota
B
C
)
fooStrings = []string{"A","B","C"} // this can also be map[Foo]string
func (f Foo) String() string {
// lookup string in fooStrings and return it
}
My proposal:
- use numeric based enum types (e.g.
int,int8) withiota- rationale: this approach is almost universally used
- use
[]stringormap[x]stringto associate const to string, and for parsing- rationale: while switches are more common, this approaches reduces string duplication, and supports producing help text for flags
- implement
encoding.TextMarshalerandencoding.TextUnmarshalerrather thanflag.Valueand JSON marshalers- rationale:
encoding.TextMarshalerandencoding.TextUnmarshalerare both usable with the newflag.TextVarand in-place of JSON marshalers, requiring only 2 methods instead of 4.
- rationale:
We should definitely have a consistent set of functions to work with types used as enums.
Perhaps the decision of type could be decided case-by-case? e.g. in some cases having a string name defined as the enum could be more meaningful, as described in the reference above. But happy to restrict ourselves to int if that's what we want.
For numeric based enums, I think the idea of using a slice or map to define the string values for the enum in a single place is a very good one.
Building on Caleb's proposal, how about the following five functions/methods for type Foo which is used like an enum:
-
String()string implementingfmt.Stringer -
FooFromString(name string) (Foo, error)implemented like this (from the same reference linked above), perhaps with the error replaced by a boolean true/false like in task.go -
MarshalText()(text []byte, err error)implementingencoding.TextMarshalerfor serialisation and flag usages (callsString()) -
UnmarshalText(text []byte) errorimplementingencoding.TextUnmarshalerfor deserialisation and flag interfaces (callsFooFromString()) -
FooValues() []Foowhich returns a slice containing all valid enum values.
Methods 1 and 2 are 'nicer' to use in internal code with strings, while 3 and 4 are implementing external interfaces which are designed to be very general and so are a little bit less readable.
I agree with everything - except for naming of (2) FooFromString.
Usually methods for returning a type from a string are named Parse (i.e when the package is named foo) or ParseFoo (e.g. https://pkg.go.dev/net/url#Parse, https://pkg.go.dev/time#Parse, https://pkg.go.dev/net/mail#ParseAddress, and https://pkg.go.dev/strconv).
Oh, and for FooValues() I'd accept Values() []Foo as well, if the package is named foo (so foo.Values() instead of foo.FooValues()).
I'd also add that implementing ParseFoo(), encoding.TextMarshaler and encoding.TextUnmarshaler could be optional.
But if an enum is used in a flag it must implement all three.
Sure, naming the 2nd method ParseFoo() or foo.Parse() depending on the containing package name and/or size sounds good to me.
Should we consistently define a none/empty value for each enum?