Large output size on jsoo
When using this library with jsoo, a single reference to the library adds 64k to the js output. This is 20% of the total size of the application. This happens for instance with just a single definition of a function like let x = Yojson.Basic.to_string, with no uses. I'm not sure if this is an issue with jsoo that would require a more advanced optimizer, or with dependencies of the library that could perhaps be reduced to avoid pulling in so much code, but since json is a built-in feature of browsers, one would expect it to have relatively low overhead.
Hi, @joprice, I don't know what exactly is creating that much code, but in #74 and #90 we're removing direct and transitive dependencies on Biniou and easy-format which will most likely help reducing the size. I would suggest we get these merged and take a look whether this helped.
If you want, you can try yourself by taking the master version and merging both PRs in and seeing how much the size goes down.
Sounds good. I'll try that out when I get a second.
Short update: both PRs are merged now and scheduled to be released as part of 2.0.0.
I tried out the master branch and I don't see a difference with 1.7.0. However, I am seeing a smaller overall increase from the library of 36k, probably due to optimizations in the newer version jsoo version I'm using since I tried this last year.
splitting yojson into different compilation units (while keeping the same API) would allow the ocaml linker to omit some units if not used (thanks to module alias). It's unlikely that one would need all 3 of Basic, Safe, Raw.
Such change would only improve the size when using separate compilation as jsoo would likely be able to drop theses units during whole program compilation.
That sounds like a very good idea, I'll try to split these apart. In particular, probably next to nobody will need Raw anyway, so it can probably be omitted most of the time.
Splitting took a moment since there was some weirdness in there, but here we go: https://github.com/ocaml-community/yojson/pull/167
After splitting, here is the size (in bytes) of individual units.
A quick look at Basic Raw and Safe reveal that the lexer is responsible for half of the size (maybe more).
270 Yojson__.js
628 Yojson.js
879 Yojson__Common.js
1524 Yojson__Codec.js
16643 Yojson__T.js
79978 Yojson__Basic.js
80284 Yojson__Raw.js
84606 Yojson__Safe.js
Adding a reference to yojson can also force CamlinternalFormat to be included, adding another 50k or so.
Running a very small experiment, I see that using Yojson.Basic.to_string (assuming CamlinternalFormat is already included) make my program jump from 84k to 129k (+45k). It turns out that the current implementation prevent jsoo from eliminating the (dead code) lexer. I've a patch that makes my experiment jump to 88k only (+4k).
A deeper look at read.mll, there are 3 lexers/parsers in there (normal, skip and buffer variants) which are duplicated 3 times (for Basic, Safe and Raw modules). It might be possible to reduce logic duplication. At least, it seems that some code could trivially be shared between the 3 modules, such as
val read_space : lexer_state -> Lexing.lexbuf -> unit
val read_eof : Lexing.lexbuf -> bool
val read_null : lexer_state -> Lexing.lexbuf -> unit
val read_null_if_possible : lexer_state -> Lexing.lexbuf -> bool
val read_bool : lexer_state -> Lexing.lexbuf -> bool
val read_int : lexer_state -> Lexing.lexbuf -> int
val read_int8 : lexer_state -> Lexing.lexbuf -> char
val read_int32 : lexer_state -> Lexing.lexbuf -> int32
val read_int64 : lexer_state -> Lexing.lexbuf -> int64
val read_number : lexer_state -> Lexing.lexbuf -> float
val skip_ident : lexer_state -> Lexing.lexbuf -> unit
...
Thanks @hhugo for your investigation. I'll try to reduce the duplication a bit next time I get to work on it.
I don't think we can remove the reference to CamlinternalFormat easily, since Format.formatter is part of the API (for sensible reasons).
Just to make sure my comment doesn't get lost in the noise above. https://github.com/ocaml-community/yojson/pull/168 allows jsoo to remove lexers if no used.
2.1.1 has been released with all these improvements, thanks for the help!