Can't deserialize a string that looks like a float properly
Under Syck if you register a domain type, the block will get the tag name and the value yielded to it as they are in the actual raw YAML.
In Psych however the value string has already been tokenized by the call at the bottom of Psych::Visitors::ToRuby#deserialize, before #accept is called.
This is a problem because if your serialized value string happens to look like a floating point number, then it will get parsed as a Float and your deserializer won't be able to get the original string so that it can parse it properly (without float's precision problems).
Is there another way to add domain class parsing in a way that won't get tokenized? It seems that at the moment we have to extend the big case statement in #deserialize (or use alias_method_chain to add a special case).
If not could we maybe rearrange slightly so that #tokenize is not called before #accept?
I see. What would you like passed to the block? The scalar value, or the node? I think the node makes more sense, but it's up to you.
Yeah, good question - personally I would only want the string (which IIRC is what we got from Syck), but I can see the case for returning the node.
If there's already other public APIs in Psych that expect callers to work with Node objects then that seems reasonable and is perhaps more future-proof? Otherwise I'd keep it simple and use strings.
Ya, the problem with strings is that you could register a handler for a complex type like a hash. In that case, passing a string wouldn't be possible.
Aaron Patterson http://tenderlovemaking.com/ I'm on an iPhone so I apologize for top posting.
On Nov 29, 2013, at 6:02 PM, Will Bryant [email protected] wrote:
Yeah, good question - personally I would only want the string (which IIRC is what we got from Syck), but I can see the case for returning the node.
If there's already other public APIs in Psych that expect callers to work with Node objects then that seems reasonable and it perhaps more future-proof? Otherwise I'd keep it simple and use strings.
Reply to this email directly or view it on GitHub.
That's interesting - are you thinking of an API that will hand over both the original representation (node) and the default parsed representation (potentially a hash)?
No, just the node. It's pretty easy to convert the node to Ruby (off the top of my head you just call to_ruby on the node), but I can't represent say a has stricture as a string. I'd have to pass you the node.
I'll write an example, and maybe that will clear it up!
Aaron Patterson http://tenderlovemaking.com/ I'm on an iPhone so I apologize for top posting.
On Nov 29, 2013, at 8:58 PM, Will Bryant [email protected] wrote:
That's interesting - are you thinking of an API that will hand over both the original representation (node) and the default parsed representation (potentially a hash)?
Reply to this email directly or view it on GitHub.
Here is an example:
require 'psych'
Psych.add_domain_type 'foo.bar,2002', 'foo' do |type, val|
p [val, val.class]
end
Psych.load('--- !foo.bar,2002/foo hello')
Psych.load('--- !foo.bar,2002/foo { "foo":"hello" }')
The domain type can be used for anything (Scalars, Arrays, Hashes). It would make sense for val to be a consistent type (always a String object, etc). The only consistent type I could provide is an AST node object.
Can I ask what you're using this for? I'd like to improve this API, but I don't have any real-world examples.
We have a BigDecimal serialisation, which we put in in 2007 before there was any standard format. At that time we used Syck:
class BigDecimal
def to_yaml(opts = {}) # if this mysterious stops working when you upgrade, check for new core extensions
YAML::quick_emit(object_id, opts) do |out|
out.scalar("tag:ruby.yaml.org,2002:decimal", to_s) # this is a standard Ruby type but not a standard Ruby YAML type, so this is slightly naughty. if the community ever does set a standard, we can easily convert ours...
end
end
end
YAML.add_ruby_type("decimal") {|type, val| BigDecimal.new(val)}
But now with Psych, to avoid the values getting converted to Float values before we get them (and so losing precision), I found we had to extend the whole method (or monkeypatch to add to the case statement inside Psych::Visitors::ToRuby#deserialize):
class BigDecimal
def encode_with(coder)
coder.scalar "!ruby/decimal", to_s
end
end
class Psych::Visitors::ToRuby
def deserialize_with_ruby_decimal(o)
if o.tag == "!ruby/decimal"
BigDecimal.new(o.value)
else
deserialize_without_ruby_decimal(o)
end
end
alias_method_chain :deserialize, :ruby_decimal
end