click_house icon indicating copy to clipboard operation
click_house copied to clipboard

Faster JSON parser

Open madejejej opened this issue 4 years ago • 0 comments

We sometimes fetch a lot of data from Clickhouse and the default Faraday JSON parser is quite slow. Patching the JSON library with the Oj gem would help, but I can't do that in my project.

I can think of two different solutions to that problem:

  1. Elasticsearch gem uses MultiJson which automatically picks the most performant JSON library: https://github.com/elastic/elastic-transport-ruby/blob/50dc0216789aadbe3dc6e2c7283128a7615ed627/lib/elastic/transport/transport/serializer/multi_json.rb#L40 (https://github.com/intridea/multi_json). Provide a custom middleware that uses MultiJson
  2. Allow a user to customize building the Faraday client in ClickHouse::Connection#transport. Perhaps that could be a part of the gem configuration? For example:
ClickHouse.config do |config|
  # You can override config.faraday as a proc. If you don't there would be a default 
  # proc that builds Faraday in the config.
  config.faraday do |conn|
        conn.options.timeout = config.timeout
        conn.options.open_timeout = config.open_timeout
        conn.headers = config.headers
        conn.ssl.verify = config.ssl_verify
        conn.request(:basic_auth, config.username, config.password) if config.auth?
        conn.response Middleware::RaiseError
        conn.response Middleware::Logging, logger: config.logger!
        conn.response :json, content_type: %r{application/json}
        conn.response Middleware::ParseCsv, content_type: %r{text/csv}
        conn.adapter config.adapter
      end
end

class ClickHouse
  class Connection
    def transport
      @transport ||= config.faraday.call(config)
    end
  end
end

I'm happy to start a PR for either solution, but wanted to get your opinion first

madejejej avatar Apr 25 '22 12:04 madejejej