tech.ml icon indicating copy to clipboard operation
tech.ml copied to clipboard

generalise sobol-gridsearch to work with arbitrary data structures

Open behrica opened this issue 5 years ago • 0 comments

This opens the path to grid search over pipeline definitions.

Specially it should work to transform this:

[[:ds/select-columns [:Text :Score]]
 [:ds/update-column :Score #(map dec %)]
 [:ds-mod/set-inference-target :Score]
 [:nlp/count-vectorize :Text :bow :nlp/default-text->bow {:stopwords(ml-gs/categorical [nil :default :google :comprehensive]) }]
 [:nb/bow->SparseArray :bow :bow-sparse {:vocab-size (ml-gs/linear 100 10000)}]
 [:ml/train {:model-type   :discrete-naive-bayes
             :discrete-naive-bayes-model :multinomial
             :sparse-column :bow-sparse
             :nb-model-hyper-parameter-x (ml-gs/linear 0.0 1.0)
             }]
]


into a list of "copies" of this data structure, in which the gridsearch definitions are replaced by concrete values.

See discussion here: https://clojurians.zulipchat.com/#narrow/stream/236259-tech.2Eml.2Edataset.2Edev/topic/couple.20tech.2Eml.20to.20tablecloth.20pipeline.20concept.20.3F

behrica avatar Jan 04 '21 20:01 behrica