Failed to calculate profile file from command line
Hi!
I've trying to calculate profile file from command line in Jupyter Notebook (from cell), but it return next error:
Traceback (most recent call last):
File "/opt/conda/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/conda/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/opt/conda/lib/python3.9/site-packages/evidently/__main__.py", line 183, in <module>
parsed.handler(**parsed.__dict__)
File "/opt/conda/lib/python3.9/site-packages/evidently/__main__.py", line 114, in calculate_profile
sampling = __get_not_none(opts_data, "sampling", {})
File "/opt/conda/lib/python3.9/site-packages/evidently/__main__.py", line 53, in __get_not_none
return default if src.get(key, None) is None else src.get(key)
AttributeError: 'str' object has no attribute 'get'
My code:
import json
config = {
"data_format":{
"separator":",",
"header":False,
"date_column":None
},
"column_mapping":{},
"dashboard_tabs":["regression_perfomance"],
"pretty_print":True
}
json_string = json.dumps(config)
with open('config.json', 'w') as outfile:
json.dump(json_string, outfile)
!python -m evidently calculate profile --config config.json --reference reference.csv --current current.csv --output reports --report_name profile.json
UPDATE
I'm trying to add "sampling" section to config.json, but error still there:
import json
config = {
"data_format":{
"separator":",",
"header":False,
"date_column":None
},
"column_mapping":{},
"dashboard_tabs":["regression_perfomance"],
"pretty_print":True,
"sampling": {
"reference": {
"type": "none"
},
"current": {
"type": "nth",
"n": 2
}
}
}
json_string = json.dumps(config)
with open('config.json', 'w') as outfile:
json.dump(json_string, outfile)
Traceback (most recent call last):
File "/opt/conda/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/conda/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/opt/conda/lib/python3.9/site-packages/evidently/__main__.py", line 183, in <module>
parsed.handler(**parsed.__dict__)
File "/opt/conda/lib/python3.9/site-packages/evidently/__main__.py", line 114, in calculate_profile
sampling = __get_not_none(opts_data, "sampling", {})
File "/opt/conda/lib/python3.9/site-packages/evidently/__main__.py", line 53, in __get_not_none
return default if src.get(key, None) is None else src.get(key)
AttributeError: 'str' object has no attribute 'get'
Hi @jenoOvchi,
In one of the recent updates, we changed the structure of config.json, and dashboard_tabs should be a dictionary.
You can see an example of config.json in the repo (https://github.com/evidentlyai/evidently/blob/main/config.json).
Hi @jenoOvchi, In one of the recent updates, we changed the structure of
config.json, anddashboard_tabsshould be a dictionary.You can see an example of
config.jsonin the repo (https://github.com/evidentlyai/evidently/blob/main/config.json).
For the clean test I've tryed this config.json as is, but error still there:
import json
config = {
"data_format": {
"separator": ",",
"header": True,
"date_column": "dteday"
},
"column_mapping" : {},
"dashboard_tabs": {
"data_drift": {
},
"cat_target_drift":{
"verbose_level": 0
}
},
"options": {
"data_drift": {
"confidence": 0.95,
"drift_share": 0.5,
"nbinsx": null,
"xbins": null
}
},
"pretty_print": True,
"sampling": {
"reference": {
"type": "none",
"n": 1,
"ratio": 0.1
},
"current": {
"type": "nth",
"n": 2,
"ratio": 0.1
}
}
}
json_string = json.dumps(config)
with open('config.json', 'w') as outfile:
json.dump(json_string, outfile)
Traceback (most recent call last):
File "/opt/conda/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/conda/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/opt/conda/lib/python3.9/site-packages/evidently/__main__.py", line 183, in <module>
parsed.handler(**parsed.__dict__)
File "/opt/conda/lib/python3.9/site-packages/evidently/__main__.py", line 114, in calculate_profile
sampling = __get_not_none(opts_data, "sampling", {})
File "/opt/conda/lib/python3.9/site-packages/evidently/__main__.py", line 53, in __get_not_none
return default if src.get(key, None) is None else src.get(key)
AttributeError: 'str' object has no attribute 'get'
I think it is not about "dashboard_tabs", but about "sampling" section.
Oh, sorry, my bad, In your code snippet, you are incorrectly writing JSON to file.
Should be something like:
with open('config.json', 'w') as outfile:
json.dump(config, outfile)
Oh, sorry, my bad, In your code snippet, you are incorrectly writing JSON to file.
Should be something like:
with open('config.json', 'w') as outfile: json.dump(config, outfile)
That's works for me, thanks! But i found some errors before correct config has been written:
- We still need "profile_sections" section for correct execution with format, that is different from documentation examples - "profile_sections":{"data_drift": {}}. Error without "profile_sections":
Traceback (most recent call last):
File "/opt/conda/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/conda/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/opt/conda/lib/python3.9/site-packages/evidently/__main__.py", line 183, in <module>
parsed.handler(**parsed.__dict__)
File "/opt/conda/lib/python3.9/site-packages/evidently/__main__.py", line 117, in calculate_profile
usage["parts"] = opts_data["profile_sections"]
KeyError: 'profile_sections'
- "nbinsx": null from config in master is incorrect value:
INFO:root:reference dataset loaded: 50 rows
INFO:root:current dataset loaded: 25 rows
Traceback (most recent call last):
File "/opt/conda/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/conda/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/opt/conda/lib/python3.9/site-packages/evidently/__main__.py", line 183, in <module>
parsed.handler(**parsed.__dict__)
File "/opt/conda/lib/python3.9/site-packages/evidently/__main__.py", line 146, in calculate_profile
runner.run()
File "/opt/conda/lib/python3.9/site-packages/evidently/runner/profile_runner.py", line 52, in run
profile.calculate(reference_data, current_data, self.options.column_mapping)
File "/opt/conda/lib/python3.9/site-packages/evidently/model_profile/model_profile.py", line 31, in calculate
self.execute(reference_data, current_data, column_mapping)
File "/opt/conda/lib/python3.9/site-packages/evidently/pipeline/pipeline.py", line 45, in execute
instance.calculate(rdata, cdata, column_mapping)
File "/opt/conda/lib/python3.9/site-packages/evidently/analyzers/data_drift_analyzer.py", line 84, in calculate
current_nbinsx = data_drift_options.get_nbinsx(feature_name)
File "/opt/conda/lib/python3.9/site-packages/evidently/options/data_drift.py", line 39, in get_nbinsx
raise ValueError(f"DataDriftOptions.nbinsx is incorrect type {type(self.nbinsx)}")
ValueError: DataDriftOptions.nbinsx is incorrect type <class 'NoneType'>
Thanks for reporting this. We will check and fix this.
Thanks for reporting this. We will check and fix this.
Thanks!