clp icon indicating copy to clipboard operation
clp copied to clipboard

Support ir extraction in decompression script

Open haiqi96 opened this issue 1 year ago • 1 comments

Description

Validation performed

Launched the package, verified that compression, decompression, ir extraction and search still work from command line

haiqi96 avatar Jul 05 '24 15:07 haiqi96

A high level concern is that what would be good terms to distinguish, (1) a general job name launched by decompress.py script" (2) a job that decompress a file, and (3) a job that extracts an IR.

I am currently using "x" as the decompression command following CLP and CLO, but internally still use decompression as job type, which is inconsistent.

Should we call both job as decompression job, and internally distinguish them as "extraction" and "ir_extraction" command?

haiqi96 avatar Jul 06 '24 04:07 haiqi96

How do we specify target-uncompressed-size? I tried ./decompress.sh i --target-uncompressed-size 10240 --orig-file-id daf326b3-ab77-42ec-9fcf-056b541f948a 0 and also manually inserted a msgpack record of

{
	"orig_file_id": "f6fa2faf-d686-4d54-b086-b68c3da04405",
	"msg_ix": 1,
	"target_uncompressed_size": 10240
}

which both yielded

[2024-07-11 06:28:05,643: INFO/ForkPoolWorker-7] job_orchestration.executor.query.extract_ir_task.extract_ir[78f14806-52e2-4e87-81d8-3f00fc2ac0d2]: Started IR extraction task for job 3
[2024-07-11 06:28:05,648: ERROR/ForkPoolWorker-7] Task job_orchestration.executor.query.extract_ir_task.extract_ir[78f14806-52e2-4e87-81d8-3f00fc2ac0d2] raised unexpected: TypeError('sequence item 8: expected str instance, int found')
Traceback (most recent call last):
  File "/opt/clp/lib/python3/site-packages/celery/app/trace.py", line 453, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/opt/clp/lib/python3/site-packages/celery/app/trace.py", line 736, in __protected_call__
    return self.run(*args, **kwargs)
  File "/opt/clp/lib/python3/site-packages/job_orchestration/executor/query/extract_ir_task.py", line 106, in extract_ir
    return run_query_task(
  File "/opt/clp/lib/python3/site-packages/job_orchestration/executor/query/utils.py", line 62, in run_query_task
    logger.info(f'Running: {" ".join(task_command)}')
TypeError: sequence item 8: expected str instance, int found

junhaoliao avatar Jul 11 '24 06:07 junhaoliao

How do we specify target-uncompressed-size? I tried ./decompress.sh i --target-uncompressed-size 10240 --orig-file-id daf326b3-ab77-42ec-9fcf-056b541f948a 0 and also manually inserted a msgpack record of

{
	"orig_file_id": "f6fa2faf-d686-4d54-b086-b68c3da04405",
	"msg_ix": 1,
	"target_uncompressed_size": 10240
}

which both yielded

[2024-07-11 06:28:05,643: INFO/ForkPoolWorker-7] job_orchestration.executor.query.extract_ir_task.extract_ir[78f14806-52e2-4e87-81d8-3f00fc2ac0d2]: Started IR extraction task for job 3
[2024-07-11 06:28:05,648: ERROR/ForkPoolWorker-7] Task job_orchestration.executor.query.extract_ir_task.extract_ir[78f14806-52e2-4e87-81d8-3f00fc2ac0d2] raised unexpected: TypeError('sequence item 8: expected str instance, int found')
Traceback (most recent call last):
  File "/opt/clp/lib/python3/site-packages/celery/app/trace.py", line 453, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/opt/clp/lib/python3/site-packages/celery/app/trace.py", line 736, in __protected_call__
    return self.run(*args, **kwargs)
  File "/opt/clp/lib/python3/site-packages/job_orchestration/executor/query/extract_ir_task.py", line 106, in extract_ir
    return run_query_task(
  File "/opt/clp/lib/python3/site-packages/job_orchestration/executor/query/utils.py", line 62, in run_query_task
    logger.info(f'Running: {" ".join(task_command)}')
TypeError: sequence item 8: expected str instance, int found

Sorry, it looks like I forgot to do a type conversion.

I included it in https://github.com/y-scope/clp/pull/472/files#diff-c3c708ca5b9cee2be7634fcb5966f6fc622b85ec0ad8575491540e676414cbbaL48. if you need this change urgently, I can also make a separate PR for it,

haiqi96 avatar Jul 11 '24 14:07 haiqi96