MII Example shows that mii is "Slower" than Baseline!

Open Weigaa opened this issue 1 year ago • 0 comments

Nowadays, I compare the txt2img examples between baseline.py and mii.py, the amazing result occurs, the baseline is even faster than mii. The baseline inference result is: (wjtorch2.0.1) lthpc@lthpc-C01:~/nvmessd/wj/DeepSpeed-MII/mii/legacy/examples/benchmark/txt2img$ python baseline-sd.py [2024-03-06 09:13:56,842] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect) /home/lthpc/anaconda3/envs/wjtorch2.0.1/lib/python3.11/site-packages/diffusers/pipelines/pipeline_utils.py:272: FutureWarning: You are loading the variant fp16 from CompVis/stable-diffusion-v1-4 via revision='fp16'even though you can load it viavariant=fp16. Loading model variants via revision='fp16' is deprecated and will be removed in diffusers v1. Please use variant='fp16' instead. warnings.warn( vae/diffusion_pytorch_model.safetensors not found Keyword arguments {'use_auth_token': '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'} are not expected by StableDiffusionPipeline and will be ignored. Loading pipeline components...: 29%|██████████████████████████▊ | 2/7 [00:00<00:00, 5.58it/s]/home/lthpc/anaconda3/envs/wjtorch2.0.1/lib/python3.11/site-packages/diffusers/models/lora.py:300: FutureWarning: LoRACompatibleConv is deprecated and will be removed in version 1.0.0. Use of LoRACompatibleConv is deprecated. Please switch to PEFT backend by installing PEFT: pip install peft. deprecate("LoRACompatibleConv", "1.0.0", deprecation_message) /home/lthpc/anaconda3/envs/wjtorch2.0.1/lib/python3.11/site-packages/diffusers/models/lora.py:387: FutureWarning: LoRACompatibleLinear is deprecated and will be removed in version 1.0.0. Use of LoRACompatibleLinear is deprecated. Please switch to PEFT backend by installing PEFT: pip install peft. deprecate("LoRACompatibleLinear", "1.0.0", deprecation_message) Loading pipeline components...: 86%|████████████████████████████████████████████████████████████████████████████████▌ | 6/7 [00:02<00:00, 2.35it/s]/home/lthpc/anaconda3/envs/wjtorch2.0.1/lib/python3.11/site-packages/transformers/models/clip/feature_extraction_clip.py:28: FutureWarning: The class CLIPFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use CLIPImageProcessor instead. warnings.warn( Loading pipeline components...: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:02<00:00, 3.36it/s] 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:02<00:00, 22.83it/s] trial=0, time_taken=1.7606 trial=1, time_taken=1.6723 trial=2, time_taken=1.7453 trial=3, time_taken=1.7477 trial=4, time_taken=1.7076 trial=5, time_taken=1.6630 trial=6, time_taken=1.6846 trial=7, time_taken=1.7243 trial=8, time_taken=1.7357 trial=9, time_taken=1.6982 median duration: 1.7160 The mii result is: (wjtorch2.0.1) lthpc@lthpc-C01:~/nvmessd/wj/DeepSpeed-MII/mii/legacy/examples/benchmark/txt2img$ python mii-sd.py [2024-03-06 09:11:22,567] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2024-03-06 09:12:06,326] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter hf_auth_token is deprecated. Parameter will be removed. Please use the pipeline_kwargs field to pass kwargs to the HuggingFace pipeline creation. [2024-03-06 09:12:06,327] [INFO] [deployment.py:75:deploy] ************* MII is using DeepSpeed Optimizations to accelerate your model ************* [2024-03-06 09:12:06,327] [INFO] [deployment.py:75:deploy] ************* MII is using DeepSpeed Optimizations to accelerate your model ************* [2024-03-06 09:12:06,329] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter hf_auth_token is deprecated. Parameter will be removed. Please use the pipeline_kwargs field to pass kwargs to the HuggingFace pipeline creation. [2024-03-06 09:12:06,329] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter trust_remote_code is deprecated. Parameter will be removed. Please use the pipeline_kwargs field to pass kwargs to the HuggingFace pipeline creation. [2024-03-06 09:12:06,329] [INFO] [server.py:38:init] Hostfile /job/hostfile not found, creating hostfile. [2024-03-06 09:12:06,329] [INFO] [server.py:38:init] Hostfile /job/hostfile not found, creating hostfile. [2024-03-06 09:12:06,330] [INFO] [server.py:103:_launch_server_process] MII server server launch: ['deepspeed', '-H', '/tmp/tmpi7kgv35e', '-i', 'localhost:0', '--master_port', '29500', '--master_addr', 'localhost', '--no_ssh_check', '--no_local_rank', '--no_python', '/home/lthpc/anaconda3/envs/wjtorch2.0.1/bin/python', '-m', 'mii.legacy.launch.multi_gpu_server', '--deployment-name', 'sd_deploy', '--load-balancer-port', '50050', '--restful-gateway-port', '51080', '--server-port', '50051', '--model-config', 'eyJtb2RlbCI6ICJDb21wVmlzL3N0YWJsZS1kaWZmdXNpb24tdjEtNCIsICJ0YXNrIjogInRleHQtdG8taW1hZ2UiLCAiZHR5cGUiOiAidG9yY2guZmxvYXQxNiIsICJtb2RlbF9wYXRoIjogIi90bXAvbWlpX21vZGVscyIsICJsb2FkX3dpdGhfc3lzX21lbSI6IGZhbHNlLCAibWV0YV90ZW5zb3IiOiBmYWxzZSwgImRlcGxveV9yYW5rIjogWzBdLCAidG9yY2hfZGlzdF9wb3J0IjogMjk1MDAsICJyZXBsaWNhX251bSI6IDEsICJyZXBsaWNhX2NvbmZpZ3MiOiBbeyJob3N0bmFtZSI6ICJsb2NhbGhvc3QiLCAidGVuc29yX3BhcmFsbGVsX3BvcnRzIjogWzUwMDUxXSwgInRvcmNoX2Rpc3RfcG9ydCI6IDI5NTAwLCAiZ3B1X2luZGljZXMiOiBbMF19XSwgInByb2ZpbGVfbW9kZWxfdGltZSI6IGZhbHNlLCAic2tpcF9tb2RlbF9jaGVjayI6IHRydWUsICJoZl9hdXRoX3Rva2VuIjogImhmX0VpWHlPa3p2TkJ2dWhpTVZoa3pWVlRYSVpyZHltZ1JtWGwiLCAidHJ1c3RfcmVtb3RlX2NvZGUiOiBmYWxzZSwgInBpcGVsaW5lX2t3YXJncyI6IHt9LCAiZW5hYmxlX2RlZXBzcGVlZCI6IHRydWUsICJlbmFibGVfemVybyI6IGZhbHNlLCAiZHNfY29uZmlnIjoge30sICJ0ZW5zb3JfcGFyYWxsZWwiOiAxLCAiZW5hYmxlX2N1ZGFfZ3JhcGgiOiBmYWxzZSwgInJlcGxhY2Vfd2l0aF9rZXJuZWxfaW5qZWN0IjogdHJ1ZSwgImNoZWNrcG9pbnRfZGljdCI6IG51bGwsICJtYXhfdG9rZW5zIjogMTAyNH0='] [2024-03-06 09:12:06,330] [INFO] [server.py:103:_launch_server_process] MII server server launch: ['deepspeed', '-H', '/tmp/tmpi7kgv35e', '-i', 'localhost:0', '--master_port', '29500', '--master_addr', 'localhost', '--no_ssh_check', '--no_local_rank', '--no_python', '/home/lthpc/anaconda3/envs/wjtorch2.0.1/bin/python', '-m', 'mii.legacy.launch.multi_gpu_server', '--deployment-name', 'sd_deploy', '--load-balancer-port', '50050', '--restful-gateway-port', '51080', '--server-port', '50051', '--model-config', 'eyJtb2RlbCI6ICJDb21wVmlzL3N0YWJsZS1kaWZmdXNpb24tdjEtNCIsICJ0YXNrIjogInRleHQtdG8taW1hZ2UiLCAiZHR5cGUiOiAidG9yY2guZmxvYXQxNiIsICJtb2RlbF9wYXRoIjogIi90bXAvbWlpX21vZGVscyIsICJsb2FkX3dpdGhfc3lzX21lbSI6IGZhbHNlLCAibWV0YV90ZW5zb3IiOiBmYWxzZSwgImRlcGxveV9yYW5rIjogWzBdLCAidG9yY2hfZGlzdF9wb3J0IjogMjk1MDAsICJyZXBsaWNhX251bSI6IDEsICJyZXBsaWNhX2NvbmZpZ3MiOiBbeyJob3N0bmFtZSI6ICJsb2NhbGhvc3QiLCAidGVuc29yX3BhcmFsbGVsX3BvcnRzIjogWzUwMDUxXSwgInRvcmNoX2Rpc3RfcG9ydCI6IDI5NTAwLCAiZ3B1X2luZGljZXMiOiBbMF19XSwgInByb2ZpbGVfbW9kZWxfdGltZSI6IGZhbHNlLCAic2tpcF9tb2RlbF9jaGVjayI6IHRydWUsICJoZl9hdXRoX3Rva2VuIjogImhmX0VpWHlPa3p2TkJ2dWhpTVZoa3pWVlRYSVpyZHltZ1JtWGwiLCAidHJ1c3RfcmVtb3RlX2NvZGUiOiBmYWxzZSwgInBpcGVsaW5lX2t3YXJncyI6IHt9LCAiZW5hYmxlX2RlZXBzcGVlZCI6IHRydWUsICJlbmFibGVfemVybyI6IGZhbHNlLCAiZHNfY29uZmlnIjoge30sICJ0ZW5zb3JfcGFyYWxsZWwiOiAxLCAiZW5hYmxlX2N1ZGFfZ3JhcGgiOiBmYWxzZSwgInJlcGxhY2Vfd2l0aF9rZXJuZWxfaW5qZWN0IjogdHJ1ZSwgImNoZWNrcG9pbnRfZGljdCI6IG51bGwsICJtYXhfdG9rZW5zIjogMTAyNH0='] [2024-03-06 09:12:06,331] [INFO] [server.py:103:_launch_server_process] load balancer server launch: ['/home/lthpc/anaconda3/envs/wjtorch2.0.1/bin/python', '-m', 'mii.legacy.launch.multi_gpu_server', '--deployment-name', 'sd_deploy', '--load-balancer-port', '50050', '--restful-gateway-port', '51080', '--load-balancer', '--model-config', 'eyJtb2RlbCI6ICJDb21wVmlzL3N0YWJsZS1kaWZmdXNpb24tdjEtNCIsICJ0YXNrIjogInRleHQtdG8taW1hZ2UiLCAiZHR5cGUiOiAidG9yY2guZmxvYXQxNiIsICJtb2RlbF9wYXRoIjogIi90bXAvbWlpX21vZGVscyIsICJsb2FkX3dpdGhfc3lzX21lbSI6IGZhbHNlLCAibWV0YV90ZW5zb3IiOiBmYWxzZSwgImRlcGxveV9yYW5rIjogWzBdLCAidG9yY2hfZGlzdF9wb3J0IjogMjk1MDAsICJyZXBsaWNhX251bSI6IDEsICJyZXBsaWNhX2NvbmZpZ3MiOiBbeyJob3N0bmFtZSI6ICJsb2NhbGhvc3QiLCAidGVuc29yX3BhcmFsbGVsX3BvcnRzIjogWzUwMDUxXSwgInRvcmNoX2Rpc3RfcG9ydCI6IDI5NTAwLCAiZ3B1X2luZGljZXMiOiBbMF19XSwgInByb2ZpbGVfbW9kZWxfdGltZSI6IGZhbHNlLCAic2tpcF9tb2RlbF9jaGVjayI6IHRydWUsICJoZl9hdXRoX3Rva2VuIjogImhmX0VpWHlPa3p2TkJ2dWhpTVZoa3pWVlRYSVpyZHltZ1JtWGwiLCAidHJ1c3RfcmVtb3RlX2NvZGUiOiBmYWxzZSwgInBpcGVsaW5lX2t3YXJncyI6IHt9LCAiZW5hYmxlX2RlZXBzcGVlZCI6IHRydWUsICJlbmFibGVfemVybyI6IGZhbHNlLCAiZHNfY29uZmlnIjoge30sICJ0ZW5zb3JfcGFyYWxsZWwiOiAxLCAiZW5hYmxlX2N1ZGFfZ3JhcGgiOiBmYWxzZSwgInJlcGxhY2Vfd2l0aF9rZXJuZWxfaW5qZWN0IjogdHJ1ZSwgImNoZWNrcG9pbnRfZGljdCI6IG51bGwsICJtYXhfdG9rZW5zIjogMTAyNH0='] [2024-03-06 09:12:06,331] [INFO] [server.py:103:_launch_server_process] load balancer server launch: ['/home/lthpc/anaconda3/envs/wjtorch2.0.1/bin/python', '-m', 'mii.legacy.launch.multi_gpu_server', '--deployment-name', 'sd_deploy', '--load-balancer-port', '50050', '--restful-gateway-port', '51080', '--load-balancer', '--model-config', 'eyJtb2RlbCI6ICJDb21wVmlzL3N0YWJsZS1kaWZmdXNpb24tdjEtNCIsICJ0YXNrIjogInRleHQtdG8taW1hZ2UiLCAiZHR5cGUiOiAidG9yY2guZmxvYXQxNiIsICJtb2RlbF9wYXRoIjogIi90bXAvbWlpX21vZGVscyIsICJsb2FkX3dpdGhfc3lzX21lbSI6IGZhbHNlLCAibWV0YV90ZW5zb3IiOiBmYWxzZSwgImRlcGxveV9yYW5rIjogWzBdLCAidG9yY2hfZGlzdF9wb3J0IjogMjk1MDAsICJyZXBsaWNhX251bSI6IDEsICJyZXBsaWNhX2NvbmZpZ3MiOiBbeyJob3N0bmFtZSI6ICJsb2NhbGhvc3QiLCAidGVuc29yX3BhcmFsbGVsX3BvcnRzIjogWzUwMDUxXSwgInRvcmNoX2Rpc3RfcG9ydCI6IDI5NTAwLCAiZ3B1X2luZGljZXMiOiBbMF19XSwgInByb2ZpbGVfbW9kZWxfdGltZSI6IGZhbHNlLCAic2tpcF9tb2RlbF9jaGVjayI6IHRydWUsICJoZl9hdXRoX3Rva2VuIjogImhmX0VpWHlPa3p2TkJ2dWhpTVZoa3pWVlRYSVpyZHltZ1JtWGwiLCAidHJ1c3RfcmVtb3RlX2NvZGUiOiBmYWxzZSwgInBpcGVsaW5lX2t3YXJncyI6IHt9LCAiZW5hYmxlX2RlZXBzcGVlZCI6IHRydWUsICJlbmFibGVfemVybyI6IGZhbHNlLCAiZHNfY29uZmlnIjoge30sICJ0ZW5zb3JfcGFyYWxsZWwiOiAxLCAiZW5hYmxlX2N1ZGFfZ3JhcGgiOiBmYWxzZSwgInJlcGxhY2Vfd2l0aF9rZXJuZWxfaW5qZWN0IjogdHJ1ZSwgImNoZWNrcG9pbnRfZGljdCI6IG51bGwsICJtYXhfdG9rZW5zIjogMTAyNH0='] /home/lthpc/anaconda3/envs/wjtorch2.0.1/lib/python3.11/site-packages/transformers/utils/hub.py:124: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead. warnings.warn( /home/lthpc/anaconda3/envs/wjtorch2.0.1/lib/python3.11/site-packages/transformers/utils/hub.py:124: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead. warnings.warn( [2024-03-06 09:12:09,118] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2024-03-06 09:12:09,249] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2024-03-06 09:12:09,735] [INFO] [runner.py:568:main] cmd = /home/lthpc/anaconda3/envs/wjtorch2.0.1/bin/python -u -m deepspeed.launcher.launch --world_info=eyJsb2NhbGhvc3QiOiBbMF19 --master_addr=localhost --master_port=29500 --no_python --no_local_rank --enable_each_rank_log=None /home/lthpc/anaconda3/envs/wjtorch2.0.1/bin/python -m mii.legacy.launch.multi_gpu_server --deployment-name sd_deploy --load-balancer-port 50050 --restful-gateway-port 51080 --server-port 50051 --model-config eyJtb2RlbCI6ICJDb21wVmlzL3N0YWJsZS1kaWZmdXNpb24tdjEtNCIsICJ0YXNrIjogInRleHQtdG8taW1hZ2UiLCAiZHR5cGUiOiAidG9yY2guZmxvYXQxNiIsICJtb2RlbF9wYXRoIjogIi90bXAvbWlpX21vZGVscyIsICJsb2FkX3dpdGhfc3lzX21lbSI6IGZhbHNlLCAibWV0YV90ZW5zb3IiOiBmYWxzZSwgImRlcGxveV9yYW5rIjogWzBdLCAidG9yY2hfZGlzdF9wb3J0IjogMjk1MDAsICJyZXBsaWNhX251bSI6IDEsICJyZXBsaWNhX2NvbmZpZ3MiOiBbeyJob3N0bmFtZSI6ICJsb2NhbGhvc3QiLCAidGVuc29yX3BhcmFsbGVsX3BvcnRzIjogWzUwMDUxXSwgInRvcmNoX2Rpc3RfcG9ydCI6IDI5NTAwLCAiZ3B1X2luZGljZXMiOiBbMF19XSwgInByb2ZpbGVfbW9kZWxfdGltZSI6IGZhbHNlLCAic2tpcF9tb2RlbF9jaGVjayI6IHRydWUsICJoZl9hdXRoX3Rva2VuIjogImhmX0VpWHlPa3p2TkJ2dWhpTVZoa3pWVlRYSVpyZHltZ1JtWGwiLCAidHJ1c3RfcmVtb3RlX2NvZGUiOiBmYWxzZSwgInBpcGVsaW5lX2t3YXJncyI6IHt9LCAiZW5hYmxlX2RlZXBzcGVlZCI6IHRydWUsICJlbmFibGVfemVybyI6IGZhbHNlLCAiZHNfY29uZmlnIjoge30sICJ0ZW5zb3JfcGFyYWxsZWwiOiAxLCAiZW5hYmxlX2N1ZGFfZ3JhcGgiOiBmYWxzZSwgInJlcGxhY2Vfd2l0aF9rZXJuZWxfaW5qZWN0IjogdHJ1ZSwgImNoZWNrcG9pbnRfZGljdCI6IG51bGwsICJtYXhfdG9rZW5zIjogMTAyNH0= [2024-03-06 09:12:11,200] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter hf_auth_token is deprecated. Parameter will be removed. Please use the pipeline_kwargs field to pass kwargs to the HuggingFace pipeline creation. [2024-03-06 09:12:11,200] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter trust_remote_code is deprecated. Parameter will be removed. Please use the pipeline_kwargs field to pass kwargs to the HuggingFace pipeline creation. Starting load balancer on port: 50050 E0306 09:12:11.202508763 2823932 http_proxy_mapper.cc:130] cannot parse value of 'http_proxy' env var. Error: INVALID_ARGUMENT: Could not parse 'scheme' from uri '10.181.196.67:7890'. Scheme must begin with an alpha character [A-Za-z]. About to start server Started [2024-03-06 09:12:11,333] [INFO] [server.py:63:_wait_until_server_is_live] waiting for server to start... [2024-03-06 09:12:11,333] [INFO] [server.py:63:_wait_until_server_is_live] waiting for server to start... /home/lthpc/anaconda3/envs/wjtorch2.0.1/lib/python3.11/site-packages/transformers/utils/hub.py:124: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead. warnings.warn( [2024-03-06 09:12:13,885] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2024-03-06 09:12:14,515] [INFO] [launch.py:145:main] WORLD INFO DICT: {'localhost': [0]} [2024-03-06 09:12:14,515] [INFO] [launch.py:151:main] nnodes=1, num_local_procs=1, node_rank=0 [2024-03-06 09:12:14,515] [INFO] [launch.py:162:main] global_rank_mapping=defaultdict(<class 'list'>, {'localhost': [0]}) [2024-03-06 09:12:14,515] [INFO] [launch.py:163:main] dist_world_size=1 [2024-03-06 09:12:14,515] [INFO] [launch.py:165:main] Setting CUDA_VISIBLE_DEVICES=0 [2024-03-06 09:12:14,516] [INFO] [launch.py:253:main] process 2823972 spawned with command: ['/home/lthpc/anaconda3/envs/wjtorch2.0.1/bin/python', '-m', 'mii.legacy.launch.multi_gpu_server', '--deployment-name', 'sd_deploy', '--load-balancer-port', '50050', '--restful-gateway-port', '51080', '--server-port', '50051', '--model-config', 'eyJtb2RlbCI6ICJDb21wVmlzL3N0YWJsZS1kaWZmdXNpb24tdjEtNCIsICJ0YXNrIjogInRleHQtdG8taW1hZ2UiLCAiZHR5cGUiOiAidG9yY2guZmxvYXQxNiIsICJtb2RlbF9wYXRoIjogIi90bXAvbWlpX21vZGVscyIsICJsb2FkX3dpdGhfc3lzX21lbSI6IGZhbHNlLCAibWV0YV90ZW5zb3IiOiBmYWxzZSwgImRlcGxveV9yYW5rIjogWzBdLCAidG9yY2hfZGlzdF9wb3J0IjogMjk1MDAsICJyZXBsaWNhX251bSI6IDEsICJyZXBsaWNhX2NvbmZpZ3MiOiBbeyJob3N0bmFtZSI6ICJsb2NhbGhvc3QiLCAidGVuc29yX3BhcmFsbGVsX3BvcnRzIjogWzUwMDUxXSwgInRvcmNoX2Rpc3RfcG9ydCI6IDI5NTAwLCAiZ3B1X2luZGljZXMiOiBbMF19XSwgInByb2ZpbGVfbW9kZWxfdGltZSI6IGZhbHNlLCAic2tpcF9tb2RlbF9jaGVjayI6IHRydWUsICJoZl9hdXRoX3Rva2VuIjogImhmX0VpWHlPa3p2TkJ2dWhpTVZoa3pWVlRYSVpyZHltZ1JtWGwiLCAidHJ1c3RfcmVtb3RlX2NvZGUiOiBmYWxzZSwgInBpcGVsaW5lX2t3YXJncyI6IHt9LCAiZW5hYmxlX2RlZXBzcGVlZCI6IHRydWUsICJlbmFibGVfemVybyI6IGZhbHNlLCAiZHNfY29uZmlnIjoge30sICJ0ZW5zb3JfcGFyYWxsZWwiOiAxLCAiZW5hYmxlX2N1ZGFfZ3JhcGgiOiBmYWxzZSwgInJlcGxhY2Vfd2l0aF9rZXJuZWxfaW5qZWN0IjogdHJ1ZSwgImNoZWNrcG9pbnRfZGljdCI6IG51bGwsICJtYXhfdG9rZW5zIjogMTAyNH0='] [2024-03-06 09:12:16,333] [INFO] [server.py:63:_wait_until_server_is_live] waiting for server to start... [2024-03-06 09:12:16,333] [INFO] [server.py:63:_wait_until_server_is_live] waiting for server to start... /home/lthpc/anaconda3/envs/wjtorch2.0.1/lib/python3.11/site-packages/transformers/utils/hub.py:124: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead. warnings.warn( [2024-03-06 09:12:17,321] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2024-03-06 09:12:18,162] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter hf_auth_token is deprecated. Parameter will be removed. Please use the pipeline_kwargs field to pass kwargs to the HuggingFace pipeline creation. [2024-03-06 09:12:18,162] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter trust_remote_code is deprecated. Parameter will be removed. Please use the pipeline_kwargs field to pass kwargs to the HuggingFace pipeline creation. [2024-03-06 09:12:21,334] [INFO] [server.py:63:_wait_until_server_is_live] waiting for server to start... [2024-03-06 09:12:21,334] [INFO] [server.py:63:_wait_until_server_is_live] waiting for server to start... /home/lthpc/anaconda3/envs/wjtorch2.0.1/lib/python3.11/site-packages/diffusers/pipelines/pipeline_utils.py:272: FutureWarning: You are loading the variant fp16 from CompVis/stable-diffusion-v1-4 via revision='fp16' even though you can load it via variant=fp16. Loading model variants via revision='fp16'is deprecated and will be removed in diffusers v1. Please usevariant='fp16'instead. warnings.warn( safety_checker/model.safetensors not found Loading pipeline components...: 0%| | 0/7 [00:00<?, ?it/s]/home/lthpc/anaconda3/envs/wjtorch2.0.1/lib/python3.11/site-packages/diffusers/models/lora.py:300: FutureWarning:LoRACompatibleConvis deprecated and will be removed in version 1.0.0. Use ofLoRACompatibleConvis deprecated. Please switch to PEFT backend by installing PEFT:pip install peft. deprecate("LoRACompatibleConv", "1.0.0", deprecation_message) /home/lthpc/anaconda3/envs/wjtorch2.0.1/lib/python3.11/site-packages/diffusers/models/lora.py:387: FutureWarning: LoRACompatibleLinearis deprecated and will be removed in version 1.0.0. Use ofLoRACompatibleLinearis deprecated. Please switch to PEFT backend by installing PEFT:pip install peft`. deprecate("LoRACompatibleLinear", "1.0.0", deprecation_message) Loading pipeline components...: 71%|███████████████████████████████████████████████████████████████████▏ | 5/7 [00:00<00:00, 5.77it/s]/home/lthpc/anaconda3/envs/wjtorch2.0.1/lib/python3.11/site-packages/transformers/models/clip/feature_extraction_clip.py:28: FutureWarning: The class CLIPFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use CLIPImageProcessor instead. warnings.warn( Loading pipeline components...: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:02<00:00, 3.07it/s]

--------- MII Settings: ds_optimize=True, replace_with_kernel_inject=True, enable_cuda_graph=False [2024-03-06 09:12:25,179] [INFO] [logging.py:96:log_dist] [Rank -1] DeepSpeed info: version=0.13.4, git-hash=unknown, git-branch=unknown [2024-03-06 09:12:25,180] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter replace_method is deprecated. This parameter is no longer needed, please remove from your call to DeepSpeed-inference [2024-03-06 09:12:25,180] [INFO] [logging.py:96:log_dist] [Rank -1] quantize_bits = 8 mlp_extra_grouping = False, quantize_groups = 1 **** found and replaced vae w. <class 'deepspeed.model_implementations.diffusers.vae.DSVAE'> Using /home/lthpc/.cache/torch_extensions/py311_cu121 as PyTorch extensions root... Detected CUDA files, patching ldflags Emitting ninja build file /home/lthpc/.cache/torch_extensions/py311_cu121/transformer_inference/build.ninja... Building extension module transformer_inference... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) ninja: no work to do. Loading extension module transformer_inference... Time to load transformer_inference op: 0.10560107231140137 seconds [2024-03-06 09:12:25,631] [INFO] [logging.py:96:log_dist] [Rank -1] DeepSpeed-Attention config: {'layer_id': 0, 'hidden_size': 320, 'intermediate_size': 1280, 'heads': 8, 'num_hidden_layers': -1, 'dtype': torch.float16, 'pre_layer_norm': True, 'norm_type': <NormType.LayerNorm: 1>, 'local_rank': -1, 'stochastic_mode': False, 'epsilon': 1e-12, 'mp_size': 1, 'scale_attention': True, 'triangular_masking': False, 'local_attention': False, 'window_size': 256, 'rotary_dim': -1, 'rotate_half': False, 'rotate_every_two': True, 'return_tuple': True, 'mlp_after_attn': True, 'mlp_act_func_type': <ActivationFuncType.GELU: 1>, 'specialized_mode': False, 'training_mp_size': 1, 'bigscience_bloom': False, 'max_out_tokens': 4096, 'min_out_tokens': 1, 'scale_attn_by_inverse_layer_idx': False, 'enable_qkv_quantization': False, 'use_mup': False, 'return_single_tuple': False, 'set_empty_params': False, 'transposed_mode': False, 'use_triton': False, 'triton_autotune': False, 'num_kv': -1, 'rope_theta': 10000} Using /home/lthpc/.cache/torch_extensions/py311_cu121 as PyTorch extensions root... Detected CUDA files, patching ldflags Emitting ninja build file /home/lthpc/.cache/torch_extensions/py311_cu121/spatial_inference/build.ninja... Building extension module spatial_inference... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) ninja: no work to do. Loading extension module spatial_inference... Time to load spatial_inference op: 0.0967557430267334 seconds [2024-03-06 09:12:26,335] [INFO] [server.py:63:_wait_until_server_is_live] waiting for server to start... [2024-03-06 09:12:26,335] [INFO] [server.py:63:_wait_until_server_is_live] waiting for server to start... /home/lthpc/anaconda3/envs/wjtorch2.0.1/lib/python3.11/site-packages/deepspeed/model_implementations/diffusers/unet.py:17: FutureWarning: Accessing config attribute in_channels directly via 'UNet2DConditionModel' object attribute is deprecated. Please access 'in_channels' over 'UNet2DConditionModel's config object instead, e.g. 'unet.config.in_channels'. self.in_channels = unet.in_channels **** found and replaced unet w. <class 'deepspeed.model_implementations.diffusers.unet.DSUNet'> Starting server on port: 50051 About to start server Started [2024-03-06 09:12:31,336] [INFO] [server.py:63:_wait_until_server_is_live] waiting for server to start... [2024-03-06 09:12:31,336] [INFO] [server.py:63:_wait_until_server_is_live] waiting for server to start... [2024-03-06 09:12:36,337] [INFO] [server.py:63:_wait_until_server_is_live] waiting for server to start... [2024-03-06 09:12:36,337] [INFO] [server.py:63:_wait_until_server_is_live] waiting for server to start... [2024-03-06 09:12:36,337] [INFO] [server.py:64:_wait_until_server_is_live] server has started on ports [50051] [2024-03-06 09:12:36,337] [INFO] [server.py:64:_wait_until_server_is_live] server has started on ports [50051] [2024-03-06 09:12:36,338] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter hf_auth_token is deprecated. Parameter will be removed. Please use the pipeline_kwargs field to pass kwargs to the HuggingFace pipeline creation. [2024-03-06 09:12:36,338] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter trust_remote_code is deprecated. Parameter will be removed. Please use the pipeline_kwargs field to pass kwargs to the HuggingFace pipeline creation. E0306 09:12:36.342717891 2823878 http_proxy_mapper.cc:130] cannot parse value of 'http_proxy' env var. Error: INVALID_ARGUMENT: Could not parse 'scheme' from uri '10.181.196.67:7890'. Scheme must begin with an alpha character [A-Za-z].

Free memory : 75.719482 (GigaBytes) Total memory: 79.151001 (GigaBytes) Requested memory: 1.078125 (GigaBytes) Setting maximum total tokens (input + output) to 4096 WorkSpace: 0x7fa652000000

trial=0, time_taken=1.8403 trial=1, time_taken=1.8492 trial=2, time_taken=1.8195 trial=3, time_taken=1.7592 trial=4, time_taken=1.7997 trial=5, time_taken=1.7986 trial=6, time_taken=1.8398 trial=7, time_taken=1.8131 trial=8, time_taken=1.7886 trial=9, time_taken=1.8212 median duration: 1.8163 [2024-03-06 09:12:59,397] [INFO] [terminate.py:12:terminate] Terminating server for sd_deploy [2024-03-06 09:12:59,397] [INFO] [terminate.py:12:terminate] Terminating server for sd_deploy [2024-03-06 09:12:59,397] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter hf_auth_token is deprecated. Parameter will be removed. Please use the pipeline_kwargs field to pass kwargs to the HuggingFace pipeline creation. [2024-03-06 09:12:59,397] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter trust_remote_code is deprecated. Parameter will be removed. Please use the pipeline_kwargs field to pass kwargs to the HuggingFace pipeline creation. E0306 09:12:59.398139619 2823878 http_proxy_mapper.cc:130] cannot parse value of 'http_proxy' env var. Error: INVALID_ARGUMENT: Could not parse 'scheme' from uri '10.181.196.67:7890'. Scheme must begin with an alpha character [A-Za-z]. [2024-03-06 09:12:59,401] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter hf_auth_token is deprecated. Parameter will be removed. Please use the pipeline_kwargs field to pass kwargs to the HuggingFace pipeline creation. [2024-03-06 09:12:59,401] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter trust_remote_code is deprecated. Parameter will be removed. Please use the pipeline_kwargs field to pass kwargs to the HuggingFace pipeline creation. (wjtorch2.0.1) lthpc@lthpc-C01:~/nvmessd/wj/DeepSpeed-MII/mii/legacy/examples/benchmark/txt2img$ [2024-03-06 09:13:01,524] [INFO] [launch.py:348:main] Process 2823972 exits successfully. ` It looks like mii spends more time, why and how to deal with it?

Mar 06 '24 01:03 Weigaa