Timeout issue with Milvus
Getting consistent timeout issues when testing Milvus with standard Cohere 1M vectors, 768 dimensions test On c5a.8xlarge EC2 instance 32vCPU and 64Gb RAM and 128Gb SSD
2024-06-05 10:59:09,978 | WARNING: VectorDB optimize timeout in 900 (task_runner.py:204) (197576) 2024-06-05 10:59:10,041 | WARNING: Failed to run performance case, reason = Performance case optimize timeout (task_runner.py:146) (197576) Traceback (most recent call last): File "/home/ubuntu/repos/vector-dbs/env/lib/python3.12/site-packages/vectordb_bench/backend/task_runner.py", line 202, in _optimize return future.result(timeout=self.ca.optimize_timeout)[1] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.12/concurrent/futures/_base.py", line 458, in result raise TimeoutError() TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/ubuntu/repos/vector-dbs/env/lib/python3.12/site-packages/vectordb_bench/backend/task_runner.py", line 134, in _run_perf_case build_dur = self._optimize() ^^^^^^^^^^^^^^^^ File "/home/ubuntu/repos/vector-dbs/env/lib/python3.12/site-packages/vectordb_bench/backend/task_runner.py", line 207, in _optimize raise PerformanceTimeoutError("Performance case optimize timeout") from e vectordb_bench.models.PerformanceTimeoutError: Performance case optimize timeout
@mattbloke99 PR: #332 increased the optimized time limit, you can update the vdbbench to v0.0.10.
I am also getting this timeout issue with v0.0.11 trying to run Search Performance Test (1M Dataset, 768 Dims), and 10M and 100M as well. Trying to run a test using DiskANN with search_list = 100.
Bare Metal Hardware: Server: HPE ProLiant DL360 Gen11 BIOS: 2.20 CPU: Dual Socket Intel(R) Xeon(R) Gold 6426Y DRAM: 128 GB Milvus Version: 2.4.4 using Docker Image Latest VectorDB Bench Version: 0.0.11 via pip install Python Version: 3.11
2024-07-02 14:31:06,433 | INFO: generated uuid for the tasks: 82c1b66f8ec24aad90dffc606128e746 (interface.py:66) (349602)
2024-07-02 14:31:06,565 | INFO | DB | CaseType Dataset Filter | task_label (task_runner.py:338)
2024-07-02 14:31:06,565 | INFO | ----------- | ------------ -------------------- ------- | ------- (task_runner.py:338)
2024-07-02 14:31:06,565 | INFO | Milvus | Performance Cohere-MEDIUM-1M None | 2024070214 (task_runner.py:338)
2024-07-02 14:31:06,565 | INFO: task submitted: id=82c1b66f8ec24aad90dffc606128e746, 2024070214, case number: 1 (interface.py:231) (349602)
2024-07-02 14:31:07,086 | INFO: [1/1] start case: {'label': <CaseLabel.Performance: 2>, 'dataset': {'data': {'name': 'Cohere', 'size': 1000000, 'dim': 768, 'metric_type': <MetricType.COSINE: 'COSINE'>}}, 'db': 'Milvus'}, drop_old=True (interface.py:164) (349724)
2024-07-02 14:31:07,086 | INFO: Starting run (task_runner.py:100) (349724)
2024-07-02 14:31:07,336 | INFO: Milvus client drop_old collection: VectorDBBenchCollection (milvus.py:45) (349724)
2024-07-02 14:31:07,341 | INFO: Milvus create collection: VectorDBBenchCollection (milvus.py:55) (349724)
2024-07-02 14:31:08,347 | INFO: Read the entire file into memory: test.parquet (dataset.py:207) (349724)
2024-07-02 14:31:08,379 | INFO: Read the entire file into memory: neighbors.parquet (dataset.py:207) (349724)
2024-07-02 14:31:08,412 | INFO: Start performance case (task_runner.py:158) (349724)
2024-07-02 14:31:09,199 | INFO: (SpawnProcess-1:1) Start inserting embeddings in batch 5000 (serial_runner.py:35) (349889)
2024-07-02 14:31:09,199 | INFO: Get iterator for shuffle_train.parquet (dataset.py:225) (349889)
2024-07-02 14:31:22,801 | INFO: (SpawnProcess-1:1) Loaded 100000 embeddings into VectorDB (serial_runner.py:59) (349889)
2024-07-02 14:31:45,193 | INFO: (SpawnProcess-1:1) Loaded 200000 embeddings into VectorDB (serial_runner.py:59) (349889)
2024-07-02 14:32:09,818 | INFO: (SpawnProcess-1:1) Loaded 300000 embeddings into VectorDB (serial_runner.py:59) (349889)
2024-07-02 14:32:33,988 | INFO: (SpawnProcess-1:1) Loaded 400000 embeddings into VectorDB (serial_runner.py:59) (349889)
2024-07-02 14:32:58,992 | INFO: (SpawnProcess-1:1) Loaded 500000 embeddings into VectorDB (serial_runner.py:59) (349889)
2024-07-02 14:33:21,667 | INFO: (SpawnProcess-1:1) Loaded 600000 embeddings into VectorDB (serial_runner.py:59) (349889)
2024-07-02 14:33:45,791 | INFO: (SpawnProcess-1:1) Loaded 700000 embeddings into VectorDB (serial_runner.py:59) (349889)
2024-07-02 14:34:07,282 | INFO: (SpawnProcess-1:1) Loaded 800000 embeddings into VectorDB (serial_runner.py:59) (349889)
2024-07-02 14:34:31,609 | INFO: (SpawnProcess-1:1) Loaded 900000 embeddings into VectorDB (serial_runner.py:59) (349889)
2024-07-02 14:34:55,207 | INFO: (SpawnProcess-1:1) Loaded 1000000 embeddings into VectorDB (serial_runner.py:59) (349889)
2024-07-02 14:34:55,211 | INFO: (SpawnProcess-1:1) Finish loading all dataset into VectorDB, dur=226.0110656730103 (serial_runner.py:61) (349889)
2024-07-02 14:34:58,411 | INFO: Milvus post insert before optimize (milvus.py:101) (352315)
2024-07-02 15:04:56,111 | WARNING: VectorDB optimize timeout in 1800 (task_runner.py:249) (349724)
2024-07-02 15:04:56,185 | WARNING: Failed to run performance case, reason = Performance case optimize timeout (task_runner.py:191) (349724)
Traceback (most recent call last):
File "/home/labuser/.local/lib/python3.11/site-packages/vectordb_bench/backend/task_runner.py", line 247, in _optimize
return future.result(timeout=self.ca.optimize_timeout)[1]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/concurrent/futures/_base.py", line 458, in result
raise TimeoutError()
TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/labuser/.local/lib/python3.11/site-packages/vectordb_bench/backend/task_runner.py", line 165, in _run_perf_case
build_dur = self._optimize()
^^^^^^^^^^^^^^^^
File "/home/labuser/.local/lib/python3.11/site-packages/vectordb_bench/backend/task_runner.py", line 252, in _optimize
raise PerformanceTimeoutError("Performance case optimize timeout") from e
vectordb_bench.models.PerformanceTimeoutError: Performance case optimize timeout
2024-07-02 15:04:56,190 | WARNING: [1/1] case {'label': <CaseLabel.Performance: 2>, 'dataset': {'data': {'name': 'Cohere', 'size': 1000000, 'dim': 768, 'metric_type': <MetricType.COSINE: 'COSINE'>}}, 'db': 'Milvus'} failed to run, reason=Performance case optimize timeout (interface.py:179) (349724)
2024-07-02 15:04:56,192 | INFO |Task summary: run_id=82c1b, task_label=2024070214 (models.py:329)
2024-07-02 15:04:56,192 | INFO |DB | db_label case label | load_dur qps latency(p99) recall max_load_count | label (models.py:329)
2024-07-02 15:04:56,192 | INFO |------ | -------- ----------------- ---------- | ----------- ---------- --------------- ------------- -------------- | ----- (models.py:329)
2024-07-02 15:04:56,192 | INFO |Milvus | Performance768D1M 2024070214 | 0.0 0.0 0.0 0.0 0 | ? (models.py:329)
2024-07-02 15:04:56,193 | INFO: write results to disk /home/labuser/.local/lib/python3.11/site-packages/vectordb_bench/results/Milvus/result_20240702_2024070214_milvus.json (models.py:187) (349724)
2024-07-02 15:04:56,197 | INFO: Success to finish task: label=2024070214, run_id=82c1b66f8ec24aad90dffc606128e746 (interface.py:203) (349724)
I am also getting this timeout issue with v0.0.11 trying to run Search Performance Test (1M Dataset, 768 Dims), and 10M and 100M as well. Trying to run a test using DiskANN with search_list = 100.
Bare Metal Hardware: Server: HPE ProLiant DL360 Gen11 BIOS: 2.20 CPU: Dual Socket Intel(R) Xeon(R) Gold 6426Y DRAM: 128 GB Milvus Version: 2.4.4 using Docker Image Latest VectorDB Bench Version: 0.0.11 via pip install Python Version: 3.11
2024-07-02 14:31:06,433 | INFO: generated uuid for the tasks: 82c1b66f8ec24aad90dffc606128e746 (interface.py:66) (349602) 2024-07-02 14:31:06,565 | INFO | DB | CaseType Dataset Filter | task_label (task_runner.py:338) 2024-07-02 14:31:06,565 | INFO | ----------- | ------------ -------------------- ------- | ------- (task_runner.py:338) 2024-07-02 14:31:06,565 | INFO | Milvus | Performance Cohere-MEDIUM-1M None | 2024070214 (task_runner.py:338) 2024-07-02 14:31:06,565 | INFO: task submitted: id=82c1b66f8ec24aad90dffc606128e746, 2024070214, case number: 1 (interface.py:231) (349602) 2024-07-02 14:31:07,086 | INFO: [1/1] start case: {'label': <CaseLabel.Performance: 2>, 'dataset': {'data': {'name': 'Cohere', 'size': 1000000, 'dim': 768, 'metric_type': <MetricType.COSINE: 'COSINE'>}}, 'db': 'Milvus'}, drop_old=True (interface.py:164) (349724) 2024-07-02 14:31:07,086 | INFO: Starting run (task_runner.py:100) (349724) 2024-07-02 14:31:07,336 | INFO: Milvus client drop_old collection: VectorDBBenchCollection (milvus.py:45) (349724) 2024-07-02 14:31:07,341 | INFO: Milvus create collection: VectorDBBenchCollection (milvus.py:55) (349724) 2024-07-02 14:31:08,347 | INFO: Read the entire file into memory: test.parquet (dataset.py:207) (349724) 2024-07-02 14:31:08,379 | INFO: Read the entire file into memory: neighbors.parquet (dataset.py:207) (349724) 2024-07-02 14:31:08,412 | INFO: Start performance case (task_runner.py:158) (349724) 2024-07-02 14:31:09,199 | INFO: (SpawnProcess-1:1) Start inserting embeddings in batch 5000 (serial_runner.py:35) (349889) 2024-07-02 14:31:09,199 | INFO: Get iterator for shuffle_train.parquet (dataset.py:225) (349889) 2024-07-02 14:31:22,801 | INFO: (SpawnProcess-1:1) Loaded 100000 embeddings into VectorDB (serial_runner.py:59) (349889) 2024-07-02 14:31:45,193 | INFO: (SpawnProcess-1:1) Loaded 200000 embeddings into VectorDB (serial_runner.py:59) (349889) 2024-07-02 14:32:09,818 | INFO: (SpawnProcess-1:1) Loaded 300000 embeddings into VectorDB (serial_runner.py:59) (349889) 2024-07-02 14:32:33,988 | INFO: (SpawnProcess-1:1) Loaded 400000 embeddings into VectorDB (serial_runner.py:59) (349889) 2024-07-02 14:32:58,992 | INFO: (SpawnProcess-1:1) Loaded 500000 embeddings into VectorDB (serial_runner.py:59) (349889) 2024-07-02 14:33:21,667 | INFO: (SpawnProcess-1:1) Loaded 600000 embeddings into VectorDB (serial_runner.py:59) (349889) 2024-07-02 14:33:45,791 | INFO: (SpawnProcess-1:1) Loaded 700000 embeddings into VectorDB (serial_runner.py:59) (349889) 2024-07-02 14:34:07,282 | INFO: (SpawnProcess-1:1) Loaded 800000 embeddings into VectorDB (serial_runner.py:59) (349889) 2024-07-02 14:34:31,609 | INFO: (SpawnProcess-1:1) Loaded 900000 embeddings into VectorDB (serial_runner.py:59) (349889) 2024-07-02 14:34:55,207 | INFO: (SpawnProcess-1:1) Loaded 1000000 embeddings into VectorDB (serial_runner.py:59) (349889) 2024-07-02 14:34:55,211 | INFO: (SpawnProcess-1:1) Finish loading all dataset into VectorDB, dur=226.0110656730103 (serial_runner.py:61) (349889) 2024-07-02 14:34:58,411 | INFO: Milvus post insert before optimize (milvus.py:101) (352315) 2024-07-02 15:04:56,111 | WARNING: VectorDB optimize timeout in 1800 (task_runner.py:249) (349724) 2024-07-02 15:04:56,185 | WARNING: Failed to run performance case, reason = Performance case optimize timeout (task_runner.py:191) (349724) Traceback (most recent call last): File "/home/labuser/.local/lib/python3.11/site-packages/vectordb_bench/backend/task_runner.py", line 247, in _optimize return future.result(timeout=self.ca.optimize_timeout)[1] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/concurrent/futures/_base.py", line 458, in result raise TimeoutError() TimeoutError The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/labuser/.local/lib/python3.11/site-packages/vectordb_bench/backend/task_runner.py", line 165, in _run_perf_case build_dur = self._optimize() ^^^^^^^^^^^^^^^^ File "/home/labuser/.local/lib/python3.11/site-packages/vectordb_bench/backend/task_runner.py", line 252, in _optimize raise PerformanceTimeoutError("Performance case optimize timeout") from e vectordb_bench.models.PerformanceTimeoutError: Performance case optimize timeout 2024-07-02 15:04:56,190 | WARNING: [1/1] case {'label': <CaseLabel.Performance: 2>, 'dataset': {'data': {'name': 'Cohere', 'size': 1000000, 'dim': 768, 'metric_type': <MetricType.COSINE: 'COSINE'>}}, 'db': 'Milvus'} failed to run, reason=Performance case optimize timeout (interface.py:179) (349724) 2024-07-02 15:04:56,192 | INFO |Task summary: run_id=82c1b, task_label=2024070214 (models.py:329) 2024-07-02 15:04:56,192 | INFO |DB | db_label case label | load_dur qps latency(p99) recall max_load_count | label (models.py:329) 2024-07-02 15:04:56,192 | INFO |------ | -------- ----------------- ---------- | ----------- ---------- --------------- ------------- -------------- | ----- (models.py:329) 2024-07-02 15:04:56,192 | INFO |Milvus | Performance768D1M 2024070214 | 0.0 0.0 0.0 0.0 0 | ? (models.py:329) 2024-07-02 15:04:56,193 | INFO: write results to disk /home/labuser/.local/lib/python3.11/site-packages/vectordb_bench/results/Milvus/result_20240702_2024070214_milvus.json (models.py:187) (349724) 2024-07-02 15:04:56,197 | INFO: Success to finish task: label=2024070214, run_id=82c1b66f8ec24aad90dffc606128e746 (interface.py:203) (349724)
I also have noticed that I do not exhibit this behavior when switching back to v0.0.7 of vectordb-bench on the exact same hardware / OS:
2024-07-03 09:32:25,935 | INFO: generated uuid for the tasks: 16b30ded9e904e4d97e77d59a56c89a3 (interface.py:69) (725651)
2024-07-03 09:32:26,116 | INFO | DB | CaseType Dataset Filter | task_label (task_runner.py:288)
2024-07-03 09:32:26,116 | INFO | ----------- | ------------ -------------------- ------- | ------- (task_runner.py:288)
2024-07-03 09:32:26,116 | INFO | Milvus | Performance Cohere-MEDIUM-1M None | 2024070309 (task_runner.py:288)
2024-07-03 09:32:26,116 | INFO: task submitted: id=16b30ded9e904e4d97e77d59a56c89a3, 2024070309, case number: 1 (interface.py:235) (725651)
2024-07-03 09:32:26,640 | INFO: [1/1] start case: {'label': <CaseLabel.Performance: 2>, 'dataset': {'data': {'name': 'Cohere', 'size': 1000000, 'dim': 768, 'metric_type': <MetricType.COSINE: 'COSINE'>}}, 'db': 'Milvus'}, drop_old=True (interface.py:167) (725751)
2024-07-03 09:32:26,896 | INFO: Milvus client drop_old collection: VectorDBBenchCollection (milvus.py:45) (725751)
2024-07-03 09:32:26,901 | INFO: Milvus create collection: VectorDBBenchCollection (milvus.py:55) (725751)
2024-07-03 09:32:28,136 | INFO: Read the entire file into memory: test.parquet (dataset.py:207) (725751)
2024-07-03 09:32:28,186 | INFO: Read the entire file into memory: neighbors.parquet (dataset.py:207) (725751)
2024-07-03 09:32:29,015 | INFO: (SpawnProcess-1:1) Start inserting embeddings in batch 5000 (serial_runner.py:35) (725918)
2024-07-03 09:32:29,016 | INFO: Get iterator for shuffle_train.parquet (dataset.py:225) (725918)
2024-07-03 09:32:43,467 | INFO: (SpawnProcess-1:1) Loaded 100000 embeddings into VectorDB (serial_runner.py:61) (725918)
2024-07-03 09:33:08,294 | INFO: (SpawnProcess-1:1) Loaded 200000 embeddings into VectorDB (serial_runner.py:61) (725918)
2024-07-03 09:33:30,597 | INFO: (SpawnProcess-1:1) Loaded 300000 embeddings into VectorDB (serial_runner.py:61) (725918)
2024-07-03 09:33:58,060 | INFO: (SpawnProcess-1:1) Loaded 400000 embeddings into VectorDB (serial_runner.py:61) (725918)
2024-07-03 09:34:23,968 | INFO: (SpawnProcess-1:1) Loaded 500000 embeddings into VectorDB (serial_runner.py:61) (725918)
2024-07-03 09:34:49,068 | INFO: (SpawnProcess-1:1) Loaded 600000 embeddings into VectorDB (serial_runner.py:61) (725918)
2024-07-03 09:35:12,599 | INFO: (SpawnProcess-1:1) Loaded 700000 embeddings into VectorDB (serial_runner.py:61) (725918)
2024-07-03 09:35:36,397 | INFO: (SpawnProcess-1:1) Loaded 800000 embeddings into VectorDB (serial_runner.py:61) (725918)
2024-07-03 09:35:58,726 | INFO: (SpawnProcess-1:1) Loaded 900000 embeddings into VectorDB (serial_runner.py:61) (725918)
2024-07-03 09:36:21,494 | INFO: Milvus post insert before optimize (milvus.py:100) (725918)
2024-07-03 10:12:29,141 | INFO: (SpawnProcess-1:1) Loaded 1000000 embeddings into VectorDB (serial_runner.py:61) (725918)
2024-07-03 10:12:29,142 | INFO: (SpawnProcess-1:1) Finish loading all dataset into VectorDB, dur=2400.126074722997 (serial_runner.py:63) (725918)
2024-07-03 10:12:30,164 | INFO: Milvus optimizing before search (milvus.py:92) (732543)
2024-07-03 10:12:34,820 | INFO: Finish loading the entire dataset into VectorDB, insert_duration=2401.1505225989968, optimize_duration=4.540166485996451 load_duration(insert + optimize) = 2405.6907 (task_runner.py:136) (725751)
2024-07-03 10:12:35,638 | INFO: SpawnProcess-1:3 start search the entire test_data to get recall and latency (serial_runner.py:173) (732670)
2024-07-03 10:12:46,387 | INFO: SpawnProcess-1:3 search entire test_data: cost=4.3488s, queries=1000, avg_recall=0.977, avg_latency=0.0043, p99=0.0054 (serial_runner.py:208) (732670)
2024-07-03 10:12:47,106 | INFO: Start search 300s in concurrency 60, filters: None (mp_runner.py:93) (725751)
2024-07-03 10:12:57,267 | INFO: Syncing all process and start concurrency search, concurrency=60 (mp_runner.py:102) (725751)
2024-07-03 10:17:57,305 | INFO: SpawnProcess-1:36 search 300s: actual_dur=300.0066s, count=28500, qps in this process: 94.9979 (mp_runner.py:73) (732938)
2024-07-03 10:17:57,308 | INFO: SpawnProcess-1:57 search 300s: actual_dur=300.0074s, count=28074, qps in this process: 93.5777 (mp_runner.py:73) (732959)
2024-07-03 10:17:57,309 | INFO: SpawnProcess-1:10 search 300s: actual_dur=300.0155s, count=28016, qps in this process: 93.3818 (mp_runner.py:73) (732912)
2024-07-03 10:17:57,310 | INFO: SpawnProcess-1:38 search 300s: actual_dur=300.008s, count=28444, qps in this process: 94.8108 (mp_runner.py:73) (732940)
2024-07-03 10:17:57,314 | INFO: SpawnProcess-1:62 search 300s: actual_dur=300.0124s, count=28597, qps in this process: 95.3194 (mp_runner.py:73) (732964)
2024-07-03 10:17:57,314 | INFO: SpawnProcess-1:15 search 300s: actual_dur=300.0097s, count=28575, qps in this process: 95.2469 (mp_runner.py:73) (732917)
2024-07-03 10:17:57,316 | INFO: SpawnProcess-1:56 search 300s: actual_dur=300.0132s, count=27957, qps in this process: 93.1859 (mp_runner.py:73) (732958)
2024-07-03 10:17:57,317 | INFO: SpawnProcess-1:45 search 300s: actual_dur=300.0049s, count=28597, qps in this process: 95.3218 (mp_runner.py:73) (732947)
2024-07-03 10:17:57,319 | INFO: SpawnProcess-1:11 search 300s: actual_dur=300.01s, count=28065, qps in this process: 93.5469 (mp_runner.py:73) (732913)
2024-07-03 10:17:57,321 | INFO: SpawnProcess-1:9 search 300s: actual_dur=300.0064s, count=28598, qps in this process: 95.3246 (mp_runner.py:73) (732911)
2024-07-03 10:17:57,324 | INFO: SpawnProcess-1:34 search 300s: actual_dur=300.0053s, count=28407, qps in this process: 94.6883 (mp_runner.py:73) (732936)
2024-07-03 10:17:57,325 | INFO: SpawnProcess-1:31 search 300s: actual_dur=300.0053s, count=28573, qps in this process: 95.2417 (mp_runner.py:73) (732933)
2024-07-03 10:17:57,324 | INFO: SpawnProcess-1:63 search 300s: actual_dur=300.007s, count=28012, qps in this process: 93.3712 (mp_runner.py:73) (732965)
2024-07-03 10:17:57,326 | INFO: SpawnProcess-1:20 search 300s: actual_dur=300.0051s, count=28461, qps in this process: 94.8684 (mp_runner.py:73) (732922)
2024-07-03 10:17:57,327 | INFO: SpawnProcess-1:52 search 300s: actual_dur=300.0047s, count=28378, qps in this process: 94.5919 (mp_runner.py:73) (732954)
2024-07-03 10:17:57,328 | INFO: SpawnProcess-1:5 search 300s: actual_dur=300.0069s, count=28629, qps in this process: 95.4278 (mp_runner.py:73) (732905)
2024-07-03 10:17:57,328 | INFO: SpawnProcess-1:19 search 300s: actual_dur=300.01s, count=27979, qps in this process: 93.2602 (mp_runner.py:73) (732921)
2024-07-03 10:17:57,329 | INFO: SpawnProcess-1:64 search 300s: actual_dur=300.0107s, count=28614, qps in this process: 95.3766 (mp_runner.py:73) (732966)
2024-07-03 10:17:57,329 | INFO: SpawnProcess-1:44 search 300s: actual_dur=300.0088s, count=28569, qps in this process: 95.2272 (mp_runner.py:73) (732946)
2024-07-03 10:17:57,329 | INFO: SpawnProcess-1:42 search 300s: actual_dur=300.0106s, count=28414, qps in this process: 94.71 (mp_runner.py:73) (732944)
2024-07-03 10:17:57,330 | INFO: SpawnProcess-1:49 search 300s: actual_dur=300.007s, count=28649, qps in this process: 95.4944 (mp_runner.py:73) (732951)
2024-07-03 10:17:57,329 | INFO: SpawnProcess-1:6 search 300s: actual_dur=300.0131s, count=28053, qps in this process: 93.5059 (mp_runner.py:73) (732908)
2024-07-03 10:17:57,329 | INFO: SpawnProcess-1:16 search 300s: actual_dur=300.0117s, count=28277, qps in this process: 94.253 (mp_runner.py:73) (732918)
2024-07-03 10:17:57,330 | INFO: SpawnProcess-1:33 search 300s: actual_dur=300.0108s, count=28267, qps in this process: 94.2199 (mp_runner.py:73) (732935)
2024-07-03 10:17:57,330 | INFO: SpawnProcess-1:48 search 300s: actual_dur=300.0069s, count=28427, qps in this process: 94.7545 (mp_runner.py:73) (732950)
2024-07-03 10:17:57,330 | INFO: SpawnProcess-1:46 search 300s: actual_dur=300.0079s, count=27985, qps in this process: 93.2809 (mp_runner.py:73) (732948)
2024-07-03 10:17:57,331 | INFO: SpawnProcess-1:30 search 300s: actual_dur=300.009s, count=28427, qps in this process: 94.7538 (mp_runner.py:73) (732932)
2024-07-03 10:17:57,331 | INFO: SpawnProcess-1:29 search 300s: actual_dur=300.005s, count=28379, qps in this process: 94.5951 (mp_runner.py:73) (732931)
2024-07-03 10:17:57,331 | INFO: SpawnProcess-1:26 search 300s: actual_dur=300.013s, count=27931, qps in this process: 93.0993 (mp_runner.py:73) (732928)
2024-07-03 10:17:57,331 | INFO: SpawnProcess-1:22 search 300s: actual_dur=300.0065s, count=28648, qps in this process: 95.4913 (mp_runner.py:73) (732924)
2024-07-03 10:17:57,332 | INFO: SpawnProcess-1:25 search 300s: actual_dur=300.0046s, count=28342, qps in this process: 94.4719 (mp_runner.py:73) (732927)
2024-07-03 10:17:57,332 | INFO: SpawnProcess-1:18 search 300s: actual_dur=300.0112s, count=28221, qps in this process: 94.0665 (mp_runner.py:73) (732920)
2024-07-03 10:17:57,332 | INFO: SpawnProcess-1:53 search 300s: actual_dur=300.0027s, count=28641, qps in this process: 95.4691 (mp_runner.py:73) (732955)
2024-07-03 10:17:57,333 | INFO: SpawnProcess-1:43 search 300s: actual_dur=300.0052s, count=28211, qps in this process: 94.035 (mp_runner.py:73) (732945)
2024-07-03 10:17:57,333 | INFO: SpawnProcess-1:7 search 300s: actual_dur=300.0079s, count=28281, qps in this process: 94.2675 (mp_runner.py:73) (732909)
2024-07-03 10:17:57,333 | INFO: SpawnProcess-1:47 search 300s: actual_dur=300.0053s, count=28149, qps in this process: 93.8283 (mp_runner.py:73) (732949)
2024-07-03 10:17:57,335 | INFO: SpawnProcess-1:23 search 300s: actual_dur=300.0107s, count=28526, qps in this process: 95.0833 (mp_runner.py:73) (732925)
2024-07-03 10:17:57,335 | INFO: SpawnProcess-1:39 search 300s: actual_dur=300.0064s, count=28483, qps in this process: 94.9413 (mp_runner.py:73) (732941)
2024-07-03 10:17:57,336 | INFO: SpawnProcess-1:60 search 300s: actual_dur=300.0048s, count=28462, qps in this process: 94.8718 (mp_runner.py:73) (732962)
2024-07-03 10:17:57,337 | INFO: SpawnProcess-1:28 search 300s: actual_dur=300.0071s, count=28591, qps in this process: 95.3011 (mp_runner.py:73) (732930)
2024-07-03 10:17:57,336 | INFO: SpawnProcess-1:14 search 300s: actual_dur=300.0039s, count=27943, qps in this process: 93.1421 (mp_runner.py:73) (732916)
2024-07-03 10:17:57,337 | INFO: SpawnProcess-1:51 search 300s: actual_dur=300.0053s, count=28490, qps in this process: 94.965 (mp_runner.py:73) (732953)
2024-07-03 10:17:57,337 | INFO: SpawnProcess-1:61 search 300s: actual_dur=300.0059s, count=28229, qps in this process: 94.0948 (mp_runner.py:73) (732963)
2024-07-03 10:17:57,338 | INFO: SpawnProcess-1:8 search 300s: actual_dur=300.0024s, count=28498, qps in this process: 94.9926 (mp_runner.py:73) (732910)
2024-07-03 10:17:57,337 | INFO: SpawnProcess-1:50 search 300s: actual_dur=300.0103s, count=28284, qps in this process: 94.2768 (mp_runner.py:73) (732952)
2024-07-03 10:17:57,337 | INFO: SpawnProcess-1:13 search 300s: actual_dur=300.0058s, count=28375, qps in this process: 94.5815 (mp_runner.py:73) (732915)
2024-07-03 10:17:57,338 | INFO: SpawnProcess-1:27 search 300s: actual_dur=300.0016s, count=28323, qps in this process: 94.4095 (mp_runner.py:73) (732929)
2024-07-03 10:17:57,339 | INFO: SpawnProcess-1:59 search 300s: actual_dur=300.0082s, count=28012, qps in this process: 93.3708 (mp_runner.py:73) (732961)
2024-07-03 10:17:57,339 | INFO: SpawnProcess-1:40 search 300s: actual_dur=300.006s, count=28612, qps in this process: 95.3714 (mp_runner.py:73) (732942)
2024-07-03 10:17:57,340 | INFO: SpawnProcess-1:21 search 300s: actual_dur=300.0056s, count=28086, qps in this process: 93.6183 (mp_runner.py:73) (732923)
2024-07-03 10:17:57,340 | INFO: SpawnProcess-1:35 search 300s: actual_dur=300.0054s, count=28016, qps in this process: 93.385 (mp_runner.py:73) (732937)
2024-07-03 10:17:57,341 | INFO: SpawnProcess-1:17 search 300s: actual_dur=300.0014s, count=28462, qps in this process: 94.8729 (mp_runner.py:73) (732919)
2024-07-03 10:17:57,341 | INFO: SpawnProcess-1:55 search 300s: actual_dur=300.0039s, count=28632, qps in this process: 95.4388 (mp_runner.py:73) (732957)
2024-07-03 10:17:57,341 | INFO: SpawnProcess-1:24 search 300s: actual_dur=300.003s, count=27957, qps in this process: 93.1891 (mp_runner.py:73) (732926)
2024-07-03 10:17:57,342 | INFO: SpawnProcess-1:12 search 300s: actual_dur=300.0051s, count=28514, qps in this process: 95.0451 (mp_runner.py:73) (732914)
2024-07-03 10:17:57,342 | INFO: SpawnProcess-1:37 search 300s: actual_dur=300.0012s, count=27994, qps in this process: 93.313 (mp_runner.py:73) (732939)
2024-07-03 10:17:57,342 | INFO: SpawnProcess-1:41 search 300s: actual_dur=300.0026s, count=28504, qps in this process: 95.0125 (mp_runner.py:73) (732943)
2024-07-03 10:17:57,343 | INFO: SpawnProcess-1:58 search 300s: actual_dur=300.0057s, count=28552, qps in this process: 95.1715 (mp_runner.py:73) (732960)
2024-07-03 10:17:57,344 | INFO: SpawnProcess-1:32 search 300s: actual_dur=300.0031s, count=28428, qps in this process: 94.759 (mp_runner.py:73) (732934)
2024-07-03 10:17:57,346 | INFO: SpawnProcess-1:54 search 300s: actual_dur=300.0052s, count=28526, qps in this process: 95.085 (mp_runner.py:73) (732956)
2024-07-03 10:17:57,348 | INFO: End search in concurrency 60: dur=300.07863932699547s, total_count=1700846, qps=5668.0009 (mp_runner.py:109) (725751)
2024-07-03 10:17:58,041 | INFO: Update largest qps with concurrency 60: current max_qps=5668.0009 (mp_runner.py:113) (725751)
2024-07-03 10:17:58,042 | INFO: Performance case got result: Metric(max_load_count=0, load_duration=2405.6907, qps=5668.0009, serial_latency_p99=0.0054, recall=0.977) (task_runner.py:150) (725751)
2024-07-03 10:17:58,042 | INFO: [1/1] finish case: {'label': <CaseLabel.Performance: 2>, 'dataset': {'data': {'name': 'Cohere', 'size': 1000000, 'dim': 768, 'metric_type': <MetricType.COSINE: 'COSINE'>}}, 'db': 'Milvus'}, result=Metric(max_load_count=0, load_duration=2405.6907, qps=5668.0009, serial_latency_p99=0.0054, recall=0.977), label=ResultLabel.NORMAL (interface.py:169) (725751)
2024-07-03 10:17:58,042 | INFO |Task summary: run_id=16b30, task_label=2024070309 (models.py:286)
2024-07-03 10:17:58,042 | INFO |DB | db_label case label | load_dur qps latency(p99) recall max_load_count | label (models.py:286)
2024-07-03 10:17:58,042 | INFO |------ | -------- ----------------- ---------- | ------------ ------------ --------------- ------------- -------------- | ----- (models.py:286)
2024-07-03 10:17:58,042 | INFO |Milvus | Performance768D1M 2024070309 | 2405.6907 5668.0009 0.0054 0.977 0 | :) (models.py:286)
2024-07-03 10:17:58,042 | INFO: write results to disk /home/labuser/.local/lib/python3.11/site-packages/vectordb_bench/results/Milvus/result_20240703_2024070309_milvus.json (models.py:144) (725751)
2024-07-03 10:17:58,042 | INFO: Succes to finish task: label=2024070309, run_id=16b30ded9e904e4d97e77d59a56c89a3 (interface.py:207) (725751)
@mattbloke99 @anrahman4 Your machine needs 2400.126074722997 seconds to load + index the dataset in 0.0.7.
We changed the process in the latest vdbbench, the index part is moved into optimize.
Since it only costs you dur=226.0110656730103s to load data into milvus, you'll probablly need 3000s timeout config for optimize.
Just change this OPTIMIZE_TIMEOUT_768D_1M = 30 * 60 # 30min into 3000 in vectordb_bench/__init__.py and you'll be fine.
We'll check if the older timeout config still works for the latest milvus cases and ajust it accordingly. AFAIK, in our 16c64G tests it costs around 1500s.
Can confirm this worked with v0.0.11, thank you very much!
Can not confirm this worked with v0.0.12, why create this question,the timeout was set 8 hours
@1191510903 8hours is a really long time. Could you please upload logs of Milvus so that I can know why this is taking so much time?
Also I'd like to ask some quick questions:
- What's your Milvus version?
- Distributed or standalone?
- What's the cpu and memory resources for Milvus? Especially IndexNode?