[L0] Enable Immediate Command List by default given Intel DG2
- Enabled Immediate Command list usage per queue given Intel DG2 HW.
- Removed default setting of false on windows.
Compute Benchmarks level_zero run (with params: ): https://github.com/oneapi-src/unified-runtime/actions/runs/10292370779
Compute Benchmarks level_zero run (): https://github.com/oneapi-src/unified-runtime/actions/runs/10292370779 Job status: success. Test status: success.
Summary
| Benchmark | This PR | baseline |
|---|---|---|
| api_overhead_benchmark_sycl SubmitKernel out of order | 26.281 | **23.082** |
| api_overhead_benchmark_sycl SubmitKernel in order | 25.33 | **22.972** |
| memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024 | 330.433 | **298.574** |
| memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024 | **204.698** |
222.377 |
| memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024 | 6.699 | **6.408** |
| memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240 | **3.091** |
3.116 |
| api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024 | 2.875 | **2.806** |
| api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024 | 2.403 | **2.322** |
| miscellaneous_benchmark_sycl VectorSum | **858.246** |
859.353 |
| Velocity-Bench Hashtable | 330.956108 | **328.705328** |
| Velocity-Bench Bitcracker | **35.6949** |
35.7419 |
| Velocity-Bench CudaSift | **218.517** |
218.846 |
| Velocity-Bench Easywave | **239** |
246.0 |
| Velocity-Bench QuickSilver | 117.23 | **117.06** |
| Velocity-Bench Sobel Filter | 612.759 | **610.354** |
Charts
api_overhead_benchmark_sycl SubmitKernel out of order
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title api_overhead_benchmark_sycl SubmitKernel out of order
todayMarker off
dateFormat X
axisFormat %s
section SubmitKernel(api=sycl<br>Profiling=0<br>Ioq=0<br>DiscardEvents=0<br>NumKernels=10<br>KernelExecTime=1<br>MeasureCompletion=0)
This PR (26.281 μs) : crit, 0, 26
baseline (23.082 μs) : 0, 23
- : 0, 0
- : 0, 0
api_overhead_benchmark_sycl SubmitKernel in order
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title api_overhead_benchmark_sycl SubmitKernel in order
todayMarker off
dateFormat X
axisFormat %s
section SubmitKernel(api=sycl<br>Profiling=0<br>Ioq=1<br>DiscardEvents=0<br>NumKernels=10<br>KernelExecTime=1<br>MeasureCompletion=0)
This PR (25.33 μs) : crit, 0, 25
baseline (22.972 μs) : 0, 22
- : 0, 0
- : 0, 0
memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024
todayMarker off
dateFormat X
axisFormat %s
section QueueInOrderMemcpy(api=sycl<br>IsCopyOnly=0<br>sourcePlacement=Device<br>destinationPlacement=Device<br>size=1KB<br>count=100)
This PR (330.433 μs) : crit, 0, 330
baseline (298.574 μs) : 0, 298
- : 0, 0
- : 0, 0
memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024
todayMarker off
dateFormat X
axisFormat %s
section QueueInOrderMemcpy(api=sycl<br>IsCopyOnly=0<br>sourcePlacement=Host<br>destinationPlacement=Device<br>size=1KB<br>count=100)
This PR (204.698 μs) : crit, 0, 204
baseline (222.377 μs) : 0, 222
- : 0, 0
- : 0, 0
memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024
todayMarker off
dateFormat X
axisFormat %s
section QueueMemcpy(api=sycl<br>sourcePlacement=Device<br>destinationPlacement=Device<br>size=1KB)
This PR (6.699 μs) : crit, 0, 6
baseline (6.408 μs) : 0, 6
- : 0, 0
- : 0, 0
memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240
todayMarker off
dateFormat X
axisFormat %s
section StreamMemory(api=sycl<br>type=Triad<br>size=10KB<br>useEvents=0<br>contents=Zeros<br>memoryPlacement=Device)
This PR (3.091 μs) : crit, 0, 3
baseline (3.116 μs) : 0, 3
- : 0, 0
- : 0, 0
api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024
todayMarker off
dateFormat X
axisFormat %s
section ExecImmediateCopyQueue(api=sycl<br>IsCopyOnly=1<br>MeasureCompletionTime=0<br>src=Device<br>dst=Device<br>size=1KB<br>ioq=0)
This PR (2.875 μs) : crit, 0, 2
baseline (2.806 μs) : 0, 2
- : 0, 0
- : 0, 0
api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024
todayMarker off
dateFormat X
axisFormat %s
section ExecImmediateCopyQueue(api=sycl<br>IsCopyOnly=1<br>MeasureCompletionTime=0<br>src=Host<br>dst=Host<br>size=1KB<br>ioq=1)
This PR (2.403 μs) : crit, 0, 2
baseline (2.322 μs) : 0, 2
- : 0, 0
- : 0, 0
miscellaneous_benchmark_sycl VectorSum
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title miscellaneous_benchmark_sycl VectorSum
todayMarker off
dateFormat X
axisFormat %s
section VectorSum(api=sycl<br>numberOfElementsX=512<br>numberOfElementsY=256<br>numberOfElementsZ=256)
This PR (858.246 μs) : crit, 0, 858
baseline (859.353 μs) : 0, 859
- : 0, 0
- : 0, 0
Velocity-Bench Hashtable
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title Velocity-Bench Hashtable
todayMarker off
dateFormat X
axisFormat %s
section hashtable
This PR (330.956108 M keys/sec) : crit, 0, 330
baseline (328.705328 M keys/sec) : 0, 328
- : 0, 0
- : 0, 0
Velocity-Bench Bitcracker
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title Velocity-Bench Bitcracker
todayMarker off
dateFormat X
axisFormat %s
section bitcracker
This PR (35.6949 s) : crit, 0, 35
baseline (35.7419 s) : 0, 35
- : 0, 0
- : 0, 0
Velocity-Bench CudaSift
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title Velocity-Bench CudaSift
todayMarker off
dateFormat X
axisFormat %s
section cudaSift
This PR (218.517 ms) : crit, 0, 218
baseline (218.846 ms) : 0, 218
- : 0, 0
- : 0, 0
Velocity-Bench Easywave
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title Velocity-Bench Easywave
todayMarker off
dateFormat X
axisFormat %s
section easywave
This PR (239 ms) : crit, 0, 239
baseline (246.0 ms) : 0, 246
- : 0, 0
- : 0, 0
Velocity-Bench QuickSilver
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title Velocity-Bench QuickSilver
todayMarker off
dateFormat X
axisFormat %s
section QuickSilver
This PR (117.23 MMS/CTT) : crit, 0, 117
baseline (117.06 MMS/CTT) : 0, 117
- : 0, 0
- : 0, 0
Velocity-Bench Sobel Filter
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title Velocity-Bench Sobel Filter
todayMarker off
dateFormat X
axisFormat %s
section sobel_filter
This PR (612.759 ms) : crit, 0, 612
baseline (610.354 ms) : 0, 610
- : 0, 0
- : 0, 0
Details
SubmitKernel(api=sycl Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0)
Environment Variables:
Command:
/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1
Output:
TestCase,Mean,Median,StdDev,Min,Max,Type SubmitKernel(api=sycl Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),26.281,26.341,5.63%,23.607,460.934,[CPU],[us]
SubmitKernel(api=sycl Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0)
Environment Variables:
Command:
/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1
Output:
TestCase,Mean,Median,StdDev,Min,Max,Type SubmitKernel(api=sycl Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),25.330,25.568,6.15%,22.364,433.367,[CPU],[us]
QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Device destinationPlacement=Device size=1KB count=100)
Environment Variables:
Command:
/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Device --destinationPlacement=Device --size=1024 --count=100
Output:
TestCase,Mean,Median,StdDev,Min,Max,Type QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Device destinationPlacement=Device size=1KB count=100),330.433,332.998,2.91%,300.883,732.033,[CPU],[us]
QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Host destinationPlacement=Device size=1KB count=100)
Environment Variables:
Command:
/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Host --destinationPlacement=Device --size=1024 --count=100
Output:
TestCase,Mean,Median,StdDev,Min,Max,Type QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Host destinationPlacement=Device size=1KB count=100),204.698,197.360,17.86%,190.390,577.874,[CPU],[us]
QueueMemcpy(api=sycl sourcePlacement=Device destinationPlacement=Device size=1KB)
Environment Variables:
Command:
/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueMemcpy --csv --noHeaders --iterations=10000 --sourcePlacement=Device --destinationPlacement=Device --size=1024
Output:
TestCase,Mean,Median,StdDev,Min,Max,Type QueueMemcpy(api=sycl sourcePlacement=Device destinationPlacement=Device size=1KB),6.699,6.406,19.81%,6.019,116.584,[CPU],[us]
StreamMemory(api=sycl type=Triad size=10KB useEvents=0 contents=Zeros memoryPlacement=Device)
Environment Variables:
Command:
/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=StreamMemory --csv --noHeaders --iterations=10000 --type=Triad --size=10240 --memoryPlacement=Device --useEvents=0 --contents=Zeros
Output:
TestCase,Mean,Median,StdDev,Min,Max,Type StreamMemory(api=sycl type=Triad size=10KB useEvents=0 contents=Zeros memoryPlacement=Device),3.091,3.097,2.84%,0.686,3.368,[CPU],[GB/s]
ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Device dst=Device size=1KB ioq=0)
Environment Variables:
Command:
/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=0 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Device --dst=Device --size=1024
Output:
TestCase,Mean,Median,StdDev,Min,Max,Type ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Device dst=Device size=1KB ioq=0),2.875,2.866,7.42%,2.668,60.993,[CPU],[us]
ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Host dst=Host size=1KB ioq=1)
Environment Variables:
Command:
/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=1 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Host --dst=Host --size=1024
Output:
TestCase,Mean,Median,StdDev,Min,Max,Type ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Host dst=Host size=1KB ioq=1),2.403,2.397,6.06%,2.280,43.808,[CPU],[us]
VectorSum(api=sycl numberOfElementsX=512 numberOfElementsY=256 numberOfElementsZ=256)
Environment Variables:
Command:
/home/test-user/bench_workdir/compute-benchmarks-build/bin/miscellaneous_benchmark_sycl --test=VectorSum --csv --noHeaders --iterations=1000 --numberOfElementsX=512 --numberOfElementsY=256 --numberOfElementsZ=256
Output:
TestCase,Mean,Median,StdDev,Min,Max,Type VectorSum(api=sycl numberOfElementsX=512 numberOfElementsY=256 numberOfElementsZ=256),858.246,858.902,0.45%,821.607,871.997,[GPU],bw [GB/s]
hashtable
Environment Variables:
Command:
/home/test-user/bench_workdir/hashtable/hashtable_sycl --no-verify
Output:
hashtable - total time for whole calculation: 0.405545 s 330.956108 million keys/second
bitcracker
Environment Variables:
Command:
/home/test-user/bench_workdir/bitcracker/bitcracker -f /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/img_win8_user_hash.txt -d /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/user_passwords_60000.txt -b 60000
Output:
---------> BitCracker: BitLocker password cracking tool <---------
================================== Retrieving Info
Reading hash file "/home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/img_win8_user_hash.txt"
Attack
================================================ Type of attack: User Password Psw per thread: 1 max_num_pswd_per_read: 60000 Dictionary: /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/user_passwords_60000.txt MAC Comparison (-m): Yes
Iter: 1, num passwords read: 60000 Kernel execution: Effective passwords: 60000 Passwords Range: npknpByH7N2m3OnLNH1X9DJxLrzIFWk ..... dL_7uuf3QCz-c6K3xDu0
================================================ Bitcracker attack completed Total passwords evaluated: 60000 Password not found!
time to subtract from total: 0.00403515 s bitcracker - total time for whole calculation: 35.6949 s
cudaSift
Environment Variables:
Command:
/home/test-user/bench_workdir/cudaSift/cudaSift
Output:
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1235 1272 33.5324% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1231 1264 33.4238% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1225 1263 33.2609% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1201 1262 32.6093% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1200 1259 32.5821% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1230 1263 33.3967% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1236 1269 33.5596% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1236 1270 33.5596% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1215 1271 32.9894% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1235 1268 33.5324% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1231 1266 33.4238% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1174 1267 31.8762% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1218 1254 33.0709% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1233 1268 33.4781% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1031 1261 27.9935% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1225 1259 33.2609% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1229 1266 33.3695% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1233 1265 33.4781% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1233 1265 33.4781% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1227 1261 33.3152% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1134 1250 30.7901% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1224 1259 33.2338% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1228 1261 33.3424% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1224 1257 33.2338% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1223 1256 33.2066% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1083 1259 29.4054% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1058 1264 28.7266% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1214 1262 32.9623% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1091 1247 29.6226% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1228 1261 33.3424% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1231 1264 33.4238% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1110 1255 30.1385% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1237 1270 33.5868% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1136 1259 30.8444% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1223 1256 33.2066% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1226 1258 33.2881% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1120 1273 30.41% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1129 1261 30.6544% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1225 1261 33.2609% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1240 1274 33.6682% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1232 1266 33.451% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1110 1272 30.1385% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1225 1259 33.2609% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1241 1275 33.6954% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1199 1279 32.555% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1212 1259 32.908% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1055 1267 28.6451% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1082 1264 29.3782% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1217 1264 33.0437% 1 2
Performing data verification Data verification is SUCCESSFUL.
Image size = (1920,1080) Initializing data... Number of original features: 3683 3933 Number of matching features: 1220 1253 33.1252% 1 2
Performing data verification Data verification is SUCCESSFUL.
Avg workload time = 218.517 ms
easywave
Environment Variables:
Command:
/home/test-user/bench_workdir/easywave/easyWave_sycl -grid /home/test-user/bench_workdir/data/easywave/examples/e2Asean.grd -source /home/test-user/bench_workdir/data/easywave/examples/BengkuluSept2007.flt -time 120
Output:
MAIN: Starting SYCL main program MAIN: Attempting to clean up previous eWave tsunami files MAIN: Clean up completed SYCL: SYCL Queue initialization successful SYCL: Using SYCL device : Intel(R) Data Center GPU Max 1100 (Driver version 1.3.29735+27) SYCL: Platform : Intel(R) oneAPI Unified Runtime over Level-Zero MAIN: Program successfully completed
QuickSilver
Environment Variables:
QS_DEVICE=GPU
Command:
/home/test-user/bench_workdir/QuickSilver/qs -i /home/test-user/bench_workdir/velocity-bench-repo/QuickSilver/Examples/AllScattering/scatteringOnly.inp
Output:
Copyright (c) 2016 Lawrence Livermore National Security, LLC All Rights Reserved Quicksilver Version : Quicksilver Git Hash : MPI Version : 3.0 Number of MPI ranks : 1 Number of OpenMP Threads: 1 Number of OpenMP CPUs : 1
Loading params Finished loading params Simulation: dt: 1e-08 fMax: 0.1 inputFile: /home/test-user/bench_workdir/velocity-bench-repo/QuickSilver/Examples/AllScattering/scatteringOnly.inp energySpectrum: boundaryCondition: octant loadBalance: 1 cycleTimers: 0 debugThreads: 0 lx: 100 ly: 100 lz: 100 nParticles: 10000000 batchSize: 0 nBatches: 10 nSteps: 10 nx: 10 ny: 10 nz: 10 seed: 1029384756 xDom: 0 yDom: 0 zDom: 0 eMax: 20 eMin: 1e-09 nGroups: 230 lowWeightCutoff: 0.001 bTally: 1 fTally: 1 cTally: 1 coralBenchmark: 0 crossSectionsOut:
Geometry: material: sourceMaterial shape: brick xMax: 100 xMin: 0 yMax: 100 yMin: 0 zMax: 100 zMin: 0
Material: name: sourceMaterial mass: 1000 nIsotopes: 10 nReactions: 9 sourceRate: 1e+10 totalCrossSection: 0.1 absorptionCrossSection: flat fissionCrossSection: flat scatteringCrossSection: flat absorptionCrossSectionRatio: 0 fissionCrossSectionRatio: 0 scatteringCrossSectionRatio: 1
CrossSection: name: flat A: 0 B: 0 C: 0 D: 0 E: 1 nuBar: 2.4 setting GPU setting parameters Building partition 0 Building partition 1 Building partition 2 Building partition 3 Building MC_Domain 0 Building MC_Domain 1 Building MC_Domain 2 Building MC_Domain 3 Starting Consistency Check Finished Consistency Check Finished initMesh Started copyMaterialDatabase_device Finished copyMaterialDatabase_device Finished copyNuclearData_device Finished copyDomainDevice cycle start source rr split absorb scatter fission produce collisn escape census num_seg scalar_flux cycleInit cycleTracking cycleFinalize 0 0 1000000 0 9000000 0 18533189 0 0 18533189 1151780 8848220 55527935 1.854923e+09 4.283490e-01 6.218600e-01 0.000000e+00 1 8848220 1000000 0 151478 0 34281997 0 0 34281997 1664159 8335539 94633679 5.047651e+09 3.604760e-01 7.619230e-01 0.000000e+00 2 8335539 1000000 0 663717 0 34354432 0 0 34354432 1366771 8632485 95010375 7.705930e+09 3.576670e-01 7.764070e-01 0.000000e+00 3 8632485 1000000 0 367978 0 34302727 0 0 34302727 1242216 8758247 94953591 9.992076e+09 3.666110e-01 8.343250e-01 0.000000e+00 4 8758247 1000000 0 242076 0 34141236 0 0 34141236 1168452 8831871 94599337 1.199834e+10 3.296530e-01 7.981060e-01 0.000000e+00 5 8831871 1000000 0 168070 0 33948724 0 0 33948724 1121156 8878785 94148236 1.377636e+10 3.294980e-01 7.726620e-01 0.000000e+00 6 8878785 1000000 0 120572 0 33760567 0 0 33760567 1089103 8910254 93689264 1.535668e+10 3.260260e-01 7.726120e-01 0.000000e+00 7 8910254 1000000 0 89810 0 33552179 0 0 33552179 1065203 8934861 93216931 1.676993e+10 3.259680e-01 7.911310e-01 0.000000e+00 8 8934861 1000000 0 65491 0 33384605 0 0 33384605 1047720 8952632 92768273 1.804559e+10 3.279000e-01 7.911720e-01 0.000000e+00 9 8952632 1000000 0 47165 0 33198494 0 0 33198494 1033968 8965829 92324678 1.920208e+10 3.248170e-01 7.645080e-01 0.000000e+00
Timer Cumulative Cumulative Cumulative Cumulative Cumulative Cumulative Name number microSecs microSecs microSecs microSecs Efficiency of calls min avg max stddev Rating main 1 1.116e+07 1.116e+07 1.116e+07 0.000e+00 100.00 cycleInit 10 3.477e+06 3.477e+06 3.477e+06 0.000e+00 100.00 cycleTracking 10 7.685e+06 7.685e+06 7.685e+06 0.000e+00 100.00 cycleTracking_Kernel 104 4.944e+06 4.944e+06 4.944e+06 0.000e+00 100.00 cycleTracking_MPI 117 2.200e+05 2.200e+05 2.200e+05 0.000e+00 100.00 cycleTracking_Test_Done 0 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.00 cycleFinalize 20 4.170e+02 4.170e+02 4.170e+02 0.000e+00 100.00 Figure Of Merit 117.23 [Num Mega Segments / Cycle Tracking Time]
sobel_filter
Environment Variables:
OPENCV_IO_MAX_IMAGE_PIXELS=1677721600
Command:
/home/test-user/bench_workdir/sobel_filter/sobel_filter -i /home/test-user/bench_workdir/data/sobel_filter/sobel_filter_data/silverfalls_32Kx32K.png -n 5
Output:
SYMN: Welcome to the SYCL version of Sobel filter workload. SYMN: Input image file: /home/test-user/bench_workdir/data/sobel_filter/sobel_filter_data/silverfalls_32Kx32K.png SYMN: Launching SYCL kernel with # of iterations: 5 time to subtract from total: 7.45873 s sobelfilter - total time for whole calculation: 0.612759 s
After rebase, usage of immediate command lists is exposing a bug running the new tests in https://github.com/oneapi-src/unified-runtime/commit/ac7eb1717ab1d373640b62de0fdb2bf2dfa2b087 by @EwanC .
Failures seen in https://github.com/intel/llvm/pull/15054 for memcpy 2d, until that is resolved this PR is in draft.
@nrspruit Is this PR a WIP?
@nrspruit Is this PR a WIP?
Hello @omarahmed1111 , yes, this patch exposes a problem in a L0 Driver so this is pending and will most likely need to be updated before it can be merged. I will remove the 0.10.x until we can get this resolved.
E2E failure is unrelated to this change.
Based on CI results here: https://github.com/intel/llvm/pull/15031 && https://github.com/intel/llvm/pull/15054 there are no failures caused by this change. The Jenkins jobs are randomly failing.
awaiting one more re-review before ready to merge.