Implement getitem backward
Added getitem operation backward kernel with solver Added driver test and gtest for getitem backward Compared to ROCm pytorch, there is a performance improvement
getitem float16
| op_name | dtype | dx_size | index_size | dy_size | dim | model | dir | ROCm pytorch (op time) | MIOpen HIP | Improvement |
|---|---|---|---|---|---|---|---|---|---|---|
| getitem | float16 | [128 128] | [128] | [128 128] | 0 | llama2 | bwd | 321654 | 53474 | 6.02 |
| getitem | float16 | [3234 4] | [16] | [16 4] | 0 | ssdlite | bwd | 249380 | 60070 | 4.15 |
| getitem | float16 | [3234 4] | [12] | [12 4] | 0 | ssdlite | bwd | 211604 | 57741 | 3.66 |
| getitem | float16 | [3234 4] | [11] | [11 4] | 0 | ssdlite | bwd | 254389 | 54470 | 4.67 |
| getitem | float16 | [3234 4] | [13] | [13 4] | 0 | ssdlite | bwd | 240549 | 58079 | 4.14 |
| getitem | float16 | [3234 4] | [14] | [14 4] | 0 | ssdlite | bwd | 220932 | 59892 | 3.69 |
| getitem | float16 | [3234 4] | [15] | [15 4] | 0 | ssdlite | bwd | 243876 | 58523 | 4.17 |
| getitem | float16 | [3234 4] | [10] | [10 4] | 0 | ssdlite | bwd | 234308 | 60301 | 3.89 |
| getitem | float16 | [3234 4] | [18] | [18 4] | 0 | ssdlite | bwd | 214548 | 59323 | 3.62 |
| getitem | float16 | [3234 4] | [20] | [20 4] | 0 | ssdlite | bwd | 221380 | 61030 | 3.63 |
| getitem | float16 | [3234 4] | [23] | [23 4] | 0 | ssdlite | bwd | 243877 | 59946 | 4.07 |
| getitem | float16 | [3234 4] | [8] | [8 4] | 0 | ssdlite | bwd | 210948 | 59039 | 3.57 |
| getitem | float16 | [3234 4] | [22] | [22 4] | 0 | ssdlite | bwd | 223524 | 59021 | 3.79 |
| getitem | float16 | [3234 4] | [25] | [25 4] | 0 | ssdlite | bwd | 246389 | 59394 | 4.15 |
| getitem | float16 | [3234 4] | [7] | [7 4] | 0 | ssdlite | bwd | 206372 | 57030 | 3.62 |
| getitem | float16 | [3234 4] | [28] | [28 4] | 0 | ssdlite | bwd | 226452 | 61297 | 3.69 |
| getitem | float16 | [3234 4] | [30] | [30 4] | 0 | ssdlite | bwd | 206788 | 60265 | 3.43 |
| getitem | float16 | [3234 4] | [31] | [31 4] | 0 | ssdlite | bwd | 223364 | 62025 | 3.60 |
| getitem | float16 | [3234 4] | [32] | [32 4] | 0 | ssdlite | bwd | 215620 | 60354 | 3.57 |
| getitem | float16 | [3234 4] | [33] | [33 4] | 0 | ssdlite | bwd | 203548 | 55057 | 3.70 |
| getitem | float16 | [3234 4] | [34] | [34 4] | 0 | ssdlite | bwd | 223060 | 61474 | 3.63 |
| getitem | float16 | [3234] | [16] | [16] | 0 | ssdlite | bwd | 224244 | 60408 | 3.71 |
| getitem | float16 | [149 128] | [1490] | [1490 128] | 0 | llama2_7b | bwd | 287158 | 65972 | 4.35 |
| getitem | float16 | [150 128] | [10] | [10 128] | 0 | llama2_7b | bwd | 235060 | 63465 | 3.70 |
| getitem | float16 | [174 128] | [10] | [10 128] | 0 | llama2_7b | bwd | 266485 | 62950 | 4.23 |
| getitem | float16 | [205 128] | [10] | [10 128] | 0 | llama2_7b | bwd | 247861 | 63604 | 3.90 |
| getitem | float16 | [232 128] | [10] | [10 128] | 0 | llama2_7b | bwd | 257685 | 61379 | 4.20 |
| getitem | float16 | [3234] | [15] | [15] | 0 | ssdlite | bwd | 242532 | 58479 | 4.15 |
| getitem | float16 | [3234 4] | [2] | [2 4] | 0 | ssdlite | bwd | 219716 | 60026 | 3.66 |
| getitem | float16 | [3234 4] | [37] | [37 4] | 0 | ssdlite | bwd | 267941 | 62142 | 4.31 |
| getitem | float16 | [3234] | [10] | [10] | 0 | ssdlite | bwd | 263285 | 62337 | 4.22 |
| getitem | float16 | [3234 4] | [38] | [38 4] | 0 | ssdlite | bwd | 219188 | 63012 | 3.48 |
| getitem | float16 | [3234 4] | [41] | [41 4] | 0 | ssdlite | bwd | 181827 | 60665 | 3.00 |
| getitem | float16 | [3234 4] | [6] | [6 4] | 0 | ssdlite | bwd | 232612 | 64968 | 3.58 |
| getitem | float16 | [3234 4] | [44] | [44 4] | 0 | ssdlite | bwd | 299382 | 55455 | 5.40 |
| getitem | float16 | [3234] | [22] | [22] | 0 | ssdlite | bwd | 265733 | 55438 | 4.79 |
| getitem | float16 | [3234] | [25] | [25] | 0 | ssdlite | bwd | 193556 | 60469 | 3.20 |
| getitem | float16 | [3234 4] | [43] | [43 4] | 0 | ssdlite | bwd | 222116 | 59971 | 3.70 |
| getitem | float16 | [3234] | [21] | [21] | 0 | ssdlite | bwd | 204452 | 59811 | 3.42 |
| getitem | float16 | [3234] | [27] | [27] | 0 | ssdlite | bwd | 187636 | 60967 | 3.08 |
| getitem | float16 | [3234] | [24] | [24] | 0 | ssdlite | bwd | 213284 | 56219 | 3.79 |
| getitem | float16 | [3234 4] | [42] | [42 4] | 0 | ssdlite | bwd | 209716 | 61002 | 3.44 |
| getitem | float16 | [3234 4] | [3] | [3 4] | 0 | ssdlite | bwd | 184708 | 58247 | 3.17 |
| getitem | float16 | [3234 4] | [45] | [45 4] | 0 | ssdlite | bwd | 220964 | 61909 | 3.57 |
| getitem | float16 | [3234] | [40] | [40] | 0 | ssdlite | bwd | 248037 | 54475 | 4.55 |
| getitem | float16 | [1 4] | [8] | [8 4] | 0 | ssdlite | bwd | 214212 | 59774 | 3.58 |
| getitem | float16 | [1] | [13] | [13] | 0 | ssdlite | bwd | 638684 | 61516 | 10.38 |
| getitem | float16 | [3234] | [45] | [45] | 0 | ssdlite | bwd | 214692 | 66920 | 3.21 |
| getitem | float16 | [1 4] | [7] | [7 4] | 0 | ssdlite | bwd | 219844 | 60449 | 3.64 |
| getitem | float16 | [1] | [7] | [7] | 0 | ssdlite | bwd | 199988 | 60822 | 3.29 |
| getitem | float16 | [3234] | [4] | [4] | 0 | ssdlite | bwd | 220484 | 58760 | 3.75 |
| getitem | float16 | [3234 4] | [59] | [59 4] | 0 | ssdlite | bwd | 185812 | 61853 | 3.00 |
| getitem | float16 | [3234] | [5] | [5] | 0 | ssdlite | bwd | 223844 | 59720 | 3.75 |
| getitem | float16 | [2 4] | [15] | [15 4] | 0 | ssdlite | bwd | 229764 | 58048 | 3.96 |
| getitem | float16 | [3234 4] | [62] | [62 4] | 0 | ssdlite | bwd | 228916 | 61035 | 3.75 |
| getitem | float16 | [3234] | [57] | [57] | 0 | ssdlite | bwd | 175347 | 61887 | 2.83 |
| getitem | float16 | [3234 4] | [77] | [77 4] | 0 | ssdlite | bwd | 185524 | 63025 | 2.94 |
| getitem | float16 | [3234 4] | [79] | [79 4] | 0 | ssdlite | bwd | 185380 | 62545 | 2.96 |
| getitem | float16 | [3234 4] | [75] | [75 4] | 0 | ssdlite | bwd | 166003 | 64340 | 2.58 |
| getitem | float16 | [3234] | [59] | [59] | 0 | ssdlite | bwd | 160531 | 60767 | 2.64 |
| getitem | float16 | [8741 4] | [2000] | [2000 4] | 0 | maskrcnn | bwd | 228196 | 60553 | 3.77 |
| getitem | float16 | [8741] | [2000] | [2000] | 0 | maskrcnn | bwd | 198628 | 61833 | 3.21 |
| getitem | float16 | [3234 4] | [89] | [89 4] | 0 | ssdlite | bwd | 138706 | 64713 | 2.14 |
| getitem | float16 | [3234 4] | [84] | [84 4] | 0 | ssdlite | bwd | 135922 | 64038 | 2.12 |
| getitem | float16 | [8741 4] | [8741] | [8741 4] | 0 | maskrcnn | bwd | 211428 | 58420 | 3.62 |
| getitem | float16 | [8741] | [8741] | [8741] | 0 | maskrcnn | bwd | 202004 | 53620 | 3.77 |
| getitem | float16 | [3234] | [58] | [58] | 0 | ssdlite | bwd | 130242 | 61140 | 2.13 |
| getitem | float16 | [3 4] | [20] | [20 4] | 0 | ssdlite | bwd | 137362 | 62900 | 2.18 |
| getitem | float16 | [3] | [20] | [20] | 0 | ssdlite | bwd | 135986 | 60251 | 2.26 |
| getitem | float16 | [3234] | [64] | [64] | 0 | ssdlite | bwd | 138371 | 63077 | 2.19 |
| getitem | float16 | [3234 4] | [83] | [83 4] | 0 | ssdlite | bwd | 132819 | 64002 | 2.08 |
| getitem | float16 | [3 4] | [17] | [17 4] | 0 | ssdlite | bwd | 133779 | 62331 | 2.15 |
| getitem | float16 | [3] | [17] | [17] | 0 | ssdlite | bwd | 131394 | 62953 | 2.09 |
| getitem | float16 | [3234 4] | [86] | [86 4] | 0 | ssdlite | bwd | 144003 | 63131 | 2.28 |
| getitem | float16 | [3234 4] | [88] | [88 4] | 0 | ssdlite | bwd | 135123 | 63682 | 2.12 |
| getitem | float16 | [3234] | [65] | [65] | 0 | ssdlite | bwd | 129378 | 62065 | 2.08 |
| getitem | float16 | [3234] | [73] | [73] | 0 | ssdlite | bwd | 138162 | 64962 | 2.13 |
| getitem | float16 | [3234] | [61] | [61] | 0 | ssdlite | bwd | 133827 | 59540 | 2.25 |
| getitem | float16 | [3 4] | [23] | [23 4] | 0 | ssdlite | bwd | 131506 | 62598 | 2.10 |
| getitem | float16 | [3] | [23] | [23] | 0 | ssdlite | bwd | 136962 | 61229 | 2.24 |
| getitem | float16 | [3 4] | [21] | [21 4] | 0 | ssdlite | bwd | 139315 | 58740 | 2.37 |
| getitem | float16 | [3] | [21] | [21] | 0 | ssdlite | bwd | 133459 | 61975 | 2.15 |
| getitem | float16 | [2 4] | [8] | [8 4] | 0 | ssdlite | bwd | 136290 | 58775 | 2.32 |
| getitem | float16 | [2] | [8] | [8] | 0 | ssdlite | bwd | 135587 | 61975 | 2.19 |
| getitem | float16 | [3 4] | [28] | [28 4] | 0 | ssdlite | bwd | 135827 | 58473 | 2.32 |
| getitem | float16 | [3] | [28] | [28] | 0 | ssdlite | bwd | 137091 | 67487 | 2.03 |
| getitem | float16 | [2 4] | [27] | [27 4] | 0 | ssdlite | bwd | 144339 | 60873 | 2.37 |
| getitem | float16 | [2] | [27] | [27] | 0 | ssdlite | bwd | 141299 | 64411 | 2.19 |
| getitem | float16 | [3 4] | [22] | [22 4] | 0 | ssdlite | bwd | 134851 | 60535 | 2.23 |
| getitem | float16 | [3] | [22] | [22] | 0 | ssdlite | bwd | 139475 | 60517 | 2.30 |
| getitem | float16 | [3234 4] | [78] | [78 4] | 0 | ssdlite | bwd | 135538 | 62011 | 2.19 |
| getitem | float16 | [2 4] | [28] | [28 4] | 0 | ssdlite | bwd | 136659 | 61744 | 2.21 |
| getitem | float16 | [2] | [28] | [28] | 0 | ssdlite | bwd | 144307 | 60198 | 2.40 |
| getitem | float16 | [3234] | [67] | [67] | 0 | ssdlite | bwd | 136146 | 62367 | 2.18 |
| getitem | float16 | [3234] | [74] | [74] | 0 | ssdlite | bwd | 132786 | 61584 | 2.16 |
| getitem | float16 | [3234] | [70] | [70] | 0 | ssdlite | bwd | 147762 | 59860 | 2.47 |
| getitem | float16 | [3 4] | [26] | [26 4] | 0 | ssdlite | bwd | 159059 | 61015 | 2.61 |
| getitem | float16 | [3] | [26] | [26] | 0 | ssdlite | bwd | 152835 | 67380 | 2.27 |
| getitem | float16 | [3 4] | [29] | [29 4] | 0 | ssdlite | bwd | 157571 | 65264 | 2.41 |
| getitem | float16 | [3] | [29] | [29] | 0 | ssdlite | bwd | 161635 | 63255 | 2.56 |
| getitem | float16 | [3 4] | [18] | [18 4] | 0 | ssdlite | bwd | 151955 | 60553 | 2.51 |
| getitem | float16 | [3] | [18] | [18] | 0 | ssdlite | bwd | 155939 | 61246 | 2.55 |
| getitem | float16 | [3234] | [68] | [68] | 0 | ssdlite | bwd | 161587 | 62722 | 2.58 |
| getitem | float16 | [3234] | [66] | [66] | 0 | ssdlite | bwd | 160131 | 62224 | 2.57 |
| getitem | float16 | [3 4] | [19] | [19 4] | 0 | ssdlite | bwd | 154099 | 59024 | 2.61 |
| getitem | float16 | [3] | [19] | [19] | 0 | ssdlite | bwd | 148547 | 56250 | 2.64 |
| getitem | float16 | [3234 4] | [90] | [90 4] | 0 | ssdlite | bwd | 158995 | 63060 | 2.52 |
| getitem | float16 | [3] | [25] | [25] | 0 | ssdlite | bwd | 150099 | 60108 | 2.50 |
| getitem | float16 | [3234 4] | [87] | [87 4] | 0 | ssdlite | bwd | 159315 | 63077 | 2.53 |
| getitem | float16 | [3 4] | [10] | [10 4] | 0 | ssdlite | bwd | 158275 | 59895 | 2.64 |
| getitem | float16 | [3] | [10] | [10] | 0 | ssdlite | bwd | 148851 | 58188 | 2.56 |
| getitem | float16 | [3 4] | [27] | [27 4] | 0 | ssdlite | bwd | 157251 | 60784 | 2.59 |
| getitem | float16 | [3] | [27] | [27] | 0 | ssdlite | bwd | 158787 | 64838 | 2.45 |
| getitem | float16 | [3 4] | [30] | [30 4] | 0 | ssdlite | bwd | 154851 | 61975 | 2.50 |
| getitem | float16 | [3] | [30] | [30] | 0 | ssdlite | bwd | 159859 | 54526 | 2.93 |
| getitem | float16 | [3 4] | [16] | [16 4] | 0 | ssdlite | bwd | 159331 | 61122 | 2.61 |
| getitem | float16 | [3] | [16] | [16] | 0 | ssdlite | bwd | 151491 | 64322 | 2.36 |
| getitem | float16 | [3 4] | [33] | [33 4] | 0 | ssdlite | bwd | 158995 | 63539 | 2.50 |
| getitem | float16 | [3] | [33] | [33] | 0 | ssdlite | bwd | 163427 | 58775 | 2.78 |
| getitem | float16 | [3234 4] | [81] | [81 4] | 0 | ssdlite | bwd | 155011 | 63522 | 2.44 |
| getitem | float16 | [3 4] | [34] | [34 4] | 0 | ssdlite | bwd | 154659 | 63415 | 2.44 |
| getitem | float16 | [3] | [34] | [34] | 0 | ssdlite | bwd | 161091 | 62028 | 2.60 |
| getitem | float16 | [4300 4] | [4261] | [4261 4] | 0 | fasterrcnn | bwd | 208404 | 60410 | 3.45 |
| getitem | float16 | [4300] | [4261] | [4261] | 0 | fasterrcnn | bwd | 208388 | 63860 | 3.26 |
| getitem | float16 | [4261 4] | [1000] | [1000 4] | 0 | fasterrcnn | bwd | 160339 | 61815 | 2.59 |
| getitem | float16 | [4261] | [1000] | [1000] | 0 | fasterrcnn | bwd | 165416 | 61317 | 2.70 |
| getitem | float16 | [3 4] | [31] | [31 4] | 0 | ssdlite | bwd | 168131 | 61975 | 2.71 |
| getitem | float16 | [3] | [31] | [31] | 0 | ssdlite | bwd | 148131 | 63255 | 2.34 |
| getitem | float16 | [2 4] | [29] | [29 4] | 0 | ssdlite | bwd | 157119 | 62206 | 2.53 |
| getitem | float16 | [2] | [29] | [29] | 0 | ssdlite | bwd | 161795 | 60873 | 2.66 |
| getitem | float16 | [3234] | [72] | [72] | 0 | ssdlite | bwd | 152995 | 61139 | 2.50 |
| getitem | float16 | [3234] | [69] | [69] | 0 | ssdlite | bwd | 152755 | 62971 | 2.43 |
| getitem | float16 | [3234 4] | [94] | [94 4] | 0 | ssdlite | bwd | 157939 | 65477 | 2.41 |
| getitem | float16 | [3 4] | [11] | [11 4] | 0 | ssdlite | bwd | 151907 | 54561 | 2.78 |
| getitem | float16 | [3] | [11] | [11] | 0 | ssdlite | bwd | 159091 | 60357 | 2.64 |
| getitem | float16 | [3 4] | [13] | [13 4] | 0 | ssdlite | bwd | 149555 | 59486 | 2.51 |
| getitem | float16 | [3] | [13] | [13] | 0 | ssdlite | bwd | 153779 | 57797 | 2.66 |
| getitem | float16 | [3 4] | [12] | [12 4] | 0 | ssdlite | bwd | 157283 | 59504 | 2.64 |
| getitem | float16 | [3] | [12] | [12] | 0 | ssdlite | bwd | 152963 | 60073 | 2.55 |
| getitem | float16 | [3 4] | [14] | [14 4] | 0 | ssdlite | bwd | 151123 | 67291 | 2.25 |
| getitem | float16 | [3] | [14] | [14] | 0 | ssdlite | bwd | 156915 | 63611 | 2.47 |
| getitem | float16 | [3234] | [71] | [71] | 0 | ssdlite | bwd | 155427 | 61851 | 2.51 |
| getitem | float16 | [3 4] | [24] | [24 4] | 0 | ssdlite | bwd | 155475 | 59770 | 2.60 |
| getitem | float16 | [3] | [24] | [24] | 0 | ssdlite | bwd | 160387 | 62028 | 2.59 |
| getitem | float16 | [3234 4] | [101] | [101 4] | 0 | ssdlite | bwd | 154627 | 63415 | 2.44 |
| getitem | float16 | [1 4] | [4] | [4 4] | 0 | ssdlite | bwd | 148931 | 61566 | 2.42 |
| getitem | float16 | [1] | [4] | [4] | 0 | ssdlite | bwd | 156211 | 61477 | 2.54 |
| getitem | float16 | [1 4] | [6] | [6 4] | 0 | ssdlite | bwd | 154675 | 54366 | 2.85 |
| getitem | float16 | [1] | [6] | [6] | 0 | ssdlite | bwd | 150579 | 59433 | 2.53 |
| getitem | float16 | [3234 4] | [85] | [85 4] | 0 | ssdlite | bwd | 158851 | 62490 | 2.54 |
| getitem | float16 | [3234 4] | [92] | [92 4] | 0 | ssdlite | bwd | 156035 | 64873 | 2.41 |
| getitem | float16 | [4 4] | [4] | [4 4] | 0 | ssdlite | bwd | 154755 | 58668 | 2.64 |
| getitem | float16 | [4] | [4] | [4] | 0 | ssdlite | bwd | 150051 | 62330 | 2.41 |
| getitem | float16 | [3234] | [76] | [76] | 0 | ssdlite | bwd | 159507 | 56588 | 2.82 |
| getitem | float16 | [3 4] | [15] | [15 4] | 0 | ssdlite | bwd | 153107 | 59646 | 2.57 |
| getitem | float16 | [3] | [15] | [15] | 0 | ssdlite | bwd | 144019 | 60001 | 2.40 |
| getitem | float16 | [3234] | [80] | [80] | 0 | ssdlite | bwd | 146691 | 61122 | 2.40 |
| getitem | float16 | [3 4] | [32] | [32 4] | 0 | ssdlite | bwd | 145331 | 61353 | 2.37 |
| getitem | float16 | [3] | [32] | [32] | 0 | ssdlite | bwd | 142178 | 57050 | 2.49 |
| getitem | float16 | [3234 4] | [91] | [91 4] | 0 | ssdlite | bwd | 137411 | 62881 | 2.19 |
| getitem | float16 | [4300 4] | [4262] | [4262 4] | 0 | fasterrcnn | bwd | 202772 | 58241 | 3.48 |
| getitem | float16 | [4300] | [4262] | [4262] | 0 | fasterrcnn | bwd | 188083 | 57939 | 3.25 |
| getitem | float16 | [4262 4] | [1000] | [1000 4] | 0 | fasterrcnn | bwd | 152211 | 60819 | 2.50 |
| getitem | float16 | [4262] | [1000] | [1000] | 0 | fasterrcnn | bwd | 143955 | 61548 | 2.34 |
| getitem | float16 | [1] | [5] | [5] | 0 | maskrcnn_ssdlite | bwd | 133970 | 61139 | 2.19 |
| getitem | float16 | [4 4] | [27] | [27 4] | 0 | ssdlite | bwd | 139363 | 62117 | 2.24 |
| getitem | float16 | [4] | [27] | [27] | 0 | ssdlite | bwd | 135491 | 62028 | 2.18 |
| getitem | float16 | [1 4] | [5] | [5 4] | 0 | ssdlite | bwd | 130291 | 58935 | 2.21 |
| getitem | float16 | [4 4] | [28] | [28 4] | 0 | ssdlite | bwd | 138797 | 59895 | 2.32 |
| getitem | float16 | [4] | [28] | [28] | 0 | ssdlite | bwd | 137699 | 60730 | 2.27 |
| getitem | float16 | [4300 4] | [4194] | [4194 4] | 0 | fasterrcnn | bwd | 194276 | 59468 | 3.27 |
| getitem | float16 | [4300] | [4194] | [4194] | 0 | fasterrcnn | bwd | 190578 | 61886 | 3.08 |
| getitem | float16 | [4194 4] | [1000] | [1000 4] | 0 | fasterrcnn | bwd | 171235 | 60410 | 2.83 |
| getitem | float16 | [2 4] | [30] | [30 4] | 0 | ssdlite | bwd | 158163 | 60837 | 2.60 |
| getitem | float16 | [2] | [30] | [30] | 0 | ssdlite | bwd | 151331 | 62153 | 2.43 |
| getitem | float16 | [4 4] | [33] | [33 4] | 0 | ssdlite | bwd | 157619 | 61459 | 2.56 |
| getitem | float16 | [4] | [33] | [33] | 0 | ssdlite | bwd | 156979 | 61157 | 2.57 |
| getitem | float16 | [3234] | [77] | [77] | 0 | ssdlite | bwd | 152355 | 61993 | 2.46 |
| getitem | float16 | [3234 4] | [93] | [93 4] | 0 | ssdlite | bwd | 149283 | 62935 | 2.37 |
| getitem | float16 | [3234 4] | [105] | [105 4] | 0 | ssdlite | bwd | 158163 | 63344 | 2.50 |
| getitem | float16 | [3234 4] | [113] | [113 4] | 0 | ssdlite | bwd | 150099 | 64464 | 2.33 |
| getitem | float16 | [4194] | [1000] | [1000] | 0 | fasterrcnn | bwd | 160915 | 70206 | 2.29 |
| getitem | float16 | [3 4] | [37] | [37 4] | 0 | ssdlite | bwd | 145811 | 59148 | 2.47 |
| getitem | float16 | [3] | [37] | [37] | 0 | ssdlite | bwd | 146574 | 57595 | 2.54 |
| getitem | float16 | [4 4] | [37] | [37 4] | 0 | ssdlite | bwd | 146515 | 64866 | 2.26 |
| getitem | float16 | [4] | [37] | [37] | 0 | ssdlite | bwd | 148307 | 62217 | 2.38 |
| getitem | float16 | [2 4] | [7] | [7 4] | 0 | ssdlite | bwd | 134354 | 64244 | 2.09 |
| getitem | float16 | [2] | [7] | [7] | 0 | ssdlite | bwd | 150131 | 59159 | 2.54 |
| getitem | float16 | [4 4] | [18] | [18 4] | 0 | ssdlite | bwd | 140434 | 55355 | 2.54 |
| getitem | float16 | [4] | [18] | [18] | 0 | ssdlite | bwd | 134099 | 59800 | 2.24 |
| getitem | float16 | [3 4] | [9] | [9 4] | 0 | ssdlite | bwd | 135986 | 59462 | 2.29 |
| getitem | float16 | [3] | [9] | [9] | 0 | ssdlite | bwd | 154323 | 67159 | 2.30 |
| getitem | float16 | [3234] | [79] | [79] | 0 | ssdlite | bwd | 140083 | 63515 | 2.21 |
| getitem | float16 | [4 4] | [30] | [30 4] | 0 | ssdlite | bwd | 133794 | 59604 | 2.24 |
| getitem | float16 | [4] | [30] | [30] | 0 | ssdlite | bwd | 149619 | 61115 | 2.45 |
| getitem | float16 | [3234] | [75] | [75] | 0 | ssdlite | bwd | 139619 | 60600 | 2.30 |
| getitem | float16 | [3 4] | [35] | [35 4] | 0 | ssdlite | bwd | 136691 | 62360 | 2.19 |
| getitem | float16 | [4 4] | [39] | [39 4] | 0 | ssdlite | bwd | 216420 | 68975 | 3.14 |
| getitem | float16 | [8] | [2008] | [2008] | 0 | maskrcnn | bwd | 263685 | 128797 | 2.05 |
| getitem | float16 | [5] | [14] | [14] | 0 | maskrcnn_ssdlite | bwd | 196612 | 61884 | 3.18 |
| getitem | float16 | [8 4] | [8] | [8 4] | 0 | ssdlite | bwd | 168851 | 60035 | 2.81 |
| getitem | float16 | [4197 4] | [1000] | [1000 4] | 0 | fasterrcnn | bwd | 166435 | 64302 | 2.59 |
| getitem | float16 | [4197] | [1000] | [1000] | 0 | fasterrcnn | bwd | 171219 | 59858 | 2.86 |
| getitem | float16 | [16] | [2016] | [2016] | 0 | maskrcnn | bwd | 238692 | 95396 | 2.50 |
| getitem | float16 | [7 4] | [33] | [33 4] | 0 | ssdlite | bwd | 155811 | 59449 | 2.62 |
getitem float16
| op_name | dtype | dx_size | index_size | dy_size | dim | model | dir | ROCm pytorch (op time) | MIOpen HIP | Improvement |
|---|---|---|---|---|---|---|---|---|---|---|
| getitem | float32 | [128 128] | [128] | [128 128] | 0 | llama2 | bwd | 259861 | 61190 | 4.25 |
| getitem | float32 | [3234 4] | [16] | [16 4] | 0 | ssdlite | bwd | 239348 | 61048 | 3.92 |
| getitem | float32 | [3234 4] | [12] | [12 4] | 0 | ssdlite | bwd | 249861 | 59945 | 4.17 |
| getitem | float32 | [3234 4] | [11] | [11 4] | 0 | ssdlite | bwd | 233364 | 59288 | 3.94 |
| getitem | float32 | [3234 4] | [13] | [13 4] | 0 | ssdlite | bwd | 230340 | 58506 | 3.94 |
| getitem | float32 | [3234 4] | [14] | [14 4] | 0 | ssdlite | bwd | 253285 | 60248 | 4.20 |
| getitem | float32 | [3234 4] | [15] | [15 4] | 0 | ssdlite | bwd | 227812 | 58808 | 3.87 |
| getitem | float32 | [3234 4] | [10] | [10 4] | 0 | ssdlite | bwd | 219668 | 60248 | 3.65 |
| getitem | float32 | [3234 4] | [18] | [18 4] | 0 | ssdlite | bwd | 236724 | 63963 | 3.70 |
| getitem | float32 | [3234 4] | [20] | [20 4] | 0 | ssdlite | bwd | 234116 | 55679 | 4.20 |
| getitem | float32 | [3234 4] | [23] | [23 4] | 0 | ssdlite | bwd | 197892 | 60443 | 3.27 |
| getitem | float32 | [3234 4] | [8] | [8 4] | 0 | ssdlite | bwd | 228756 | 54541 | 4.19 |
| getitem | float32 | [3234 4] | [22] | [22 4] | 0 | ssdlite | bwd | 233748 | 60870 | 3.84 |
| getitem | float32 | [3234 4] | [25] | [25 4] | 0 | ssdlite | bwd | 200356 | 62221 | 3.22 |
| getitem | float32 | [3234 4] | [7] | [7 4] | 0 | ssdlite | bwd | 213124 | 58541 | 3.64 |
| getitem | float32 | [3234 4] | [28] | [28 4] | 0 | ssdlite | bwd | 214148 | 60159 | 3.56 |
| getitem | float32 | [3234 4] | [30] | [30 4] | 0 | ssdlite | bwd | 224036 | 62506 | 3.58 |
| getitem | float32 | [3234 4] | [31] | [31 4] | 0 | ssdlite | bwd | 211092 | 59110 | 3.57 |
| getitem | float32 | [3234 4] | [32] | [32 4] | 0 | ssdlite | bwd | 213524 | 61634 | 3.46 |
| getitem | float32 | [3234 4] | [33] | [33 4] | 0 | ssdlite | bwd | 223892 | 60195 | 3.72 |
| getitem | float32 | [3234 4] | [34] | [34 4] | 0 | ssdlite | bwd | 210660 | 59999 | 3.51 |
| getitem | float32 | [3234] | [16] | [16] | 0 | ssdlite | bwd | 214116 | 60426 | 3.54 |
| getitem | float32 | [149 128] | [1490] | [1490 128] | 0 | llama2_7b | bwd | 254165 | 83092 | 3.06 |
| getitem | float32 | [150 128] | [10] | [10 128] | 0 | llama2_7b | bwd | 267893 | 61794 | 4.34 |
| getitem | float32 | [174 128] | [10] | [10 128] | 0 | llama2_7b | bwd | 272789 | 61705 | 4.42 |
| getitem | float32 | [205 128] | [10] | [10 128] | 0 | llama2_7b | bwd | 298054 | 62004 | 4.81 |
| getitem | float32 | [232 128] | [10] | [10 128] | 0 | llama2_7b | bwd | 256661 | 62285 | 4.12 |
| getitem | float32 | [248 128] | [10] | [10 128] | 0 | llama2_7b | bwd | 237956 | 63600 | 3.74 |
| getitem | float32 | [3234] | [15] | [15] | 0 | ssdlite | bwd | 247045 | 57679 | 4.28 |
| getitem | float32 | [3234 4] | [2] | [2 4] | 0 | ssdlite | bwd | 263477 | 63973 | 4.12 |
| getitem | float32 | [3234 4] | [37] | [37 4] | 0 | ssdlite | bwd | 238756 | 62942 | 3.79 |
| getitem | float32 | [3234] | [10] | [10] | 0 | ssdlite | bwd | 258229 | 57803 | 4.47 |
| getitem | float32 | [3234 4] | [38] | [38 4] | 0 | ssdlite | bwd | 218180 | 61021 | 3.58 |
| getitem | float32 | [3234 4] | [41] | [41 4] | 0 | ssdlite | bwd | 277029 | 60487 | 4.58 |
| getitem | float32 | [3234 4] | [6] | [6 4] | 0 | ssdlite | bwd | 269109 | 58425 | 4.61 |
| getitem | float32 | [3234 4] | [44] | [44 4] | 0 | ssdlite | bwd | 156131 | 61536 | 2.54 |
| getitem | float32 | [3234] | [22] | [22] | 0 | ssdlite | bwd | 230196 | 59029 | 3.90 |
| getitem | float32 | [3234] | [25] | [25] | 0 | ssdlite | bwd | 210116 | 60540 | 3.47 |
| getitem | float32 | [3234 4] | [43] | [43 4] | 0 | ssdlite | bwd | 208052 | 58158 | 3.58 |
| getitem | float32 | [3234] | [21] | [21] | 0 | ssdlite | bwd | 206996 | 60007 | 3.45 |
| getitem | float32 | [3234] | [27] | [27] | 0 | ssdlite | bwd | 213444 | 61749 | 3.46 |
| getitem | float32 | [3234] | [24] | [24] | 0 | ssdlite | bwd | 199588 | 59385 | 3.36 |
| getitem | float32 | [3234 4] | [42] | [42 4] | 0 | ssdlite | bwd | 199860 | 55419 | 3.61 |
| getitem | float32 | [3234 4] | [3] | [3 4] | 0 | ssdlite | bwd | 212100 | 58958 | 3.60 |
| getitem | float32 | [3234 4] | [45] | [45 4] | 0 | ssdlite | bwd | 196100 | 56628 | 3.46 |
| getitem | float32 | [3234] | [40] | [40] | 0 | ssdlite | bwd | 240628 | 60752 | 3.96 |
| getitem | float32 | [1 4] | [8] | [8 4] | 0 | ssdlite | bwd | 177076 | 58885 | 3.01 |
| getitem | float32 | [1] | [13] | [13] | 0 | ssdlite | bwd | 192084 | 60449 | 3.18 |
| getitem | float32 | [3234] | [45] | [45] | 0 | ssdlite | bwd | 222484 | 60840 | 3.66 |
| getitem | float32 | [1 4] | [7] | [7 4] | 0 | ssdlite | bwd | 189156 | 62013 | 3.05 |
| getitem | float32 | [1] | [7] | [7] | 0 | ssdlite | bwd | 215748 | 61160 | 3.53 |
| getitem | float32 | [3234] | [4] | [4] | 0 | ssdlite | bwd | 196884 | 59951 | 3.28 |
| getitem | float32 | [3234 4] | [59] | [59 4] | 0 | ssdlite | bwd | 229012 | 61515 | 3.72 |
| getitem | float32 | [3234] | [5] | [5] | 0 | ssdlite | bwd | 213780 | 59239 | 3.61 |
| getitem | float32 | [2 4] | [15] | [15 4] | 0 | ssdlite | bwd | 188436 | 64929 | 2.90 |
| getitem | float32 | [3234 4] | [62] | [62 4] | 0 | ssdlite | bwd | 183491 | 64395 | 2.85 |
| getitem | float32 | [3234] | [57] | [57] | 0 | ssdlite | bwd | 182788 | 60642 | 3.01 |
| getitem | float32 | [3234 4] | [77] | [77 4] | 0 | ssdlite | bwd | 184499 | 65460 | 2.82 |
| getitem | float32 | [3234 4] | [79] | [79 4] | 0 | ssdlite | bwd | 163174 | 63558 | 2.57 |
| getitem | float32 | [3234 4] | [75] | [75 4] | 0 | ssdlite | bwd | 156243 | 65496 | 2.39 |
| getitem | float32 | [3234] | [59] | [59] | 0 | ssdlite | bwd | 148819 | 63007 | 2.36 |
| getitem | float32 | [8741 4] | [2000] | [2000 4] | 0 | maskrcnn | bwd | 199124 | 62598 | 3.18 |
| getitem | float32 | [8741] | [2000] | [2000] | 0 | maskrcnn | bwd | 179811 | 65158 | 2.76 |
| getitem | float32 | [3234 4] | [89] | [89 4] | 0 | ssdlite | bwd | 139987 | 66776 | 2.10 |
| getitem | float32 | [3234 4] | [84] | [84 4] | 0 | ssdlite | bwd | 138851 | 63202 | 2.20 |
| getitem | float32 | [8741 4] | [8741] | [8741 4] | 0 | maskrcnn | bwd | 220772 | 60304 | 3.66 |
| getitem | float32 | [8741] | [8741] | [8741] | 0 | maskrcnn | bwd | 183843 | 60731 | 3.03 |
| getitem | float32 | [3234] | [58] | [58] | 0 | ssdlite | bwd | 133907 | 56109 | 2.39 |
| getitem | float32 | [3 4] | [20] | [20 4] | 0 | ssdlite | bwd | 136386 | 61425 | 2.22 |
| getitem | float32 | [3] | [20] | [20] | 0 | ssdlite | bwd | 131311 | 63095 | 2.08 |
| getitem | float32 | [3234] | [64] | [64] | 0 | ssdlite | bwd | 134962 | 61976 | 2.18 |
| getitem | float32 | [3234 4] | [83] | [83 4] | 0 | ssdlite | bwd | 138754 | 64713 | 2.14 |
| getitem | float32 | [3 4] | [17] | [17 4] | 0 | ssdlite | bwd | 134258 | 63380 | 2.12 |
| getitem | float32 | [3] | [17] | [17] | 0 | ssdlite | bwd | 135427 | 64411 | 2.10 |
| getitem | float32 | [3234 4] | [86] | [86 4] | 0 | ssdlite | bwd | 132386 | 66793 | 1.98 |
| getitem | float32 | [3234 4] | [88] | [88 4] | 0 | ssdlite | bwd | 133282 | 64891 | 2.05 |
| getitem | float32 | [3234] | [65] | [65] | 0 | ssdlite | bwd | 137762 | 63113 | 2.18 |
| getitem | float32 | [3234] | [73] | [73] | 0 | ssdlite | bwd | 135731 | 60037 | 2.26 |
| getitem | float32 | [3234] | [61] | [61] | 0 | ssdlite | bwd | 128850 | 62882 | 2.05 |
| getitem | float32 | [3 4] | [23] | [23 4] | 0 | ssdlite | bwd | 139475 | 64944 | 2.15 |
| getitem | float32 | [3] | [23] | [23] | 0 | ssdlite | bwd | 139315 | 64856 | 2.15 |
| getitem | float32 | [3 4] | [21] | [21 4] | 0 | ssdlite | bwd | 135043 | 64784 | 2.08 |
| getitem | float32 | [3] | [21] | [21] | 0 | ssdlite | bwd | 135714 | 65869 | 2.06 |
| getitem | float32 | [2 4] | [8] | [8 4] | 0 | ssdlite | bwd | 134066 | 61922 | 2.17 |
| getitem | float32 | [2] | [8] | [8] | 0 | ssdlite | bwd | 130682 | 61602 | 2.12 |
| getitem | float32 | [3 4] | [28] | [28 4] | 0 | ssdlite | bwd | 144947 | 63469 | 2.28 |
| getitem | float32 | [3] | [28] | [28] | 0 | ssdlite | bwd | 139859 | 64429 | 2.17 |
| getitem | float32 | [2 4] | [27] | [27 4] | 0 | ssdlite | bwd | 137219 | 70011 | 1.96 |
| getitem | float32 | [2] | [27] | [27] | 0 | ssdlite | bwd | 143363 | 69389 | 2.07 |
| getitem | float32 | [3 4] | [22] | [22 4] | 0 | ssdlite | bwd | 138963 | 60553 | 2.29 |
| getitem | float32 | [3] | [22] | [22] | 0 | ssdlite | bwd | 132994 | 61691 | 2.16 |
| getitem | float32 | [3234 4] | [78] | [78 4] | 0 | ssdlite | bwd | 138707 | 64749 | 2.14 |
| getitem | float32 | [2 4] | [28] | [28 4] | 0 | ssdlite | bwd | 143283 | 68927 | 2.08 |
| getitem | float32 | [2] | [28] | [28] | 0 | ssdlite | bwd | 140947 | 66758 | 2.11 |
| getitem | float32 | [3234] | [67] | [67] | 0 | ssdlite | bwd | 136942 | 60998 | 2.25 |
| getitem | float32 | [3234] | [74] | [74] | 0 | ssdlite | bwd | 140643 | 62331 | 2.26 |
| getitem | float32 | [3234] | [70] | [70] | 0 | ssdlite | bwd | 153427 | 59931 | 2.56 |
| getitem | float32 | [3 4] | [26] | [26 4] | 0 | ssdlite | bwd | 156683 | 63984 | 2.45 |
| getitem | float32 | [3] | [26] | [26] | 0 | ssdlite | bwd | 161795 | 72767 | 2.22 |
| getitem | float32 | [3 4] | [29] | [29 4] | 0 | ssdlite | bwd | 155251 | 61531 | 2.52 |
| getitem | float32 | [3] | [29] | [29] | 0 | ssdlite | bwd | 149539 | 65193 | 2.29 |
| getitem | float32 | [3 4] | [18] | [18 4] | 0 | ssdlite | bwd | 157059 | 62669 | 2.51 |
| getitem | float32 | [3] | [18] | [18] | 0 | ssdlite | bwd | 153619 | 57139 | 2.69 |
| getitem | float32 | [3234] | [68] | [68] | 0 | ssdlite | bwd | 152035 | 65318 | 2.33 |
| getitem | float32 | [3234] | [66] | [66] | 0 | ssdlite | bwd | 150563 | 60997 | 2.47 |
| getitem | float32 | [3 4] | [19] | [19 4] | 0 | ssdlite | bwd | 154019 | 63380 | 2.43 |
| getitem | float32 | [3] | [19] | [19] | 0 | ssdlite | bwd | 160147 | 63949 | 2.50 |
| getitem | float32 | [3234 4] | [90] | [90 4] | 0 | ssdlite | bwd | 152963 | 65335 | 2.34 |
| getitem | float32 | [3] | [25] | [25] | 0 | ssdlite | bwd | 158419 | 58224 | 2.72 |
| getitem | float32 | [3234 4] | [87] | [87 4] | 0 | ssdlite | bwd | 151971 | 63931 | 2.38 |
| getitem | float32 | [3 4] | [10] | [10 4] | 0 | ssdlite | bwd | 147987 | 60162 | 2.46 |
| getitem | float32 | [3] | [10] | [10] | 0 | ssdlite | bwd | 157043 | 57993 | 2.71 |
| getitem | float32 | [3 4] | [27] | [27 4] | 0 | ssdlite | bwd | 159027 | 65228 | 2.44 |
| getitem | float32 | [3] | [27] | [27] | 0 | ssdlite | bwd | 151155 | 61246 | 2.47 |
| getitem | float32 | [3 4] | [30] | [30 4] | 0 | ssdlite | bwd | 158163 | 64891 | 2.44 |
| getitem | float32 | [3] | [30] | [30] | 0 | ssdlite | bwd | 158115 | 64055 | 2.47 |
| getitem | float32 | [3 4] | [16] | [16 4] | 0 | ssdlite | bwd | 149187 | 61993 | 2.41 |
| getitem | float32 | [3] | [16] | [16] | 0 | ssdlite | bwd | 158019 | 64109 | 2.46 |
| getitem | float32 | [3 4] | [33] | [33 4] | 0 | ssdlite | bwd | 162515 | 62242 | 2.61 |
| getitem | float32 | [3] | [33] | [33] | 0 | ssdlite | bwd | 152515 | 64624 | 2.36 |
| getitem | float32 | [3234 4] | [81] | [81 4] | 0 | ssdlite | bwd | 157987 | 65264 | 2.42 |
| getitem | float32 | [3 4] | [34] | [34 4] | 0 | ssdlite | bwd | 159235 | 66651 | 2.39 |
| getitem | float32 | [3] | [34] | [34] | 0 | ssdlite | bwd | 151539 | 65104 | 2.33 |
| getitem | float32 | [4300 4] | [4261] | [4261 4] | 0 | fasterrcnn | bwd | 208548 | 65353 | 3.19 |
| getitem | float32 | [4300] | [4261] | [4261] | 0 | fasterrcnn | bwd | 201396 | 65691 | 3.07 |
| getitem | float32 | [4261 4] | [1000] | [1000 4] | 0 | fasterrcnn | bwd | 171427 | 60748 | 2.82 |
| getitem | float32 | [4261] | [1000] | [1000] | 0 | fasterrcnn | bwd | 164723 | 62242 | 2.65 |
| getitem | float32 | [3 4] | [31] | [31 4] | 0 | ssdlite | bwd | 152595 | 69086 | 2.21 |
| getitem | float32 | [3] | [31] | [31] | 0 | ssdlite | bwd | 156531 | 65726 | 2.38 |
| getitem | float32 | [2 4] | [29] | [29 4] | 0 | ssdlite | bwd | 159283 | 69442 | 2.29 |
| getitem | float32 | [2] | [29] | [29] | 0 | ssdlite | bwd | 150067 | 66828 | 2.25 |
| getitem | float32 | [3234] | [72] | [72] | 0 | ssdlite | bwd | 158467 | 59771 | 2.65 |
| getitem | float32 | [3234] | [69] | [69] | 0 | ssdlite | bwd | 161059 | 61335 | 2.63 |
| getitem | float32 | [3234 4] | [94] | [94 4] | 0 | ssdlite | bwd | 146515 | 64144 | 2.28 |
| getitem | float32 | [3 4] | [11] | [11 4] | 0 | ssdlite | bwd | 155843 | 59166 | 2.63 |
| getitem | float32 | [3] | [11] | [11] | 0 | ssdlite | bwd | 147763 | 60428 | 2.45 |
| getitem | float32 | [3 4] | [13] | [13 4] | 0 | ssdlite | bwd | 154563 | 65334 | 2.37 |
| getitem | float32 | [3] | [13] | [13] | 0 | ssdlite | bwd | 154659 | 59148 | 2.61 |
| getitem | float32 | [3 4] | [12] | [12 4] | 0 | ssdlite | bwd | 149299 | 74989 | 1.99 |
| getitem | float32 | [3] | [12] | [12] | 0 | ssdlite | bwd | 156019 | 59504 | 2.62 |
| getitem | float32 | [3 4] | [14] | [14 4] | 0 | ssdlite | bwd | 155619 | 62259 | 2.50 |
| getitem | float32 | [3] | [14] | [14] | 0 | ssdlite | bwd | 146131 | 61691 | 2.37 |
| getitem | float32 | [3234] | [71] | [71] | 0 | ssdlite | bwd | 157123 | 59788 | 2.63 |
| getitem | float32 | [3 4] | [24] | [24 4] | 0 | ssdlite | bwd | 162227 | 65104 | 2.49 |
| getitem | float32 | [3] | [24] | [24] | 0 | ssdlite | bwd | 151939 | 65264 | 2.33 |
| getitem | float32 | [3234 4] | [101] | [101 4] | 0 | ssdlite | bwd | 153763 | 65637 | 2.34 |
| getitem | float32 | [1 4] | [4] | [4 4] | 0 | ssdlite | bwd | 156499 | 59788 | 2.62 |
| getitem | float32 | [1] | [4] | [4] | 0 | ssdlite | bwd | 150691 | 65228 | 2.31 |
| getitem | float32 | [1 4] | [6] | [6 4] | 0 | ssdlite | bwd | 151811 | 59717 | 2.54 |
| getitem | float32 | [1] | [6] | [6] | 0 | ssdlite | bwd | 154915 | 61050 | 2.54 |
| getitem | float32 | [3234 4] | [85] | [85 4] | 0 | ssdlite | bwd | 152451 | 62704 | 2.43 |
| getitem | float32 | [3234 4] | [92] | [92 4] | 0 | ssdlite | bwd | 152867 | 61762 | 2.48 |
| getitem | float32 | [4 4] | [4] | [4 4] | 0 | ssdlite | bwd | 151299 | 61139 | 2.47 |
| getitem | float32 | [4] | [4] | [4] | 0 | ssdlite | bwd | 152755 | 59593 | 2.56 |
| getitem | float32 | [3234] | [76] | [76] | 0 | ssdlite | bwd | 152019 | 60268 | 2.52 |
| getitem | float32 | [3 4] | [15] | [15 4] | 0 | ssdlite | bwd | 141939 | 63894 | 2.22 |
| getitem | float32 | [3] | [15] | [15] | 0 | ssdlite | bwd | 146403 | 61655 | 2.37 |
| getitem | float32 | [3234] | [80] | [80] | 0 | ssdlite | bwd | 148051 | 60446 | 2.45 |
| getitem | float32 | [3 4] | [32] | [32 4] | 0 | ssdlite | bwd | 139203 | 66419 | 2.10 |
| getitem | float32 | [3] | [32] | [32] | 0 | ssdlite | bwd | 136946 | 67166 | 2.04 |
| getitem | float32 | [3234 4] | [91] | [91 4] | 0 | ssdlite | bwd | 142242 | 66508 | 2.14 |
| getitem | float32 | [4300 4] | [4262] | [4262 4] | 0 | fasterrcnn | bwd | 189828 | 63521 | 2.99 |
| getitem | float32 | [4300] | [4262] | [4262] | 0 | fasterrcnn | bwd | 191028 | 62793 | 3.04 |
| getitem | float32 | [4262 4] | [1000] | [1000 4] | 0 | fasterrcnn | bwd | 144659 | 62153 | 2.33 |
| getitem | float32 | [4262] | [1000] | [1000] | 0 | fasterrcnn | bwd | 149171 | 62313 | 2.39 |
| getitem | float32 | [1] | [5] | [5] | 0 | maskrcnn_ssdlite | bwd | 135635 | 61388 | 2.21 |
| getitem | float32 | [4 4] | [27] | [27 4] | 0 | ssdlite | bwd | 135778 | 65353 | 2.08 |
| getitem | float32 | [4] | [27] | [27] | 0 | ssdlite | bwd | 136578 | 64268 | 2.13 |
| getitem | float32 | [1 4] | [5] | [5 4] | 0 | ssdlite | bwd | 136754 | 59699 | 2.29 |
| getitem | float32 | [4 4] | [28] | [28 4] | 0 | ssdlite | bwd | 133634 | 64890 | 2.06 |
| getitem | float32 | [4] | [28] | [28] | 0 | ssdlite | bwd | 136627 | 65299 | 2.09 |
| getitem | float32 | [4300 4] | [4194] | [4194 4] | 0 | fasterrcnn | bwd | 192116 | 63984 | 3.00 |
| getitem | float32 | [4300] | [4194] | [4194] | 0 | fasterrcnn | bwd | 190356 | 63344 | 3.01 |
| getitem | float32 | [4194 4] | [1000] | [1000 4] | 0 | fasterrcnn | bwd | 159907 | 63450 | 2.52 |
| getitem | float32 | [2 4] | [30] | [30 4] | 0 | ssdlite | bwd | 152867 | 73762 | 2.07 |
| getitem | float32 | [2] | [30] | [30] | 0 | ssdlite | bwd | 159491 | 68517 | 2.33 |
| getitem | float32 | [4 4] | [33] | [33 4] | 0 | ssdlite | bwd | 159107 | 65477 | 2.43 |
| getitem | float32 | [4] | [33] | [33] | 0 | ssdlite | bwd | 153363 | 64997 | 2.36 |
| getitem | float32 | [3234] | [77] | [77] | 0 | ssdlite | bwd | 155235 | 60232 | 2.58 |
| getitem | float32 | [3234 4] | [93] | [93 4] | 0 | ssdlite | bwd | 156259 | 60179 | 2.60 |
| getitem | float32 | [3234 4] | [105] | [105 4] | 0 | ssdlite | bwd | 146979 | 65922 | 2.23 |
| getitem | float32 | [3234 4] | [113] | [113 4] | 0 | ssdlite | bwd | 155555 | 64197 | 2.42 |
| getitem | float32 | [4194] | [1000] | [1000] | 0 | fasterrcnn | bwd | 165587 | 61833 | 2.68 |
| getitem | float32 | [3 4] | [37] | [37 4] | 0 | ssdlite | bwd | 145331 | 71859 | 2.02 |
| getitem | float32 | [3] | [37] | [37] | 0 | ssdlite | bwd | 151763 | 61808 | 2.46 |
| getitem | float32 | [4 4] | [37] | [37 4] | 0 | ssdlite | bwd | 148531 | 63763 | 2.33 |
| getitem | float32 | [4] | [37] | [37] | 0 | ssdlite | bwd | 141762 | 63835 | 2.22 |
| getitem | float32 | [2 4] | [7] | [7 4] | 0 | ssdlite | bwd | 139202 | 61222 | 2.27 |
| getitem | float32 | [2] | [7] | [7] | 0 | ssdlite | bwd | 138098 | 60439 | 2.28 |
| getitem | float32 | [4 4] | [18] | [18 4] | 0 | ssdlite | bwd | 133170 | 62715 | 2.12 |
| getitem | float32 | [4] | [18] | [18] | 0 | ssdlite | bwd | 135507 | 68119 | 1.99 |
| getitem | float32 | [3 4] | [9] | [9 4] | 0 | ssdlite | bwd | 142835 | 62040 | 2.30 |
| getitem | float32 | [3] | [9] | [9] | 0 | ssdlite | bwd | 139107 | 55018 | 2.53 |
| getitem | float32 | [3234] | [79] | [79] | 0 | ssdlite | bwd | 134882 | 59729 | 2.26 |
| getitem | float32 | [4 4] | [30] | [30 4] | 0 | ssdlite | bwd | 139651 | 62466 | 2.24 |
| getitem | float32 | [4] | [30] | [30] | 0 | ssdlite | bwd | 135970 | 64244 | 2.12 |
| getitem | float32 | [3234] | [75] | [75] | 0 | ssdlite | bwd | 135858 | 62840 | 2.16 |
| getitem | float32 | [3 4] | [35] | [35 4] | 0 | ssdlite | bwd | 142610 | 69612 | 2.05 |
| getitem | float32 | [4 4] | [39] | [39 4] | 0 | ssdlite | bwd | 190996 | 57118 | 3.34 |
| getitem | float32 | [8] | [2008] | [2008] | 0 | maskrcnn | bwd | 260261 | 131362 | 1.98 |
| getitem | float32 | [5] | [14] | [14] | 0 | maskrcnn_ssdlite | bwd | 169587 | 60871 | 2.79 |
| getitem | float32 | [8 4] | [8] | [8 4] | 0 | ssdlite | bwd | 174323 | 63111 | 2.76 |
| getitem | float32 | [4197 4] | [1000] | [1000 4] | 0 | fasterrcnn | bwd | 178787 | 62987 | 2.84 |
| getitem | float32 | [4197] | [1000] | [1000] | 0 | fasterrcnn | bwd | 169379 | 63964 | 2.65 |
| getitem | float32 | [16] | [2016] | [2016] | 0 | maskrcnn | bwd | 234628 | 140481 | 1.67 |
| getitem | float32 | [7 4] | [33] | [33 4] | 0 | ssdlite | bwd | 160179 | 64782 | 2.47 |
getitem float16
| op_name | dtype | dx_size | index_size | dy_size | dim | model | dir | ROCm pytorch (op time) | MIOpen HIP | Improvement |
|---|---|---|---|---|---|---|---|---|---|---|
| getitem | bfloat16 | [128 128] | [128] | [128 128] | 0 | llama2 | bwd | 253429 | 65687 | 3.86 |
| getitem | bfloat16 | [3234 4] | [16] | [16 4] | 0 | ssdlite | bwd | 242037 | 61545 | 3.93 |
| getitem | bfloat16 | [3234 4] | [12] | [12 4] | 0 | ssdlite | bwd | 240565 | 61848 | 3.89 |
| getitem | bfloat16 | [3234 4] | [11] | [11 4] | 0 | ssdlite | bwd | 231941 | 58150 | 3.99 |
| getitem | bfloat16 | [3234 4] | [13] | [13 4] | 0 | ssdlite | bwd | 253237 | 61297 | 4.13 |
| getitem | bfloat16 | [3234 4] | [14] | [14 4] | 0 | ssdlite | bwd | 226148 | 59874 | 3.78 |
| getitem | bfloat16 | [3234 4] | [15] | [15 4] | 0 | ssdlite | bwd | 226548 | 58327 | 3.88 |
| getitem | bfloat16 | [3234 4] | [10] | [10 4] | 0 | ssdlite | bwd | 248805 | 58737 | 4.24 |
| getitem | bfloat16 | [3234 4] | [18] | [18 4] | 0 | ssdlite | bwd | 242149 | 61439 | 3.94 |
| getitem | bfloat16 | [3234 4] | [20] | [20 4] | 0 | ssdlite | bwd | 213460 | 59003 | 3.62 |
| getitem | bfloat16 | [3234 4] | [23] | [23 4] | 0 | ssdlite | bwd | 229716 | 55039 | 4.17 |
| getitem | bfloat16 | [3234 4] | [8] | [8 4] | 0 | ssdlite | bwd | 225156 | 61190 | 3.68 |
| getitem | bfloat16 | [3234 4] | [22] | [22 4] | 0 | ssdlite | bwd | 209716 | 62719 | 3.34 |
| getitem | bfloat16 | [3234 4] | [25] | [25 4] | 0 | ssdlite | bwd | 219940 | 59003 | 3.73 |
| getitem | bfloat16 | [3234 4] | [7] | [7 4] | 0 | ssdlite | bwd | 221348 | 59572 | 3.72 |
| getitem | bfloat16 | [3234 4] | [28] | [28 4] | 0 | ssdlite | bwd | 217460 | 61937 | 3.51 |
| getitem | bfloat16 | [3234 4] | [30] | [30 4] | 0 | ssdlite | bwd | 197748 | 55857 | 3.54 |
| getitem | bfloat16 | [3234 4] | [31] | [31 4] | 0 | ssdlite | bwd | 222724 | 61101 | 3.65 |
| getitem | bfloat16 | [3234 4] | [32] | [32 4] | 0 | ssdlite | bwd | 221268 | 61457 | 3.60 |
| getitem | bfloat16 | [3234 4] | [33] | [33 4] | 0 | ssdlite | bwd | 209668 | 59768 | 3.51 |
| getitem | bfloat16 | [3234 4] | [34] | [34 4] | 0 | ssdlite | bwd | 214500 | 64959 | 3.30 |
| getitem | bfloat16 | [3234] | [16] | [16] | 0 | ssdlite | bwd | 220367 | 60425 | 3.65 |
| getitem | bfloat16 | [149 128] | [1490] | [1490 128] | 0 | llama2_7b | bwd | 293670 | 77563 | 3.79 |
| getitem | bfloat16 | [150 128] | [10] | [10 128] | 0 | llama2_7b | bwd | 223060 | 62665 | 3.56 |
| getitem | bfloat16 | [174 128] | [10] | [10 128] | 0 | llama2_7b | bwd | 261013 | 62523 | 4.17 |
| getitem | bfloat16 | [205 128] | [10] | [10 128] | 0 | llama2_7b | bwd | 257141 | 57576 | 4.47 |
| getitem | bfloat16 | [232 128] | [10] | [10 128] | 0 | llama2_7b | bwd | 254901 | 63086 | 4.04 |
| getitem | bfloat16 | [248 128] | [10] | [10 128] | 0 | llama2_7b | bwd | 265013 | 61751 | 4.29 |
| getitem | bfloat16 | [3234] | [15] | [15] | 0 | ssdlite | bwd | 263381 | 59564 | 4.42 |
| getitem | bfloat16 | [3234 4] | [2] | [2 4] | 0 | ssdlite | bwd | 219956 | 60399 | 3.64 |
| getitem | bfloat16 | [3234 4] | [37] | [37 4] | 0 | ssdlite | bwd | 242933 | 61128 | 3.97 |
| getitem | bfloat16 | [3234] | [10] | [10] | 0 | ssdlite | bwd | 272741 | 57821 | 4.72 |
| getitem | bfloat16 | [3234 4] | [38] | [38 4] | 0 | ssdlite | bwd | 192020 | 59545 | 3.22 |
| getitem | bfloat16 | [3234 4] | [41] | [41 4] | 0 | ssdlite | bwd | 226452 | 61607 | 3.68 |
| getitem | bfloat16 | [3234 4] | [6] | [6 4] | 0 | ssdlite | bwd | 178259 | 58283 | 3.06 |
| getitem | bfloat16 | [3234 4] | [44] | [44 4] | 0 | ssdlite | bwd | 247812 | 56842 | 4.36 |
| getitem | bfloat16 | [3234] | [22] | [22] | 0 | ssdlite | bwd | 254565 | 60949 | 4.18 |
| getitem | bfloat16 | [3234] | [25] | [25] | 0 | ssdlite | bwd | 188339 | 58140 | 3.24 |
| getitem | bfloat16 | [3234 4] | [43] | [43 4] | 0 | ssdlite | bwd | 223300 | 62603 | 3.57 |
| getitem | bfloat16 | [3234] | [21] | [21] | 0 | ssdlite | bwd | 222628 | 61553 | 3.62 |
| getitem | bfloat16 | [3234] | [27] | [27] | 0 | ssdlite | bwd | 188820 | 59296 | 3.18 |
| getitem | bfloat16 | [3234] | [24] | [24] | 0 | ssdlite | bwd | 209604 | 60096 | 3.49 |
| getitem | bfloat16 | [3234 4] | [42] | [42 4] | 0 | ssdlite | bwd | 212964 | 62051 | 3.43 |
| getitem | bfloat16 | [3234 4] | [3] | [3 4] | 0 | ssdlite | bwd | 181732 | 58051 | 3.13 |
| getitem | bfloat16 | [3234 4] | [45] | [45 4] | 0 | ssdlite | bwd | 206852 | 62442 | 3.31 |
| getitem | bfloat16 | [3234] | [40] | [40] | 0 | ssdlite | bwd | 183395 | 58120 | 3.16 |
| getitem | bfloat16 | [1 4] | [8] | [8 4] | 0 | ssdlite | bwd | 211396 | 65321 | 3.24 |
| getitem | bfloat16 | [1] | [13] | [13] | 0 | ssdlite | bwd | 218180 | 61498 | 3.55 |
| getitem | bfloat16 | [3234] | [45] | [45] | 0 | ssdlite | bwd | 192628 | 62778 | 3.07 |
| getitem | bfloat16 | [1 4] | [7] | [7 4] | 0 | ssdlite | bwd | 223796 | 58297 | 3.84 |
| getitem | bfloat16 | [1] | [7] | [7] | 0 | ssdlite | bwd | 220084 | 58315 | 3.77 |
| getitem | bfloat16 | [3234] | [4] | [4] | 0 | ssdlite | bwd | 221716 | 60378 | 3.67 |
| getitem | bfloat16 | [3234 4] | [59] | [59 4] | 0 | ssdlite | bwd | 194932 | 60751 | 3.21 |
| getitem | bfloat16 | [3234] | [5] | [5] | 0 | ssdlite | bwd | 200388 | 59293 | 3.38 |
| getitem | bfloat16 | [2 4] | [15] | [15 4] | 0 | ssdlite | bwd | 216948 | 61231 | 3.54 |
| getitem | bfloat16 | [3234 4] | [62] | [62 4] | 0 | ssdlite | bwd | 216724 | 62369 | 3.47 |
| getitem | bfloat16 | [3234] | [57] | [57] | 0 | ssdlite | bwd | 185412 | 61531 | 3.01 |
| getitem | bfloat16 | [3234 4] | [77] | [77 4] | 0 | ssdlite | bwd | 178740 | 65425 | 2.73 |
| getitem | bfloat16 | [3234 4] | [79] | [79 4] | 0 | ssdlite | bwd | 152915 | 66118 | 2.31 |
| getitem | bfloat16 | [3234 4] | [75] | [75 4] | 0 | ssdlite | bwd | 151603 | 64020 | 2.37 |
| getitem | bfloat16 | [3234] | [59] | [59] | 0 | ssdlite | bwd | 157059 | 59842 | 2.62 |
| getitem | bfloat16 | [8741 4] | [2000] | [2000 4] | 0 | maskrcnn | bwd | 216740 | 58811 | 3.69 |
| getitem | bfloat16 | [8741] | [2000] | [2000] | 0 | maskrcnn | bwd | 186244 | 64305 | 2.90 |
| getitem | bfloat16 | [3234 4] | [89] | [89 4] | 0 | ssdlite | bwd | 142355 | 65140 | 2.19 |
| getitem | bfloat16 | [3234 4] | [84] | [84 4] | 0 | ssdlite | bwd | 130803 | 64144 | 2.04 |
| getitem | bfloat16 | [8741 4] | [8741] | [8741 4] | 0 | maskrcnn | bwd | 234148 | 58473 | 4.00 |
| getitem | bfloat16 | [8741] | [8741] | [8741] | 0 | maskrcnn | bwd | 191907 | 59646 | 3.22 |
| getitem | bfloat16 | [3234] | [58] | [58] | 0 | ssdlite | bwd | 133955 | 61389 | 2.18 |
| getitem | bfloat16 | [3 4] | [20] | [20 4] | 0 | ssdlite | bwd | 134866 | 56962 | 2.37 |
| getitem | bfloat16 | [3] | [20] | [20] | 0 | ssdlite | bwd | 136547 | 69478 | 1.97 |
| getitem | bfloat16 | [3234] | [64] | [64] | 0 | ssdlite | bwd | 134850 | 58082 | 2.32 |
| getitem | bfloat16 | [3234 4] | [83] | [83 4] | 0 | ssdlite | bwd | 142067 | 58580 | 2.43 |
| getitem | bfloat16 | [3 4] | [17] | [17 4] | 0 | ssdlite | bwd | 136002 | 65371 | 2.08 |
| getitem | bfloat16 | [3] | [17] | [17] | 0 | ssdlite | bwd | 133619 | 61015 | 2.19 |
| getitem | bfloat16 | [3234 4] | [86] | [86 4] | 0 | ssdlite | bwd | 135385 | 64376 | 2.10 |
| getitem | bfloat16 | [3234 4] | [88] | [88 4] | 0 | ssdlite | bwd | 136114 | 65656 | 2.07 |
| getitem | bfloat16 | [3234] | [65] | [65] | 0 | ssdlite | bwd | 134338 | 60838 | 2.21 |
| getitem | bfloat16 | [3234] | [73] | [73] | 0 | ssdlite | bwd | 136867 | 55646 | 2.46 |
| getitem | bfloat16 | [3234] | [61] | [61] | 0 | ssdlite | bwd | 137171 | 60642 | 2.26 |
| getitem | bfloat16 | [3 4] | [23] | [23 4] | 0 | ssdlite | bwd | 133922 | 65051 | 2.06 |
| getitem | bfloat16 | [3] | [23] | [23] | 0 | ssdlite | bwd | 131427 | 68642 | 1.91 |
| getitem | bfloat16 | [3 4] | [21] | [21 4] | 0 | ssdlite | bwd | 137427 | 59860 | 2.30 |
| getitem | bfloat16 | [3] | [21] | [21] | 0 | ssdlite | bwd | 135763 | 63362 | 2.14 |
| getitem | bfloat16 | [2 4] | [8] | [8 4] | 0 | ssdlite | bwd | 128722 | 63077 | 2.04 |
| getitem | bfloat16 | [2] | [8] | [8] | 0 | ssdlite | bwd | 137171 | 61442 | 2.23 |
| getitem | bfloat16 | [3 4] | [28] | [28 4] | 0 | ssdlite | bwd | 137922 | 60855 | 2.27 |
| getitem | bfloat16 | [3] | [28] | [28] | 0 | ssdlite | bwd | 134530 | 63736 | 2.11 |
| getitem | bfloat16 | [2 4] | [27] | [27 4] | 0 | ssdlite | bwd | 142290 | 69122 | 2.06 |
| getitem | bfloat16 | [2] | [27] | [27] | 0 | ssdlite | bwd | 140739 | 66473 | 2.12 |
| getitem | bfloat16 | [3 4] | [22] | [22 4] | 0 | ssdlite | bwd | 132035 | 62989 | 2.10 |
| getitem | bfloat16 | [3] | [22] | [22] | 0 | ssdlite | bwd | 138658 | 61851 | 2.24 |
| getitem | bfloat16 | [3234 4] | [78] | [78 4] | 0 | ssdlite | bwd | 134595 | 65264 | 2.06 |
| getitem | bfloat16 | [2 4] | [28] | [28 4] | 0 | ssdlite | bwd | 137602 | 71238 | 1.93 |
| getitem | bfloat16 | [2] | [28] | [28] | 0 | ssdlite | bwd | 144419 | 66811 | 2.16 |
| getitem | bfloat16 | [3234] | [67] | [67] | 0 | ssdlite | bwd | 136675 | 56180 | 2.43 |
| getitem | bfloat16 | [3234] | [74] | [74] | 0 | ssdlite | bwd | 133058 | 61068 | 2.18 |
| getitem | bfloat16 | [3234] | [70] | [70] | 0 | ssdlite | bwd | 156723 | 60891 | 2.57 |
| getitem | bfloat16 | [3 4] | [26] | [26 4] | 0 | ssdlite | bwd | 159459 | 68518 | 2.33 |
| getitem | bfloat16 | [3] | [26] | [26] | 0 | ssdlite | bwd | 150451 | 65175 | 2.31 |
| getitem | bfloat16 | [3 4] | [29] | [29 4] | 0 | ssdlite | bwd | 155683 | 65567 | 2.37 |
| getitem | bfloat16 | [3] | [29] | [29] | 0 | ssdlite | bwd | 156915 | 64464 | 2.43 |
| getitem | bfloat16 | [3 4] | [18] | [18 4] | 0 | ssdlite | bwd | 150467 | 64304 | 2.34 |
| getitem | bfloat16 | [3] | [18] | [18] | 0 | ssdlite | bwd | 150723 | 61620 | 2.45 |
| getitem | bfloat16 | [3234] | [68] | [68] | 0 | ssdlite | bwd | 154467 | 62722 | 2.46 |
| getitem | bfloat16 | [3234] | [66] | [66] | 0 | ssdlite | bwd | 155315 | 59059 | 2.63 |
| getitem | bfloat16 | [3 4] | [19] | [19 4] | 0 | ssdlite | bwd | 156019 | 62295 | 2.50 |
| getitem | bfloat16 | [3] | [19] | [19] | 0 | ssdlite | bwd | 159667 | 63237 | 2.52 |
| getitem | bfloat16 | [3234 4] | [90] | [90 4] | 0 | ssdlite | bwd | 160435 | 64589 | 2.48 |
| getitem | bfloat16 | [3] | [25] | [25] | 0 | ssdlite | bwd | 149315 | 58011 | 2.57 |
| getitem | bfloat16 | [3234 4] | [87] | [87 4] | 0 | ssdlite | bwd | 157955 | 62882 | 2.51 |
| getitem | bfloat16 | [3 4] | [10] | [10 4] | 0 | ssdlite | bwd | 154723 | 59468 | 2.60 |
| getitem | bfloat16 | [3] | [10] | [10] | 0 | ssdlite | bwd | 151491 | 59593 | 2.54 |
| getitem | bfloat16 | [3 4] | [27] | [27 4] | 0 | ssdlite | bwd | 148339 | 65778 | 2.26 |
| getitem | bfloat16 | [3] | [27] | [27] | 0 | ssdlite | bwd | 159731 | 63664 | 2.51 |
| getitem | bfloat16 | [3 4] | [30] | [30 4] | 0 | ssdlite | bwd | 152787 | 65140 | 2.35 |
| getitem | bfloat16 | [3] | [30] | [30] | 0 | ssdlite | bwd | 150835 | 64837 | 2.33 |
| getitem | bfloat16 | [3 4] | [16] | [16 4] | 0 | ssdlite | bwd | 158339 | 62882 | 2.52 |
| getitem | bfloat16 | [3] | [16] | [16] | 0 | ssdlite | bwd | 157603 | 63451 | 2.48 |
| getitem | bfloat16 | [3 4] | [33] | [33 4] | 0 | ssdlite | bwd | 151475 | 66651 | 2.27 |
| getitem | bfloat16 | [3] | [33] | [33] | 0 | ssdlite | bwd | 161203 | 64962 | 2.48 |
| getitem | bfloat16 | [3234 4] | [81] | [81 4] | 0 | ssdlite | bwd | 152003 | 64820 | 2.35 |
| getitem | bfloat16 | [3 4] | [34] | [34 4] | 0 | ssdlite | bwd | 148995 | 68002 | 2.19 |
| getitem | bfloat16 | [3] | [34] | [34] | 0 | ssdlite | bwd | 162515 | 62988 | 2.58 |
| getitem | bfloat16 | [4300 4] | [4261] | [4261 4] | 0 | fasterrcnn | bwd | 202516 | 61833 | 3.28 |
| getitem | bfloat16 | [4300] | [4261] | [4261] | 0 | fasterrcnn | bwd | 209684 | 63148 | 3.32 |
| getitem | bfloat16 | [4261 4] | [1000] | [1000 4] | 0 | fasterrcnn | bwd | 162931 | 62597 | 2.60 |
| getitem | bfloat16 | [4261] | [1000] | [1000] | 0 | fasterrcnn | bwd | 182643 | 61957 | 2.95 |
| getitem | bfloat16 | [3 4] | [31] | [31 4] | 0 | ssdlite | bwd | 159603 | 65833 | 2.42 |
| getitem | bfloat16 | [3] | [31] | [31] | 0 | ssdlite | bwd | 154259 | 64197 | 2.40 |
| getitem | bfloat16 | [2 4] | [29] | [29 4] | 0 | ssdlite | bwd | 152947 | 67006 | 2.28 |
| getitem | bfloat16 | [2] | [29] | [29] | 0 | ssdlite | bwd | 159267 | 68553 | 2.32 |
| getitem | bfloat16 | [3234] | [72] | [72] | 0 | ssdlite | bwd | 157891 | 57921 | 2.73 |
| getitem | bfloat16 | [3234] | [69] | [69] | 0 | ssdlite | bwd | 149843 | 62028 | 2.42 |
| getitem | bfloat16 | [3234 4] | [94] | [94 4] | 0 | ssdlite | bwd | 157171 | 63664 | 2.47 |
| getitem | bfloat16 | [3 4] | [11] | [11 4] | 0 | ssdlite | bwd | 151475 | 63486 | 2.39 |
| getitem | bfloat16 | [3] | [11] | [11] | 0 | ssdlite | bwd | 158163 | 56179 | 2.82 |
| getitem | bfloat16 | [3 4] | [13] | [13 4] | 0 | ssdlite | bwd | 152067 | 63557 | 2.39 |
| getitem | bfloat16 | [3] | [13] | [13] | 0 | ssdlite | bwd | 148723 | 60499 | 2.46 |
| getitem | bfloat16 | [3 4] | [12] | [12 4] | 0 | ssdlite | bwd | 159027 | 58775 | 2.71 |
| getitem | bfloat16 | [3] | [12] | [12] | 0 | ssdlite | bwd | 152643 | 60002 | 2.54 |
| getitem | bfloat16 | [3 4] | [14] | [14 4] | 0 | ssdlite | bwd | 149699 | 63219 | 2.37 |
| getitem | bfloat16 | [3] | [14] | [14] | 0 | ssdlite | bwd | 155843 | 61033 | 2.55 |
| getitem | bfloat16 | [3234] | [71] | [71] | 0 | ssdlite | bwd | 159891 | 61744 | 2.59 |
| getitem | bfloat16 | [3 4] | [24] | [24 4] | 0 | ssdlite | bwd | 152115 | 66793 | 2.28 |
| getitem | bfloat16 | [3] | [24] | [24] | 0 | ssdlite | bwd | 158819 | 64766 | 2.45 |
| getitem | bfloat16 | [3234 4] | [101] | [101 4] | 0 | ssdlite | bwd | 163683 | 63220 | 2.59 |
| getitem | bfloat16 | [1 4] | [4] | [4 4] | 0 | ssdlite | bwd | 147381 | 73549 | 2.00 |
| getitem | bfloat16 | [1] | [4] | [4] | 0 | ssdlite | bwd | 155971 | 61602 | 2.53 |
| getitem | bfloat16 | [1 4] | [6] | [6 4] | 0 | ssdlite | bwd | 154931 | 60766 | 2.55 |
| getitem | bfloat16 | [1] | [6] | [6] | 0 | ssdlite | bwd | 150259 | 56659 | 2.65 |
| getitem | bfloat16 | [3234 4] | [85] | [85 4] | 0 | ssdlite | bwd | 157619 | 65602 | 2.40 |
| getitem | bfloat16 | [3234 4] | [92] | [92 4] | 0 | ssdlite | bwd | 156451 | 63166 | 2.48 |
| getitem | bfloat16 | [4 4] | [4] | [4 4] | 0 | ssdlite | bwd | 158499 | 62632 | 2.53 |
| getitem | bfloat16 | [4] | [4] | [4] | 0 | ssdlite | bwd | 155651 | 60588 | 2.57 |
| getitem | bfloat16 | [3234] | [76] | [76] | 0 | ssdlite | bwd | 140515 | 62064 | 2.26 |
| getitem | bfloat16 | [3 4] | [15] | [15 4] | 0 | ssdlite | bwd | 146547 | 62970 | 2.33 |
| getitem | bfloat16 | [3] | [15] | [15] | 0 | ssdlite | bwd | 144627 | 62028 | 2.33 |
| getitem | bfloat16 | [3234] | [80] | [80] | 0 | ssdlite | bwd | 139490 | 61406 | 2.27 |
| getitem | bfloat16 | [3 4] | [32] | [32 4] | 0 | ssdlite | bwd | 143187 | 66135 | 2.17 |
| getitem | bfloat16 | [3] | [32] | [32] | 0 | ssdlite | bwd | 139154 | 75397 | 1.85 |
| getitem | bfloat16 | [3234 4] | [91] | [91 4] | 0 | ssdlite | bwd | 130467 | 64624 | 2.02 |
| getitem | bfloat16 | [4300 4] | [4262] | [4262 4] | 0 | fasterrcnn | bwd | 191188 | 59930 | 3.19 |
| getitem | bfloat16 | [4300] | [4262] | [4262] | 0 | fasterrcnn | bwd | 189482 | 58099 | 3.26 |
| getitem | bfloat16 | [4262 4] | [1000] | [1000 4] | 0 | fasterrcnn | bwd | 149603 | 64677 | 2.31 |
| getitem | bfloat16 | [4262] | [1000] | [1000] | 0 | fasterrcnn | bwd | 144979 | 61282 | 2.37 |
| getitem | bfloat16 | [1] | [5] | [5] | 0 | maskrcnn_ssdlite | bwd | 130802 | 61833 | 2.12 |
| getitem | bfloat16 | [4 4] | [27] | [27 4] | 0 | ssdlite | bwd | 136306 | 61086 | 2.23 |
| getitem | bfloat16 | [4] | [27] | [27] | 0 | ssdlite | bwd | 136626 | 61175 | 2.23 |
| getitem | bfloat16 | [1 4] | [5] | [5 4] | 0 | ssdlite | bwd | 130178 | 60037 | 2.17 |
| getitem | bfloat16 | [4 4] | [28] | [28 4] | 0 | ssdlite | bwd | 137586 | 64624 | 2.13 |
| getitem | bfloat16 | [4] | [28] | [28] | 0 | ssdlite | bwd | 138771 | 64446 | 2.15 |
| getitem | bfloat16 | [4300 4] | [4194] | [4194 4] | 0 | fasterrcnn | bwd | 188211 | 60730 | 3.10 |
| getitem | bfloat16 | [4300] | [4194] | [4194] | 0 | fasterrcnn | bwd | 203940 | 60642 | 3.36 |
| getitem | bfloat16 | [4194 4] | [1000] | [1000 4] | 0 | fasterrcnn | bwd | 164147 | 60712 | 2.70 |
| getitem | bfloat16 | [2 4] | [30] | [30 4] | 0 | ssdlite | bwd | 160111 | 67682 | 2.37 |
| getitem | bfloat16 | [2] | [30] | [30] | 0 | ssdlite | bwd | 156755 | 77442 | 2.02 |
| getitem | bfloat16 | [4 4] | [33] | [33 4] | 0 | ssdlite | bwd | 150579 | 63593 | 2.37 |
| getitem | bfloat16 | [4] | [33] | [33] | 0 | ssdlite | bwd | 157987 | 57797 | 2.73 |
| getitem | bfloat16 | [3234] | [77] | [77] | 0 | ssdlite | bwd | 154387 | 59539 | 2.59 |
| getitem | bfloat16 | [3234 4] | [93] | [93 4] | 0 | ssdlite | bwd | 145795 | 65068 | 2.24 |
| getitem | bfloat16 | [3234 4] | [105] | [105 4] | 0 | ssdlite | bwd | 157283 | 65033 | 2.42 |
| getitem | bfloat16 | [3234 4] | [113] | [113 4] | 0 | ssdlite | bwd | 154243 | 65442 | 2.36 |
| getitem | bfloat16 | [4194] | [1000] | [1000] | 0 | fasterrcnn | bwd | 156547 | 62828 | 2.49 |
| getitem | bfloat16 | [3 4] | [37] | [37 4] | 0 | ssdlite | bwd | 143443 | 69866 | 2.05 |
| getitem | bfloat16 | [3] | [37] | [37] | 0 | ssdlite | bwd | 152179 | 66252 | 2.30 |
| getitem | bfloat16 | [4 4] | [37] | [37 4] | 0 | ssdlite | bwd | 139618 | 63674 | 2.19 |
| getitem | bfloat16 | [4] | [37] | [37] | 0 | ssdlite | bwd | 149107 | 69843 | 2.13 |
| getitem | bfloat16 | [2 4] | [7] | [7 4] | 0 | ssdlite | bwd | 139939 | 61844 | 2.26 |
| getitem | bfloat16 | [2] | [7] | [7] | 0 | ssdlite | bwd | 129730 | 60244 | 2.15 |
| getitem | bfloat16 | [4 4] | [18] | [18 4] | 0 | ssdlite | bwd | 137235 | 62271 | 2.20 |
| getitem | bfloat16 | [4] | [18] | [18] | 0 | ssdlite | bwd | 138933 | 62377 | 2.23 |
| getitem | bfloat16 | [3 4] | [9] | [9 4] | 0 | ssdlite | bwd | 137107 | 60457 | 2.27 |
| getitem | bfloat16 | [3] | [9] | [9] | 0 | ssdlite | bwd | 136803 | 59426 | 2.30 |
| getitem | bfloat16 | [3234] | [79] | [79] | 0 | ssdlite | bwd | 142323 | 61986 | 2.30 |
| getitem | bfloat16 | [4 4] | [30] | [30 4] | 0 | ssdlite | bwd | 131635 | 65079 | 2.02 |
| getitem | bfloat16 | [4] | [30] | [30] | 0 | ssdlite | bwd | 135666 | 64155 | 2.11 |
| getitem | bfloat16 | [3234] | [75] | [75] | 0 | ssdlite | bwd | 137907 | 63124 | 2.18 |
| getitem | bfloat16 | [3 4] | [35] | [35 4] | 0 | ssdlite | bwd | 135746 | 68813 | 1.97 |
| getitem | bfloat16 | [4 4] | [39] | [39 4] | 0 | ssdlite | bwd | 182899 | 69011 | 2.65 |
| getitem | bfloat16 | [8] | [2008] | [2008] | 0 | maskrcnn | bwd | 264645 | 178237 | 1.48 |
| getitem | bfloat16 | [5] | [14] | [14] | 0 | maskrcnn_ssdlite | bwd | 158099 | 61244 | 2.58 |
| getitem | bfloat16 | [8 4] | [8] | [8 4] | 0 | ssdlite | bwd | 187412 | 60266 | 3.11 |
| getitem | bfloat16 | [4197 4] | [1000] | [1000 4] | 0 | fasterrcnn | bwd | 161043 | 56605 | 2.85 |
| getitem | bfloat16 | [4197] | [1000] | [1000] | 0 | fasterrcnn | bwd | 176067 | 62755 | 2.81 |
| getitem | bfloat16 | [16] | [2016] | [2016] | 0 | maskrcnn | bwd | 238645 | 125743 | 1.90 |
| getitem | bfloat16 | [7 4] | [33] | [33 4] | 0 | ssdlite | bwd | 164963 | 63858 | 2.58 |
- Average over all cases
| Op | Type | average |
|---|---|---|
| getitem_backward | float16 | 2.89 |
| getitem_backward | float32 | 2.75 |
| getitem_backward | bfloat16 | 2.76 |
The driver is being modified while resolving merge conflicts.
@junliume Even if I do a clean build and make check, i cannot determine the cause of the failure in Jenkins - HIP Package. Please check the reason of failure.
@amberhassaan Have your review comments been addressed ?
@amberhassaan Have your review comments been addressed ?
No, not quite. I haven't had a chance to leave a detailed review, but I'd like @seungmanhan to
- replace macros with functions that are
__device__ __host__. - give better comments. Are we converting an linear index to a 4D or 5D index? Are NCHW components dims or strides?
- It would help me if the documentation clarifies what
getitem backwarddoes. May be it's just my ignorance, but I feel some documentation or a link to one is needed here.
@amberhassaan Have your review comments been addressed ?
No, not quite. I haven't had a chance to leave a detailed review, but I'd like @seungmanhan to
- replace macros with functions that are
__device__ __host__.- give better comments. Are we converting an linear index to a 4D or 5D index? Are NCHW components dims or strides?
- It would help me if the documentation clarifies what
getitem backwarddoes. May be it's just my ignorance, but I feel some documentation or a link to one is needed here.
The macro was changed to a constexpr function. Made function names easier and added comments. It it a backward of getitem. And getitem is a kind of tensor indexing, slicing, masking.(https://pytorch.org/tensordict/stable/tutorials/tensordict_slicing.html?highlight=tensor+slice)
@CAHEK7 @amberhassaan @JehandadKhan re-request review
@atamazov Any final opinions before we merge this?
@seungmanhan Please see https://github.com/ROCm/MIOpen/pull/2883#pullrequestreview-2072076879. This review may also apply to other primitives added by your team.
@seungmanhan we have a HIP tidy issue:
[2024-05-23T13:46:38.046Z] /home/jenkins/workspace/MLLIBS_MIOpen_impl_getitem_bwd/src/include/miopen/miopen_internal.h:119:1: warning: function 'miopenConvolutionCKBackwardWeightsGetWorkSpaceSize' has a definition with different parameter names [readability-inconsistent-declaration-parameter-name]
[2024-05-23T13:46:38.046Z] 119 | miopenConvolutionCKBackwardWeightsGetWorkSpaceSize(const miopenAlphaBetaCase_t ab_case,
[2024-05-23T13:46:38.046Z] | ^
[2024-05-23T13:46:38.046Z] /home/jenkins/workspace/MLLIBS_MIOpen_impl_getitem_bwd/src/convolution_api.cpp:200:1: note: the definition seen here
[2024-05-23T13:46:38.046Z] 200 | miopenConvolutionCKBackwardWeightsGetWorkSpaceSize(const miopenAlphaBetaCase_t alpha_beta_case,
[2024-05-23T13:46:38.046Z] | ^
[2024-05-23T13:46:38.046Z] /home/jenkins/workspace/MLLIBS_MIOpen_impl_getitem_bwd/src/include/miopen/miopen_internal.h:119:1: note: differing parameters are named here: ('ab_case'), in definition: ('alpha_beta_case')
[2024-05-23T13:46:38.046Z] 119 | miopenConvolutionCKBackwardWeightsGetWorkSpaceSize(const miopenAlphaBetaCase_t ab_case,
[2024-05-23T13:46:38.046Z] | ^ ~~~~~~~
[2024-05-23T13:46:38.046Z] | alpha_beta_case
@junliume Can you please confirm why the github action is failing?
@junliume When I add miopen::IsUnset(MIOPEN_ENV(MIOPEN_TEST_ALL) in gtest, GitHub action fails. Can you please check the log 363dbe2?
@seungmanhan
@junliume When I add miopen::IsUnset(MIOPEN_ENV(MIOPEN_TEST_ALL) in gtest, GitHub action fails. Can you please check the log 363dbe2?
You can check locally
- Formatting - run this formatting script, check if any files modified, and commit changes.
#!/bin/bash
find . -iname '*.h' \
-o -iname '*.hpp' \
-o -iname '*.cpp' \
-o -iname '*.h.in' \
-o -iname '*.hpp.in' \
-o -iname '*.cpp.in' \
-o -iname '*.cl' \
| grep -v -E '(build/)|(install/)|(fin/)' \
| xargs -n 1 -P $(nproc) -I{} -t clang-format-12 -style=file {} -i 2>/dev/null
- static checks
make -j $((`nproc`-4)) -k analyze
@seungmanhan
@junliume When I add miopen::IsUnset(MIOPEN_ENV(MIOPEN_TEST_ALL) in gtest, GitHub action fails. Can you please check the log 363dbe2?
You can check locally
- Formatting - run this formatting script, check if any files modified, and commit changes.
#!/bin/bash find . -iname '*.h' \ -o -iname '*.hpp' \ -o -iname '*.cpp' \ -o -iname '*.h.in' \ -o -iname '*.hpp.in' \ -o -iname '*.cpp.in' \ -o -iname '*.cl' \ | grep -v -E '(build/)|(install/)|(fin/)' \ | xargs -n 1 -P $(nproc) -I{} -t clang-format-12 -style=file {} -i 2>/dev/null
- static checks
make -j $((`nproc`-4)) -k analyze
I confirmed that this alone does not resolve problems that occur in github actions.
@JehandadKhan @randyspauldingamd @CAHEK7 last ping before last round of CI and then plan to merge
@seungmanhan we still get a few tests persistently failing at Fp32 Hip All gfx90a stage, could you help to check them out?
@junliume, please ignore the Windows build failure; the #3043 will fix that.
@junliume Can you find out why PRs keep failing at the build stage? (https://github.com/ROCm/MIOpen/pull/2883/commits/6255e87b863630c107bb830b1c9f9d5f69e58ef4)
@seungmanhan could you help to resolve the conflict? We are having some CI issues and let me know if you need help. Thanks!
@junliume Can you confirm what is being tested in 'Jenkins - Fp32 Hip All gfx90a' or send me the script?
@seungmanhan @apwojcik The Windows build is failing while the Linux side is passing:
lld-link: error: undefined symbol: enum miopenStatus_t __cdecl miopen::GetitemBackward(struct miopen::Handle &, void *, unsigned __int64, struct miopen::TensorDescriptor const &, void const *, unsigned int, struct miopen::TensorDescriptor const *const *, void const **, struct miopen::TensorDescriptor const &, void *, struct miopen::TensorDescriptor const &, void *, unsigned int, int const *, unsigned int, int const *, unsigned int)
Can we either (1) this feature is provided on Linux only and we protect it to be so; (2) @apwojcik anyway to support this feature on Windows?
@seungmanhan @apwojcik The Windows build is failing while the Linux side is passing:
lld-link: error: undefined symbol: enum miopenStatus_t __cdecl miopen::GetitemBackward(struct miopen::Handle &, void *, unsigned __int64, struct miopen::TensorDescriptor const &, void const *, unsigned int, struct miopen::TensorDescriptor const *const *, void const **, struct miopen::TensorDescriptor const &, void *, struct miopen::TensorDescriptor const &, void *, unsigned int, int const *, unsigned int, int const *, unsigned int)
Can we either (1) this feature is provided on Linux only and we protect it to be so; (2) @apwojcik anyway to support this feature on Windows?
Please export the missing (undefined) symbols with MOPEN_INTERNALS_EXPORT.