hostboot icon indicating copy to clipboard operation
hostboot copied to clipboard

Wrong L3 [d|i]-cache-sets on POWER9

Open tuliom opened this issue 7 years ago • 12 comments

POWER9 has an L3 that is 20-way set associative. This changed from POWER8 that had L1, L2 and L3 all with 8-way associativity.

However, after extracting DTS from HDAT, this is what we get:

		l3-cache@30000008 {
...
			d-cache-sets = <0x8>;
			i-cache-sets = <0x8>;
			d-cache-size = <0xa00000>;
			i-cache-size = <0xa00000>;
		};

Notice both L1 and L2 (8-way associative) have correct numbers:

		PowerPC,POWER9@8 {
...
			d-cache-size = <0x8000>;
			i-cache-size = <0x8000>;
			i-cache-sets = <0x8>;
			d-cache-sets = <0x8>;
...
		};
...
		l2-cache@20000008 {
...
			d-cache-sets = <0x8>;
			i-cache-sets = <0x8>;
			d-cache-size = <0x80000>;
			i-cache-size = <0x80000>;
...
		};

tuliom avatar Dec 03 '18 12:12 tuliom

I was discussing this topic with @segher and we didn't agree on what [d|i]-cache-sets mean. However, regardless of the this discussion, the L3 value on POWER9 is wrong.

Option 1 - N-way associativity [d|i]-cache-sets means associativity N from N-way associative. In that case, P9 L3 [d|i]-cache-sets should have 0x14 (20-way associative). All the other values for P8 and P9 (including L1 and L2) are correct.

Option 2 - Number of sets [d|i]-cache-sets should store the number of sets, i.e. sets = size / (N * line_size) If this case is correct, P9 L3 [d|i]-cache-sets is expected to have 0x1000. All the P8 and P9 values (including L1, L2 and L3) also have to be re-calculated.

Notice the Devicetree specification v0.2 specifies the following for d-cache-sets:

Specifies the number of associativity sets in the data cache.

Source: https://github.com/devicetree-org/devicetree-specification/releases/tag/v0.2

tuliom avatar Dec 03 '18 14:12 tuliom

We've got the same value in our PHYP box as well so it is odd nobody has noticed anything wrong for this long now. The change is very simple to make, literally 1 line of target_types.xml, just want to make sure we don't break anybody downstream.

dcrowell77 avatar Dec 03 '18 17:12 dcrowell77

The P9 user manual says:

The L1 D-cache is the first level of cache available to load and store operations. It is 32 KB and organized as 8-way set-associative with 128-byte cache lines.

Where are you getting the 20-way value from?

dcrowell77 avatar Dec 03 '18 18:12 dcrowell77

@dcrowell77, L1I, L1D and L2 are indeed 8-way associative on P9.

But according to the P9 user manual, section 8.3 (L3 Cache - List of features and resources), L3 is 20-way associative.

Anyway, notice the discussion around the definition of [d|i]-cache-sets. There is a chance it's being misused.

tuliom avatar Dec 03 '18 19:12 tuliom

The skiboot folks control the devtree part of the equation so I can't speak much to that. Hostboot fills in the HDAT field of "Number of associativity sets in the data cache." which doesn't specifically say L1/L2/L3. Adding @stewart-ibm @ozbenh

dcrowell77 avatar Dec 03 '18 19:12 dcrowell77

On Mon, 2018-12-03 at 19:47 +0000, Dan wrote:

The skiboot folks control the devtree part of the equation so I can't speak much to that. Hostboot fills in the HDAT field of "Number of associativity sets in the data cache." which doesn't specifically say L1/L2/L3. Adding @stewart-ibm @ozbenh

Offset 0x44 in Cache Size Structure in the CUDA has the L2 associativity, but not the L3, so we assume they are identical.

Cheers, Ben.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

ozbenh avatar Dec 03 '18 22:12 ozbenh

Hi, any updates on this issue?

tuliom avatar Feb 03 '21 15:02 tuliom

Tagging @klauskiwi , based on where the discussion is headed it might make sense to move this to the skiboot list.

dcrowell77 avatar Feb 03 '21 17:02 dcrowell77

@hegdevasant looks something for you...

klauskiwi avatar Feb 03 '21 17:02 klauskiwi

@hegdevasant looks something for you...

Sorry. Somehow I missed this one.. As BenH mentioned earlier , we just assume same associativity for L2 and L3.. as we are not getting L3 associativity information from HDAT.

From skiboot code :

213 
214         /* Assume cache associavitity sets is same for L2, L3 and L3.5 */
215         dt_add_property_cells(node, "d-cache-sets",
216                               be32_to_cpu(cache->l2_cache_assoc_sets));
217         dt_add_property_cells(node, "i-cache-sets",
218                               be32_to_cpu(cache->l2_cache_assoc_sets));

I think fix should come from HDAT side..

-Vasant

hegdevasant avatar Mar 24 '21 07:03 hegdevasant

@hegdevasant Are you proposing that HDAT needs a new field or that the current values are incorrect?

dcrowell77 avatar Mar 25 '21 14:03 dcrowell77

@dcrowell77

@hegdevasant Are you proposing that HDAT needs a new field or that the current values are incorrect?

I think we need new field in HDAT for L3 associativity. Currently we are not getting that. Hence we just use L2 associativity for L3. So on P9 we are displaying L3 associativity as 8

-Vasant

hegdevasant avatar Mar 25 '21 14:03 hegdevasant