fbc icon indicating copy to clipboard operation
fbc copied to clipboard

Boolean bitfields writes treated as 32/64-bit rather than 8-bit; broken on big-endian

Open rversteegen opened this issue 4 years ago • 10 comments

When fbc writes to a boolean bitfield it treats the bitfield address as an integer ptr (32 or 64 bit). So it reads 3/7 bytes past the bitfield and then writes them back as they were. Reading from the bitfield doesn't seem to be affected. This is usually invisible and I don't think there's any unittest that can test for this bug on a little-endian platform, but it breaks big-endian platforms because the bitmask used is wrong. This bug was discovered by running on PowerPC (discussion in #290).

Testcase:

type UDT
        a : 1 as boolean
end type
dim y as integer
dim x as UDT
x.a = 0
y = x.a

32-bit -gen gcc x86 result:

	// dim x as UDT
	struct $3UDT X$0;
	__builtin_memset( &X$0, 0, 1 );
	// x.a = 0
	*(uint32*)&X$0 = *(uint32*)&X$0 & 4294967294u;
	// y = x.a
	Y$0 = -(boolean)((uint32)(int32)*(int8*)&X$0 & 1u);

64-bit -gen gcc is the same except it treats the address as a uint64* and uses the bitmask 0xfffffffffffffffe instead.

-gen gas output:

##dim x as UDT
	mov byte ptr [ebp-8], 0
##x.a = 0
	and dword ptr [ebp-8], 4294967294
##y = x.a
	movsx eax, byte ptr [ebp-8]
	and eax, 1
	mov ebx, eax
	movzx eax, bl
	neg eax
	mov dword ptr [ebp-4], eax

rversteegen avatar Feb 23 '21 05:02 rversteegen

Has there been any progress on this issue? Is this related to the issue reading files with little endian BOM?

lenoil98 avatar May 09 '22 16:05 lenoil98

I haven't looked further into this. I wasn't planning to attempt a fix myself.

It's completely unrelated to the BOM and file I/O bugs. I've just dusted off that branch and did a little on it. I could submit it as a draft PR.

rversteegen avatar May 10 '22 17:05 rversteegen

That would be great. Looking forward to it.

lenoil98 avatar May 14 '22 06:05 lenoil98

Is it possible to get access to the code fix for the BOM and file I/O bugs? FBC is having BOM issues when building VisualFBEditor on big endian.

lenoil98 avatar Jun 12 '22 20:06 lenoil98

OK, I've picked it up again. Sorry for forgetting about it.

rversteegen avatar Jun 13 '22 14:06 rversteegen

Any further progress on this?

lenoil98 avatar Apr 11 '23 04:04 lenoil98

Sorry... I forgot that you'd previously asked about the big endian file BOM bugs in this issue; that's what you're asking about rather than this bitfield bug? I did indeed resume work on that branch for a while (I just pushed some more to https://github.com/rversteegen/fbc/commits/ppc) but looking at it I see I mostly got really sidetracked with tests and fixes for obscure putback buffer bugs. There are some BOM fixes but apparently not including OPEN ENCODING nor fbc's BOM detection, which are likely the two fixes you most need. But those shouldn't be too hard to fix. I'm a bit busy at the moment though.

rversteegen avatar Apr 15 '23 12:04 rversteegen

Thanks for all your work. I understood that you were probably busy, just wanted to see if you’d done any more on the big endian BOM issues. I’ll checkout your ppc branch. At any rate I’ve installed fbsd 13.2 for little endian and may move everything to that platform.

Again, thanks in advance for your help.

lenoil98 avatar Apr 20 '23 04:04 lenoil98

Maybe not the best code generation, but looks like we should be limiting memory reads and writes to one byte only.

32-bit gas, x86:

	##dim y as integer
		mov dword ptr [ebp-8], 0
	##dim x as UDT
		mov byte ptr [ebp-12], 0
	##x.a = 0
		movzx eax, byte ptr [ebp-12]
		and eax, 4294967294
		mov byte ptr [ebp-12], al
	##y = x.a
		movsx eax, byte ptr [ebp-12]
		and eax, 1
		mov ebx, eax
		movzx eax, bl
		neg eax
		mov eax, eax
		mov dword ptr [ebp-8], eax

64-bit gas, x86_64:

	##dim y as integer
	   mov QWORD PTR -72[rbp], 0
	##dim x as UDT
	   mov BYTE PTR -76[rbp], 0
	##x.a = 0
	   movzx r11, BYTE PTR -76[rbp]
	   and r11, -2
	   mov -76[rbp], r11b
	##y = x.a
	   movsx r11, BYTE PTR -76[rbp]
	   and r11, 1
	   cmp r11b, 0
	   setne al
	   neg al
	   movsx r10, al
	   mov -72[rbp], r10

32-bit gcc, x86:

	int32 Y$0;
	__builtin_memset( &Y$0, 0, 4 );
	struct $3UDT X$0;
	__builtin_memset( &X$0, 0, 1 );
	*(uint8*)&X$0 = (uint8)((uint32)(int32)*(uint8*)&X$0 & 4294967294u);
	Y$0 = -(boolean)((uint32)(int32)*(int8*)&X$0 & 1u);

64-bit gcc, x86_64

	int64 Y$0;
	__builtin_memset( &Y$0, 0, 8ll );
	struct $3UDT X$0;
	__builtin_memset( &X$0, 0, 1ll );
	*(uint8*)&X$0 = (uint8)((uint64)(int64)*(uint8*)&X$0 & 18446744073709551614ull);
	Y$0 = (int64)-(boolean)((uint64)(int64)*(int8*)&X$0 & 1ull);

(FYI, the boolean-bitfield branch that was pushed, failed CI due to depletion of build credits on my account, however, CI checks on the PR passed.)

jayrm avatar Aug 14 '23 01:08 jayrm

The first priority was to at least get some changes added so that memory was not overwritten and try to add a few tests for it. Then follow up to explore the code generation.

But at: https://github.com/freebasic/fbc/blob/fef1dfaf592b3b80716e7c15f0fc0c7de7badf30/src/compiler/ast-node-misc.bas#L428

I believe I had intended: return astNewBOP( AST_OP_SHL, hMakeUintMask( bits, dtype ), astNewCONSTi( bitpos ) )

Because, with usage in source linked above, overloaded hMakeUintMask() ultimately resolves to FB_DATATYPE_UINT.

jayrm avatar Aug 16 '23 23:08 jayrm