Remove "copy code and data" in entrypoint
Hi, I want to discuss the need for the following lines: https://github.com/DragonMinded/libdragon/blob/d310a6b3c65aac02fdd53ab2e621a54dbd791dfe/src/entrypoint.S#L22-L40
I came across this because, in my project, I had to remove them, as I load the ROM from the PIF instead of the cartridge. The funny thing is that, after compiling Libdragon without this code, the ROMs were still working from the cart. This is because the bootcode (at least 6102) copies 1 MiB of data using DMA from cart to RAM.
So I see the following problems:
- the copy code is pointless when the ROM is smaller than 1 MiB
- the code copies the same data that was DMA'ed to RAM recently
- the code is slow using IO reads
- the code doesn't check memory size
- the code may be handy as soon as the ROM is bigger than 1 MiB
- but it should at least check if the ROM fits into the available memory
Most small projects should be in memory after the DMA transfer of the bootcode. If someone needs more code/data then loading should be implemented separately.
Things I am not aware of:
- Don't know if such a change would break projects using bigger ROMs.
- This would not affect the DFS as it is already treated separately and is read directly from the cart.
- Does someone use the 6105 bootcode? I don't know how this works. It will probably do the DMA transfer too.
- Maybe someone can point out other facts why we need this.
I actually don't remember if 6105 would break with this. If it works on real HW after removing this then it does indeed seem redundant. I would be okay with a documented caveat that libdragon works with the N64 boot code and thus expects you to remain within 1 MiB and if you want more, look into gcc overlays or dynamic loading of code using DFS. That matches how lots of systems work in practice.
Yes, it does work on real HW. Do you have an idea where to document it? Then I could prepare this.
I was wondering why libdragon's pif boot code does not touch the RSP when several commercial roms do (conker, majora). Where does it come from? Git history just says "merge alt_libn64", and I can't find much on that.
Its md5sum doesn't seem to match any of 6101, 2, 3, 5, 6 available in mikeryan's repo (last 4032 bytes).
It uses the 6102 bootcode. (The header file in the repo's root.)
But this has nothing to do with the entry code that my question was related to.
The games you are mentioning use other bootcodes than 6102. But I don't know if they do something with the RSP. Are you sure, this is happening in the bootcode? I think this is a task for the game code, but I'm not familiar with it.
Where are this md5 checksums?
Btw.: I wouldn't call them PIF boot codes. They are just loaded and executed from the PIF's bootrom.
Yes, not directly related to removing the boot copy loop, sorry for the derail. Just curious about those things.
libdragon's header bootcode is not Nintendo's 6102, that I'm sure of. It looks like someone's reimplementation. You can check the md5sum of such a file with dd if=header skip=64 bs=1 count=4032 status=noxfer | md5sum You'll find it does not match any at https://github.com/mikeryan/n64dev/tree/master/src/boot/bootcodes including 6102.
Conker and majora run things on the RSP before the 0xaaaaae write. I've added printfs to cen64 trying to track down RSP DMA instability.
The 6102 bootcode from the linked repo is identical. You can not use the md5 compare because this 6102 file is byteswapped (v64) and the header here is not swapped (z64).
Conker and Majora use the 6105 bootcode. Sorry, I don't know what the 0xaaaaae write is. Can you make this more clear?
Oh, quite a surprise that. Half of mikeryan's are swapped and half are not. Yes, the bootcode here matches 6102 when byteswapped.
The 0xaaaaae write is the last RSP CP0 write done by the bootcodes, setting the "base state" before moving to user-side code. Those ROMs run things on the RSP before that write.
I even noticed that the 6101 and 6102 codes are identical. This must be a mistake. Will check this later.
Hm, here is a disassembly of the 6102 bootcode: https://github.com/PeterLemon/N64/blob/master/BOOTCODE/BOOTCODE.asm The jump to game code is here: https://github.com/PeterLemon/N64/blob/63f09b404bef9c0bdba6be9efa87ce78401864d7/BOOTCODE/BOOTCODE.asm#L667
Now, I was looking for a mtc0 instruction and something with "aaaa" but without any luck. Eighter I still don't know what to look for or this is specific to the 6105 code. Does SM64 show this behaviour too?
Just a few lines above: https://github.com/PeterLemon/N64/blob/63f09b404bef9c0bdba6be9efa87ce78401864d7/BOOTCODE/BOOTCODE.asm#L598
The same 0xaaaaae write is in every bootcode, I think. It's done on the CPU side, not RSP.
SM64 - I couldn't say, I don't have this rom at hand.
Ah, ok. Was confused because you wrote "CP0". The CP0 is in the CPU and the RSP is in the RCP. So I was looking for mtc0 and overlooked this searching using my phone. Now I see, what you mean.
Yes, this is all CPU code. It's a pity that I can't find a disassembly of the 6105 code.
All bootcodes are copied to SP_DMEM and executed from there by the CPU. But this is not done via DMA. A small part of the bootcode is copied to main ram after ram init. This writes of 0xaaaaae to the SP_STATUS register are in this part. Later the SP_DMEM and SP_IMEM are cleared to zero.
The 6102 is the most common bootcode and the games will behave like libdragon roms do. I agree, if you see these DMA accesses prior to the SP_STATUS init, it's coming from the bootcode.
I'd first check if this only happens with 6105 games. The 6105 is more complex in terms of security features and checks different things. But I have almost no knowledge about both the 6105 bootcode and the RSP.
The RSP has its own CP0, sorry if I was unclear.
Probably the reason for the misunderstanding was on my side.
I've looked into the RSP programmers guide and now I understand that the write to SP_STATUS goes to the RSP's CP0 RSP_STATUS register.
As explained in the description, that entrypoint code is redundant for ROMs < 1Mb but it's actually very useful for ROMs > 1Mb. It is extremely useful that they just work, and the feature has been used by several ports of PC games with huge code sizes.
I don't think we should remove this feature, as there is no easy workaround for this once the ROM becomes bigger than 1Mb. Also, I believe it is the 6102 IPL3 to be in the wrong: always loading a fixed amount of 1Mb seems not the correct choice. Once we have our own IPL3, it would make sense for sure to load the correct amount of data rather than a hardcoded size.
What I think we can still do even without our own IPL3:
- Use PI DMA to do the copy
- Just copy excess data after the first Mb, rather than everything
I've sent a first PR that switches to PI DMA, but still copy the whole segments. TICKS_READ() at the start of main decreases from ~27267500 to ~22114050 in audioplayer.z64. That's ~110ms faster.
The second part (transferring just data in excess to the first MiB) has been merged: 77eab033. This can now be closed.