Add coralnpu on asap7
Description
Add https://github.com/google-coral/coralnpu to ORFS on asap7
Suggested Solution
Use fakeram as needed.
Additional Context
No response
From Peter Gadfort:
first pass, yosys gets stuck in a techmapping pass, so will require some debugging to figure out why
I'll share something later today, but it's before the PDK is even referenced, so I don't think that would matter (it's something in the RTL from the Chisel subsystem)
test case: https://drive.google.com/file/d/1Glx6R5E9fm3ldPyH_hJKmE0mLV123_we/view?usp=sharing
How about adding a PR for coralnpu with bazel-orfs?
In ORFS? We have no bazel-orfs test there.
@gadfort , which steps did you follow to generate the SV with the CoralNPUChiselSubsystem top module?
I was looking at the Integration Guide and it gives instructions on how to generate the SV for top modules with the names CoreMiniAxi or RvvCoreMiniAxi.
The issue with yosys stems from the Chisel generated code, so once I isolated that thats the only file included. It was generated by building just the chisel part isolation, not sure if something went wrong when I ran it, but yosys still chokes on it.
I was following some of the FPGA integration since it has a top level with a reasonable number of ports (ie. integrated external memory).
FYI, Chisel allows a lot of control over how code is generated via firtool, such as memories, verilog style, disabling verification layers, disable randomization of registers, etc. This used to be more important before slang as yosys without slang is much more finicky about what Verilog constructs that it supports.
Note that slang is not yet on by default
Note that slang is not yet on by default
Should it ever be?
It is easily enabled per project...
I don't see why not as the sv support in yosys is inferior.
I don't see why not as the sv support in yosys is inferior.
Noted. Timeframe to make slang default? "asymptotic goal"?
More just a matter of getting around to testing it and making sure nothing breaks.
The issue with yosys stems from the Chisel generated code, so once I isolated that thats the only file included. It was generated by building just the chisel part isolation, not sure if something went wrong when I ran it, but yosys still chokes on it.
I was following some of the FPGA integration since it has a top level with a reasonable number of ports (ie. integrated external memory).
Thanks for the clarification. I wasn't looking at the FPGA code before. I'm able to export essentially the same SV as you included in your tarball (only changes appeared to be the ifdef's and some newlines).
So, now the question is what version of the design should be added to ORFS:
- CoralNPUChiselSubsystem (FPGA)
- CoreMiniAxi (scalar-only Coral NPU configuration is provided that can integrate with an AXI-based system)
- RvvCoreMiniAxi (RISC-V vector (RV32IMF_Zve32x) version of Coral NPU)
@maliberty Perhaps it is a bridge too far to try to fake RAM for this design in asap7 initially?
There's interesting stuff to be learned from this design, other than how to set up SRAMs?
Given that the SRAMs will have a mismatch in behavioral model (number of ports, read clock, etc.), wrapper/interfaced logic would have to be written to combine one or more fake RAMs to a single RAM. This is what barstools does in Chipyard and SRAM inference support in FPGA.
I find that setting number of rows in SRAMs to 1 row and then looking at what else there is to learn from a big design first is a good first step.
Realism in SRAM is dessert...
W.r.t. interesting issues to investigate with coralnpu, I think memory instantiation and placement is the least novel/interestin, there are already designs that investigate that.
It would be neat with a feature that could put a pin in memory instantiation and placement to look for other interesting aspects with coralnpu:
https://github.com/The-OpenROAD-Project/OpenROAD-flow-scripts/pull/3621
@jeffng-or See https://github.com/The-OpenROAD-Project/OpenROAD-flow-scripts/pull/3621#issuecomment-3508852354, slang is stopping memories from being inferred, which means that the ORFS flow doesn't error out early when it encounters large memories that haven't been dealt with.
@jeffng-or FYI https://github.com/The-OpenROAD-Project/OpenROAD-flow-scripts/issues/3645
We are already using fakerams so there is no inference needed.
We are already using fakerams so there is no inference needed.
But yosys is still stuck?
My thinking is that the memories are a red herring w.r.t. yosys "running forever".
They save some time but not the bulk.
They save some time but not the bulk.
When you have something you can share and it gets past synthesis in a sensible time, I am curious to understand what "glue verilog" you had to write to make the behavioral model compatible with CoralNPU and what the behavioral model is of the fakeram. actual timing in .lib file is a little bit less interesting to me, I care about how the timing paths terminate correctly in a register.
For instance, if CoralNPU uses a read clock, and fakeram doesnt have a read clock, how is that accounted for?
Specifically https://github.com/google-coral/coralnpu/blob/main/hdl%2Fchisel%2Fsrc%2Fbus%2FSpi2TLUL.scala
// val write_data_buffer = SyncReadMem(kBufferDepth, UInt(kBufferWidth.W))
None they already have support for RAMs, just only setup for private PDKs. We just added support for asap7 fakerams.
None they already have support for RAMs, just only setup for private PDKs. We just added support for asap7 fakerams.
Yes... how do you match up asap7 to SyncReadRam() above?
@jeffng-or please comment on your ram changes
Got through synthesis on CoreAxiMini. Something went wrong with constraints.sdc because I'm not getting a Endpoint Slack Histogram in synthesis, even if I side-stepped the top-level clock gating shenanigans.
https://github.com/The-OpenROAD-Project/OpenROAD-flow-scripts/issues/3645#issuecomment-3518393304
Some more luck with CoreAxiMini https://github.com/The-OpenROAD-Project/OpenROAD-flow-scripts/pull/3652#issuecomment-3520202567
@jeffng-or please comment on your ram changes
I only instantiated and hooked up the ports that the FakeRAM had:
`elsif USE_ASAP7
///////////////////////////
//////// ASAP7 SRAM ////////
///////////////////////////
wire [127:0] nwmask;
genvar i;
generate
for (i = 0; i < 16; i++) begin
assign nwmask[8*i +: 8] = {8{wmask[i]}};
end
endgenerate
fakeram_512x128 u_asap7_sram (
.rd_out(rdata),
.addr_in(addr),
.wd_in(wdata),
.we_in(write),
.ce_in(enable),
.clk(clock)
);
`else
Note that this is in CoreMiniAxi.sv, not in CoralNPUChiselSubsystem.sv (see still-open question above on which version of coralnpu to add to ORFS).
There isnt any behavioral Verilog for FakeRAM, is there?
I tried several of the CoralNPU configurations and CoralNPU has many configuration options besides. OpenROAD seems capable of dealing with the various configurations.
To my mind what is missing are clear mission parameters that makes the choice unambigious.... An integration test to test what? A demo? PPA study?
Certainly, like MegaBoom, turnaround times will be long unless the flow is carefully curated.
The FakeRAM is only a black box;
(* blackbox *)
module fakeram_512x128 (
output reg [127:0] rd_out,
input [8:0] addr_in,
input we_in,
input [127:0] wd_in,
input clk,
input ce_in
);
endmodule
And there is a .lib file too, right?
That .lib file will reflect a behavioral model, e.g. whether there is a read clock.
And there is a .lib file too, right?
That .lib file will reflect a behavioral model, e.g. whether there is a read clock.
Yes, there's a Liberty, but FakeRAM only has the one clock.