OpenLane icon indicating copy to clipboard operation
OpenLane copied to clipboard

Detail Routing crash (due to memory use 12Gb for a 7k cells design)

Open Askartos opened this issue 3 years ago • 16 comments

Description

Hi, openlane is crashing at the detail routing step, the tool consumes 12GB of memory and crashes, from the final_summary_report.csv looks like the design, only has 7k cells. could you please let me know any suggestions to work around the issue?

Environment

Kernel: Linux v4.15.0-180-generic
Distribution: ubuntu 18.04
Python: v3.6.9 (OK)
Container Engine: docker v20.10.14 (OK)
OpenLane Git Version: 2022.02.23_02.50.41
pip:click: INSTALLED
pip:pyyaml: INSTALLED
pip:venv: INSTALLED
---
PDK Version Verification Status: OK
---
Git Log (Last 3 Commits)

e4bfdd7 2022-02-22T16:59:14-03:00 Remove pip install, no longer needed (#953) - Vitor Bandeira -  (grafted, HEAD, tag: 2022.02.23_02.50.41)

Reproduction Material

openroad_issue_reproducible.tar.gz

Expected behavior

Use less than 8Gb of memory for 7k cells design detail routing Also would be good if the tool does not crash, and provided a verbose error saying "The tool has gone out of memory, please run with higher resources or create an issue if the design is small ".

Logs

[INFO DRT-0187] Start routing data preparation.
[INFO DRT-0267] cpu time = 00:00:00, elapsed time = 00:00:00, memory = 9384.01 (MB), peak = 12581.42 (MB)
[INFO DRT-0194] Start detail routing.
[INFO DRT-0195] Start 0th optimization iteration.
[ERROR]: during executing openroad script /openlane/scripts/openroad/droute.tcl
[ERROR]: Exit code: 1
[ERROR]: Last 10 lines:
child killed: kill signal

Askartos avatar May 29 '22 00:05 Askartos

@maliberty Thoughts? I don't know if I agree with out-of-memory detection for practical purposes

donn avatar May 29 '22 02:05 donn

Note that the design has fill cells so :

[INFO ODB-0131] Created 1807364 components and 6756773 component-terminals.

This will add to the memory usage. Usage looks to be around 29-30Gb with the two threads in your test case.

maliberty avatar May 29 '22 05:05 maliberty

@maliberty Recommended action, then? Just "more RAM"?

donn avatar May 29 '22 15:05 donn

We can look at improving the memory usage but in the short term yes, more ram or more swap.

maliberty avatar May 29 '22 15:05 maliberty

@maliberty do you have any runtime expectation for that testcase? I've increased my swap and it has been running detail routing for about 13 hours.

Askartos avatar May 31 '22 16:05 Askartos

Really, depends on your computer; your CPU, the amount of RAM, the amount of swap...

You could try a larger design area to make routing easier, for now.

@maliberty Can you have someone take a look at this design? If it's 7k as they claim and its taking over 12 GB of RAM, something's not right methinks

donn avatar May 31 '22 17:05 donn

@donn if you see my comments above the 7k excludes all the fill cells.

maliberty avatar May 31 '22 17:05 maliberty

I ran it and got a result in: real 16m12.260s

Is your machine thrashing? What step are you on?

maliberty avatar May 31 '22 17:05 maliberty

the process is still running, the last message I got says that it is on detail routing "Completing 70%". the process is using 16G of ram and 11G of swap. My machine has : Core i5 7300HQ and 16Gb of ddr4 -2400 MHz

Askartos avatar May 31 '22 18:05 Askartos

Which iteration is it on? It took 10 iterations for it to complete for me.

maliberty avatar May 31 '22 18:05 maliberty

iteration 0

Askartos avatar May 31 '22 18:05 Askartos

I can only assume the machine is thrashing as the runtime difference is huge.

maliberty avatar May 31 '22 18:05 maliberty

In my machine it went to 13gb of RAM after track assignment, and then, when routing started, it filled the entire available memory (28gb) and crashed out of memory. Given this, it seems that the fills are not the cause of the high memory.

ispd19_test10 is another design that cannot be run in my machine and it has almost 900k components (against 1.8kk of this one) and the number of tracks by layer goes to 21k, while in this design it goes up to 17k. So I dont know if it is really surprising that this design consumes so much memory. Note that, for DR, track density is also important for memory, since it will have relation with the grid graph size. And I think this is the cause of the high memory since the number of nets is small (7.7K).

Stephanommg avatar May 31 '22 19:05 Stephanommg

Even with the given above, I tested running the design without all the fill cells, and the problem was solved. The routing finished in 16 seconds and in iteration 12. The peak memory reported by the router was 2.6Gb.

Stephanommg avatar Jun 01 '22 17:06 Stephanommg

@Stephanommg that alings with my latest run, reducing the area to 1000um x 1000um removes the issue. The fillers "lef" file only has ports at met1 and li1, I don't know if anything can be done here to improve the detailed routing, 10X memory consumption seems a lot.

Askartos avatar Jun 01 '22 18:06 Askartos

I implemented a feature to disregard fill cells that are not adjacent to non-fill cells. This didnt make any difference for this design since it has 1.5kk cells name FILLER* and only about 256k of these cells are marked as SPACER in the lef file (this flag tells that it is supposed to be treated as a filler). I am not sure whether this is a mistake or it is intentional, but anyway, I tested adding the SPACER flag in the macros sky130_fd_sc_hd__decap_6 and sky130_fd_sc_hd__decap_12, which seemed to be the ones used by all other FILLER* instances. The result is that about half of the FILLER* count was ignored by the router and so it was possible to route the design using 20Gb.

Thus, if you want the router to try to ignore a cell because you think it will be useless in routing, you should add the SPACER flag in they macro.

The implemented feature will run through additional testing and when it is released on the master branch I will inform you.

Stephanommg avatar Jun 07 '22 18:06 Stephanommg