opentitan [reggen] Add tooling for checking in-line countermeasure annotations

As discussed in the entropy_src D2S review meeting today, it would be great if we could annotate the code with the countermeasure IDs.

These would not be exact annotations, but rather serve as hints where an auditor should look in order to audit a specific countermeasure. I.e., certain countermeasures can be pinpointed exactly - think prim like instances like prim_count, whereas algorithmic countermeasures like masking schemes are a bit less well localized.

The tentative format for such annotations should be // SEC_CM [INSTANCE].ASSET.CM_TYPE in order to facilitate checks in reggen.

In particular, reggen should:

Extract all lines starting with // SEC_CM from ipname/rtl/*.sv files
Match them up against the Hjson list defined, and throw an error if
- the countermeasure annotation does not exist in the Hjson list
- a countermeasure from the Hjson list cannot be matched up with an RTL annotation (it is allowed to use the same countermeasure ID multiple times in RTL)

Possible future enhancements:

Instead of just grepping files under ipname/rtl/*.sv, fusesoc could be invoked to build the file list
In the documentation, we can append a list of links to each countermeasure ID that takes the reviewer/auditor to the code section with the annotation (on GitHub).

Jan 13 '22 21:01 msfschaffner

CC @sriyerg

Jan 13 '22 21:01 msfschaffner

that sounds good. so hypothetically speaking, let's say there's something called CTRL.MUBI, but in the design there are actually multiple instances of such.

Each one would then probably match to the same entry in the hjson?

Jan 13 '22 23:01 tjaychen

That is correct, unless we want to enforce a 1:1 correspondence. Note that such 1:1 correspondence would then probably have to be checked on a per SV module basis, since reggen cannot elaborate generate loops that create multiple instances of the same module. Do we have any preference?

Jan 14 '22 00:01 msfschaffner

i dont think so. I think 1 to many should be fine, especially when it comes to the more standard types. If during the review process we find the need to split them up, then it might be up to the designer to actually tag them with separate prefixes so they can be separately identified.

Jan 14 '22 00:01 tjaychen

Yes, I think I am also leaning towards leaving more flexibility. For the standardized ones, this would actually allow us to collapse them into one item, whereas for more custom stuff we probably want to call them out specifically.

This would eliminate quite some replication in blocks like OTP: https://github.com/lowRISC/opentitan/blob/7df0097b04e67cda6797e3bfef3f9c7b99b638cf/hw/ip/otp_ctrl/data/otp_ctrl.hjson#L892-L909

Jan 14 '22 00:01 msfschaffner

I definitely favor flexibility, too. Collapsing multiple standardized CM instances seems more acceptable with the annotations in the RTL. However, there may still be cases if the list of CM for a particular instance are different in which case giving individual names simplifies things . For example, you may have different FSMs in an IP that all use the same base hardening but only one FSM features global escalation, too.

Jan 14 '22 12:01 vogelpi

Initial thought was that 1:1 would be more effective, but after more consideration, maybe just having the broad CM listed in the hjson and the instance name in the RTL is sufficient.

Jan 14 '22 14:01 mwbranstad

@mwbranstad since we would like to be able to globally aggregate and list CM IDs, I think we should not use different IDs in RTL annotation for consistency.

I.e., either the annotator chooses to make instance names explicit, and then the list in Hjson should reflect that, or they choose to use the general ID, in which case you get multiple tags of the same name in RTL.

Jan 14 '22 19:01 msfschaffner

+1 for flexibility - it will be useful to enforce for example, ALL instances of some critical counter measure foo are properly verified, once we start extracting testplan bits from these.

Jan 14 '22 21:01 sriyerg

As the reviewer, probably 1:1 would be easier to deal with. But that basically makes the countermeasure lists unwieldy and forces a need for manual reconciliation which is not desirable. So 1:many for things like MUBI is ok for me, I think. We'll check the individual MUBIs in D2S and have to keep in mind that downstream changes to anything countermeasure-related (adding or removing) probably mean retriggering a D2S review.

Jan 19 '22 19:01 cdgori

One other comment, what I am finding in the review process is that we are pursuing two goals in parallel:

for annotated/stated countermeasures, are they present / properly used?
are any countermeasures missing that should be present - this usually by enumerating assets and trying to think about attacks / potential design weaknesses or single points of failure.

The second one is harder without fully enumerated 1:1 type of lists. I think the flexibility is still probably worth it, it just means that we spend a little more time on the review (a one-time cost, hopefully) to get everything accounted for.

Jan 19 '22 19:01 cdgori

Thanks for all the feedback here. We now have the tooling in place that can check

whether the canonical names are correct
whether all items annotated in RTL are actually listed in Hjson
whether all items in Hjson are annotated in RTL

Note that 2) produces an error and 3) a warning at the moment since we still have some un-annotated blocks that already have an Hjson list merged. That warning should be bumped to an error once all blocks have been annotated.

As for additional enhancements, one thing that I would find very neat would be to use the extracted info to enhance the autogen'd table and add links to the individual label instaces in the code like this (example from sram_ctrl):

Countermeasure ID	Description	RTL Annotation
SRAM_CTRL.BUS.INTEGRITY	End-to-end bus integrity scheme.	sram_ctrl.sv:96 sram_ctrl.sv:392
SRAM_CTRL.MEM.INTEGRITY	End-to-end data/memory integrity scheme.	sram_ctrl.sv:378
SRAM_CTRL.MEM.SCRAMBLE	Data is scrambled with a keyed reduced-round PRINCE cipher in CTR mode.	sram_ctrl.sv:408
SRAM_CTRL.ADDR.SCRAMBLE	Address is scrambled with a keyed lightweight permutation/diffusion function.	sram_ctrl.sv:408

I don't think it is hard to do, given that we already extract this info as part of the RTL annotation checking. We would just need to append this info to the datastructure stored in the IpBlock object, and expand the rendering step here: https://github.com/lowRISC/opentitan/blob/dc40cb5774741bca306f536b1f5abe1f59cb1319/util/reggen/gen_cfg_html.py#L116-L128

If anyone feels inclined to take a stab at this please go ahead. Otherwise I'll come back to this enhancement once I find some spare cycles in my backyard.

Jan 22 '22 02:01 msfschaffner

btw i think one thing we "might" not be dealing with yet is the templated modules. I've run into a few cases where either the wrong hjson or maybe the wrong rtl was being inspected? do you happen to know?

Jan 22 '22 02:01 tjaychen

Ah, this is something we need to double check.

The regtool does the check on the files that live at ../rtl/ relative to the hjson, so it could be that this currently only works for generated IPs that are generated with IP gen, since that one copies the entire documentation over as well (and the docs are built from the generated folder)...

Could this be the issue?

Jan 22 '22 03:01 msfschaffner

Dang, it looks like the checking currently does not work properly for generated IPs in general, no matter how they have been generated.

Jan 22 '22 03:01 msfschaffner

Ok so #10271 should take care of IPs generated via IP gen. Still need to figure out the ones that are generated with the legacy mechanism.

Jan 22 '22 03:01 msfschaffner

I pushed another patch to #10271 which takes care of legacy-style generated IPs, too.

Jan 22 '22 04:01 msfschaffner

Ok generated IPs should now be covered. However, we still need to figure out how we want to deal with vendored-in IP like Ibex, for instance. We probably need a way to redirect reggen to include and scan vendored-in RTL files.

@rswarbrick @GregAC for thoughts.

This will be relevant for #10341.

Jan 25 '22 20:01 msfschaffner

Testplan support for countermeasure items in Hjson is about to be added in #10556 by @sriyerg.

Feb 02 '22 21:02 msfschaffner

should we close this now? It seems like everything that's needed has been added.

Mar 03 '22 22:03 tjaychen

Were we keeping it open until we'd upgraded the warnings to errors?

Mar 03 '22 23:03 rswarbrick

ah that might be a while until i officially move all my stupid templated blocks over to ip_autogen...

On Thu, Mar 3, 2022 at 3:46 PM Rupert Swarbrick @.***> wrote:

Were we keeping it open until we'd upgraded the warnings to errors?

— Reply to this email directly, view it on GitHub https://github.com/lowRISC/opentitan/issues/10071#issuecomment-1058674430, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAH2RSSEE2BWGDZQU2JAHPDU6FFLRANCNFSM5L462JTQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were assigned.Message ID: @.***>

Mar 03 '22 23:03 tjaychen

@tjaychen I believe we should have support for both IPgen'd and legacy gen'd blocks... or does that not fully work on some of your blocks?

Mar 04 '22 00:03 msfschaffner

@tjaychen I believe we should have support for both IPgen'd and legacy gen'd blocks... or does that not fully work on some of your blocks?

o it's not that. It's because some of my blocks are really dependent on top level (ie, the hjson alone isn't enough, it actually needs to wait for other blocks to be parsed), so the hw/ip/* stuff are not really up to date and will be missing certain statements, leading to warnings.

Mar 04 '22 00:03 tjaychen

ah ok, got it.

Mar 04 '22 19:03 msfschaffner

One other mode that should be added: reggen cannot parse vendored code and prims at the moment, which means that we have to add all the SEC_CM labels in the comportable IP wrappers (see https://github.com/lowRISC/opentitan/pull/12104#discussion_r849935613). This could be solved, e.g., by leveraging the build system to gather the files...

Apr 14 '22 17:04 msfschaffner

@tjaychen we may want to fix and close this one as well when doing the IPgen migration.

Jan 12 '23 00:01 msfschaffner

CC @matutem

We should be able to complete this once everything works with Ipgen.

Oct 06 '23 02:10 msfschaffner

It turns out ipgen is not required: what needs to be done is to avoid running regtool in hw/Makefile for generated blocks, since topgen will run regtool with the right hjson file. This is addressed by https://github.com/lowRISC/opentitan/pull/21053.

There is also an issue (https://github.com/lowRISC/opentitan/issues/20887) to add missing countermeasure annotation to some blocks which issue warnings.

Once these two items are dealt with we could turn missing annotations into errors by default, perhaps with a flag to turn them into warnings for RTL development?

Jan 29 '24 15:01 matutem

@matutem , I've assigned you to this issue because it is my understanding that you are in the process of finishing the last items before we can switch the warnings into errors. Please let me know if this is not correct, thanks!

Mar 25 '24 13:03 vogelpi