boulder icon indicating copy to clipboard operation
boulder copied to clipboard

bad-key-revoker: mitigate potential race condition

Open jprenken opened this issue 4 years ago • 0 comments

I believe there's a race condition involving compromised keys. Consider the following scenario:

  • A certificate issuance begins, and passes its final goodkey/sa.KeyBlocked check before signing.
  • A compromised key is inserted into blockedKeys.
  • bad-key-revoker begins a run, and executes its SELECT on keyHashToSerial.
  • sa.AddKeyHash is only now called for the certificate issuance.

bad-key-revoker would miss this certificate.

This scenario is very unlikely if all or most of the queries run against the primary database node, but we are about to direct SELECTs to replicas much more frequently. Replication lag on replicas could make this scenario more likely.

This could be avoided by wrapping parts of the issuance pipeline and bad-key-revoker in MySQL transactions, such that the transaction would fail and roll back if the data underlying the transaction's SELECTs (e.g. on blockedKeys or keyHashToSerial) had changed before commitment. However, this would require deep refactoring.

I think an easier approach would be to modify bad-key-revoker to run multiple cycles on each row, with a configurable delay between cycles. Operators would ideally set that delay to the same amount of time for which their database layer is set to guarantee consistency.

This would not require a schema change if extantCertificatesChecked were treated as an integer representing the number of cycles that have been run. It's already a tinyint(1), not a boolean.

jprenken avatar Sep 30 '21 22:09 jprenken