cortex-a icon indicating copy to clipboard operation
cortex-a copied to clipboard

Add functions for HVC and SMC calls.

Open qwandor opened this issue 3 years ago • 11 comments

These follow the Arm SMC Calling Convention version 1.3. I've somewhat arbitrarily used function parameters for the HVC32/SMC32 calls and an array of arguments for the HVC64/SMC64 versions, because they take more arguments.

qwandor avatar Mar 14 '22 17:03 qwandor

Thanks!

If you’re up to it, I’d like to brainstorm the API a bit. I am writing this on mobile and didn’t have time yet to browse how others do it, so just putting out something that I noticed.

Is it intentional that for hvc32 and smc32, dedicated parameters are used, but an array for the 64 versions?

andre-richter avatar Mar 14 '22 18:03 andre-richter

It's intentional but somewhat arbitrary. I can use an array for both if you prefer, it just seemed a bit nicer to use dedicated parameters when there aren't so many of them.

qwandor avatar Mar 14 '22 18:03 qwandor

I think that would be more streamlined 👍

I also wonder if the args parameter should be a slice instead. The compiler would optimize stuff away in most cases I presume, but theoretically it means they get copied by value.

andre-richter avatar Mar 14 '22 19:03 andre-richter

Same slice that goes in could be used for the return values.

andre-richter avatar Mar 14 '22 19:03 andre-richter

I did consider a slice, but as far as I know that wouldn't allow the length of the array to be statically checked, and I'd rather not have runtime checks here.

For return values, the problem is that the function ID if an HVC64/SMC64 is still a 32-bit value, but the first return value (from x0) is 64 bits long.

qwandor avatar Mar 14 '22 20:03 qwandor

I did consider a slice, but as far as I know that wouldn't allow the length of the array to be statically checked, and I'd rather not have runtime checks here.

Good point. How about providing a struct SMCCCData32 and struct SMCCCData64 that the respective calls take as mutable reference. This way we have a static check.

For convenience, we could then also throw in a builder for these types using the builder pattern, where all the default register values are zero. This makes it quite expressive and the user must only care to explicitly initialize what he really needs.

For return values, the problem is that the function ID if an HVC64/SMC64 is still a 32-bit value, but the first return value (from x0) is 64 bits long.

The builder/constructor could cast the function to u64 in this case.

Apart from that, I am still unhappy about the current overhead due to all the arguments. In reality, one will seldom need as many registers as is provisioned here. This just adds needless overhead and bloats the machine code. Ideally, we just care about max(num_input_registers, num_output_registers), and nothing more. I'm sure the answer is macros, but I am currently too tired already to sketch it out.

andre-richter avatar Mar 14 '22 21:03 andre-richter

Were you thinking of the structs having individual fields, or containing an array? Either way I'm not convinced this is better. Comparing how this would look for an example PSCI call:

Passing arguments directly:

let result = hvc32(PSCI_SYSTEM_RESET2, reset_type, cookie, 0, 0, 0, 0, 0)[0];

Passing array:

let result = hvc32(PSCI_SYSTEM_RESET2, [reset_type, cookie, 0, 0, 0, 0, 0])[0];

Passing struct with named fields:

let result = hvc32(SMCCCData32 {
    a0: PSCI_SYSTEM_RESET2,
    a1: reset_type,
    a2: cookie,
    ..Default::default()
})
.a0;

Passing struct with an array and a builder:

let result = hvc32(
    SMCCCData32::function(PSCI_SYSTEM_RESET2)
        .a1(reset_type)
        .a2(cookie)
        .build(),
)
.regs[0];

Passing struct with named fields by reference:

let mut args = SMCCCData32 {
    a0: PSCI_SYSTEM_RESET2,
    a1: reset_type,
    a2: cookie,
    ..Default::default()
};
hvc32(&mut args);
let result = args.a0;

I really think it makes more sense to pass by value, as the return value is logically a different thing, not a modification of the arguments, and so it's not very nice to have to create a mutable variable to hold them both.

qwandor avatar Mar 15 '22 13:03 qwandor

Hmm okay, lets park the bikeshedding on the value/reference thing for now and look at the more serious issue: Optimal code generation.

With the current implementation, we inevitably clobber registers x0..x7 or x0..x17, even when we actually need only a select few of them (e.g. the PSCI example you just gave). I think we should really avoid that. SMCs or HVCs are often in a performance critical path and this could hurt.

We are generic over num_input_parameters and num_output_parameters, which unfortunately is a quite huge space to cover. And we should also consider the ergonomics on the user side. I must confess that my macro game is especially weak, so I am not sure to what extend we could automate code generation here.

What do you think about the following?

struct SMCCCData32<const NUM_PARAMS: usize> {
    function: u32,
    params: [u32; NUM_PARAMS],
}

trait Hvc32<const NUM_RETURN_ITEMS: usize> {
    fn hvc32(self) -> [u32; NUM_RETURN_ITEMS];
}

impl Hvc32<3> for SMCCCData32<4> {
    fn hvc32(self) -> [u32; 3] {
        let mut ret = [0; 3];

        unsafe {
            core::arch::asm!(
                "hvc #0",
                inout("w0") self.function => ret[0],
                inout("w1") self.params[0] => ret[1],
                inout("w2") self.params[1] => ret[1],
                in("w3") self.params[2],
                in("w4") self.params[3],
                options(nomem, nostack)
            )
        }

        ret
    }
}

impl Hvc32<2> for SMCCCData32<4> {
    fn hvc32(self) -> [u32; 2] {
        let mut ret = [0; 2];

        unsafe {
            core::arch::asm!(
                "hvc #0",
                inout("w0") self.function => ret[0],
                inout("w1") self.params[0] => ret[1],
                in("w2") self.params[1],
                in("w3") self.params[2],
                in("w4") self.params[3],
                options(nomem, nostack)
            )
        }

        ret
    }
}

#[no_mangle]
unsafe fn _start() -> ! {
    let input = SMCCCData32 {
        function: 1,
        params: [1, 2, 3, 4],
    };

    let _x = Hvc32::<2>::hvc32(input);

    loop {}
}

This would only use as many registers as needed, and it still is kinda okay. Not sure how much of this could be auto-generated.

andre-richter avatar Mar 15 '22 15:03 andre-richter

It's a bit more complicated than that, due to the calling convention and the potential need to support implementations of old versions of the SMCCC by the hypervisor or secure firmware. In particular:

  • SMCCC 1.3 specifies that x0-x3 are not preserved even if they don't contain results, so we always need to treat them as clobbered, even for your Hvc32<2> example above.
  • SMCCC 1.0 specifies that the return state of x4-x17 is unpredictable for HVC64/SMC64 calls, so if it's possible that the callee implements SMCCC 1.0 then the caller needs to assume that they are clobbered too. This might not be known at build time, it would have to be probed at runtime.

I think it's safer and simpler to just assume that all registers which might be used for return values are clobbered. This does mean that caller might have to save them to the stack, but are a few sequential memory accesses really that expensive compared to whatever the call is doing? What performance critical paths do you have in mind?

qwandor avatar Mar 15 '22 16:03 qwandor

What performance critical paths do you have in mind?

Paravirtualization use cases are often performance critical, for example.

andre-richter avatar Mar 15 '22 16:03 andre-richter

I'd like to suggest a few additional aspects to consider:

  • As per SMC32 calling convention, only X18-X30 are guaranteed to be untouched. smc32 and hvc32 should mark W7 and X8-X17 as clobbers to inform the register allocator not to rely on these registers to preserve their contents across a SMC.

  • In terms of flexibility, it is desirable to let users customize the smc/hvc instruction immediate. This however depends on the currently unstable asm_const feature:

pub unsafe fn smc<const ID: usize>(ctx: &mut ctx::SecureMonitorContext) {
    asm!(
        "smc {id}",
        id = const ID,
        // ...
    )
}
  • I think it also makes sense to expose the pre-defined error codes to reduce potential boilerplate on the user end:
pub const UNKNOWN_FUNCTION_ID: u32 = u32::MAX;
  • Users may also appreciate facilities to build Function ID bitmasks from their respective components which often enhances expressiveness of code over the magic values:
pub type FunctionId = u32;

#[inline(always)]
pub const fn make_function_id(
    function: u16,
    service: u8,
    is_smc64: bool,
    fast_call: bool,
) -> FunctionId {
    assert!(service < 64, "Owning Entity Number out of range!");

    (function as FunctionId)
        | (service as FunctionId) << 24
        | (is_smc64 as FunctionId) << 30
        | (fast_call as FunctionId) << 31
}

vbe0201 avatar Jun 13 '22 22:06 vbe0201

Closing as this crate has been renamed and moved to https://github.com/rust-embedded/aarch64-cpu. Please continue there if needed.

andre-richter avatar Nov 06 '22 19:11 andre-richter