Proposal

Problem statement

Standard library on Windows use some API bindings to get process information, but those bindings have a higher overhead than using TEB, an internal struct on Windows, which contains many process information within its members, we can use it to improve performance.

Motivating examples or use cases

Get stdout/stderr handle
Get command line pointer
Get process id
Get last error(most common case)

Solution sketch

The first step is to define types for TEB and its members: windows.txt

The second step is to obtain pointer of TEB, there are different ways to do it:

use NtCurrentTeb binding:

extern "system" {
    fn NtCurrentTeb() -> *const TEB;
}

fn get_teb() -> *const TEB {
    NtCurrentTeb()
}

use register value:


/// x86_64
use std::arch::asm
fn get_teb() -> *const TEB {
    let peb;
    unsafe {
        asm!(
            "mov, {}, gs:[0x30]",
            out(reg) peb,
        )
    }
    peb
}

/// x86
use std::arch::asm
fn get_teb() -> *const TEB {
    let peb;
    unsafe {
        asm!(
            "mov, {}, fs:[0x18]",
            out(reg) peb,
        )
    }
    peb
}

The final step is to replace those bindings with member accessment of TEB.

For example:

pub fn get_last_error() -> WinError {
    // SAFETY: This just returns a thread-local u32 and has no other effects.
    unsafe { WinError { code: (*get_teb()).last_error_value } }
}

Alternatives

Links and related work

Zig uses TEB to get process information and treat it and most of its members as non-null pointers: https://github.com/ziglang/zig/blob/master/lib/std/os/windows.zig
Local benchmark of different ways to get command line pointer:

TEB/inline-assembly time:   [241.46 ps 243.36 ps 245.64 ps]
TEB/NtCurrentTeb    time:   [1.1975 ns 1.1998 ns 1.2028 ns]
TEB/GetCommandLineW time:   [1.6770 ns 1.6790 ns 1.6817 ns]

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

We think this problem seems worth solving, and the standard library might be the right place to solve it.
We think that this probably doesn't belong in the standard library.

Second, if there's a concrete solution:

We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.

Mar 09 '25 09:03 CrazyboyQCD

I think this needs a stronger motivation. You mention performance but I would like to see some real-world numbers. E.g. it's unlikely people are getting the command line in a tight loop and any function call overhead is going to be dwarfed by parsing the command line in any case. The most common (GetLastError) is called only after a much more expensive call into the Windows API.

Mar 09 '25 10:03 ChrisDenton

@ChrisDenton

E.g. it's unlikely people are getting the command line in a tight loop and any function call overhead is going to be dwarfed by parsing the command line in any case.

Indeed, but TEB also contains length of command line buffer, current cmd line parsing implementation on windows uses iterator and null-check, this could turn it into a slice iterator and get slight improvment(Caveat: this may get hacked if another process modifys the length, but I think is fine since it can also modify the pointer of cmd line and that's what GetCommandLineW returns, see this for detail).

The most common (GetLastError) is called only after a much more expensive call into the Windows API.

Benchmarking with recursive read_dir(contains several get_last_errors) shows no significant difference, so this doesn't impact much as expected.

Mar 09 '25 14:03 CrazyboyQCD

As this isn't an API change, only an implementation change, the ACP process is not needed for this. However as @ChrisDenton said, the motivation isn't sufficiently strong for making this change anyways.

Apr 01 '25 17:04 Amanieu

Use `TEB` to get process information on Windows

Proposal

Problem statement

Motivating examples or use cases

Solution sketch

Alternatives

Links and related work

What happens now?

Possible responses