'daxctl list' should display whether a system-ram device is online or offline
It would be very useful to know whether a system-ram device is online or not. Currently, we do not display this information, eg:
# daxctl list --dev dax0.0
[
{
"chardev":"dax0.0",
"size":1598128390144,
"target_node":2,
"mode":"system-ram",
"movable":true
}
]
For the NUMA node, we can look at the state for each memory block, eg:
# cat /sys/bus/node/devices/node2/memory*/state | uniq -c
743 online
A proposal would be to include an 'online' boolean in the output, eg:
# daxctl list --dev dax0.0
[
{
"chardev":"dax0.0",
"size":1598128390144,
"target_node":2,
"mode":"system-ram",
"movable":true
"online":true
}
]
We may need to consider the case where some of the memory blocks within the node are online and some are offline. Such a situation may occur if the memory blocks are manipulated outside of daxctl or when 'daxctl offline-memory' offlines some, but not all, memory blocks because they are in-use.
How about instead of a boolean flag add 2 fields, "total_memblocks" and "online_memblocks", so that it's clear when a memory range is only partially online.
That's a much better idea. Given a memory block can be online or offline in the ZONE_NORMAL (kernel) or ZONE_MOVABLE domains, we could remove the current "movable" boolean state and cover the combinations with:
- online_kernel_memblocks - Number of online memblocks in the NORMAL (Kernel) zone
- online_kernel_size - Number of bytes that are online in the NORMAL (Kernel) zone
- online_movable_memblocks - Number of online memblocks in the MOVABLE zone
- online_movable_size - Number of bytes that are online in the MOVABLE zone
- offline_memblocks - Number of offline memblocks
- offline_size - Number of bytes that are offline
- total_memblocks - Total number of online and offline memory blocks within the NUMA node
A memblock may not mean anything to a system administrator, so we should display the size in bytes (perhaps even dropping the number of memblocks in favour of *_size).
Including an "offline_memblocks" makes it more obvious in the partial-online case for those needing to troubleshoot, monitor, and report the system state.
Example:
# daxctl list --dev dax0.0
[
{
"chardev":"dax0.0",
"size":1598128390144,
"target_node":2,
"online_kernel_memblocks":743
"online_kernel_size":1598128390144
"online_movable_memblocks":0
"online_movable_size":0
"offline_memblocks":0
"offline_size":0
"total_memblocks": 743
}
]
Could we also include the memory block range in the list output to make it easy to map the NUMA node to an address range shown in lsmem for example:
# lsmem
RANGE SIZE STATE REMOVABLE BLOCK
0x0000000000000000-0x000000007fffffff 2G online yes 0
0x0000000100000000-0x000000307fffffff 190G online yes 2-96
0x0000003680000000-0x000001a9ffffffff 1.5T offline 109-851
0x000001aa00000000-0x000001d9ffffffff 192G online yes 852-947
Memory block size: 2G
Total online memory: 384G
Total offline memory: 1.5T
Example:
# daxctl list --dev dax0.0
[
{
"chardev":"dax0.0",
"size":1598128390144,
"target_node":2,
"online_kernel_memblocks":743
"online_kernel_size":1598128390144
"online_movable_memblocks":0
"online_movable_size":0
"offline_memblocks":0
"offline_size":0
"total_memblocks": 743
"memblocks":109-851
}
]
Only took a year.. :) But I was doing something related to this, and remembered this request.
With the libdaxctl APIs we have, it is easy enough to add (and I've got a patch for this that I'll post shortly):
# daxctl list --dev dax0.0
[
{
"chardev":"dax0.0",
"size":1598128390144,
"target_node":2,
"total_memblocks": 743
"online_memblocks": 743
}
]
I'm hesitant to add fields like 'online_size' - the most common case is that the entire range will be online or offline, and the main device size should be enough for that. Partial online/offline states are more of a problematic state rather than a normal use case - and especially considering daxctl doesn't support actively putting a device in such a state, I'm not sure how useful it will be adding extra information related to that.
For splitting out online counts by zone - movable vs. kernel - that does seem reasonable, but would need some reworks in libdaxctl. The example I listed above is easy to implement with current APIs. Is there a strong need to split counts by zone? movable: "true" already indicates /all/ memblocks are movable. If that's not the case, you're already in a state where extra debug might be needed.
Posted patch here: http://lore.kernel.org/r/[email protected]