netopeer2 icon indicating copy to clipboard operation
netopeer2 copied to clipboard

Ram usage of netopeer2-server increases after getting huge operational data

Open trentzhou opened this issue 1 year ago • 5 comments

I used netopeer2-cli to test my program. Command get --filter-xpath /interfaces-state/interface --rpc-timeout 3600. The data is populated by a python script using sysrepo-python which returns over 10 thousand fake interfaces. The data can be returned successfully, but I observe the memory usage for netopeer2-server increases and never drops by checking column RSS in the output of top.

I'm not sure whether this is a problem. Maybe it's caused by glibc not returning memory to kernel. But the memory usage of netopeer2-server is too high, about 3 times higher than the python script.

trentzhou avatar May 14 '24 04:05 trentzhou

This will likely be the result of sysrepo caching enabled in netopeer2. You can try removing the flag on main.c:511 and see whether that fixes it. If it is a significant problem, another compilation or run-time option can be added to turn the cache off but it will significantly slow down all data requests.

michalvasko avatar May 14 '24 07:05 michalvasko

I changed SR_CONN_CACHE_RUNNING to SR_CONN_DEFAULT, there is no change.

trentzhou avatar May 15 '24 05:05 trentzhou

You can try using an explicit measurement tool such as massif, I am not aware of the operational data being stored anywhere after they are returned.

michalvasko avatar May 15 '24 06:05 michalvasko

This problem can be reproduced very easily. I cleanly installed sysrepo and netopeer2, then installed yang [email protected] and [email protected]. So this is a very simple environment.

Then I used this python script to generate operational data:

#!/usr/bin/env python3
# benchmark.py
import sysrepo
import sysrepo.session
import sys
import signal
from typing import Any


count = 1000

def gen_fake_interfaces(count: int):
    # generate fake config
    data = {
        "interfaces-state": {
            "interface": []
        }
    }
    for i in range(count):
        data["interfaces-state"]["interface"].append({
            "name": f"eth{i}",
            "type": "iana-if-type:ethernetCsmacd"
        })
    return data

def interface_state_callback(xpath: str, private_data: Any):
    return gen_fake_interfaces(count)

def main():
    global count
    if len(sys.argv) > 1:
        count = int(sys.argv[1])
    with sysrepo.SysrepoConnection() as conn:
        with conn.start_session() as sess:
            sess.switch_datastore("running")
            sess.subscribe_oper_data_request("ietf-interfaces",
                                             "/ietf-interfaces:interfaces-state",
                                             interface_state_callback)
            signal.sigwait([signal.SIGINT, signal.SIGTERM])
            
if __name__ == '__main__':
    main()

While this python script is running, I can get operational data with get --rpc-timeout 3600 --filter-xpath /interfaces-state. Just try running benchmark.py 100000 which returns a huge amount of data. Then I see the memory usage of netopeer2-server increases and never drops. Even when the python script is killed, the memory usage is still very high.

trentzhou avatar May 17 '24 01:05 trentzhou

image This is the graph displayed by massif visualizer. I noticed that the memory usage is lower when the program runs with massif. The memory usage is worse without massif.

trentzhou avatar May 17 '24 02:05 trentzhou

Then I guess it really is a "feature" of memory allocation and not actual memory consumption of netopeer2-server.

michalvasko avatar May 20 '24 07:05 michalvasko

In my test, when I return 1 million fake interfaces, the memory consumption of netopeer2 goes up to 1GB. After the result is returned, the memory is still at 1GB. I'm afraid this will cause the system to have memory problems. I wish the memory can be released after the result is returned.

trentzhou avatar May 20 '24 08:05 trentzhou

You see what massif reported, I do not think I can help you in any way.

michalvasko avatar May 20 '24 09:05 michalvasko

I still feel confused. Look at the massif report: image

After the oper data is returned, it seems netopeer2 is still holding the data. Can the data be freed immediately?

trentzhou avatar May 20 '24 09:05 trentzhou

I am quite certain no unnecessary memory is being held and if you still have the netopeer2 cache disabled, there is not much else to do.

michalvasko avatar May 20 '24 14:05 michalvasko

Thanks for your information. This issue can be closed now.

trentzhou avatar May 22 '24 02:05 trentzhou