Wilc1000 driver memory leak after connected to AP and disconnected
I used the development board sama5d27-som1-ek1 of microchip to test the Wilc1000-SD module; I compiled wilc's driver as a ko file, loaded and ran it with a shell script like below:
#!/bin/sh
index=0;
echo "Start to Test"
#wifi reset
wifi_reset_action()
{
echo "WIFI Reset"
#remove wilc-sdio driver
sleep 5
rmmod wilc-sdio
#install wilc-sdio driver
insmod /root/wilc-sdio.ko
sleep 3
}
wifi_reset_action
#wifi runing loop
while true
do
echo "ctrl_interface=/var/run/wpa_supplicant" > /tmp/wilc_wpa_supplicant.conf
echo "ap_scan=1" >> /tmp/wilc_wpa_supplicant.conf
wpa_supplicant -iwlan0 -Dnl80211 -f /tmp/ice1001_wpa_log -c /tmp/wilc_wpa_supplicant.conf & echo $! > /tmp/ice_wifi_scan.pid
sleep 3
wpa_cli -i wlan0 scan
sleep 2
wpa_cli scan_result | grep xie_router
if [ $? -ne 0 ];then
echo "can't scan ap Now................., Try reset wifi module"
wifi_reset_action
continue
fi
wpa_cli -p/var/run/wpa_supplicant add_network
wpa_cli -p/var/run/wpa_supplicant set_network 0 ssid '"xie_router"'
wpa_cli -p/var/run/wpa_supplicant set_network 0 psk '"123456789"'
wpa_cli -p/var/run/wpa_supplicant select_network 0
while true
do
#connect failed because wrong password
cat /tmp/ice1001_wpa_log | grep "4-Way Handshake failed"
if [ $? -ne 0 ];then
echo "Waitting"
else
echo "Get Wifi Connected failed because psk key!!!!!"
wpa_cli -i wlan0 remove_network 0
break
fi
#Connection is associated
cat /tmp/ice1001_wpa_log | grep "CTRL-EVENT-CONNECTED"
if [ $? -ne 0 ];then
echo "Waitting"
else
echo "Connected to AP Successful"
wpa_cli -i wlan0 disconnect
wpa_cli -i wlan0 remove_network 0
break
fi
if [ $index -lt 10 ];then
echo "waitting"
else
echo "Try Connectting to AP TimeOut"
wpa_cli -i wlan0 remove_network 0
break;
fi
index=`expr $index + 1`;
sleep 1
done
sleep 1
#quit wpa_supplicant
kill `cat /tmp/ice_wifi_scan.pid` > /dev/null 2>&1
rm /tmp/ice_wifi_scan.pid
rm /tmp/ice1001_wpa_log
rm /tmp/wilc_wpa_supplicant.conf -rf
sleep 20
free >> /wilc_test/test_info.txt
cat /proc/meminfo | grep SUnreclaim >> /wilc_test/test_info.txt
index=0;
done
After running for 12 hours and found that every time the connection to the AP was initiated and then disconnected, there would be a memory leak;
Some record from /wilc_test/test_info.txt like below:
Start----------------->
total used free shared buffers cached
Mem: 119956 26084 93872 156 2464 9628
-/+ buffers/cache: 13992 105964
Swap: 0 0 0
SUnreclaim: 4308 kB
total used free shared buffers cached
Mem: 119956 26112 93844 172 2472 9648
-/+ buffers/cache: 13992 105964
Swap: 0 0 0
SUnreclaim: 4304 kB
......
total used free shared buffers cached
Mem: 119956 34216 85740 344 6468 10168
-/+ buffers/cache: 17580 102376
Swap: 0 0 0
SUnreclaim: 7384 kB
total used free shared buffers cached
Mem: 119956 34200 85756 356 6468 10180
-/+ buffers/cache: 17552 102404
Swap: 0 0 0
SUnreclaim: 7376 kB
total used free shared buffers cached
Mem: 119956 34236 85720 372 6468 10196
-/+ buffers/cache: 17572 102384
Swap: 0 0 0
SUnreclaim: 7384 kB
......
SUnreclaim: 10260 kB
total used free shared buffers cached
Mem: 119956 37596 82360 416 6516 10520
-/+ buffers/cache: 20560 99396
Swap: 0 0 0
SUnreclaim: 10272 kB
total used free shared buffers cached
Mem: 119956 37624 82332 432 6516 10536
-/+ buffers/cache: 20572 99384
Swap: 0 0 0
SUnreclaim: 10280 kB
total used free shared buffers cached
Mem: 119956 37652 82304 448 6516 10552
-/+ buffers/cache: 20584 99372
Swap: 0 0 0
SUnreclaim: 10272 kB
END------>
Please let me know if there have any solution for this problem. Thanks!
Hi @xiebluelike,
The memory usage can increase/decrease based on the usage and caching scheme. Now the driver doesn't maintain any scan results or connect information. All that information is maintained by the cfg80211 subsystem.
To isolate if this issue observed because of continuous connection and disconnection with AP, it is better to check memory usage with the individual operation. In the above sample script, the wpa_supplicant is also killed for each test iteration. Do you observe this same behavior if the wpa_supplicant is not killed during the operation? i.e performing continuous scan/connect/disconnect without killing wpa_supplicant.
Hi @ajaykathat,
Thank you for your reply. I try to scan loop without killing wpa_supplicant, and there is still a memory leak; My testing script just like below:
#!/bin/sh
index=0;
echo "Start to Test"
#wifi reset
wifi_reset_action()
{
echo "WIFI Reset"
#remove wilc-sdio driver
sleep 5
rmmod wilc-sdio
#install wilc-sdio driver
insmod /root/wilc-sdio.ko
sleep 3
}
wifi_reset_action
echo "ctrl_interface=/var/run/wpa_supplicant" > /tmp/wilc_wpa_supplicant.conf
echo "ap_scan=1" >> /tmp/wilc_wpa_supplicant.conf
wpa_supplicant -iwlan0 -Dnl80211 -f /tmp/ice1001_wpa_log -c /tmp/wilc_wpa_supplicant.conf & echo $! > /tmp/ice_wifi_scan.pid
sleep 3
#wifi runing loop
while true
do
wpa_cli -i wlan0 scan
sleep 10
date >> /wilc_test/test_info.txt
free >> /wilc_test/test_info.txt
cat /proc/meminfo | grep SUnreclaim >> /wilc_test/test_info.txt
index=0;
done
Looking forward to your reply. Thanks!
Hi @xiebluelike,
Do you observe any difference in the rate at which memory is getting consumed between the modified and previous script for the same test time.
From the script, it looks like it's just performing the scan without any connect operation. The same can be simulated without wpa_supplicant with changes like below and it will also help to rule out its involvement.
......
#wpa_supplicant -iwlan0 -Dnl80211 -f /tmp/ice1001_wpa_log -c /tmp/wilc_wpa_supplicant.conf &
ifconfig wlan0 up
echo $! > /tmp/ice_wifi_scan.pid
sleep 3
#wifi runing loop
while true
do
iw wlan0 scan
sleep 10
......
done
Actually, during scan operation, the driver doesn't keep any copy of the scan result. It may be worth validating the system memory usage without the scan operation too.
Dear @ajaykathat
I am xiebluelike's teammate. Recently with slabtop i find that the main memory leak is from kmalloc-512 . so i run a test.sh and then turn on /sys/kernel/slab/kmalloc-512/trace and save the trace log. 15 mintes later i analyze the log and find there is a very high probability(142-not-free / 263--total-malloc) of memory leak in cfg_scan_result()->cfg80211_inform_bss_frame()->__kmalloc(), because i can't find its free operate, see below:
(memory leak in 15 mintes is 142x512=71KB, almost match the leak 4M per 13 hours in the test early )
[ 773.390000] TRACE kmalloc-512 alloc 0xc1e61400 inuse=8 fp=0x (null)
[ 773.390000] CPU: 0 PID: 6 Comm: kworker/u2:0 Tainted: G WC O 3.10.108 #40
[ 773.390000] Workqueue: WILC_wq handle_rcvd_ntwrk_info [wilc_sdio]
[ 773.390000] Backtrace:
[ 773.390000] [
Also kmalloc-256 leak memory too, which also mainly from cfg80211_inform_bss_frame, which leak 415-not-free / 2175--total-malloc in 50 mintes, see below:
[28697.450000] TRACE kmalloc-256 alloc 0xc1ef3100 inuse=16 fp=0x (null)^M
[28697.450000] CPU: 0 PID: 19762 Comm: kworker/u2:0 Tainted: G WC O 3.10.108 #40^M
[28697.450000] Workqueue: WILC_wq handle_rcvd_ntwrk_info [wilc_sdio]^M
[28697.450000] Backtrace: ^M
[28697.450000] [
so i think this found may help for you to deal with wilc driver memory leak issue.
Best Regard, mgkiller
Actually the memory allocated by cfg80211_inform_bss_frame() is maintained by cfg80211 subsystem. cfg8011 uses a reference count mechanism to keep track of allocated memory. There is no direct free for that allocated memory. cfg80211_put_bss() call is used to decrement the reference count which is already existed in the driver. Cfg80211 subsystem free the allocated memory based on an aging algorithm which invalidates the entire and frees the expired entries on the fly. The allocated memory may not be freed immediately but should be gradual.
Thank you for reply @ajaykathat Actually i am sure the memory leak from cfg80211_inform_bss_frame() at least 50 mintes not be freed, is that normal? enven thougt i think this leak can not be freed at all. AS you say those allocated memory are freed so gradually , any advise when use wilc module in a MCU with only 64MB internal DRAM?
I don't think cfg80211_inform_bss_frame() will take more then 50mins to free a single BSS entry if that network is not found in a continuous scan. cfg80211_inform_bss_frame() is used to buildup the scan results based on the received beacons and probe response information during scan. For every scan, if a new network is found it will add an entry otherwise update an existing entry. It is a very common code which is used by all the cfg80211 based driver. The best approach to identify the time taken by an entry to be removed is using iw commands
after loading the module try these commands $ ifconfig wlan0 up $ iw wlan0 scan $ iw wlan0 scan dump -u
Giving "iw wlan0 scan dump -u" command again after some time will not give the results when the entries are expired.