Lubuntu

irqbalance memory leak

16 8819
tosiara  
Hi,

I experience memory leak on my BananaPi Lubuntu 3.1.1 server.
The process irqbalance eats all available memory during few days and after that kernel generates out of memory error. To workaround the issue I have scheduled to restart irqbalance on daily basis. On Cacti graph this looks like following:



Is there a way to permanently fix this?

Thanks
Alpod  
How do you restart it daily? Did you write a script?

tosiara  
I put a script into cron.daily which executes "service irqbalance restart"

tkaiser  
Can you please provide us with the output of 'cat /proc/interrupts' from your system shortly before the service restarts?

tosiara  
Post Last Edited by tosiara at 2014-10-5 10:17

Here go logs:
irqbalance disabled and stopped: http://pastebin.com/M5N4TnL9
irqbalance enabled and has been just started: http://pastebin.com/RnKY0g0N
irqbalance runs for 12 hours and leaks about 100MB: http://pastebin.com/hSL2jEsu

diff:
  1. --- interrupts_irq_disabled.txt 2014-10-05 18:55:00.080659669 +0300
  2. +++ interrupts_irq_900.txt      2014-10-05 18:55:05.149765847 +0300
  3. @@ -1,10 +1,10 @@
  4.             CPU0       CPU1      
  5. - 29:   28809517   18943160       GIC  arch_timer
  6. + 29:   31828467   20846645       GIC  arch_timer
  7.   30:          0          0       GIC  arch_timer
  8.   32:          0          0       GIC  axp_mfd
  9.   33:        209          0       GIC  serial
  10. - 37:      29550          0       GIC  RemoteIR
  11. - 39:   51315660          0       GIC  sunxi-i2c.0
  12. + 37:      29810          0       GIC  RemoteIR
  13. + 39:   56968226          0       GIC  sunxi-i2c.0
  14.   40:          0          0       GIC  sunxi-i2c.1
  15.   41:          0          0       GIC  sunxi-i2c.2
  16.   54:          0          0       GIC  timer0
  17. @@ -12,13 +12,13 @@
  18.   56:          0          0       GIC  sunxi-rtc alarm
  19.   59:          0          0       GIC  dma_irq
  20.   60:          0          0       GIC  sunxi-gpio
  21. - 64:   23609285          0       GIC  sunxi-mmc
  22. - 71:  156628257          0       GIC  ehci_hcd:usb1
  23. - 72:   24296502          0       GIC  ehci_hcd:usb3
  24. - 76:   74968772          0       GIC  sunxi lcd0
  25. + 64:   26340917          0       GIC  sunxi-mmc
  26. + 71:  173325506          0       GIC  ehci_hcd:usb1
  27. + 72:   26892046          0       GIC  ehci_hcd:usb3
  28. + 76:   82951670          0       GIC  sunxi lcd0
  29.   77:          0          0       GIC  sunxi lcd1
  30.   78:          0          0       GIC  g2d
  31. - 79:   37459652          0       GIC  sunxi scaler0
  32. + 79:   41450949          0       GIC  sunxi scaler0
  33.   80:          0          0       GIC  sunxi scaler1
  34.   88:          0          0       GIC  sw_ahci
  35.   92:          0          0       GIC  ace_dev
  36. @@ -32,9 +32,9 @@
  37. 107:          0          0       GIC  mali_mmu_irq_handlers
  38. 120:          0          0       GIC  sunxi-i2c.3
  39. IPI0:          0          0  Timer broadcast interrupts
  40. -IPI1:   10893199   28343494  Rescheduling interrupts
  41. +IPI1:   11828993   31203597  Rescheduling interrupts
  42. IPI2:          0          0  Function call interrupts
  43. -IPI3:        494        489  Single function call interrupts
  44. +IPI3:        569        549  Single function call interrupts
  45. IPI4:          0          0  CPU stop interrupts
  46. IPI5:          0          0  CPU backtrace
  47. Err:          0
Copy the Code

tkaiser  
Why don't you disable irqbalance since it's doing nothing except of eating up your memory (all IRQs served by CPU0)?

tosiara  
I'm not sure how much performance penalty will be there if I completely disable it. So I'm hoping to fix the initial issue

man irqbalance:
  1. The purpose of irqbalance is distribute hardware interrupts across processors on a multiprocessor system in order to increase performance.
Copy the Code

tkaiser  
How should performance be affected if irqbalance seems to do nothing at all (except of having a memory leak prior to version 1.0.7 on platforms that lack PCI like ARM/BananaPi)? Did I missed it or were there any hardware interrupts served by CPU1?

https://lkml.org/lkml/2012/8/4/51
http://comments.gmane.org/gmane.linux.ports.arm.kernel/102251
https://groups.google.com/d/msg/ ... qX9uz8/r3ouEhRxwPMJ

I would try to disable irqbalance at all and reassign ehci_hcd:usb1 manually to CPU1 (maybe even ehci_hcd:usb3 too). For example by adding the following to /etc/rc.local:
  1. echo 2 >/proc/irq/$(cat /proc/interrupts | grep 'ehci_hcd:usb1' | cut -f 1 -d ":" | tr -d " ")/smp_affinity
Copy the Code
In case there's USB storage on ehci_hcd:usb1 you might see a slight increase in performance.

tosiara  
There is web camera attached which serves net stream. And actually until stream is started  - irqbalance does not seem to leak
I will try your suggestions and report back. Thanks!

farshad  
Edited by farshad at Sun Apr 19, 2015 07:26

I can confirm this issue. Memory used by irqbalance makes system unusable after 24-48 hrs.

Looks like a general Linux issue which is fixed in newer kernels.

You have to log in before you can reply Login | Sign Up

Points Rules