cubicweb-brainomics #2712072 performance issue [done]

We see serious performance issues after a few days of operation of our Brainomics server. SSH and CubicWeb response time are exceedingly slow - a few minutes. This does not look like a memory issue because swap is merely used. PostgreSQL does eat lots of CPU. For example:

$ top

top - 13:30:41 up 6 days,  2:24,  1 user,  load average: 1.67, 0.76, 0.33
Tasks:  93 total,   2 running,  91 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.3%us,  7.2%sy,  0.0%ni, 92.3%id,  0.0%wa,  0.0%hi,  0.2%si,  0.0%st
Mem:   8178396k total,  6973192k used,  1205204k free,    94948k buffers
Swap:  4190204k total,        0k used,  4190204k free,  1113724k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
15268 postgres  20   0  133m 8980 6124 R  100  0.1   1:53.08 postgres
15744 brainomi  20   0 17340 1240  936 R    4  0.0   0:01.95 top
 1033 root      20   0 15980  676  500 S    3  0.0   2:28.49 irqbalance
 2162 brainomi  20   0  909m 185m 3064 S    1  2.3  32:10.20 cubicweb-ctl
 2104 postgres  20   0  124m 1648  312 S    0  0.0   2:08.92 postgres
12954 root      20   0     0    0    0 S    0  0.0   0:16.74 kworker/0:1
[...]
$

Has CubicWeb been thoroughly tested on Ubuntu 12.04 LST and PostgreSQL 9.1? It could be a compatibility issue between CubicWeb and the version of PostgreSQL shipping with ubuntu 12.04.

For now I have restarted PostgreSQL and CubicWeb:

$ cubicweb-ctl stop localizer
can't kill process 2162
trying SIGKILL
instance localizer stopped
$
$ sudo service postgresql restart
 * Restarting PostgreSQL 9.1 database server                             [ OK ]
$
$ cubicweb-ctl start localizer
instance localizer started
$

How to debug this issue next time? This is the second time it happens. It is obviously not related to system or CubicWeb updates as I initially imagined - see ticket #2632425.

priorityimportant
typebug
done in<not specified>
load left0.000
closed by<not specified>