How to find the source of increasing memory usage of a twisted server?

I have an audio broadcasting server written in Python and based on Twisted. It works fine, but its memory usage is increasing when there are more users on server, but the memory usage never goes down when those users get off line. As you see in following figure:

You can see the curve of memory usage goes up where the curve of listeners/radios goes up, but after the peak of listener/radios, the memory usage is still high, never goes down.

I have tried following method for solving this problem:

  1. Upgrade Twisted from 8.2 to 9.0
  2. Use guppy to dump heapy, but doesn't help at all
  3. Switch selector reactor to epoll reactor, same problem.
  4. Use objgraph to draw the diagram of objects' relation, but I can't see points from that.

Here is the environment I used for running my twisted server:

  • Python: 2.5.4 r254:67916
  • OS: Linux version 2.6.18-164.9.1.el5PAE (mockbuild@builder16.centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-46))
  • Twisted: 9.0 (under virtualenv)

The dump of guppy:

Partition of a set of 116280 objects. Total size = 9552004 bytes.
 Index  Count   %     Size   % Cumulative  % Type
  0  52874  45  4505404  47   4505404  47 str
  1   5927   5  2231096  23   6736500  71 dict
  2  29215  25  1099676  12   7836176  82 tuple
  3   7503   6   510204   5   8346380  87 types.CodeType
  4   7625   7   427000   4   8773380  92 function
  5    672   1   292968   3   9066348  95 type
  6    866   1    82176   1   9148524  96 list
  7   1796   2    71840   1   9220364  97 __builtin__.weakref
  8   1140   1    41040   0   9261404  97 __builtin__.wrapper_descriptor
  9   2603   2    31236   0   9292640  97 int

As you can see, the total size 9552004 bytes is 9.1 MB, and you can see the rss reported by ps command:

[xxxx@webxx ~]$ ps -u xxxx-o pid,rss,cmd
  PID   RSS CMD
22123 67492 twistd -y broadcast.tac -r epoll

The rss of my server is 65.9 MB, it means there are 56.8 MB invisible memory usage in my server, what are they?

My questions are:

  1. How to find the source of increasing memory usage?
  2. What are visible memory usage to guppy?
  3. What are those invisible memory usage?
  4. Is that caused by memory leaks of some modules written in C? If it is, how can I trace and fix that?
  5. How does Python manage memory? Memory pool? I think this might caused by audio data chunks. So that there are little leaks in memory chunk owned by Python interpreter.

Update 2010/1/20: It's interesting, I download the latest log file, and it shows that the memory never increase from a moment. I think might be the allocated memory space is big enough. Here is the latest figure.

Update 2010/1/21: Another figure here. hum.... raise a little bit

Oops... Still going up

Answers


As my guessing, it is due to memory fragmentation problem. The original design is to keep audio data chunks in a list, all of them are not in fixed size. Once the total size of the buffering list exceeds the limit of buffer, it pops some chunks from the top of list for limiting the size. It might looks like this:

  1. chunk size 511
  2. chunk size 1040
  3. chunk size 386
  4. chunk size 1350
  5. ...

Most of them are bigger than 256 bytes, Python uses malloc for chunks that are bigger than 256 bytes rather than uses memory pool. And you can imagine that those chunks are allocated, and released, what would happened? For example, when the chunk with 1350 size is released, then there might be a free 1350 bytes space in heap. After that, here comes another request 988, once malloc pick up the hole, and then there is another new little free hole of size 362. After long running, there are more and more little holes in heaps, in other words, there are so many fragments in heaps. The size of page of virtual memory usually is 4KB, those fragments are distributed around a big range of heap, it makes OS can't swap those page out. Thus, the RSS is always high.

After modification of the design of the audio chunk management module of my server, it uses little memory now. You can see the figure and compare to previous one.

The new design use bytearray rather than list of strings. It is a big chunk of memory, so there is no more fragmentation.


Need Your Help

How to apply a C preprocessor only to certain (#if/#endif) directives?

c++ c-preprocessor

I was wondering if it is possible, and if yes how, can I run a C preprocessor, like cpp, on a

What features would you like to see in a game programming DSL?

programming-languages language-design dsl

Me and my friend are in the first stages of creating a domain-specific language designed for game programming, for his thesis paper.