The 0.2 release includes dependency to the Speex (libspeex1 and the corresponding gst plugin in gstreamer0.10-plugins-good-kulve). The Speex is compiled from SVN trunk.
I made some light (just looked the numbers, no playback) performance tests. The two tables below show how much time in seconds it took to encode a 60 second audio clip (8000Hz, mono, 16bit). The first column shows the complexity used (scale is 0-10) and the rest of the columns show the encoding time with different qualities (0-10).
no VBR, no DTX, no VAD:
c/q 1 2 3 5 7 0 12 8 10 14 11 1 12 9 10 15 11 2 13 8 12 16 16 3 13 10 15 22 19 4 13 10 16 23 20
VBR, DTX, VAD:
c/q 1 2 3 5 7 0 10 10 11 14 16 1 9 10 13 14 16 2 10 12 13 15 18 3 11 13 15 19 22 4 11 13 17 20 24
The next table shows the difference between fixed point and floating point versions.
c fixed float 0 11 17 1 11 18 2 13 20 3 19 29
So, use the fixed point Speex API, if possible (the gst doesn’t use it). If using the floating point API, the libspeex will make the conversion.
For VoIP a good quality setting could be quality 7, complexity 2 and VBR, VAD, and DTX on.