I’ve been wanting to run things on OMAP‘s DSP processor since the day I tested an OMAP1510 development board some years ago. Finally after a Proof of concept G.711 dsptask mail by Simon Pickering I decided to cut’n’paste that and try.
I knew the Speex had already been ported to the DSP (TMS320C55) used by OMAP1 (770) and OMAP2 (n800) so I started porting it to the DSP Gateway to be able to use it in my n800.
The DSP Gateway’s DSP tasks are seen as
/dev/dsptask/<taskname> devices on ARM side and are interacted with read, write, and ioctl calls. Very linux stylish and convenient in my opinion. On the DSP code one struct defines callbacks for these calls. The code overhead on DSP side is some tens of lines.
speex_alloc functions that can be overwritten if the default memory allocation is not suitable. In the existing DSP port these were overwritten with functions using preallocated global char arrays that are used for all allocations, freeing is not possible. The DSP char is 16 bits as is int. The long is 32 bits and a pointer is 23 bits. Long long is 40 bits. I had some problems with calculating with pointers (casting a pointer to long didn’t produce valid looking value) in the existing implementation so I rewrote the allocation functions to use arrays and indexes. Also the existing implementation printed pointer values as %x which seemed to print completely different values than %p. I did use dbg() however, not fprintf.
The dbg() prints will be shown in dmesg or syslog on the ARM side if the kernel is compiled with mbox debug. The n800 stock kernel is not, but it’s almost trivial to compile a kernel of your own. My kernel with mbox debugs and built-in ext3 is here.
Other than the allocation functions there was not much to port, couple of functions calling the speex’ init and encode functions. I did have an odd problem though: the DSP seemed to jam after some tens of seconds and the DSP kernel was rebooted. After I added
poll_disable(task) in my encoding callback function, the jamming was gone. The function will disable the polling for 10 seconds. Afaik, that should be called only if the completion of the function takes more that 10 seconds, which it doesn’t.
My speex patch against the git head is available, if somebody wants to test it. There can be some weirdness as this is my first test with the DSP stuff and I don’t even know how to use the git ;)
If my measurements are correct it takes roughly 40 ms to encode one 20 ms frame without compiler optimizations. With -o3 it takes a bit over 20 ms. The speex is meant to be portable and has a header defining some math functions. The functions are ported to different CPUs like ARMv5 and Blackfin. Maybe some DSP guru quickly ports those to the TMS320C55 too? :)
I think the best place for DSP information for the tablets is in maemo.org wiki as I hope it will be kept up to date with the latest projects and tweaks. One thing to be added there would be a note about the skeleton page. I used the Active Block Receiving and Active Block Sending.