Xcore 200 Issue with 3 Bytes per sub slot

Technical questions regarding the XTC tools and programming with XMOS.
Post Reply
MarcAntigny
Active Member
Posts: 61
Joined: Tue Feb 13, 2018 2:50 pm

Xcore 200 Issue with 3 Bytes per sub slot

Post by MarcAntigny »

Hi,
I'm working with the xCore 200 MC Audio dev kit and the reference software. As mentioned in this topic, we want to use 40 USB channels. To get this amount of channel, we need to use 3 Bytes per sub slot.
First, I tried 3 Bytes per sub slot for only 32 channels to validate the use of 3 Bytes per sub slot.
And I got a problem on the PLAY path (Computer > USB > XMOS). The data output is hashed periodically (cf the output below where the input is a sawtooth)
Image
The period of this hashing is 56 samples at 48kHz (1.17 ms), so there is 56 samples of zero and then 56 samples of data etc...
The time taken for each frame (one sample for all the channels) is below the maximum time per sample (20.8 us at 48 kHz). However, the time taken for each frame varies periodically.

I didn't modify the Software Reference, except the VENDOR_ID/DEVICE_ID/USB descriptors to work with our Theysicon driver and the define to use 3 Bytes per sub slot.
For 30 channels 3 Bytes per sub slot or 32 channels 4 Bytes per sub slot, everything works fine. However for 40 channels we need to use 3 Bytes per sub slot (and so we get the problem).

Thanks for any help. I created this new topic (rather than continue the old one) because it is linked directly to the reference Software, which is announced to work at 32 channels.

Marc


User avatar
infiniteimprobability
XCore Legend
Posts: 1126
Joined: Thu May 27, 2010 10:08 am
Contact:

Post by infiniteimprobability »

Looking at your waveform, it looks like underflow. Ie. samples are being provided too slowly and the buffer underflows (and outputs zero). This doesn't really make sense though.. Is the overall period of the sawtooth what you'd expect? If not, perhaps the host isn't keeping up!?

I think this will require greater visibility - possibly code profiling (adding timers or wiggling and IO pin is your friend). What we do know is that the reference design definitely handles endpoints with 10 x 32b channels @192kHz which is 61.4Mbps. This config is failing at 36.9Mbps. So the problem IS NOT usb transfer. Are you able to profile the unpacking code to see how long extracting the channels take?

The way it is architected is that I2S task sends a token to decouple which fires an interrupt. It then extracts samples, one by one, and sends them to I2S. So I2S is blocked until the last sample arrives. The interrupt takes about 400ns and you have approx half a frame (1/48000 = 20.83 /2 = 10.4us) to get all of the data across. If you don't then I2S will get stretched. An easy way to check this is to look at the LRCLK and see if it deviates from exactly 48000.

I assume you use I2S? (not TDM?). If you use TDM you have 1/8 of the cycle to do the exchange with decouple because it needs to tend to I/O duties (the IO ports only buffer 32 bits at a time, 64 at a pinch if you schedule carefully).

To extract (get 3b slots) 40 samples in 10.4us gives you 260ns per sample. At 62.5MIPS, thats 16 instructions per extraction. That may be getting close but should be possible.

One thing worth trying is to switch on MIXER (with MAX_MIX_COUNT = 0). That effectively puts a thread between audio and decouple which will enable decouple to take a whole sample period and serve up samples to audio instantly, making it's life easier too..
Post Reply