Integer clock cycle slip with AVB-DC + Mac Aggregate

Technical discussions related to any XMOS development kit or reference design. Eg XK-1A, sliceKIT, etc.
User avatar
akp
XCore Expert
Posts: 578
Joined: Thu Nov 26, 2015 11:47 pm

Integer clock cycle slip with AVB-DC + Mac Aggregate

Post by akp »

Hello

I am observing integer clock cycle slip between my two AVB-DC boards from time to time. I am running them in a daisy chain and using the two as an aggregate 8 in / 8 out device with MacBook Pro running El Capitan. When they first come up they are synchronized; I am testing synchronization by putting the same sinusoid into both XA-SK-AUDIO-PLL boards and then recording to Audacity and viewing the zero crossings are coincident.

However, if I go away and come back and repeat the test, I can sometimes see integer cycle slips between the two boards. What I mean is, the zero crossing from one board will be delayed by an integer number of sampling periods vis-a-vis the other board.

I believe I have found part of the cause: when I view the LRCLK sync between the two boards on an oscilloscope, I have noticed that sometimes they will become unlocked and then repeat the locking process. If this happens, then the audio recorded on the MacBook will show the clock cycle slip issue. I can fix it by rebooting the MacBook.

Has anyone witnessed this behaviour and (if so) is there a good fix for it (other than rebooting the Mac)? I hope to extend this system to a much larger number of channels (boards) and I need to maintain synchronization to the (sub) clock cycle. Clock cycle slips will cause inaccuracies in the analysis we want to do on the recorded audio.

I have attached a screen shot showing the cycle slip, this was with a 1kHz sine wave put into both boards.
Attachments
5_clock_slip.png
Image showing 5 clock cycle slip between two AVB-DC boards
(13.24 KiB) Not downloaded yet
5_clock_slip.png
Image showing 5 clock cycle slip between two AVB-DC boards
(13.24 KiB) Not downloaded yet


User avatar
johned
XCore Addict
Posts: 185
Joined: Tue Mar 26, 2013 12:10 pm
Contact:

Post by johned »

Hi akp,
Thank you very much for raising this.
We will take a look and get back to you shortly.
Best regards,
John
User avatar
infiniteimprobability
XCore Legend
Posts: 1126
Joined: Thu May 27, 2010 10:08 am
Contact:

Post by infiniteimprobability »

One thing definitely worth doing is running the endpoints from the host (xrun from command line or run from the gui) with xscope enabled. You will see a whole load of useful debug print outputs which will report on the state of ptp roles, status as well as stream info and media clock diagnostics.

Would you be able to capture those and post the results? That will be the first step in getting some visibility of the system in order to debug it.
User avatar
larry
Respected Member
Posts: 275
Joined: Fri Mar 12, 2010 6:03 pm

Post by larry »

If you play a sinewave to an aggregate device on the Mac that includes both boards, do you get synchronised signal on output jacks of both boards?

It would be interesting to establish if it's timestamps of the two streams that are off, or the host's actual presentation times.

As infiniteimprobability suggested, console output might give you insights, as well as packet trace using Wireshark or tcpdump. Do you have that?
User avatar
akp
XCore Expert
Posts: 578
Joined: Thu Nov 26, 2015 11:47 pm

Post by akp »

I played an 8ch sine wave to the boards when they had a sample offset and recorded it back (analog loopback of board 0 out->in and board 1 out->in). I thought I might see 2x the delay but in fact the delay didn't change so it seems like the playback is sample synchronized even when there is a record delay. At least in the one test I did.

I haven't tried a packet trace. I will try to get this data.

Here is a log of two boards' xscope and my observations when the sample delay changed from 0 to 12 samples on record. There was nothing interesting (that I saw) that corresponded to the lrclks unlocking / relocking on my scope. If I had been thinking I would have verified my statement above about the clock delays being the same whether recording in analog loopback or from an external sine wave. Something to do next time.

One other note is sometimes the recording is offset by an integer number of lrclks even when I first power up the system. I have seen delays of 1 cycle (not a big deal) to 5 cycles (a bigger deal) in this case. But the longer delays like the 12 cycle delay I just saw seem to correspond to unlock / relock of the lrclk.

I have my observations first, then the board 0 log followed by the board 1 log. The only change I have in my software vis-a-vis 1.0.6rc1 is that I have corrected the code to work properly with El Capitan per a fix posted by infiniteimprobability

0min xrun with xscope enabled
0min power on macbook
1min lrclks locked
2min zero crossings coincident when recording 1kHz sine wave in Audacity
3min oscope shows lrclks unlock and then relock
3min zero crossings on board connected to xtag id 1 delayed 12 clock cycles when recording 1kHz sine wave in Audacity
6min stop xrun

xtag id 0 log:
in>xrun --id 0 --io --xscope app_daisy_chain.xe
PTP Port 0 Role: Master
PTP Port 1 Role: Master
Setting clock source: INPUT_STREAM_DERIVED
MAAP reserved Talker stream #0 address: 91:E0:F0:0:EA:8A
PTP Port 0 Role: Slave
PTP Port 1 Role: Master
PTP sync locked
1722.1 Controller 0FE109532 acquired entity
CONNECTING Talker stream #0 (22970064C00000) -> Listener A9:20:66:10:95:32:0:0
MSRP: Register stream request 229700:64C00000
Added stream:
ID: 22970064C00000
DA:91:E0:F0:0:EA:8A:
max size: 224
interval: 1
Talker stream #0 ready
CONNECTING Listener sink #0 -> Talker stream A820661095320000, DA: 91:E0:F0:0:41
:1E
Listener sink #0 chan map:
0 -> 0
1 -> 1
2 -> 2
3 -> 3
1722 router: Enabled map for stream A820661095320000 (link_num:0, hash:0)
MSRP: Register attach request A8206610:95320000
Added stream:
ID: A820661095320000
Added stream:
ID: 22970064C00000
Increasing port 0 shaper bandwidth to 17024000
Setting stream 0 1722 TX port to 0
Talker stream #0 on
Added stream:
ID: A820661095320000
DA:91:E0:F0:0:41:1E:
max size: 144
interval: 1
AP: MSRP_LISTENER 95320000:1:1:0 MRP_EVENT_DUMMY: MRP_DISABLED -> MRP_UN
USED
Media output 0 locked: 12 samples shorter
Media output 1 locked: 11 samples shorter
Media output 2 locked: 12 samples shorter
Media output 3 locked: 11 samples shorter
Added stream:
ID: 22970064C40000
DA:91:E0:F0:0:DA:F0:
max size: 224
interval: 1
Added stream:
ID: 22970064C40000
Increasing port 0 shaper bandwidth to 34112000
1722 router: Enabled forwarding for stream 22970064C40000
Added stream:
ID: A820661095320001
DA:91:E0:F0:0:A9:4D:
max size: 144
interval: 1
Added stream:
ID: A820661095320001
Increasing port 1 shaper bandwidth to 11968000
1722 router: Enabled forwarding for stream A820661095320001

xtag id 1 log:
in>xrun --id 1 --io --xscope app_daisy_chain.xe
PTP Port 0 Role: Master
PTP Port 1 Role: Master
Setting clock source: INPUT_STREAM_DERIVED
PTP Port 1 Role: Slave
PTP Port 0 Role: Master
PTP sync locked
MAAP reserved Talker stream #0 address: 91:E0:F0:0:DA:F0
PTP threw away Sync outlier (master_diff 184068760)
PTP Port 1 Role: Slave
PTP Port 0 Role: Master
PTP sync locked
Added stream:
ID: 22970064C00000
DA:91:E0:F0:0:EA:8A:
max size: 224
interval: 1
Added stream:
ID: A820661095320000
DA:91:E0:F0:0:41:1E:
max size: 144
interval: 1
1722.1 Controller 0FE109532 acquired entity
CONNECTING Talker stream #0 (22970064C40000) -> Listener A9:20:66:10:95:32:0:1
MSRP: Register stream request 229700:64C40000
Added stream:
ID: 22970064C40000
DA:91:E0:F0:0:DA:F0:
max size: 224
interval: 1
Talker stream #0 ready
CONNECTING Listener sink #0 -> Talker stream A820661095320001, DA: 91:E0:F0:0:A9
:4D
Listener sink #0 chan map:
0 -> 0
1 -> 1
2 -> 2
3 -> 3
1722 router: Enabled map for stream A820661095320001 (link_num:0, hash:0)
MSRP: Register attach request A8206610:95320001
Added stream:
ID: A820661095320001
Added stream:
ID: A820661095320001
DA:91:E0:F0:0:A9:4D:
max size: 144
interval: 1
Added stream:
ID: 22970064C40000
AP: MSRP_LISTENER 95320001:0:1:0 MRP_EVENT_DUMMY: MRP_DISABLED -> MRP_UN
USED
Increasing port 1 shaper bandwidth to 17024000
Setting stream 0 1722 TX port to 1
Talker stream #0 on
Media output 0 locked: 13 samples shorter
Media output 1 locked: 13 samples shorter
Media output 2 locked: 13 samples shorter
Media output 3 locked: 13 samples shorter
Attachments
12_sample_delay.png
(158.03 KiB) Not downloaded yet
12_sample_delay.png
(158.03 KiB) Not downloaded yet
User avatar
akp
XCore Expert
Posts: 578
Joined: Thu Nov 26, 2015 11:47 pm

Post by akp »

I have measured another sample delay change event with a bit more information now.

using an externally supplied 1kHz tone:
sample delay starts at 1 sample time, changes, and ends at 7 sample times
when it changed I witnessed LRCLK unlock / relock on my scope
changing to loopback of 1kHz tone:
looping back board 0 to board 0 and board 1 to board 1 still shows 7 sample clocks delay
looping back board 0 to board 1 and board 1 to board 0 still shows 7 sample clocks delay
... so that says to me the delay issue is only in record, not playback

I was able to record the audio during the event and, by graphing the zero crossings delta-t, it clearly shows how the LRCLK unlocks and relocks, changing from a 1 sample delay to a 7 sample delay. I will attach this file (it is a pdf output from GNU Octave plot). The event occurred 24.5 seconds after I started recording an externally supplied 1kHz tone in Audacity. I started recording pretty soon after the boards were recognized by the MacBook.

I also recorded wireshark during this event so I can attach that as well.

I am pretty sure the operating scenario was:
1. Start with MacBook powered down and AVB-DC devices stopped --> AVB-DC devices are in a simple daisy chain: MacBook - AVB-DC 0 - AVB-DC 1 without any other switches, etc.
2. Power up MacBook
3. Start Wireshark capture
4. xrun xscope on id 0 then id 1 (run from my Windows machine)
5. Wait for AVB-DC devices to be recognized
6. Start record in Audacity with externally supplied 1kHz tone
7. Witness LRCLK unlock / lock on scope
8. Stop Audacity record
9. Start Audacity record with same board analog loopback of 1kHz tone (board 0 in 0 records board 0 out 0, board 1 in 0 records board 1 out 0)
10. Start Audacity record with opposite board analog loopback of 1kHz tone (board 0 in 0 records board 1 out 0, board 1 in 0 records board 0 out 0)

Here are the xscope logs... I still don't see much of interest. Hope they are helpful to you.

xtag id 0 log:
in>xrun --id 0 --io --xscope app_daisy_chain.xe
PTP Port 0 Role: Master
PTP Port 1 Role: Master
Setting clock source: INPUT_STREAM_DERIVED
MAAP reserved Talker stream #0 address: 91:E0:F0:0:D7:9
PTP Port 0 Role: Slave
PTP Port 1 Role: Master
PTP sync locked
1722.1 Controller 0FE109532 acquired entity
CONNECTING Talker stream #0 (22970064C00000) -> Listener A9:20:66:10:95:32:0:0
MSRP: Register stream request 229700:64C00000
Added stream:
ID: 22970064C00000
DA:91:E0:F0:0:D7:9:
max size: 224
interval: 1
Talker stream #0 ready
CONNECTING Listener sink #0 -> Talker stream A820661095320000, DA: 91:E0:F0:0:B6
:95
Listener sink #0 chan map:
0 -> 0
1 -> 1
2 -> 2
3 -> 3
1722 router: Enabled map for stream A820661095320000 (link_num:0, hash:0)
MSRP: Register attach request A8206610:95320000
Added stream:
ID: A820661095320000
Added stream:
ID: 22970064C00000
Increasing port 0 shaper bandwidth to 17024000
Setting stream 0 1722 TX port to 0
Talker stream #0 on
Added stream:
ID: A820661095320000
DA:91:E0:F0:0:B6:95:
max size: 144
interval: 1
AP: MSRP_LISTENER 95320000:1:1:0 MRP_EVENT_DUMMY: MRP_DISABLED -> MRP_UN
USED
Media output 0 locked: 12 samples shorter
Media output 1 locked: 11 samples shorter
Media output 2 locked: 12 samples shorter
Media output 3 locked: 11 samples shorter
Added stream:
ID: 22970064C40000
DA:91:E0:F0:0:DD:74:
max size: 224
interval: 1
Added stream:
ID: 22970064C40000
Increasing port 0 shaper bandwidth to 34112000
1722 router: Enabled forwarding for stream 22970064C40000
Added stream:
ID: A820661095320001
DA:91:E0:F0:0:EC:F7:
max size: 144
interval: 1
Added stream:
ID: A820661095320001
Increasing port 1 shaper bandwidth to 11968000
1722 router: Enabled forwarding for stream A820661095320001
PTP threw away Sync outlier (master_diff 250383680)
PTP threw away Sync outlier (master_diff 250840800)
PTP threw away Sync outlier (master_diff 250466080)
PTP threw away Sync outlier (master_diff 250496960)
PTP threw away Sync outlier (master_diff 250355040)
PTP threw away Sync outlier (master_diff 250773600)
PTP threw away Sync outlier (master_diff 250202880)
PTP threw away Sync outlier (master_diff 250432160)
PTP threw away Sync outlier (master_diff 250527200)
PTP threw away Sync outlier (master_diff 250620800)
PTP threw away Sync outlier (master_diff 250565280)
PTP threw away Sync outlier (master_diff 250602720)
PTP threw away Sync outlier (master_diff 250717120)
PTP threw away Sync outlier (master_diff 251001280)
PTP threw away Sync outlier (master_diff 251110080)
PTP threw away Sync outlier (master_diff 250865760)
PTP threw away Sync outlier (master_diff 250885280)
PTP threw away Sync outlier (master_diff 250327200)
PTP threw away Sync outlier (master_diff 250582560)
PTP threw away Sync outlier (master_diff 251157280)
PTP threw away Sync outlier (master_diff 250503200)
PTP threw away Sync outlier (master_diff 250419040)
PTP threw away Sync outlier (master_diff 250826560)
PTP threw away Sync outlier (master_diff 250365920)
PTP threw away Sync outlier (master_diff 250625920)
PTP threw away Sync outlier (master_diff 250871680)
PTP threw away Sync outlier (master_diff 250744000)
PTP threw away Sync outlier (master_diff 250578720)
PTP threw away Sync outlier (master_diff 250822880)
PTP threw away Sync outlier (master_diff 250714080)
PTP threw away Sync outlier (master_diff 251464960)

xtag id 1 log:
in>xrun --id 1 --io --xscope app_daisy_chain.xe
PTP Port 0 Role: Master
PTP Port 1 Role: Master
Setting clock source: INPUT_STREAM_DERIVED
PTP Port 1 Role: Slave
PTP Port 0 Role: Master
MAAP reserved Talker stream #0 address: 91:E0:F0:0:DD:74
PTP sync locked
PTP threw away Sync outlier (master_diff 212967800)
PTP Port 1 Role: Slave
PTP Port 0 Role: Master
PTP sync locked
Added stream:
ID: A820661095320000
DA:91:E0:F0:0:B6:95:
max size: 144
interval: 1
Added stream:
ID: 22970064C00000
DA:91:E0:F0:0:D7:9:
max size: 224
interval: 1
1722.1 Controller 0FE109532 acquired entity
CONNECTING Talker stream #0 (22970064C40000) -> Listener A9:20:66:10:95:32:0:1
MSRP: Register stream request 229700:64C40000
Added stream:
ID: 22970064C40000
DA:91:E0:F0:0:DD:74:
max size: 224
interval: 1
Talker stream #0 ready
CONNECTING Listener sink #0 -> Talker stream A820661095320001, DA: 91:E0:F0:0:EC
:F7
Listener sink #0 chan map:
0 -> 0
1 -> 1
2 -> 2
3 -> 3
1722 router: Enabled map for stream A820661095320001 (link_num:0, hash:0)
MSRP: Register attach request A8206610:95320001
Added stream:
ID: A820661095320001
Added stream:
ID: A820661095320001
DA:91:E0:F0:0:EC:F7:
max size: 144
interval: 1
Added stream:
ID: 22970064C40000
AP: MSRP_LISTENER 95320001:0:1:0 MRP_EVENT_DUMMY: MRP_DISABLED -> MRP_UN
USED
Increasing port 1 shaper bandwidth to 17024000
Setting stream 0 1722 TX port to 1
Talker stream #0 on
Media output 0 locked: 13 samples shorter
Media output 1 locked: 12 samples shorter
Media output 2 locked: 13 samples shorter
Media output 3 locked: 12 samples shorter
Attachments
avb_1khz_delay_1to7.pcapng.gz
Wireshark trace including time when sample delay change event occurred
(114.74 KiB) Downloaded 332 times
avb_1khz_delay_1to7.pcapng.gz
Wireshark trace including time when sample delay change event occurred
(114.74 KiB) Downloaded 332 times
sample_delay_change_event.pdf
Graph showing timing of sample delay change event
(362.89 KiB) Downloaded 360 times
sample_delay_change_event.pdf
Graph showing timing of sample delay change event
(362.89 KiB) Downloaded 360 times
User avatar
akp
XCore Expert
Posts: 578
Joined: Thu Nov 26, 2015 11:47 pm

Post by akp »

Another data point for you:

The sampling synchronization slip looks to me like it is being driven by the PLL_SYNC signal. I did a measurement just now where the sampling clocks of both boards started perfectly synchronized, but ended with one of the boards advanced by 7 clock cycles after about a minute or so of being recognized by the MacBook. That is, when I started recording the zero crossings on board 0 and board 1 were coincident, but when I ended about a minute later the zero crossings on board 1 had advanced by 7 cycles with respect to board 0.

In this test, rather than viewing the LRCLK, I viewed the PLL_SYNC signal. The skew between board 0 PLL_SYNC and board 1 PLL_SYNC started at ~250us when the sampling was synchronized. So they were frequency locked. Then, about one minute into my recording, the 2 PLL_SYNC signals lost frequency lock, and when they had regained frequency lock the skew had increased to ~370us. When I stopped my recording this showed a sampling clock cycle slip event at the same time, going from 0 sample offset to 7 sample advance.

This says to me that something in the embedded software must think it needs to drive the PLL_SYNC at a different rate for a while. However, I see no xscope messages proclaiming why it is doing this (or even if it 'knows' it is doing it, not to imbue too much sentience to the processor).

Hope this is helpful.
User avatar
akp
XCore Expert
Posts: 578
Joined: Thu Nov 26, 2015 11:47 pm

Post by akp »

I'm at a loss with this, is there any more information I can provide to help with this topic?
User avatar
larry
Respected Member
Posts: 275
Joined: Fri Mar 12, 2010 6:03 pm

Post by larry »

From the console logs it looks like both boards recover media clocks (INPUT_STREAM_DERIVED). Where is source of the media clock? On the Mac?
User avatar
akp
XCore Expert
Posts: 578
Joined: Thu Nov 26, 2015 11:47 pm

Post by akp »

Yes, I believe it is on the mac. All I have in the network is the mac, AVB DC boards and sometimes a MOTU AVB switch.
Post Reply