Issues with KSZ9031RNX Topic is solved

Sub forums for various specialist XMOS applications. e.g. USB audio, motor control and robotics.
lorenzochiesi
Active Member
Posts: 33
Joined: Mon Sep 05, 2016 4:20 pm

Issues with KSZ9031RNX

Post by lorenzochiesi »

Hi All,

I'm designed a board were KSZ9031RNX ethernet PHY was used in replacement of AR8035.
Our design worked smootly at 100Mbit rate but won't work reliably at Gibgabit rate!
In particular with a simple TCPIP stack including only an ICMP server on the XMOS we lost about 40% of ping from a directly connected PC.
Using loopback features and injecting / reading ethernet frames on XMOS side we verified that severe corruption happen randomly between PHY and XMOS, thus RGMII signals integrity or timing seems the issue.

From the following dicussion viewtopic.php?f=37&t=5778 I understood that many had success with KSZ9031RNX, thus would be great to know if you implemented RGMII track in a similar way as me, if you experienced similar issue and how was hard for you to tune RGMII signals delays (or other PHY features).

Attached a pdf including relevant schematic and PCB track...

As you can see from our schematic we omitted by mistake the series termination resistor on all RGMII signal except for TXCLK and RXCLK.
Do all you have series resistor on all RGMII?

We routed track very short, trying to equalize length of each set:
- RX signals are routed on TOP with length between 725 to 825 mils
- Due to pin disposition TX signals are routed trough BOT layer, all have 2 vias and length between 1125 to 1131 mils
- PCB is 4 layer and under TOP there is solid GND, whilst under BOT solid 3V3 plane
Do all you have solved the routing in a similar way?

Attached also the phy driver code, largely derived from github slicekit appnote code...

We implemented in the driver following delay settings:
+0.36ns on RXCLK to achieve in addition to 1.2ns natively present in KSZ9031RNX RXCLK path a total shift between RXCLK and RX signals of about 1.6ns
-0.3ns on RXD3 to better match delay of other lines (however I think this is not essential)
Effectiveness of driver setup is confirmed by scope capture (350MHz) I included in the folder that shown alignement as expected and good signal aspect, depite missing series resistor.
On the TX path we rely on the 2ns nominal delay included by XMOS ETH block with standard library settings.
We have poor accessibility to TX path signal, thus at the moment we check only TXEN vs TXCLK that looks as expected.
Have you probed RGMII signals with a high freq scope? Does your signals looks as mine?

Many thanks for any hits could point me in the right direction
Lorenzo
You do not have the required permissions to view the files attached to this post.


View Solution
User avatar
CousinItt
Respected Member
Posts: 265
Joined: Wed May 31, 2017 6:55 pm

Post by CousinItt »

Have you looked at the hardware design checklist? link

Also you might find this useful: link

Have you compared the behaviour of your device with the known good explorer kit? I've found that pinging isn't necessarily a great way to check they are working. They seem rock solid when used with frequent traffic, but occasional messages can sometimes not work. I wondered if it's something to do with autotuning in the PHY not having enough to work with, but once I found it was reliable I didn't investigate further.
lorenzochiesi
Active Member
Posts: 33
Joined: Mon Sep 05, 2016 4:20 pm

Post by lorenzochiesi »

Thanks for your answer CousinItt,
I know both document, however I found hardware checklist only after have designed the PCB :-(

Main difference between my design and hardware design checklist is in missing series permination resistor on RGMII data (I have only on TX and RX clock)
Do you have series resistor on all line in your design?

I know that ping isn't good benchmarking, for this reason we are testing RGMII side with intrnal PHY loopback and sending/receiving long ethernet frames.

I had no chance to test explorer kit RGMII signal but will manage to do, thanks for the hints!

Lorenzo
User avatar
CousinItt
Respected Member
Posts: 265
Joined: Wed May 31, 2017 6:55 pm

Post by CousinItt »

I've not designed with another PHY - I work with low volume instrumentation gear so where I've needed ethernet I've used an explorer kit.

I would have thought that using series terminations only in some of the lines could be problematic. If the traces are short enough, have you tried replacing the series terminations for the clocks with short circuits?
User avatar
mon2
XCore Legend
Posts: 1910
Joined: Thu Jun 10, 2010 11:43 am

Post by mon2 »

Is the PCB layout impedance controlled?

https://www.protoexpress.com/blog/contr ... y-matters/

Did your PCB shop share the TDR report for the PCB?

Could the issue be linked to the XMOS IP? Perhaps the IP is unable to support this data rate as currently coded?

You can review the PCB layout details of the XMOS Gigabit products to compare notes.

Another suggestion is to perhaps open a ticket with Microchip for some assistance on using their PHY. From our past experience, Microchip has been very helpful.
leobodnar
Member
Posts: 12
Joined: Mon May 07, 2018 9:26 am

Post by leobodnar »

What is the risetime/falltime of the XMOS and KSZ9031 RGMII line signals?
Generally, if trace length is shorter than 20% of the signal risetime propagation length then termination and trace impedance are irrelevant.
The trace behaves as a lumped component in such case.

I have noticed that you are using biitstrap on RX_CLK.
I would remove the R53 pull-up and let XMOS with its internal pull-downs bootstrap your PHY address as 0x3 instead. The last trace you want a stub in is DDR clock.

Also, do you have 10uF bulk caps behind ferrite beads or only 0.1uF?

Leo
lorenzochiesi
Active Member
Posts: 33
Joined: Mon Sep 05, 2016 4:20 pm

Post by lorenzochiesi »

Hi lebodnar,
Many thanks, you pointed me finally in the right direction!

Issue was caused by bit-strap PU on RXCLK line!
Was not a question of clock signal distortion but a mistery dependent on XMOS RGMII receiver block...

XMOS datasheet in RGMII section states:
"The RGMII PHY should be configured so that RX_CLK is low during reset of the
xCORE. This may be achieved by putting a pull-down resistor on the reset of the
PHY, keeping the PHY in reset until the RGMII layer on the xCORE takes the PHY
out of reset."

Of course our PU violate this condition. We changed with a PD and this magically solves any issue!
(PHY address in the driver changes accordingly to 0x03 as was in the original code available on github)
Funny things is that with that violation ethernet work fine at 100Mbit and have few random error at Gbit, this make issue difficult to track!

I'd like to close this post sharing this info with the forum as could help other in designing Gb ethernet with alternative PHY:
- KSZ9031RNX work well at 100Mb and Gbit ethernet
- Attached layout shown in pdf was validated and fully working
- With short track there is no need of series resistor on data lines (At your risk)
- Attached driver (coming from github code with some changes) work properly