Promlems with XTA Timing Analyzer in 10.4.2 tools

Technical questions regarding the XTC tools and programming with XMOS.
MaxFlashrom
Experienced Member
Posts: 82
Joined: Fri Nov 05, 2010 2:59 pm

Promlems with XTA Timing Analyzer in 10.4.2 tools

Post by MaxFlashrom »

I have tried the UART timing tutorial for tools version 10.4.2 (build 1752) on Windows 32 and Ubuntu 10.10, 64-bit Linux. I have had problems with both:
Let's start with Windows. I compile the example as a release build, click on the yellow XTA icon, set from and to endpoints in the code at points A and B respectively in the code

Code: Select all

       // Output start bit
       txd <: 0 @ time; // Endpoint A

       // Output data bits
       for (int i = 0; i < 8; i++) {
          time += BIT_TIME;
          txd @ time <: >> byte; // Endpoint B
I then click on the green "E" icon to analyze between enpoints. The route is highlighted in the routes view as expected with a red question mark. I right click and set a timing requirement.This is where the promise and optimism of the tutorial and the wizzy YouTube video demo depart from the cold harsh reality of the real world, for me. Rather than getting a green tick as promised, I get a red cross. The box says

Code: Select all

    Node: 0, Core :0
    Fail with 23 unknowns, Num Paths 481
    Worst Case:Unresolved
    Required 8.7ns, 115.21 MHz, 0 thread cycles
    Violation:Unresolved
The Routes window shows routes for the whole program, not just the section between the endpoints, and the Visualizations window also shows a graph for the whole program. This is not what I was expecting.

Show route in console(right click menu of selected route) gives:

Code: Select all

    xta 9>version
    This is an unreleased/unsupported version. (build 1752)
    xta 10>xta 10>print routeinfo 0
    endpoints: C:/Documents and Settings/Administrator/XMOS/workspace/UART-loopback-example/src/uart-loopback.xc:54 to C:/Documents and Settings/Administrator/XMOS/workspace/UART-loopback-example/src/uart-loopback.xc:59
    Node: 0, Core: 0
    Fail with 23 unknowns, Num Paths: 481
    Worst Case:  Unresolved
    Required:      8.7 ns,  115.21 MHz,       0 thread cycles
    Violation:  Unresolved
    xta 3>print trace 0
    * 0.0: (10.0ns) 0x1015a txByte + 10            out (r2r)    res[r1], r0 (P)
      10.0: (10.0ns) 0x1015c txByte + 12            ldw (ru6)    r0, sp[0x0]
    * 20.0: (10.0ns) 0x1015e txByte + 14            syncr (1r)   res[r0] (P)
      30.0: (10.0ns) 0x10160 txByte + 16            ldw (ru6)    r0, sp[0x0]
      40.0: (10.0ns) 0x10162 txByte + 18            getts (2r)   r0, res[r0]
      50.0: (10.0ns) 0x10164 txByte + 20            stw (ru6)    r0, sp[0x2]
      60.0: (10.0ns) 0x10166 txByte + 22            ldc (ru6)    r0, 0x0
      70.0: (10.0ns) 0x10168 txByte + 24            stw (ru6)    r0, sp[0x3]
      80.0: (10.0ns) 0x1016a txByte + 26            ldw (ru6)    r1, sp[0x3]
      90.0: (10.0ns) 0x1016c                        --FNOP--     
      100.0: (10.0ns) 0x1016c txByte + 28            ldc (ru6)    r0, 0x8
      110.0: (10.0ns) 0x1016e txByte + 30            lss (3r)     r0, r1, r0
      120.0: (10.0ns) 0x10170 txByte + 32            bt (ru6)     r0, 0x1
      130.0: (10.0ns) 0x10172 txByte + 34            bu (u6)      0x10
      140.0: (10.0ns) 0x10194 txByte + 68            ldw (ru6)    r1, sp[0x2]
      150.0: (10.0ns) 0x10196                        --FNOP--     
      160.0: (10.0ns) 0x10196 txByte + 70            ldc (lru6)   r0, 0x364
      170.0: (10.0ns) 0x1019a txByte + 74            add (3r)     r0, r1, r0
      180.0: (10.0ns) 0x1019c txByte + 76            stw (ru6)    r0, sp[0x2]
      190.0: (10.0ns) 0x1019e txByte + 78            ldw (ru6)    r1, sp[0x0]
      200.0: (10.0ns) 0x101a0                        --FNOP--     
      210.0: (10.0ns) 0x101a0 txByte + 80            ldw (ru6)    r0, sp[0x2]
    * 220.0: (10.0ns) 0x101a2 txByte + 82            setpt (r2r)  res[r1], r0 (P)
      230.0: (10.0ns) 0x101a4 txByte + 84            ldw (ru6)    r1, sp[0x0]
      240.0: (10.0ns) 0x101a6 txByte + 86            mkmsk (rus)  r0, 0x1
    * 250.0: (10.0ns) 0x101a8 txByte + 88            out (r2r)    res[r1], r0 (P)
    etc.

On typing version at the XTA prompt it reports that
This is an unreleased/unsupported version. (build 1752)
This adds but little reassurance to a process that I feel, already, is not going well.
I have asked to trace six lines and, indeed ,one can see that, at some level, that this is acknowledged:
example/src/uart-loopback.xc:54 to C:/Documents and Settings/Administrator/XMOS/workspace/UART-loopback-example/src/uart-loopback.xc:59
However, the tool then completely ignores this and tells me it has found 481 paths through this small section of code, with 23 unknowns. There's about 23 lines of assembler between the endpoints: it seems not to know what it's doing for any of them. Which brings me on to the next feature: it doesn't know how long any I/O instructions take. It marks them all with a warning triangle exclamation. This is a little sad, as it's pretty much the whole point of this tool's existence. The ports used here are simple sampled I/O; there is no external handshaking configured on them. There is no unspecified external timing dependency, their timing operation is completely determined from within the core and should be completely deterministic. The XN config file specifies all the clock frequencies being used.
The instructions for which the tool could not fathom a timing were, amongst others:

Code: Select all

                       0x1015a    out (r2r) res[r1], r0
                        0x1015e    syncr (1r) res[r0]
                        0x10182    setpt (r2r) res[r1], r0

My experience on Linux64 was a little better. Interestingly, despite it being the same build of the tools, the behaviour differs. When one sets the path endpoints and asks for an analysis it does what one might expect: only the desired path between the endpoints is shown in the routes and visualization window. This is better, but the problem of unknown I/O times remains, making the tool useless.

I'm much deflated by the whole experience. The XTA tool shows promise but, for me, its usefulness is currently dismal. I may be missing something. If you've made it work I'd love to hear from you! If you're involved in developing it I hope this helps and the issues can be resolved. It would be good news for all of us!

Max.


User avatar
kris
Experienced Member
Posts: 84
Joined: Mon Jan 18, 2010 2:52 pm

Post by kris »

Hi Max,

Thanks for the feedback. I guess there are a number of issues here, some minor and some more important.

The version number issue seems to be a bug in our release mechanism, I'll get this sorted out for the next release.

As for the behaviour you are seeing in the uart tutorial, could you confirm that you are using the release build? If you are using the debug build, then there is indeed a path from endpoint A that misses the loop entriely, thus will show that the rest of the program is potentially hit. In order to solve this an exclusion must be placed at some point after the loop, e.g. at endpoint C. If the release build is used however, then the compiler generates code such that this path does not exist, thus you will see the correct time without requiring the exclusion.

You make a very good point about the tools lack of intelligence about calculating the pause times of I/O instructions. This is indeed high up on our list of things to do next. However, even without this we think the tool can provide a useful service, as a common use case is to find the time between endpoints, and not over them.

Hope this helps,
Cheers,
Kris.
MaxFlashrom
Experienced Member
Posts: 82
Joined: Fri Nov 05, 2010 2:59 pm

Post by MaxFlashrom »

Hi Kris,
thanks for the reply. I have looked further into this. The non-reporting of the version number is not a show stopper, but I thought it useful to mention it. I'd like to point out that this behaviour only occurs when one types

Code: Select all

version 
into the window within the XDE environment. On the command line everything is as it should be.

Code: Select all

C:\Program Files\XMOS\DesktopTools\10.4.2>xta
xta 1>version
Version: 10.4.2 (build 1752)
xta 2>
It now appears that both I and the XDE got confused. I originally built a debug version, then remembering this was wrong, went back and corrected it to a release. What seems to have happened is that an XTA Time configuration got stuck on a "Debug" version of the code and it kept using this even though I rebuilt as "release" afterward. Once I deleted the time configuration and created a new one using the "Release" build then the route shown was only that between the endpoints, as is desired. The behavior between the Windows and Linux versions is consistent, then.

I must confess to making an error when following the tutorial: I entered 8.68ns, not 8.68us. It's easy to miss this as ns is the default time choice in the set timing requirement box. I then got the incorrect impression that the timing requirement was unable to be calculated as the time for the I/O was unknown.
I've now corrected this and things work as as they should. :D

One thing I noticed is that once one has launched several XTA traces and then switches to the Debug perspective, there are a lot of old configurations listed in the debug pane. "Remove all terminated" only works for "Run Configurations" it seems. the XTA ones won't go away. I acknowledge this is largely cosmetic, but it does clutter things up.

Although the tool flags the I/O timings as unknown it reports that the timing contraint is met between the endpoints. It seems to assume that they just take 10ns. That's a max of 100MIPS in this thread. Does that mean that at one time I can only have 4 threads max on my XS1-L1 if they're all maxed out, not sleeping, rather than 8? I'll have to read up more deeply in the docs on timing stuff.

You said:
You make a very good point about the tools lack of intelligence about calculating the pause times of I/O instructions. This is indeed high up on our list of things to do next. However, even without this we think the tool can provide a useful service, as a common use case is to find the time between endpoints, and not over them.
I believe that one can do a loop-back between pins on a core in the simulator and, possibly between multiple chips? The reason I mention this is that a facility to specify some kind of external input stimulus to pins, in an XML file or similar, rather like the signals in the waveform viewer in your tools is handy for testing one's program. Currently one can generate sequences in an XC program to an output and feed it back to the input under test, I suppose.

Keep up the good work and thanks,
Max
User avatar
kris
Experienced Member
Posts: 84
Joined: Mon Jan 18, 2010 2:52 pm

Post by kris »

Hi Max,

Good point about the time configurations left hanging around in the debug perspective. I'd not noticed that before, so will look at sorting this out.

In terms of the architecture, you're right, only 4 100 MIPS threads can be running at any one time. Extra threads (above 4) will cause the others to slow down, i.e. once there are 8 threads running, each one will have 50 MIPS.

If running on the simulator, to create external activity, you have 2 options. The simulator has a plugin interface, and we supply a plugin which loopsback pins, thus allowing test stimulus to be created from xc. The other option is to create your own simulator plugin/testbench in c/c++.

Cheers,
Kris.