Why is select a cyclehog?

New to XMOS and XCore? Get started here.
Treczoks
Active Member
Posts: 38
Joined: Thu Mar 21, 2013 11:18 am

Why is select a cyclehog?

Post by Treczoks »

Hi!

I'm still fighting with my first project (this is taking waaaay to long for my taste): I want to receive data from a PHY. See http://www.xcore.com/forum/viewtopic.php?f=26&t=2166 for some details.

Now I started to find a way into XMOS timing analysis and my face fell down. Obviously, the "select" statement is the culprit for the timing problems, devouring a whooping 27 cycles.

I was told that XMOS IO is "oh so effective", and that the select statement would make things "oh so easy". Sorry, but "No so". The select eats so far into the 320ns timing that almost notime is left for doing something sensible with the data.

So, what can I do to improve things? Is it possible to replace the select construct with something handcrafted? I tried something in this direction, but there is an uncovered risk that DataValid might drop in the middle of receiving Data (As I'm only sending and therefor expecting longwords this will only pose a problem when someone pulls out the cable).

Any other ideas? Is there a way to optimize the select statement with some pragmas or whatever?

Yours, Christian Treczoks

UPDATE: I had a closer look at the disassembly of the function. Although I cannot claim to be proficient in XS1 assembler, one thing strikes me: the compiler optimisation is horribly bad. About everything gets loaded into r0 (and maybe r1), gets processed there and stored back into RAM instead of using a whole range of registers for a functions' local variables. It is a festival of loads and stores with the one or other FNOP to bring things down to a crawl. I'd say that by using some more registers, a lot of instructions and especially some FNOPs could be saved.
Is there a way to force the compiler into higher optimisation levels so it actually uses some of the registers the processor offers?


User avatar
Ross
XCore Expert
Posts: 966
Joined: Thu Dec 10, 2009 9:20 pm
Location: Bristol, UK

Post by Ross »

Are you talking about setting up the Select, or the code after the "case" before your event code?

You have optimisations set to O3?
User avatar
Ross
XCore Expert
Posts: 966
Joined: Thu Dec 10, 2009 9:20 pm
Location: Bristol, UK

Post by Ross »

Treczoks wrote:
UPDATE: I had a closer look at the disassembly of the function. Although I cannot claim to be proficient in XS1 assembler, one thing strikes me: the compiler optimisation is horribly bad. About everything gets loaded into r0 (and maybe r1), gets processed there and stored back into RAM instead of using a whole range of registers for a functions' local variables
...
Treczoks wrote: Is there a way to force the compiler into higher optimisation levels so it actually uses some of the registers the processor offers?
In answer to my own question: you don't have optimisations turns on at all. You should set to -O3 and try again.

(I built your example with the following: xcc -O3 EtherPipe.xc -target=XC-1A)
Treczoks
Active Member
Posts: 38
Joined: Thu Mar 21, 2013 11:18 am

Post by Treczoks »

Ross wrote:Are you talking about setting up the Select, or the code after the "case" before your event code?
When I'm in the timing analyzer and highlight the big fat block, it highlights the select statement, both case statements and one of the break statements. Thats what I'm talking about.
Ross wrote:You have optimisations set to O3?
I'd love to. If I could find it. In the xTIME composer user guide: "optimisation" - no hits. On your web site: "optimisation" - lost of broken links and outdated stuff. I've browsed menus and requesters forth and back in the xTIME composer without finding anything that even looks remotely like "compiler parameters".
This is a reoccuring problem with XMOS - you might have a lot of documentation, but finding soemthing useful is as bad as (or even worse than) on MSDN.

Christian
Treczoks
Active Member
Posts: 38
Joined: Thu Mar 21, 2013 11:18 am

Post by Treczoks »

Ross wrote:(I built your example with the following: xcc -O3 EtherPipe.xc -target=XC-1A)
You want to tell me that in order to get optimisation, I need to turn back to command line compiling or knitting my own makefiles? There should be a way to set optimisation levels in the GUI.

Astonished,
Christian
User avatar
Ross
XCore Expert
Posts: 966
Joined: Thu Dec 10, 2009 9:20 pm
Location: Bristol, UK

Post by Ross »

There is some general information about optimisation levels in the tools user guide here:
https://www.xmos.com/published/tools-user-guide

In the GUI the you should double click on the Makefile in your project - under the XCC Flags section you can set separate optimisation levels for "release" and "debug".
User avatar
Ross
XCore Expert
Posts: 966
Joined: Thu Dec 10, 2009 9:20 pm
Location: Bristol, UK

Post by Ross »

Treczoks wrote:
Ross wrote:(I built your example with the following: xcc -O3 EtherPipe.xc -target=XC-1A)
You want to tell me that in order to get optimisation, I need to turn back to command line compiling or knitting my own makefiles? There should be a way to set optimisation levels in the GUI.

Astonished,
Christian
No, I'm just informing you as to how I produced good output from your code. Im certainly not suggesting you list all your source files on the command line.
Treczoks
Active Member
Posts: 38
Joined: Thu Mar 21, 2013 11:18 am

Post by Treczoks »

Ross wrote:There is some general information about optimisation levels in the tools user guide here: https://www.xmos.com/published/tools-user-guide
I've read this file already, but it only explained the command line options per se, but not where to set them. I never bothered to click on the makefile at all, as I expected it to be something built automatically from options and structures pulled from somewhere else. That information "double-click on the makefile to edit project options" should definitely go into the manual, though. Is this one of the many odd eclipse philosophy things?
Ross wrote:In the GUI the you should double click on the Makefile in your project - under the XCC Flags section you can set separate optimisation levels for "release" and "debug".
Ah, there it is! Finally. And behold, the compiler suddenly knows that the processor has registers, and the time has come down dramatically.

Thank you for solving this mystery! Now I can finally proceed with the project.

Yours, Christian