select/case different to wait

Technical questions regarding the XTC tools and programming with XMOS.
User avatar
RedDave
Experienced Member
Posts: 77
Joined: Fri Oct 05, 2018 4:26 pm

select/case different to wait

Post by RedDave »

I have written some simple code to read data from a FIFO in an FPGA.

The code waits for the FIFO to not be empty, reads eight nybbles which it then constructs into a 32 bit word. When I do that it works reliably, reading all data correctly.
If I change the code so that instead of waiting it selects on a 'not empty' case, then the data is not read in correctly. Data are misaligned into the nybble array.

I have no idea why. Can anyone enlighten me as to why this code should run any differently based on the value of USE_SELECT.

Code: Select all

#include <platform.h>
#include <string.h>
#include <stdio.h>

#include "max10_data.h"
#include "defs.h"

#define BUFFER_LEN  (32)

#define USE_SELECT (0)

void fifo_task(in port port_data_hi, in port port_data_lo, out port port_clk, out port port_read_req, in port port_empty, clock clk, server max10_data_if i_data)
{
    configure_clock_rate(clk, 100, 4);      // 100/4 = 25MHz
    configure_port_clock_output(port_clk, clk);
    configure_in_port(port_data_hi, clk);
    configure_in_port(port_data_lo, clk);
    start_clock(clk);

    int D[8];
    int data;
    int prev;
    int count = 0;

    while(TRUE)
    {
#if USE_SELECT
        select
        {
            case port_empty when pinseq(0) :> void:
#else
                port_empty when pinseq(0) :> void;  // Wait for FIFO not empty
#endif
                clearbuf(port_data_hi);
                clearbuf(port_data_lo);
                port_read_req <: 1;
                for(int i=0; i<4; i++)
                {
                    port_data_hi :> D[i*2+1];
                    port_data_lo :> D[i*2];
                }
                port_read_req <: 0;

                // Build data into 32 bit value
                data =  D[7] << 28 |
                        D[6] << 24 |
                        D[5] << 20 |
                        D[4] << 16 |
                        D[3] << 12 |
                        D[2] << 8 |
                        D[1] << 4 |
                        D[0];

                // Incoming data counts in lower two bytes.
                if (((((prev & 0xFFFF0000) | ((prev+1) & 0xFFFF)) != data)  // Data is incorrect
                     && count > 10000)   // Ignore errors after soon after printf.  The printf throws things out of sync.
                        || (count == 5000000))                              // All data correct for 5million readings.
                {
                    printf("#%10d\t%08X\t%08X\n", count, prev, data);
                    count = 0;
                }
                prev = data;
                count++;
#if USE_SELECT
            break;
        }
#endif
    }
}
I am aware that this code could be perhaps improved by using 16 bit buffering on each of the 4-bit ports, but first I am trying to understand why the select/case addition stops it from working as it is.


User avatar
mon2
XCore Legend
Posts: 1913
Joined: Thu Jun 10, 2010 11:43 am
Contact:

Post by mon2 »

Printf is blocking. Can you comment out this line and test again?

Off topic, we just closed a PCBA design using the world's smallest FPGA (ICE40UL1K) on our SMT line - revived the DIPSY project under our own - ICEPIK. Size of a postage stamp and with dip style fanout.
User avatar
RedDave
Experienced Member
Posts: 77
Joined: Fri Oct 05, 2018 4:26 pm

Post by RedDave »

I could remove the printf... but how do I then find out if it is working?

It only printf's every 5000000 counts (every few seconds). After a printf block it does not printf again for 10000 counts, even if there are errors.

This code works fine with USE_SELECT=0, but not with USE_SELECT=1. Why the difference in behaviour?

[BTW, it will be Monday before I am near my hardware again].

Behaviour is a "no errors" message every 5000000 with USE_SELECT=0
Error report with count = 10001-5 with USE_SELECT=1. i.e. almost as soon as it is able.
User avatar
mon2
XCore Legend
Posts: 1913
Joined: Thu Jun 10, 2010 11:43 am
Contact:

Post by mon2 »

User avatar
CousinItt
Respected Member
Posts: 360
Joined: Wed May 31, 2017 6:55 pm

Post by CousinItt »

I just knocked up a noddy program to check the differences (if any) between select/case and the naked 'p when...'

When using select, the compiler generates a WAITEU instruction, which I think suspends the thread while waiting on a port event. This doesn't happen when using the raw 'p when', but I'm not clear on whether an IN instruction can be used to suspend on a particular input pattern or whether it's producing a polling loop. I can post the disassembly if anyone wants to comment. I used the -O2 switch but didn't check alternatives.

Either the difference in timing is sufficient to give rise to different behaviour, or could it be a glitch on the input pin that's can trigger the port event but is not seen by a polling loop?
User avatar
akp
XCore Expert
Posts: 578
Joined: Thu Nov 26, 2015 11:47 pm

Post by akp »

If you post the assembly I can take a look. I know enough xmos assembly to be dangerous. With an IN instruction the thread will pause only long enough to clock in the buffer size of the port. In the case of the OP the port is unbuffered so it will be essentially instantaneous. So I suspect without the select there will be a polling loop.

For the OP's problem, though, I wonder if a strobed slave input port would work. That way nothing would be read from the port until the fifo_empty goes low (refer to https://www.xmos.com/download/XC-Serial ... g(1.0).pdf) -- now the readyIn signal is supposed to be an active high rather than active low, but it might be possible to avoid an external inverting buffer by using the set_port_inv() on the port_empty signal to generate a port_not_empty active high signal; it's not something I've tried but it might work and would move a bunch of code directly to the port hardware. But it seems you're talking to an FPGA so you could invert the signal when it's output perhaps.

Plus the port_data_lo and port_data_hi ports should really be buffered ports I think, presumably with 16 bit buffer. Then you could read two 16 bit numbers and use the zip instruction to zip them to a 32 bit number I think. It would be much faster. Of course you might need to do some bitrev or byterev or some things like that.

And then you wouldn't need to call clearbuf() except right when you were configuring everything. And obviously there's more configuration and having to be careful about timing when you configure synchronous input on two ports. But it's possible, a lot can be learned from reading the app notes.

So then the main loop would look like

Code: Select all

while(1) {
 port_data_hi :> dhi;
 port_data_lo :> dlo;
 data = zip(dhi, dlo, 2); 
 }
 
Or something like that, you probably have to do some bitreving on the inputs first perhaps, or the output later, and maybe take the high word of data (unsigned long long) rather than low word; it can all be tested in the simulator. It would take some work but it all seems doable and substantially faster. I don't know how the port_read_req signal works but you might be able to raise it before you call the input of port_data_hi.

Good luck!
User avatar
RedDave
Experienced Member
Posts: 77
Joined: Fri Oct 05, 2018 4:26 pm

Post by RedDave »

I'm trying the strobed slave option now. This was something I did not know about.

I've connected the empty to the read_req through an inverter inside the FPGA, so that the FIFO will always spit out data when it is not empty. The XMOS then using read_req as an input port to enable the data ports.

I'm configuring the ports as below, but it is reporting an illegal resource.

Code: Select all

    configure_clock_rate(clk, 100, 4);      // 100/4 = 25MHz
    configure_port_clock_output(port_clk, clk);
    configure_in_port_strobed_slave(port_data_hi, port_read_req, clk);
    configure_in_port_strobed_slave(port_data_lo, port_read_req, clk);
    start_clock(clk);
ERROR:
tile[0] core[3] (Suspended: Signal 'ET_ILLEGAL_RESOURCE' received. Description: Resource exception.)
3 configure_in_port_strobed_slave() 0x0004ae08
Any idea what causes this?
User avatar
akp
XCore Expert
Posts: 578
Joined: Thu Nov 26, 2015 11:47 pm

Post by akp »

To be honest I've never used strobed ports it just seemed like the perfect application. I wonder if you can't use a ready signal with more than one port?
In that case you might want to input on a 4 bit port with 32 bit buffering. Then you could read 2 32 bit words (each of which would have 16 used bits) and use unzip drop the unused bits and zip to merge the two words?? Just spitballing now.
User avatar
RedDave
Experienced Member
Posts: 77
Joined: Fri Oct 05, 2018 4:26 pm

Post by RedDave »

The ready signal not being allowed on two ports was my first thought. Commenting out the second configure_in... still gives the same error.

Reading into 32 bit buffered ports would slow down my maximum data rate by a factor of two and would require some jiggery-pokery* to ensure that data is not read out of the FIFO into the unused sections.

My current thought is that perhaps the strobed clock needs to be an input clock rather than an output, as all the examples seem to do this. I do not understand why this would be the case. I will see what happens when I make that change.

* technical term.
User avatar
akp
XCore Expert
Posts: 578
Joined: Thu Nov 26, 2015 11:47 pm

Post by akp »

Good luck, I hope it works for you. The main place I have encountered strobed I/O is in the 100 Mbps MII master from lib_ethernet. The port setup function here might be illuminating https://github.com/xmos/lib_ethernet/bl ... _master.xc
Post Reply