Cancelling a blocked channel transmission

Technical discussions around xCORE processors (e.g. xcore-200 & xcore.ai).
Post Reply
User avatar
data
Active Member
Posts: 43
Joined: Wed Apr 06, 2011 8:02 pm

Cancelling a blocked channel transmission

Post by data »

In my application, it is possible for a channel transmit operation to become blocked indefinitely (static routing is involved). I would like to be able to cancel the transmit operation after a timeout expires, either from the same thread (preferred), or from a different thread.

I have thought of a few approaches so far:

* Somehow cause an interrupt on the blocked thread. The interrupt handler would obviously need to avoid returning to the same transmit instruction. I'm new to interrupts so this would potentially be a substantial effort for me, but it seems like a good possibility.

* Set up a "watchdog" thread which forcibly resets the link where the transmit is taking place, in the hope that this causes the transmit to end with an exception. I have tried this in the simulator, but unfortunately no exception seems to result -- as far as I can tell, the transmit keeps hanging on.

* Have the watchdog thread kill the transmit thread using FREER, and then relaunch it. This could accomplish cancelling the transmission, but I'm not sure what effect it would have on other threads which communicate with the killed thread.

It would be nice if a timer could be used to generate an interrupt, as this would save a thread, but sadly the architecture manual seems to state that this is not possible. The wiki has an example of enabling interrupts on a channel, so I suppose that would be the way to do it.

It seems to me that events are not useful here, and I don't see any way to generate an exception on a different thread. DCALL and related are not useful to me since they halt the entire processor.

Hints, comments on anything I've missed, or other comments on this topic from knowledgeable people, will be most welcome!


User avatar
ers35
Active Member
Posts: 62
Joined: Mon Jun 10, 2013 2:14 pm
Contact:

Post by ers35 »

Can you give more information about why you are using static routing? Is there a case where the links are sometimes not connected on device boot due to your application use case? Perhaps a change in application design will eliminate the need to cancel a blocked transmission.
User avatar
data
Active Member
Posts: 43
Joined: Wed Apr 06, 2011 8:02 pm

Post by data »

Our system has a relatively large number of processors, and each system can have a different number of processors. While we would certainly prefer to use a preconfigured network and hardware routing (as indeed we are doing in certain parts of the system), unfortunately the XMOS tools don't really support dynamic configuration in our particular situation.

At least, that's what the XMOS engineers we worked with before advised us; on their advice, we chose this design. Believe me, if there were a way for me to not have to write my own networking layer, I would jump at it!
User avatar
ers35
Active Member
Posts: 62
Joined: Mon Jun 10, 2013 2:14 pm
Contact:

Post by ers35 »

You may find this thread useful: https://www.xcore.com/forum/viewtopic.php?f=26&t=1347

Have the links already been established when you start transmitting? Can the links go away after they have been established? I am trying to determine whether the channel blocking is due to your software design or the varying physical configurations under which your application operates. This will determine how robust of a solution is required.

For example, there is a difference between blocking while the links are guaranteed to be established and blocking because the links are physically disconnected.
User avatar
data
Active Member
Posts: 43
Joined: Wed Apr 06, 2011 8:02 pm

Post by data »

ers35 wrote:You may find this thread useful: https://www.xcore.com/forum/viewtopic.php?f=26&t=1347
I am VERY familiar with that thread. ;)
ers35 wrote:Have the links already been established when you start transmitting? Can the links go away after they have been established? I am trying to determine whether the channel blocking is due to your software design or the varying physical configurations under which your application operates. This will determine how robust of a solution is required.

For example, there is a difference between blocking while the links are guaranteed to be established and blocking because the links are physically disconnected.
Blocked transmit is not something which should occur as long as nothing goes wrong. Unfortunately this is not a hobby project and, out in the wild world, things will almost certainly go wrong. So, if possible, I would like a way to recover from this situation with at least some amount of grace.

Anyway, it sounds like this is a Hard Problem and I'm probably not going to convince you here that it is something I actually need to do -- I'll just get in touch directly. Thanks!
henk
Respected Member
Posts: 347
Joined: Wed Jan 27, 2016 5:21 pm

Post by henk »

Hi data,

You are off piste here!

Your best bet is probably to interrupt the thread that is hanging.

The way to do that is to allocate a channel end. First it up from so that the thread that will hang (A) has input a token, and another thread has output a token. Now get thread A to set the channel end to get interrupts rather than events (SETC), and get thread A to enable interrupts in its SR. Now thread A is ready to be hung up, and any thread can interrupt thread A by sending it a token over that channel.

On interrupt, input the token from that channel first, then store the saved PC onto the stack; modify the stack, reload the saved PC, and you can return from interrupt and thread A will have gone somewhere else.

You will have to set up your kernel stack to be large enough to hold the saved PC and registers that you need in the interrupt routine.

Cheers,
Henk
User avatar
ers35
Active Member
Posts: 62
Joined: Mon Jun 10, 2013 2:14 pm
Contact:

Post by ers35 »

I started implementing henk's suggestion and it inspired me to think of a simpler way. Instead of using a channel and interrupts, how about a timer and events?

The following proof of concept implements a function out_nonblocking() that returns 1 instead of blocking if the timeout is reached before the out completes. Keep in mind that data loss occurs if the out does not complete. The code was only tested in the simulator, not on real hardware.

Code: Select all

// out-nonblocking.S
#include <xs1.h>

.text

.global out_nonblocking
.global out_nonblocking.nstackwords
.linkset out_nonblocking.nstackwords, 0
.global out_nonblocking.maxthreads
.linkset out_nonblocking.maxthreads, 0
.global out_nonblocking.maxtimers
.linkset out_nonblocking.maxtimers, 1
.global out_nonblocking.maxchanends
.linkset out_nonblocking.maxchanends, 0
.global out_nonblocking
.align 4
#define channel r0
#define data r1
#define timeout r2
#define tmr r3
#define scratch r11
out_nonblocking:
	clre
	getr tmr, XS1_RES_TYPE_TIMER
	in scratch, res[tmr]
	add scratch, scratch, timeout
	setd res[tmr], scratch
	setc res[tmr], XS1_SETC_COND_AFTER
	ldap scratch, out_nonblocking_handler
	setv res[tmr], scratch
	eeu res[tmr]
	// event enable
	setsr (1 << XS1_SR_EEBLE_SHIFT)
	out res[channel], data
	// out did not block or completed before the timer expired.
	// event disable
	clrsr (1 << XS1_SR_EEBLE_SHIFT)
	freer res[tmr]
	ldc r0, 0
	retsp 0
out_nonblocking_handler:
	// the timer expired before the out completed.
	// return instead of blocking.
	// event disable
	clrsr (1 << XS1_SR_EEBLE_SHIFT)
	freer res[tmr]
	ldc r0, 1
	retsp 0
#undef channel
#undef data
#undef timeout
#undef tmr
#undef scratch

Code: Select all

// channel-blocking.xc
// xcc -target=XCORE-200-EXPLORER channel-blocking.xc out-nonblocking.S -o channel-blocking.xe
// xsim channel-blocking.xe

#include <platform.h>
#include <print.h>
#include <xs1.h>

// returns 1 if the out blocked and 0 if the out did not block
unsigned out_nonblocking(chanend channel, unsigned data, unsigned timeout);

void send_task(chanend c)
{
  while (1)
  {
    for (int i = 0; i < 32; ++i)
    {
      unsigned blocked = out_nonblocking(c, i, 100);
      printuintln(blocked);
    }
  }
}

void recv_task(chanend c)
{
  // alternate between blocking and not blocking
  while (1)
  {
    for (int i = 0; i < 32; i++)
    {
      unsigned n = inuint(c);
      //printuintln(n);
    }

    delay_ticks(4096);
  }
}

int main()
{
  chan c;

  par
  {
    on tile[0]:
    {
      send_task(c);
    }
    
    on tile[1]:
    {
      recv_task(c);
    }
  }
  return 0;
}
henk
Respected Member
Posts: 347
Joined: Wed Jan 27, 2016 5:21 pm

Post by henk »

Hi ers35,

Using timers is a great way of doing so if there is fixed time-out period.

Two notes on your code.

The code uses event enable; this may not be appropriate if you are using events for other activities in the same thread. For example, your normal code may be doing a WAIT (an XC-select { }). If you set the timer resource to produce interrupts instead of events, then you can use interrupts for the timer, and normal events for everything else. Having said that, if you use interrupts there are a few complications to take care of; interrupts may change from single to dual issue or vice versa (controlled by SR), and it will set some bits in SR. All detailed in the XS2 architecture manual. The good thing about using events with events enabled is that you can just continue as if nothing has happened with the RETSP, as your code shows.

Whether you use events or interrupts; in the handler the code should input from the timer; that completes the event/interrupt sequence. The timer signals that it is ready, it events/interrupts, then the program INs to complete it.

Cheers,
Henk
Post Reply