Making the most of the concurrency

New to XMOS and XCore? Get started here.
Post Reply
BruceNaylor
Junior Member
Posts: 4
Joined: Tue Sep 07, 2010 5:46 pm
Contact:

Making the most of the concurrency

Post by BruceNaylor »

Well I've tried and so far failed.

Hooking up all cores to channels and passing a token in a round-robin doesn't make sense. The blocking while waiting for the next token makes a mockery of all this processing power. Everytime I try and think of using an XMOS in a project, I just end up using an ARM or a PIC because I can't seem to benefit from the concurrency available.

For instance, a really simple thing to want to do - have different cores perform some process and output an LED indication on the little demo PCB - forget the name, but it has 4 LED's on one 4 bit port. After much head scratching I resorted back to passing a token around the cores, so each core could modify a bit in the token, and a thread would output the token on the LED port. So each thread had to block waiting for the token to come around before it could go back to it's task. Wheres the concurrency here? I could do the same job a lot quicker and with a smaller memory footprint with a $2 pic and interrupts.

I can see limited merit of concurrent statement execution, but I want to create state machines on different cores and have them all free-wheeling in parallel, and I just don't get it. And I can't seem to locate the info.

I write multi-threading/multi-core aware code in C++ on Intel processors for a day job, all I want to do is the same for embedded projects.

I've looked at the web, bought the XC manual, but am still none the wiser.

Please please point me in the right direction before I give up on this technology.

Bruce.


User avatar
bsmithyman
Experienced Member
Posts: 126
Joined: Fri Feb 12, 2010 10:31 pm
Contact:

Post by bsmithyman »

Hi Bruce,

Forgive me if I'm missing the subtleties, but it sounds like by trying to pass control tokens in this way is the problem. If you start a variety of functions in a par {} block, they run in parallel without synchronizing; there's no need for additional work to get that to happen. In the multi-core case, the cores are completely independent unless you have them talk to each other. If you connect them in a ring topology with channels using blocking I/O, you're forcing them to synchronize and basically making the job sequential on the (4?) processes involved in the communications. Assuming you're talking about a 4-core chip, there's no reason the other ~28 processes can't be working away.

If you definitely want to use this kind of topology, you could always put the communications part of the code in a select statement and make the I/O non-blocking. You could also have a single reporter thread monitor the others, and wait on each of their channels (i.e. more like a star topology). It's also possible to do interesting things with messages in assembly that might allow you to send short messages via streaming channels (if they're smaller than the channel buffer, you should be able to do non-blocking writes).

Can you give a bit more information about what jobs you're trying to do? There are times when it's just a case of the right tool for the job, and maybe a $2 PIC makes more sense. On the other hand, the XMOS chips are pretty powerful in the right application.

Cheers,
Brendan
Heater
Respected Member
Posts: 296
Joined: Thu Dec 10, 2009 10:33 pm

Post by Heater »

BruceNaylor,

Gosh, don't give up yet. It's much easier than you think.
I write multi-threading/multi-core aware code in C++ on Intel processors for a day job,
I can see your problem right there. Generally multi-threaded apps on multi-core processors have all the threads working in the same memory space. Perhaps with mutexes, semaphores whatever to stop them tripping over each other when accessing shared data structures.

The preferred model for XMOS programming with XC is "Communicating Sequential Processes". Basically threads don't share memory, they communicate with each other via communication channels. There is no need for mutexes, semaphores etc. In this way threads on a core are on the same footing as threads on different cores or even on different chips.

As you see that presents the problem of how to have a thread listening on a channel whilst at the same time doing useful work.

The solution to that is the "select" statement. It can listen on multiple channels or I/O ports or timers and which ever one is ready first is read and the processing continues accordingly. Basically "selec" is is waiting for "events", much as you would see in an event or signal driven system on a PC in C++, the Qt framework for example.

What if there is no channel or port input ready? No problem just use the "default" case of select and it will fall through with no input and allow other processing to continue.

So, the answer to your problem is to read up on XC and "select" in particular.

Having said all that, if you really want to use traditional shared memory techniques between threads in a core you can always use C instead of XC.

Hope this point's you in the right direction. Have fun.
Heater
Respected Member
Posts: 296
Joined: Thu Dec 10, 2009 10:33 pm

Post by Heater »

BruceNaylor,

Take a look at the tutorial document for the XC-1A it has an explanation and an example of the use of "select" in section 4.

Conceptually it works like "select" in Unix so I'm sure you will have no trouble with it.

Cheers.
BruceNaylor
Junior Member
Posts: 4
Joined: Tue Sep 07, 2010 5:46 pm
Contact:

Post by BruceNaylor »

Sorry for the late response to your replys - and thanks for the support within them.

I will have another crack with the XMOS, and try and follow through the 'select' concept.

My first "design" to get me going with the XMOS was simply one of getting different tasks to flash an LED. didn't matter what the tasks were - I think they were just counters in the end, but they would have evolved into 4 x PWM's, a display driver, and a slave I2c port had all gone to plan. It burnt a weekend, and ended in frustration.

Only last week another job landed on my desk that my instant reaction was "ideal for an XMOS", but now currently looking at using a CPLD tied to a PIC. The XMOS looks like it would be the cheaper option, and possibly the quicker to impliment, control in production (only 1 set of firmware etc) so yes, I'll have another crack at it.

Bruce.
Post Reply