XCore Architecture Block Diagram

Technical discussions around xCORE processors (e.g. xcore-200 & xcore.ai).
pej02
Member
Posts: 11
Joined: Thu Jun 09, 2011 9:04 pm

XCore Architecture Block Diagram

Post by pej02 »

I'm trying to understand how an XCore works and have been trying to find an architecture block diagram to help me. Can anyone point me at one?

For reference, there's a nice one for the Propeller here: http://www.parallax.com/Portals/0/Image ... lock-L.jpg


User avatar
leon_heller
XCore Expert
Posts: 546
Joined: Thu Dec 10, 2009 10:41 pm
Location: St. Leonards-on-Sea, E. Sussex, UK.

Post by leon_heller »

pej02
Member
Posts: 11
Joined: Thu Jun 09, 2011 9:04 pm

Post by pej02 »

I agree that the figure on the first page ticks the "block diagram" 'box' in its strictest sense. However, it does not go any way to explaining, in contrast to the Propeller block diagram, how an XCore works. For example, what is the mechanism by which each of the 8 threads gets access to the shared I/O pins?
User avatar
Berni
Respected Member
Posts: 363
Joined: Thu Dec 10, 2009 10:17 pm

Post by Berni »

Well it doesn't need any special mechanisms to share I/O since all threads run on the same physical CPU and use the same I/O registers. The 8 threads are simply pipelined along the CPUs 4 step instruction cycle while using separate cpu registers for each to make them run completely independent of each other.

You start to get multiple CPUs using the multi core chips and those are basically almost independent MCUs that are linked using these high speed Xlinks and then the CPUs can talk over them.

So you cant really draw a diagram like that because it works in a completely different way than a propeller.
Heater
Respected Member
Posts: 296
Joined: Thu Dec 10, 2009 10:33 pm

Post by Heater »

Unlike the Propeller there are no shared I/O pins on the xcore.

You will have noticed xcore pins are grouped into ports of 1, 2, 4, 8, 16 pins at a time.
When one sets port to input or output all the pins in that port are set as input or output, you can not set the direction of individual pins of a port as you can on the Propeller.

I pretty sure I am right in saying that if no two threads in XC can be accessing the same pin group (port) at the same time.
That is to say that you can't have a thread setting a couple of pins of an 8 bit port whilst another thread sets a different couple of pins on the same port.

So you really end up with ports (groups of pins) being dedicate to a particular thread. This is much the same as not allowing multiple XC threads to access shared global data.

Of course when you get to using multiple cores you find that each core has it's own pins. You cannot share pins or ports across cores like you can with the Propeller.
pej02
Member
Posts: 11
Joined: Thu Jun 09, 2011 9:04 pm

Post by pej02 »

Heater wrote:Unlike the Propeller there are no shared I/O pins on the xcore.
In the XCORE-XS1-Architecture-Tutorial(1.1).pdf document is states on page 1/34 in paragraph 3:
All threads share access to all other resources available on the core
pej02
Member
Posts: 11
Joined: Thu Jun 09, 2011 9:04 pm

Post by pej02 »

Berni wrote:Well it doesn't need any special mechanisms to share I/O since all threads run on the same physical CPU and use the same I/O registers. The 8 threads are simply pipelined along the CPUs 4 step instruction cycle while using separate cpu registers for each to make them run completely independent of each other.
Many thanks for this description - it is far more understandable to the layman than other descriptions I have read in the architecture documentation off the XMOS website documentation links.
Berni wrote:So you cant really draw a diagram like that because it works in a completely different way than a propeller.
But would you agree that the existing XCore block diagram referenced above could be more explicit? Even including some independent cpu registers in each thread would be better than the current arrangement of anodyne coloured boxes?
User avatar
leon_heller
XCore Expert
Posts: 546
Joined: Thu Dec 10, 2009 10:41 pm
Location: St. Leonards-on-Sea, E. Sussex, UK.

Post by leon_heller »

As it's usually programmed in XC and C, most users don't need details of the registers. The XS1-G4 product brief has a diagram that includes them.
Heater
Respected Member
Posts: 296
Joined: Thu Dec 10, 2009 10:33 pm

Post by Heater »

pej02,
In the XCORE-XS1-Architecture-Tutorial(1.1).pdf document is states on page 1/34 in paragraph 3:
All threads share access to all other resources available on the core
Yes indeed, but how many errors are there in the code below?:

Code: Select all

#include <platform.h>
on stdcore[0] : port port_8 = XS1_PORT_8D;
char x;

void thread_a()
{
    port_8 <: 1;
    x++;


}

void thread_b()
{
	port_8 <: 2;
	x--;
}

int main(void)
{
   par
   {
	   thread_a();
	   thread_b();
   }
   return 0;
}
You cannot just share variables in memory in XC. You cannot just share ports in XC and so on.

Even if XC language would allow port sharing between threads I suspect the hardware does not like it (Is that true anyone?)

Even if the hardware liked it two threads cannot uses different pins of a multi-pin port independently as inputs or outputs.

This could be quite limiting in case where for example you have an 8 pin port free but you want to implement a couple of independent drivers that need a few inputs and output each. Can't do it. The Propeller for example does allow this as all pins are available to all cores all the time.
MaxFlashrom
Experienced Member
Posts: 82
Joined: Fri Nov 05, 2010 2:59 pm

Post by MaxFlashrom »

Hi, Heater asked about sharing ports and commented:
Even if XC language would allow port sharing between threads I suspect the hardware does not like it (Is that true anyone?)
Even if the hardware liked it two threads cannot uses different pins of a multi-pin port independently as inputs or outputs.
This could be quite limiting in case where for example you have an 8 pin port free but you want to implement a couple of independent drivers that need a few inputs and output each.
I have written code to successfully share port32A on the L1-128 between multiple threads. As you observe, lines cannot be individually programmed as input or output. I imagine it's possible to have one thread use a whole port as input while another uses the whole port as output, but a far more likely case is that you will use groups of lines independently from multiple threads, either all as inputs or all as outputs. The XS1-L1 does not like resource sharing much: the XS1 Architecture manual says
Resources are owned and used by a single thread. If multiple threads attempt to access the same resource within 4 cycles of each other, a Resource Dependency exception will be raised.

When ET RESOURCE DEP is raised et will be set to 9.
In general, trying to guess timing dependency between threads such that a concurrent case of port access doesn't occur is a recipe for failure. It appears that port accesses separated by enough time between them do, indeed, work. If one is doing only reads, each thread need only do a read from the port and ignore any lines it's not interested in. For a write, the typical case is that a thread will want to change only the lines which interest it while not interfering with the state of those controlled by other threads. A thread must read the port, modify the value read, and write it out again. For this to work reliably this read-modify-write operation must be atomic. (Don't panic, this has nothing to do with Avagadro, Blondie or nuclear physics -;)
The XS1 architecture does not support single read-modify-write instructions to either memory or ports. These are frequently used to implement locks and semaphores. Fortunately it does provide hardware locks, channels and thread synchronisers.

Hardware synchronisers, channels, and shared memory flags can ensure thread rendezvous at fixed points. One can use this a safe way to synchronise concurrent threads that subsequently communicate large buffers via shared memory, for instance. Although shared memory flags work, as memory reads and writes are atomic to word-level, this requires polling which is inelegant and wasteful of cycles that would otherwise be available to other threads. Channels and, I presume, synchronisers put threads into a sleeping state until they can continue.

I used hardware locks to enable exclusive atomic read-write access to the port. The comment above about resource sharing appears not to apply to lock acquisition, which is fortunate, as this would defeat their entire purpose. While this scheme works, it is not without issues. As port access is, then, no longer thread-exclusive it's difficult to reason exactly about the exact timing of port updates. For software implemented hardware I/O driving this can lead to jitter in the signal output timings. For some SPI implementations, UART or LED indicators this may not be a problem; it's certainly better than wasting loads of output lines.

This method of port sharing is not in the spirit of the XS1 design, though, and one may wish, instead, to consider a solution where one of one's threads accesses the port while listening on a select statement for requests delegated to it from other threads over channels.

Hope this is useful
Max.