As the XCore has the capability to run 8 threads simultaneously but only 4 in "full speed" I'm wondering if here is some method to make some thread "have higher priority" than the others.
I found the set_thread_fast_mode_on/off() but I'm not sure that's what I am looking for.
For example, I have 5 thread running on a single XCore and would like to make sure this one "high priority thread" never gets "paused" to give their time slot to another of the remaining 4 threads. Make some other "lower priority thread" to share its time with others.
I hope I was understandable enough.
regards,
m.culibrk
Thread priority - is there any option?
-
- Active Member
- Posts: 38
- Joined: Tue Jul 13, 2010 2:57 pm
-
- Respected Member
- Posts: 296
- Joined: Thu Dec 10, 2009 10:33 pm
As far as I understand threads are scheduled on an instruction by instruction basis.
So no matter what is going on if you have 5 threads each one of them gets to complete an instruction every 5 instruction cycles. There will never be pauses longer than that for any of them. (unless of course they are waiting on an event).
So no matter what is going on if you have 5 threads each one of them gets to complete an instruction every 5 instruction cycles. There will never be pauses longer than that for any of them. (unless of course they are waiting on an event).
-
- Active Member
- Posts: 38
- Joined: Tue Jul 13, 2010 2:57 pm
Yeah... I understand that...
But, as you say, there are 5 threads and a round-robin scheduler is used so each thread will get its chance... but is there any settings/option/way to tell the scheduler not to suspend/skip the "high priority one" and rather suspend some other thread and give its slice to other.
If I have really tight timing requirements this "skipping" will introduce some "glitches" in the thread execution... ok, I could synchronize on some event/message/clock but is there any way to instruct the thread not to "skip" in favor of some other thread?
regards,
m.culibrk
But, as you say, there are 5 threads and a round-robin scheduler is used so each thread will get its chance... but is there any settings/option/way to tell the scheduler not to suspend/skip the "high priority one" and rather suspend some other thread and give its slice to other.
If I have really tight timing requirements this "skipping" will introduce some "glitches" in the thread execution... ok, I could synchronize on some event/message/clock but is there any way to instruct the thread not to "skip" in favor of some other thread?
regards,
m.culibrk
-
- Respected Member
- Posts: 296
- Joined: Thu Dec 10, 2009 10:33 pm
As far as I know there is no such way to prioritize a thread.
Let's look at this a bit. I have just been timing some code here, I have one thread running the task to be timed and then I start from 1 to 7 other threads to see how that slows the timed thread. Here are the results:
The results are as expected. From one to 4 threads you get the maximum execution speed of 4 instruction cycles per thread. Adding a thread adds another instruction cycle to the execution time and gives a slow down of 25%. And so on for 6,7 and 8 threads.
So a few observations:
If you have up to 4 threads execution of all of them is pretty much 100% deterministic. No matter if a thread waits on a timer or I/O event, the other threads will not speed up whilst it is "sleeping".
If you have 5 or more threads then execution timing determinism goes out the window. If a thread "sleeps", waiting for a timer say, the other 4 threads will speed up by 25%. In this way when threads go in and out of the waiting state that modulates the execution speed of all the running threads.
Perhaps this is the source of the "glitches" you refer to.
The only way to avoid these glitches is to:
a) Only have 4 threads.
b) Make use of the timers, clocked I/O etc hardware features of the chip to ensure accurate timing of your code.
I believe option b) is the preferred solution.
However if you say that you have 5 threads but you really need to guarantee one of those gets maximum execution speed all the time you have a problem. The only way to go that I can see in that case is to combine two of your threads thus ensuring you only have 4 threads and they all run at maximum speed.
Let's look at this a bit. I have just been timing some code here, I have one thread running the task to be timed and then I start from 1 to 7 other threads to see how that slows the timed thread. Here are the results:
Code: Select all
THREADS NANOSECONDS SLOW DOWN(%)
1 436067 0
2 436067 0
3 436067 0
4 436067 0
5 545084 25
6 654101 50
7 763117 75
8 872134 100
So a few observations:
If you have up to 4 threads execution of all of them is pretty much 100% deterministic. No matter if a thread waits on a timer or I/O event, the other threads will not speed up whilst it is "sleeping".
If you have 5 or more threads then execution timing determinism goes out the window. If a thread "sleeps", waiting for a timer say, the other 4 threads will speed up by 25%. In this way when threads go in and out of the waiting state that modulates the execution speed of all the running threads.
Perhaps this is the source of the "glitches" you refer to.
The only way to avoid these glitches is to:
a) Only have 4 threads.
b) Make use of the timers, clocked I/O etc hardware features of the chip to ensure accurate timing of your code.
I believe option b) is the preferred solution.
However if you say that you have 5 threads but you really need to guarantee one of those gets maximum execution speed all the time you have a problem. The only way to go that I can see in that case is to combine two of your threads thus ensuring you only have 4 threads and they all run at maximum speed.
-
- XCore Addict
- Posts: 169
- Joined: Fri Jan 08, 2010 12:13 am
By the way 'fast mode' is an option that makes the chip schedule an instruction for every thread, all the time.
So for example - when thread is waiting the waiting instruction will be reissued on every cycle, rather than really pausing. The reason this is called 'fast mode' is that it can reduce the latency in responding to events.
The effect is best seen in the simulator trace...
So for example - when thread is waiting the waiting instruction will be reissued on every cycle, rather than really pausing. The reason this is called 'fast mode' is that it can reduce the latency in responding to events.
The effect is best seen in the simulator trace...
Paul
On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
-
- Active Member
- Posts: 38
- Joined: Tue Jul 13, 2010 2:57 pm
Thanks for all your comments!
@Heater
Yes, I understand your "demonstration" and it's generally the same I expected to be... but was hoping there is some "(un)documented parameter" to somehow influence the scheduler...
@Paul
so this "fast mode" just issues "NOP"s instead of actually suspending a thread, right?
but this really does not help as the "high prio" thread does not "sleep/suspend" by itself so its cycles would still be "lost" on other threads.
Yep, I can only limit my code to 4 threads and eventually use more XS chips...
ehm... slowly (but "painfully") learning about XMOS/XC gotchas and quirks.... and observing the increasing number of grey/pulled hair on my head.... :shock:
@Heater
Yes, I understand your "demonstration" and it's generally the same I expected to be... but was hoping there is some "(un)documented parameter" to somehow influence the scheduler...
@Paul
so this "fast mode" just issues "NOP"s instead of actually suspending a thread, right?
but this really does not help as the "high prio" thread does not "sleep/suspend" by itself so its cycles would still be "lost" on other threads.
Yep, I can only limit my code to 4 threads and eventually use more XS chips...
ehm... slowly (but "painfully") learning about XMOS/XC gotchas and quirks.... and observing the increasing number of grey/pulled hair on my head.... :shock:
-
- Respected Member
- Posts: 296
- Joined: Thu Dec 10, 2009 10:33 pm
paul,
Interesting. How does one enable this "fast mode".
Can I assume this:
If fast mode is in use all threads are effectively executing all the time. No matter what they may be waiting on.
Therefore if I have 5 threads, say, they all get one fifth of the available time, even if one of them is waiting.
If I have one thread that is waiting the majority of the time it effectively slows all the other threads all the time.
Perhaps it should be called "slow mode" and should be used sparingly.
Interesting. How does one enable this "fast mode".
Can I assume this:
If fast mode is in use all threads are effectively executing all the time. No matter what they may be waiting on.
Therefore if I have 5 threads, say, they all get one fifth of the available time, even if one of them is waiting.
If I have one thread that is waiting the majority of the time it effectively slows all the other threads all the time.
Perhaps it should be called "slow mode" and should be used sparingly.
-
- Respected Member
- Posts: 377
- Joined: Thu Dec 10, 2009 6:07 pm
I think it would be fairer to say "fast response" mode, as it enables threads, on average, to potentially shave off a few processor cycles responding to an incoming event.
-
- Member++
- Posts: 28
- Joined: Thu Dec 10, 2009 7:25 pm
It re-issues the same instruction each time, so that an I/O instruction can keep issuing until it succeeds as quickly as possible.mculibrk wrote:so this "fast mode" just issues "NOP"s instead of actually suspending a thread, right?
but this really does not help as the "high prio" thread does not "sleep/suspend" by itself so its cycles would still be "lost" on other threads.
If you can split the workload between multiple threads and still get the performance you need (which may not be 100MIPS, say), then you can use more than 4 threads and cope with the slowdown. If all threads need full performance individually then you would indeed have to limit yourself to 4 threads to avoid the reduction in per-thread performance.Yep, I can only limit my code to 4 threads and eventually use more XS chips...
-
- XCore Addict
- Posts: 169
- Joined: Fri Jan 08, 2010 12:13 am
It is enabled on a per thread basis - using the following function calls:Heater wrote:paul,
Interesting. How does one enable this "fast mode".
Code: Select all
set_thread_fast_mode_on()
set_thread_fast_mode_off()
Paul
On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.