FreeRTOS port

XCore Project reviews, ideas, videos and proposals.
User avatar
akp
Respected Member
Posts: 415
Joined: Thu Nov 26, 2015 11:47 pm

Re: FreeRTOS port

Postby akp » Tue Nov 05, 2019 5:05 pm

Changing the kernel assembly to dual issue on the XS2A seems to reduce the context switch time by roughly 15%.
User avatar
akp
Respected Member
Posts: 415
Joined: Thu Nov 26, 2015 11:47 pm

Postby akp » Thu Dec 12, 2019 9:20 pm

Not surprisingly, rewriting the kernel assembly in dual-issue yields an improvement
fabriceo
Active Member
Posts: 40
Joined: Mon Jan 08, 2018 4:14 pm

Postby fabriceo » Tue Dec 31, 2019 9:26 am

Hi see this RTOS stack implementation for freertos for VocalFusion. ver active GitHub branch
https://github.com/xmos/lib_rtos_support
User avatar
akp
Respected Member
Posts: 415
Joined: Thu Nov 26, 2015 11:47 pm

Postby akp » Tue Dec 31, 2019 12:27 pm

Thanks, I will take a look.
User avatar
akp
Respected Member
Posts: 415
Joined: Thu Nov 26, 2015 11:47 pm

Postby akp » Tue Dec 31, 2019 7:04 pm

I took a look at the FreeRTOS port and I wonder if it supports dual-issue mode? It doesn't look like it would work if the FreeRTOS core were running dual-issue code, it might generate exception when it returns from kcall. With respect the kernel assembly I wrote is faster (takes advantage of dual-issue features to perform context switch quicker) and it supports context switching a FreeRTOS core running dual-issue code. The examples are compiled -Os so that means they're tested in single-issue mode.

EDIT: Obviously the big advantage of the XMOS FreeRTOS port is that it supports SMP whereas I can run only one FreeRTOS core per tile. So I am not pooh-poohing it. I just don't have a need for SMP FreeRTOS at present so optimizing the single core FreeRTOS for speed seemed to be a better option for me, leaving more MIPS and cores for xc tasks which is where most of my stuff gets done (e.g. time critical stuff or co-operative multitasking using combinable tasks).
mbruno
Posts: 9
Joined: Thu Aug 24, 2017 2:48 pm

Postby mbruno » Mon Mar 02, 2020 10:34 pm

Hi akp,

Thanks for these insights. I have updated our kernel assembly code to support yields from either single or dual issue code. I will update the task context switch code to utilize dual issue as well. Hopefully I will have this released publicly within a few days. I will post here again when it is ready.

Note that we have a single core port without the SMP kernel modifications here:
https://github.com/xmos/FreeRTOS/tree/release/xcore

This single core port will likely soon be integrated into the official FreeRTOS repository, so if you have any more suggestions let me know.

Thanks,
Mike
mbruno
Posts: 9
Joined: Thu Aug 24, 2017 2:48 pm

Postby mbruno » Tue Mar 03, 2020 9:27 pm

Hi akp,

I am wondering what exactly you rewrote in dual issue mode to achieve the 15% context switch speed up. The majority of the context switch assembly is a series of stw/ldw instructions which cannot be dual issued. The rest is primarily the call to vTaskSwitchContext which is written in C in the FreeRTOS file tasks.c, so this can be dual issued by compiling with -mdual-issue, but will not be hand optimized. So I have been able to set it up so that dual issue mode is enabled upon kernel entry, and I have everything compiled and assembled and running successfully with dual issue mode enabled everywhere. It just doesn't look like much, if any, of the kernel port assembly code can be sped up by rewriting it to take advantage of dual issue mode.

Are you using the configUSE_PORT_OPTIMISED_TASK_SELECTION option? We do have this on in our single core port (though not in SMP). This should reduce context switch time as well.

The context switch time could be reduced by replacing the stw/ldw instructions with std/ldd instructions, but this requires that the stack pointer be at an 8 byte boundary upon kernel entry which cannot be guaranteed. I have been thinking about how I could force the alignment, but it seems like this will likely waste more cycles than it will save.

When comparing the assembly code in our XS2 port with the one I believe you have based yours on, I do note a couple significant differences. I'm comparing the following two files:

https://github.com/xmos/FreeRTOS/blob/r ... /portasm.S
https://github.com/BiancoZandbergen/XMO ... port_asm.S

The context save and restore code in ours is shared between interrupts and kcalls rather than duplicated, so this should reduce code size. And the bit of code that adds the context size to the stack pointer at the end of the restore just before the kret is done in only 1 instruction in our port rather than in 4, saving a small amount of both time and space.

Mike
User avatar
akp
Respected Member
Posts: 415
Joined: Thu Nov 26, 2015 11:47 pm

Postby akp » Wed Mar 04, 2020 7:37 am

Hi Mike

Right. I meant I used the ldd / std instructions rather than ldw / stw instructions. So that's not dual issue, there's only a few instructions that are truly dual issue. But ldd / std does move two words per instruction rather than one.

I will try to get my port together and post it up for you to look at. I didn't implement your optimization to add the context size in a single instruction I don't think. I will look at it.

Thanks
Akp
mbruno
Posts: 9
Joined: Thu Aug 24, 2017 2:48 pm

Postby mbruno » Wed Mar 04, 2020 2:36 pm

Great, thanks. I'm curious to see what you did. I did actually try using the ldd/std instructions for the r0-r11 registers a while back but quickly realized that it would occasionally crash with an ET_LOAD_STORE exception whenever a task entered the kernel while its SP was not at a double word boundary.

Mike
mbruno
Posts: 9
Joined: Thu Aug 24, 2017 2:48 pm

Postby mbruno » Wed Mar 04, 2020 10:06 pm

Please see the latest single core update here:
https://github.com/xmos/FreeRTOS/tree/6 ... af0028ba86

FreeRTOS 10.3.0 has been merged in and support for dual issue has been added. Compiling the application and/or kernel with -O2 or -mdual-issue works now without issue.

Entering the kernel via a yield (kcall) or interrupt now automatically enables dual issue mode. I was able to modify the context switch assembly code to dual issue 12 instructions which should shave off 6 cycles total. Not much, but better than nothing.

Mike

Who is online

Users browsing this forum: No registered users and 0 guests