PS. It's very interesting kster!
But when using feed-forward solutions without feedback, i'm not sure that the benefit of using the maximum frequency benefits over the probably increasing problems with the error on each edge that will be generated from the transistors since they are not perfect and do not provide infinite bandwith.
JAES fellow Malcom has done alot of work during his life:
Check out http://www.essex.ac.uk/csee/research/au ... tions.html if anyone has missed it
Hmm, I should read thisone myself.
http://www.essex.ac.uk/csee/research/au ... lifier.pdf
DS.
Most Efficient Way to Generate Varying Duty Cycle and PWM
-
- XCore Expert
- Posts: 956
- Joined: Fri Dec 11, 2009 3:53 am
- Location: Sweden, Eskilstuna
Probably not the most confused programmer anymore on the XCORE forum.
-
- XCore Expert
- Posts: 956
- Joined: Fri Dec 11, 2009 3:53 am
- Location: Sweden, Eskilstuna
What is the frequency of the clock when it's clocked internally? Does it use the 400 MHz clock ?Woody wrote:You may find that you get better results by using timestamping. Each port has a timer which is incremented every time it receives a clock pulse. Timestamping allows you to specify the exact time (clock number) that you want a signal transition to occur on.As you can see from this example you can first issue an output and read the time that it occured at, then you can set a time in the future when you want the next output to occur and then schedule that.Code: Select all
int portTime; pwmPort <: 0 @ portTime; // Find out the current port time portTime += 20; pwmPort @ portTime <: 1; portTime += 60; pwmPort @ portTime <: 0; portTime += 47; pwmPort @ portTime <: 1; portTime += 33; pwmPort @ portTime <: 0;
See section 4.3 "Performing I/O on Specific Clock Edges" of Programming XC on XMOS Devices for more details: http://www.xmos.com/support/documentation
If so, what about a 500 MHz L device.
For an example, how does (t+8) correlate to 12 ns in this XMOS module ?
Code: Select all
// read with address.
p_sram_addr <: Adrs @ t;
// read data with 12 ns access time.
p_sram_data @ (t + 8) :> Result;
Probably not the most confused programmer anymore on the XCORE forum.
-
- XCore Addict
- Posts: 165
- Joined: Wed Feb 10, 2010 2:32 pm
If ports are clocked internally they use the reference clock. This is 100MHz*. Note that the reference clock is also used for the timers.lilltroll wrote:What is the frequency of the clock when it's clocked internally? Does it use the 400 MHz clock ? If so, what about a 500 MHz L device.
*There are occasionally times when you may want the ref. clock to differ from 100MHz. This can be achieved via the .xn file (see the 'XS1-? Clock Frequency Control' documents http://www.xmos.com/support/documentation for details.
Note that there may be knock on effects of changing the ref. clock because code blocks may assume a 100MHz timer.
For +8 to correspond to a 12ns delay a ref clock with a period of 1.5ns would be required (666MHz). This is out of range for an XS1 device, so I suspect that the comment is either invalid or out of context. Where did you get the code?lilltroll wrote:For an example, how does (t+8) correlate to 12 ns in this XMOS module ?Code: Select all
// read with address. p_sram_addr <: Adrs @ t; // read data with 12 ns access time. p_sram_data @ (t + 8) :> Result;
-
- XCore Expert
- Posts: 956
- Joined: Fri Dec 11, 2009 3:53 am
- Location: Sweden, Eskilstuna
For +8 to correspond to a 12ns delay a ref clock with a period of 1.5ns would be required (666MHz). This is out of range for an XS1 device, so I suspect that the comment is either invalid or out of context. Where did you get the code?[/quote]Woody wrote: p_sram_addr <: Adrs @ t;
// read data with 12 ns access time.
p_sram_data @ (t + 8) :> Result;[/code]
It's from the SRAM module http://www.xmos.com/applications/memory/sram-controller
I wanted so see if I could come closer to 50 Mreads/s e.g. 50 Mbytes/s
Also check this tread: http://www.xcore.com/forum/viewtopic.php?f=15&t=512
Probably not the most confused programmer anymore on the XCORE forum.
-
- XCore Addict
- Posts: 165
- Joined: Wed Feb 10, 2010 2:32 pm
That comment is wrong. It is really an 80ns access (8*100MHz cycles). Thanks for pointing it out, I'll put a bug on the comments in that code.
-
- XCore Legend
- Posts: 1274
- Joined: Thu Dec 10, 2009 10:20 pm
@infiniteimprobability
I am curious why in your chart using the buffered serialised port is more expensive thread wise, 3 times more to be specific can you explain that.
Also I'm interested in seeing what happens when you increase the frequency resolution up to 100s of Khz, then how does it change the table balance? I assume some of these methods duck out when you hit certain frequencies and multiple instances?
regards
Al
I am curious why in your chart using the buffered serialised port is more expensive thread wise, 3 times more to be specific can you explain that.
Also I'm interested in seeing what happens when you increase the frequency resolution up to 100s of Khz, then how does it change the table balance? I assume some of these methods duck out when you hit certain frequencies and multiple instances?
regards
Al
-
- XCore Legend
- Posts: 1126
- Joined: Thu May 27, 2010 10:08 am
I'll have a go.. The buffered serialised method basically requires you to shovel the next 32b of data before the buffer empties. If you are running at 12b (2^12=4096), 20KHz then this period will be:I am curious why in your chart using the buffered serialised port is more expensive thread wise, 3 times more to be specific can you explain that.
(1/20E3) / 4096 * 32 = 390ns. Running the thread at 50MHz (20ns instruction time) means you've 19 cycles to work out whether to transmit 0b0000000000000000, 0b1111111111111 or something in between from a lookup table of the 30 transition patterns.
You've also got to take care of updating the duty register although shared memory will do you favours here.
So as a rough guess (haven't done the calcs), you can probably only get away with 2 outputs per thread running at full pelt. Relaxing the PWM frequency or resolution would change things..
I've seen code that manages 6 outputs per thread using this method, but it does it differently by having a server thread which just shovels the data and the client thread (which updates the PWM duty) does the hard work. Nice idea....
Well it all comes down to how many cycles you have - doubling the frequency would require a drop in resolution of one bit, so 160KHz should be doable at 9b, 320KHz at 8b and so forth...Also I'm interested in seeing what happens when you increase the frequency resolution up to 100s of Khz, then how does it change the table balance? I assume some of these methods duck out when you hit certain frequencies and multiple instances
WHat requires such high PWM frequency? :?:
-
- XCore Legend
- Posts: 1274
- Joined: Thu Dec 10, 2009 10:20 pm
Interesting stuff and definitely worth more investigation, particularly where to put which pieces in the control and drive parts of the closed loop and still get good thread value.
regards
Al
I think this idea makes a lot of sense with the server being the motor driver and the controller handling more complex things like the feedback transforms which it needs the threads for etc..I've seen code that manages 6 outputs per thread using this method, but it does it differently by having a server thread which just shovels the data and the client thread (which updates the PWM duty) does the hard work. Nice idea....
Its more of a mathematical and performance curiosity than practical application from a motor POV as most high speed units checkout around 50Khz. Obviously getting above audio range is beneficial but higher will likely degrade rather than benefit performance. I suppose it could be useful for other applications such as piezoelectric motor/driving etc..WHat requires such high PWM frequency?
regards
Al
Last edited by Folknology on Wed Jun 23, 2010 5:44 pm, edited 1 time in total.
-
- XCore Addict
- Posts: 162
- Joined: Thu Dec 31, 2009 8:51 am
Those 32 bit buffered ports are supposed to be double buffered.
At 100mhz output, you only need to write once every 32 operations.
Supposing I'm at 400mhz I should be able to do 32*4 operations between writes.
so I can do:
for loop
porta <: mynumber;
portb <: mynumber;
portc <: mynumber;
portd <: mynumber;
and have time to spare with a bunch of calculations since it should only block when the buffer is full (which happens only once in 32 operations).
I currently have a PWM code running in 1 thread and can update at least 8 motors with 8 bit PWM at 100mhz while computing the next value to write in the same thread on the fly.
Or am I missing something?
At 100mhz output, you only need to write once every 32 operations.
Supposing I'm at 400mhz I should be able to do 32*4 operations between writes.
so I can do:
for loop
porta <: mynumber;
portb <: mynumber;
portc <: mynumber;
portd <: mynumber;
and have time to spare with a bunch of calculations since it should only block when the buffer is full (which happens only once in 32 operations).
I currently have a PWM code running in 1 thread and can update at least 8 motors with 8 bit PWM at 100mhz while computing the next value to write in the same thread on the fly.
Or am I missing something?
-
- XCore Expert
- Posts: 956
- Joined: Fri Dec 11, 2009 3:53 am
- Location: Sweden, Eskilstuna
Has no-one applied dither to PDM?
I took a very fast look at the "Class D Audio Power Amplifier"
Shouldn't a first order SigmaDelta look something like this:
The for loop is just a very ugly oversampling for testing, but it's a reason why I use -+2^16 and not -+2^15 due to the nature of PDM. The value 25 controls the amount of applied dither. Even the XC-1 speaker will sound nicer with dither. (The use of x in the CRC32 is overkill)
PS. I am testing a 5:th order noise shaper, since the demo code uses 8 times oversampling. DS
I took a very fast look at the "Class D Audio Power Amplifier"
Shouldn't a first order SigmaDelta look something like this:
Code: Select all
void speaker(streaming chanend c_in,out buffered port:1 p,clock clk){
const unsigned short delay=35;
unsigned short time=0;
int x,y;
unsigned dither;
int qe=0;
set_clock_ref(clk);
configure_port_clock_output(p, clk);
configure_out_port_no_ready(p, clk, 0);
start_clock(clk);
while(1) {
c_in:>x;
for(int i=0;i<64;i++){ //fs=44.6 kHz Use maximum oversampling
if(x>=qe)
{y=65536;time+=delay; p@time <: 1;}
else
{y=-65536;time+=delay; p@time <: 0;}
crc32(dither,x,0xEB31D82E); //Magic poly
qe=qe+y-x+(dither>>25);
}
}
}
PS. I am testing a 5:th order noise shaper, since the demo code uses 8 times oversampling. DS
Probably not the most confused programmer anymore on the XCORE forum.