Most Efficient Way to Generate Varying Duty Cycle and PWM

Post by **lilltroll** » Thu Jun 10, 2010 4:23 pm

PS. It's very interesting kster!
But when using feed-forward solutions without feedback, i'm not sure that the benefit of using the maximum frequency benefits over the probably increasing problems with the error on each edge that will be generated from the transistors since they are not perfect and do not provide infinite bandwith.

JAES fellow Malcom has done alot of work during his life:
Check out http://www.essex.ac.uk/csee/research/au ... tions.html if anyone has missed it

Hmm, I should read thisone myself.
http://www.essex.ac.uk/csee/research/au ... lifier.pdf

DS.

Post by **lilltroll** » Thu Jun 10, 2010 9:30 pm

Woody wrote:You may find that you get better results by using timestamping. Each port has a timer which is incremented every time it receives a clock pulse. Timestamping allows you to specify the exact time (clock number) that you want a signal transition to occur on.
Code: Select all
int portTime;

pwmPort <: 0 @ portTime;  // Find out the current port time
portTime += 20;
pwmPort @ portTime <: 1;
portTime += 60;
pwmPort @ portTime <: 0;
portTime += 47;
pwmPort @ portTime <: 1;
portTime += 33;
pwmPort @ portTime <: 0;
As you can see from this example you can first issue an output and read the time that it occured at, then you can set a time in the future when you want the next output to occur and then schedule that.

See section 4.3 "Performing I/O on Specific Clock Edges" of Programming XC on XMOS Devices for more details: http://www.xmos.com/support/documentation

What is the frequency of the clock when it's clocked internally? Does it use the 400 MHz clock ?
If so, what about a 500 MHz L device.

For an example, how does (t+8) correlate to 12 ns in this XMOS module ?

Code: Select all

   // read with address.
   p_sram_addr <: Adrs @ t;
   // read data with 12 ns access time.
   p_sram_data @ (t + 8) :> Result;

Woody · Post by **Woody** » Fri Jun 11, 2010 9:53 am

lilltroll wrote:What is the frequency of the clock when it's clocked internally? Does it use the 400 MHz clock ? If so, what about a 500 MHz L device.

If ports are clocked internally they use the reference clock. This is 100MHz*. Note that the reference clock is also used for the timers.

*There are occasionally times when you may want the ref. clock to differ from 100MHz. This can be achieved via the .xn file (see the 'XS1-? Clock Frequency Control' documents http://www.xmos.com/support/documentation for details.

Note that there may be knock on effects of changing the ref. clock because code blocks may assume a 100MHz timer.

lilltroll wrote:For an example, how does (t+8) correlate to 12 ns in this XMOS module ?
Code: Select all
   // read with address.
   p_sram_addr <: Adrs @ t;
   // read data with 12 ns access time.
   p_sram_data @ (t + 8) :> Result;

For +8 to correspond to a 12ns delay a ref clock with a period of 1.5ns would be required (666MHz). This is out of range for an XS1 device, so I suspect that the comment is either invalid or out of context. Where did you get the code?

Post by **lilltroll** » Fri Jun 11, 2010 4:11 pm

Woody wrote: p_sram_addr <: Adrs @ t;
// read data with 12 ns access time.
p_sram_data @ (t + 8) :> Result;[/code]

For +8 to correspond to a 12ns delay a ref clock with a period of 1.5ns would be required (666MHz). This is out of range for an XS1 device, so I suspect that the comment is either invalid or out of context. Where did you get the code?[/quote]

It's from the SRAM module http://www.xmos.com/applications/memory/sram-controller
I wanted so see if I could come closer to 50 Mreads/s e.g. 50 Mbytes/s

Also check this tread: http://www.xcore.com/forum/viewtopic.php?f=15&t=512

Woody · Post by **Woody** » Fri Jun 11, 2010 4:58 pm

That comment is wrong. It is really an 80ns access (8*100MHz cycles). Thanks for pointing it out, I'll put a bug on the comments in that code.

Post by **Folknology** » Tue Jun 22, 2010 3:57 pm

@infiniteimprobability

I am curious why in your chart using the buffered serialised port is more expensive thread wise, 3 times more to be specific can you explain that.

Also I'm interested in seeing what happens when you increase the frequency resolution up to 100s of Khz, then how does it change the table balance? I assume some of these methods duck out when you hit certain frequencies and multiple instances?

regards
Al

infiniteimprobability · Tue Jun 22, 2010 6:46 pm

I am curious why in your chart using the buffered serialised port is more expensive thread wise, 3 times more to be specific can you explain that.

I'll have a go.. The buffered serialised method basically requires you to shovel the next 32b of data before the buffer empties. If you are running at 12b (2^12=4096), 20KHz then this period will be:

(1/20E3) / 4096 * 32 = 390ns. Running the thread at 50MHz (20ns instruction time) means you've 19 cycles to work out whether to transmit 0b0000000000000000, 0b1111111111111 or something in between from a lookup table of the 30 transition patterns.

You've also got to take care of updating the duty register although shared memory will do you favours here.

So as a rough guess (haven't done the calcs), you can probably only get away with 2 outputs per thread running at full pelt. Relaxing the PWM frequency or resolution would change things..

I've seen code that manages 6 outputs per thread using this method, but it does it differently by having a server thread which just shovels the data and the client thread (which updates the PWM duty) does the hard work. Nice idea....

Also I'm interested in seeing what happens when you increase the frequency resolution up to 100s of Khz, then how does it change the table balance? I assume some of these methods duck out when you hit certain frequencies and multiple instances

Well it all comes down to how many cycles you have - doubling the frequency would require a drop in resolution of one bit, so 160KHz should be doable at 9b, 320KHz at 8b and so forth...

WHat requires such high PWM frequency? :?:

Post by **Folknology** » Tue Jun 22, 2010 9:03 pm

Interesting stuff and definitely worth more investigation, particularly where to put which pieces in the control and drive parts of the closed loop and still get good thread value.

I've seen code that manages 6 outputs per thread using this method, but it does it differently by having a server thread which just shovels the data and the client thread (which updates the PWM duty) does the hard work. Nice idea....

I think this idea makes a lot of sense with the server being the motor driver and the controller handling more complex things like the feedback transforms which it needs the threads for etc..

WHat requires such high PWM frequency?

Its more of a mathematical and performance curiosity than practical application from a motor POV as most high speed units checkout around 50Khz. Obviously getting above audio range is beneficial but higher will likely degrade rather than benefit performance. I suppose it could be useful for other applications such as piezoelectric motor/driving etc..

regards
Al

kster59 · Post by **kster59** » Wed Jun 23, 2010 7:57 am

Those 32 bit buffered ports are supposed to be double buffered.

At 100mhz output, you only need to write once every 32 operations.

Supposing I'm at 400mhz I should be able to do 32*4 operations between writes.

so I can do:

for loop
porta <: mynumber;
portb <: mynumber;
portc <: mynumber;
portd <: mynumber;

and have time to spare with a bunch of calculations since it should only block when the buffer is full (which happens only once in 32 operations).

I currently have a PWM code running in 1 thread and can update at least 8 motors with 8 bit PWM at 100mhz while computing the next value to write in the same thread on the fly.

Or am I missing something?

Post by **lilltroll** » Tue Nov 02, 2010 11:35 am

Has no-one applied dither to PDM?

I took a very fast look at the "Class D Audio Power Amplifier"

Shouldn't a first order SigmaDelta look something like this:

Code: Select all

void speaker(streaming chanend c_in,out buffered port:1 p,clock clk){
const unsigned short delay=35;
unsigned short time=0;
int x,y;
unsigned dither;
int qe=0;
set_clock_ref(clk);
configure_port_clock_output(p, clk);
configure_out_port_no_ready(p, clk, 0);

start_clock(clk);
	while(1) {
		c_in:>x;
		for(int i=0;i<64;i++){  //fs=44.6 kHz Use maximum oversampling
		if(x>=qe)
		  {y=65536;time+=delay; p@time <: 1;}
		 else
		  {y=-65536;time+=delay; p@time <: 0;}
		crc32(dither,x,0xEB31D82E); //Magic poly
		qe=qe+y-x+(dither>>25);
		}
	}
}

The for loop is just a very ugly oversampling for testing, but it's a reason why I use -+2^16 and not -+2^15 due to the nature of PDM. The value 25 controls the amount of applied dither. Even the XC-1 speaker will sound nicer with dither. (The use of x in the CRC32 is overkill)

PS. I am testing a 5:th order noise shaper, since the demo code uses 8 times oversampling. DS

Most Efficient Way to Generate Varying Duty Cycle and PWM

Re: Most Efficient Way to Generate Varying Duty Cycle and PWM

Re: Most Efficient Way to Generate Varying Duty Cycle and PWM

Re: Most Efficient Way to Generate Varying Duty Cycle and PWM

Re: Most Efficient Way to Generate Varying Duty Cycle and PWM

Re: Most Efficient Way to Generate Varying Duty Cycle and PWM

Re: Most Efficient Way to Generate Varying Duty Cycle and PWM

Re: Most Efficient Way to Generate Varying Duty Cycle and PWM

Re: Most Efficient Way to Generate Varying Duty Cycle and PWM

Re: Most Efficient Way to Generate Varying Duty Cycle and PWM

Re: Most Efficient Way to Generate Varying Duty Cycle and PWM