Quickest Comm

Technical discussions related to any XMOS development kit or reference design. Eg XK-1A, sliceKIT, etc.
kster59
XCore Addict
Posts: 162
Joined: Thu Dec 31, 2009 8:51 am

Post by kster59 »

Shouldn't you use the fastest one?

Look in the datasheet to find out what are your options.


User avatar
rp181
Respected Member
Posts: 395
Joined: Tue May 18, 2010 12:25 am
Contact:

Post by rp181 »

Ok, this is what i need to do. I am working on reading a camera module, which requires precise timing using the cameras PCLK signal. If i miss one of those triggers, the data will be corrupt.

Code: Select all

// wait for rise and fall of VSYNC
	VSYNC_Pin when pinsneq ( 0 ) :> void;
	VSYNC_Pin when pinsneq ( 1 ) :> void;

	for(int y = 0; y<240; y++)
	{
// wait for rise and fall of HREF
		HREF_Pin when pinsneq ( 1 ) :> void;
		HREF_Pin when pinsneq ( 0 ) :> void;

		for(int r = 0;r<320;r++) {
// wait for rise and fall of PCLK
			PCLK_Pin when pinsneq ( 1 ) :> void;
			PCLK_Pin when pinsneq ( 0 ) :> void;
// read the 8bit port for the y data
			yData_Pin :> yData;
// transmit the data though a channel
			c <: yData;

		}
	}
the transmission code:

Code: Select all

#define BIT_RATE 19200
#define BIT_TIME XS1_TIMER_HZ / BIT_RATE
...
...
...
unsigned d;
	char buffer[5];
	while (1) {
		c :> d;
		
		sprintf(buffer, "%i", d);
		if(d > 99) {
			txByte(TXD, (unsigned)buffer[0]);
			txByte(TXD, (unsigned)buffer[1]);
			txByte(TXD, (unsigned)buffer[2]);
		}
		else if(d > 9) {
			txByte(TXD, (unsigned)buffer[0]);
			txByte(TXD, (unsigned)buffer[1]);
		}
		else {
			txByte(TXD, (unsigned)buffer[0]);
		}
		txByte(TXD, (unsigned)'\n');
	}

c is the channel that connects to a communication method to send this data to the computer. The problem is that sending this data is taking way to long (c <: yData), and something that should occur in 0.0064512 seconds is taking more than a minute. How can i make it so the transmission of data doesnt take so much time? I tried the UART with the usb up to 921600 baud, but i am missing some values. I dont know if its the program i am using or the xmos.
User avatar
lilltroll
XCore Expert
Posts: 956
Joined: Fri Dec 11, 2009 3:53 am
Location: Sweden, Eskilstuna

Post by lilltroll »

You camera sends out data at around 100 MBit /s. You do not have a chance to transfer that bandwith to the host with the FTDI chip that's on-board the XMOS card.
I do not know your final objective what to do with the video, but if you just want to "see an image" on the host over uart you have to skip 99% of the data. A simple method to do so would be to buffer one line of data into SRAM, stamp it with which line it is and send it over to the host, and meantime skip all lines in between. There after cache the next current linestart into SRAM...

This is how it could look like over the UART.
Send:
Frame 1, Line 0
Frame 1, Line 120
Frame 1, Line 240
Frame 1, Line 320
Frame 2, Line 1
Frame 2, Line 121
Frame 2, Line 241
Frame 2, Line 321
Frame 3, Line 2
...

This will of course take 120 frames e.g. 4 s to send over one image, and if you move the camera during that time the picture will look "funny". But at least you have a change to send over a picture.
A much more convenient way would be to buffer a hole frame, but that would need 3*640*480 ~1 Mbyte of cache memory. It's possibly to connect an external SRAM/SDRAM to an XMOS chip, but the pin-out is not enough with the XC-1 card.

Is this for the helicopter project? If so, what is your objectives with the camera?
"30fps camera, with body tracking" Can you make that statement more specific?


PS. When I was a kid, I saw an movie there a boy with a very rich father had a radio-controlled helicopter with an onboard camera - and the picture was sent back wireless to the "base", and they used it to spy on a criminal or something in the movie.
I thought the hole technology was soooo cool back then. DS.
Probably not the most confused programmer anymore on the XCORE forum.
User avatar
rp181
Respected Member
Posts: 395
Joined: Tue May 18, 2010 12:25 am
Contact:

Post by rp181 »

This is for the helicopter project. I was hoping to experiment with image analysis for motion tracking to provide another set of data to the IMU filter, providing more accurate results for the seed being traveled, and 3D orientation. In addition, i wanted to try terrain analysis to detect, for example, where best to land. I was actually hoping for 2 cameras eventually, as I am working on a purely image based 3D reconstruction program. 3D reconstruction is possible with 1 camera, but alot more difficult to program, and less accurate results. The stereo images would be derived through motion tracking and adjacent frames.

4 seconds is way to slow as the platform will be moving. Would monochrome composite video (60 Hz) be low enough bandwidth to stream? Assuming, of course, i could get a fast enough ADC.

Is there a relatively easy way to get that much RAM? I was hoping to cache it all, do on-board image compression, then stream it. The resolution is, at default, (or at least set-able), to 320x240, so that is about 230kb RAM.

Whats the maximum amount of RAM you can have with 1 core (12 connections)? Have no idea how to use/wire the RAM up yet.
User avatar
lilltroll
XCore Expert
Posts: 956
Joined: Fri Dec 11, 2009 3:53 am
Location: Sweden, Eskilstuna

Post by lilltroll »

One Core has 64 kB of SRAM on chip. The code must fit that RAM as well. The common way to increase it, is to use a XMOS 32bit ports as a address line and a XMOS 8 or 16 bit ports for data line. The G4-512BGA chip has all internal pin connected to a BGA ball, so that a possible task, but unfortunately the XC-cards has a limited amount of connections.
You can use both SRAM and SDRAM. The SDRAM is cheaper if you need MB's of memory, but it might have a longer latency in R/W access (CAS & RAS) you have probably seen it in a PC BIOS some time.

A 32 bit address line can address 4 GB of memory, so you will not run out of adress space.

The alternative is that someone is working on a small FPGA "lookalike" solution that have the memory on one side and a XLINK on the other side. The XC language is prepared to access services like that.
It has been such talk on the IRC, but I do not know the status of such a project.
One nice thing with the XLINK is that it uses much fever pins for the same bandwith, since it's clocked much faster, also you would not need to run one thread as a RAM server, since it is handled by the FPGA. Also, all Cores can access such a memory.

A simple alternative is to use 4 bit gray-scale to start with in this prototype version. That would fit in 38.4 kB of RAM.
Probably not the most confused programmer anymore on the XCORE forum.
User avatar
rp181
Respected Member
Posts: 395
Joined: Tue May 18, 2010 12:25 am
Contact:

Post by rp181 »

That sounds like a great solution. How would I store it as a 4 bit number? I was looking at a table of primitives in C, and 8 is the minimum. For conversion from 8-bit to 4-bit, do i simply "round" to the nearest number in the pallet? I found this, and a 4-bit grayscale image seems perffectly acceptable.
User avatar
lilltroll
XCore Expert
Posts: 956
Joined: Fri Dec 11, 2009 3:53 am
Location: Sweden, Eskilstuna

Post by lilltroll »

One way that's maybe easy to understand and that will compile is to use shifting:

Double_nibble = ((Byte1>>4)<<4) + (Byte2 >>4);

You can also mask out the bits your are interested in;

Double_nibble = (Byte1 & 0xF0) + (Byte2 & 0xF0)>>4;

IRL you are interested in code that's compiles to as few instructions as possible.

The >>4 can be translated to the LSRI instruction for unsigned values (logical shift right immediate)

Masking can be translated to MKMSKI (Make n-bit mask immediate) and/or maybe
ZEXTI (Zero extend immediate)

Check the disasm that you get a result using as few instructions as possible.
Since a register is 32 bits long, you should probably fill it with 8 nibbles before you store it to SRAM.
Probably not the most confused programmer anymore on the XCORE forum.
User avatar
rp181
Respected Member
Posts: 395
Joined: Tue May 18, 2010 12:25 am
Contact:

Post by rp181 »

I'le be honest, I have no idea what your talking about. I have never worked with binary numbers directly nor bitwise shifting (ile look into it more).

When you say:

Code: Select all

Double_nibble = ((Byte1>>4)<<4) + (Byte2 >>4);
Does this mean you combine 2 4-bit values (byte1, byte2) to get the resultant bit (8 bits?), and then take 4 of these and store it?

What is the LSRI instruction?

What exactly does "masking" do, and which is preferable?

What is "disasm"? Does this just mean extracting the original values form the combined result?
kster59
XCore Addict
Posts: 162
Joined: Thu Dec 31, 2009 8:51 am

Post by kster59 »

Honestly, I think this project has little chance of success if you want to do such high speed / bandwidth operations but don't even know about masks and bit shifting.

The biggest problem with XMOS devices is a lack of usable libraries. Easy to use SDRAM isn't really available. When I get around to it/I need SDRAM I will write a good SDRAM library.

Also you want to do operations at 12mhz and 12MBps. Try doing something basic like toggling a single port on/off at 12mhz and run it in the simulator and you will see that it doesn't work without more advanced concepts like buffering serialization.

Also I don't understand why you want to read a camera then send it to the PC frame by frame.
A better solution is to buy something like an Atom Netbook, SBC or FIT2PC to do your heavy dsp lifting and use the XMOS for motor control/etc. You can then run something like a USB webcam and use libraries like OpenCV for image processing.

If you want to do all the image processing in the XMOS chip then you don't need to send the data but you are constrained to 64kB of memory (more like 30-50kB after coding). You also know that XMOS doesn't have an FPU right?
User avatar
rp181
Respected Member
Posts: 395
Joined: Tue May 18, 2010 12:25 am
Contact:

Post by rp181 »

Image processing is just an idea I am playing around with. I know the other, more crucial systems, are do-able (and have been done on much weaker systems), and have started these. I may eventually try out a beagle board, which happens to be very good at image processing. Where did you get 12MHz from? The way i was planning it, the helicopter is autonomous (very low speed comparatively, Servo control probably the fastest), and sends data when requested. I didn't actually want the processing portion on the XMOS, but in the GCS.
Post Reply