Internal Flash memory slow-down?

Technical discussions around xCORE processors (e.g. xcore-200 & xcore.ai).
leifR
Junior Member
Posts: 6
Joined: Fri Sep 25, 2015 9:41 am

Internal Flash memory slow-down?

Post by leifR »

Hello,

we have XMOS XEF216-512-TQ128-I20 processor in our prototype and see the following issue:

We use internal flash of the processor to persist data and use quadflashlib.h (and -lquadflash flag in XCC flag.) When we have a new board with new processor, we have below one second delay to read hundred times single integer from the memory via serial communication. After some time, or let's say 600+ power cycles, we experience delay of some 40 seconds. In other boards that has been longer time in use we may reach close to 60 seconds timeout for this simple read. It is not clear in our tests what slows down the flash reading. We are not writing to flash anything in normal use except initially.

This is a tricky problem and can limit the product, because this is considerable timeout in serial communication.

Any idea what could have happened? How to tackle or solve the issue?

br,

\leif


User avatar
mon2
XCore Legend
Posts: 1913
Joined: Thu Jun 10, 2010 11:43 am

Post by mon2 »

Hi. Only some ideas to review:

1) what is the temp of the equipment (PCB / CPU / power supply feeding the CPU) under test? Is the temp higher on the PCBs that are taking longer time to respond?

2) For your serial communication, what is the baud rate? Is the clock to the serial IP stable and accurate? For PC standard baud rates, the clock should be 1.8432 MHZ or a multiple of this value (ie. 14.7456 Mhz is good).

3) If possible, consider to use an IR thermometer for electronic use and ping parts of your design to get a measurement of the temps. Respectively, try to use some freeze spray to cool down the PCB under test and see if this alters the results in your favour.

4) Aside from serial communication, can you rework your IP to perform the flash read and then just blink a local LED? How are the results if not using the serial communication part of your code? For example, code up a small routine to perform your flash read and then blink the LED but take a time stamp to measure the time taken to perform this task. Then repeat and if the time stamp is not similar and out of line then lock on the external LED to notify the user that something is wrong.

Post your results.
leifR
Junior Member
Posts: 6
Joined: Fri Sep 25, 2015 9:41 am

Post by leifR »

Answer to few of the points.
1) Temperature few cm from processor is around 40 C. There are large copper pours in PCB to get heat away.
2) For serial communication we have 115200 Baud UART.

3 + 4 to go.
br,
\leif
leifR
Junior Member
Posts: 6
Joined: Fri Sep 25, 2015 9:41 am

Post by leifR »

3) Temperature should not be an issue because units work initially fast. Also slowness is visible immediately after power-up in slow units.
4) I did test with used unit (one week with power-cycles) and with a fresh unit. In the test I read flash memory 100x in row and blinked leds after.
Result: 23s for used unit. <0.5s for fresh unit.

Next I plan to run the fresh unit one week and check how it slows down. Anyway, any help is welcome to understand the issue and if there exist a solution.
br,
\leif

Code: Select all

void flash_leds()
{
    int curve_nr = 0;
    unsigned char int_buf[4];
    int value;
    while (1) {

        p_led1 <: 0b0000;
        delay_milliseconds(500);
        p_led1 <: 0b1111;
        delay_milliseconds(500);
        p_led1 <: 0b0110;
        delay_milliseconds(500);
        for (curve_nr = 0; curve_nr < 100; curve_nr ++) {
        bf_flash_read(flash_ports, int_buf, 4, INDEX_X_TYPE(curve_nr));  // read x_type
        value = chars_to_int(int_buf);
        }
    }
}

int bf_flash_read(fl_QSPIPorts &ports,
        unsigned char read_buf[], unsigned n_bytes, unsigned offset)
{
  // Connect to the QuadSPI device using the quadflash library function fl_connectToDevice.
  //if(fl_connectToDevice(ports, deviceSpecs, spec_lenght) != 0)
  if(fl_connectToDevice(ports, _deviceSpecs, dev_spec_lenght) != 0)
  {
    return 1;
  }

  // read bytes
  if(fl_readData(offset, n_bytes, read_buf) != 0)
  {
    return 1;
  }
  // Disconnect from the QuadSPI device.
  fl_disconnect();
  return 0;
}

fl_QuadDeviceSpec _deviceSpecs[] =
{
  FL_QUADDEVICE_SPANSION_S25FL116K,
  FL_QUADDEVICE_SPANSION_S25FL132K,
  FL_QUADDEVICE_SPANSION_S25FL164K,
  FL_QUADDEVICE_ISSI_IS25LQ080B,
  FL_QUADDEVICE_ISSI_IS25LQ016B,
  FL_QUADDEVICE_ISSI_IS25LQ032B,
};
User avatar
akp
XCore Expert
Posts: 578
Joined: Thu Nov 26, 2015 11:47 pm

Post by akp »

I don't think you should be calling fl_connectToDevice() every time you call bf_flash_read(). When I use the flash I connect to it once, at power up, and then leave it. In our testing the fl_connectToDevice() function is pretty time consuming. You can see why if you look at the old SPI flash routines on the xcore github and make the educated guess that XMOS probably just ported essentially the same thing to their closed source QuadSPI routines.

I suggest you see if you can change it as follows and post the result.

Code: Select all

void flash_leds()
{
    int curve_nr = 0;
    unsigned char int_buf[4];
    int value;

   // Connect to the QuadSPI device using the quadflash library function fl_connectToDevice.
   if(fl_connectToDevice(ports, _deviceSpecs, dev_spec_lenght) != 0)
   {
     p_led1 <: 0b0001; // whatever to signal connection failure
     return;
   }

    while (1) {

        p_led1 <: 0b0000;
        delay_milliseconds(500);
        p_led1 <: 0b1111;
        delay_milliseconds(500);
        p_led1 <: 0b0110;
        delay_milliseconds(500);
        for (curve_nr = 0; curve_nr < 100; curve_nr ++) {
        bf_flash_read(flash_ports, int_buf, 4, INDEX_X_TYPE(curve_nr));  // read x_type
        value = chars_to_int(int_buf);
        }
    }

    // Disconnect from the QuadSPI device; does nothing so far as I know and shouldn't get here
    fl_disconnect();
}

int bf_flash_read(fl_QSPIPorts &ports,
        unsigned char read_buf[], unsigned n_bytes, unsigned offset)
{
  // read bytes
  if(fl_readData(offset, n_bytes, read_buf) != 0)
  {
    return 1;
  }
  return 0;
}

fl_QuadDeviceSpec _deviceSpecs[] =
{
  FL_QUADDEVICE_SPANSION_S25FL116K,
  FL_QUADDEVICE_SPANSION_S25FL132K,
  FL_QUADDEVICE_SPANSION_S25FL164K,
  FL_QUADDEVICE_ISSI_IS25LQ080B,
  FL_QUADDEVICE_ISSI_IS25LQ016B,
  FL_QUADDEVICE_ISSI_IS25LQ032B,
};
leifR
Junior Member
Posts: 6
Joined: Fri Sep 25, 2015 9:41 am

Post by leifR »

According to libquadflash documentation (https://www.xmos.com/developer/publishe ... dflash-api)
The program must explicitly open a connection to the Quad-SPI device before attempting to use it, and must disconnect once finished accessing the device.
I guess there is some risk to leave connection open for a long time and for example plug the power off.
br,
\leif
User avatar
akp
XCore Expert
Posts: 578
Joined: Thu Nov 26, 2015 11:47 pm

Post by akp »

I've not had a problem with it. Just try it and see what happens.
User avatar
akp
XCore Expert
Posts: 578
Joined: Thu Nov 26, 2015 11:47 pm

Post by akp »

I've not had a problem with it. Just try it and see what happens.

If xmos didn't close the source we could say for sure but compare https://github.com/xcore/sc_flash/blob/ ... flashlib.c where fl_disconnect does nothing. It's probably the same with the quad spi flash lib because the primary difference is it's using quad spi rather than plain old spi.
leifR
Junior Member
Posts: 6
Joined: Fri Sep 25, 2015 9:41 am

Post by leifR »

Thanks akp!

It looks like this makes reading the flash much faster (expected behaviour). I wonder is there any harm? I mean if for example power-off needs some closing command behind the scenes?

If I look the older referenced file (https://github.com/xcore/sc_flash/blob/ ... flashlib.c) and expect that quadflash is similar, maybe there is writing to flash every time device is connected:
fl_connectToDevice ->
fl_initProtection ->
fl_setProtection ->
- fl_int_issueShortCommand
- fl_int_waitWhileWriting

br,
\leif
User avatar
akp
XCore Expert
Posts: 578
Joined: Thu Nov 26, 2015 11:47 pm

Post by akp »

Hi leif.

Glad you got it working. I did an xobjdump of some code of mine, here's what I got for fl_disconnect

Code: Select all

<fl_disconnect>:
             0x0005c250: 00 f0 42 77: entsp (lu6)     0x2
             0x0005c254: 03 f0 c4 d0: bl (lu10)       0xcc4 <fl_int_qspiFinish>
             0x0005c258: 00 68:       ldc (ru6)       r0, 0x0
             0x0005c25a: c2 77:       retsp (u6)      0x2
So all it does is call fl_int_qspiFinish and then return 0, indicating success... here's fl_int_qspiFinish

Code: Select all

<fl_int_qspiFinish>:
             0x0005dbe0: 00 f0 40 77: entsp (lu6)     0x0
             0x0005dbe4: 1b f0 18 58: ldw (lru6)      r0, dp[0x6d8]
             0x0005dbe8: 94 a7:       mkmsk (rus)     r1, 0x4
             0x0005dbea: c4 ae:       out (r2r)       res[r0], r1
             0x0005dbec: f0 87:       syncr (1r)      res[r0]
             0x0005dbee: 1b f0 5b 58: ldw (lru6)      r1, dp[0x6db]
             0x0005dbf2: 40 e8:       setc (ru6)      res[r1], 0x0
             0x0005dbf4: 00 e8:       setc (ru6)      res[r0], 0x0
             0x0005dbf6: 1b f0 19 58: ldw (lru6)      r0, dp[0x6d9]
             0x0005dbfa: 00 e8:       setc (ru6)      res[r0], 0x0
             0x0005dbfc: 1b f0 1a 58: ldw (lru6)      r0, dp[0x6da]
             0x0005dc00: 00 e8:       setc (ru6)      res[r0], 0x0
             0x0005dc02: c0 77:       retsp (u6)      0x0
It looks to me like it's disabling some resources, probably the clock and IO for the quad spi. I get that by looking at fl_int_qspiInit() because it seems to just set up these resources, see below. So it seems to me that's not going to have any effect on the data in the FLASH itself -- it really shouldn't matter to keep them enabled, so long as you don't need to use those resources elsewhere.

Code: Select all

<fl_int_qspiInit>:
             0x0005db90: 00 f0 42 77: entsp (lu6)     0x2
             0x0005db94: 1b f0 58 58: ldw (lru6)      r1, dp[0x6d8]
             0x0005db98: 48 e8:       setc (ru6)      res[r1], 0x8
             0x0005db9a: 00 f0 86 68: ldc (lru6)      r2, 0x6
             0x0005db9e: d9 fe ec 0f: setclk (lr2r)   res[r1], r2
             0x0005dba2: 1b f0 99 58: ldw (lru6)      r2, dp[0x6d9]
             0x0005dba6: 88 e8:       setc (ru6)      res[r2], 0x8
             0x0005dba8: 00 f0 c6 68: ldc (lru6)      r3, 0x6
             0x0005dbac: de fe ec 0f: setclk (lr2r)   res[r2], r3
             0x0005dbb0: 1b f0 da 58: ldw (lru6)      r3, dp[0x6da]
             0x0005dbb4: c8 e8:       setc (ru6)      res[r3], 0x8
             0x0005dbb6: 80 f0 cf e8: setc (lru6)     res[r3], 0x200f
             0x0005dbba: e0 6a:       ldc (ru6)       r11, 0x20
             0x0005dbbc: 5f ff ec 27: settw (lr2r)    res[r3], r11
             0x0005dbc0: 00 f0 c6 6a: ldc (lru6)      r11, 0x6
             0x0005dbc4: 5f ff ec 0f: setclk (lr2r)   res[r3], r11
             0x0005dbc8: 9c a7:       mkmsk (rus)     r3, 0x4
             0x0005dbca: cd ae:       out (r2r)       res[r1], r3
             0x0005dbcc: f1 87:       syncr (1r)      res[r1]
             0x0005dbce: 1b f0 5b 58: ldw (lru6)      r1, dp[0x6db]
             0x0005dbd2: 48 e8:       setc (ru6)      res[r1], 0x8
             0x0005dbd4: 30 47:       zext (rus)      r0, 0x8
             0x0005dbd6: d1 16:       setd (r2r)      res[r1], r0
             0x0005dbd8: 08 90:       add (2rus)      r0, r2, 0x0
             0x0005dbda: 00 f0 2f d7: bl (lu10)       -0x32f <configure_port_clock_output>
             0x0005dbde: c2 77:       retsp (u6)      0x2