top of page
Search
  • Writer's pictureIfrim Ciprian

Final System - Code Optimisation

Updated: May 16, 2022

As known, the code for the 2 modules is built in C++, meanwhile, the ML, CNN and Linear Regression are performed in Python.


The C++ code as it needs to be deployed on small edge devices, need to be optimised considering the low available ram and flash. It is important to note that there is enough RAM/FLASH without optimisations needed, but it helps increase the resources available for future updates and displays consideratation when building the code.


The following steps have been taken to reduce the RAM/FLASH usage:


1) Have every variable at the lowest amount of bits required depending on the value stored.

For example, having an int8_t instead of an int, which by default needs 16 bits, so reducing the requirment in half.

Here is a list with the regular bytes requirement per variable type:

char  : 1 byte
short : 2 bytes
int   : 2 bytes
long  : 4 bytes
float : 4 bytes
double: 8 bytes

2) Using variables that occupy a set amount of bits but are stored in memory differently. int8_t means it's an 8-bit signed type. int_fast8_t means it's the fastest signed int with at least 8 bits. int_least8_t means it's an signed int with at least 8 bits.


These have been used depending on whether the system needs to access the value very fast, or if memory optimisation is more important.


3) Using signed/unsigned depending on the values stored.

Different variables can hold the following values:

  1. signed char: -127 to 127 (note, not -128 to 127; this accommodates 1's-complement and sign-and-magnitude platforms)

  2. unsigned char: 0 to 255

  3. "plain" char: same range as signed char or unsigned char, implementation-defined

  4. signed short: -32767 to 32767

  5. unsigned short: 0 to 65535

  6. signed int8: -128 to 128

  7. unsigned int8: 0 to 256

  8. signed int16: -32767 to 32767

  9. unsigned int16: 0 to 65535

  10. signed long: -2147483647 to 2147483647

  11. unsigned long: 0 to 4294967295

  12. signed long long: -9223372036854775807 to 9223372036854775807

  13. unsigned long long: 0 to 18446744073709551615


4) For loops are extemely optimised in Arduino and they perform better than anything else, and a simple 10 elements array print with a for loop takes a mere 6 milliseconds, as an example. IFs for example take circa 50ms, but it does depend on the condition.


Therefore, I have translated as many lines of code as possible to for loops in order to improve readability, improve speed and reduce code footprint.


5) Have as little as possible global variables. All the variables in code are as low level as possible unless necessary.


6) Not using String data type. In C++/C Strings can cause issued that result in memory fragmentation, with time on Arduino/AVR systems, this has been optimised so the system does not crash because of low memory available taken up by strings. In my case, to be on the same side, I have decided to remove all use cases of Strings.


There is 1 necessary String use case, because the Bosch library outputs the value as a String and the anctual value return has not been properly implemented in the library.


7) Using as many functions as needed, to move reapting sub-routines outside of the main code which helps with readability and code footprint.


8) As the Nicla Sense ME & Xiao Sense use the ARM Built MBED Operating System, it gives the possibility to the user, to put the system to sleep instead of using delays.

So in my case, the following code:

delay(300);

Becomes:

ThisThread::sleep_for(300);

This improves the battery life of the system.


9) Talking about battery life, another improvement was to add a flag, so that in case the battery goes below 10%, a flag is triggered and the voice recognition is disabled until the battery is recharged. Moreover, the system refresh which happens every 1 second normally, is changed to every 5 minutes.


The battery consumption per refresh is equal to 4 uA (micro amps) for the Nicla sensors and 1.3 mA (milli amps) for the MLX90614, with the SI1145 consuming 9 uA (micro amps). The Seed Xiao consumes circa 5 uA when sleeping and waiting for a new command from the I2C bus.


NICLA SENSE:

All of these optimisations make the Nicla Code use 312284 bytes which is 59% of the total 527616 bytes available.

For the ram, it is using 39376 bytes, equal to 61%, of the 64288 total. Free: 24912 bytes.


XIAO SENSE:

Total code uses 775544 bytes represning 95% of the total 811008 available flash.

For the ram, it uses 54872 bytes, only 23% of total 237568 bytes, leaving 182696 bytes available.

7 views0 comments

Recent Posts

See All
bottom of page