Thursday, 3 May 2018

Is low level programming still relevant these days?

The levels of abstraction have made the application programming much easier and faster. But everything comes at a price.

This is a new type of article here, I hope I will have some more such articles describing programming problems and the practices to solve the former.

Contents

Unlimited Resources Myth
Basic Data Structure
Improved Data Structure
Optimized Data Structure

Data Packing Algorithm
Faster Memory Allocation
Sporadic nature of data insertion

Example Program
Conclusion

Top

Unlimited Resources Myth

The overhead of various frameworks and libraries sometimes (or should I say always) makes memory usage and CPU utilization less efficient. Fast hardware and loads of RAM lead to sloppy coding practices too, luring the developers into the mirage of endless resources.

Sure thing, most developers would place a check now and than (e.g. making sure malloc does not return NULL, although most of the time this will not happen). But what happens when memory is fragmented? Large data structures will make fragmentation even worse and allocating more memory will become slower and slower.

And yes, eventually the available RAM will shrink.

Top

Basic Data Structure

Consider this data structure:

typedef struct{
int id; //MAX id < 10000
int array1[10000]; //MAX element value <= 4
int array2[10000]; //MAX element value <= 4
} my_struct_type;

struct my_struct_type my_array[10000];

On a 64bit machine the structure will take 8 (id) + 8 x 10000 x 2 (array1,2) = 160008 bytes. Creating an array of 10000 such structures will occupy about 1.6GB. This is a not-so-high level language. Just imagine what size the equivalent structure will be in Java or C#.

Top

Improved Data Structure

Let's take a closer look at the struct. It is noticeable that the arrays operate on very small values, so it would be natural to change the structure to use smallest possible datatype for arrays elements:

typedef struct{
int id; //MAX id <= 10000
uint8_t array1[10000]; //MAX element value <= 4
uint8_t array2[10000]; //MAX element value <= 4
} my_struct_type;

Now the size of the structure is 8 (id) + 1 x 10000 x 2 (array1,2) = 20008 bytes. And the total size is 200MB. That is much more manageable.

But wait! What will happen if we need more structures -e.g. 100K? This will lead to 2GB - oops, too much again (to be honest, 200MB is too much anyway).

Most developers (I hope) know that a byte consists of 8 bits. From the data structure comments we can assume that an array element will not occupy more than 2 bytes. So why don't we use bitfields instead of full uint8_t? Well, it's better not to - bitfields are compiler dependent and are not that efficient (more on that later).

Theoretically we can reduce the data struct footprint to a quarter size. And it is possible using bitmasks and bitwise operators. Lets do it step by step.

Top

Optimized Data Structure

Start with the notion the arrays1,2 are of the same size. So we can put element of array2 into unused part of array1:

typedef struct{
int id; //MAX id <= 10000
uint8_t array12[10000]; //two arrays combined
} my_struct_type;

So now the array element in its binary form will look like array12[0] = 0b0000BBAA, where AA is the array1 element and BB is array2 element. So the size is halved. But there is still enough room to insert more data. Why don't we store one more of each elements of array1 and array2 in the same element of array 12, like this: array12[0] = 0bB1A1B0A0.

this way our struct becomes

typedef struct{
int id; //MAX id <= 10000
uint8_t array12[5000]; //four values in one element
} my_struct_type;

Now the memory footprint will be roughly 50MB. And this is not the only benefit of the current data structure. But before discussing it lets consider the possible overhead of bit manipulation.

Data Packing Algorithm

To store the value val in the array's element el we need to perform the following operations:

1. Put value in correct binary position:

val = val << val_pos;

2. Depending on the index value, shift the value to the left half:

val = val << (index % 2 ? 4 : 0);

Modulus can be replaced with binary index & 1 and the value can be shifted left 2 bits to get rid of conditional jump:

val = val << ((index & 1 ) << 2);

3. Insert the value into the element:

el |= val;

There are 5 bitwise operators. Bitwise operators are the most simple and "inexpencive" ones for the CPU. This will take about the same time (or only slightly more) than storing (copying) a value into array.

Faster Memory Allocation

As I mentioned earlier the reduced memory footprint is not the only good thing about the new data structure. For the computer's memory management engine it is much easier to allocate lots of small memory chunks than lots of large memory chunks. By reducing the structures memory size we increase the processing seed (possible compensating for additional CPU cycles taken by bitwise operators).

Sporadic nature of new data insertion

Another thing worth noting that addition of new elements does not happen all at once, but occurs more or less sporadically. The new element creation will be unnoticeable from CPU point of view overshadowed by more CPU intense operations. As for the occupied memory, there is not way around it and at some stage the program will have to struggle with subsequent memory allocation due to RAM fragmentation (if using dynamic memory allocation of course).

Top

Example program

A working example can be found here: https://github.com/droukin-jobs/packer - a proof of concept that bitwise data placement is not much slower than traditional approach and much more memory efficient.

Top

Conclusion

While having "unlimited" memory and CPU resources it is still important to keep track of your resource usage. How many times you had to sit and wait until some program loads a new screen or processes a new request (a good example here is SolidWorks - the behemoth is so slow even on the fastest systems despite being a well known and respected brand). When proramming it is good to create robust and easy to understand code, but after all testing is done, why not optimize some obvious parts - if you know how?
Top

Tuesday, 1 May 2018

There is no master in modern team

Of course this is sarcasm, don't take it seriously!

Economics of efficiency

Meet the team

How many Software engineers do you need to screw in a light bulb? One? You are wrong! Let's calculate risks and how we can mitigate those.

What if the Engineer instead of screwing in will start unscrewing the light bulb? Someone will need to ensure that. An extra pair of eyes will definitely help to catch the wrong direction of turning at the early stage, thus saving a lot of time in the long run. This is called Peer review.

But we need to test the light bulb - what if it 's not working? Let's get the Engineers perform the set of tests on the light bulb. Since they are busy testing, we will have to hire two more Engineers to keep up with the work.

What if the lamp is unscrewable, e.g. it has different size or shape of connector? We need someone to tell Engineer what kind of light bulb to use - the Architect will help with that.

What if the light bulb got changed in the wrong room? Or the customer wants a different light bulb? Product manager will help with this.

How about the history of successful bulb replacements - that should definitely help with future installations! Let's hire an Agile coach to teach the team to reflect on their mistakes and re-use successful solutions.

So far we have 4 engineers + System Architect + Product Manager + Agile Coach = 7 people. We now have a TEAM !

Quality calculations

After the project is done we can safely assume the quality of the final solution is going to be very high. But still let's calculate the approximate quality and time spent on an average light bulb project.

Suppose the average Engineer's time to screw in the light bulb is 1 hour and the quality rate is about 70% - i.e. about 70% of work is acceptable. Peer review (in theory) will increase this to 100% - 30% x 70 % = 80%. The time spent will increase by 1 hour.

Meeting with Architect to discuss what kind of screwing method to use will take 0.5 hours of time. If Engineers are less competent with this method, this will reduce their accuracy say by 10%.

Lets see what the customer want, consult with Product manager for 0.5 hours.

And don't forget to listen what other team members are doing, and also reflect on previous work in a short standup meeting for 0.25 hours.

The overall quality is now 80% (70% if different method of screwing is used), and the time spent is 3.25 hours.

Imagine the unthinkable

What if in the beginning we hire an Engineer with 80% quality rating? And get him to do a bit of product management, allowing him to decide on screwing methods?

It's still takes 1 hour to screw in the light bulb. Testing might take 0.25 hours, and in 1 time out of 5 will result in re-fixing the light bulb - so add 0.25 hours. Customer interaction will take 1 hour.

So the quality is still 80%, the time spent 2.5 hours and - quelle horreur! - it's all done by one person!

And if we don't pay this person equivalent of at least 3 people from the team above, he will eventually quit. And the project will halt. That's why it is so important to have a team!

New technologies vs Old logic

The levels of abstraction

A master often invents some kind of system to minimize time spent on trivial tasks. Other people may employ this system to their benefit. Some even improve it. And some - a majority - just blindly copy it without understanding the root causes and original problems the system supposed to solve.

Then there are people who wrap the system nicely and start to sell it to others. In order to increase the value these people add some beautiful decorations and supply shiny manuals on how to use the system.

And eventually some other people start to train others on how to sell the system.

At the end the reason the system was created has long been forgotten, the final solution is bloated and complex and no one quite understands why it should be used - apart from the obvious "it saves time".

What is missing?

Not what, but who - the master is left out of the whole thing. Remember, it was he who invented the system to solve _his_ problems. Your problems may or may not be the same.

So who is missing then?

A person with old fashioned logic. What is logic? What, you did not study this? And I am not talking boolean logic - this is only good for binary devices. Humans are not binary, even the simplest ones. But most humans are not complex either. They make fast-fetched conclusions based on inadequate information and this leads them to mistakes.

A person with old sense logic normally tries to understand the deep layers of the problem. Seeing wider picture (i.e. having broad experience well outside the main skill set) helps enormously during decision making. And all this provides more confidence and less room for error.

So what?

A thorough logical thinking allows to determine the snake oil salesmen early. With enough experience sometimes a few seconds is enough to recognize hype and stop wasting time trying to employ it. Remembering there is nothing new under the Sun will help critically assess all the "new" technologies. Some of these are indeed new and useful, but the majority are just a variety of haphazardly sewn pieces of useless dirty laundry.

It is easy for an average person to be restricted by the set of rules - and it definitely helps a lot during the 'infancy' in the profession. But then experience must take over saying that different rules are for different contexts. Even within each context there are times when it is much easier to bend the rule in order to achieve the result. But only master can do it. And only master can then create a system that will take care of trivial tasks.

Modern problem

There are not many masters around. More over the Software engineering philosophy is based on anti-master principles: there is no I in the TEAM, we follow the rules, make sure others understand your code. This is good for a junior developer, but this will not help him to grow into master, unless he will logically comprehend various levels of abstraction and why those were created in the first place.

Old solution

It's hard to break the circle of never ending "improvement techniques" and "efficiency programs". Most people lack common sense and prefer to use ready made solutions. It works for some, but the overhead is usually hard to calculate and it is not necessarily the best solution. And definitely not the simplest one. However it would be unwise to do nothing about it. I am not talking of destroying the whole software development model, but rethinking the approach to implementing some parts of it. And for every one the final solution will be different - the master should decide what to leave in place what to discard.

Parkinson (The Parkinson's Laws) mentioned that work expands so as to fill the time available for its completion. Similar with the team size - it does not really matter how many members in your team, it will take approximately the same time for small team and for big team to finish the same task. To choose the right time and right people for work again requires master.

The solution is simple. There is only one problem - where to find the master?

Tuesday, 17 April 2018

Linkedin article: Code efficiency in modern times

Published an article on Linkedin: https://www.linkedin.com/pulse/code-efficiency-modern-times-dmitri-roukin/

Tuesday, 10 April 2018

Project status as of 11-Apr-2018

It has been awhile since I wrote anything about my research. Now it's time for a major update since a lot has happened since October 2017.

New projects and their status

Supercapacitors for automotive use - Active, High priority

After figuring out that my car battery cannot hold the charge for too long I decide to install supercapacitors to take the burden of powering the starter motor off the battery. It took way too long for the capacitors toe arrive and the battery has lost the ability to hold the charge even for a day. But not all is lost. Later I will provide some of my findings into this matter.

Touchscreen control Version 2 - Active, Low priority

Having a touchscreen to control stuff may be a cool thing at first but it becomes a bit boring later. So why not to add some really good features to it such as aggregated feedback and PoE (Power over Ethernet)? Oh yes, and also the GUI based end user interface design for smart home automation? At the moment I got stuck with Cortex-M3 connectivity line due to the crazy amount of datasheets required for bare metal setup. It's ok, I will get through it someday.

Custom LED driver - Active, Medium priority

After ordering some cheap 9W-12W MR16 LEDs from China I realized that the power output is not what it should be. Some LEDs were faulty which gave me one more reason to dismantle and look inside. As expected the soldering job was not good at the rectifier bridge. Also the boost converter seemed to be too weak for 12W LED - instead of 36V it provided only 22V. That means one more project for me: design a robust and cheap power controller for the LEDs.

Old projects put on hold

Time is extremely hard to refund. Some even say it's impossible. So I had to put some projects on hold due to busy lifestyle and some unexpected difficulties.

Digital DI box

ATXmega could not cope with realtime sound conversion - it was just enough to convert Analogue data into digital signal, but converting it back to Analogue was too much of a task. I will have to use either a specially designed DAC or a more powerful uC (Cortex-M3 maybe?).

Abandoned projects

Technically I have not abandoned any projects yet, only changed priorities and some key components. For example I will not longer use ATXmega for high computing/high data throughput projects, there are other microcontrollers out there better suitable for this and at a lower cost too. At the same time I am switching mainly to PIC at low end due to cost and driver availability.

Do you really need to know the first principles to...

... become an engineer? Before today I was 100% sure that you do need the base knowledge of the engineering field you specializing in. You have to understand some basics of electronics for example to be able to design electronic systems. Or you need some basic knowledge of mechanics to produce robots. And in both cases you most likely will use computers and therefore you need to know computer basics too.

So what happened today?

A few hours ago (on a very short notice) I helped to setup an Arduino based class for schoolchildren visiting the Uni for a field trip. Some PhD students were organizing and helping with technical side: connect Arduino based robots, copy libraries onto computers, setup login and workspace for the school students to practice.

The first problem occured when the PhD's tried to install a particular Arduino library onto the computer's harddrive. I usually frown upon anything that gets installed on top of the base image, especially if this requires admin rights. Well, I am probably too lazy to type the admin password into 20+ computers, but I also know that this should not be necessary in order to use that library. It can be done by simply coping the library to desktop and using an absolute path in the program source code. So the students - intelligent Mechatronics PHD's in their early 20's! - copied the files from a network drive onto local Desktop and asked if this is enough to do it once so the file will (magically?) appear on the rest of computers.

They were surprised when I told them they will have to manually copy the files on all computers. And no, I did not want to automate the process - why to make their life easier, especially when they were not prepared in the first place?

Another problem was with specifying the local path for #include statement. I would expect them to know that #inlcude takes a file name as an argument, not the directory name. But no! The student's did not realize that library (in Arduino sense) is just a bunch of source files, with the .H file as its base. I wonder what will happen when they meet real uC development toolchain?

Once again, these were intelligent (and some are quite actually bright) people who did not understand fully the basics of 1) Desktop computing; 2) Arduino programming; 3) File structure. I am pretty sure they are quite successful at what they do despite these knowledge gaps - thanks to several layers of abstraction imposed by software developers.

Then what is the problem?

The problem lies in their core ability to create something non-mainstream, something that will require advanced skills. What looks to me as a very basic and necessary information for them will become an intimidating and alien concept. They will spend too much time to get through the wrongly understood principles and improperly learned skills

Or they will just hire someone who will do all the dirty work for them

Tuesday, 31 October 2017

Power control 2-01 - Testing

Sensors arrived

Finally! The sensors are here. I ordered ACS712T from Aliexpress almost a month ago and yesterday received 10 of them. I quickly soldered the sensors to the adapter board with separately soldered wires for stronger current (up to 5A).

Test Setup

According to ACS712T datasheet the device provides continuous output of instantaneous measurements of current. Also I found out since the resolution is roughly 30mA for a 10bit ADC I will not be able to measure small current. I am still waiting for a single supply opamps, this will help me to expand the current range.

Due to the instantaneous nature of current measurement it would be hard to sample high frequency AC. Since I don't need high precision I decide to use some analogue circuitry to maintain the voltage level. In the datasheet I found the following rectifier circuit.

Since I wanted the full range I omitted the resistors, leaving only the diode and capacitor.

First impressions

After running a small DC motor with the current sensor and rectifier circuit I was able to measure the current. Of course the data was not accurate - I will need to get a formula to adjust for capacitor and diode influence. But the general increase/decrease in current were consistent with the motor's behavior.

Next step will be running an AC load (up to 1A to start with) and plotting the data in real time. Also I will have a power calculation, but first I need to test it with a proper power meter.

Tuesday, 24 October 2017

Data Aquisition Real Time Visualisation

Data Acquisition Real Time Visualization

This is a new sub-project for the future version of Touchscreen Control (2.0). I wont reveal much details about this right now, suffice to say once I have all the components I will start on bringing everything together for testing phase

Setup

After playing with real time audio conversion/visualization in my Digital DI Box project I shifted the focus to slower sampling rate in order to be able to keep up with changing data. USB FTDI drivers are not suitable for 2Mbs continuous data flow and I am not ready yet to write my own USB driver. Instead I decided to use a simpler MC (PIC16F1827) with a reasonable amount of ADC channels to see if web interface is ready for moderate amount of data. I set sampling rate at 100ms, starting with one ADC channel and slowly moving to 5 channels (there will be more, I just ran out of sensors). Right now I get the following info:

Potentiometer
Potentiometer (inverse)
Magnetic sensor
Photo cell
Temeprature sensor

There was a vibration sensor, but it requires amplification and my opamps need dual power supply. The breadboard is already full and there is not much room for virtual ground and the whole bunch of additional wires.

Results

The web interface was able to produce some nice looking real life graphs. Unfortunately I killed the photo cell - it's sensitivity has dropped a lot after some miswiring. Otherwise the system is doing great. Next step will be to test the timing by gradually reducing the sampling period in order to capture finer detail (as you can see on the screenshot below the mangetometer - black line - is not very smooth when displaying slowly rotating magnetic flux).