published on

Programming for Recruiters - Machine-Level Architecture

Machine-level architecture

As we’ve mentioned above, when we are talking about machine-level architectures, we are literally talking about physical systems that are interacting with each-other over wires using electricity. These electric impulses trigger simple computations on atomic items in software that we call bits.

What the flip-flop is a bit? Good question! A bit is a basic building block in information theory (not my words). It can have 2 values: 0 and 1. Multiple bits can describe more complex entities, such as integers (101010 means 42 in binary) or text (01100001 01110000 01110000 01101100 01100101 01110100 01110010 01100101 01100101 stands for appletree in ASCII binary). Why do we do something so unreadable? The answer is simple: 0 and 1 matches perfectly with electrical impulses. When there is electricity, the bits says 1, when there’s none, it says 0. Everything in modern software is composed of bits. We can thank this wonderful model to John von Neumann.

Bits are can be stored in many ways depending oh how we want to use them. Before we jump into the usage, we might want to take a higher-level look at how a computer actually works.

The Central Processing Unit

The Central Processing Unit (from now referred to as CPU) is the part of the computer that does the computation and all the calculations that your machine needs to operate. The CPU receives inputs from the rest of the system and calculates the next state the system should be in. Here are some of the things that the CPU calculates for us:

  • When you move your mouse, what is the next place it should show up in on your screen.
  • When you open LinkedId recruiter and search for “data scientist” which people it should show up to you.
  • When you are preparing a report in Excel, and want to aggregate data, what the result should be.

As you can see these are a wide-variety of problems to solve and CPUs do most of the heavy-lifting here. The architecture of a CPU is quite complex and probably should not be included in the scope of this article. But here is how you can imagine it:

  • There is a part that does the math, this we call the Arithmetic Logic Unit (ALU)
  • There is a part that does the logic, this we call the Control Unit (CU)
  • There are parts that we call the registers, this is where we store temporary information about calculations
  • There are parts that we call buses, these are the systems that carry information between the CPU and memory

Speaking of memory, it is probably time to talk about that as well. If you’d like to learn more about the CPU and how they operate, I highly recommend checking out the Wikipedia article which contains a bunch of useful information around this topic.

The Memory

Let’s zoom back to bits for a second. We have checked out how the CPU takes bits and manipulates them to become new ones through complex math. When we think about our brain and how we operate in real-life, usually this is the process that happens:

  1. Something around us is happening
  2. We get information about previous similar situations from our memory
  3. We do something according to the memories and the current situation

Just like in real-life computations, the computations in software usually require access to memory as well. The idea is quite simple, but the physical space requires tricks to make things actually fast and reliable. The idea is, that the closer the memory is to the CPU, the faster we can access it, however, the space around the CPU is limited, so the memory we can access quickly is usually small. The farther we go from the CPU, the slower we get, but the more we can store. Something for something I guess. Here are the different types of memory that we like to differentiate:

  • Caches - The fastest type of memory. When we are talking about hardware we are mostly referring to caches that live inside or right by the CPU. We categorize these caches as Level 1, Level 2 and so on, Level 1 being the fastest of them. Caches are usually very-very tiny, sometimes only being able to store a couple of bits at once, however, the access to them is rapid beyond belief, making them very valuable assets for people who are trained to use them.
  • Random-access memory - The long version of RAM. Slower than the cache, RAMs are still very fast to read from and write to. The naming of “random-access” comes from the past when software engineers could only write things into memory in order, making software very rigid and difficult to work with. From the 1940s there was development to make things a bit easier using magnets (how flipping cool) and it kinda worked, making the first version of the random-access memory viable. RAM is the memory that is used by most software that is running on your machine. The important state of your software is stored in the RAM, such as your photos when you open iCloud or the text that you are typing when you are working on a Word document.
  • Disks, drives, the storage - Now we have gotten really far from the CPU to the areas of the cold storage. This is where your photographs are stored when you are not working with them. This is where your games are installed. This is where the data that is collected about you is stored when you are browsing the web unsuspectingly. Cold storages are slow to access.

You still wouldn’t notice the difference with your eyes between a cache read and a cold storage read, however, here are some comparisons between different types of memories:

Some cool numbers can be read in this StackOverflow post (although I don’t necessarily agree that every programmer needs to know that light travels 1ft in 1ns).

There would be a lot more to say about the different type of memories. I recommend checking out this article for some more reading on how to differentiate between memories.

Conclusions

We have gone quite far in hardware-land so I am going to zoom away from the hardware architectures for now. Probably will dig deeper in this area in a future post. However, in our next post, we are going to explore the horizons of architecture that most developers are familiar with, code-level architectures.