Chapter 8
Programming Basics - Variables and Types
Coding is within everyone's reach.
The real challenge lies in the architecture of the code. It needs to be testable, adaptable in case of new feature ideas, maintainable over the long term, and understandable by the 50 developers who will work on it in succession.
Although each language has its own peculiarities, the most well-known ones have enormous similarities in basic concepts and syntax. This can be explained by a common ancestor that has greatly influenced modern languages: the C language. This allows us, as developers, to switch from one language to another without too much difficulty.
That's why sharp companies don't care about hiring developers who already know their technologies. During interviews, they focus on reasoning, architectural concepts, and domain knowledge. Even in a new language, a developer who is comfortable with these topics will, within just a few weeks, become much better than a simple "code monkey" who has been using the same language for 10 years.
The language is just one of the many tools in a developer's toolkit.
Today, you start learning the basics.
The goal is, of course, not to make you a developer. This will allow you to better understand what we do in our daily lives and perhaps open the door to a different way of thinking. 💪
The examples will use JavaScript. It is one of the most widely used languages in the world, with syntax inspired by C. Plus, you can use your web browser's console to execute JavaScript without installing anything.
All browsers offer a console. In Chrome, press F12 and then go to the "Console" tab.
Variables
A variable is a box in which you store a value for reuse later.
a = 1 + 2
b = a * 3
The variable "a" is assigned the value 3, so the variable "b" is 3 * 3 = 9.
As the name suggests, a variable can vary; it can change value.
numberOfTechGuideReaders = 200
// You share it with your friends and then:
numberOfTechGuideReaders = 220
Conversely, there is the concept of a constant: once a value is stored, it cannot be changed.
const bestTechGuide = "Nico"
// This will cause an error
bestTechGuide = "Someone else"
Variables are physically stored in the machine's memory, primarily in those famous RAM sticks. So you have to be careful not to overload it.
In a variable, you can put all sorts of things: text, numbers, and much more complicated stuff. In C, the developer must master each variable and its content to manually reserve the necessary space in memory and then free it when it becomes unnecessary.
Imagine 32 gigabytes of memory as a piece of furniture with 32 billion shelves. You want to assign the value "Nico" to a variable. The developer must count, "There are 4 characters in this word, each character takes up 1 byte in memory, so I need to reserve 4 bytes." And woe to them if they try to store something larger than the space they reserved!
🥳 Good news for you!
Modern languages, through the technologies that execute them, manage memory allocation automatically.
However, this is not an excuse to not know what’s inside your variables!
Types
For system coherence and maintainability, mastering the content of variables is essential. We try to explicitly type our variables in the code as much as possible.
There are many types. Some are natively integrated into languages, and some are added by developers. The most common primitive types are:
- Integer, called an int. Examples: 5, 42, -12...
Some systems require some precision on the size of the number because they store it differently. Is it a number less than 256? Thus, there are more precise subtypes: short int, long int... - Floating-point number, called a float. Examples: 3.14, 0.00001...
Processors are optimized to work with integers but are super bad at dealing with floating-point numbers. Therefore, languages differentiate them to adapt their processing and memory representation.
For example, in Python, 0.1 + 0.2 is not equal to 0.3 but to 0.30000000004. It’s much less trivial than it seems if you’re working on millions of bank transactions or planetary alignments. Fortunately, the community has developed tools to compensate for the native shortcomings of processors.
Float also has its subtype: the double float. It allows for double precision and the representation of very small or very large numbers. However, it takes up more memory and is slower to process. - Character, called a char. Examples: 'a', '?', '5'...
Yes, a number is also a character. We’re talking here about its representation as a symbol, for display somewhere, for example. It’s the same difference as between the number 5 and its Roman representation V. A character is surrounded by quotes, allowing the distinction between 5 and '5'. - String, called a string. Examples: "Nico", ..., "5" 😛.
Some languages, like Java, for example, make a distinction between a character and a string. They don’t offer the same features depending on the type. For example, you can count the number of characters in a string.
Quotes, double quotes, allow the distinction between a single character and a string of a single character: 'a' vs "a".
The concept of a standalone character tends to fade away in recent languages. The developer directly manipulates strings. So don't worry, in JavaScript, you can use quotes or double quotes interchangeably.
A quick aside: Java and JavaScript, aside from four letters in their names, have absolutely nothing in common. - Boolean or bool. It’s either true or false. Some systems use true or false, others use 0 or 1.
- Array. It’s a collection of values. For example: [1, 2, 3, "toto"].
They are generally represented by square brackets.
Languages offer numerous structures for handling sets of data. Some ensure the uniqueness of a value, like a dictionary. Some ensure that all values are of the same type. Some allow you to store anything and everything. Some link the values together so they can be traversed in order, etc.
The array in JavaScript is one of the very (too) permissive structures. You can store cabbages and carrots in it. I could rant for hours about devs who use arrays haphazardly to transmit sets of data without caring about what they contain.
In the following, we will use it to represent a collection of values of the same type. - Void: nothingness. When you know there’s no value. We’ll get back to this.
Some languages are strongly typed, meaning the developer must explicitly declare all types. Java, C++, etc.
Other languages are not, or are very loosely typed. The runtime environment figures out the content of the variables on its own. Python, JavaScript, etc.
Strongly typed languages offer more robustness and often more performance. It’s indispensable in critical or embedded environments: planes, robots...
Typing also allows for greater project maintainability in the long term. It forces the developer to understand their data and construct their code to adhere to contracts. If a piece of code expects a number and you pass it text, it will crash!
This weak typing is particularly what gave languages like JavaScript or PHP a bad reputation. They are easy to use, with a quick learning curve, and project development goes faster... But it goes too fast! The flexibility of these languages opens the door to poor practices:
- Unintentional: developers with low skills can create convoluted architectures. And since it seems to work, they have no reason to further educate themselves and improve their practices.
- Intentional: even seasoned developers are tempted to take bad shortcuts if they are under too much pressure with impossible deadlines.
This explains a large part of the massive technical debt we encounter in web companies based on these technologies. I’m not saying that there’s no debt with strongly typed languages, but it’s often more limited and easier to pay off because architectures are, by necessity, better designed.
The communities that, 20 years ago, promoted the flexibility of weak typing ("Hey, it’s easier, why make things harder for yourself!"), have realized their mistakes and are now trying to introduce typing.
Thus, PHP is becoming more strongly typed with each version. There are still a few things that annoy me, but it’s getting much closer to what strongly typed languages do.
The JavaScript ecosystem now includes TypeScript, an overlay that provides very solid typing. Personally, I love TypeScript; it’s simple and effective. It will be even better when it becomes completely independent of JavaScript and spreads natively in web browsers.
Python, for its part, has introduced a module that allows for the explicit declaration of types in code. For now, it’s mainly used to make life easier for developers; it doesn’t have a real impact on script execution: there won’t be an error if the typing isn’t respected during execution.
All these examples offer a hybrid mode: the developer can type if they want to, but nothing forces them to. This allows for some controlled flexibility and, in some cases, to stick with the philosophy of Duck typing:
When I see a bird that walks like a duck, swims like a duck, and quacks like a duck, I call that bird a duck. <James Whitcomb Riley>
Objects
Developers create their own types representing the business concepts they handle.
That’s why it’s crucial, before starting a project, to brief the developers as much as possible on the business domain: its vocabulary, concepts, organization, current needs, and future ideas... All of this can deeply affect the design of the code.
This approach to design, aiming to mirror the real-world business domain, is called Domain Driven Design (DDD).
In a company, there is a marketing department and a finance department: there will be two distinct software modules.
Both departments track sales. Marketing wants to know the number of payments and the revenue generated.
Finance needs the transaction details to calculate VAT returns, payment partner fees...
As a developer, at first glance:
- These two teams naturally don’t use the same vocabulary: payment vs transaction. Is it really the same concept?
For example, one possible view: it's time to renew the customer’s subscription. They need to pay: make a payment. For that, we trigger a bank transaction to debit their account. It fails. In 2 days, we’ll initiate another transaction. Both transactions incurred bank fees. In this example, a payment consists of a set of transactions.
We refine the system to have a coherent system and establish an ubiquitous language: common to everyone. - The marketing team doesn’t seem interested in refunds, frauds, etc. They only look at sales.
- The marketing payment concept only includes a date and an amount. Finance needs VAT, the name of the partner used, etc.
This modeling can take a lot of time. Generally, a business team that has been using a vocabulary for years doesn’t care to deepen or correct it if it turns out to be wrong or full of language abuses. For them, it’s a waste of time that doesn’t add anything to their daily life. It’s not always easy to organize sessions, explain that they are corrupting concepts, and agree that everyone speaks the same language. In theory, the software should adapt to the business. In practice, if the business is too inconsistent, developers might push back a bit to ensure system reliability.
My favorite example: product vs offer. I think I encounter it at least twice a year 👌.
Once all these questions are answered, we can identify the involved entities, their links, and the possible actions. In my example, for finance:
- I have a client.
- This client can pay and get refunded through bank transactions.
- These transactions have a nominal amount: numerical amount + currency.
- I have several banking partners.
- Each partner has a contract.
- In this contract, there are fees calculated according to rules: €0.01 for 100 transactions, etc.
To represent all this in code, we create dedicated types. We thus have a Client, Partner, Transaction, Fee, Contract, Amount, etc., type.
To create a type, we must describe it. This description, called a “class,” will then allow us to build concrete instances: objects.
Imagine an architect drawing a plan for a house. From this plan, they can build an entire housing development of almost identical houses.
Languages diverge a bit on this part. In most, we’ll have something like this:
A client consists of an array of transactions and their email.
class Client {
// Declaration of the "email" attribute as a "string".
email: string
/*
* Declaration of the "renewalHistory" attribute
* which is an empty array of "Transaction" objects.
*/
renewalHistory: Transaction[] = []
/*
* Declaration of the "requestRefund" function.
* It takes as input a parameter "transactionToRefund"
* of type "Transaction".
* It returns nothing: void.
*/
requestRefund(transactionToRefund: Transaction): void {
// Processing to refund.
}
}
// Creating the "robert" object, which is a "Client".
robert = new Client('robert@gmail.com')
// Creating a transaction
subscriptionRenewal = new Transaction(10, '€', '2024-02-04')
robert.requestRefund(subscriptionRenewal)
This is demonstration code; it doesn’t work as is 😛.
The slashes "//" and asterisks "/*" are used to comment the code in most languages.
In theory, the code should be clear and explicit enough to be read by anyone, but in practice, comments save a lot of situations.
In the example, we see that a client has a possible action: requestRefund. The declaration of this action is called a "function" and, more specifically, in the context of an object: a "method." I have no idea why we use two different words 🤷🏻♂️.
Robert is a client just like 42 is a number and “toto” is a string.
By the way, in some languages, primitive types are also represented as objects.
On the string type, you can do things like “toto”.length to directly get the length of a text.