Chapter 5
Databases

Analysis, decision-making, customer journey, they're everywhere.
Reading time : 5 minutes


Computer systems need to keep in memory a plethora of information. Customer accounts, subscription status, or even simply the website's content.

These pieces of information are stored via software called "database management systems." A large-scale website utilizes multiple database technologies, each specialized for a particular use. Among others:

  • Working with huge batches of data.
  • Or conversely, on a single data point.
  • Search engine.
  • Identifying relationships.
  • Storage of documents and media.
  • Ultra-fast access.

Today, we will focus on relational database management systems. These are the technologies most commonly used by developers in their daily lives. You might have already heard of MySQL, Oracle, SQL Server, or PostgreSQL.

Let's build a newsletter 💪

Imagine a database as an Excel file.
A tab in the file represents what we call a "table".
A column in a table is called a "field".
A row in a table is called a "tuple", "entry", "row", or "record".


For example, if you want to model in a database this newsletter, the wonderful people who have subscribed to it, and the mail sendings:


Table: chapter

identifier title publication_date content
database-sql The Tech Guide #5 - Databases 2024-01-05 Welcome to…
frontend-backend The Tech Guide #4 - Front-end and Back-end 2023-09-10 👋, welcome…

Table: subscription

identifier email
1 foo@bar.com
2 hello@world.com

Table: sent_mail

chapter_identifier subscription_identifier send_date
database-sql 1 2024-01-05
database-sql 2 2024-01-06

You get two tables that represent concrete objects: chapter and subscription. Each line is manually or automatically assigned a unique identifier. It's called a "primary key".

The third table, sent_mail, uses these keys to relate chapters and subscriptions to each other. It manipulates "foreign keys".

This way of organizing data explains the name "relational database".

SQL

To manipulate these tables, we use Structured Query Language.

SQL, pronounced S.Q.L and formerly known as sequel, aims to be a natural language close to English. It very easily allows writing queries to carry out the 4 basic operations of data processing:

  • Creation. Example: someone subscribes to the newsletter.
  • Reading. Example: check who has already received the chapter to avoid sending it twice.
  • Updating. Example: correcting a typo in the content.
  • Deletion. Example: unsubscribing from the newsletter.

For example, to obtain the details of this chapter:
SELECT title, content FROM chapter WHERE identifier = “database-sql”

Easy, isn't it? 😉
We can do crazy stuff in SQL, but the basic syntax is really accessible to everyone.

It's exactly this query that is played if you display the page https://the-tech.guide/chapter/database-sql/. The URL is analyzed to extract the chapter's identifier. It is then injected into the SQL query allowing to construct the page.

The database serves in this case to code only one page and to automatically generate all the variants.

😱 And if someone tampers with the URL hoping to modify the underlying SQL query?

That's called SQL injection. It's the flaw exploited in 90% of website hacks.
Every use of data public to users must be secured. Every day, robots randomly try thousands of injections on the most known sites.

NoSQL

Relational databases, despite their power, show limits on large volumes of data.

The advent of Cloud and BigData has seen the emergence of new technologies categorized as “NoSQL”.

Broadly, NoSQL refers to databases that are not relational. It’s a super broad designation encompassing a whole range of technologies.
Fun fact: NoSQL technologies often use SQL or a derivative 🤪. They tried to make up for this poor naming with Not Only SQL.

In a NoSQL database, data is organized completely differently, at least in the internal engine of the software. This allows for easier distribution across different machines and thus infinitely increases computing power and available memory.