Database Basics

Hi! You might want to know that this post continues ideas from the following.

Recutils — Small Technology Notes from Feb 5, 2020, 6:48am

My original plan for this Sunday was an outline for people sheltering in place who would like a path towards changing careers. That’s running slow, though, and part of that required understanding what a database is and how they work.

Conveniently, when I posted my recutils rundown to DEV, I added some general information on databases, so that’ll be this post and the career-hopping post will come next Sunday.

Not that you needed to know any of this, since I never announced my plans…

Background on Databases

Before I jump into recutils itself, if you’re not already conversant in databases, I’ll try to give a capsule version. For example, If you’ve taken the undergraduate database class or have written more than a few lines of SQL, you can probably safely skip down to Recutils, in General.

For the rest of you, here are some definitions, and then I’ll go over some basics.

A database is a collection of related data, regardless of the form.

A relational database is a database—surprise!—where we store data elements based on their relationships (surprise again!) to other elements. Sometimes those relationships are described as records, and sometimes they’re described by metadata that connects separate records. The full definition centers on Codd’s Rules.

A database management system (DBMS) is software that mediates access to a database to minimize the chances of a user or program shooting itself in the foot. For example, it might (depending on the system) moderate concurrent access or prevent you from deleting a record that another record points to.

Every database has a universe of discourse or mini-world that explains how to interpret the data, whether anybody bothered to write it down. We all somehow agree to never talk about this, even though “what does this data actually mean?” is one of the most important questions we can be asking, so I’m bringing it up here…

Instead of a universe of discourse, the thing we really get is a schema, the layout of what we store in each kind of record and how they relate to each other through foreign keys. I’ll explain foreign keys in a bit, rather than try to assemble a coherent definition.

With that out of the way, record sets are called “relations” by theoreticians and implemented as tables, with records implemented as rows and individual elements as columns. The terms are mostly interchangeable, but you’ll find table/row/column used by just about everybody in industry.

The last thing you probably need to know (unless you want to implement a DBMS) is that finding data (“querying”) has what amounts to three steps.

First, we join the relations/tables we need, by which I mean taking records/rows that are connected through a common element/field/column. That commonality is (and should be specified as) a “foreign key,” a kind of pointer into that other table’s space. Even if we only care about one table, we still need to specify it.

Second, we filter the available records down to those whose elements are interesting to us, similar to writing an if statement in general programming. We might want all the records, of course, which makes this step trivial. Or this might be extremely complicated, depending on what we want.

Third, we project the remaining records to just the fields that interest us. Again, we might want every field/column, making this part easier to specify.

If you’re familiar with SQL, you’ll probably recognize the SELECT column-or-columns FROM table-or-tables WHERE conditions-are-true format, where the FROM might list tables arbitrarily or might explicitly use JOIN to connect the tables. If you’re not, recutils doesn’t use SQL, so don’t worry about it.

In a full database course, from here, you’d talk more about those query languages like SQL and then look at how to implement a database management system. But for purposes of reading more about databases and tools that use databases, this should give you a decent footing.

Credits: The header image is untitled by an anonymous PxHere photographer, made available under the CC0 1.0 Universal Public Domain Dedication.

By commenting, you agree to follow the blog's Code of Conduct and that your comment is released under the same license as the rest of the blog. Or do you not like comments sections? Continue the conversation in the #entropy-arbitrage chatroom on Matrix…

Get monthly ^* updates on Entropy Arbitrage posts, additional reading of interest, thoughts that are too short/personal/trivial for a full post, and previews of upcoming projects, delivered right to your inbox. I won’t share your information or use it for anything else. But you might get an occasional discount on upcoming services.
Or…	Mailchimp 🐒 seems less trustworthy every month, so you might prefer to head to my Buy Me a Coffee ☕ page and follow me there, which will get you the newsletter three days after Mailchimp, for now. Members receive previews, if you feel so inclined.
Email Address
First Name
Last Name
Email Format	html text


* Each issue of the newsletter is released on the Saturday of the Sunday-to-Saturday week including the last day of the month.
Can’t decide? You can read previous issues to see what you’ll get.

Background on Databases

Sign up for My Newsletter!