SQL lessons for NoSQL developers

Because no software developer is complete without at least a passing knowledge of the world’s most popular query language

SQL lessons for NoSQL developers
Jamesbrey / Getty Images

If you were raised on MongoDB or learned full stack development in a coding bootcamp, then you might not know SQL. You might have even been told that SQL was bad. However, NoSQL databases are a bit like Hadoop — they had early promise but it fizzled. After 10 years of the NoSQL “revolution,” SQL databases remain the bulk of the database market.

There are several reasons for this. First, many applications require real transactional integrity, which NoSQL databases (despite their claims) do not offer. Second, the relational model is an incredibly useful way to represent data. Third, SQL is still the best-thought-out and most capable query language. Fourth, GraphQL and object-relational mapping (ORM) technologies made developer challenges with relational databases largely moot. And finally, we have the emergence of distributed SQL databases, sometimes called NewSQL databases. (Full disclosure: I work for Yugabyte, provider of an open source distributed SQL database.)

With the COVID-19 pandemic, transactional applications that never would have been trusted to a NoSQL database are rapidly moving to the cloud as they require more scalability and resilience. These are typically SQL database applications. So many developers who learned on document databases or other NoSQL databases now need to know SQL. Get started!

Choose a SQL database

Pick your favorite SQL database. If you pick a PostgreSQL-compatible database such as YugabyteDB the code samples should work without modification. If you pick MariaDB or another MySQL derivative, then you will probably have to change the data types and make minor modifications. The same can be said for Oracle Database or SQL Server. While SQL is a standard there are differences between the underlying database implementations and minor dialect choices that can be non-ecumenical. Regardless of your RDBMS choice, install it and get it running!

A note on file handles

This is not related to SQL, but really if you install any database you need to increase the file handle limits on your OS. For MacOS you can follow the instructions below. For Linux please consult your distribution’s documentation.


launchctl limit maxfiles

And check to see if the limit is a high enough number. I recommend 1048576 for most databases.

If not then edit /etc/sysctl.conf and include the following:


Create or edit /Library/LaunchDaemons/limit.maxfiles.plist and insert the following:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
  <plist version="1.0">


sudo launchctl unload -w /Library/LaunchDaemons/limit.maxfiles.plist
sudo launchctl load -w /Library/LaunchDaemons/limit.maxfiles.plist

In whatever terminal session is used to run the database this command needs to be run:

ulimit -n 1048576

Northwind sample database

In order to run some queries you need a database schema and some data. Grab the Northwind sample database:

This is basically the same kind of demo database that came with Microsoft Access back in the day, and that Microsoft still provides for SQL Server: 

northwind sample database IDG

The Northwind database contains products, customers, orders, and order details among other things. It should be intuitive to anyone who has seen a business run.

To create and load the database on YugabyteDB type ./bin/ysqlsh from your install directory. On PostgreSQL type ./bin/psql. Then from this shell type:

\c northwind
\i northwind_ddl.sql
\i northwind_data.sql

This enters the shell, creates the database, and loads the data. Note the slash commands are specific to the shell of PostgreSQL and compatible database systems. They are not SQL.

If you are not a command-line junkie like me you can also use a tool like pgAdmin for either database.

pgadmin ui IDG

The us_states table

In PostgreSQL-compatible databases, from the shell you type \dt to list the tables in the database. The output should look be something like this:

                 List of relations
 Schema |          Name          | Type  |  Owner  
 public | categories             | table | yugabyte
 public | customer_customer_demo | table | yugabyte
 public | customer_demographics  | table | yugabyte
 public | customers              | table | yugabyte
 public | employee_territories   | table | yugabyte
 public | employees              | table | yugabyte
 public | order_details          | table | yugabyte
 public | orders                 | table | yugabyte
 public | products               | table | yugabyte
 public | region                 | table | yugabyte
 public | shippers               | table | yugabyte
 public | suppliers              | table | yugabyte
 public | territories            | table | yugabyte
 public | us_states              | table | yugabyte

To view a table definition you type \d and the name of the table, e.g. \d us_states. In MySQL derivatives the comand is show columns from us_states and in Oracle Database it is describe us_states. Regardless of the database your output should be something like this:

                        Table "public.us_states"
    Column    |          Type          | Collation | Nullable | Default
 state_id     | smallint               |           | not null |
 state_name   | character varying(100) |           |          |
 state_abbr   | character varying(2)   |           |          |
 state_region | character varying(50)  |           |          |
    "us_states_pkey" PRIMARY KEY, lsm (state_id HASH)

SQL select statement

From the shell type select * from us_states. You will see something like:

select * from us_states;
 state_id |      state_name      | state_abbr | state_region
        4 | Arkansas             | AR         | south
       46 | Vermont              | VT         | east
       29 | Nevada               | NV         | west
       25 | Mississippi          | MS         | south
       14 | Illinois             | IL         | midwest
       23 | Michigan             | MI         | north
        1 | Alabama              | AL         | south
       47 | Virginia             | VA         | east
       37 | Oklahoma             | OK         | midwest
       13 | Idaho                | ID         | midwest
       20 | Maine                | ME         | north
       27 | Montana              | MT         | west
       51 | Wyoming              | WY         | west
       39 | Pennsylvania         | PA         | east
       32 | New Mexico           | NM         | west
       45 | Utah                 | UT         | west
        2 | Alaska               | AK         | north
        7 | Connecticut          | CT         | east
       11 | Georgia              | GA         | south

select is what tells the database that we’re retrieving records. The * means all columns. The from clause specifies which tables should be used to retrieve records. In this case only one table is used, us_states. All SQL statements end with a semicolon.

Here is the command dissected into its constituent parts:

select * from us_states;
  • select – retrieve records
  • * – all columns
  • from – list of tables
  • us_states – the table selected
  • ; – end of statement

SQL projection with a where clause

Try a projection and a condition. This query retrieves all of the state abbreviations from the us_states table where the region is south — my favorite part of the US.

From the shell run:

select state_abbr from us_states where state_region='south';
(9 rows)

This time instead of signifying all columns with *, the statement uses what is called a projection, which is just a list of columns. This query includes only the state_abbr field in the list. where is how a list of conditions is specified — only records matching these conditions will be included. In this query, the condition is state_region equals the string literal south, which is delineated by the single quotes.

Here are the constituent parts:

select state_abbr from us_states where state_region='south';
  • state_abbr – a field from the table
  • where – query conditions
  • state_region='south' – a condition, return all rows where state_region has the string literal value south.

Make sure the query has 'real' single quotes. "Double quotes" are used for something else and the stylized or ‘smart quotes’ are not interpreted as the same character.

The products table

For some of the next steps, a different table is required. Take a look at the products table:

\d products
                          Table "public.products"
      Column       |         Type          | Collation | Nullable | Default
 product_id        | smallint              |           | not null |
 product_name      | character varying(40) |           | not null |
 supplier_id       | smallint              |           |          |
 category_id       | smallint              |           |          |
 quantity_per_unit | character varying(20) |           |          |
 unit_price        | real                  |           |          |
 units_in_stock    | smallint              |           |          |
 units_on_order    | smallint              |           |          |
 reorder_level     | smallint              |           |          |
 discontinued      | integer               |           | not null |
    "products_pkey" PRIMARY KEY, lsm (product_id HASH)
Foreign-key constraints:
    "products_category_id_fkey" FOREIGN KEY (category_id) REFERENCES categories(category_id)
    "products_supplier_id_fkey" FOREIGN KEY (supplier_id) REFERENCES suppliers(supplier_id)
Referenced by:
    TABLE "order_details" CONSTRAINT "order_details_product_id_fkey" FOREIGN KEY (product_id) REFERENCES products(product_id)
1 2 3 Page 1
Page 1 of 3