Python - Getting started with SQLite3

As we build projects, we'll soon run across a need to store data. At first, we may try to use CSV files or another form of a text file. Maybe even an Excel file. While they can work, they may not be able to handle the amount of data we are using. We could move to a full SQL database such as MySQL or PostgreSQL. This would add a lot of complexity that may not be worth it.

Please note: I'm going to assume you have some knowledge of the SQL language for this post.

That leads us to SQLite3.

No server required
No configuration needed
Easy upgrade path to MySQL or PostgreSQL
Already installed with Python

Once we import SQLite3 into our Python script we have full access to a complete database system. Don't let the lite fool you. This has plenty of power to get everything up and running fast. If you do outgrow it, never fear, because it's SQL interface compliant it'll allow quick shifting over to a server-based SQL Database.

import sqlite3

Connecting to the Database

SQLite3 provides the .connect() method to connect to the database. A filename is passed into the method which SQLite3 will use. If the file doesn't exist, it will be created. 😀

conn = sqlite3.connect("mydatabase.db")

Another option is to pass ":memory:" instead of a file name. This will create the database in memory. This will be faster but is only temporary. Could be useful if you were pulling data from an API and wasn't planning on keeping it local.

conn = sqlite3.connect(":memory:")

A brand new SQLite3 database is just an empty file (or empty space in memory). Think of it like a filing cabinet without labels, folders or even drawers. It's up to us to define the tables that go inside and store the data.

The connection stored in conn is a hallway that gives us access to the data store. To transport data and commands we'll need a waiter that will go through the connection and do our bidding. That waiter is called a cursor.

A database cursor is an object that allows us to execute SQL statements, like creating tables, inserting data and fetching results.

cur = conn.cursor()

I'm using conn and cur to hold the connection and the cursor. But you could use anything you want. These are common names for these objects. You could use sql_connection and sql_cursor. While longer they are more descriptive. Completely up to you and your team.

Creating a Table

The cursor object has a method called .execute() which we can pass an SQL statement that will be executed on the database.

cur.execute("CREATE TABLE people(name, address, city, state, zip)")

Here we CREATE a TABLE named people with the column (field) names of name, address, city, state, and zip.

Side note: In case the table already exists, we can add IF NOT EXISTS to the CREATE TABLE statement so we don't get an error if we run the code over again.

cur.execute("CREATE TABLE IF NOT EXISTS people(name, address, city, state, zip)")

If you wanted to start with an empty database we can drop the table first.

cur.execute("DROP TABLE people")

In case you want to delete the data from the table but not the table itself, the DELETE FROM can be useful.

cur.execute("DELETE FROM people")

Something you might have noticed we didn't declare the data types. Most SQL databases require the datatypes. SQLite3 uses dynamic typing. Similar to Python, we don't have the data type set in stone at the start. It's common though to put a data type in place so other developers know what to expect and also some developer tools will also pick up on the types.

Expression Affinity	Column Declared Type
TEXT	"TEXT"
NUMERIC	"NUM"
INTEGER	"INT"
REAL	"REAL"
BLOB or NONE	"" (empty string)

Affinity means SQLite3 will attempt to put the data into that data type. For example, if we tried to store "3" into a column with the type of INT, SQLite3 will convert it to an integer for us. Affinity goes beyond conversion, but that's outside the scope of this post.

cur.execute("""CREATE TABLE IF NOT EXISTS people (
                    name TEXT NOT NULL UNIQUE,
                    address TEXT NOT NULL,
                    city TEXT NOT NULL,
                    state TEXT NOT NULL,
                    zip TEXT NOT NULL
        );""")

Note: The triple quotes are not part of the SQL but instead I'm using the multi-line string so I can make the SQL code easier to read.

Now we've provided the types we are designing for. Also, by specifying NOT NULL SQLite3 will raise an error if we try to create a record without that field. UNIQUE will require all records to hold a unique value in that field.

Inserting data

SQL has the INSERT INTO command which we can use to insert data into the table. There are a few ways of doing this.

Here we are insert into the data table, into the columns (name, age, gender) the values of (Russ, 55 and Male).

cur.execute("""INSERT INTO data 
                         (name, age, gender) 
                         VALUES
                         ("Russ", 55, "Male")""")

We could use an f-string to build this. But something very common is:

row = ["russ",55,"male"]

cur.execute("INSERT INTO data (name, age, gender) VALUES (?, ?, ?)", row)

The ? marks will be replaced by each element in the list row. The order is very important. There must be the same number of columns, same number of ? and same number of elements in the list.

Insert Many

Chances are we don't want to enter each record one by one. SQLite3 has a method executemany() the works very close to execute except it takes a list or lists. Or tuples. Some sort of sequence. You get the idea.

For example:

# Sample data for sales table
sales_data = [
    (1, 'Product A', 100, 25.00),
    (2, 'Product B', 50, 30.00),
    (3, 'Product C', 75, 20.00),
]

We'll create a new table and then insert the entire list in one command.

# Create the sales table
cur.execute("""
      CREATE TABLE IF NOT EXISTS sales 
      (product_id INTEGER PRIMARY KEY, 
       product_name TEXT, 
       quantity INTEGER, 
        unit_price REAL)"""
)

# Insert sample data into the sales table
cur.executemany('INSERT INTO sales VALUES (?, ?, ?, ?)', sales_data)

Inserting using a Dictionary

Using a dictionary object, we will use named placeholders instead of ?.

We'll use this dictionary to insert the data.

sales_data =[
    {'product':'Product A', 'quantity':100,'unit_price': 25.00},
    {'product':'Product B', 'quantity':50, 'unit_price':30.00},
    {'product':'Product C', 'quantity':75, 'unit_price':20.00},
]

cursor.executemany(""INSERT INTO sales VALUES 
           (:product, :quantity, :unit_price)
            """, sales_data)

The placeholder are the keys in the dictionary. Place a : in front of the placeholder.

Please Note: Starting with 3.12 a Deprecation Warning will be raised if named placeholders with a sequence. (non-dictionary) The named placeholders will only be useable with a dictionary.

fetching the data

The first step is to execute a SELECT command on the cursor object.

cur.execute("SELECT * FROM data")

Now that we've primed the cursor with the SELECT command we can now fetch the data.

fetchall()

Using fetchall() will store in results a list of each record, each stored in a tuple.

results = cur.fetchall()

for result in results:
     print(result)

fetchone()

Using fetchone() will retrieve the next row. If there is no more rows, it will return None.

while True:
    record = cur.fetchone()

    if record is None:
         break

    print(record)

We're using an infinite loop that will break out when there are no more records.

fetchmany()

fetchmany will retrieve a specific number of records. If the number is more than the records available, it will retrieve just what is left. When called a second time, it'll pull the next batch.

while True:
     results = cur.fetchmany(5)

     if (len(results) == 0:
          break

    for result in results:
        print(results)

The infinite loop with continue until the number of records is zero.

Closing the database

Two steps should happen when you're finished with the database. First, using the connection; conn; the changes must be committed, and then the connection needs to be closed.

# Commit the changes
conn.commit()

# Close the connection
conn.close()

ID

Database tables commonly have an ID present. These are normally designated as the PRIMARY KEY. SQLite3 automatically adds one called rowid column. This is very fast and can be called using a SELECT statement.

We can create our own ID by specifying INTEGER PRIMARY KEY. By specifying this ourselves, SQLite3 won't use the rowid as the primary key.

Let's look at two pieces of code where the difference is one uses the built in ID and the other one is using an ID we put in place.

This is common between the two code examples:

import sqlite3

conn = sqlite3.connect('data/rowid_test.db')

cur = conn.cursor()

sales_data = [
    ('Product A', 100, 25.00),
    ('Product B', 50, 30.00),
    ('Product C', 75, 20.00),
]

Here we are creating our own ID.

# Adding a product_id with `INTEGER PRIMARY KEY` will use auto-increment
cur.execute("""CREATE TABLE IF NOT EXISTS sales
                  (product_id INTEGER PRIMARY KEY, 
                   product_name TEXT, 
                   quantity INTEGER, 
                   unit_price REAL)""")

# Passing NULL tells SQLite3 to use the autoincremented number
cur.executemany('INSERT INTO sales VALUES (NULL, ?, ?, ?)', sales_data)

cur.execute("SELECT * FROM sales")

We can also rely on SQLite3's rowid.

# We don't need to mention the id
cur.execute("""CREATE TABLE IF NOT EXISTS sales (
                          product_name TEXT, 
                          quantity INTEGER, 
                          unit_price REAL)""")

# it'll automatically handle the `rowid` creation
cur.executemany('INSERT INTO sales VALUES (?, ?, ?)', sales_data)

# Even though we used `*`, we do have to specify that we also want the `rowid`.
cur.execute("SELECT rowid, * FROM sales")

The fetch and displaying will be the same.

results = cur.fetchall()

for result in results:
    print(result)

Both pieces of code out the exact same thing:

(1, 'Product A', 100, 25.0)
(2, 'Product B', 50, 30.0)
(3, 'Product C', 75, 20.0)

Now why would we want to specify our own id when SQLite3 creates a perfectly fine id by itself? One reason is rowid cannot be used as part of a FOREIGN KEY.

Note: A FOREIGN KEY links two tables together. This is more advanced SQL and until you need to make a choice, use whichever you are comfortable with.

There is an AUTOINCREMENT argument that can be used. In the code above I didn't use AUTOINCREMENT. By default, SQLite3 will use the next highest number that currently exists. For example, if id 10 was used and then deleted, 10 could be reused if the highest existing number was 9. Using AUTOINCREMENT changes the algorithm and it'll be the next highest number that ever existed. Whether you use AUTOINCREMENT is another choice since it does use more computing power. If your project requires never using an ID again, it's best to create your own ID and mark it as AUTOINCREMENT.

Context Manager

SQLite3 can use a context manager which makes using SQLite3 a bit easier. Creating the connection is performed by a with context block. When the code exits the with context block., all transactions are committed. This does not close the database.

Instead of:

conn = sqlite3.connect("mydatabase.db")

Use this:

with sqlite3.connect(db_file) as conn:
    # code block

A connection is created and stored in conn. The connection is available outside of the with context block.

Why doesn't this close the connection when leaving the with context block? Opening and closing are expensive actions. In most cases it's fine to leave them open. It's the transactions that you are performing that need to be committed or else they can be lost.

For example, let's say you've opened a connection, wrote some records out but during processing to get more records, the script crashes. If you haven't committed the transactions, those will be lost.

A better solution is to use a context manager so you can see exactly the code writing to the Database. Each time it writes you enter a new context. Leave the context and the transactions are committed. Crash the script while outside of the context manager, the transactions are saved. And since we weren't opening and closing the database, we keep the speed up.

You could of course just commit after each execute without using a context manager. Which method should we use? It's completely up to your (and team's) coding style.

Here is an example of a context manager:

import sqlite3

db_file = "example.db"

# Use the with statement to create a context manager
with sqlite3.connect(db_file) as conn:

    cur = conn.cursor()
    cur.execute("SELECT * FROM table_name")

    # Fetch all the results as a list of tuples
    results = cur.fetchall()

# The connection has been committed

# Print the results
for row in results:
    print(row)

# Only need to close, since the commit already happened.
cur.close()

Updating data

After we've inserted records there is always the possibility of needing to update the data.

Earlier we stored the sales_data in the sales table. Turns out Product A sold another 500 items, so we need to update the quantity to 600.

with sqlite3.connect('data/sales_database.db') as conn:
    cur = conn.cursor()
    cur.execute("""UPDATE sales
                             SET quantity = 600
                             WHERE product_id = "Product A"
                         """)

What we are telling SQLite3 is to UPDATE the table sales and SET the quantity to 600 for all records WHERE product_id is Product A.

The WHERE clause is very important without it, we're not limiting SQLite3 as to which rows. Without the WHERE clause it will do all rows.

Handling Errors

As with anything, errors can happen. With a try-except block we can catch those errors and handle them. In some cases, we can even recover without interrupting the user's workflow. At the very least we can provide a useful message to the user to enable them to fix the issue.

try:
    # Perform database operations
except sqlite3.Error as e:
    print("Error:", e)

SQLite3 has a list of result and error codes. With any try-except block we'll want to go through the error codes that may appear. Finding those errors that are common and those that we can write code to skirt around and catching those.

A common error that might come up is sqlite3.IntegrityError. This would appear if during an INSERT a unique field was about to have an already used value.

try:
    # SQL INSERT command
except sqlite3.IntegrityError as e:
    # Take care of the error

As the developer it's up to you as to how to deal with the Error. Should the script stop? Should the step be skipped? Should the user be asked to reenter the information? Should it be logged, and the process continue without raising an error to the user?

`sqlite_schema` the first table

When SQLite3 creates the database, there is a table created before ours. It's where SQLite3 stores information about the database. Most of the time you won't be changing this table. This is one of those tidbits that might be useful and might not. For me sometimes I like having the full picture. During debugging you might see this table appear. If so it's nice to understand where it came from.

The table can go by other names, they are all pointing at the same table. This depends on the version of SQLite3 you're using.

sqlite_master
sqlite_temp_schema
sqlite_temp_master

result = cur.execute("SELECT name FROM sqlite_schema")

When we execute a SELECT SQL command an object is returned. This object is an iterator that can provide the results.

The .fetchone() method will return the next item in the iterator. If none are left, None will be returned.

>>> res.fetchone()
('movie',)

What is returned is a tuple with a single item, the name of a table in the database. Each time we call .fetchone() we will get the next name until we run out.

Upsert

What the heck is an Upsert? It's not SQL command like UPDATE or INSERT. But instead, a way of putting them together.

Let's say you have a database where the product name or number is the PRIMARY KEY or UNIQUE. This is so you know you'll always have unique product information. You've just been handed new pricing. You might think, you'll need to go through find the products that already exists and update them and insert the rest. This seems doable and something expected. Why not let SQLite3 handle this work?

The first step is to make sure the product_name will be unique. Being set as PRIMARY KEY will cause SQLite3 to error out when we try to INSERT one already existing. UNIQUE can also be used if we wanted to leave rowid as the PRIMARY KEY.

CREATE TABLE sales (
            product_name TEXT PRIMARY KEY, 
            quantity INTEGER, 
            unit_price REAL);

Next is the INSERT.

INSERT INTO sales(product_name, quantity, unit_price)
            VALUES("Product A", 100, 125.00)

Great but since "Product A" already exists, we'll get a conflict. So, we tell SQLite what to do on a conflict.

ON CONFLICT(product_name) DO UPDATE SET
        quantity = excluded.quantity,
        unit_price = excluded.unit_price
WHERE product_name = excluded.product_name;

Putting this all together along with passing multiple records.

cur.executemany(
    """
            INSERT INTO sales(product_name, quantity, unit_price)
            VALUES(?, ?, ?)
            ON CONFLICT(product_name) DO UPDATE SET
                quantity = excluded.quantity,
                unit_price = excluded.unit_price
            WHERE product_name = excluded.product_name
            """, new_data
)

The new_data list can contain both new records to insert plus existing product_names to update. SQLite3 will automatically handle it all based on uniqueness of the product_name.

Conclusion

We've covered the basics of using SQLite3. I'd highly suggest play around with SQLite3 to get the hang of it. Since the Database is a single file, it's easy to copy to others. It's also a standard data structure that other programming languages can also access.

Resources

Below is links to web sites I used for this post. These can also be used to extend your studies..