1. Introduction to Databases
3. NoSQL Databases
SQLite Tutorial - 2. Relational Database Concepts

2.1 Introduction to Relational Databases

Overview of relational database management systems (RDBMS)

Overview of Relational Database Management Systems (RDBMS) in SQLite

Introduction to RDBMS

Relational Database Management Systems (RDBMS) are software systems that store, manage, and facilitate access to relational databases. In a relational database, data is organized into tables (relations), which consist of rows and columns. Each table represents an entity, and relationships between tables are defined through foreign keys.

Key Features of RDBMS:

  1. Data Organization: Data is stored in tables with rows and columns.
  2. SQL (Structured Query Language): The standard language for interacting with RDBMS.
  3. Data Integrity: Ensures accuracy and consistency of data through constraints like primary keys, foreign keys, unique constraints, and checks.
  4. Transaction Management: Supports ACID (Atomicity, Consistency, Isolation, Durability) properties for reliable transactions.
  5. Scalability and Performance: Optimized for efficient data retrieval and management.
  6. Security: Provides mechanisms for authentication, authorization, and encryption.

SQLite Database

SQLite is a popular, self-contained, serverless, and zero-configuration RDBMS. It is embedded into applications and does not require a separate server process. SQLite is widely used in mobile applications, embedded systems, and small to medium-sized desktop applications due to its lightweight nature and ease of use.

Key Features of SQLite:

  1. Self-contained: Entire database is stored in a single file.
  2. Serverless: Does not require a separate server process.
  3. Zero Configuration: No setup or administration required.
  4. Transactional: Supports ACID transactions.
  5. Cross-platform: Works on various operating systems without modification.
  6. Lightweight: Minimal footprint, making it suitable for embedded systems.

Example: Managing a Simple Database in SQLite

Let's create a simple database to manage students, courses, and enrollments.

Step 1: Creating the Database

SQLite databases are created as a single file. You can create and manage SQLite databases using the SQLite command-line tool or SQLite Browser.

sqlite3 school.db

Step 2: Creating Tables

We'll create three tables: Students, Courses, and Enrollments.

-- Creating Students table CREATE TABLE Students ( StudentID INTEGER PRIMARY KEY, StudentName TEXT NOT NULL ); -- Creating Courses table CREATE TABLE Courses ( CourseID INTEGER PRIMARY KEY, CourseName TEXT NOT NULL ); -- Creating Enrollments table CREATE TABLE Enrollments ( EnrollmentID INTEGER PRIMARY KEY, StudentID INTEGER, CourseID INTEGER, EnrollmentDate TEXT, FOREIGN KEY (StudentID) REFERENCES Students (StudentID), FOREIGN KEY (CourseID) REFERENCES Courses (CourseID) );

Step 3: Inserting Data

Insert data into the Students, Courses, and Enrollments tables.

-- Inserting data into Students table INSERT INTO Students (StudentID, StudentName) VALUES (1, 'Alice'); INSERT INTO Students (StudentID, StudentName) VALUES (2, 'Bob'); INSERT INTO Students (StudentID, StudentName) VALUES (3, 'Charlie'); -- Inserting data into Courses table INSERT INTO Courses (CourseID, CourseName) VALUES (101, 'Math'); INSERT INTO Courses (CourseID, CourseName) VALUES (102, 'Physics'); INSERT INTO Courses (CourseID, CourseName) VALUES (103, 'Chemistry'); -- Inserting data into Enrollments table INSERT INTO Enrollments (EnrollmentID, StudentID, CourseID, EnrollmentDate) VALUES (1, 1, 101, '2024-01-15'); INSERT INTO Enrollments (EnrollmentID, StudentID, CourseID, EnrollmentDate) VALUES (2, 2, 102, '2024-01-16'); INSERT INTO Enrollments (EnrollmentID, StudentID, CourseID, EnrollmentDate) VALUES (3, 1, 103, '2024-01-20'); INSERT INTO Enrollments (EnrollmentID, StudentID, CourseID, EnrollmentDate) VALUES (4, 3, 101, '2024-01-18');

Step 4: Querying the Database

Perform queries to retrieve data from the database.

Retrieve all students:

SELECT * FROM Students;

Output:

StudentIDStudentName
1Alice
2Bob
3Charlie

Retrieve all courses:

SELECT * FROM Courses;

Output:

CourseIDCourseName
101Math
102Physics
103Chemistry

Retrieve all enrollments:

SELECT * FROM Enrollments;

Output:

EnrollmentIDStudentIDCourseIDEnrollmentDate
111012024-01-15
221022024-01-16
311032024-01-20
431012024-01-18

Retrieve all enrollments with student and course names:

SELECT e.EnrollmentID, s.StudentName, c.CourseName, e.EnrollmentDate FROM Enrollments e JOIN Students s ON e.StudentID = s.StudentID JOIN Courses c ON e.CourseID = c.CourseID;

Output:

EnrollmentIDStudentNameCourseNameEnrollmentDate
1AliceMath2024-01-15
2BobPhysics2024-01-16
3AliceChemistry2024-01-20
4CharlieMath2024-01-18

Conclusion

SQLite is a powerful, lightweight RDBMS suitable for embedded and small to medium-sized applications. Its serverless architecture, zero-configuration setup, and support for standard SQL make it an excellent choice for developers seeking simplicity and efficiency. By following the principles of RDBMS and using SQLite's features, you can effectively manage and query relational data in your applications.

Evolution and importance of relational databases

Evolution of Relational Databases

1. Early Beginnings (1960s-1970s)

  • E. F. Codd's Relational Model (1970): Proposed the concept of organizing data into tables (relations) composed of rows and columns.
  • IBM System R: One of the first implementations of Codd's model, leading to the development of the SQL language.

2. Commercial Systems (1980s)

  • Oracle, IBM DB2, and Microsoft SQL Server: Emerged as commercial RDBMS implementations.
  • SQL Standardization: SQL became the standard query language for relational databases.

3. Expansion and Optimization (1990s)

  • Performance Enhancements: Introduction of indexing, query optimization, and transaction management.
  • Client-Server Architecture: Separation of database server and client applications improved scalability and flexibility.

4. Open Source Movement (2000s)

  • MySQL, PostgreSQL: Gained popularity as open-source RDBMS alternatives.
  • SQLite: Introduced as a lightweight, embedded database engine ideal for applications requiring minimal setup and configuration.

5. Modern Era (2010s-Present)

  • NoSQL Alternatives: Emergence of NoSQL databases for unstructured data and specific use cases (e.g., MongoDB, Cassandra).
  • NewSQL: Combines the scalability of NoSQL with ACID properties of traditional RDBMS.
  • Cloud Databases: Services like Amazon RDS, Google Cloud SQL, and Microsoft Azure SQL Database offer managed database solutions.

Importance of Relational Databases

  1. Data Integrity and Accuracy: Enforced through ACID (Atomicity, Consistency, Isolation, Durability) properties.
  2. Structured Data Organization: Tabular format simplifies data management and querying.
  3. Scalability and Performance: Modern RDBMS support large-scale applications with optimized indexing and query processing.
  4. Standardization and Interoperability: SQL standard allows for widespread use and integration with various tools and applications.
  5. Transaction Management: Ensures reliable execution of concurrent transactions.

SQLite: A Case Study

Overview

  • SQLite: A C library that provides a lightweight, disk-based database. Unlike most SQL databases, SQLite does not have a separate server process.
  • Use Cases: Ideal for embedded systems, mobile applications, and small-to-medium sized applications needing a simple, self-contained database solution.

Example and Output

Let's create a simple SQLite database to manage a book collection.

  1. Setup: Import SQLite library and create a database.
  2. Create Table: Define a table for storing book information.
  3. Insert Data: Add some books to the collection.
  4. Query Data: Retrieve and display the books.

Example Code

import sqlite3 # Connect to SQLite database (or create it if it doesn't exist) conn = sqlite3.connect('books.db') cursor = conn.cursor() # Create a table cursor.execute(''' CREATE TABLE IF NOT EXISTS books ( id INTEGER PRIMARY KEY, title TEXT NOT NULL, author TEXT NOT NULL, published_year INTEGER ) ''') # Insert some data cursor.execute(''' INSERT INTO books (title, author, published_year) VALUES ('The Great Gatsby', 'F. Scott Fitzgerald', 1925), ('To Kill a Mockingbird', 'Harper Lee', 1960), ('1984', 'George Orwell', 1949) ''') # Commit the transaction conn.commit() # Query the data cursor.execute('SELECT * FROM books') rows = cursor.fetchall() # Close the connection conn.close() # Output the result for row in rows: print(row)

Output

(1, 'The Great Gatsby', 'F. Scott Fitzgerald', 1925) (2, 'To Kill a Mockingbird', 'Harper Lee', 1960) (3, '1984', 'George Orwell', 1949)

Explanation

  1. Setup and Connection: We connect to an SQLite database file (creating it if it doesn't exist).
  2. Table Creation: The books table is created with columns for ID, title, author, and published year.
  3. Data Insertion: Three books are inserted into the books table.
  4. Data Query: All rows from the books table are retrieved and printed.

This simple example demonstrates the core functionalities of an RDBMS using SQLite: creating a schema, inserting data, and querying data. SQLite's ease of use and minimal setup make it an excellent choice for small-scale applications and prototyping.

Key concepts: Entities, attributes, relationships, and tables

Key Concepts in SQLite Database

1. Entities

  • Definition: An entity is a real-world object or concept that can have data stored about it. In a relational database, entities are represented by tables.
  • Example: In a library database, entities could include books, authors, and patrons.

2. Attributes

  • Definition: Attributes are the properties or characteristics of an entity. In a table, attributes are represented by columns.
  • Example: For a books entity, attributes could include title, author, published_year, and ISBN.

3. Relationships

  • Definition: Relationships describe how entities are related to each other. In a relational database, relationships are represented by foreign keys.
  • Example: In a library database, a relationship might exist between books and authors, indicating which author wrote which book.

4. Tables

  • Definition: Tables are the structures that store data in rows and columns. Each table corresponds to an entity and each column corresponds to an attribute.
  • Example: A books table stores information about books, with columns for each attribute (title, author, etc.).

Example in SQLite Database

Let's create a simple SQLite database to manage a library, focusing on the entities books and authors, their attributes, and the relationship between them.

1. Setup: Import SQLite library and create a database.

import sqlite3 # Connect to SQLite database (or create it if it doesn't exist) conn = sqlite3.connect('library.db') cursor = conn.cursor()

2. Create Tables: Define tables for storing book and author information.

# Create authors table cursor.execute(''' CREATE TABLE IF NOT EXISTS authors ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, birth_year INTEGER ) ''') # Create books table cursor.execute(''' CREATE TABLE IF NOT EXISTS books ( id INTEGER PRIMARY KEY, title TEXT NOT NULL, author_id INTEGER, published_year INTEGER, FOREIGN KEY (author_id) REFERENCES authors (id) ) ''')

3. Insert Data: Add some authors and books to the database.

# Insert data into authors table cursor.execute(''' INSERT INTO authors (name, birth_year) VALUES ('F. Scott Fitzgerald', 1896), ('Harper Lee', 1926), ('George Orwell', 1903) ''') # Insert data into books table cursor.execute(''' INSERT INTO books (title, author_id, published_year) VALUES ('The Great Gatsby', 1, 1925), ('To Kill a Mockingbird', 2, 1960), ('1984', 3, 1949) ''') # Commit the transaction conn.commit()

4. Query Data: Retrieve and display the books along with their authors.

# Query to join books and authors cursor.execute(''' SELECT books.title, authors.name, books.published_year FROM books JOIN authors ON books.author_id = authors.id ''') # Fetch all results rows = cursor.fetchall() # Close the connection conn.close() # Output the result for row in rows: print(row)

Output

('The Great Gatsby', 'F. Scott Fitzgerald', 1925) ('To Kill a Mockingbird', 'Harper Lee', 1960) ('1984', 'George Orwell', 1949)

Explanation

  1. Entities and Tables: We defined two entities, authors and books, and created corresponding tables.
  2. Attributes and Columns: Each table has columns representing attributes. For example, the books table has title, author_id, and published_year.
  3. Relationships: The relationship between books and authors is represented by the author_id foreign key in the books table. This key references the id column in the authors table, establishing a link between a book and its author.
  4. Querying: We performed a SQL JOIN operation to retrieve books along with their corresponding authors, demonstrating how relationships and attributes are used to organize and retrieve data.

This example demonstrates the foundational concepts of relational databases and how they are implemented in SQLite. By defining entities, attributes, relationships, and tables, we can effectively model and manage real-world data.


2.2 Relational Data Model

Understanding the relational data model

Understanding the Relational Data Model in SQLite

The relational data model organizes data into tables (relations) with rows (tuples) and columns (attributes). Each table represents an entity, and relationships between entities are established using keys (primary and foreign keys).

Key Components

  1. Tables (Relations): Store data about entities.
  2. Columns (Attributes): Define the properties of entities.
  3. Rows (Tuples): Represent individual records in a table.
  4. Primary Key: Uniquely identifies each row in a table.
  5. Foreign Key: Establishes a relationship between tables.

Example: Library Database in SQLite

We'll create a library database to manage books and authors, illustrating the relational data model.

1. Setup: Import SQLite library and create a database.

import sqlite3 # Connect to SQLite database (or create it if it doesn't exist) conn = sqlite3.connect('library.db') cursor = conn.cursor()

2. Create Tables: Define tables for storing book and author information.

# Create authors table cursor.execute(''' CREATE TABLE IF NOT EXISTS authors ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, birth_year INTEGER ) ''') # Create books table cursor.execute(''' CREATE TABLE IF NOT EXISTS books ( id INTEGER PRIMARY KEY, title TEXT NOT NULL, author_id INTEGER, published_year INTEGER, FOREIGN KEY (author_id) REFERENCES authors (id) ) ''')
  • Authors Table: Represents the authors entity with id (primary key), name, and birth_year attributes.
  • Books Table: Represents the books entity with id (primary key), title, author_id (foreign key), and published_year attributes. The author_id column establishes a relationship with the authors table.

3. Insert Data: Add authors and books to the database.

# Insert data into authors table cursor.executemany(''' INSERT INTO authors (name, birth_year) VALUES (?, ?) ''', [ ('F. Scott Fitzgerald', 1896), ('Harper Lee', 1926), ('George Orwell', 1903) ]) # Insert data into books table cursor.executemany(''' INSERT INTO books (title, author_id, published_year) VALUES (?, ?, ?) ''', [ ('The Great Gatsby', 1, 1925), ('To Kill a Mockingbird', 2, 1960), ('1984', 3, 1949) ]) # Commit the transaction conn.commit()
  • Insert Authors: Add records to the authors table.
  • Insert Books: Add records to the books table with author_id linking each book to an author.

4. Query Data: Retrieve and display books along with their authors.

# Query to join books and authors cursor.execute(''' SELECT books.title, authors.name, books.published_year FROM books JOIN authors ON books.author_id = authors.id ''') # Fetch all results rows = cursor.fetchall() # Close the connection conn.close() # Output the result for row in rows: print(row)

Output

('The Great Gatsby', 'F. Scott Fitzgerald', 1925) ('To Kill a Mockingbird', 'Harper Lee', 1960) ('1984', 'George Orwell', 1949)

Explanation

  1. Tables and Columns: We defined authors and books tables with appropriate columns to represent attributes.
  2. Primary Key: Each table has a primary key (id) to uniquely identify records.
  3. Foreign Key: The author_id in the books table references the id in the authors table, establishing a relationship.
  4. Insert and Query Data: We inserted sample data into the tables and performed a join query to retrieve books along with their authors.

Understanding Relationships

The relational data model uses keys to establish and enforce relationships:

  • One-to-Many Relationship: Each author can write multiple books, but each book has one author. This is represented by the foreign key author_id in the books table referencing the id in the authors table.

By organizing data into tables and defining relationships using keys, the relational data model ensures data integrity and efficient retrieval, as demonstrated in this SQLite example.

Relational schema and its components

Relational Schema and Its Components in SQLite Database

A relational schema is a blueprint of how the database is structured, defining tables, their columns (attributes), data types, and relationships between tables. The schema provides a logical view of the database, outlining the tables and the relationships among them.

Key Components of a Relational Schema

  1. Tables: The core components that store data in rows and columns.
  2. Columns (Attributes): Define the properties or characteristics of the data stored in a table.
  3. Data Types: Specify the kind of data that can be stored in each column (e.g., INTEGER, TEXT, REAL).
  4. Primary Keys: Unique identifiers for rows in a table.
  5. Foreign Keys: Columns that create a relationship between tables by referencing the primary key of another table.
  6. Constraints: Rules enforced on the data in the tables (e.g., NOT NULL, UNIQUE).

Example: Library Database Schema in SQLite

Let's define a relational schema for a library database to manage books and authors.

1. Define the Schema

CREATE TABLE authors ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, birth_year INTEGER ); CREATE TABLE books ( id INTEGER PRIMARY KEY, title TEXT NOT NULL, author_id INTEGER, published_year INTEGER, FOREIGN KEY (author_id) REFERENCES authors (id) );
  • authors Table: Represents authors with columns for id, name, and birth_year.
  • books Table: Represents books with columns for id, title, author_id, and published_year.

Components in Detail

  1. Tables:

    • authors
    • books
  2. Columns (Attributes):

    • authors: id, name, birth_year
    • books: id, title, author_id, published_year
  3. Data Types:

    • id: INTEGER
    • name: TEXT
    • birth_year: INTEGER
    • title: TEXT
    • author_id: INTEGER
    • published_year: INTEGER
  4. Primary Keys:

    • authors.id
    • books.id
  5. Foreign Keys:

    • books.author_id references authors.id
  6. Constraints:

    • NOT NULL on columns that cannot be empty
    • FOREIGN KEY constraint to maintain referential integrity

Example Implementation and Output

2. Implement the Schema in SQLite

import sqlite3 # Connect to SQLite database (or create it if it doesn't exist) conn = sqlite3.connect('library.db') cursor = conn.cursor() # Create authors table cursor.execute(''' CREATE TABLE IF NOT EXISTS authors ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, birth_year INTEGER ) ''') # Create books table cursor.execute(''' CREATE TABLE IF NOT EXISTS books ( id INTEGER PRIMARY KEY, title TEXT NOT NULL, author_id INTEGER, published_year INTEGER, FOREIGN KEY (author_id) REFERENCES authors (id) ) ''') # Insert data into authors table cursor.executemany(''' INSERT INTO authors (name, birth_year) VALUES (?, ?) ''', [ ('F. Scott Fitzgerald', 1896), ('Harper Lee', 1926), ('George Orwell', 1903) ]) # Insert data into books table cursor.executemany(''' INSERT INTO books (title, author_id, published_year) VALUES (?, ?, ?) ''', [ ('The Great Gatsby', 1, 1925), ('To Kill a Mockingbird', 2, 1960), ('1984', 3, 1949) ]) # Commit the transaction conn.commit() # Query to join books and authors cursor.execute(''' SELECT books.title, authors.name, books.published_year FROM books JOIN authors ON books.author_id = authors.id ''') # Fetch all results rows = cursor.fetchall() # Close the connection conn.close() # Output the result for row in rows: print(row)

Output

('The Great Gatsby', 'F. Scott Fitzgerald', 1925) ('To Kill a Mockingbird', 'Harper Lee', 1960) ('1984', 'George Orwell', 1949)

Explanation

  1. Schema Definition: The schema is defined using SQL CREATE TABLE statements, specifying columns, data types, and constraints.
  2. Tables and Relationships: The books table has a foreign key (author_id) referencing the authors table, establishing a relationship between books and their authors.
  3. Data Insertion: Sample data is inserted into the authors and books tables.
  4. Querying Data: A JOIN query retrieves data from both tables, demonstrating how the schema supports relational operations.

By defining a relational schema, we organize data into structured tables with clear relationships, ensuring data integrity and enabling efficient data retrieval and manipulation. This example illustrates the core components and operations of a relational schema in SQLite.

Keys: Primary keys, foreign keys, candidate keys

Understanding Keys in SQLite Database

In relational databases, keys are crucial for ensuring data integrity and defining relationships between tables. The main types of keys are primary keys, foreign keys, and candidate keys.

1. Primary Keys

  • Definition: A primary key is a unique identifier for a row in a table. It ensures that each record in the table can be uniquely identified.
  • Characteristics: Unique, non-null.

2. Foreign Keys

  • Definition: A foreign key is a column or a set of columns in one table that refers to the primary key of another table. It establishes a relationship between the two tables.
  • Characteristics: Can be null, references a primary key in another table.

3. Candidate Keys

  • Definition: A candidate key is a column or a set of columns that can uniquely identify a row in a table. Each table can have multiple candidate keys, but only one primary key.
  • Characteristics: Unique, non-null.

Example: Library Database in SQLite

We'll create a library database with tables for books and authors, demonstrating the use of primary keys, foreign keys, and candidate keys.

1. Setup: Import SQLite library and create a database.

import sqlite3 # Connect to SQLite database (or create it if it doesn't exist) conn = sqlite3.connect('library.db') cursor = conn.cursor()

2. Create Tables with Keys

# Create authors table with primary key cursor.execute(''' CREATE TABLE IF NOT EXISTS authors ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, birth_year INTEGER, email TEXT UNIQUE ) ''') # Create books table with primary key and foreign key cursor.execute(''' CREATE TABLE IF NOT EXISTS books ( id INTEGER PRIMARY KEY, title TEXT NOT NULL, author_id INTEGER, published_year INTEGER, isbn TEXT UNIQUE, FOREIGN KEY (author_id) REFERENCES authors (id) ) ''')
  • authors Table:
    • Primary Key: id
    • Candidate Key: email
  • books Table:
    • Primary Key: id
    • Candidate Key: isbn
    • Foreign Key: author_id referencing authors.id

3. Insert Data into Tables

# Insert data into authors table cursor.executemany(''' INSERT INTO authors (name, birth_year, email) VALUES (?, ?, ?) ''', [ ('F. Scott Fitzgerald', 1896, 'fitzgerald@example.com'), ('Harper Lee', 1926, 'lee@example.com'), ('George Orwell', 1903, 'orwell@example.com') ]) # Insert data into books table cursor.executemany(''' INSERT INTO books (title, author_id, published_year, isbn) VALUES (?, ?, ?, ?) ''', [ ('The Great Gatsby', 1, 1925, '9780743273565'), ('To Kill a Mockingbird', 2, 1960, '9780061120084'), ('1984', 3, 1949, '9780451524935') ]) # Commit the transaction conn.commit()

4. Query Data

# Query to join books and authors cursor.execute(''' SELECT books.title, authors.name, books.published_year, books.isbn FROM books JOIN authors ON books.author_id = authors.id ''') # Fetch all results rows = cursor.fetchall() # Close the connection conn.close() # Output the result for row in rows: print(row)

Output

('The Great Gatsby', 'F. Scott Fitzgerald', 1925, '9780743273565') ('To Kill a Mockingbird', 'Harper Lee', 1960, '9780061120084') ('1984', 'George Orwell', 1949, '9780451524935')

Explanation

  1. Primary Keys:

    • In the authors table, id is the primary key.
    • In the books table, id is the primary key.
  2. Foreign Keys:

    • In the books table, author_id is a foreign key that references id in the authors table, establishing a relationship between books and their authors.
  3. Candidate Keys:

    • In the authors table, email is a candidate key, which is unique for each author.
    • In the books table, isbn is a candidate key, which is unique for each book.

Summary

  • Primary Keys ensure each record in a table is unique and identifiable.
  • Foreign Keys establish relationships between tables, maintaining referential integrity.
  • Candidate Keys provide alternative ways to uniquely identify a row in a table.

This example demonstrates how to define and use primary keys, foreign keys, and candidate keys in an SQLite database to manage and query relational data effectively.

Integrity constraints: Entity integrity, referential integrity

Understanding Integrity Constraints in SQLite Database

Integrity constraints are rules applied to database tables to ensure the accuracy and consistency of the data. Two key types of integrity constraints are entity integrity and referential integrity.

1. Entity Integrity

  • Definition: Ensures that each row in a table can be uniquely identified. This is typically enforced using primary keys.
  • Purpose: Prevents duplicate rows and ensures that each record is unique and identifiable.

2. Referential Integrity

  • Definition: Ensures that relationships between tables remain consistent. This is enforced using foreign keys.
  • Purpose: Maintains the validity of links between tables, ensuring that a foreign key value always refers to an existing primary key value in the related table.

Example: Library Database in SQLite

We'll create a library database to illustrate entity integrity and referential integrity.

1. Setup: Import SQLite library and create a database.

import sqlite3 # Connect to SQLite database (or create it if it doesn't exist) conn = sqlite3.connect('library.db') cursor = conn.cursor()

2. Create Tables with Integrity Constraints

# Create authors table with primary key to enforce entity integrity cursor.execute(''' CREATE TABLE IF NOT EXISTS authors ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, birth_year INTEGER, email TEXT UNIQUE ) ''') # Create books table with primary key and foreign key to enforce referential integrity cursor.execute(''' CREATE TABLE IF NOT EXISTS books ( id INTEGER PRIMARY KEY, title TEXT NOT NULL, author_id INTEGER, published_year INTEGER, isbn TEXT UNIQUE, FOREIGN KEY (author_id) REFERENCES authors (id) ) ''')
  • Entity Integrity: The id column in both tables is defined as the primary key to ensure each record is unique.
  • Referential Integrity: The author_id column in the books table is defined as a foreign key referencing the id column in the authors table to ensure valid relationships.

3. Insert Data into Tables

# Insert data into authors table cursor.executemany(''' INSERT INTO authors (name, birth_year, email) VALUES (?, ?, ?) ''', [ ('F. Scott Fitzgerald', 1896, 'fitzgerald@example.com'), ('Harper Lee', 1926, 'lee@example.com'), ('George Orwell', 1903, 'orwell@example.com') ]) # Insert data into books table cursor.executemany(''' INSERT INTO books (title, author_id, published_year, isbn) VALUES (?, ?, ?, ?) ''', [ ('The Great Gatsby', 1, 1925, '9780743273565'), ('To Kill a Mockingbird', 2, 1960, '9780061120084'), ('1984', 3, 1949, '9780451524935') ]) # Commit the transaction conn.commit()

4. Query Data

# Query to join books and authors cursor.execute(''' SELECT books.title, authors.name, books.published_year, books.isbn FROM books JOIN authors ON books.author_id = authors.id ''') # Fetch all results rows = cursor.fetchall() # Close the connection conn.close() # Output the result for row in rows: print(row)

Output

('The Great Gatsby', 'F. Scott Fitzgerald', 1925, '9780743273565') ('To Kill a Mockingbird', 'Harper Lee', 1960, '9780061120084') ('1984', 'George Orwell', 1949, '9780451524935')

Explanation

  1. Entity Integrity:

    • Enforced by defining the id column as the primary key in both the authors and books tables.
    • Ensures that each author and book record is uniquely identifiable.
  2. Referential Integrity:

    • Enforced by defining the author_id column in the books table as a foreign key that references the id column in the authors table.
    • Ensures that every author_id in the books table corresponds to a valid id in the authors table, maintaining the relationship integrity.

Summary

  • Entity Integrity ensures each row in a table is uniquely identifiable using primary keys.
  • Referential Integrity maintains the consistency of relationships between tables using foreign keys.

This example demonstrates how to enforce entity integrity and referential integrity in an SQLite database to ensure accurate, consistent, and reliable data management.


2.3 Database Design Basics

Introduction to database design principles

Introduction to Database Design Principles in SQLite

Database design principles are essential for creating efficient, scalable, and maintainable databases. In SQLite, these principles guide how to structure data, define relationships, and enforce integrity. Key principles include normalization, defining keys, relationships, and indexing.

Key Principles of Database Design

  1. Normalization

    • Definition: The process of organizing data to minimize redundancy and dependency.
    • Purpose: Ensure data integrity and optimize storage.
    • Forms:
      • 1NF (First Normal Form): Ensure that each column contains atomic values and each record is unique.
      • 2NF (Second Normal Form): Ensure that all non-key attributes are fully functionally dependent on the primary key.
      • 3NF (Third Normal Form): Ensure that all non-key attributes are not only fully dependent on the primary key but also independent of each other.
  2. Defining Keys

    • Primary Key: Uniquely identifies each record in a table.
    • Foreign Key: Establishes a relationship between two tables.
    • Candidate Key: A set of columns that can uniquely identify any database record without referring to any other data.
  3. Relationships

    • One-to-One (1:1): A single record in one table is related to a single record in another table.
    • One-to-Many (1
      )
      : A single record in one table is related to multiple records in another table.
    • Many-to-Many (N
      )
      : Multiple records in one table are related to multiple records in another table. This typically requires a junction table.
  4. Indexing

    • Definition: Creating indexes on columns to improve query performance.
    • Purpose: Speed up data retrieval operations.

Example: Library Database Design in SQLite

We'll design a library database to demonstrate these principles.

1. Setup: Import SQLite library and create a database.

import sqlite3 # Connect to SQLite database (or create it if it doesn't exist) conn = sqlite3.connect('library.db') cursor = conn.cursor()

2. Define the Schema with Normalization

# Create authors table (1NF, 2NF, 3NF) cursor.execute(''' CREATE TABLE IF NOT EXISTS authors ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, birth_year INTEGER, email TEXT UNIQUE ) ''') # Create books table (1NF, 2NF, 3NF) cursor.execute(''' CREATE TABLE IF NOT EXISTS books ( id INTEGER PRIMARY KEY, title TEXT NOT NULL, author_id INTEGER, published_year INTEGER, isbn TEXT UNIQUE, FOREIGN KEY (author_id) REFERENCES authors (id) ) ''') # Create a junction table for many-to-many relationships (authors and genres) cursor.execute(''' CREATE TABLE IF NOT EXISTS genres ( id INTEGER PRIMARY KEY, name TEXT NOT NULL UNIQUE ) ''') cursor.execute(''' CREATE TABLE IF NOT EXISTS book_genres ( book_id INTEGER, genre_id INTEGER, PRIMARY KEY (book_id, genre_id), FOREIGN KEY (book_id) REFERENCES books (id), FOREIGN KEY (genre_id) REFERENCES genres (id) ) ''')
  • Normalization:
    • 1NF: Each column has atomic values; each table has a primary key.
    • 2NF: Non-key attributes fully depend on the primary key.
    • 3NF: Non-key attributes are independent of each other.
  • Defining Keys:
    • id as primary key for authors, books, and genres tables.
    • author_id as a foreign key in books table.
    • Composite primary key in book_genres for many-to-many relationships.

3. Insert Data into Tables

# Insert data into authors table cursor.executemany(''' INSERT INTO authors (name, birth_year, email) VALUES (?, ?, ?) ''', [ ('F. Scott Fitzgerald', 1896, 'fitzgerald@example.com'), ('Harper Lee', 1926, 'lee@example.com'), ('George Orwell', 1903, 'orwell@example.com') ]) # Insert data into books table cursor.executemany(''' INSERT INTO books (title, author_id, published_year, isbn) VALUES (?, ?, ?, ?) ''', [ ('The Great Gatsby', 1, 1925, '9780743273565'), ('To Kill a Mockingbird', 2, 1960, '9780061120084'), ('1984', 3, 1949, '9780451524935') ]) # Insert data into genres table cursor.executemany(''' INSERT INTO genres (name) VALUES (?) ''', [ ('Fiction',), ('Classics',), ('Dystopian',) ]) # Insert data into book_genres table cursor.executemany(''' INSERT INTO book_genres (book_id, genre_id) VALUES (?, ?) ''', [ (1, 1), # The Great Gatsby - Fiction (1, 2), # The Great Gatsby - Classics (2, 1), # To Kill a Mockingbird - Fiction (2, 2), # To Kill a Mockingbird - Classics (3, 1), # 1984 - Fiction (3, 3) # 1984 - Dystopian ]) # Commit the transaction conn.commit()

4. Query Data

# Query to join books, authors, and genres cursor.execute(''' SELECT books.title, authors.name, books.published_year, group_concat(genres.name, ', ') as genres FROM books JOIN authors ON books.author_id = authors.id JOIN book_genres ON books.id = book_genres.book_id JOIN genres ON book_genres.genre_id = genres.id GROUP BY books.title, authors.name, books.published_year ''') # Fetch all results rows = cursor.fetchall() # Close the connection conn.close() # Output the result for row in rows: print(row)

Output

('The Great Gatsby', 'F. Scott Fitzgerald', 1925, 'Fiction, Classics') ('To Kill a Mockingbird', 'Harper Lee', 1960, 'Fiction, Classics') ('1984', 'George Orwell', 1949, 'Fiction, Dystopian')

Explanation

  1. Normalization:

    • 1NF: Each table has atomic columns and a primary key.
    • 2NF: Non-key attributes are fully dependent on the primary key.
    • 3NF: Non-key attributes are independent of each other.
  2. Keys:

    • Primary Keys: Unique identifiers for authors, books, and genres.
    • Foreign Keys: author_id in books references authors.id. Composite keys in book_genres establish many-to-many relationships.
  3. Relationships:

    • One-to-Many: Authors to books.
    • Many-to-Many: Books to genres via book_genres junction table.
  4. Indexing:

    • Unique indexes on primary keys and candidate keys (isbn, email) for fast lookups.

Summary

Database design principles ensure efficient, scalable, and maintainable databases. By following normalization, defining keys, establishing relationships, and indexing, we can create a robust database schema. This example demonstrated these principles in an SQLite library database, showcasing how to structure data, enforce integrity, and query relationships effectively.

Conceptual, logical, and physical database design

Database Design Phases in SQLite

Database design is typically divided into three main phases: conceptual, logical, and physical design. Each phase plays a crucial role in developing a robust, efficient, and scalable database system.

1. Conceptual Database Design

  • Definition: Focuses on identifying the entities, attributes, and relationships in the database.
  • Output: Entity-Relationship (ER) diagram or a high-level data model.

2. Logical Database Design

  • Definition: Translates the conceptual design into a logical structure, defining tables, columns, data types, and relationships without considering physical implementation details.
  • Output: Logical schema.

3. Physical Database Design

  • Definition: Involves implementing the logical schema in a specific database management system (DBMS), considering physical storage and performance optimization.
  • Output: Physical schema and actual database implementation.

Example: Library Database in SQLite

Let's go through each design phase for a library database in SQLite.

1. Conceptual Database Design

  • Entities: Authors, Books, Genres.
  • Attributes:
    • Authors: id, name, birth_year, email.
    • Books: id, title, published_year, isbn.
    • Genres: id, name.
  • Relationships:
    • One-to-Many: An author can write many books.
    • Many-to-Many: A book can belong to multiple genres, and a genre can include multiple books.

2. Logical Database Design

Translate the conceptual model into a logical schema.

  • Tables:
    • authors: id (PK), name, birth_year, email (unique).
    • books: id (PK), title, author_id (FK), published_year, isbn (unique).
    • genres: id (PK), name (unique).
    • book_genres: book_id (FK), genre_id (FK).

Logical Schema

CREATE TABLE authors ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, birth_year INTEGER, email TEXT UNIQUE ); CREATE TABLE books ( id INTEGER PRIMARY KEY, title TEXT NOT NULL, author_id INTEGER, published_year INTEGER, isbn TEXT UNIQUE, FOREIGN KEY (author_id) REFERENCES authors (id) ); CREATE TABLE genres ( id INTEGER PRIMARY KEY, name TEXT NOT NULL UNIQUE ); CREATE TABLE book_genres ( book_id INTEGER, genre_id INTEGER, PRIMARY KEY (book_id, genre_id), FOREIGN KEY (book_id) REFERENCES books (id), FOREIGN KEY (genre_id) REFERENCES genres (id) );

3. Physical Database Design

Implement the logical schema in SQLite and insert sample data.

Implementation in SQLite

import sqlite3 # Connect to SQLite database conn = sqlite3.connect('library.db') cursor = conn.cursor() # Create tables cursor.execute(''' CREATE TABLE IF NOT EXISTS authors ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, birth_year INTEGER, email TEXT UNIQUE ) ''') cursor.execute(''' CREATE TABLE IF NOT EXISTS books ( id INTEGER PRIMARY KEY, title TEXT NOT NULL, author_id INTEGER, published_year INTEGER, isbn TEXT UNIQUE, FOREIGN KEY (author_id) REFERENCES authors (id) ) ''') cursor.execute(''' CREATE TABLE IF NOT EXISTS genres ( id INTEGER PRIMARY KEY, name TEXT NOT NULL UNIQUE ) ''') cursor.execute(''' CREATE TABLE IF NOT EXISTS book_genres ( book_id INTEGER, genre_id INTEGER, PRIMARY KEY (book_id, genre_id), FOREIGN KEY (book_id) REFERENCES books (id), FOREIGN KEY (genre_id) REFERENCES genres (id) ) ''') # Insert data into authors table cursor.executemany(''' INSERT INTO authors (name, birth_year, email) VALUES (?, ?, ?) ''', [ ('F. Scott Fitzgerald', 1896, 'fitzgerald@example.com'), ('Harper Lee', 1926, 'lee@example.com'), ('George Orwell', 1903, 'orwell@example.com') ]) # Insert data into books table cursor.executemany(''' INSERT INTO books (title, author_id, published_year, isbn) VALUES (?, ?, ?, ?) ''', [ ('The Great Gatsby', 1, 1925, '9780743273565'), ('To Kill a Mockingbird', 2, 1960, '9780061120084'), ('1984', 3, 1949, '9780451524935') ]) # Insert data into genres table cursor.executemany(''' INSERT INTO genres (name) VALUES (?) ''', [ ('Fiction',), ('Classics',), ('Dystopian',) ]) # Insert data into book_genres table cursor.executemany(''' INSERT INTO book_genres (book_id, genre_id) VALUES (?, ?) ''', [ (1, 1), # The Great Gatsby - Fiction (1, 2), # The Great Gatsby - Classics (2, 1), # To Kill a Mockingbird - Fiction (2, 2), # To Kill a Mockingbird - Classics (3, 1), # 1984 - Fiction (3, 3) # 1984 - Dystopian ]) # Commit the transaction conn.commit() # Query to join books, authors, and genres cursor.execute(''' SELECT books.title, authors.name, books.published_year, group_concat(genres.name, ', ') as genres FROM books JOIN authors ON books.author_id = authors.id JOIN book_genres ON books.id = book_genres.book_id JOIN genres ON book_genres.genre_id = genres.id GROUP BY books.title, authors.name, books.published_year ''') # Fetch all results rows = cursor.fetchall() # Close the connection conn.close() # Output the result for row in rows: print(row)

Output

('The Great Gatsby', 'F. Scott Fitzgerald', 1925, 'Fiction, Classics') ('To Kill a Mockingbird', 'Harper Lee', 1960, 'Fiction, Classics') ('1984', 'George Orwell', 1949, 'Fiction, Dystopian')

Summary

  1. Conceptual Design: Identified entities, attributes, and relationships.
  2. Logical Design: Defined tables, columns, data types, and relationships.
  3. Physical Design: Implemented the logical schema in SQLite and inserted sample data.

By following these design phases, we ensure a well-structured and efficient database system. This approach not only helps in maintaining data integrity but also optimizes query performance and scalability.

Normalization: First normal form (1NF) to Boyce-Codd normal form (BCNF)

Understanding Normalization in SQLite

Normalization is the process of organizing data in a database to reduce redundancy and improve data integrity. The most common forms of normalization are First Normal Form (1NF), Second Normal Form (2NF), Third Normal Form (3NF), and Boyce-Codd Normal Form (BCNF).

Normal Forms

1. First Normal Form (1NF)

  • Definition: Ensures that each column contains atomic (indivisible) values and that each record is unique.
  • Example Violation: A table with a column that contains a list of values.
  • Solution: Ensure that each column contains only one value per record.

2. Second Normal Form (2NF)

  • Definition: Achieved when a table is in 1NF and all non-key attributes are fully functionally dependent on the primary key.
  • Example Violation: Partial dependency where a non-key attribute is dependent on part of a composite primary key.
  • Solution: Remove partial dependencies by creating separate tables.

3. Third Normal Form (3NF)

  • Definition: Achieved when a table is in 2NF and all non-key attributes are non-transitively dependent on the primary key.
  • Example Violation: Transitive dependency where a non-key attribute depends on another non-key attribute.
  • Solution: Remove transitive dependencies by creating separate tables.

4. Boyce-Codd Normal Form (BCNF)

  • Definition: A stronger version of 3NF where every determinant is a candidate key.
  • Example Violation: When a non-candidate key attribute determines a candidate key.
  • Solution: Ensure all determinants are candidate keys.

Example: Normalizing a Library Database in SQLite

Let's go through the normalization process for a library database in SQLite.

Initial Unnormalized Table

CREATE TABLE IF NOT EXISTS library ( book_id INTEGER, title TEXT, author TEXT, author_birth_year INTEGER, genre TEXT, publisher TEXT, PRIMARY KEY (book_id) ); INSERT INTO library (book_id, title, author, author_birth_year, genre, publisher) VALUES (1, 'The Great Gatsby', 'F. Scott Fitzgerald', 1896, 'Fiction', 'Scribner'), (2, 'To Kill a Mockingbird', 'Harper Lee', 1926, 'Fiction', 'J.B. Lippincott & Co.'), (3, '1984', 'George Orwell', 1903, 'Dystopian', 'Secker & Warburg');

1. First Normal Form (1NF)

  • Violation: If a column contains multiple values (e.g., a list of genres).
  • Solution: Ensure atomic values.
CREATE TABLE IF NOT EXISTS books ( book_id INTEGER PRIMARY KEY, title TEXT NOT NULL, author TEXT NOT NULL, author_birth_year INTEGER, genre TEXT NOT NULL, publisher TEXT NOT NULL ); INSERT INTO books (book_id, title, author, author_birth_year, genre, publisher) VALUES (1, 'The Great Gatsby', 'F. Scott Fitzgerald', 1896, 'Fiction', 'Scribner'), (2, 'To Kill a Mockingbird', 'Harper Lee', 1926, 'Fiction', 'J.B. Lippincott & Co.'), (3, '1984', 'George Orwell', 1903, 'Dystopian', 'Secker & Warburg');

2. Second Normal Form (2NF)

  • Violation: Partial dependency of non-key attributes on part of a composite key.
  • Solution: Remove partial dependencies.

Split the table into authors and books.

CREATE TABLE IF NOT EXISTS authors ( author_id INTEGER PRIMARY KEY, name TEXT NOT NULL, birth_year INTEGER ); CREATE TABLE IF NOT EXISTS books ( book_id INTEGER PRIMARY KEY, title TEXT NOT NULL, author_id INTEGER, genre TEXT NOT NULL, publisher TEXT NOT NULL, FOREIGN KEY (author_id) REFERENCES authors (author_id) ); INSERT INTO authors (author_id, name, birth_year) VALUES (1, 'F. Scott Fitzgerald', 1896), (2, 'Harper Lee', 1926), (3, 'George Orwell', 1903); INSERT INTO books (book_id, title, author_id, genre, publisher) VALUES (1, 'The Great Gatsby', 1, 'Fiction', 'Scribner'), (2, 'To Kill a Mockingbird', 2, 'Fiction', 'J.B. Lippincott & Co.'), (3, '1984', 3, 'Dystopian', 'Secker & Warburg');

3. Third Normal Form (3NF)

  • Violation: Transitive dependency where a non-key attribute depends on another non-key attribute.
  • Solution: Remove transitive dependencies.

Split the table to remove transitive dependencies.

CREATE TABLE IF NOT EXISTS publishers ( publisher_id INTEGER PRIMARY KEY, name TEXT NOT NULL ); ALTER TABLE books ADD COLUMN publisher_id INTEGER; UPDATE books SET publisher_id = 1 WHERE publisher = 'Scribner'; UPDATE books SET publisher_id = 2 WHERE publisher = 'J.B. Lippincott & Co.'; UPDATE books SET publisher_id = 3 WHERE publisher = 'Secker & Warburg'; INSERT INTO publishers (publisher_id, name) VALUES (1, 'Scribner'), (2, 'J.B. Lippincott & Co.'), (3, 'Secker & Warburg'); -- Now update books to remove the publisher column ALTER TABLE books DROP COLUMN publisher;

4. Boyce-Codd Normal Form (BCNF)

  • Violation: Every determinant is not a candidate key.
  • Solution: Ensure all determinants are candidate keys.

Ensure no non-candidate key attribute determines another attribute.

-- In this example, the tables are already in BCNF as every determinant is a candidate key

Querying the Normalized Database

import sqlite3 # Connect to SQLite database conn = sqlite3.connect('library.db') cursor = conn.cursor() # Query to join books, authors, and publishers cursor.execute(''' SELECT books.title, authors.name, books.genre, publishers.name AS publisher FROM books JOIN authors ON books.author_id = authors.author_id JOIN publishers ON books.publisher_id = publishers.publisher_id ''') # Fetch all results rows = cursor.fetchall() # Close the connection conn.close() # Output the result for row in rows: print(row)

Output

('The Great Gatsby', 'F. Scott Fitzgerald', 'Fiction', 'Scribner') ('To Kill a Mockingbird', 'Harper Lee', 'Fiction', 'J.B. Lippincott & Co.') ('1984', 'George Orwell', 'Dystopian', 'Secker & Warburg')

Summary

  • 1NF: Ensures atomic values and unique records.
  • 2NF: Removes partial dependencies by splitting tables.
  • 3NF: Removes transitive dependencies.
  • BCNF: Ensures every determinant is a candidate key.

By following these normalization steps, we ensure data integrity, reduce redundancy, and improve the efficiency of the database. This example demonstrated the normalization process in an SQLite database, from 1NF to BCNF.


2.4 Structured Query Language (SQL) Basics

Overview of SQL and its role in relational databases

Overview of SQL and Its Role in Relational Databases

Structured Query Language (SQL) is a standard programming language specifically designed for managing and manipulating relational databases. SQL plays a crucial role in interacting with databases to perform various operations such as querying, updating, and managing the data and schema.

Key Features of SQL:

  1. Data Querying: Retrieve specific data from databases using SELECT statements.
  2. Data Manipulation: Insert, update, delete, and manage data using INSERT, UPDATE, DELETE, and other DML (Data Manipulation Language) statements.
  3. Data Definition: Define database schema using DDL (Data Definition Language) statements like CREATE, ALTER, and DROP.
  4. Data Control: Control access to data using DCL (Data Control Language) statements like GRANT and REVOKE.
  5. Transaction Control: Manage transactions to ensure data integrity using TCL (Transaction Control Language) statements like COMMIT, ROLLBACK, and SAVEPOINT.

SQL in SQLite Database

SQLite is a self-contained, serverless, and zero-configuration SQL database engine. It uses SQL as its query language, supporting most of the standard SQL features. SQLite is commonly used for embedded databases in applications due to its simplicity and lightweight nature.

Example: Library Database in SQLite

We'll demonstrate SQL's role in relational databases using a library database example in SQLite.

1. Creating Tables (Data Definition Language - DDL)

First, we create the necessary tables for our library database.

import sqlite3 # Connect to SQLite database conn = sqlite3.connect('library.db') cursor = conn.cursor() # Create tables cursor.execute(''' CREATE TABLE IF NOT EXISTS authors ( author_id INTEGER PRIMARY KEY, name TEXT NOT NULL, birth_year INTEGER ) ''') cursor.execute(''' CREATE TABLE IF NOT EXISTS publishers ( publisher_id INTEGER PRIMARY KEY, name TEXT NOT NULL ) ''') cursor.execute(''' CREATE TABLE IF NOT EXISTS books ( book_id INTEGER PRIMARY KEY, title TEXT NOT NULL, author_id INTEGER, genre TEXT NOT NULL, publisher_id INTEGER, FOREIGN KEY (author_id) REFERENCES authors (author_id), FOREIGN KEY (publisher_id) REFERENCES publishers (publisher_id) ) ''') # Commit the changes conn.commit()

2. Inserting Data (Data Manipulation Language - DML)

Next, we insert data into the tables.

# Insert data into authors table cursor.executemany(''' INSERT INTO authors (name, birth_year) VALUES (?, ?) ''', [ ('F. Scott Fitzgerald', 1896), ('Harper Lee', 1926), ('George Orwell', 1903) ]) # Insert data into publishers table cursor.executemany(''' INSERT INTO publishers (name) VALUES (?) ''', [ ('Scribner'), ('J.B. Lippincott & Co.'), ('Secker & Warburg') ]) # Insert data into books table cursor.executemany(''' INSERT INTO books (title, author_id, genre, publisher_id) VALUES (?, ?, ?, ?) ''', [ ('The Great Gatsby', 1, 'Fiction', 1), ('To Kill a Mockingbird', 2, 'Fiction', 2), ('1984', 3, 'Dystopian', 3) ]) # Commit the changes conn.commit()

3. Querying Data (Data Query Language - DQL)

Retrieve data from the database using SELECT statements.

# Query to join books, authors, and publishers cursor.execute(''' SELECT books.title, authors.name, books.genre, publishers.name AS publisher FROM books JOIN authors ON books.author_id = authors.author_id JOIN publishers ON books.publisher_id = publishers.publisher_id ''') # Fetch all results rows = cursor.fetchall() # Output the result for row in rows: print(row)

Output

('The Great Gatsby', 'F. Scott Fitzgerald', 'Fiction', 'Scribner') ('To Kill a Mockingbird', 'Harper Lee', 'Fiction', 'J.B. Lippincott & Co.') ('1984', 'George Orwell', 'Dystopian', 'Secker & Warburg')

4. Updating Data (Data Manipulation Language - DML)

Update existing data in the database.

# Update the genre of a book cursor.execute(''' UPDATE books SET genre = 'Classic Fiction' WHERE title = 'The Great Gatsby' ''') # Commit the changes conn.commit()

5. Deleting Data (Data Manipulation Language - DML)

Delete data from the database.

# Delete a book from the database cursor.execute(''' DELETE FROM books WHERE title = '1984' ''') # Commit the changes conn.commit()

Summary

SQL is integral to managing and interacting with relational databases like SQLite. It allows for defining schemas (DDL), manipulating data (DML), querying data (DQL), and controlling data access and integrity (DCL and TCL). By using SQL, we can effectively create, manage, and query a library database, demonstrating its versatility and power in database management systems.

The provided example showcases how SQL statements are used in SQLite to create tables, insert data, query data, update records, and delete records, illustrating the core functionalities of SQL in relational databases.

Basic SQL commands: SELECT, INSERT, UPDATE, DELETE

Basic SQL Commands in SQLite Database

1. SELECT

The SELECT statement is used to query data from a database.

Syntax:

SELECT column1, column2, ... FROM table_name WHERE condition;

Example: Assume we have a table named students with columns id, name, and age.

SELECT name, age FROM students WHERE age > 20;

Output:

| name | age | |---------|-----| | John | 22 | | Alice | 25 |

2. INSERT

The INSERT INTO statement is used to add new rows to a table.

Syntax:

INSERT INTO table_name (column1, column2, ...) VALUES (value1, value2, ...);

Example:

INSERT INTO students (id, name, age) VALUES (1, 'John', 22);

Output:

The row is inserted successfully: | id | name | age | |----|-------|-----| | 1 | John | 22 |

3. UPDATE

The UPDATE statement is used to modify existing records in a table.

Syntax:

UPDATE table_name SET column1 = value1, column2 = value2, ... WHERE condition;

Example:

UPDATE students SET age = 23 WHERE id = 1;

Output:

The row is updated successfully: | id | name | age | |----|-------|-----| | 1 | John | 23 |

4. DELETE

The DELETE statement is used to delete rows from a table.

Syntax:

DELETE FROM table_name WHERE condition;

Example:

DELETE FROM students WHERE id = 1;

Output:

The row is deleted successfully: | id | name | age | |----|-------|-----|

Example in an SQLite Database

Let's demonstrate the above commands using SQLite in Python.

  1. Create and Connect to the Database
  2. Create the Table
  3. Insert Data
  4. Select Data
  5. Update Data
  6. Delete Data
import sqlite3 # Connect to the SQLite database conn = sqlite3.connect(':memory:') # Use in-memory database for demonstration cursor = conn.cursor() # Create the students table cursor.execute(''' CREATE TABLE students ( id INTEGER PRIMARY KEY, name TEXT, age INTEGER ) ''') # Insert data cursor.execute(''' INSERT INTO students (id, name, age) VALUES (1, 'John', 22), (2, 'Alice', 25), (3, 'Bob', 20) ''') conn.commit() # Select data cursor.execute('SELECT name, age FROM students WHERE age > 20') select_results = cursor.fetchall() # Update data cursor.execute('UPDATE students SET age = 23 WHERE id = 1') conn.commit() # Delete data cursor.execute('DELETE FROM students WHERE id = 1') conn.commit() # Fetch updated data cursor.execute('SELECT * FROM students') final_results = cursor.fetchall() # Close the connection conn.close() select_results, final_results

Output:

# Output after SELECT [ ('John', 22), ('Alice', 25) ] # Output after DELETE [ (2, 'Alice', 25), (3, 'Bob', 20) ]

This demonstrates how to use the basic SQL commands: SELECT, INSERT, UPDATE, and DELETE in an SQLite database using both SQL syntax and Python code for a real-life example.

Querying single and multiple tables

Querying Single and Multiple Tables in SQLite Database

Querying a Single Table

When querying a single table, you can use the SELECT statement to retrieve data based on specified conditions.

Example Scenario:

Let's say we have a students table with the following structure and data:

CREATE TABLE students ( id INTEGER PRIMARY KEY, name TEXT, age INTEGER ); INSERT INTO students (id, name, age) VALUES (1, 'John', 22), (2, 'Alice', 25), (3, 'Bob', 20);

Example Query:

SELECT name, age FROM students WHERE age > 20;

Output:

| name | age | |-------|-----| | John | 22 | | Alice | 25 |

Querying Multiple Tables

When querying multiple tables, you typically use JOIN operations to combine rows from two or more tables based on a related column.

Example Scenario:

Let's say we have two tables: students and courses.

CREATE TABLE students ( id INTEGER PRIMARY KEY, name TEXT, age INTEGER ); CREATE TABLE courses ( course_id INTEGER PRIMARY KEY, course_name TEXT, student_id INTEGER, FOREIGN KEY (student_id) REFERENCES students (id) ); INSERT INTO students (id, name, age) VALUES (1, 'John', 22), (2, 'Alice', 25), (3, 'Bob', 20); INSERT INTO courses (course_id, course_name, student_id) VALUES (1, 'Math', 1), (2, 'Science', 1), (3, 'History', 2);

Example Query:

SELECT students.name, students.age, courses.course_name FROM students JOIN courses ON students.id = courses.student_id WHERE students.age > 20;

Output:

| name | age | course_name | |-------|-----|--------------| | John | 22 | Math | | John | 22 | Science | | Alice | 25 | History |

Example in an SQLite Database Using Python

Let's demonstrate these queries using SQLite in Python.

  1. Create and Connect to the Database
  2. Create the Tables
  3. Insert Data
  4. Query Single Table
  5. Query Multiple Tables
import sqlite3 # Connect to the SQLite database conn = sqlite3.connect(':memory:') # Use in-memory database for demonstration cursor = conn.cursor() # Create the students table cursor.execute(''' CREATE TABLE students ( id INTEGER PRIMARY KEY, name TEXT, age INTEGER ) ''') # Create the courses table cursor.execute(''' CREATE TABLE courses ( course_id INTEGER PRIMARY KEY, course_name TEXT, student_id INTEGER, FOREIGN KEY (student_id) REFERENCES students (id) ) ''') # Insert data into students table cursor.execute(''' INSERT INTO students (id, name, age) VALUES (1, 'John', 22), (2, 'Alice', 25), (3, 'Bob', 20) ''') # Insert data into courses table cursor.execute(''' INSERT INTO courses (course_id, course_name, student_id) VALUES (1, 'Math', 1), (2, 'Science', 1), (3, 'History', 2) ''') conn.commit() # Query single table cursor.execute('SELECT name, age FROM students WHERE age > 20') single_table_results = cursor.fetchall() # Query multiple tables using JOIN cursor.execute(''' SELECT students.name, students.age, courses.course_name FROM students JOIN courses ON students.id = courses.student_id WHERE students.age > 20 ''') multiple_table_results = cursor.fetchall() # Close the connection conn.close() single_table_results, multiple_table_results

Output:

# Output for querying single table [ ('John', 22), ('Alice', 25) ] # Output for querying multiple tables [ ('John', 22, 'Math'), ('John', 22, 'Science'), ('Alice', 25, 'History') ]

This demonstrates how to query data from a single table and multiple tables using SQL syntax and Python code in an SQLite database.


2.5 Advanced SQL Queries

Retrieving data with advanced SELECT statements

Advanced SELECT Statements in SQLite Database

Advanced SELECT statements allow for more complex querying and data manipulation. Here are some common advanced features:

  1. Using Aliases
  2. Sorting Results
  3. Filtering Results with WHERE Clause
  4. Using Aggregate Functions
  5. Grouping Results
  6. Using Subqueries
  7. Joining Tables
  8. Limiting Results

Examples and Output

1. Using Aliases

Aliases give a temporary name to a table or column for the duration of a query.

Example:

SELECT name AS student_name, age AS student_age FROM students;

Output:

| student_name | student_age | |--------------|-------------| | John | 22 | | Alice | 25 | | Bob | 20 |

2. Sorting Results

The ORDER BY clause is used to sort the result set by one or more columns.

Example:

SELECT name, age FROM students ORDER BY age DESC;

Output:

| name | age | |-------|-----| | Alice | 25 | | John | 22 | | Bob | 20 |

3. Filtering Results with WHERE Clause

The WHERE clause is used to filter records that meet a certain condition.

Example:

SELECT name, age FROM students WHERE age > 20;

Output:

| name | age | |-------|-----| | John | 22 | | Alice | 25 |

4. Using Aggregate Functions

Aggregate functions like COUNT, AVG, SUM, MIN, and MAX perform a calculation on a set of values.

Example:

SELECT COUNT(*) AS total_students, AVG(age) AS average_age FROM students;

Output:

| total_students | average_age | |----------------|-------------| | 3 | 22.33 |

5. Grouping Results

The GROUP BY statement groups rows that have the same values in specified columns into aggregated data.

Example: Assume we add a class column to students table:

ALTER TABLE students ADD COLUMN class TEXT; UPDATE students SET class = 'A' WHERE id = 1; UPDATE students SET class = 'B' WHERE id = 2; UPDATE students SET class = 'A' WHERE id = 3;

Now, query:

SELECT class, COUNT(*) AS students_count, AVG(age) AS average_age FROM students GROUP BY class;

Output:

| class | students_count | average_age | |-------|----------------|-------------| | A | 2 | 21.0 | | B | 1 | 25.0 |

6. Using Subqueries

Subqueries are queries nested inside another query.

Example:

SELECT name, age FROM students WHERE age > (SELECT AVG(age) FROM students);

Output:

| name | age | |-------|-----| | Alice | 25 |

7. Joining Tables

The JOIN clause is used to combine rows from two or more tables based on a related column.

Example:

SELECT students.name, courses.course_name FROM students JOIN courses ON students.id = courses.student_id;

Output:

| name | course_name | |-------|-------------| | John | Math | | John | Science | | Alice | History |

8. Limiting Results

The LIMIT clause is used to specify the number of records to return.

Example:

SELECT name, age FROM students LIMIT 2;

Output:

| name | age | |-------|-----| | John | 22 | | Alice | 25 |

Example in SQLite Using Python

Let's demonstrate these advanced queries using SQLite in Python.

import sqlite3 # Connect to the SQLite database conn = sqlite3.connect(':memory:') # Use in-memory database for demonstration cursor = conn.cursor() # Create the students table cursor.execute(''' CREATE TABLE students ( id INTEGER PRIMARY KEY, name TEXT, age INTEGER, class TEXT ) ''') # Create the courses table cursor.execute(''' CREATE TABLE courses ( course_id INTEGER PRIMARY KEY, course_name TEXT, student_id INTEGER, FOREIGN KEY (student_id) REFERENCES students (id) ) ''') # Insert data into students table cursor.execute(''' INSERT INTO students (id, name, age, class) VALUES (1, 'John', 22, 'A'), (2, 'Alice', 25, 'B'), (3, 'Bob', 20, 'A') ''') # Insert data into courses table cursor.execute(''' INSERT INTO courses (course_id, course_name, student_id) VALUES (1, 'Math', 1), (2, 'Science', 1), (3, 'History', 2) ''') conn.commit() # Using Aliases cursor.execute('SELECT name AS student_name, age AS student_age FROM students') aliases_results = cursor.fetchall() # Sorting Results cursor.execute('SELECT name, age FROM students ORDER BY age DESC') sorted_results = cursor.fetchall() # Filtering Results with WHERE Clause cursor.execute('SELECT name, age FROM students WHERE age > 20') filtered_results = cursor.fetchall() # Using Aggregate Functions cursor.execute('SELECT COUNT(*) AS total_students, AVG(age) AS average_age FROM students') aggregate_results = cursor.fetchall() # Grouping Results cursor.execute('SELECT class, COUNT(*) AS students_count, AVG(age) AS average_age FROM students GROUP BY class') grouped_results = cursor.fetchall() # Using Subqueries cursor.execute('SELECT name, age FROM students WHERE age > (SELECT AVG(age) FROM students)') subquery_results = cursor.fetchall() # Joining Tables cursor.execute(''' SELECT students.name, courses.course_name FROM students JOIN courses ON students.id = courses.student_id ''') join_results = cursor.fetchall() # Limiting Results cursor.execute('SELECT name, age FROM students LIMIT 2') limit_results = cursor.fetchall() # Close the connection conn.close() # Output results (aliases_results, sorted_results, filtered_results, aggregate_results, grouped_results, subquery_results, join_results, limit_results)

Output:

# Output for aliases [ ('John', 22), ('Alice', 25), ('Bob', 20) ] # Output for sorting results [ ('Alice', 25), ('John', 22), ('Bob', 20) ] # Output for filtering results [ ('John', 22), ('Alice', 25) ] # Output for aggregate functions [ (3, 22.33) ] # Output for grouping results [ ('A', 2, 21.0), ('B', 1, 25.0) ] # Output for subqueries [ ('Alice', 25) ] # Output for joining tables [ ('John', 'Math'), ('John', 'Science'), ('Alice', 'History') ] # Output for limiting results [ ('John', 22), ('Alice', 25) ]

These examples illustrate how to perform advanced data retrieval using the SELECT statement in SQLite with Python, demonstrating practical applications of aliases, sorting, filtering, aggregation, grouping, subqueries, joins, and result limiting.

Filtering and sorting data using WHERE and ORDER BY clauses

Filtering and Sorting Data Using WHERE and ORDER BY Clauses in SQLite Database

Filtering Data with WHERE Clause

The WHERE clause is used to filter records based on specified conditions. It allows you to retrieve only those rows that meet the criteria.

Syntax:

SELECT column1, column2, ... FROM table_name WHERE condition;

Example Scenario:

Assume we have a students table with the following structure and data:

CREATE TABLE students ( id INTEGER PRIMARY KEY, name TEXT, age INTEGER, class TEXT ); INSERT INTO students (id, name, age, class) VALUES (1, 'John', 22, 'A'), (2, 'Alice', 25, 'B'), (3, 'Bob', 20, 'A'), (4, 'Eve', 23, 'B'), (5, 'Chris', 19, 'A');

Example Query:

SELECT name, age, class FROM students WHERE age > 20;

Output:

| name | age | class | |-------|-----|-------| | John | 22 | A | | Alice | 25 | B | | Eve | 23 | B |

Sorting Data with ORDER BY Clause

The ORDER BY clause is used to sort the result set by one or more columns. You can sort in ascending (ASC) or descending (DESC) order.

Syntax:

SELECT column1, column2, ... FROM table_name ORDER BY column1 [ASC|DESC], column2 [ASC|DESC], ...;

Example Query:

SELECT name, age, class FROM students ORDER BY age DESC, name ASC;

Output:

| name | age | class | |-------|-----|-------| | Alice | 25 | B | | Eve | 23 | B | | John | 22 | A | | Bob | 20 | A | | Chris | 19 | A |

Combining WHERE and ORDER BY Clauses

You can combine the WHERE and ORDER BY clauses to filter the data first and then sort the result.

Example Query:

SELECT name, age, class FROM students WHERE age > 20 ORDER BY age DESC;

Output:

| name | age | class | |-------|-----|-------| | Alice | 25 | B | | Eve | 23 | B | | John | 22 | A |

Example in SQLite Using Python

Let's demonstrate filtering and sorting data using WHERE and ORDER BY clauses in SQLite with Python.

import sqlite3 # Connect to the SQLite database conn = sqlite3.connect(':memory:') # Use in-memory database for demonstration cursor = conn.cursor() # Create the students table cursor.execute(''' CREATE TABLE students ( id INTEGER PRIMARY KEY, name TEXT, age INTEGER, class TEXT ) ''') # Insert data into students table cursor.execute(''' INSERT INTO students (id, name, age, class) VALUES (1, 'John', 22, 'A'), (2, 'Alice', 25, 'B'), (3, 'Bob', 20, 'A'), (4, 'Eve', 23, 'B'), (5, 'Chris', 19, 'A') ''') conn.commit() # Filtering data with WHERE clause cursor.execute('SELECT name, age, class FROM students WHERE age > 20') filtered_results = cursor.fetchall() # Sorting data with ORDER BY clause cursor.execute('SELECT name, age, class FROM students ORDER BY age DESC, name ASC') sorted_results = cursor.fetchall() # Combining WHERE and ORDER BY clauses cursor.execute('SELECT name, age, class FROM students WHERE age > 20 ORDER BY age DESC') combined_results = cursor.fetchall() # Close the connection conn.close() # Output results (filtered_results, sorted_results, combined_results)

Output:

# Output for filtering data with WHERE clause [ ('John', 22, 'A'), ('Alice', 25, 'B'), ('Eve', 23, 'B') ] # Output for sorting data with ORDER BY clause [ ('Alice', 25, 'B'), ('Eve', 23, 'B'), ('John', 22, 'A'), ('Bob', 20, 'A'), ('Chris', 19, 'A') ] # Output for combining WHERE and ORDER BY clauses [ ('Alice', 25, 'B'), ('Eve', 23, 'B'), ('John', 22, 'A') ]

Explanation

  1. Filtering Data with WHERE Clause:

    • The SELECT statement retrieves data from the students table where the age is greater than 20.
  2. Sorting Data with ORDER BY Clause:

    • The SELECT statement retrieves data from the students table and sorts the results first by age in descending order and then by name in ascending order.
  3. Combining WHERE and ORDER BY Clauses:

    • The SELECT statement retrieves data from the students table where the age is greater than 20 and sorts the results by age in descending order.

This demonstrates how to filter and sort data in SQLite using the WHERE and ORDER BY clauses with practical examples and their corresponding outputs in Python.

Aggregation functions: SUM, AVG, COUNT, MAX, MIN

Aggregation Functions in SQLite Database

Aggregation functions in SQL perform a calculation on a set of values and return a single value. SQLite supports several aggregation functions, including SUM, AVG, COUNT, MAX, and MIN.

1. SUM

The SUM function returns the total sum of a numeric column.

Syntax:

SELECT SUM(column_name) FROM table_name WHERE condition;

Example: Assume we have a students table with columns id, name, age, and score.

CREATE TABLE students ( id INTEGER PRIMARY KEY, name TEXT, age INTEGER, score INTEGER ); INSERT INTO students (id, name, age, score) VALUES (1, 'John', 22, 85), (2, 'Alice', 25, 92), (3, 'Bob', 20, 75), (4, 'Eve', 23, 88), (5, 'Chris', 19, 90);

Query:

SELECT SUM(score) AS total_score FROM students;

Output:

| total_score | |-------------| | 430 |

2. AVG

The AVG function returns the average value of a numeric column.

Syntax:

SELECT AVG(column_name) FROM table_name WHERE condition;

Example:

SELECT AVG(score) AS average_score FROM students;

Output:

| average_score | |---------------| | 86.0 |

3. COUNT

The COUNT function returns the number of rows that match a specified condition.

Syntax:

SELECT COUNT(column_name) FROM table_name WHERE condition;

Example:

SELECT COUNT(*) AS total_students FROM students;

Output:

| total_students | |----------------| | 5 |

4. MAX

The MAX function returns the maximum value in a column.

Syntax:

SELECT MAX(column_name) FROM table_name WHERE condition;

Example:

SELECT MAX(score) AS highest_score FROM students;

Output:

| highest_score | |---------------| | 92 |

5. MIN

The MIN function returns the minimum value in a column.

Syntax:

SELECT MIN(column_name) FROM table_name WHERE condition;

Example:

SELECT MIN(score) AS lowest_score FROM students;

Output:

| lowest_score | |--------------| | 75 |

Example in SQLite Using Python

Let's demonstrate these aggregation functions using SQLite in Python.

import sqlite3 # Connect to the SQLite database conn = sqlite3.connect(':memory:') # Use in-memory database for demonstration cursor = conn.cursor() # Create the students table cursor.execute(''' CREATE TABLE students ( id INTEGER PRIMARY KEY, name TEXT, age INTEGER, score INTEGER ) ''') # Insert data into students table cursor.execute(''' INSERT INTO students (id, name, age, score) VALUES (1, 'John', 22, 85), (2, 'Alice', 25, 92), (3, 'Bob', 20, 75), (4, 'Eve', 23, 88), (5, 'Chris', 19, 90) ''') conn.commit() # SUM function cursor.execute('SELECT SUM(score) AS total_score FROM students') sum_result = cursor.fetchall() # AVG function cursor.execute('SELECT AVG(score) AS average_score FROM students') avg_result = cursor.fetchall() # COUNT function cursor.execute('SELECT COUNT(*) AS total_students FROM students') count_result = cursor.fetchall() # MAX function cursor.execute('SELECT MAX(score) AS highest_score FROM students') max_result = cursor.fetchall() # MIN function cursor.execute('SELECT MIN(score) AS lowest_score FROM students') min_result = cursor.fetchall() # Close the connection conn.close() # Output results (sum_result, avg_result, count_result, max_result, min_result)

Output:

# Output for SUM function [ (430,) ] # Output for AVG function [ (86.0,) ] # Output for COUNT function [ (5,) ] # Output for MAX function [ (92,) ] # Output for MIN function [ (75,) ]

Explanation

  1. SUM Function:

    • The SELECT SUM(score) AS total_score FROM students statement calculates the total sum of the score column for all students.
    • Output: 430
  2. AVG Function:

    • The SELECT AVG(score) AS average_score FROM students statement calculates the average value of the score column for all students.
    • Output: 86.0
  3. COUNT Function:

    • The SELECT COUNT(*) AS total_students FROM students statement counts the total number of rows in the students table.
    • Output: 5
  4. MAX Function:

    • The SELECT MAX(score) AS highest_score FROM students statement finds the maximum value in the score column.
    • Output: 92
  5. MIN Function:

    • The SELECT MIN(score) AS lowest_score FROM students statement finds the minimum value in the score column.
    • Output: 75

This demonstrates how to use aggregation functions (SUM, AVG, COUNT, MAX, MIN) in SQLite to perform calculations on data, with practical examples and their corresponding outputs in Python.

Grouping data with GROUP BY clause

Grouping Data with GROUP BY Clause in SQLite Database

The GROUP BY clause is used in SQL to arrange identical data into groups. It often comes with aggregate functions like COUNT, SUM, AVG, MAX, and MIN to perform operations on each group.

Syntax

SELECT column1, aggregate_function(column2) FROM table_name WHERE condition GROUP BY column1 ORDER BY column1;

Example Scenario

Assume we have a students table with columns id, name, age, class, and score.

CREATE TABLE students ( id INTEGER PRIMARY KEY, name TEXT, age INTEGER, class TEXT, score INTEGER ); INSERT INTO students (id, name, age, class, score) VALUES (1, 'John', 22, 'A', 85), (2, 'Alice', 25, 'B', 92), (3, 'Bob', 20, 'A', 75), (4, 'Eve', 23, 'B', 88), (5, 'Chris', 19, 'A', 90);

Example Queries

  1. Count Students in Each Class
SELECT class, COUNT(*) AS student_count FROM students GROUP BY class;

Output:

| class | student_count | |-------|---------------| | A | 3 | | B | 2 |
  1. Average Score in Each Class
SELECT class, AVG(score) AS average_score FROM students GROUP BY class;

Output:

| class | average_score | |-------|---------------| | A | 83.33 | | B | 90.0 |
  1. Total Score in Each Class
SELECT class, SUM(score) AS total_score FROM students GROUP BY class;

Output:

| class | total_score | |-------|-------------| | A | 250 | | B | 180 |
  1. Maximum Score in Each Class
SELECT class, MAX(score) AS highest_score FROM students GROUP BY class;

Output:

| class | highest_score | |-------|---------------| | A | 90 | | B | 92 |
  1. Minimum Score in Each Class
SELECT class, MIN(score) AS lowest_score FROM students GROUP BY class;

Output:

| class | lowest_score | |-------|--------------| | A | 75 | | B | 88 |

Example in SQLite Using Python

Let's demonstrate the GROUP BY clause with the above queries using SQLite in Python.

import sqlite3 # Connect to the SQLite database conn = sqlite3.connect(':memory:') # Use in-memory database for demonstration cursor = conn.cursor() # Create the students table cursor.execute(''' CREATE TABLE students ( id INTEGER PRIMARY KEY, name TEXT, age INTEGER, class TEXT, score INTEGER ) ''') # Insert data into students table cursor.execute(''' INSERT INTO students (id, name, age, class, score) VALUES (1, 'John', 22, 'A', 85), (2, 'Alice', 25, 'B', 92), (3, 'Bob', 20, 'A', 75), (4, 'Eve', 23, 'B', 88), (5, 'Chris', 19, 'A', 90) ''') conn.commit() # Count Students in Each Class cursor.execute('SELECT class, COUNT(*) AS student_count FROM students GROUP BY class') count_results = cursor.fetchall() # Average Score in Each Class cursor.execute('SELECT class, AVG(score) AS average_score FROM students GROUP BY class') avg_results = cursor.fetchall() # Total Score in Each Class cursor.execute('SELECT class, SUM(score) AS total_score FROM students GROUP BY class') sum_results = cursor.fetchall() # Maximum Score in Each Class cursor.execute('SELECT class, MAX(score) AS highest_score FROM students GROUP BY class') max_results = cursor.fetchall() # Minimum Score in Each Class cursor.execute('SELECT class, MIN(score) AS lowest_score FROM students GROUP BY class') min_results = cursor.fetchall() # Close the connection conn.close() # Output results (count_results, avg_results, sum_results, max_results, min_results)

Output:

# Output for count of students in each class [ ('A', 3), ('B', 2) ] # Output for average score in each class [ ('A', 83.33), ('B', 90.0) ] # Output for total score in each class [ ('A', 250), ('B', 180) ] # Output for maximum score in each class [ ('A', 90), ('B', 92) ] # Output for minimum score in each class [ ('A', 75), ('B', 88) ]

Explanation

  1. Count Students in Each Class:

    • SELECT class, COUNT(*) AS student_count FROM students GROUP BY class
    • This query groups the students by class and counts the number of students in each class.
  2. Average Score in Each Class:

    • SELECT class, AVG(score) AS average_score FROM students GROUP BY class
    • This query groups the students by class and calculates the average score in each class.
  3. Total Score in Each Class:

    • SELECT class, SUM(score) AS total_score FROM students GROUP BY class
    • This query groups the students by class and sums the scores in each class.
  4. Maximum Score in Each Class:

    • SELECT class, MAX(score) AS highest_score FROM students GROUP BY class
    • This query groups the students by class and finds the highest score in each class.
  5. Minimum Score in Each Class:

    • SELECT class, MIN(score) AS lowest_score FROM students GROUP BY class
    • This query groups the students by class and finds the lowest score in each class.

These examples demonstrate how to use the GROUP BY clause in SQLite to group data and perform aggregate calculations on each group, with practical examples and their corresponding outputs in Python.


2.6 Joins and Subqueries

Understanding relational joins: INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL JOIN

Understanding Relational Joins in SQLite

Joins in SQL are used to combine rows from two or more tables based on a related column between them. SQLite supports several types of joins: INNER JOIN, LEFT JOIN, and CROSS JOIN. SQLite does not directly support RIGHT JOIN and FULL JOIN, but they can be achieved using combinations of other joins.

1. INNER JOIN

An INNER JOIN returns only the rows that have matching values in both tables.

Syntax:

SELECT columns FROM table1 INNER JOIN table2 ON table1.column = table2.column;

Example: Assume we have two tables, students and classes:

CREATE TABLE students ( id INTEGER PRIMARY KEY, name TEXT, class_id INTEGER ); CREATE TABLE classes ( id INTEGER PRIMARY KEY, class_name TEXT ); INSERT INTO students (id, name, class_id) VALUES (1, 'John', 1), (2, 'Alice', 2), (3, 'Bob', 1), (4, 'Eve', 3), (5, 'Chris', 2); INSERT INTO classes (id, class_name) VALUES (1, 'Math'), (2, 'Science'), (3, 'History');

Query:

SELECT students.name, classes.class_name FROM students INNER JOIN classes ON students.class_id = classes.id;

Output:

| name | class_name | |-------|------------| | John | Math | | Alice | Science | | Bob | Math | | Eve | History | | Chris | Science |

2. LEFT JOIN

A LEFT JOIN returns all rows from the left table, and the matched rows from the right table. The result is NULL from the right side if there is no match.

Syntax:

SELECT columns FROM table1 LEFT JOIN table2 ON table1.column = table2.column;

Example:

SELECT students.name, classes.class_name FROM students LEFT JOIN classes ON students.class_id = classes.id;

Output:

| name | class_name | |-------|------------| | John | Math | | Alice | Science | | Bob | Math | | Eve | History | | Chris | Science |

3. RIGHT JOIN (Using LEFT JOIN and UNION)

SQLite does not support RIGHT JOIN directly, but you can achieve it using a combination of LEFT JOIN and UNION.

Syntax:

SELECT columns FROM table1 LEFT JOIN table2 ON table1.column = table2.column UNION ALL SELECT columns FROM table2 LEFT JOIN table1 ON table2.column = table1.column WHERE table1.column IS NULL;

Example: Assume we have another student who is not enrolled in any class.

INSERT INTO students (id, name, class_id) VALUES (6, 'Frank', NULL); -- Query: SELECT students.name, classes.class_name FROM students LEFT JOIN classes ON students.class_id = classes.id UNION ALL SELECT students.name, classes.class_name FROM classes LEFT JOIN students ON students.class_id = classes.id WHERE students.class_id IS NULL;

Output:

| name | class_name | |-------|------------| | John | Math | | Alice | Science | | Bob | Math | | Eve | History | | Chris | Science | | NULL | Math | | NULL | Science | | NULL | History |

4. FULL JOIN (Using LEFT JOIN, UNION, and RIGHT JOIN Emulation)

SQLite does not support FULL JOIN directly, but you can achieve it using a combination of LEFT JOIN, UNION, and a RIGHT JOIN emulation.

Syntax:

SELECT columns FROM table1 LEFT JOIN table2 ON table1.column = table2.column UNION ALL SELECT columns FROM table2 LEFT JOIN table1 ON table2.column = table1.column WHERE table1.column IS NULL;

Example: We will use the same students and classes tables.

Query:

SELECT students.name, classes.class_name FROM students LEFT JOIN classes ON students.class_id = classes.id UNION SELECT students.name, classes.class_name FROM classes LEFT JOIN students ON students.class_id = classes.id WHERE students.class_id IS NULL;

Output:

| name | class_name | |-------|------------| | John | Math | | Alice | Science | | Bob | Math | | Eve | History | | Chris | Science | | NULL | Math | | NULL | Science | | NULL | History |

Example in SQLite Using Python

Let's demonstrate these joins using SQLite in Python.

import sqlite3 # Connect to the SQLite database conn = sqlite3.connect(':memory:') # Use in-memory database for demonstration cursor = conn.cursor() # Create the students and classes tables cursor.execute(''' CREATE TABLE students ( id INTEGER PRIMARY KEY, name TEXT, class_id INTEGER ) ''') cursor.execute(''' CREATE TABLE classes ( id INTEGER PRIMARY KEY, class_name TEXT ) ''') # Insert data into students table cursor.execute(''' INSERT INTO students (id, name, class_id) VALUES (1, 'John', 1), (2, 'Alice', 2), (3, 'Bob', 1), (4, 'Eve', 3), (5, 'Chris', 2), (6, 'Frank', NULL) ''') # Insert data into classes table cursor.execute(''' INSERT INTO classes (id, class_name) VALUES (1, 'Math'), (2, 'Science'), (3, 'History') ''') conn.commit() # INNER JOIN cursor.execute(''' SELECT students.name, classes.class_name FROM students INNER JOIN classes ON students.class_id = classes.id ''') inner_join_results = cursor.fetchall() # LEFT JOIN cursor.execute(''' SELECT students.name, classes.class_name FROM students LEFT JOIN classes ON students.class_id = classes.id ''') left_join_results = cursor.fetchall() # RIGHT JOIN emulation using LEFT JOIN and UNION cursor.execute(''' SELECT students.name, classes.class_name FROM students LEFT JOIN classes ON students.class_id = classes.id UNION ALL SELECT students.name, classes.class_name FROM classes LEFT JOIN students ON students.class_id = classes.id WHERE students.class_id IS NULL ''') right_join_results = cursor.fetchall() # FULL JOIN emulation using LEFT JOIN, UNION, and RIGHT JOIN emulation cursor.execute(''' SELECT students.name, classes.class_name FROM students LEFT JOIN classes ON students.class_id = classes.id UNION SELECT students.name, classes.class_name FROM classes LEFT JOIN students ON students.class_id = classes.id WHERE students.class_id IS NULL ''') full_join_results = cursor.fetchall() # Close the connection conn.close() # Output results (inner_join_results, left_join_results, right_join_results, full_join_results)

Output:

# Output for INNER JOIN [ ('John', 'Math'), ('Alice', 'Science'), ('Bob', 'Math'), ('Eve', 'History'), ('Chris', 'Science') ] # Output for LEFT JOIN [ ('John', 'Math'), ('Alice', 'Science'), ('Bob', 'Math'), ('Eve', 'History'), ('Chris', 'Science'), ('Frank', None) ] # Output for RIGHT JOIN emulation [ ('John', 'Math'), ('Alice', 'Science'), ('Bob', 'Math'), ('Eve', 'History'), ('Chris', 'Science'), (None, 'Math'), (None, 'Science'), (None, 'History') ] # Output for FULL JOIN emulation [ ('John', 'Math'), ('Alice', 'Science'), ('Bob', 'Math'), ('Eve', 'History'), ('Chris', 'Science'), (None, 'Math'), (None, 'Science'), (None, 'History') ]

Explanation

  1. INNER JOIN:

    • The INNER JOIN combines rows from students and classes where the class_id matches the id in the classes table. Only the matching rows are returned.
  2. LEFT JOIN:

    • The LEFT JOIN combines rows from students and classes, returning all rows from the students table and matching rows from the classes table. If there is no match, NULL is returned for columns from the classes table.
  3. RIGHT JOIN Emulation:

    • Since SQLite does not support RIGHT JOIN directly, it is emulated using a combination of LEFT JOIN and UNION ALL. This ensures that all rows from the classes table are included, with NULL for non-matching rows from the students table.
  4. FULL JOIN Emulation:

    • Since SQLite does not support FULL JOIN directly, it is emulated using a combination of LEFT JOIN, UNION, and RIGHT JOIN emulation. This ensures that all rows from both tables are included, with NULL for non-matching rows from either table.

These examples demonstrate how to use different types of joins in SQLite to combine data from multiple tables, with practical examples and their corresponding outputs in Python.

Using subqueries in SQL queries

Using Subqueries in SQL Queries in SQLite Database

A subquery, also known as an inner query or nested query, is a query within another SQL query. Subqueries can be used in various places within a SQL statement, including the SELECT, FROM, WHERE, and HAVING clauses. They provide a powerful way to perform complex queries.

Types of Subqueries

  1. Scalar Subqueries: Return a single value.
  2. Row Subqueries: Return a single row.
  3. Column Subqueries: Return a single column.
  4. Table Subqueries: Return a result set (multiple rows and columns).

Examples of Subqueries

Example Scenario

Assume we have two tables, students and scores.

CREATE TABLE students ( id INTEGER PRIMARY KEY, name TEXT, age INTEGER ); CREATE TABLE scores ( student_id INTEGER, subject TEXT, score INTEGER, FOREIGN KEY (student_id) REFERENCES students(id) ); INSERT INTO students (id, name, age) VALUES (1, 'John', 22), (2, 'Alice', 25), (3, 'Bob', 20), (4, 'Eve', 23), (5, 'Chris', 19); INSERT INTO scores (student_id, subject, score) VALUES (1, 'Math', 85), (2, 'Math', 92), (3, 'Math', 75), (4, 'Math', 88), (5, 'Math', 90), (1, 'Science', 80), (2, 'Science', 85), (3, 'Science', 70), (4, 'Science', 78), (5, 'Science', 95);

1. Scalar Subquery

Find the student with the highest score in Math.

SELECT name FROM students WHERE id = ( SELECT student_id FROM scores WHERE subject = 'Math' ORDER BY score DESC LIMIT 1 );

Output:

| name | |-------| | Alice |

2. Row Subquery

Find the name and age of the student with the highest average score across all subjects.

SELECT name, age FROM students WHERE id = ( SELECT student_id FROM scores GROUP BY student_id ORDER BY AVG(score) DESC LIMIT 1 );

Output:

| name | age | |-------|-----| | Chris | 19 |

3. Column Subquery

Find the students who scored more than 80 in all subjects.

SELECT name FROM students WHERE id IN ( SELECT student_id FROM scores GROUP BY student_id HAVING MIN(score) > 80 );

Output:

| name | |-------| | Alice | | Chris |

4. Table Subquery

Find the average score for each subject and list subjects with average score above 80.

SELECT subject, avg_score FROM ( SELECT subject, AVG(score) AS avg_score FROM scores GROUP BY subject ) AS avg_scores WHERE avg_score > 80;

Output:

| subject | avg_score | |---------|-----------| | Math | 86.0 | | Science | 81.6 |

Example in SQLite Using Python

Let's demonstrate these subqueries using SQLite in Python.

import sqlite3 # Connect to the SQLite database conn = sqlite3.connect(':memory:') # Use in-memory database for demonstration cursor = conn.cursor() # Create the students and scores tables cursor.execute(''' CREATE TABLE students ( id INTEGER PRIMARY KEY, name TEXT, age INTEGER ) ''') cursor.execute(''' CREATE TABLE scores ( student_id INTEGER, subject TEXT, score INTEGER, FOREIGN KEY (student_id) REFERENCES students(id) ) ''') # Insert data into students table cursor.execute(''' INSERT INTO students (id, name, age) VALUES (1, 'John', 22), (2, 'Alice', 25), (3, 'Bob', 20), (4, 'Eve', 23), (5, 'Chris', 19) ''') # Insert data into scores table cursor.execute(''' INSERT INTO scores (student_id, subject, score) VALUES (1, 'Math', 85), (2, 'Math', 92), (3, 'Math', 75), (4, 'Math', 88), (5, 'Math', 90), (1, 'Science', 80), (2, 'Science', 85), (3, 'Science', 70), (4, 'Science', 78), (5, 'Science', 95) ''') conn.commit() # Scalar Subquery cursor.execute(''' SELECT name FROM students WHERE id = ( SELECT student_id FROM scores WHERE subject = 'Math' ORDER BY score DESC LIMIT 1 ) ''') scalar_subquery_result = cursor.fetchall() # Row Subquery cursor.execute(''' SELECT name, age FROM students WHERE id = ( SELECT student_id FROM scores GROUP BY student_id ORDER BY AVG(score) DESC LIMIT 1 ) ''') row_subquery_result = cursor.fetchall() # Column Subquery cursor.execute(''' SELECT name FROM students WHERE id IN ( SELECT student_id FROM scores GROUP BY student_id HAVING MIN(score) > 80 ) ''') column_subquery_result = cursor.fetchall() # Table Subquery cursor.execute(''' SELECT subject, avg_score FROM ( SELECT subject, AVG(score) AS avg_score FROM scores GROUP BY subject ) AS avg_scores WHERE avg_score > 80 ''') table_subquery_result = cursor.fetchall() # Close the connection conn.close() # Output results (scalar_subquery_result, row_subquery_result, column_subquery_result, table_subquery_result)

Output:

# Output for Scalar Subquery [ ('Alice',) ] # Output for Row Subquery [ ('Chris', 19) ] # Output for Column Subquery [ ('Alice',), ('Chris',) ] # Output for Table Subquery [ ('Math', 86.0), ('Science', 81.6) ]

Explanation

  1. Scalar Subquery:

    • The subquery finds the student_id with the highest score in Math, and the outer query selects the name of that student.
  2. Row Subquery:

    • The subquery finds the student_id with the highest average score, and the outer query selects the name and age of that student.
  3. Column Subquery:

    • The subquery finds the student_id of students who scored more than 80 in all subjects, and the outer query selects their name.
  4. Table Subquery:

    • The subquery calculates the average score for each subject, and the outer query selects subjects with an average score above 80.

These examples demonstrate how to use subqueries in SQLite to perform complex queries, with practical examples and their corresponding outputs in Python.

Correlated vs. non-correlated subqueries

Correlated vs. Non-Correlated Subqueries in SQLite

1. Non-Correlated Subqueries

A non-correlated subquery is a subquery that can be executed independently of the outer query. It does not rely on values from the outer query and can be run on its own to produce a result.

Example:

Find students whose age is above the average age of all students.

First, let's create and populate the students table:

CREATE TABLE students ( id INTEGER PRIMARY KEY, name TEXT, age INTEGER ); INSERT INTO students (id, name, age) VALUES (1, 'John', 22), (2, 'Alice', 25), (3, 'Bob', 20), (4, 'Eve', 23), (5, 'Chris', 19);

Query:

SELECT name, age FROM students WHERE age > (SELECT AVG(age) FROM students);

Output:

| name | age | |-------|-----| | John | 22 | | Alice | 25 | | Eve | 23 |

In this example, the subquery (SELECT AVG(age) FROM students) is non-correlated because it does not depend on the outer query. It calculates the average age of all students, and the outer query uses this result to filter students.

2. Correlated Subqueries

A correlated subquery is a subquery that depends on the outer query for its values. It is executed repeatedly, once for each row processed by the outer query.

Example:

Find students who have a higher age than the average age of students in their class.

First, let's extend our example by adding a class_id to the students table and creating a classes table:

CREATE TABLE classes ( id INTEGER PRIMARY KEY, class_name TEXT ); INSERT INTO classes (id, class_name) VALUES (1, 'Math'), (2, 'Science'), (3, 'History'); ALTER TABLE students ADD COLUMN class_id INTEGER; UPDATE students SET class_id = 1 WHERE id IN (1, 3); UPDATE students SET class_id = 2 WHERE id IN (2, 5); UPDATE students SET class_id = 3 WHERE id = 4;

Query:

SELECT name, age FROM students s1 WHERE age > (SELECT AVG(age) FROM students s2 WHERE s1.class_id = s2.class_id);

Output:

| name | age | |-------|-----| | John | 22 | | Alice | 25 | | Eve | 23 |

In this example, the subquery (SELECT AVG(age) FROM students s2 WHERE s1.class_id = s2.class_id) is correlated because it depends on the class_id of each row in the outer query students s1. For each student in the outer query, the subquery calculates the average age of students in the same class and then compares it to the age of the student from the outer query.

Example in SQLite Using Python

Let's demonstrate these subqueries using SQLite in Python.

import sqlite3 # Connect to the SQLite database conn = sqlite3.connect(':memory:') # Use in-memory database for demonstration cursor = conn.cursor() # Create the students and classes tables cursor.execute(''' CREATE TABLE students ( id INTEGER PRIMARY KEY, name TEXT, age INTEGER, class_id INTEGER ) ''') cursor.execute(''' CREATE TABLE classes ( id INTEGER PRIMARY KEY, class_name TEXT ) ''') # Insert data into classes table cursor.execute(''' INSERT INTO classes (id, class_name) VALUES (1, 'Math'), (2, 'Science'), (3, 'History') ''') # Insert data into students table cursor.execute(''' INSERT INTO students (id, name, age, class_id) VALUES (1, 'John', 22, 1), (2, 'Alice', 25, 2), (3, 'Bob', 20, 1), (4, 'Eve', 23, 3), (5, 'Chris', 19, 2) ''') conn.commit() # Non-Correlated Subquery cursor.execute(''' SELECT name, age FROM students WHERE age > (SELECT AVG(age) FROM students) ''') non_correlated_subquery_result = cursor.fetchall() # Correlated Subquery cursor.execute(''' SELECT name, age FROM students s1 WHERE age > (SELECT AVG(age) FROM students s2 WHERE s1.class_id = s2.class_id) ''') correlated_subquery_result = cursor.fetchall() # Close the connection conn.close() # Output results (non_correlated_subquery_result, correlated_subquery_result)

Output:

# Output for Non-Correlated Subquery [ ('John', 22), ('Alice', 25), ('Eve', 23) ] # Output for Correlated Subquery [ ('John', 22), ('Alice', 25), ('Eve', 23) ]

Explanation

  1. Non-Correlated Subquery:

    • The subquery calculates the average age of all students independently of the outer query.
    • The outer query selects students whose age is greater than this average.
  2. Correlated Subquery:

    • The subquery calculates the average age of students in the same class for each student in the outer query.
    • The outer query selects students whose age is greater than the average age of their classmates.

These examples demonstrate how to use correlated and non-correlated subqueries in SQLite to perform complex queries, with practical examples and their corresponding outputs in Python.


2.7 Views and Stored Procedures

Introduction to database views

Introduction to Database Views in SQLite

A view in SQLite is a virtual table created by a query that selects data from one or more tables. Views are useful for a variety of reasons:

  • Simplifying complex queries: Views can encapsulate complex queries and present them as simple tables.
  • Security: Views can restrict access to certain columns or rows, enhancing security.
  • Consistency: Views ensure that complex queries are standardized and consistently used across the application.

Creating a View

You create a view using the CREATE VIEW statement. Here’s the basic syntax:

CREATE VIEW view_name AS SELECT column1, column2, ... FROM table_name WHERE condition;

Example

Let's work through an example. Assume we have a database with the following table called Employees:

CREATE TABLE Employees ( EmployeeID INTEGER PRIMARY KEY, FirstName TEXT NOT NULL, LastName TEXT NOT NULL, Department TEXT NOT NULL, Salary REAL NOT NULL );

We can populate the Employees table with some data:

INSERT INTO Employees (FirstName, LastName, Department, Salary) VALUES ('John', 'Doe', 'HR', 60000), ('Jane', 'Smith', 'Finance', 75000), ('Mike', 'Johnson', 'IT', 80000), ('Emily', 'Davis', 'Finance', 72000), ('James', 'Wilson', 'IT', 82000);

Now, let's create a view called FinanceEmployees that shows the first name, last name, and salary of employees in the Finance department:

CREATE VIEW FinanceEmployees AS SELECT FirstName, LastName, Salary FROM Employees WHERE Department = 'Finance';

Querying the View

Once the view is created, you can query it just like a regular table:

SELECT * FROM FinanceEmployees;

Example Output

Assuming the above data, the query on the FinanceEmployees view would produce the following output:

FirstNameLastNameSalary
JaneSmith75000
EmilyDavis72000

Complete Example

Here is a complete SQLite script to create the table, insert data, create a view, and query the view:

-- Create the Employees table CREATE TABLE Employees ( EmployeeID INTEGER PRIMARY KEY, FirstName TEXT NOT NULL, LastName TEXT NOT NULL, Department TEXT NOT NULL, Salary REAL NOT NULL ); -- Insert data into the Employees table INSERT INTO Employees (FirstName, LastName, Department, Salary) VALUES ('John', 'Doe', 'HR', 60000), ('Jane', 'Smith', 'Finance', 75000), ('Mike', 'Johnson', 'IT', 80000), ('Emily', 'Davis', 'Finance', 72000), ('James', 'Wilson', 'IT', 82000); -- Create a view for Finance employees CREATE VIEW FinanceEmployees AS SELECT FirstName, LastName, Salary FROM Employees WHERE Department = 'Finance'; -- Query the view SELECT * FROM FinanceEmployees;

This script will create the table, insert the data, define the view, and then query the view to show employees in the Finance department. This demonstrates how views can be used to simplify access to specific subsets of data in SQLite.

Creating and managing views in SQL

Creating and Managing Views in SQLite

Views in SQLite provide a powerful way to present data in a structured and easily accessible manner. They allow you to encapsulate complex queries, provide a simplified interface to the data, and help enforce security by limiting access to specific data.

Creating a View

The CREATE VIEW statement is used to create a view in SQLite. Here’s the basic syntax:

CREATE VIEW view_name AS SELECT column1, column2, ... FROM table_name WHERE condition;

Example

Let's work with a sample Employees table:

CREATE TABLE Employees ( EmployeeID INTEGER PRIMARY KEY, FirstName TEXT NOT NULL, LastName TEXT NOT NULL, Department TEXT NOT NULL, Salary REAL NOT NULL );

Insert sample data into the Employees table:

INSERT INTO Employees (FirstName, LastName, Department, Salary) VALUES ('John', 'Doe', 'HR', 60000), ('Jane', 'Smith', 'Finance', 75000), ('Mike', 'Johnson', 'IT', 80000), ('Emily', 'Davis', 'Finance', 72000), ('James', 'Wilson', 'IT', 82000);

Now, create a view to show employees from the Finance department:

CREATE VIEW FinanceEmployees AS SELECT FirstName, LastName, Salary FROM Employees WHERE Department = 'Finance';

Querying the View

You can query the view just like a regular table:

SELECT * FROM FinanceEmployees;

This will output:

FirstNameLastNameSalary
JaneSmith75000
EmilyDavis72000

Managing Views

Updating a View

Views in SQLite cannot be directly updated. To modify a view, you need to drop it and recreate it.

Dropping a View

To drop a view, use the DROP VIEW statement:

DROP VIEW IF EXISTS FinanceEmployees;

Creating a New View After Dropping

After dropping the view, you can recreate it with any modifications needed:

CREATE VIEW FinanceEmployees AS SELECT FirstName, LastName, Salary FROM Employees WHERE Department = 'Finance' AND Salary > 70000;

Complete Example

Here is a full script to demonstrate creating, querying, managing, and updating a view in SQLite:

-- Create the Employees table CREATE TABLE Employees ( EmployeeID INTEGER PRIMARY KEY, FirstName TEXT NOT NULL, LastName TEXT NOT NULL, Department TEXT NOT NULL, Salary REAL NOT NULL ); -- Insert data into the Employees table INSERT INTO Employees (FirstName, LastName, Department, Salary) VALUES ('John', 'Doe', 'HR', 60000), ('Jane', 'Smith', 'Finance', 75000), ('Mike', 'Johnson', 'IT', 80000), ('Emily', 'Davis', 'Finance', 72000), ('James', 'Wilson', 'IT', 82000); -- Create a view for Finance employees CREATE VIEW FinanceEmployees AS SELECT FirstName, LastName, Salary FROM Employees WHERE Department = 'Finance'; -- Query the view SELECT * FROM FinanceEmployees; -- Drop the view if it exists DROP VIEW IF EXISTS FinanceEmployees; -- Create the view again with a modified query CREATE VIEW FinanceEmployees AS SELECT FirstName, LastName, Salary FROM Employees WHERE Department = 'Finance' AND Salary > 70000; -- Query the modified view SELECT * FROM FinanceEmployees;

Output of Queries

  1. Initial SELECT * FROM FinanceEmployees; query:
FirstNameLastNameSalary
JaneSmith75000
EmilyDavis72000
  1. After recreating the view with the new condition Salary > 70000:
FirstNameLastNameSalary
JaneSmith75000

This demonstrates how to create, manage, and update views in an SQLite database, providing a clear and simplified interface for data retrieval.

Overview of stored procedures and their advantages

Overview of Stored Procedures in SQLite

In many relational database systems, stored procedures are used to encapsulate complex business logic in reusable and executable routines. However, SQLite, being a lightweight, file-based database, does not natively support stored procedures like other more heavyweight database systems (e.g., MySQL, PostgreSQL, SQL Server).

Emulating Stored Procedures in SQLite

Although SQLite does not have built-in support for stored procedures, similar functionality can be achieved through:

  1. User-defined functions (UDFs): Custom functions written in C, Python, or other programming languages and loaded into SQLite.
  2. SQL scripts: Combining multiple SQL statements into a single script that can be executed as a transaction.
  3. Triggers: Used to automatically execute a predefined set of actions in response to certain events on a table (e.g., INSERT, UPDATE, DELETE).

Advantages of Stored Procedure-Like Implementations in SQLite

  1. Encapsulation of Logic: By emulating stored procedures, complex logic can be encapsulated, making it easier to manage and reuse.
  2. Performance: Precompiled logic (in the form of UDFs) can improve performance by reducing the overhead of repeated SQL parsing and execution.
  3. Security: Encapsulating logic within UDFs or triggers can limit direct access to the underlying tables, enhancing security.
  4. Maintainability: Centralized business logic improves maintainability, as changes need to be made in only one place.

Example: Emulating a Stored Procedure with a SQL Script

Let's create an example where we simulate a stored procedure to increase the salary of employees in a particular department by a given percentage. We'll use a SQL script to achieve this.

Step 1: Create the Employees Table and Insert Data

CREATE TABLE Employees ( EmployeeID INTEGER PRIMARY KEY, FirstName TEXT NOT NULL, LastName TEXT NOT NULL, Department TEXT NOT NULL, Salary REAL NOT NULL ); INSERT INTO Employees (FirstName, LastName, Department, Salary) VALUES ('John', 'Doe', 'HR', 60000), ('Jane', 'Smith', 'Finance', 75000), ('Mike', 'Johnson', 'IT', 80000), ('Emily', 'Davis', 'Finance', 72000), ('James', 'Wilson', 'IT', 82000);

Step 2: Create a SQL Script to Simulate a Stored Procedure

The following script increases the salary of all employees in a specified department by a given percentage:

-- SQL script to increase salary by a percentage for a specific department -- Parameters: :dept (department), :percent (percentage increase) BEGIN TRANSACTION; -- Update salaries UPDATE Employees SET Salary = Salary * (1 + :percent / 100) WHERE Department = :dept; -- Select updated rows SELECT * FROM Employees WHERE Department = :dept; COMMIT;

Step 3: Execute the Script with Parameters

Assume we want to increase the salary of all employees in the "Finance" department by 10%. Here’s how you would execute the script:

-- Set parameters PRAGMA dept = 'Finance'; PRAGMA percent = 10; -- Execute the script BEGIN TRANSACTION; UPDATE Employees SET Salary = Salary * (1 + (SELECT 10) / 100) WHERE Department = (SELECT 'Finance'); SELECT * FROM Employees WHERE Department = (SELECT 'Finance'); COMMIT;

Example Output

Before executing the script, the Finance department employees’ data is:

EmployeeIDFirstNameLastNameDepartmentSalary
2JaneSmithFinance75000
4EmilyDavisFinance72000

After executing the script, the Finance department employees’ data becomes:

EmployeeIDFirstNameLastNameDepartmentSalary
2JaneSmithFinance82500
4EmilyDavisFinance79200

Conclusion

While SQLite does not support stored procedures directly, you can emulate similar functionality using SQL scripts, user-defined functions, and triggers. This approach provides several advantages, including encapsulation of logic, performance improvements, security, and maintainability. The example provided demonstrates how to use a SQL script to simulate a stored procedure to update employee salaries in a specific department.


2.8 Indexes and Query Optimization

Understanding database indexes and their role in query optimization

Understanding Database Indexes and Their Role in Query Optimization in SQLite

Introduction to Database Indexes

A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data. Indexes are used to quickly locate data without having to search every row in a database table every time a database table is accessed.

Role of Indexes in Query Optimization

  1. Faster Query Execution: Indexes can significantly reduce the time it takes to retrieve records. Instead of scanning the entire table, the database can search the index.
  2. Efficient Sorting: Indexes can help with sorting data efficiently. When executing ORDER BY clauses, indexed columns can speed up the sorting process.
  3. Unique Constraints: Indexes are used to enforce uniqueness of columns, typically via unique indexes.
  4. Joins Optimization: Indexes on join columns can enhance the performance of join operations.

Types of Indexes in SQLite

  1. Single-column Indexes: Indexes that are created on a single column.
  2. Multi-column Indexes: Indexes that span multiple columns, useful for queries filtering by multiple columns.
  3. Unique Indexes: Ensure the uniqueness of the values in the indexed columns.

Creating and Using Indexes in SQLite

Example Table: Employees

Let's start by creating a table called Employees:

CREATE TABLE Employees ( EmployeeID INTEGER PRIMARY KEY, FirstName TEXT NOT NULL, LastName TEXT NOT NULL, Department TEXT NOT NULL, Salary REAL NOT NULL );

Insert some sample data:

INSERT INTO Employees (FirstName, LastName, Department, Salary) VALUES ('John', 'Doe', 'HR', 60000), ('Jane', 'Smith', 'Finance', 75000), ('Mike', 'Johnson', 'IT', 80000), ('Emily', 'Davis', 'Finance', 72000), ('James', 'Wilson', 'IT', 82000);

Creating an Index

To create an index on the LastName column:

CREATE INDEX idx_lastname ON Employees (LastName);

Querying with and without Index

Let's compare the performance of a query with and without the index.

Without Index
EXPLAIN QUERY PLAN SELECT * FROM Employees WHERE LastName = 'Smith';

Output (without index):

0|0|0|SCAN TABLE Employees

This indicates a full table scan.

With Index
-- After creating the index CREATE INDEX idx_lastname ON Employees (LastName); EXPLAIN QUERY PLAN SELECT * FROM Employees WHERE LastName = 'Smith';

Output (with index):

0|0|0|SEARCH TABLE Employees USING INDEX idx_lastname (LastName=?)

This indicates that the query is using the index, which is more efficient.

Dropping an Index

If an index is no longer needed, it can be dropped:

DROP INDEX IF EXISTS idx_lastname;

Complete Example

Here's a complete script demonstrating the creation of an index, its usage in a query, and the performance improvement:

-- Create the Employees table CREATE TABLE Employees ( EmployeeID INTEGER PRIMARY KEY, FirstName TEXT NOT NULL, LastName TEXT NOT NULL, Department TEXT NOT NULL, Salary REAL NOT NULL ); -- Insert data into the Employees table INSERT INTO Employees (FirstName, LastName, Department, Salary) VALUES ('John', 'Doe', 'HR', 60000), ('Jane', 'Smith', 'Finance', 75000), ('Mike', 'Johnson', 'IT', 80000), ('Emily', 'Davis', 'Finance', 72000), ('James', 'Wilson', 'IT', 82000); -- Create an index on the LastName column CREATE INDEX idx_lastname ON Employees (LastName); -- Query the Employees table using the index EXPLAIN QUERY PLAN SELECT * FROM Employees WHERE LastName = 'Smith'; -- Drop the index DROP INDEX IF EXISTS idx_lastname; -- Query plan without the index EXPLAIN QUERY PLAN SELECT * FROM Employees WHERE LastName = 'Smith';

Conclusion

Indexes in SQLite play a crucial role in optimizing query performance by enabling faster data retrieval and efficient sorting. By creating appropriate indexes on columns that are frequently used in queries, especially in WHERE clauses and joins, you can significantly improve the performance of your database operations. However, it is essential to balance the use of indexes, as they come with the cost of additional storage and maintenance overhead.

Index types: B-tree indexes, hash indexes, bitmap indexes

SQLite primarily supports B-tree indexes, which are the default and most commonly used type of index in SQLite. Other types of indexes, such as hash indexes and bitmap indexes, are not natively supported in SQLite. Here's a detailed explanation of the types of indexes and their characteristics:

B-Tree Indexes

Overview

  • Structure: B-tree (Balanced tree) indexes store keys in a sorted order, which allows for efficient retrieval, insertion, and deletion operations.
  • Usage: Suitable for a wide range of queries, including equality searches, range searches, and sorting.
  • Default: B-tree indexes are the default index type in SQLite.

Creating a B-tree Index

Let's create a B-tree index on the LastName column of the Employees table.

CREATE TABLE Employees ( EmployeeID INTEGER PRIMARY KEY, FirstName TEXT NOT NULL, LastName TEXT NOT NULL, Department TEXT NOT NULL, Salary REAL NOT NULL ); INSERT INTO Employees (FirstName, LastName, Department, Salary) VALUES ('John', 'Doe', 'HR', 60000), ('Jane', 'Smith', 'Finance', 75000), ('Mike', 'Johnson', 'IT', 80000), ('Emily', 'Davis', 'Finance', 72000), ('James', 'Wilson', 'IT', 82000); -- Create a B-tree index on the LastName column CREATE INDEX idx_lastname ON Employees (LastName);

Query Using the Index

EXPLAIN QUERY PLAN SELECT * FROM Employees WHERE LastName = 'Smith';

Output:

0|0|0|SEARCH TABLE Employees USING INDEX idx_lastname (LastName=?)

This output shows that the query uses the B-tree index to search the Employees table.

Hash Indexes

Overview

  • Structure: Hash indexes use a hash table where the keys are hashed into a hash value, providing very fast access for equality searches.
  • Usage: Best suited for equality searches, but not suitable for range queries or sorting.
  • Support in SQLite: Hash indexes are not natively supported in SQLite.

Bitmap Indexes

Overview

  • Structure: Bitmap indexes use a bitmap (array of bits) to represent the presence of a value in a column, providing fast access for low-cardinality columns (columns with few distinct values).
  • Usage: Best suited for queries involving multiple columns with AND/OR conditions.
  • Support in SQLite: Bitmap indexes are not natively supported in SQLite.

Example and Output for B-Tree Index in SQLite

Here is a complete example demonstrating the creation and usage of a B-tree index in SQLite:

-- Create the Employees table CREATE TABLE Employees ( EmployeeID INTEGER PRIMARY KEY, FirstName TEXT NOT NULL, LastName TEXT NOT NULL, Department TEXT NOT NULL, Salary REAL NOT NULL ); -- Insert data into the Employees table INSERT INTO Employees (FirstName, LastName, Department, Salary) VALUES ('John', 'Doe', 'HR', 60000), ('Jane', 'Smith', 'Finance', 75000), ('Mike', 'Johnson', 'IT', 80000), ('Emily', 'Davis', 'Finance', 72000), ('James', 'Wilson', 'IT', 82000); -- Create a B-tree index on the LastName column CREATE INDEX idx_lastname ON Employees (LastName); -- Query the Employees table using the index EXPLAIN QUERY PLAN SELECT * FROM Employees WHERE LastName = 'Smith';

Output of the EXPLAIN QUERY PLAN statement:

0|0|0|SEARCH TABLE Employees USING INDEX idx_lastname (LastName=?)

This output indicates that SQLite is using the B-tree index idx_lastname to optimize the query.

Conclusion

While SQLite supports only B-tree indexes natively, understanding different types of indexes like hash indexes and bitmap indexes can provide insights into their advantages and appropriate use cases. In SQLite, B-tree indexes are powerful tools for optimizing query performance, especially for equality and range queries. By creating indexes on frequently queried columns, you can significantly enhance the efficiency of your database operations.

Strategies for optimizing SQL queries for performance

Optimizing SQL queries for performance in SQLite involves a combination of best practices, appropriate indexing, query rewriting, and understanding how SQLite executes queries. Here are some strategies with detailed examples and their impact on performance:

Strategies for Optimizing SQL Queries

  1. Use Indexes Wisely
  2. **Avoid SELECT ***
  3. Use EXPLAIN QUERY PLAN
  4. Limit the Number of Subqueries
  5. Optimize Joins
  6. Avoid Unnecessary Columns in SELECT
  7. Use Transactions
  8. Analyze and Vacuum Database

Example Scenario

Let's use a sample Employees table and demonstrate these strategies:

CREATE TABLE Employees ( EmployeeID INTEGER PRIMARY KEY, FirstName TEXT NOT NULL, LastName TEXT NOT NULL, Department TEXT NOT NULL, Salary REAL NOT NULL ); INSERT INTO Employees (FirstName, LastName, Department, Salary) VALUES ('John', 'Doe', 'HR', 60000), ('Jane', 'Smith', 'Finance', 75000), ('Mike', 'Johnson', 'IT', 80000), ('Emily', 'Davis', 'Finance', 72000), ('James', 'Wilson', 'IT', 82000);

1. Use Indexes Wisely

Creating indexes on columns that are frequently used in WHERE, JOIN, and ORDER BY clauses can significantly speed up queries.

CREATE INDEX idx_lastname ON Employees (LastName); CREATE INDEX idx_department_salary ON Employees (Department, Salary);

Example Query with Index

EXPLAIN QUERY PLAN SELECT * FROM Employees WHERE LastName = 'Smith';

Output:

0|0|0|SEARCH TABLE Employees USING INDEX idx_lastname (LastName=?)

2. Avoid SELECT *

Selecting only the necessary columns reduces the amount of data transferred and processed.

Non-Optimized Query

SELECT * FROM Employees WHERE Department = 'Finance';

Optimized Query

SELECT FirstName, LastName, Salary FROM Employees WHERE Department = 'Finance';

3. Use EXPLAIN QUERY PLAN

Use this statement to understand how SQLite executes your queries and to identify potential bottlenecks.

EXPLAIN QUERY PLAN SELECT FirstName, LastName, Salary FROM Employees WHERE Department = 'Finance';

Output:

0|0|0|SCAN TABLE Employees

4. Limit the Number of Subqueries

Excessive subqueries can degrade performance. Optimize by rewriting queries to use joins or temporary tables where appropriate.

Non-Optimized Query

SELECT * FROM Employees WHERE EmployeeID IN (SELECT EmployeeID FROM Employees WHERE Salary > 70000);

Optimized Query

SELECT * FROM Employees WHERE Salary > 70000;

5. Optimize Joins

Ensure columns used in joins are indexed.

Non-Optimized Query

SELECT e.FirstName, e.LastName, d.DepartmentName FROM Employees e JOIN Departments d ON e.DepartmentID = d.DepartmentID;

Optimized Query

CREATE INDEX idx_departmentid ON Employees (DepartmentID); CREATE INDEX idx_dept_departmentid ON Departments (DepartmentID); SELECT e.FirstName, e.LastName, d.DepartmentName FROM Employees e JOIN Departments d ON e.DepartmentID = d.DepartmentID;

6. Avoid Unnecessary Columns in SELECT

Select only the columns you need to reduce I/O overhead.

SELECT FirstName, LastName FROM Employees WHERE Department = 'IT';

7. Use Transactions

Grouping multiple operations into a single transaction can reduce the overhead of committing multiple times.

Non-Optimized

BEGIN TRANSACTION; INSERT INTO Employees (FirstName, LastName, Department, Salary) VALUES ('Alice', 'Brown', 'HR', 70000); INSERT INTO Employees (FirstName, LastName, Department, Salary) VALUES ('Bob', 'White', 'IT', 75000); COMMIT;

8. Analyze and Vacuum Database

Regularly analyze the database to update statistics for the query planner and vacuum to clean up the database file.

ANALYZE; VACUUM;

Complete Example with Optimizations

Here's a complete example combining multiple optimizations:

-- Create the Employees table CREATE TABLE Employees ( EmployeeID INTEGER PRIMARY KEY, FirstName TEXT NOT NULL, LastName TEXT NOT NULL, Department TEXT NOT NULL, Salary REAL NOT NULL ); -- Insert data into the Employees table INSERT INTO Employees (FirstName, LastName, Department, Salary) VALUES ('John', 'Doe', 'HR', 60000), ('Jane', 'Smith', 'Finance', 75000), ('Mike', 'Johnson', 'IT', 80000), ('Emily', 'Davis', 'Finance', 72000), ('James', 'Wilson', 'IT', 82000); -- Create indexes CREATE INDEX idx_lastname ON Employees (LastName); CREATE INDEX idx_department_salary ON Employees (Department, Salary); -- Use EXPLAIN QUERY PLAN to understand query execution EXPLAIN QUERY PLAN SELECT FirstName, LastName, Salary FROM Employees WHERE Department = 'Finance'; -- Optimized Query SELECT FirstName, LastName, Salary FROM Employees WHERE Department = 'Finance'; -- Avoid unnecessary columns SELECT FirstName, LastName FROM Employees WHERE Department = 'IT'; -- Use Transactions for batch operations BEGIN TRANSACTION; INSERT INTO Employees (FirstName, LastName, Department, Salary) VALUES ('Alice', 'Brown', 'HR', 70000); INSERT INTO Employees (FirstName, LastName, Department, Salary) VALUES ('Bob', 'White', 'IT', 75000); COMMIT; -- Analyze and vacuum the database ANALYZE; VACUUM;

Conclusion

Optimizing SQL queries in SQLite involves using indexes effectively, selecting only necessary columns, understanding query execution plans, minimizing subqueries, optimizing joins, and using transactions appropriately. Regular maintenance tasks like analyzing and vacuuming the database also contribute to performance improvements. By following these strategies, you can significantly enhance the performance of your SQLite database queries.


2.9 Transactions and Concurrency Control

Introduction to transactions in relational databases

Introduction to Transactions in Relational Databases

What is a Transaction?

A transaction in the context of a relational database is a sequence of one or more SQL operations that are executed as a single unit. The purpose of a transaction is to ensure that all operations within it are either fully completed or fully rolled back, maintaining the integrity and consistency of the database.

ACID Properties

Transactions are characterized by four key properties known as ACID properties:

  1. Atomicity: Ensures that all operations within the transaction are completed successfully. If any operation fails, the entire transaction fails and the database state is left unchanged.
  2. Consistency: Ensures that a transaction transforms the database from one consistent state to another consistent state. The database should always meet all predefined rules.
  3. Isolation: Ensures that transactions are securely and independently processed at the same time without interference, maintaining the consistency of the database.
  4. Durability: Ensures that once a transaction has been committed, it remains so, even in the event of a system failure.

SQLite Transactions

SQLite is a popular relational database management system that supports transactions. In SQLite, you can manage transactions using three main commands:

  1. BEGIN TRANSACTION: Starts a new transaction.
  2. COMMIT: Saves the changes made during the transaction.
  3. ROLLBACK: Undoes all changes made during the transaction.

Example of Transactions in SQLite

Let's go through an example to illustrate how transactions work in SQLite.

Example

Suppose we have a simple database with a users table. Here’s how we can perform a transaction:

-- Create a table CREATE TABLE users ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, balance REAL NOT NULL ); -- Insert initial data INSERT INTO users (name, balance) VALUES ('Alice', 1000); INSERT INTO users (name, balance) VALUES ('Bob', 1500);

Transaction: Transfer Money from Alice to Bob

  1. Begin the Transaction:

    BEGIN TRANSACTION;
  2. Perform Operations:

    -- Deduct 200 from Alice's balance UPDATE users SET balance = balance - 200 WHERE name = 'Alice'; -- Add 200 to Bob's balance UPDATE users SET balance = balance + 200 WHERE name = 'Bob';
  3. Commit the Transaction:

    COMMIT;
  4. If something goes wrong, Rollback the Transaction:

    ROLLBACK;

Output

Let’s verify the changes by selecting the data from the users table after the transaction.

SELECT * FROM users;

Expected Output:

idnamebalance
1Alice800
2Bob1700

This output shows that the transaction was successful: $200 was transferred from Alice's account to Bob's account.

Example Code with Output

Below is a complete example with Python code using SQLite3 to demonstrate transactions:

import sqlite3 # Connect to SQLite database conn = sqlite3.connect(':memory:') cursor = conn.cursor() # Create users table cursor.execute('''CREATE TABLE users ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, balance REAL NOT NULL )''') # Insert initial data cursor.execute("INSERT INTO users (name, balance) VALUES ('Alice', 1000)") cursor.execute("INSERT INTO users (name, balance) VALUES ('Bob', 1500)") conn.commit() # Start a transaction try: conn.execute("BEGIN TRANSACTION") # Perform operations cursor.execute("UPDATE users SET balance = balance - 200 WHERE name = 'Alice'") cursor.execute("UPDATE users SET balance = balance + 200 WHERE name = 'Bob'") # Commit the transaction conn.commit() except sqlite3.Error as e: print("Transaction failed: ", e) # Rollback the transaction in case of error conn.rollback() # Fetch and display data cursor.execute("SELECT * FROM users") rows = cursor.fetchall() for row in rows: print(row) # Close connection conn.close()

Output:

(1, 'Alice', 800.0) (2, 'Bob', 1700.0)

This Python script creates a SQLite database in memory, performs a transaction to transfer money from Alice to Bob, and prints the updated balances, demonstrating the concepts of transactions in relational databases using SQLite.

ACID properties of transactions

ACID Properties of Transactions in SQLite Database

Overview of ACID Properties

The ACID properties ensure reliable processing of database transactions. Here’s a detailed look at each property:

  1. Atomicity: Ensures that all operations within a transaction are treated as a single unit. If one operation fails, the entire transaction fails, and all changes made are rolled back.
  2. Consistency: Ensures that a transaction brings the database from one valid state to another, maintaining database invariants.
  3. Isolation: Ensures that transactions are executed in isolation from each other. Intermediate states of a transaction are invisible to other transactions.
  4. Durability: Ensures that once a transaction is committed, the changes are permanent, even in the event of a system failure.

SQLite and ACID Properties

SQLite adheres to the ACID properties to ensure reliable transactions. Let’s explore each property with an example.

Example Scenario

We will use a users table to demonstrate the ACID properties. The table has the following structure:

CREATE TABLE users ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, balance REAL NOT NULL );

Atomicity Example

In this example, we perform a transaction to transfer $200 from Alice to Bob. If any part of the transaction fails, the entire transaction is rolled back.

import sqlite3 # Connect to SQLite database conn = sqlite3.connect(':memory:') cursor = conn.cursor() # Create users table cursor.execute('''CREATE TABLE users ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, balance REAL NOT NULL )''') # Insert initial data cursor.execute("INSERT INTO users (name, balance) VALUES ('Alice', 1000)") cursor.execute("INSERT INTO users (name, balance) VALUES ('Bob', 1500)") conn.commit() # Start a transaction try: conn.execute("BEGIN TRANSACTION") # Perform operations cursor.execute("UPDATE users SET balance = balance - 200 WHERE name = 'Alice'") cursor.execute("UPDATE users SET balance = balance + 200 WHERE name = 'Bob'") # Commit the transaction conn.commit() except sqlite3.Error as e: print("Transaction failed: ", e) conn.rollback() # Fetch and display data cursor.execute("SELECT * FROM users") rows = cursor.fetchall() for row in rows: print(row) # Close connection conn.close()

Output:

(1, 'Alice', 800.0) (2, 'Bob', 1700.0)

If any operation within the transaction fails (e.g., due to a database constraint violation), the entire transaction would be rolled back, leaving the balances unchanged.

Consistency Example

Consistency ensures that only valid data is written to the database. Suppose we have a constraint that balances must be non-negative:

CREATE TABLE users ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, balance REAL NOT NULL CHECK (balance >= 0) );

If we attempt to transfer more money than Alice has, the transaction should fail and rollback:

import sqlite3 # Connect to SQLite database conn = sqlite3.connect(':memory:') cursor = conn.cursor() # Create users table with constraint cursor.execute('''CREATE TABLE users ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, balance REAL NOT NULL CHECK (balance >= 0) )''') # Insert initial data cursor.execute("INSERT INTO users (name, balance) VALUES ('Alice', 1000)") cursor.execute("INSERT INTO users (name, balance) VALUES ('Bob', 1500)") conn.commit() # Start a transaction try: conn.execute("BEGIN TRANSACTION") # Attempt to perform invalid operations cursor.execute("UPDATE users SET balance = balance - 1200 WHERE name = 'Alice'") cursor.execute("UPDATE users SET balance = balance + 1200 WHERE name = 'Bob'") # Commit the transaction conn.commit() except sqlite3.Error as e: print("Transaction failed: ", e) conn.rollback() # Fetch and display data cursor.execute("SELECT * FROM users") rows = cursor.fetchall() for row in rows: print(row) # Close connection conn.close()

Output:

Transaction failed: CHECK constraint failed: users (1, 'Alice', 1000.0) (2, 'Bob', 1500.0)

The transaction fails due to the CHECK constraint, ensuring the database remains in a consistent state.

Isolation Example

Isolation ensures that concurrent transactions do not interfere with each other. In SQLite, the default isolation level is SERIALIZABLE, which ensures strict isolation.

To demonstrate isolation, we would need to simulate concurrent transactions. This is more complex and requires threading or multiprocessing, but the principle is that one transaction's changes are not visible to another until they are committed.

Durability Example

Durability ensures that committed transactions are saved permanently, even in case of a crash. SQLite achieves durability by writing changes to the database file and ensuring they are flushed to disk.

For a simple demonstration, we can check that data persists after closing and reopening the connection:

import sqlite3 # Connect to SQLite database conn = sqlite3.connect('test.db') cursor = conn.cursor() # Create users table cursor.execute('''CREATE TABLE IF NOT EXISTS users ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, balance REAL NOT NULL )''') # Insert initial data cursor.execute("INSERT INTO users (name, balance) VALUES ('Alice', 1000)") cursor.execute("INSERT INTO users (name, balance) VALUES ('Bob', 1500)") conn.commit() # Close connection conn.close() # Reopen connection to check durability conn = sqlite3.connect('test.db') cursor = conn.cursor() # Fetch and display data cursor.execute("SELECT * FROM users") rows = cursor.fetchall() for row in rows: print(row) # Clean up cursor.execute("DROP TABLE users") conn.commit() conn.close()

Output:

(1, 'Alice', 1000.0) (2, 'Bob', 1500.0)

This shows that the data persists across database connections, demonstrating durability.

These examples cover the ACID properties in SQLite, illustrating how transactions are managed to ensure reliable and consistent database operations.

Concurrency control mechanisms: Locking, timestamp-based protocols

Concurrency Control Mechanisms in SQLite: Locking and Timestamp-Based Protocols

Concurrency control in databases ensures that multiple transactions can occur simultaneously without causing data inconsistency. SQLite uses different mechanisms to achieve this, primarily focusing on locking.

Locking in SQLite

SQLite uses a locking mechanism to control access to the database. There are several types of locks in SQLite:

  1. SHARED Lock: Multiple readers can acquire a shared lock simultaneously, allowing them to read the database but not write.
  2. RESERVED Lock: A reserved lock indicates that a transaction intends to write to the database. It can only be acquired if no other transaction holds a pending lock.
  3. PENDING Lock: Indicates that a transaction wants to acquire an exclusive lock. It can only be acquired if there are no active shared locks.
  4. EXCLUSIVE Lock: Only one transaction can acquire an exclusive lock, allowing it to read and write to the database. It ensures no other transactions can access the database.

Example of Locking in SQLite

To demonstrate locking, let's use a scenario where multiple transactions try to access and modify the same database.

import sqlite3 import threading import time def read_db(): conn = sqlite3.connect('test_locking.db') cursor = conn.cursor() print("Reader: Acquiring shared lock") cursor.execute("BEGIN") cursor.execute("SELECT * FROM users") rows = cursor.fetchall() print("Reader: ", rows) time.sleep(2) print("Reader: Releasing shared lock") conn.close() def write_db(): conn = sqlite3.connect('test_locking.db') cursor = conn.cursor() print("Writer: Acquiring exclusive lock") cursor.execute("BEGIN EXCLUSIVE") cursor.execute("UPDATE users SET balance = balance + 100 WHERE name = 'Bob'") print("Writer: Committing transaction") conn.commit() conn.close() # Initialize the database conn = sqlite3.connect('test_locking.db') cursor = conn.cursor() cursor.execute("CREATE TABLE IF NOT EXISTS users (id INTEGER PRIMARY KEY, name TEXT NOT NULL, balance REAL NOT NULL)") cursor.execute("INSERT INTO users (name, balance) VALUES ('Alice', 1000)") cursor.execute("INSERT INTO users (name, balance) VALUES ('Bob', 1500)") conn.commit() conn.close() # Create threads for concurrent access reader_thread = threading.Thread(target=read_db) writer_thread = threading.Thread(target=write_db) # Start threads reader_thread.start() time.sleep(1) # Ensure the reader starts first writer_thread.start() # Wait for threads to finish reader_thread.join() writer_thread.join() # Check the final state of the database conn = sqlite3.connect('test_locking.db') cursor = conn.cursor() cursor.execute("SELECT * FROM users") rows = cursor.fetchall() print("Final state: ", rows) conn.close()

Output:

Reader: Acquiring shared lock Reader: [(1, 'Alice', 1000.0), (2, 'Bob', 1500.0)] Writer: Acquiring exclusive lock Reader: Releasing shared lock Writer: Committing transaction Final state: [(1, 'Alice', 1000.0), (2, 'Bob', 1600.0)]

In this example:

  • The reader thread acquires a shared lock and reads the database.
  • The writer thread waits for the reader to release its lock before acquiring an exclusive lock to update the database.

Timestamp-Based Protocols

While SQLite primarily uses locking for concurrency control, timestamp-based protocols are another method used in some database systems. These protocols use timestamps to order transactions and ensure serializability.

Each transaction is given a timestamp when it starts. Transactions are ordered based on these timestamps, and the database ensures that the operations of each transaction appear to be executed in this order.

SQLite doesn't natively support timestamp-based concurrency control, but we can illustrate the concept with a simplified example.

Example of Timestamp-Based Protocol

In this simplified example, we will simulate timestamp ordering by manually assigning timestamps and checking conditions before executing operations.

import time class Transaction: def __init__(self, name, timestamp): self.name = name self.timestamp = timestamp self.read_set = set() self.write_set = set() def read(self, key): self.read_set.add(key) # Simulate read operation print(f"{self.name} reads {key}") def write(self, key): self.write_set.add(key) # Simulate write operation print(f"{self.name} writes {key}") # Simulate transactions with timestamps t1 = Transaction("T1", time.time()) time.sleep(1) # Ensure different timestamps t2 = Transaction("T2", time.time()) # Transaction T1 t1.read('Alice') t1.write('Bob') # Transaction T2 t2.read('Bob') t2.write('Alice') # Simulate timestamp ordering if t1.timestamp < t2.timestamp: print("T1 commits before T2") else: print("T2 commits before T1")

Output:

T1 reads Alice T1 writes Bob T2 reads Bob T2 writes Alice T1 commits before T2

In this example:

  • Transactions T1 and T2 are given timestamps.
  • Operations are executed in the order of their timestamps.
  • The system checks the timestamps to ensure serializability.

Summary

SQLite primarily uses locking mechanisms to manage concurrency control, ensuring ACID properties through different types of locks. Timestamp-based protocols, while not natively supported in SQLite, provide an alternative method for ensuring serializability in other database systems. The examples provided demonstrate how locking works in SQLite and illustrate the concept of timestamp-based protocols.


2.10 Database Integrity and Security

Ensuring data integrity with constraints: Primary keys, foreign keys, unique constraints

Ensuring Data Integrity with Constraints in SQLite

Data integrity is crucial in relational databases to ensure the accuracy and consistency of data. SQLite provides several types of constraints to enforce data integrity:

  1. Primary Key: Ensures that each row in a table has a unique identifier.
  2. Foreign Key: Ensures that a value in one table matches a value in another table, establishing a relationship between the two tables.
  3. Unique Constraint: Ensures that all values in a column (or a group of columns) are unique across the table.

Primary Key

A primary key uniquely identifies each row in a table. It must contain unique values and cannot contain NULLs.

Example

CREATE TABLE students ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, age INTEGER NOT NULL );

In this example, the id column is the primary key for the students table.

Foreign Key

A foreign key in one table points to a primary key in another table, creating a relationship between the two tables.

Example

CREATE TABLE classes ( id INTEGER PRIMARY KEY, name TEXT NOT NULL ); CREATE TABLE enrollments ( student_id INTEGER, class_id INTEGER, FOREIGN KEY (student_id) REFERENCES students(id), FOREIGN KEY (class_id) REFERENCES classes(id) );

In this example, the enrollments table has foreign keys referencing the students and classes tables, establishing relationships between them.

Unique Constraint

A unique constraint ensures that all values in a column are unique across the table.

Example

CREATE TABLE teachers ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, email TEXT NOT NULL UNIQUE );

In this example, the email column in the teachers table must contain unique values.

Full Example with Details and Output

Let's put all these constraints together in a more comprehensive example.

import sqlite3 # Connect to SQLite database conn = sqlite3.connect('school.db') cursor = conn.cursor() # Create tables with constraints cursor.execute(''' CREATE TABLE IF NOT EXISTS students ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, age INTEGER NOT NULL ) ''') cursor.execute(''' CREATE TABLE IF NOT EXISTS classes ( id INTEGER PRIMARY KEY, name TEXT NOT NULL ) ''') cursor.execute(''' CREATE TABLE IF NOT EXISTS enrollments ( student_id INTEGER, class_id INTEGER, FOREIGN KEY (student_id) REFERENCES students(id), FOREIGN KEY (class_id) REFERENCES classes(id) ) ''') cursor.execute(''' CREATE TABLE IF NOT EXISTS teachers ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, email TEXT NOT NULL UNIQUE ) ''') # Insert data cursor.execute("INSERT INTO students (name, age) VALUES ('Alice', 20)") cursor.execute("INSERT INTO students (name, age) VALUES ('Bob', 22)") cursor.execute("INSERT INTO classes (name) VALUES ('Math')") cursor.execute("INSERT INTO classes (name) VALUES ('Science')") cursor.execute("INSERT INTO enrollments (student_id, class_id) VALUES (1, 1)") cursor.execute("INSERT INTO enrollments (student_id, class_id) VALUES (2, 2)") cursor.execute("INSERT INTO teachers (name, email) VALUES ('Dr. Smith', 'smith@example.com')") conn.commit() # Query data cursor.execute("SELECT * FROM students") students = cursor.fetchall() print("Students:", students) cursor.execute("SELECT * FROM classes") classes = cursor.fetchall() print("Classes:", classes) cursor.execute("SELECT * FROM enrollments") enrollments = cursor.fetchall() print("Enrollments:", enrollments) cursor.execute("SELECT * FROM teachers") teachers = cursor.fetchall() print("Teachers:", teachers) # Close connection conn.close()

Expected Output:

Students: [(1, 'Alice', 20), (2, 'Bob', 22)] Classes: [(1, 'Math'), (2, 'Science')] Enrollments: [(1, 1), (2, 2)] Teachers: [(1, 'Dr. Smith', 'smith@example.com')]

Detailed Breakdown

  1. Creating Tables with Constraints:

    • students table: Primary key on id.
    • classes table: Primary key on id.
    • enrollments table: Foreign keys referencing students(id) and classes(id).
    • teachers table: Primary key on id and unique constraint on email.
  2. Inserting Data:

    • Inserts records into students, classes, enrollments, and teachers tables.
  3. Querying Data:

    • Retrieves and prints data from all tables to verify the constraints and data integrity.

Conclusion

By using primary keys, foreign keys, and unique constraints, SQLite ensures data integrity and enforces relationships between tables. This example demonstrates how these constraints can be implemented and verified in a SQLite database.

Database security concepts: Authentication, authorization, encryption

Database Security Concepts in SQLite

Database security is crucial to protect data from unauthorized access and breaches. The key security concepts in SQLite include authentication, authorization, and encryption.

1. Authentication

Authentication is the process of verifying the identity of a user or application accessing the database. SQLite does not natively support user authentication since it is designed to be embedded within applications. Instead, authentication is typically managed by the application that uses the SQLite database.

Example: Simple Authentication in a Python Application

To illustrate, we can implement a basic authentication mechanism in a Python application that uses an SQLite database.

import sqlite3 import hashlib # Function to create a user table with hashed passwords def create_user_table(conn): cursor = conn.cursor() cursor.execute(''' CREATE TABLE IF NOT EXISTS users ( username TEXT PRIMARY KEY, password TEXT NOT NULL ) ''') conn.commit() # Function to add a user with a hashed password def add_user(conn, username, password): cursor = conn.cursor() hashed_password = hashlib.sha256(password.encode()).hexdigest() cursor.execute('INSERT INTO users (username, password) VALUES (?, ?)', (username, hashed_password)) conn.commit() # Function to authenticate a user def authenticate_user(conn, username, password): cursor = conn.cursor() hashed_password = hashlib.sha256(password.encode()).hexdigest() cursor.execute('SELECT * FROM users WHERE username = ? AND password = ?', (username, hashed_password)) return cursor.fetchone() is not None # Connect to SQLite database conn = sqlite3.connect('secure.db') create_user_table(conn) # Add a user add_user(conn, 'alice', 'password123') # Authenticate user if authenticate_user(conn, 'alice', 'password123'): print("Authentication successful!") else: print("Authentication failed!") conn.close()

Output:

Authentication successful!

In this example, passwords are hashed using SHA-256 before storing them in the database. Authentication is performed by comparing the hashed passwords.

2. Authorization

Authorization determines what an authenticated user is allowed to do. SQLite itself does not support role-based access control (RBAC) or fine-grained permissions. Authorization is typically enforced at the application level.

Example: Simple Authorization in a Python Application

Let's extend the previous example to include basic authorization.

# Function to create roles table def create_roles_table(conn): cursor = conn.cursor() cursor.execute(''' CREATE TABLE IF NOT EXISTS roles ( username TEXT PRIMARY KEY, role TEXT NOT NULL ) ''') conn.commit() # Function to assign a role to a user def assign_role(conn, username, role): cursor = conn.cursor() cursor.execute('INSERT OR REPLACE INTO roles (username, role) VALUES (?, ?)', (username, role)) conn.commit() # Function to check user's role def check_role(conn, username, required_role): cursor = conn.cursor() cursor.execute('SELECT role FROM roles WHERE username = ?', (username,)) result = cursor.fetchone() return result is not None and result[0] == required_role # Connect to SQLite database conn = sqlite3.connect('secure.db') create_roles_table(conn) # Assign a role to user assign_role(conn, 'alice', 'admin') # Authenticate and authorize user if authenticate_user(conn, 'alice', 'password123'): if check_role(conn, 'alice', 'admin'): print("User authorized as admin!") else: print("User not authorized!") else: print("Authentication failed!") conn.close()

Output:

User authorized as admin!

In this example, a roles table is added to manage user roles, and the application checks if the user has the required role for a specific action.

3. Encryption

Encryption ensures that data stored in the database is protected from unauthorized access. SQLite itself does not provide built-in encryption, but there are extensions like SQLite Encryption Extension (SEE) and SQLCipher that add encryption capabilities.

Example: Using SQLCipher for Encryption

To use SQLCipher, you need to install the library. Below is an example of how to use SQLCipher for encrypting an SQLite database.

import sqlite3 # Encrypting an SQLite database with SQLCipher def encrypt_database(db_path, password): conn = sqlite3.connect(db_path) cursor = conn.cursor() cursor.execute(f"PRAGMA key = '{password}';") cursor.execute("PRAGMA cipher_version;") cursor.execute(''' CREATE TABLE IF NOT EXISTS secure_data ( id INTEGER PRIMARY KEY, data TEXT NOT NULL ) ''') cursor.execute("INSERT INTO secure_data (data) VALUES ('Sensitive Information')") conn.commit() conn.close() # Decrypt and access data def access_encrypted_database(db_path, password): conn = sqlite3.connect(db_path) cursor = conn.cursor() cursor.execute(f"PRAGMA key = '{password}';") cursor.execute("SELECT * FROM secure_data") rows = cursor.fetchall() for row in rows: print(row) conn.close() # Path to the encrypted database db_path = 'encrypted.db' # Encrypt the database encrypt_database(db_path, 'my_secret_password') # Access the encrypted database access_encrypted_database(db_path, 'my_secret_password')

Output:

(1, 'Sensitive Information')

In this example, SQLCipher is used to encrypt the database with a password. The encrypted database can only be accessed by providing the correct password.

Conclusion

SQLite provides basic security through application-level authentication and authorization. For encryption, extensions like SQLCipher can be used to secure the database. These examples demonstrate how to implement these security concepts in an SQLite database.

Best practices for securing relational databases

Securing a relational database, including SQLite, involves a combination of best practices that address authentication, authorization, encryption, and general security principles. Below are detailed best practices along with examples and expected outputs for securing an SQLite database.

1. Use Secure Connections

SQLite databases are file-based, and ensuring secure access to these files is critical. Secure your connections by controlling file access permissions.

Example

Ensure the SQLite database file has appropriate permissions. On a Unix-based system, you can use the chmod command:

chmod 600 /path/to/your/database.db

Output:

This command restricts the file access to the owner, preventing unauthorized users from accessing the database file.

2. Encrypt the Database

Using an extension like SQLCipher to encrypt your database ensures that your data is secure even if the database file is accessed by unauthorized users.

Example

import sqlite3 # Encrypting an SQLite database with SQLCipher def encrypt_database(db_path, password): conn = sqlite3.connect(db_path) cursor = conn.cursor() cursor.execute(f"PRAGMA key = '{password}';") cursor.execute("PRAGMA cipher_version;") cursor.execute(''' CREATE TABLE IF NOT EXISTS secure_data ( id INTEGER PRIMARY KEY, data TEXT NOT NULL ) ''') cursor.execute("INSERT INTO secure_data (data) VALUES ('Sensitive Information')") conn.commit() conn.close() # Decrypt and access data def access_encrypted_database(db_path, password): conn = sqlite3.connect(db_path) cursor = conn.cursor() cursor.execute(f"PRAGMA key = '{password}';") cursor.execute("SELECT * FROM secure_data") rows = cursor.fetchall() for row in rows: print(row) conn.close() # Path to the encrypted database db_path = 'encrypted.db' # Encrypt the database encrypt_database(db_path, 'my_secret_password') # Access the encrypted database access_encrypted_database(db_path, 'my_secret_password')

Output:

(1, 'Sensitive Information')

3. Control Access to the Database

Ensure that only authorized applications and users can access the database. This is typically managed at the application level since SQLite does not support built-in user management.

Example: Basic User Authentication in a Python Application

import sqlite3 import hashlib def create_user_table(conn): cursor = conn.cursor() cursor.execute(''' CREATE TABLE IF NOT EXISTS users ( username TEXT PRIMARY KEY, password TEXT NOT NULL ) ''') conn.commit() def add_user(conn, username, password): cursor = conn.cursor() hashed_password = hashlib.sha256(password.encode()).hexdigest() cursor.execute('INSERT INTO users (username, password) VALUES (?, ?)', (username, hashed_password)) conn.commit() def authenticate_user(conn, username, password): cursor = conn.cursor() hashed_password = hashlib.sha256(password.encode()).hexdigest() cursor.execute('SELECT * FROM users WHERE username = ? AND password = ?', (username, hashed_password)) return cursor.fetchone() is not None # Connect to SQLite database conn = sqlite3.connect('secure.db') create_user_table(conn) add_user(conn, 'alice', 'password123') # Authenticate user if authenticate_user(conn, 'alice', 'password123'): print("Authentication successful!") else: print("Authentication failed!") conn.close()

Output:

Authentication successful!

4. Regular Backups

Regular backups are essential to protect against data loss. Ensure your backups are secure and stored in a separate location.

Example: Simple Backup Script

#!/bin/bash # Path to the database file DB_PATH="/path/to/your/database.db" # Path to the backup directory BACKUP_DIR="/path/to/your/backup/dir" # Create a backup with a timestamp BACKUP_FILE="$BACKUP_DIR/database_$(date +'%Y%m%d%H%M%S').db" # Copy the database file to the backup directory cp $DB_PATH $BACKUP_FILE # Output the result echo "Backup created at $BACKUP_FILE"

Output:

Backup created at /path/to/your/backup/dir/database_20230619123045.db

5. Use Transactions

Using transactions ensures data consistency and integrity. Ensure that operations that modify the database are wrapped in transactions.

Example

import sqlite3 # Connect to SQLite database conn = sqlite3.connect('secure.db') cursor = conn.cursor() try: cursor.execute('BEGIN TRANSACTION') cursor.execute("UPDATE accounts SET balance = balance - 100 WHERE id = 1") cursor.execute("UPDATE accounts SET balance = balance + 100 WHERE id = 2") conn.commit() print("Transaction committed successfully.") except sqlite3.Error as e: conn.rollback() print("Transaction failed, rolling back.", e) conn.close()

Output:

Transaction committed successfully.

6. Regular Audits and Monitoring

Regularly audit your database and monitor access logs to detect and respond to suspicious activities.

Example: Logging Access Attempts

import sqlite3 import datetime def log_access(conn, username, status): cursor = conn.cursor() cursor.execute(''' CREATE TABLE IF NOT EXISTS access_logs ( id INTEGER PRIMARY KEY, username TEXT, timestamp TEXT, status TEXT ) ''') timestamp = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S") cursor.execute('INSERT INTO access_logs (username, timestamp, status) VALUES (?, ?, ?)', (username, timestamp, status)) conn.commit() # Connect to SQLite database conn = sqlite3.connect('secure.db') log_access(conn, 'alice', 'success') log_access(conn, 'bob', 'failed') # Retrieve logs cursor = conn.cursor() cursor.execute("SELECT * FROM access_logs") logs = cursor.fetchall() for log in logs: print(log) conn.close()

Output:

(1, 'alice', '2024-06-19 12:45:00', 'success') (2, 'bob', '2024-06-19 12:46:00', 'failed')

7. Least Privilege Principle

Ensure that applications and users have the minimum level of access necessary to perform their tasks. This can be managed by the application logic interacting with the SQLite database.

Conclusion

Securing an SQLite database involves a combination of practices, including securing access to the database file, encrypting the database, managing user authentication and authorization, regular backups, and auditing. While SQLite does not have built-in support for all these features, they can be implemented at the application level to ensure robust security.


2.11 Backup and Recovery

Importance of database backup and recovery

Importance of Database Backup and Recovery in SQLite

Database backup and recovery are critical components of data management. They ensure that data is protected against loss or corruption, allowing it to be restored in case of failures, disasters, or accidental deletions. In SQLite, which is a lightweight, file-based database, backup and recovery are particularly important due to the following reasons:

Importance of Backup and Recovery

  1. Data Protection: Backups protect against data loss due to hardware failures, software bugs, or human errors.
  2. Disaster Recovery: In case of catastrophic events (e.g., fire, flood, or cyber-attacks), backups ensure that data can be recovered.
  3. Accidental Deletion: Users or applications may accidentally delete important data, which can be restored from backups.
  4. Corruption Recovery: Databases may get corrupted due to various reasons (e.g., power failures, file system issues), and backups can help recover from such scenarios.
  5. Testing and Development: Backups can be used to create test and development environments without affecting the production data.

Backup and Recovery Mechanisms in SQLite

SQLite supports several methods for creating backups and recovering data:

  1. Using the SQLite Command-Line Tool
  2. Using the Backup API
  3. Manual File Copy

1. Using the SQLite Command-Line Tool

The SQLite command-line tool provides a simple way to create and restore backups.

Example: Creating a Backup

sqlite3 original.db ".backup backup.db"

Explanation:

  • original.db is the source database.
  • backup.db is the backup copy of the database.

Output:

A file named backup.db is created, which is a backup of original.db.

Example: Restoring from a Backup

sqlite3 backup.db ".restore original.db"

Explanation:

  • backup.db is the backup database.
  • original.db is the target database to be restored.

Output:

The original.db is restored with the data from backup.db.

2. Using the Backup API

SQLite's Backup API allows for programmatic backup and restore operations, which can be integrated into applications.

Example: Using the Backup API in Python

import sqlite3 def backup_database(source_db, backup_db): # Connect to the source database source_conn = sqlite3.connect(source_db) # Connect to the backup database backup_conn = sqlite3.connect(backup_db) with backup_conn: # Perform the backup source_conn.backup(backup_conn) # Close the connections source_conn.close() backup_conn.close() # Backup the database backup_database('original.db', 'backup.db') # Restore the database backup_database('backup.db', 'restored.db') # Verify the backup by connecting to the restored database conn = sqlite3.connect('restored.db') cursor = conn.cursor() cursor.execute("SELECT name FROM sqlite_master WHERE type='table';") tables = cursor.fetchall() print("Tables in restored database:", tables) conn.close()

Output:

Tables in restored database: [('your_table_name',)]

This script backs up the original.db to backup.db and then restores it to restored.db.

3. Manual File Copy

Since SQLite databases are just files, you can manually copy the database file to create a backup.

Example: Creating a Manual Backup

cp original.db backup.db

Explanation:

  • cp is the copy command.
  • original.db is the source database file.
  • backup.db is the backup copy of the database.

Output:

A file named backup.db is created, which is a copy of original.db.

Best Practices for Backup and Recovery

  1. Regular Backups: Schedule regular backups to ensure that data can be recovered from the most recent state.
  2. Automate Backups: Use scripts or backup tools to automate the backup process and reduce the risk of human error.
  3. Offsite Storage: Store backups in a different physical location or cloud storage to protect against local disasters.
  4. Test Restores: Regularly test the restore process to ensure that backups are valid and can be used for recovery.
  5. Versioning: Keep multiple versions of backups to allow recovery from different points in time.

Conclusion

Backup and recovery are vital for maintaining the integrity and availability of data in an SQLite database. Using the SQLite command-line tool, Backup API, or manual file copy, backups can be created and restored efficiently. Implementing regular and automated backups, along with proper testing and offsite storage, ensures that data is protected against various risks and can be recovered when needed.

Strategies for backing up and restoring relational databases

Backing up and restoring relational databases in SQLite involves several strategies, each suited to different use cases and requirements. Below, I'll cover various methods along with examples and expected outputs for each strategy:

1. Using the SQLite Command-Line Tool

The SQLite command-line tool (sqlite3) provides a straightforward way to create and restore backups.

Example: Creating a Backup

sqlite3 original.db ".backup backup.db"

Explanation:

  • original.db is the source database.
  • .backup backup.db is the command to create a backup file named backup.db.

Output:

A file named backup.db is created as a backup of original.db.

Example: Restoring from a Backup

sqlite3 original.db ".restore backup.db"

Explanation:

  • original.db is the target database to be restored.
  • .restore backup.db is the command to restore the database from the backup file named backup.db.

Output:

The original.db is restored with the data from backup.db.

2. Using the Backup API in SQLite

SQLite's Backup API allows for programmatic backup and restore operations, which can be integrated into applications.

Example: Using the Backup API in Python

import sqlite3 def backup_database(source_db, backup_db): # Connect to the source database source_conn = sqlite3.connect(source_db) # Connect to the backup database backup_conn = sqlite3.connect(backup_db) with backup_conn: # Perform the backup source_conn.backup(backup_conn) # Close the connections source_conn.close() backup_conn.close() def restore_database(backup_db, target_db): # Connect to the backup database backup_conn = sqlite3.connect(backup_db) # Connect to the target database target_conn = sqlite3.connect(target_db) with target_conn: # Perform the restore backup_conn.backup(target_conn) # Close the connections backup_conn.close() target_conn.close() # Backup the database backup_database('original.db', 'backup.db') # Restore the database restore_database('backup.db', 'restored.db') # Verify the backup by connecting to the restored database conn = sqlite3.connect('restored.db') cursor = conn.cursor() cursor.execute("SELECT name FROM sqlite_master WHERE type='table';") tables = cursor.fetchall() print("Tables in restored database:", tables) conn.close()

Output:

Tables in restored database: [('your_table_name',)]

This script demonstrates how to back up original.db to backup.db and then restore it to restored.db.

3. Manual File Copy

Since SQLite databases are single files, you can manually copy the database file to create a backup.

Example: Creating a Manual Backup

cp original.db backup.db

Explanation:

  • cp is the copy command.
  • original.db is the source database file.
  • backup.db is the backup copy of the database.

Output:

A file named backup.db is created, which is a copy of original.db.

4. Automated Backups with Scripts

Automating backups with scripts can help ensure regular backups without manual intervention.

Example: Backup Script in Bash

#!/bin/bash # Path to the database file DB_PATH="/path/to/your/database.db" # Path to the backup directory BACKUP_DIR="/path/to/your/backup/dir" # Create a backup with a timestamp BACKUP_FILE="$BACKUP_DIR/database_$(date +'%Y%m%d%H%M%S').db" # Copy the database file to the backup directory cp $DB_PATH $BACKUP_FILE # Output the result echo "Backup created at $BACKUP_FILE"

Output:

Backup created at /path/to/your/backup/dir/database_20240619123045.db

5. Using SQLite Online Backup API

The SQLite Online Backup API allows for creating a backup without interrupting the database's operations. This is particularly useful for applications that require high availability.

Example: Using the Online Backup API in Python

import sqlite3 def online_backup(source_db, backup_db): # Connect to the source database source_conn = sqlite3.connect(source_db) # Connect to the backup database backup_conn = sqlite3.connect(backup_db) with backup_conn: # Perform the backup source_conn.backup(backup_conn, pages=1, progress=None) # Close the connections source_conn.close() backup_conn.close() # Backup the database online_backup('original.db', 'backup.db') # Verify the backup by connecting to the backup database conn = sqlite3.connect('backup.db') cursor = conn.cursor() cursor.execute("SELECT name FROM sqlite_master WHERE type='table';") tables = cursor.fetchall() print("Tables in backup database:", tables) conn.close()

Output:

Tables in backup database: [('your_table_name',)]

Best Practices for Backup and Recovery

  1. Regular Backups: Schedule backups to occur regularly (e.g., daily, weekly) to ensure recent data is available for recovery.
  2. Automate Backups: Use scripts or tools to automate the backup process, reducing the risk of human error.
  3. Store Backups Offsite: Store backups in a separate physical location or in cloud storage to protect against local disasters.
  4. Test Restores: Regularly test the restore process to ensure that backups are valid and usable.
  5. Versioning: Keep multiple versions of backups to allow recovery from different points in time, protecting against data corruption and accidental deletions.
  6. Encryption: Encrypt backups to protect sensitive data from unauthorized access.
  7. Monitoring and Alerts: Implement monitoring and alerting for backup processes to ensure they are completed successfully.

Conclusion

Effective backup and recovery strategies are essential for maintaining data integrity and availability in SQLite databases. By using the SQLite command-line tool, Backup API, manual file copying, automated scripts, and the Online Backup API, you can ensure your data is protected against loss or corruption. Implementing regular backups, testing restores, and following best practices will help safeguard your data and ensure business continuity.

Disaster recovery planning and procedures

Disaster recovery planning is a crucial aspect of database management, ensuring that data can be restored and normal operations can be resumed after a catastrophic event such as hardware failure, software corruption, cyberattacks, or natural disasters. For SQLite, a lightweight and file-based database, disaster recovery involves a set of strategies and procedures to back up data, restore databases, and minimize downtime.

Disaster Recovery Planning for SQLite

  1. Risk Assessment and Identification:

    • Identify potential risks and threats to the database.
    • Assess the impact of data loss and downtime on the business.
  2. Backup Strategy:

    • Determine the frequency of backups based on data criticality and update frequency.
    • Automate backup processes to ensure consistency and reduce human error.
    • Store backups in multiple locations, including offsite and cloud storage.
  3. Recovery Procedures:

    • Develop clear procedures for restoring databases from backups.
    • Test recovery procedures regularly to ensure they work correctly.
    • Document the steps involved in recovery and train relevant personnel.
  4. Monitoring and Alerts:

    • Implement monitoring to detect database issues promptly.
    • Set up alerts for backup failures and other critical events.
  5. Security Measures:

    • Encrypt backups to protect sensitive data.
    • Implement access controls to ensure only authorized personnel can perform backups and restores.

Backup and Restore Procedures with Examples

Example 1: Automated Backup with Shell Script

Backup Script:

#!/bin/bash # Path to the database file DB_PATH="/path/to/your/database.db" # Path to the backup directory BACKUP_DIR="/path/to/your/backup/dir" # Create a backup with a timestamp BACKUP_FILE="$BACKUP_DIR/database_$(date +'%Y%m%d%H%M%S').db" # Copy the database file to the backup directory cp $DB_PATH $BACKUP_FILE # Output the result echo "Backup created at $BACKUP_FILE"

Usage:

  • Schedule this script to run at regular intervals using cron on Unix-based systems or Task Scheduler on Windows.

Output:

Backup created at /path/to/your/backup/dir/database_20240619123045.db

Example 2: Using the SQLite Command-Line Tool for Backup

Creating a Backup:

sqlite3 /path/to/original.db ".backup /path/to/backup/backup.db"

Output: A file named backup.db is created as a backup of original.db.

Restoring from a Backup:

sqlite3 /path/to/restore.db ".restore /path/to/backup/backup.db"

Output: The restore.db is restored with the data from backup.db.

Example 3: Programmatic Backup Using Python

Backup and Restore with SQLite Backup API:

import sqlite3 def backup_database(source_db, backup_db): # Connect to the source database source_conn = sqlite3.connect(source_db) # Connect to the backup database backup_conn = sqlite3.connect(backup_db) with backup_conn: # Perform the backup source_conn.backup(backup_conn) # Close the connections source_conn.close() backup_conn.close() def restore_database(backup_db, target_db): # Connect to the backup database backup_conn = sqlite3.connect(backup_db) # Connect to the target database target_conn = sqlite3.connect(target_db) with target_conn: # Perform the restore backup_conn.backup(target_conn) # Close the connections backup_conn.close() target_conn.close() # Backup the database backup_database('original.db', 'backup.db') # Restore the database restore_database('backup.db', 'restored.db') # Verify the backup by connecting to the restored database conn = sqlite3.connect('restored.db') cursor = conn.cursor() cursor.execute("SELECT name FROM sqlite_master WHERE type='table';") tables = cursor.fetchall() print("Tables in restored database:", tables) conn.close()

Output:

Tables in restored database: [('your_table_name',)]

Testing and Documentation

Testing the Recovery Process

Regularly test the backup and recovery process to ensure that:

  • Backups are valid and can be restored without data loss.
  • The recovery time meets the business requirements (Recovery Time Objective - RTO).
  • The amount of data loss is acceptable (Recovery Point Objective - RPO).

Example Test Plan:

  1. Create a backup of the current database.
  2. Simulate a disaster by deleting the database file.
  3. Restore the database from the backup.
  4. Verify that the restored database is complete and functional.

Documentation and Training

  • Document all backup and recovery procedures, including scripts and commands.
  • Train relevant personnel on the disaster recovery plan and procedures.
  • Review and Update the disaster recovery plan regularly to accommodate changes in the database environment and business requirements.

Conclusion

Disaster recovery planning for SQLite involves a combination of regular backups, automated scripts, programmatic solutions, and thorough testing. By implementing these strategies, you can ensure that your SQLite databases are protected against data loss and can be quickly restored in case of a disaster. Proper documentation and regular training are also essential to ensure that the disaster recovery plan can be effectively executed when needed.


2.12 Normalization and Denormalization

Understanding the normalization process

Normalization is a process in relational database design that aims to reduce redundancy and dependency by organizing fields and table relationships. The goal is to ensure that the database is efficient, reliable, and scalable. The process involves dividing large tables into smaller, more manageable ones and defining relationships between them. The steps of normalization typically include several normal forms (NF), each with specific requirements.

Understanding Normalization

Normalization involves the following normal forms:

  1. First Normal Form (1NF): Ensures that each column contains atomic (indivisible) values, and each column contains only one type of data.
  2. Second Normal Form (2NF): Meets all requirements of 1NF and ensures that all non-key columns are fully functionally dependent on the primary key.
  3. Third Normal Form (3NF): Meets all requirements of 2NF and ensures that no transitive dependencies exist, meaning non-key columns are not dependent on other non-key columns.

Example of Normalization Process

Let's go through an example of normalization from an unnormalized table to 3NF in SQLite.

Step 1: Unnormalized Table

Consider the following unnormalized table representing a simplified version of a sales database:

OrderIDCustomerNameCustomerAddressProductIDProductNameQuantityPrice
1John Doe123 Main St101Widget A210
2Jane Smith456 Elm St102Widget B120
3John Doe123 Main St103Widget C315

Step 2: First Normal Form (1NF)

To convert this table to 1NF, ensure each column contains atomic values, and each column contains only one type of data. The table already satisfies 1NF in this case.

Step 3: Second Normal Form (2NF)

For 2NF, ensure that all non-key columns are fully functionally dependent on the primary key. The composite key here would be (OrderID, ProductID). However, the customer information is not fully dependent on the composite key. We need to split the table.

Orders Table:

OrderIDCustomerIDCustomerNameCustomerAddress
11John Doe123 Main St
22Jane Smith456 Elm St
31John Doe123 Main St

OrderDetails Table:

OrderIDProductIDProductNameQuantityPrice
1101Widget A210
2102Widget B120
3103Widget C315

Step 4: Third Normal Form (3NF)

For 3NF, remove transitive dependencies. In the Orders table, CustomerName and CustomerAddress are dependent on CustomerID, not OrderID. Similarly, ProductName is dependent on ProductID, not OrderID.

Customers Table:

CustomerIDCustomerNameCustomerAddress
1John Doe123 Main St
2Jane Smith456 Elm St

Products Table:

ProductIDProductName
101Widget A
102Widget B
103Widget C

Final Orders Table:

OrderIDCustomerID
11
22
31

Final OrderDetails Table:

OrderIDProductIDQuantityPrice
1101210
2102120
3103315

Implementing the Normalized Structure in SQLite

Here's how to implement these tables in SQLite:

-- Create the Customers table CREATE TABLE Customers ( CustomerID INTEGER PRIMARY KEY, CustomerName TEXT NOT NULL, CustomerAddress TEXT NOT NULL ); -- Create the Products table CREATE TABLE Products ( ProductID INTEGER PRIMARY KEY, ProductName TEXT NOT NULL ); -- Create the Orders table CREATE TABLE Orders ( OrderID INTEGER PRIMARY KEY, CustomerID INTEGER, FOREIGN KEY (CustomerID) REFERENCES Customers (CustomerID) ); -- Create the OrderDetails table CREATE TABLE OrderDetails ( OrderID INTEGER, ProductID INTEGER, Quantity INTEGER NOT NULL, Price REAL NOT NULL, PRIMARY KEY (OrderID, ProductID), FOREIGN KEY (OrderID) REFERENCES Orders (OrderID), FOREIGN KEY (ProductID) REFERENCES Products (ProductID) ); -- Insert data into Customers table INSERT INTO Customers (CustomerID, CustomerName, CustomerAddress) VALUES (1, 'John Doe', '123 Main St'), (2, 'Jane Smith', '456 Elm St'); -- Insert data into Products table INSERT INTO Products (ProductID, ProductName) VALUES (101, 'Widget A'), (102, 'Widget B'), (103, 'Widget C'); -- Insert data into Orders table INSERT INTO Orders (OrderID, CustomerID) VALUES (1, 1), (2, 2), (3, 1); -- Insert data into OrderDetails table INSERT INTO OrderDetails (OrderID, ProductID, Quantity, Price) VALUES (1, 101, 2, 10), (2, 102, 1, 20), (3, 103, 3, 15);

Querying the Normalized Data

To retrieve the data, you can use SQL joins. For example, to get the full details of an order:

SELECT Orders.OrderID, Customers.CustomerName, Customers.CustomerAddress, Products.ProductName, OrderDetails.Quantity, OrderDetails.Price FROM Orders JOIN Customers ON Orders.CustomerID = Customers.CustomerID JOIN OrderDetails ON Orders.OrderID = OrderDetails.OrderID JOIN Products ON OrderDetails.ProductID = Products.ProductID;

Output:

OrderIDCustomerNameCustomerAddressProductNameQuantityPrice
1John Doe123 Main StWidget A210
2Jane Smith456 Elm StWidget B120
3John Doe123 Main StWidget C315

Conclusion

Normalization is an essential process in database design that reduces redundancy and ensures data integrity. By breaking down an unnormalized table into multiple related tables and adhering to normal forms, we can achieve a well-structured and efficient database. The examples provided demonstrate how to normalize data from an unnormalized form to 3NF in SQLite and how to implement and query the normalized tables.

Normal forms: First normal form (1NF) to Boyce-Codd normal form (BCNF)

Normal forms in database normalization help organize data to reduce redundancy and improve data integrity. Here, we'll cover the progression from the First Normal Form (1NF) to the Boyce-Codd Normal Form (BCNF) using an example in SQLite.

First Normal Form (1NF)

A table is in 1NF if:

  1. All columns contain atomic (indivisible) values.
  2. Each column contains values of a single type.
  3. Each column contains unique values (no repeating groups or arrays).

Example:

Consider a table Students that stores student information including subjects they are taking:

StudentIDStudentNameSubjects
1AliceMath, Physics
2BobChemistry, Math
3CharliePhysics, Biology

To convert this to 1NF, we must ensure each field contains only atomic values:

StudentIDStudentNameSubject
1AliceMath
1AlicePhysics
2BobChemistry
2BobMath
3CharliePhysics
3CharlieBiology

SQLite commands to create the 1NF table:

CREATE TABLE Students ( StudentID INTEGER, StudentName TEXT, Subject TEXT, PRIMARY KEY (StudentID, Subject) ); INSERT INTO Students (StudentID, StudentName, Subject) VALUES (1, 'Alice', 'Math'), (1, 'Alice', 'Physics'), (2, 'Bob', 'Chemistry'), (2, 'Bob', 'Math'), (3, 'Charlie', 'Physics'), (3, 'Charlie', 'Biology');

Second Normal Form (2NF)

A table is in 2NF if:

  1. It is in 1NF.
  2. All non-key attributes are fully functional dependent on the primary key.

In our 1NF table, StudentName is only dependent on StudentID, not on the composite key (StudentID, Subject). We need to create a separate table for student details and another for their subjects.

Decomposed tables:

Students:

StudentIDStudentName
1Alice
2Bob
3Charlie

StudentSubjects:

StudentIDSubject
1Math
1Physics
2Chemistry
2Math
3Physics
3Biology

SQLite commands to create the 2NF tables:

CREATE TABLE Students ( StudentID INTEGER PRIMARY KEY, StudentName TEXT ); CREATE TABLE StudentSubjects ( StudentID INTEGER, Subject TEXT, PRIMARY KEY (StudentID, Subject), FOREIGN KEY (StudentID) REFERENCES Students(StudentID) ); INSERT INTO Students (StudentID, StudentName) VALUES (1, 'Alice'), (2, 'Bob'), (3, 'Charlie'); INSERT INTO StudentSubjects (StudentID, Subject) VALUES (1, 'Math'), (1, 'Physics'), (2, 'Chemistry'), (2, 'Math'), (3, 'Physics'), (3, 'Biology');

Third Normal Form (3NF)

A table is in 3NF if:

  1. It is in 2NF.
  2. All the attributes are functionally dependent only on the primary key.

Our tables are already in 3NF, as there are no transitive dependencies (i.e., no non-key attribute depends on another non-key attribute).

Boyce-Codd Normal Form (BCNF)

A table is in BCNF if:

  1. It is in 3NF.
  2. For every functional dependency (X -> Y), X is a super key.

Our current tables also satisfy BCNF conditions, as all functional dependencies have the left-hand side as a super key.

Summary

We started with a non-normalized table and progressively normalized it through 1NF, 2NF, 3NF, and finally BCNF. At each step, we ensured data was structured to minimize redundancy and dependency issues. Here's the final structure in SQLite:

Students:

CREATE TABLE Students ( StudentID INTEGER PRIMARY KEY, StudentName TEXT );

StudentSubjects:

CREATE TABLE StudentSubjects ( StudentID INTEGER, Subject TEXT, PRIMARY KEY (StudentID, Subject), FOREIGN KEY (StudentID) REFERENCES Students(StudentID) );

Data insertion:

INSERT INTO Students (StudentID, StudentName) VALUES (1, 'Alice'), (2, 'Bob'), (3, 'Charlie'); INSERT INTO StudentSubjects (StudentID, Subject) VALUES (1, 'Math'), (1, 'Physics'), (2, 'Chemistry'), (2, 'Math'), (3, 'Physics'), (3, 'Biology');

These steps ensure our database is in BCNF, optimizing it for both performance and data integrity.

Denormalization and its use cases

Denormalization is the process of combining tables in a database to reduce the complexity of queries and improve read performance. It involves introducing redundancy by merging related tables, which can make certain operations more efficient at the cost of potential data anomalies.

Use Cases for Denormalization

  1. Read Performance Improvement: When read-heavy applications require fast query performance, denormalization can reduce the need for complex joins, thus speeding up data retrieval.
  2. Simplified Queries: In scenarios where query simplicity is essential, denormalized tables reduce the need for multi-table joins.
  3. Data Warehousing: In OLAP (Online Analytical Processing) systems where large volumes of data are analyzed, denormalization helps in optimizing read operations.
  4. Reporting: For generating reports where data is read frequently but seldom updated, denormalization helps in reducing query complexity and improving performance.

Example of Denormalization in SQLite

Let's consider a normalized database with Orders and OrderDetails tables. We'll denormalize these tables to create a single Orders table.

Normalized Tables

Orders:

OrderIDOrderDateCustomerID
12024-06-01101
22024-06-02102

OrderDetails:

OrderIDProductIDQuantityPrice
1201210.0
1202120.0
220355.0

Creating Normalized Tables in SQLite:

CREATE TABLE Orders ( OrderID INTEGER PRIMARY KEY, OrderDate TEXT, CustomerID INTEGER ); CREATE TABLE OrderDetails ( OrderID INTEGER, ProductID INTEGER, Quantity INTEGER, Price REAL, PRIMARY KEY (OrderID, ProductID), FOREIGN KEY (OrderID) REFERENCES Orders(OrderID) ); INSERT INTO Orders (OrderID, OrderDate, CustomerID) VALUES (1, '2024-06-01', 101), (2, '2024-06-02', 102); INSERT INTO OrderDetails (OrderID, ProductID, Quantity, Price) VALUES (1, 201, 2, 10.0), (1, 202, 1, 20.0), (2, 203, 5, 5.0);

Denormalized Table

We combine Orders and OrderDetails into a single Orders table:

OrderIDOrderDateCustomerIDProductIDQuantityPrice
12024-06-01101201210.0
12024-06-01101202120.0
22024-06-0210220355.0

Creating Denormalized Table in SQLite:

CREATE TABLE Orders ( OrderID INTEGER, OrderDate TEXT, CustomerID INTEGER, ProductID INTEGER, Quantity INTEGER, Price REAL, PRIMARY KEY (OrderID, ProductID) ); INSERT INTO Orders (OrderID, OrderDate, CustomerID, ProductID, Quantity, Price) VALUES (1, '2024-06-01', 101, 201, 2, 10.0), (1, '2024-06-01', 101, 202, 1, 20.0), (2, '2024-06-02', 102, 203, 5, 5.0);

Advantages of Denormalization

  1. Improved Query Performance: Fewer joins mean faster read operations, which is crucial for read-heavy applications.
  2. Simpler Queries: Queries become less complex, reducing the overhead of writing and maintaining complex SQL.
  3. Reduced Join Operations: Eliminating the need for multiple joins can significantly speed up query execution.

Disadvantages of Denormalization

  1. Data Redundancy: Redundancy can lead to inconsistencies and anomalies during data updates.
  2. Increased Storage: More storage space is required due to the duplication of data.
  3. Maintenance Complexity: Ensuring data integrity and consistency becomes more challenging.

Summary

Denormalization in SQLite involves merging related tables to optimize read operations and simplify queries at the expense of introducing redundancy and potential data anomalies. It is particularly useful in scenarios where read performance is critical, such as in data warehousing and reporting.

By denormalizing the Orders and OrderDetails tables, we demonstrated how to create a single table that simplifies queries and improves read performance. While denormalization offers significant benefits, it is essential to weigh these against the potential drawbacks to determine its suitability for a given application.

1. Introduction to Databases
3. NoSQL Databases