6-1 Project One: Creating A Database And Querying Data: Exact Answer & Steps

Ever tried to explain a database to a roommate who thinks “SQL” is a new streaming service? On the flip side, you end up drawing tables on napkins, and suddenly the whole night feels like a crash‑course in data science. That said, that’s the vibe of the 6‑1 Project One – it’s the first hands‑on step where you actually create a database and then query it. No fluff, just the stuff that makes the difference between “it works” and “I’m stuck forever”.

What Is the 6‑1 Project One

In plain English, the 6‑1 Project One is a beginner‑level assignment that asks you to spin up a tiny relational database from scratch and pull out information with SQL queries. Think of it as building a LEGO house (the tables) and then deciding which rooms you want to peek into (the SELECT statements).

This changes depending on context. Keep that in mind.

You’ll usually work with a simple schema – maybe a students table, a courses table, and a junction table called enrollments. The goal isn’t to model a Fortune 500 enterprise; it’s to get comfortable with:

Defining tables and their columns
Setting primary keys and foreign keys
Inserting a handful of rows for testing
Writing basic SELECT, WHERE, JOIN, and aggregate queries

All of that happens inside a lightweight DBMS like SQLite, MySQL, or PostgreSQL. In most classrooms the choice is SQLite because you can drop a single file into a folder and you’re good to go.

The Core Pieces

Database file – the physical container (e.g., school.db).
Schema – the blueprint: tables, columns, data types, constraints.
Data – the rows you’ll actually query.
SQL – the language that talks to the database.

If you’ve ever used Excel, you’ve already done a tiny version of this: columns, rows, and formulas. The difference is that SQL lets you ask the same question across many tables without copying data around Still holds up..

Why It Matters / Why People Care

Why bother with a project that feels like “just another homework assignment”? Because the moment you can create a database and query it, you’ve unlocked a skill that powers everything from web apps to data analytics pipelines But it adds up..

Real‑world relevance – Companies store customer info, inventory, logs… all in relational databases. If you can write a SELECT that pulls the right rows, you’re already speaking the language of data engineers.
Problem‑solving muscle – Figuring out why a JOIN returns duplicate rows is a mental workout that sharpens logical thinking.
Foundation for the next steps – Once you’ve mastered the basics, you can move on to ORMs, migrations, and even big‑data tools that still rely on SQL under the hood.

In practice, the ability to query data quickly translates to faster decisions. Imagine a manager asking, “How many users signed up last month?” If you can write that query on the fly, you’re the go‑to person for answers, not the person who has to ask IT to run a report Simple, but easy to overlook..

How It Works (or How to Do It)

Below is the step‑by‑step roadmap most instructors expect for the 6‑1 Project One. Feel free to adapt the names and data to something you care about – a music collection, a recipe box, anything that makes the tables feel personal Practical, not theoretical..

1. Set Up Your Environment

Install a DBMS – SQLite is the easiest (sqlite3 command works on macOS/Linux, Windows has a handy executable).
Choose a client – The command line works fine, but GUI tools like DB Browser for SQLite or DBeaver make visualizing tables painless.
Create a new database file – sqlite3 school.db will open a prompt and create the file if it doesn’t exist.

2. Define the Schema

Write a CREATE TABLE script. Here’s a minimal example:

CREATE TABLE students (
    student_id   INTEGER PRIMARY KEY,
    first_name   TEXT NOT NULL,
    last_name    TEXT NOT NULL,
    enrollment_date DATE
);

CREATE TABLE courses (
    course_id    INTEGER PRIMARY KEY,
    course_name  TEXT NOT NULL,
    credits      INTEGER
);

CREATE TABLE enrollments (
    enrollment_id INTEGER PRIMARY KEY,
    student_id    INTEGER,
    course_id     INTEGER,
    grade         TEXT,
    FOREIGN KEY(student_id) REFERENCES students(student_id),
    FOREIGN KEY(course_id)  REFERENCES courses(course_id)
);

A few things to notice:

Primary keys guarantee each row is unique.
Foreign keys enforce relationships – you can’t enroll a non‑existent student.
Data types (INTEGER, TEXT, DATE) keep the DB tidy.

3. Insert Sample Data

Populate each table with a handful of rows. You don’t need hundreds; ten students, five courses, and a dozen enrollments are plenty.

INSERT INTO students (first_name, last_name, enrollment_date) VALUES
('Alice', 'Miller', '2023-08-15'),
('Bob',   'Smith',  '2022-09-01'),
('Cara',  'Lee',    '2023-01-20');

INSERT INTO courses (course_name, credits) VALUES
('Intro to SQL', 3),
('Data Structures', 4),
('Web Development', 3);

INSERT INTO enrollments (student_id, course_id, grade) VALUES
(1, 1, 'A'), (1, 2, 'B+'), (2, 1, 'B'), (3, 3, 'A-');

If you’re using a GUI, you can paste these statements into a “Run SQL” window and hit Execute Simple as that..

4. Basic SELECT Queries

Start with the simplest: pull everything from a table Small thing, real impact..

SELECT * FROM students;

That returns all columns for every student. In practice you’ll want to limit columns:

SELECT first_name, last_name FROM students;

5. Filtering with WHERE

Add conditions to narrow the result set And that's really what it comes down to..

SELECT * FROM students
WHERE enrollment_date >= '2023-01-01';

That shows only students who joined this year. You can combine conditions with AND / OR Not complicated — just consistent..

SELECT * FROM courses
WHERE credits >= 3 AND course_name LIKE '%SQL%';

6. Joining Tables

The real power shows up when you combine data from multiple tables. Let’s list each student with the courses they’re taking.

SELECT s.first_name, s.last_name, c.course_name, e.grade
FROM enrollments e
JOIN students    s ON e.student_id = s.student_id
JOIN courses     c ON e.course_id  = c.course_id;

Notice the three‑table join. The ON clauses tie the foreign keys to their parents. If you forget a join, you’ll either get a Cartesian product (every row paired with every other) or a “no such column” error Not complicated — just consistent..

7. Aggregates and GROUP BY

Want to know how many students are in each course? Use COUNT and GROUP BY.

SELECT c.course_name, COUNT(e.student_id) AS num_students
FROM courses c
LEFT JOIN enrollments e ON c.course_id = e.course_id
GROUP BY c.course_name;

The LEFT JOIN ensures courses with zero enrollments still appear with a count of zero.

8. Ordering and Limiting

Sort results and grab only the top few That's the part that actually makes a difference..

SELECT first_name, last_name
FROM students
ORDER BY enrollment_date DESC
LIMIT 5;

That shows the five most recent enrollees.

9. Updating and Deleting

You’ll also need to modify data.

UPDATE enrollments
SET grade = 'A'
WHERE enrollment_id = 2;

And to clean up:

DELETE FROM enrollments
WHERE enrollment_id = 4;

Always double‑check the WHERE clause; a missing condition wipes the whole table.

10. Saving and Exporting

When you’re done, you can dump the whole DB to a SQL script:

sqlite3 school.db .dump > school_backup.sql

That file is handy for grading or sharing with classmates.

Common Mistakes / What Most People Get Wrong

Forgetting foreign key constraints – It’s tempting to skip FOREIGN KEY lines because they feel optional. Without them you can enroll a phantom student, and later your JOINs return nonsense.
Using SELECT * everywhere – Works, but it’s a performance trap. Pulling every column forces the DB to read more data than needed, especially as tables grow.
Mixing data types – Inserting a string into an INTEGER column usually throws an error, but some DBMS silently coerce it, leading to subtle bugs later.
Cartesion products – A missing ON clause in a JOIN creates a massive result set that looks like the DB exploded. Always double‑check your join conditions.
Not aliasing tables – When you have three tables, writing students.student_id every time becomes a pain. Aliases (s, c, e) keep the query readable.
Ignoring NULL handling – COUNT(column) ignores NULLs, while COUNT(*) counts rows. If you’re tallying grades, a missing grade will skew your totals unless you account for it.

Spotting these pitfalls early saves you hours of debugging later The details matter here..

Practical Tips / What Actually Works

Sketch the schema first – Grab a pen, draw boxes for each table, and draw arrows for foreign keys. Visualizing relationships prevents a lot of “I can’t join these tables” moments.
Use meaningful column names – student_id is clearer than just id. It prevents confusion when you have multiple tables with their own IDs And it works..
Test queries incrementally – Build a JOIN step by step. Start with SELECT * FROM students; then add the first JOIN, run it, verify, then add the next That alone is useful..
take advantage of the SQLite .schema command – Type .schema at the prompt to see the exact CREATE statements. Great for double‑checking column names.
Wrap data inserts in a transaction – It speeds up bulk inserts dramatically It's one of those things that adds up..
```
BEGIN TRANSACTION;
INSERT INTO ...;
INSERT INTO ...;
COMMIT;
```
Use EXPLAIN QUERY PLAN – Even in a tiny DB, it shows you whether the engine is using indexes (if you created any) No workaround needed..
Keep a backup – One accidental DELETE FROM students; and you’re scrambling. A simple copy of the .db file is a lifesaver.
Comment your SQL – -- Get all students enrolled in Intro to SQL makes the script readable for you and anyone grading it.

FAQ

Q: Do I have to use SQLite?
A: No. MySQL, PostgreSQL, or even Microsoft SQL Server work the same way for the core concepts. SQLite is just the simplest to set up for a one‑off project.

Q: What if my DBMS doesn’t enforce foreign keys?
A: Some SQLite builds have foreign key enforcement off by default. Run PRAGMA foreign_keys = ON; at the start of your session to enable it Easy to understand, harder to ignore. Simple as that..

Q: How many rows should I insert for the project?
A: Aim for 5‑10 rows per table. Enough to demonstrate joins and aggregates, but not so many that you spend all your time typing data The details matter here..

Q: Can I use Python or another language to run the queries?
A: Absolutely. Libraries like sqlite3 (Python) or node-sqlite3 (Node.js) let you execute the same SQL statements from code, which is a nice extension if you want to build a tiny app around the database.

Q: What’s the difference between INNER JOIN and LEFT JOIN?
A: INNER JOIN returns only rows with matching keys in both tables. LEFT JOIN returns all rows from the left table, filling missing matches with NULLs. Use LEFT JOIN when you need to show “no enrollment” rows.

That’s the whole ride: create a database, sprinkle in some data, and start pulling answers with SQL. ” And that shift? Day to day, ” instead of “I have no idea where the data lives. In real terms, once you’ve nailed the 6‑1 Project One, you’ll find yourself asking “how do I get that info? That’s exactly why the project matters. Happy querying!

Going Beyond the Basics

Once you’ve mastered the six tables and the core queries, you’ll naturally start wondering how to make the data more useful. Below are a few ideas that can stretch the project without overcomplicating it And that's really what it comes down to..

1. Add a grades table

If you want to practice aggregates, create a grades table that links student_id, course_id, and a numeric score. Then experiment with AVG, MAX, and GROUP BY to see who’s topping the class.

CREATE TABLE grades (
    grade_id   INTEGER PRIMARY KEY AUTOINCREMENT,
    student_id INTEGER NOT NULL,
    course_id  INTEGER NOT NULL,
    score      INTEGER NOT NULL,
    FOREIGN KEY (student_id) REFERENCES students(student_id),
    FOREIGN KEY (course_id)  REFERENCES courses(course_id)
);

2. Use window functions

SQLite now supports window functions. A simple example: rank students by GPA within each major And that's really what it comes down to. Which is the point..

SELECT
    s.student_name,
    m.major_name,
    g.gpa,
    RANK() OVER (PARTITION BY m.major_id ORDER BY g.gpa DESC) AS rank_within_major
FROM students s
JOIN majors m   ON s.major_id   = m.major_id
JOIN grades g   ON s.student_id = g.student_id
GROUP BY s.student_id, m.major_id, g.gpa;

3. Build a small report

Export a CSV that lists every student, their major, the courses they’re enrolled in, and the instructor for each course. Use SELECT ... INTO OUTFILE in MySQL or sqlite3’s .output command in SQLite.

SELECT
    s.student_name,
    m.major_name,
    c.course_name,
    i.instructor_name
FROM students s
JOIN majors m      ON s.major_id = m.major_id
JOIN enrollments e ON s.student_id = e.student_id
JOIN courses c     ON e.course_id = c.course_id
JOIN instructors i ON c.instructor_id = i.instructor_id
ORDER BY s.student_name, c.course_name;

Common Pitfalls and How to Avoid Them

Pitfall	Why it Happens	Fix
Misspelled column names	Quick typing errors	Use `.schema` or your IDE’s autocomplete
Forgot to enable foreign keys	SQLite turns them off by default	`PRAGMA foreign_keys = ON;`
Unindexed lookups	Slow joins on big tables	Add `CREATE INDEX idx_course_id ON enrollments(course_id);`
Data type mismatches	Mixing integers with strings	Keep consistent types (`INTEGER` vs `TEXT`)
Accidental data loss	Blind `DELETE` or `DROP`	Keep versioned backups or use version control for `.sql` scripts

Wrapping It All Together

You started with a blank .db file, defined six relational tables, loaded a handful of rows, and learned how to join, filter, aggregate, and even protect your data. These are the core skills every budding data professional needs:

Design – Think about how tables relate and enforce constraints.
Populate – Insert realistic data, using transactions for speed.
Query – Pull meaningful insights with SELECT, JOIN, GROUP BY, and window functions.
Diagnose – Use EXPLAIN, indexing, and schema reviews to troubleshoot.

The 6‑1 Project One is more than a homework assignment; it’s a micro‑ecosystem where you practice the entire data‑life cycle. When you finish, you’ll not only have a working database but also a mental framework for approaching any new schema you encounter.

So fire up your DBMS, type those CREATE TABLE statements, and let the data flow. Even so, the next time a professor asks you, “What does the database tell us about the students who are taking both Calculus and Physics? ” you’ll answer with a crisp query, be proud of your schema, and feel the confidence that comes from mastering the fundamentals.

Happy querying!

Putting It All Together in a Script

If you’re working on an assignment or a small demo, you’ll often want to bundle everything into a single, repeatable script. Here's the thing — sql. Still, it covers schema creation, data loading, and a couple of sample queries. Worth adding: db < setup_and_query. That said, below is a fully‑self‑contained SQLite example that you can drop into a file called setup_and_query. On top of that, sqland run withsqlite3 school. Feel free to copy, paste, and tweak it for your own project.

-- 1️⃣  Create tables
DROP TABLE IF EXISTS enrollments;
DROP TABLE IF EXISTS students;
DROP TABLE IF EXISTS majors;
DROP TABLE IF EXISTS courses;
DROP TABLE IF EXISTS instructors;
DROP TABLE IF EXISTS grades;

CREATE TABLE majors (
    major_id   INTEGER PRIMARY KEY,
    major_name TEXT NOT NULL UNIQUE
);

CREATE TABLE students (
    student_id INTEGER PRIMARY KEY,
    student_name TEXT NOT NULL,
    major_id INTEGER NOT NULL,
    FOREIGN KEY (major_id) REFERENCES majors(major_id)
);

CREATE TABLE instructors (
    instructor_id INTEGER PRIMARY KEY,
    instructor_name TEXT NOT NULL
);

CREATE TABLE courses (
    course_id INTEGER PRIMARY KEY,
    course_name TEXT NOT NULL,
    instructor_id INTEGER NOT NULL,
    FOREIGN KEY (instructor_id) REFERENCES instructors(instructor_id)
);

CREATE TABLE enrollments (
    enrollment_id INTEGER PRIMARY KEY,
    student_id INTEGER NOT NULL,
    course_id INTEGER NOT NULL,
    semester TEXT NOT NULL,
    FOREIGN KEY (student_id) REFERENCES students(student_id),
    FOREIGN KEY (course_id)  REFERENCES courses(course_id)
);

CREATE TABLE grades (
    grade_id INTEGER PRIMARY KEY,
    enrollment_id INTEGER NOT NULL,
    grade TEXT NOT NULL,
    FOREIGN KEY (enrollment_id) REFERENCES enrollments(enrollment_id)
);

-- 2️⃣  Insert data
INSERT INTO majors (major_id, major_name) VALUES
    (1,'Computer Science'), (2,'Mathematics'), (3,'Physics'), (4,'Chemistry');

INSERT INTO students (student_id, student_name, major_id) VALUES
    (1,'Alice',1), (2,'Bob',2), (3,'Charlie',3), (4,'Diana',4),
    (5,'Ethan',1), (6,'Fiona',2), (7,'George',3), (8,'Hannah',4);

INSERT INTO instructors (instructor_id, instructor_name) VALUES
    (1,'Prof. Newton'), (2,'Prof. Day to day, euler'), (3,'Prof. Curie'), (4,'Prof. 

INSERT INTO courses (course_id, course_name, instructor_id) VALUES
    (1,'Calculus I',2), (2,'Linear Algebra',2), (3,'Quantum Mechanics',1),
    (4,'Organic Chemistry',3), (5,'Intro to CS',4), (6,'Data Structures',4);

INSERT INTO enrollments (enrollment_id, student_id, course_id, semester) VALUES
    (1,1,5,'Fall 2023'), (2,1,6,'Fall 2023'), (3,2,1,'Fall 2023'),
    (4,2,2,'Fall 2023'), (5,3,3,'Fall 2023'), (6,3,1,'Fall 2023'),
    (7,4,4,'Fall 2023'), (8,5,5,'Fall 2023'), (9,6,1,'Fall 2023'),
    (10,7,3,'Fall 2023'), (11,8,4,'Fall 2023');

INSERT INTO grades (grade_id, enrollment_id, grade) VALUES
    (1,1,'A'), (2,2,'B+'), (3,3,'A-'), (4,4,'B'), (5,5,'A'),
    (6,6,'B+'), (7,7,'A-'), (8,8,'B'), (9,9,'C+'), (10,10,'A'),
    (11,11,'B-');

-- 3️⃣  Sample queries

-- a) Students who are taking both Calculus I and Quantum Mechanics
SELECT s.student_name
FROM students s
JOIN enrollments e1 ON s.student_id = e1.student_id
JOIN courses c1 ON e1.course_id = c1.course_id
JOIN enrollments e2 ON s.student_id = e2.student_id
JOIN courses c2 ON e2.course_id = c2.course_id
WHERE c1.course_name = 'Calculus I'
  AND c2.course_name = 'Quantum Mechanics';

-- b) Average GPA per major (assuming grade points: A=4, A-=3.7, B+=3.3, B=3, B-=2.7, C+=2.3, C=2, etc.)
WITH grade_points AS (
    SELECT g.grade_id, g.enrollment_id,
           CASE g.grade
               WHEN 'A'  THEN 4.0
               WHEN 'A-' THEN 3.7
               WHEN 'B+' THEN 3.3
               WHEN 'B'  THEN 3.0
               WHEN 'B-' THEN 2.7
               WHEN 'C+' THEN 2.3
               WHEN 'C'  THEN 2.0
               ELSE 0 END AS points
    FROM grades g
)
SELECT m.major_name,
       ROUND(AVG(gp.points),2) AS avg_gpa,
       COUNT(gp.grade_id) AS courses_taken
FROM majors m
JOIN students s ON s.major_id = m.major_id
JOIN enrollments e ON e.student_id = s.student_id
JOIN grade_points gp ON gp.enrollment_id = e.enrollment_id
GROUP BY m.major_id, m.major_name
ORDER BY avg_gpa DESC;

Running this script will give you a clean, reproducible database and a couple of ready‑to‑use queries that demonstrate the power of relational modeling Most people skip this — try not to. Surprisingly effective..

Final Thoughts

You’ve just walked through the entire life cycle of a small academic database:

Conceptual design – deciding which entities exist and how they connect.
Physical implementation – writing CREATE TABLE statements, choosing keys, and enforcing integrity.
Data population – inserting realistic rows, batching for performance.
Exploratory querying – extracting insights, filtering, joining, and aggregating.
Maintenance – indexing, debugging, and avoiding common pitfalls.

In practice, the same pattern scales from a handful of tables to thousands. The skills you’re building—schema thinking, SQL fluency, and a disciplined approach to data—are the foundation for any data‑driven career, whether you end up in analytics, business intelligence, or software engineering And it works..

So go ahead, run that script, tweak the queries, and experiment with new tables or more complex joins. Every time you write a new SELECT, you’re reinforcing the mental model that lets you see relationships, spot anomalies, and answer the questions that matter. Happy querying!