Programming:SQL PDF Print E-mail

Programming:SQL

From Wikibooks

SQL, which is an acronym for Structured Query Language is a method for specifying the means of either requesting data from a database, of updating data in a database, of adding data to a database, or of manipulating metadata about the database. Getting a database to work with

You will need a text editor to write your SQL scripts and you will need a database to work with. The Postgresql and MySQL databases are free, open-source databases that you can use while studying this text. Both run on a number of operating systems, including Linux, Mac OS/X, and Windows.

todo: test all code samples with any databases we recommend people use.


Introduction to SQL

SQL is a high-level specification for a set of instructions to a database to perform an action on one or more tables in the database, data in one or more tables, or both. The specification is designed to be what is called non-programmable in that the instructions given are generally applicable to the entire database or table as opposed to a single table in the database, single record in a table or single field in a record as a typical programming language would look at data.

The instructions are given in the form of commands, consisting of a specific SQL command and additional parameters and operands that apply to that command.

The typical commands used in SQL are:
{moscode} SELECT - Used to retrieve certain records from one or more tables.
INSERT - Used to create a record.
UPDATE - Used to change certain records.
DELETE - Used to delete certain records.
CREATE - Used to create a new table, a view of a table, or other object in database.
ALTER - Used to modify an existing database object, such as a table.
DROP - Used to delete an entire table, a view of a table or other object in the database.
GRANT - Used to give a privilege to someone.
REVOKE - Used to take back privileges granted to someone.
{/moscode}
Additional commands may be available, and some databases support stored procedures which allow what are essentially programs written using SQL or a special programming language designed for use with SQL commands, to manipulate data or tables on the server without necessarily returning data from the server.

Each one of these commands has various modifiers, both based upon standard SQL command rules, and certain extensions to SQL that each database provider implements.
[edit]

Relational Databases


Tables, rows and columns

The easiest and fun way I find to think about database structure is to compare it to a spreadsheet.

Think of the entire spreadsheet as a representation of the database. A sheet in a spreadsheet is the same as a table in a database. Just as each sheet in a spreadsheet has columns and rows, so does a table.

Columns in database tables are the type of each bit of data contained in a row.

Think of a row of data like a sentence. A sentence has different words of different types (nouns, adjectives, etc) likewise a column can be a different type. Different column types include (but are not limited to) text, varchar (that is a variable number of characters - so VARCHAR(20) would be able to contain up to 20 ASCII values), date or blob (large binary data).


Creating Tables

Example code for a table that describes the produce section of the grocery store:

create table fruit (
    fruit_name text,
    price_per_pound numeric,
    quantity_available numeric
);

This would produce an empty table that has field names of fruit_name, price_per_pound and quantity_available. Their type is also specified and any data entered must be of that type.


Retrieving Data

Select column name (,column name,...)

From table name (,table name,...)

Where condition

(and condition ...);

For example if a table named pets has a field id and a field name,
SELECT * FROM pets;
would print out the entire table.
SELECT name FROM pets;
would print out only the names. and
SELECT name FROM pets WHERE id=1;
would print out only the name of the first pet in the table.


Adding Data

Extending the fruit store metaphor:

insert into fruit ( fruit_name, price_per_pound, quantity ) values ('apples', 0.90, 400);
insert into fruit ( fruit_name, price_per_pound, quantity ) values ('bananas', 0.59, 800);

select * from fruit;

 fruit_name | price_per_pound | quantity
------------+-----------------+----------
 bananas    |            0.59 |      800
 apples     |            0.90 |      400
(2 rows)


Foreign Keys

In Postgresql (and possibly other databases) you can require that values in one column match values in a column in another table. For example you have a table that tracks all the dog owners in the neighborhood:

owner            dog name      dog breed
Joe Smith        Rex           mastiff
Rosa Gomez       Gato          chihuahua
Smitty Jones     Spot          mastiff

And let's pretend you have another table that describes the attributes of different dog breeds:

dog breed       average size       poop description
mastiff         enormous           horrible; will kill most grasses and flowers.
chihuahuah      tiny               surprisingly pungent


Ok, so you want to make sure that when you add new neighbors to the list, you only add valid dog breeds off your dog breed table. You can use the following sql code to enforce that:

create table dog_types (
    dog_breed text,
    average_size text,
    poop_description text
);

create table dog_owners (
    owners text,
    dog_name text,
    dog_breed text references dog_types
);


This is a foreign key in SQL jargon. There are much more interesting applications than this.


Exercise

Write a set of tables that a teacher can use to create an online multiple-choice testing program. One table will hold questions and the answer_id of the correct answer. Another table will hold a list of answers. A third table will hold the available answers for each question.

For example, imagine two questions exist in the question table:

1. what is 6 * 30?
2. who was the first US president?

And imagine the following four answers exist in the answer table:

1. 45
2. George Washington
3. Thomas Jefferson
4. 180

Now create a third table that lists answers one and four as possible answers for question 1 and answers 2 and 3 for question 2.

Got it?


Exercise solution

This is code from a project I wrote, and it needs to be tweaked for this textbook, but it illustrates a compound foreign key.

/*

This program illustrates one use of a compound foreign key constraint.  I
needed a way to track answers to multiple-choice questions, and to limit
possible answers to each question.  For example, I want to make sure that if I
ask the question "what is your favorite color?" then the answer is a color,
and not something silly like "yes."

*/

/* this table holds each question. */
create table questions (
    q_id serial primary key,
    q_text text
);

/* here are some questions.  Since I used the serial data type, I'm not going
 * to assign the q_id column.  Postgresql will assign the q_id automatically.
 */
insert into questions (q_text) values ('What is your favorite color?'); /* q_id 1 */
insert into questions (q_text) values ('Is it raining outside?');       /* q_id 2 */

/* after inserting, this is what the database looks like now:

matt=> select * from questions;
 q_id |            q_text            
------+------------------------------
    1 | What is your favorite color?
    2 | Is it raining outside?
(2 rows)

*/


/* this table lists every possible answer, each with a unique id. */
create table all_answers (
   a_id serial primary key,
   a_text text
);

/* these are some answers. */
insert into all_answers (a_text) values ('red');      /* a_id 1 */
insert into all_answers (a_text) values ('yes');      /* a_id 2 */
insert into all_answers (a_text) values ('green');    /* a_id 3 */
insert into all_answers (a_text) values ('no');       /* a_id 4 */

/* here's what all_answers looks like after adding data:

matt=> select * from all_answers;
 a_id | a_text
------+--------
    1 | red
    2 | yes
    3 | green
    4 | no
(4 rows)

*/insert into fruit ( fruit_name, price_per_pound, quantity ) values ('apples', 0.90, 400);

/* this table links each question to meaningful possible answers. I am using
 * question ID (q_id) and answer ID (a_id) numbers in order to save space.
 */
create table possible_answers (
    q_id integer references questions (q_id),
    a_id integer references all_answers (a_id),
    primary key (q_id, a_id)
);

/* now, I'll link certain questions with certain answers. */
insert into possible_answers (q_id, a_id) values (1, 1);
/* this statement linked 'What is your favorite color?' with 'red'. */

insert into possible_answers (q_id, a_id) values (1, 3);
insert into possible_answers (q_id, a_id) values (2, 2);
insert into possible_answers (q_id, a_id) values (2, 4);
/* this statement linked 'Is it raining outside?' with 'no'. */

/* And, just to continue the trend...

matt=> select * from possible_answers;
 q_id | a_id
------+------
    1 |    1
    1 |    3
    2 |    2
    2 |    4
(4 rows)

*/

/*finally, this is the table that will record the actual answer given for a
 * question and make sure that the answer is appropriate for the question.
 */

create table real_answers (
   q_id integer references questions (q_id),
   a_id integer references all_answers (a_id),
   foreign key (q_id, a_id) references possible_answers (q_id, a_id)
);

/* now watch what happens when I try to insert an answer that doesn't match
 * the question into the database.  I'm going to try to answer the question:

"quot;What is your favorite color?"quot;

with the answer

"yes"

Hopefully the database will prevent me.


matt=> select q_text from questions where q_id = 1;
            q_text            
------------------------------
What is your favorite color?
(1 row)

matt=> select a_text from all_answers where a_id = 2;
a_text
--------
 yes
(1 row)

matt=> insert into real_answers (q_id, a_id) values (1, 2);
ERROR:  $3 referential integrity violation - key referenced from real_answers
not found in possible_answers

Hurray!  We are prevented from entering the data!  Now, let's try storing an
acceptable answer.

matt=> insert into real_answers (q_id, a_id) values (1,1);
INSERT 17196 1

That means it worked!

*/