Relational Algebra: A Foundation for Database Queries
Learn about relational algebra, a procedural query language used in database management systems (DBMS). Understand the fundamental operations of relational algebra, including select, project, join, union, intersection, difference, and Cartesian product, and how they are used to manipulate and retrieve data from relational databases.
DBMS - Relational Algebra
Relational database systems utilize query languages to allow users to interact with and query database instances. The two main types of query languages are relational algebra and relational calculus.
Relational Algebra
Relational algebra is a procedural query language that processes relations as inputs and returns relations as outputs. It operates through a set of fundamental operators, which can be either unary or binary, taking relations as inputs and providing relations as outputs. Relational algebra is performed recursively, treating intermediate results as relations as well.
Fundamental Operations of Relational Algebra
- Select
- Project
- Union
- Set Difference
- Cartesian Product
- Rename
Select Operation (σ)
The Select operation retrieves tuples that satisfy a specific condition.
Notation: σp(r)
Where σ
stands for selection, p represents the condition or predicate, and r is the relation. The predicate may include operators like =, ≠, ≥, <, >, ≤, as well as logical connectors like AND, OR, and NOT.
Examples:
- σsubject = "database"(Books) – Selects tuples where subject is "database".
- σsubject = "database" AND price = 450(Books) – Selects tuples where subject is "database" and price is 450.
- σsubject = "database" AND price = 450 OR year > 2010(Books) – Selects tuples where the subject is "database" and price is 450 or those published after 2010.
Project Operation (∏)
The Project operation retrieves specified columns from a relation.
Notation: ∏A1, A2, ..., An(r)
Where A1, A2, ..., An are attribute names from relation r. Duplicate rows are automatically eliminated, as the result is treated as a set.
Example:
- ∏subject, author(Books) – Projects columns "subject" and "author" from the Books relation.
Union Operation (∪)
Performs a binary union between two relations.
Notation: r ∪ s
Where r and s are relations or result sets. For the union to be valid, both relations must have the same number of attributes with compatible domains. Duplicate tuples are eliminated.
Example:
- ∏author(Books) ∪ ∏author(Articles) – Projects authors who have either written a book or an article, or both.
Set Difference (−)
The Set Difference operation finds tuples in one relation that are not present in the second relation.
Notation: r − s
Example:
- ∏author(Books) − ∏author(Articles) – Retrieves authors who have written books but not articles.
Cartesian Product (Χ)
The Cartesian Product operation combines information from two different relations into one.
Notation: r Χ s
Example:
- σauthor = 'tutorialsarena'(Books Χ Articles) – Shows all books and articles by the author "tutorialsarena".
Rename Operation (ρ)
The Rename operation assigns a name to the output of an expression. It is denoted by the Greek letter rho (ρ).
Notation: ρx(E)
Where x is the new name for the result of expression E.
Additional Operations
- Set Intersection
- Assignment
- Natural Join
Relational Calculus
In contrast to Relational Algebra, Relational Calculus is a non-procedural query language that specifies what to retrieve without detailing how to retrieve it.
Tuple Relational Calculus (TRC)
TRC filters variables over tuples.
Notation: { T | Condition }
Example:
- { T.name | Author(T) AND T.article = 'database' } – Retrieves tuples with "name" from Author who has written about "database".
Domain Relational Calculus (DRC)
DRC filters variables over the domains of attributes.
Notation: { a1, a2, ..., an | P(a1, a2, ..., an) }
Example:
- {< article, page, subject > | ∈ tutorialsarena ∧ subject = 'database' } – Retrieves "article", "page", and "subject" where the subject is "database".