Referential Integrity

October 26, 2022 | by Arround The Web | No comments

Referential Integrity is discussed in this tutorial in the relational database context. Referential integrity is a data property, stating that all its references are valid. Two tables are concerned here. One is called the referenced or parent table. This referenced table has a primary key, which may be made up of one or more columns. The other table has a foreign key whose values are the same as those of the primary key of the referenced table. The other table is referred to as a child table to the parent table.

The number of columns that make up the foreign key in the child table is the same as those that make up the primary key in the referenced (parent) table. All the foreign key values are found in the column of the primary key. However, the values in the foreign key column may repeat in the child table.

The child table has its own primary key that is different from the foreign key column that it has. The child table is also called the Referencing Table.

“all its references are valid” means that a foreign key in the child table points to one row in the referenced table. That foreign key in the child table and the primary key in the referenced (parent) table are the same value. More than one foreign key in the child table may point to just one row in the referenced table.

This article explains the constraints that are naturally associated with the foreign key of the child table and the same primary key of the referenced table. If the constraints are respected, then the data (corresponding rows) in both tables form integrity, with no errors, everything being equal; hence the term Referential Integrity.”

Illustration With Two Tables

The table notation of the two tables can be:

Sales(saleID, dateSold, customerID, employee)

and

Customers(customerID, customerName, phone, address, city, region, postalCode, country)

customerID in the Sales table is a foreign key with a dash underline. CustomerID in the Customer table is a primary key with a single underline. The primary key for the Sales table is saleID, with a single underline. saleID in the sales table depends on customerID as well as it depends on the dateSold attribute and employee attribute (column).

The tables with data are as follows:

Sales Table

saleID	dateSold	customerID	employee
1	07/06/23	1	John Jones
2	07/07/23	2	Barbara Baker
3	07/07/23	2	Peter Lewis
4	07/08/23	1	John Jones
5	07/09/23	3	Barbara Baker
6	07/10/23	1	Barbara Baker
7	07/11/23	1	Barbara Baker
8	07/12/23	1	John Jones
9	07/13/23	2	Peter Lewis

Customers Table

For both tables, customers (individuals) go to a shop and buy different products on different days. The products bought are not shown. The Customers table has the addresses of the customers in the same town. The Sales table has the names of the employees who served the individual customers and has other data.

In the Sales table, customerID is a foreign key. In the Customers table, customerID is a primary key. Every foreign key in the Sales table occurs in the Customers table once. That is, every foreign key in the Sales table references a primary key and hence references a distinct row in the Customers table. The data (referential) integrity fails if this is not the case.

NULL Value and Foreign Keys

A foreign key in the child table must reference a row, through the primary key, of the parent table. All the primary keys of the parent table are unique. The foreign keys in the foreign key column in the child table, all of which are found in the parent table, do not have to be unique. Repetitions of foreign keys should be expected there. Above, the Sales table is a child table, and the Customers table is a parent table. Consider these two tables as belonging to the database of a convenience shop. The customers are individuals who buy from the shop.

Imagine that the customer with customerID 2 has offended you, the shop proprietor; for example, he tried to steal, and you have forbade him from ever coming to buy in your shop again. This means that his name has to be removed from your list of customers. His buying in the Sales table is in the past, happened without any issue, and so has to remain.

In this case, all customerID of 2 in the sales table have to be replaced by NULL (nothing), which means no reference to the Customers table of a row that has been deleted. For the above Customers table, two rows will remain after his row (customerID and the rest of his credentials) has been deleted.

There are two ways of annulling the row in the Customers table, by the computer (Database Management System) as follows:

actually deleting the row, and the remaining customerID’s are 1 and 3, missing the auto-incremented value of 2 and its row;
actually deleting the row and renumbering the rest of the rows below it from 2, auto-incrementing, and there is no missing 2 in the Customers table. In this case, all the foreign keys from 3 in the Sales table are renumbered, auto-incrementing from 2, and there is no missing 2 in the Sales table.

Whatever way is chosen, the NULL value remains in the Sales table as the foreign key for the deleted customer corresponding rows. The NULL value references (points to) no row in the Customers table.

So, for such reasons, a foreign key can have a NULL value. That is to do with referential integrity!

Updating the Child and Parent Tables

Updating means modifying. Modifying any cell in the child (Sales) table (even the saleID), except the foreign key, does not really cause any harm to the child table or the parent (Customers) table correspondences. For example, if the date was wrongly typed by mistake, the correct date should be typed.

Modifying any cell in the parent (Customers) table, except the primary key, does not really cause any harm to the parent (Customers) table or the child (Sales) table correspondences. For example, if the customer has changed his street but is still in the neighborhood, changing his address in the Customers table does not cause any harm to the child and parent table correspondences.

However, changing the value of the foreign key in the child table means that all the same foreign key values in the child table have to be changed to that particular value, and the one corresponding primary key value in the parent table also has to be changed, to that same value as well. This is cascading.

Can a NULL value foreign key be changed? – Well, yes: if the above-expelled customer comes back after six months and apologizes, and you, the proprietor, forgive him, and if only his rows in the Sales table had NULL values, then his name and credentials will be entered in the Customers table, at the bottom, with a new customerID; and all the NULL values as foreign keys in the child (Sales) table, will be replaced by this new customerID. However, if there were other rows in the child table with NULL values not belonging to him, then all the NULL values, including his, will remain; and all his new sales will be entered in the Sales table anew, at the bottom of the Sales table.

Deleting in the Child and Parent Tables

It does not make sense to delete any column cell value in the child or parent table unless it is for a column whose cell value can be empty. Empty does not mean NULL. Empty means void.

However, a whole row in the child table or parent table can be deleted. Assume that there are some fake rows in the child (Sales) table; if these rows are deleted (including their SaleIDs and foreign keys entirely), then there will be no problem. The customers either have other real rows in the child table or have just not shown their willingness to buy from the shop and have never bought anything. – They will buy someday in the near future.

On the other hand, if there is a fake customer with fake sales when his row in the customer table is deleted, all his fake sale rows in the Sales table have to be deleted. This is cascading. This can happen when an employee does real sales, records false information, and signs as another employee in the same table. Since the corresponding foreign keys in the child table are the same as the primary key in the parent table, then that same value links the two tables.

To solve this problem, each employee should have his own Sales exercise book if the computer is not being used. If the computer is being used, then each employee should have a login username and password to the Database Management System application, which automatically records the employee’s name against all his actions.

Quotations

The following three paragraphs are quotations from Wikipedia:

“Referential integrity is a property of data, stating that all its references are valid.

In the context of relational databases, it requires that if a value of one attribute (column) of a relation (table) references a value of another attribute (either in the same or a different relation), then the referenced value must exist.

Referential integrity is a database concept that ensures that relationships between tables remain consistent. When one table has a foreign key to another table, the concept of referential integrity states that you may not add a record to the table that contains the foreign key unless there is a corresponding record in the linked table.”

Referential Actions

Without the computer, it is difficult to enforce Referential Integrity. This leads to accounting errors. With the computer, referential integrity can be enforced by enforcing cascading and other actions by programming the Database Management System.

CASCADE

This means: updating or deleting cascading.

RESTRICT

This means: do not update or delete the connected rows in the child and parent table.

NO ACTION

RESTRICT checks the connections of the rows in both tables before trying to update or delete. NO ACTION is similar to restrict but tries to update or delete before checking the connections in both tables.

RESTRICT will issue an error message that an attempt to update or delete a row was made, despite the fact that there is a corresponding row in the other table that would not have been updated or deleted. NO ACTION would update or delete the row and then issue an error message that the row was updated or deleted, despite the fact that a corresponding row exists in the other table. Of course, this would lead to wrong accounting unless the user was in the process of handling both table changes manually.

If no corresponding row exists in the other table, then update or delete the row, whether or not the enforcement was RESTRICT or NO ACTION.

SET NULL

With this action enforced, if a row in the parent table is deleted, all the foreign keys of the corresponding rows in the child table would be set to NULL.

SET DEFAULT

With this action enforced, if a row in the parent table is deleted, all the foreign keys of the corresponding rows in the child table would be set to the default value chosen by the database user.

Conclusion

When the tables are kept in ledgers, it becomes difficult to enforce referential integrity. However, when a DBMS keeps the tables in a computer, those referential actions are enforced by a programmer.

Source: linuxhint.com