relational database (original) (raw)
What is a relational database?
A relational database is a type of database that organizes data points with defined relationships for easy access. In the relational database model, the data structures -- data tables, indexes and views -- remain separate from the physical storage structures, letting database administrators edit the physical data storage without affecting the logical data structure.
In the enterprise, relational databases are used to organize data and identify relationships between key data points. They make it easy to sort and find information, which helps organizations make business decisions more efficiently and minimize costs. They work well with structured data.
How does a relational database work?
Relational databases organize data into tables. The data tables used store information about related objects in a tabular structure. This structure is what makes it easy to find relationships between data points.
The tables are broken down into rows and columns. Each row represents an individual record, and each column contains attributes and values.
Each row contains a record with a unique identifier known as a primary key. Aside from uniquely identifying each record in a table, primary keys also ensure there are no duplicate rows. Each record assigns a value to each feature, making relationships between data points easy to identify.
Together, both the rows and columns in the table represent a collection of related values of one object or entity.
Different tables in a relational database can also be joined together to gain deeper insights between different data points. This is done using foreign keys. A foreign key connects one or more columns in a table with a primary key in another table. It essentially creates a link between two tables. The foreign key in one table is what defines the relationships in another table.
Primary and foreign keys are just two of the multiple key types that can be found in relational databases.
A database management system (RDBMS) is used to manage relational databases. Key functions of an RDBMS include data storage, retrieval, manipulation, data security and backup.
The standard user and application program interface (API) of a relational database is the Structured Query Language. SQL code statements are used both for interactive queries for information from a relational database and for gathering data for reports. Defined data integrity rules must be followed to ensure the relational database is accurate and accessible.
What is the structure of a relational database model?
E. F. Codd, then a programmer at IBM, invented the relational database in 1970. In his paper "A Relational Model of Data for Large Shared Data Banks," Codd proposed shifting from storing data in hierarchical or navigational structures to organizing data in a tabular structure of tables containing rows and columns.
Each table, sometimes called a relation, in a relational database contains one or more data categories in columns, or attributes. Each row, also called a record or tuple, contains a unique instance of data, or key, for the categories defined by the columns. Each table has a unique primary key that identifies the information in a table. The relationship between tables can be set via the use of foreign keys -- a field in a table that links to the primary key of another table.
A relational database includes tables containing rows and columns.
For example, a typical business order entry database would include a table that describes a customer with columns for name, address, phone number and so forth. Another table would describe an order, including information like the product, customer, date and sales price.
A user can get a database report showing the data they need. For example, a branch office manager might want a report on all customers who bought products after a certain date. A financial services manager in the same company could, from the same tables, obtain a report on accounts that need to be paid.
When creating a relational database, users define the domain of possible values in a data column and constraints that may apply to that data value. For example, a domain of possible customers could allow up to 10 possible customer names, but it is limited in one table to allowing only three of these customer names to be specifiable.
Two constraints relate to data integrity and the primary and foreign keys:
- Entity integrity ensures that the primary key in a table is unique and the value is not set to null.
- Referential integrity requires that every value in a foreign key column will be found in the primary key of the table from which it originated.
In addition, relational databases possess physical data independence. This refers to a system's capacity to make changes to the inner schema without altering the external schemas or application programs. Inner schema alterations include the following:
- Using new storage devices.
- Modifying indexes.
- Changing from a specific access method to a different one.
- Using different data structures.
- Using various storage structures or file organizations.
Logical data independence is a system's ability to manage the conceptual schema without altering the external schema or application programs. Conceptual schema alterations may include the addition or deletion of new relationships, entities or attributes without altering existing external schemas or rewriting application programs.
What are the types of databases?
There are numerous available database categories, from basic flat files that aren't relational to not only SQL (NoSQL) and newer graph databases that are considered even more relational than standard relational databases. Some database types include the following:
Flat file database
These databases consist of a single table of data that has no interrelation -- typically text files. This type of file lets users specify data attributes, such as columns and data types.
Find out about the advantages and disadvantages of flat file and relational databases.
NoSQL database
This type of database is an alternative that's especially useful for large, distributed data sets. NoSQL databases support a variety of data models, including key-value, document, columnar and graph formats.
Graph database
Expanding beyond traditional column- and row-based relational data models, this NoSQL database uses nodes and edges that represent connections between data relationships and can discover new relationships between the data. Graph databases are more sophisticated than relational databases. They are used for fraud detection or web recommendation engines.
See how graph and relational databases compare.
Object relational database
An ORD is composed of both an RDBMS and an object-oriented database management system (OODBMS). It contains characteristics of both the RDBMS and OODBMS models. A traditional database is used to store the data. It is then accessed and manipulated using queries written in a query language, such as SQL. Therefore, the basic approach of an ORD is based on a relational database.
However, an ORD can also be considered object storage, particularly for software written in the object-oriented programming language, thus pulling on object-oriented characteristics. In this situation, APIs are used in the storage and retrieval of data.
See the characteristics of an RDBMS vs. a DBMS and where they overlap.
Distributed
A distributed database is a type of database that stores records or files in multiple different physical locations. Data processing is also spread out and replicated in different parts of the network.
Cloud
Cloud databases are built in a public, private or hybrid cloud for a virtualized environment. Cloud databases offer high scalability and availability on demand. Users are charged based on how much storage and bandwidth they use. These databases, including relational databases, can also be offered as a service. The as-a-service offering is called database as a service, or DBaaS.
Multimodel
A multimodel database is a type of database that supports multiple data models. Being multimodel lets IT teams meet various application requirements without needing to deploy different database systems.
Self-driving
A self-driving, or autonomous, database is a type of database that automates regular data management tasks, including backups, updates, tuning and security. These databases are cloud-based and use machine learning processes in their automation.
What are the advantages of relational databases?
The key advantages of relational databases include the following:
- Categorizing data. Database administrators can easily categorize and store data in a relational database that can then be queried and filtered to extract information for reports. Relational databases are also easy to extend and aren't reliant on physical organization. After the original database creation, a new data category can be added without having to modify the existing applications.
- Accuracy. Data is stored just once, eliminating data deduplication in storage procedures.
- Ease of use. Complex queries are easy for users to carry out with SQL, the main query language used with relational databases.
- Collaboration. Multiple users can access the same database.
- Security. Direct access to data in tables within an RDBMS can be limited to specific users.
- ACID. Relational databases support ACID -- atomicity, consistency, isolation, and durability.
- Stored procedures. Relational databases also support stored procedures, which helps ensure that specific data functions are implemented in a detailed way.
- Redundancy. Normalization and stored procedures help reduce instances of redundancy. Likewise, primary keys also ensure there are no duplicate rows.
What are the disadvantages of relational databases?
The disadvantages of relational databases include the following:
- Structure. Relational databases require a lot of structure and a certain level of planning because columns must be defined and data needs to fit correctly into somewhat rigid categories. The structure is good in some situations, but it creates issues related to the other drawbacks, such as maintenance and lack of flexibility and scalability.
- Maintenance issues. Developers and other personnel responsible for the database must spend time managing and optimizing the database as data gets added to it.
- Inflexibility. Relational databases are not ideal for handling large quantities of unstructured data. Data that is largely qualitative, not easily defined or dynamic is not optimal for relational databases. As the data changes or evolves, the schema must evolve with it, which takes time.
- Lack of scalability. Relational databases do not horizontally scale well across physical storage structures with multiple servers. It is difficult to handle relational databases across multiple servers because as a data set gets larger and more distributed, the structure is disrupted, and the use of multiple servers has effects on performance -- such as application response times -- and availability.
- Performance over time. Complex relational databases contain many different tables. Over time, with an increase of data, performance might start to decrease, and response times queries might start to slow.
Examples of relational databases
Standard relational databases enable users to manage predefined data relationships across multiple databases. Popular examples of standard relational databases include Microsoft SQL Server, Oracle Database, MySQL and IBM DB2.
Cloud-based relational databases are also widely used because they enable companies to outsource database maintenance, patching and infrastructure support requirements. Cloud relational databases include Amazon Relational Database Service, Google Cloud SQL, IBM DB2 on Cloud, SQL Azure and Oracle Cloud.
What are the differences between relational databases, non-relational databases and NoSQL?
The most important difference between relational database systems and non-relational database systems is that relational databases are normalized. That is, they store data in a tabular form, arranged in a table with rows and columns. A non-relational database stores data as files.
Other differences include the following:
- Use of primary keys. Relational database tables each have a primary key identifier. In a non-relational database, data is normally stored in hierarchical or navigational form, without the use of primary keys.
- Data value relationships. Since data in a relational database is stored in tables, the relationship between these data values is stored as well. Since a non-relational database stores data as files, there is no relationship between the data values.
- Integrity constraints. In a relational database, the integrity constraints are any constraint that ensures database integrity. They are defined for the purpose of ACID. Non-relational databases do not use integrity constraints.
- Structured vs. unstructured data. Relational databases work well for structured data that conforms to a predefined data model and doesn't change much. Non-relational databases are better for unstructured data, which doesn't conform to a predefined data model and can't be stored in an RDBMS. Examples of unstructured data include text, emails, photos, videos and web pages.
Relational and non-relational databases have unique strengths and weaknesses.
Non-relational databases are also called NoSQL databases. The terms are used interchangeably, but there are differences.
SQL is the query language that is used with relational databases. Relational databases and their management systems almost always use SQL as their underlying query language. NoSQL, or not only SQL, databases use SQL and other query languages. For example, the NoSQL database management program MongoDB uses JSON-like documents to store and organize data. (Technically, it uses a variant of JSON called BSON, or binary JSON.)
Referring to databases as non-relational vs. relational categorizes them based on their architecture. Referring to them as SQL vs. NoSQL categorizes them based on the query language, whether it is solely SQL or not only SQL. Often, a relational database can be referred to as a SQL database, as many of them use SQL, and non-relational databases can be referred to as NoSQL databases. NoSQL and non-relational databases work well with more fluid data models, such as in engineering parts and molecular modeling, where the data is always changing.
Both relational and non-relational database platforms have their drawbacks. NewSQL databases seek to provide the benefits of both types, by offering the data integrity and application access control that relational databases offer and the horizontal scalability that non-relational or NoSQL platforms provide.
Choosing the right database
Relational databases work for structured data with defined relationships that can be organized in a tabular format. However, there is a lot more to selecting the right database architecture than just choosing between relational and non-relational. The type of data and application being used or developed are key factors to consider. Learn some of the other factors to consider when choosing a database model for an enterprise application.
Additional factors to consider include the amount of data, how many users need simultaneous access, preferred programming language, the geographic distribution of data, budget, integration needs, availability needs and scaling requirements.
Certain initiatives require specific considerations when choosing database software. For instance, with IoT initiatives, SQL vs. NoSQL is an issue, as is static vs. streaming. Find out what to assess when selecting a database for an IoT project.