What is Structured Query Language (SQL)? (original) (raw)

Structured Query Language (SQL) is a standardized programming language that is used to manage relational databases and perform various operations on the data in them. Initially created in the 1970s, SQL is regularly used not only by database administrators but also by developers writing data integration scripts and data analysts looking to set up and run analytical queries.

The term SQL is pronounced ess-kew-ell or sequel.

SQL is used for the following:

SQL queries and other operations take the form of commands written as statements and are aggregated into programs that enable users to add, modify or retrieve data from database tables.

A table is the most basic unit of a database and consists of rows and columns of data. A single table holds records, and each record is stored in a row of the table. Tables are the most used type of database objects or structures that hold or reference data in a relational database. Other types of database objects include the following:

Each column in a table corresponds to a category of data -- for example, customer name or address -- while each row contains a data value for the intersecting column.

How do relational databases work?

Relational databases are relational because they are composed of tables that relate to each other. Non-relational databases, on the other hand, don't use rows and tables, but instead use data models such as key-value pairs, graphs and column families to store and organize data.

Relational databases use a tabular layout to store information, with rows and columns representing different data qualities and the various relationships between the data values. For example, a SQL database used for customer service can have one table for customer names and addresses and other tables that hold information about specific purchases, product codes and customer contacts. A table used to track customer contacts usually uses a unique customer identifier called a key or primary key to reference the customer's record in a separate table used to store customer data, such as name and contact information.

SQL became the de facto standard programming language for relational databases after they emerged in the late 1970s and early 1980s.

Relational vs. non-relational databases diagram.

The SQL query language can be used for relational or non-relational databases, but it offers advantages for relational databases.

Why is SQL important?

SQL is a great language to learn because it's the primary database language used for data processing tasks and is used across various industries.

The following reasons highlight the importance of SQL:

SQL standard and proprietary extensions

An official SQL standard was adopted by the American National Standards Institute (ANSI) in 1986, with the International Organization for Standardization (ISO) adopting the standard in 1987. New versions of the SQL standard are published every few years, the most recent in 2022.

ISO/IEC 9075 is the ISO SQL standard developed jointly by ISO and the International Electrotechnical Commission. The standard way of referring to an ISO standard version is to use the standards organizations -- ISO/IEC -- followed by the ISO standard number, a colon and the publication year. The current ISO standard for SQL is ISO/IEC 9075:2022.

Both proprietary and open source RDBMSes built around SQL are available for use by organizations. SQL-compliant database server products include the following:

Some versions of SQL include proprietary extensions to the standard language for procedural programming and other functions. For example, Microsoft offers a set of extensions called Transact-SQL, while Oracle's extended version of the standard is Procedural Language for SQL. Commercial vendors offer proprietary extensions to differentiate their product offerings by giving customers additional features and functions. As a result, the different variants of extended SQL offered by vendors are not fully compatible with one another.

SQL commands and syntax

SQL is, fundamentally, a programming language designed for accessing, modifying and extracting information from relational databases. As a programming language, SQL has commands and a syntax for issuing those commands.

SQL commands are divided into several different types, including the following:

SQL syntax, the set of rules for how SQL statements are written and formatted, is similar to other programming languages. Some components of SQL syntax include the following.

SELECT * FROM customers;

This SELECT statement extracts all of the contents of a table called customers.

SELECT * FROM customers;
select * from CUSTOMERS;

These two statements are different:

SELECT * FROM customers;
SELECT * FROM "Customers";
SELECT name, telephone, age
FROM customers;

This command selects the contents of the columns name, telephone and age in the table customers.

SELECT name, telephone, age
FROM customers
WHERE age > 21;

Most SQL implementations include support for issuing statements at the command line, through a graphical user interface, by using SQL programs, or through application programming interfaces to access SQL databases using other programming languages.

Commonly used SQL commands with examples

Most SQL commands are used with operators to modify or reduce the scope of data operated on by the statement. Some commonly used SQL commands, along with examples of SQL statements using those commands include the following:

SQL SELECT. The SELECT command is used to get some or all data in a table. SELECT can be used with operators to narrow down the amount of data selected.

SELECT title, author, pub_date
FROM catalog
WHERE pub_date = 2021;

This example could be used by a publisher to select the title, author and publication date columns from a table named catalog.

SQL CREATE. The CREATE command is used to create a new SQL database or SQL table. Most versions of SQL create a new database by creating a new directory, in which tables and other database objects are stored as files.

The following CREATE DATABASE statement creates a new SQL database named Human_Resources:

CREATE DATABASE Human_Resources;

The CREATE TABLE command is used to create a table in SQL. The following statement creates a table named Employees that has three columns: employee_ID, last_name and first_name, with the first column storing integer (int) data and the other columns storing variable character data of type varchar and a maximum of 255 characters.

CREATE TABLE Employees (
    employee_ID int,
    last_name varchar(255),
    first_name varchar(255)
);

SQL DELETE. The DELETE command removes rows from a named table. In this example, all records of employees with the last name Smithee are deleted:

DELETE FROM Employees WHERE last_name='Smithee';

This delete statement returns the number of rows deleted when it finishes running.

SQL INSERT INTO. The INSERT INTO command is used to add records into a database table. The following statement adds a new record into the Employees table:

INSERT INTO Employees (
    last_name,
    first_name
)
VALUES (
    'Alan',
    'Smithee'
);

SQL UPDATE. The UPDATE command is used to make changes to rows or records in a specific table. For example, the following update statement updates all records that include a last_name value of Smithee by changing the name to Smith:

UPDATE Employees
SET last_name = 'Smith',
WHERE last_name = 'Smithee';

SQL statements can use loops, variables and other components of a programming language to update records based on different criteria.

SQL-on-Hadoop tools

SQL-on-Hadoop query engines are a newer offshoot of SQL that enable organizations with big data architectures built around Hadoop data stores to use SQL as a querying language and let database professionals use a familiar query language instead of having to use more complex and less familiar languages -- in particular, the MapReduce programming environment for developing batch processing applications. These tools are designed to bridge the gap between SQL and Hadoop, enabling programmers and analysts to utilize their existing SQL expertise to query and analyze data stored in Hadoop.

More than a dozen SQL-on-Hadoop tools are available from Hadoop distribution providers and other vendors; many of them are open source software or commercial versions. In addition, the Apache Spark processing engine, which is often used in conjunction with Hadoop, includes a Spark SQL module that similarly supports SQL-based programming.

Not all SQL-on-Hadoop tools support all of the functionality offered in relational implementations of SQL. But SQL-on-Hadoop tools are a regular component of Hadoop deployments, as companies look to get developers and data analysts with SQL skills involved in programming big data applications.

Common SQL-on-Hadoop tools include the following:

SQL security

SQL databases are subject to most of the same vulnerabilities as any other enterprise application, including weak authentication, insecure design, misconfiguration and other application security issues. However, SQL injection, first reported in 1998, continues to dominate security issues for SQL systems.

The following are the common SQL security threats and their causes.

SQL injection attacks

SQL injection attacks usually exploit weaknesses in database systems where data submissions are not scanned and sanitized to remove potentially malicious code incorporated or injected into data.

The best-known example of a SQL injection exploit is documented in the "Little Bobby Tables" comic by Randall Munroe, in which a SQL injection is perpetrated by a mom who filled out a SQL form with her son's name followed by malicious SQL code.

In the comic, the son's name is entered as the following:

Robert'); DROP TABLE Students; --

Following the valid data (Robert), the name continues with characters that SQL interprets as ending the data -- single quote, close parenthesis and semicolon -- followed by the DROP TABLE command.

Verbose errors

SQL databases can sometimes produce error messages containing internal information, potentially aiding attackers in identifying vulnerabilities or strategizing an attack.

It's imperative to ensure that all procedural code incorporates proper error handling mechanisms, thereby averting the exposure of default error messages to the user.

Privilege escalation

Privilege escalation attacks in SQL databases involve unauthorized attempts to elevate user privileges, enabling individuals to access data or execute functions beyond their designated level of authority.

To mitigate privilege escalation attacks, it's important to enforce preventative measures. These can include strict access controls, adhering to the principle of least privilege, regularly updating and patching the SQL Server software and conducting thorough security assessments to identify and address potential weaknesses.

Denial of service (DoS) attacks

In production environments, SQL databases often face the risk of application-layer DoS attacks, involving the flooding of fake queries by attackers. This slows down performance for legitimate users, leading to eventual downtime.

When using SQL in the cloud, one can utilize DoS protection services to detect and redirect malicious traffic, safeguarding the database from potential disruptions.

Lack of encryption

If a database or the data it contains is not encrypted properly, hackers can identify the points of connection between the machine and the network, which could result in database intrusions.

To lessen these risks, database administrators must maintain encryption for both the primary and backup SQL databases.

The use of best practices for database security can help protect an organization's most valuable digital assets. SQL security is an ongoing process and it's important to stay updated with the latest security practices and measures and conduct security audits to identify and address any new vulnerabilities.

History of SQL

With roots going back to the early 1970s, SQL continues to make key milestones as one of the most successful ideas in computing history.

The following is a timeline of the SQL language and its key milestones:

SQL skills can boost many careers, not just those of database administrators, data warehouse architects, database programmers and others whose roles directly use SQL.

Some other roles that can benefit from SQL experience include the following:

SQL skills can be helpful in a surprisingly broad range of disciplines. For example, journalists who reported in 2013 on offshore tax dodgers and money laundering had to learn SQL to help them understand the significance of the millions of emails and files that were leaked.

NoSQL systems are becoming more prevalent in the cloud, with numerous cloud providers and vendors offering them. Learn about the various types of NoSQL databases and their pros and cons.