Understanding data storage (original) (raw)

Data storage has come a long way since the days of disk systems. Sure, those disk systems might still be used here and there—but now all that data is attached to a network and software-defined.

What is data storage?

Data storage is the collection and retention of digital information—the bits and bytes behind applications, network protocols, documents, media, address books, user preferences, and more. Data storage is a central component of big data and data management.

Think about it like this. Computers are like brains. Both have short-term and long-term memories. Brains handle short-term memory in the prefrontal cortex, while computers handle it with random-access memory (RAM).

Brains and RAM process and remember things while awake, and both get tired after a while. Your brain converts working memories into long-term memories while you sleep, and computers transfer active memory into storage volumes when it sleeps. Computers also distribute data by type in the same way brains distribute memories by semantic, spatial, emotional, or procedural.

A brief history of data storage devices

Perhaps the best consolidated history of data storage devices is contained within the first dozen pages of Gordan Haff and William Henry’s From Pots and Vats to Programs and Apps: How Software Learned to Package Itself.

In it, Haff and Henry describe how a 1725 textile worker programmed looms using punchcards that were inspired by automated organs’ cylinders. Punchcards fed information into a 19th century computer as part of the 1890 U.S. Census and remained popular until the era of magnetic tape drives began in the 1950s. From there, the size of magnetic tape drives shrank until they became cassette tapes.

Right before the 1970s, IBM released the floppy disk—which were used for almost everything. Floppies initialized mainframes, stored software applications, and were the only persistent storage device available until hard disk drives (HDDs) dropped in price. HDDs became compact disks (CDs) in the 1980s, and solid state drives (SSDs) replaced the spinning disks with solid chips and flash memory. Flash storage now fits in our pockets as flash drives that hold hard copies of everything we want or need.

Software-defined storage (SDS) uses abstraction management software to decouple data from hardware before reformating and organizing it for network use. SDS works particularly well with container and microservice workloads that use unstructured data, since it can scale in ways hardwired storage solutions simply can’t.

Cloud storage is the organization of data kept somewhere that can be accessed through the internet by anyone—given the right permissions. You don’t need to be connected to an internal network (that’s known as NAS) and aren’t accessing the data from hardware directly attached to your computer. Popular cloud storage providers include Microsoft, Google, and IBM.

Network-attached storage (NAS) makes data more accessible to internal networks by installing a lightweight operating system onto a server that turns it into something called a NAS box, unit, or head. The NAS box becomes an important part of intranets because it processes every single storage request.

Object storage, also known as object-based storage, is a flat structure in which files are broken into pieces and spread out among hardware. In object storage, the data is broken into discrete units called objects and is kept in a single repository, instead of being kept as files in folders or as blocks on servers.

File storage arranges data as hierarchical files that users can open and navigate from top to bottom. Since files are stored on back ends and front ends the same way, users can requests files by unique identifiers such as names, locations, or URLs. This is the predominant human-readable storage format.

Block storage splits storage volumes into individual instances known as blocks. Each block exists independently, which gives users complete configuration autonomy. Because blocks aren’t burdened with the same unique identifier requirements as files, blocks are a faster storage system—making them ideal formats for rich media databases.

How do I learn to use storage?

Why Red Hat?

Software-defined storage is inherently open. It decouples hardware from software, freeing you from vendor lock-in. Red Hat has taken "open" a step further. Our software-defined storage is also open source. It draws on the innovations of a community of developers, partners, and customers. This gives you control over exactly how your storage is formatted and used—based on your business’ unique workloads, environments, and needs.

Keep reading

Article

Why choose Red Hat storage

Learn what software-defined storage is and how to deploy a Red Hat software-defined storage solution that gives you the flexibility to manage, store, and share data as you see fit.

Article

What is cloud storage?

Cloud storage is the organization of data kept somewhere that can be accessed by anyone with the right permissions over the internet. Learn about how it works.

Topic

Understanding data services

Data services are collections of small, independent, and loosely coupled functions that enhance, organize, share, or calculate information collected and saved in data storage volumes.

More about storage

Products

Red Hat OpenShift Data Foundation

Software-defined storage that gives data a permanent place to live as containers spin up and down and across environments.

Red Hat Ceph Storage

An open, massively scalable, software-defined storage system that efficiently manages petabytes of data.

Resources

Podcast

Command Line Heroes Season 4, Episode 4:
"Floppies: The disks that changed the world"

Keep exploring