Just how much data is out there? Well, first it is important to understand that not all data that is created is stored. In fact, data created is two orders of magnitude higher than data stored. In 2012, we were creating 2.5 exabytes of data every day, and that number was doubling approximately every 40 months.
In 2016 we created 218 zettabytes (ZB) of data, and by 2021 that number is expected to be 847 ZB per year. How much is a zettabyte anyway? It is made up of 1 billion terabytes (TB). Even if only half of that data is stored, we will soon be storing somewhere north of 420 billion TB every year.
Of course the point of collecting this data is not to just store it, but to create actionable data that is useful to one industry or another, or several all at once, aiding the continued development of predictive analytics. Big data can be used in everything from healthcare to employee training. Here are some reasons we won’t reach a data overload anytime soon, even with all of the data we are collecting.
Data Center Design
One way we are preventing this is with data center design. Those that think going paperless with your office and using exclusively cloud or digital storage is better for the environment are right to a certain extent. This is only because most data centers, even though they use a lot of power for running servers and cooling (more on that in a moment), are powered by green energy like solar and wind.
Most of these data centers have a mean capacity of 7.5 petabytes (PB) but in the next decade, 30 percent of data centers will have 10 PB capacity, and 18 percent will be moving into the 50+ PB range. How are they handling this increased amount of storage? There are three ways: storage virtualization, flash storage, and storage consolidation.
When we look at the simple example of patient health data, we can see that the figure of 153 exabytes in 2013 is growing at a rate of 48 percent per year. This means that, in the very near future, health data alone will reach the zettabyte and yottabyte level. However, over 80 percent of this data is unstructured, leaving room for a lot of technological advances.
This is where storage virtualization and consolidation comes in. These two methods look for efficient ways to store the same amount of data with a lot less hardware. Structuring data is just a part of this solution, but it’s one that also makes that same data more actionable and easily accessible.
All flash is another storage method using only flash storage instead of spinning drives, but it greatly disrupts the current design of data centers. The basic issue is that it uses much less room than spinning drives and is more reliable, a great advantage, except that removing 40 racks of data and replacing them with two racks changes cooling design, airflow, the width of aisles, and where fans must be positioned.
Still, using more all-flash, a data center could increase its capacity to around 20 times its current one using less power (a savings of close to $25,000 annually), and future structures could be designed in an entirely different way.
The Real Data Limit
So is there a limit to how much data we can store? The answer is a resounding yes — until you add the question “How much?” or “When will we reach that limit?” The reason is simple. Technology is advancing quickly, so as soon as we say “We can store this much data with current technology,” the technology we were speaking of will be obsolete.
DNA Storage: This is not a real storage solution yet. Why? Well, because it takes a long time to retrieve the storage from DNA, but we can store a lot of data in a small space there: 700 terabytes per gram, which is around one cubic centimeter. If we find a faster way to access this data, an entire data center would fit inside your two-car garage.
Electronic Quantum Holography: Currently technology is being developed to store data on electrons. A single electron could store 66 million bits of data, and the atom it was circling would determine on how much space it would take up. Many feel that for speed of access, silver is a good solution. How much room would this take up? That’s not definitive yet, but a current data center capacity could conceivably take up a single room.
Data Centers in Space: With private space programs on the rise, there is a move toward putting data centers in space. Powering them using solar would be relatively easy, and cooling would be much less of a factor. A big part of making this feasible is due to Elon Musk and SpaceX, and the prospect of reducing the cost of putting infrastructure into space by employing reusable rockets.
Could we reach a data overload at some point? It’s possible, but not probable anytime in the near future.