It's doubtful that even the most voracious downloader of MP3s would ever need a petabyte -- a million gigabytes -- worth of disk space to house their collection.
But some people do have really serious storage requirements, and a mere petabyte worth of storage doesn't even begin to meet their needs.
IBM's new data storage system, codenamed Storage Tank, is designed for the most voracious data warehouser. Storage Tank uses software to link servers in multiple locations over an IP network, creating a sort of mega-server capable of connecting thousands of computers and processing multiple petabytes of data.
Storage Tank also makes a distributed storage network look and behave just like a local network. No matter where or on what operating system any piece of stored data might reside, it can be located quickly and used by anyone else on the network.
"Storage Tank has the potential to become to an organization's data what the Dewey Decimal system is to a library," said Dan Colby, general manager of storage systems at IBM. "It reinvents the way information is filed, managed, shared and accessed within an organization."
Storage Tank already is at work in beta form. At the European Organization for Nuclear Research, better known as CERN, Storage Tank is being used as part of the world's largest computing grid to help CERN physicists virtually recreate the first moments of the "big bang."
By smashing protons together at high energies in a particle accelerator called the Large Hadron Collider, scientists hope to recreate conditions thought to have existed shortly after the big bang occurred.
"The data generated by the experiment is expected to fill the equivalent of more than 20 million CDs a year," said Wolfgang von Rüden, information-technology division leader and head of the CERN open lab. "Some 70,000 computers will be needed to analyze it."
CERN's computing grid can take care of the computing needs. It allows researchers in CERN's Geneva lab to tap into the processing power of hundreds of computers in 12 countries, and is expected to grow its capacity over the next few years.
And Storage Tank will, they hope, allow researchers to house the vast amounts of data generated by the experiment, and work with it. CERN is now testing Storage Tank in what is the first year of a planned three-year collaboration with IBM.
Despite all the spiffy new features offered by Storage Tank, IBM executives know that it's hard to get most people excited over storage systems.
"We knew the potential of SAN File System was huge, but we were unsure of the degree to which customers would fully comprehend this potential," said Jeffrey Barnett, manager of IBM's storage strategy division. "So we worked directly with some of our customers to convey the full potential of this technology."
IBM expects Storage Tank eventually will be able to handle 10 to 20 terabytes of CERN data. By 2007, when the proton smashing is scheduled to commence in earnest, CERN will be generating data at a minimum rate of 5 to 8 petabytes a year.
IBM's Barnett expects that major research facilities such as CERN and industries that generate large amounts of data, like technology equipment manufacturers, will benefit the most from Storage Tank.
Storage Tank can connect as few as 10 computers. Some systems administrators thought the product was intriguing with the potential to help them get the most out of the scattered pockets of storage space on their networks. But the price -- $90,000 for a starter configuration -- provoked groans and laughter.
"Graphics files gobble up a huge amount of storage space, but I could buy a whole lot of servers for $90,000," said Mike O'Keefe, a systems administrator for a Manhattan design firm.
"I guess if you're recreating the birth of the universe, you'd need something like Storage Tank," O'Keefe said. "But all of us regular folks will just have to wait for the designer knockoff adaptation of this technology."