Also in this article:
As libraries’ cloud-computing options multiply, LJ looks at the pros and cons of jumping in
By Edward M. Corrado & Heather Lea Moulaison
Cloud computing can be defined in a number of ways, making it confusing for librarians to understand what is available and what it delivers. Although computer scientists and technologists may have a strict definition of cloud computing involving “on-demand network access to a shared pool of configurable computing resources” (see the NIST Definition of Cloud Computing), for non-IT librarians it’s enough to think about cloud computing as library data and services hosted beyond the library’s walls and accessible via the web. More and more electronic resources and software used in libraries are hosted in the cloud.
Why the cloud?
There are many uses for cloud-based systems in libraries, from discovery layers to citation management to mobile apps (see “What’s in the Cloud?” p. 51), and the future holds even further possibilities. Cloud-based offerings such as the HathiTrust digital repository, discovery layers, and library management systems implemented on top of large, shared, community bibliographic databases have the potential to revolutionize library systems. In a September 2011 Computers in Libraries article (ow.ly/8OAhW), Vanderbilt University director for innovative technology and research Marshall Breeding predicted the upcoming demise of the integrated library system (ILS) and its replacement with a “library services platform” that will be cloud-based and, supposedly, egalitarian. Any size library will be able to implement such cloud solutions, if they can afford them. From a technological and access standpoint, a large portion of what a library does could be done in the cloud, freeing librarians’ time for other pursuits. For some libraries this may be a boon, but one size may not fit all. Some libraries will want to make sure they can mitigate any potential downside of the cloud by simultaneously hosting everything locally. Outsourcing mundane operations to the cloud, however, could allow librarians to provide more access to local and unique content.
Cloud pros and cons
One of the advantages of cloud computing is greater efficiency. Because servers are typically shared in the cloud environment, they can run multiple program instances simultaneously, leading to a better use of resources. In a traditional environment, a server may run at only five percent capacity most of the time. Cloud-computing providers, typically concerned about electricity costs, invest in energy-efficient equipment that may help create a greener computing environment.
The cloud is also touted for its flexibility and scalability. While most libraries do not have the same high peaks in usage that, say, an online retailer might have during the holidays, they can take advantage of the same flexibility. For instance, if they are implementing new software in the cloud, they can quickly add computing resources if the services become more popular, instead of ordering and installing new servers. Cloud computing’s flexibility may also be used for testing upgrades or new services. Instead of purchasing servers for testing, a virtual server can be started in the cloud and operated only as long as needed.
Cloud computing also allows for new uses of data that may not have been possible before. An article recommender based on data from one library might not be very valuable, but Ex Libris’s bX combines usage data from millions of researchers to create a scholarly recommender service. Ex Libris soon plans to offer Hot Articles, a free service employing bX data that shows what articles are trending in a particular subject.
Hosted cloud solutions offer a way to deal with the lack of technical expertise or a small systems staff. The vendor can take care of hardware, operating system upgrades, and software upgrades, for example, and do this at scale with shared hardware. As a result, in many cases cloud computing may be less expensive than traditional computing methods.
For nonlibrary-specific programs such as email, cloud-based solutions like Google’s gmail may be available for low or no cost. (Los Angeles Public Library is the largest system currently using gmail as its in-house email service.) That said, cloud computing is not always cheaper when all factors are considered, so libraries should be careful to evaluate all of the costs involved, such as network bandwidth, transition costs, and backup storage costs, when considering a migration to the cloud.
Library vendors that provide cloud-based solutions strive to maintain low downtime. At the 2011 European Library Automation Group (ELAG) conference, representatives from OCLC and Ex Libris stated that they try to have at least 99.5 percent uptime. This may be better than local computing staff would be able to provide. Conversely, after a migration to the cloud, local staff may lose control over when planned downtime occurs and be restricted in what they can do when unplanned downtime takes place. Access to high-speed broadband is becoming more common, but if a library does not have a reliable high-speed Internet connection, it might be a detriment to some cloud computing services.
Many librarians moving data to the cloud are concerned about security and privacy. This is a real issue, and librarians need to do their due diligence before moving data—especially patron data—to the cloud, but most cloud computing providers go to great lengths to ensure security. Iowa State University researcher Qing Hu told the Iowa State News Service (ow.ly/8OzDF) in July 2009 “that internal computer fraud is a more significant issue than external hacking.” Libraries in the United States may also have data that falls under Health Insurance Portability and Accountability Act (HIPAA) or Family Rights and Accountability Act (FERPA) privacy regulations, and other countries have similar laws.
In almost all cases, taking the proper precautions can minimize risk. Rackspace Server Backup, for instance, offers robust encryption; numerous other businesses offer Amazon-cloud-based services for medical records, which must conform to strict HIPAA regulation.
Data ownership is an important matter in the library cloud. What rights do the library and the vendor have to cloud-based data? Can the vendor “mine” patron data? If the library chooses to leave a cloud-based service, what data, in what format, will be returned? What happens if a vendor goes out of business or if a library doesn’t pay its bills on time? Will the library be cut off from its data? These are important questions to ask before migrating to the cloud, which should be addressed in a service-level agreement (SLA). Concerns about uptime, response time, system speed time, maintenance schedules, and support should also be included. Librarians will likely have an easier time negotiating an SLA with a library vendor than with an Internet giant like Amazon or Google. But even if they cannot negotiate SLA terms with a cloud provider, librarians should make sure they understand the terms, and can live with them, before signing.
Two cloud cases
Not everyone using a cloud-based service has had such a positive experience. When the University of Washington Health Services (UWHS) Library was looking for a cloud-based web-conferencing solution to offer library instruction and online reference, it thought it had been found: a free solution, DimDim, apparently had everything necessary. After spending considerable time and effort to implement DimDim successfully, the product was sold to another company, and service was discontinued. UWHS had only three months’ notice to save its files and find another solution. Although the library was able to use Adobe Connect to continue the program, library systems head Ann Whitney Gleason said last year that they “will probably not use a free cloud application for a critical project again.” (Gleason’s full account is available in Getting Started with Cloud Computing: A LITA Guide .)
On the other hand, Wake Forest University’s Z. Smith Reynolds Library, Winston-Salem, NC, has had great success moving to the cloud. It has migrated key IT services to cloud-based environments. Besides moving its ILS to the cloud (hosted by its vendor), it has migrated vital services using Amazon’s Elastic Compute Cloud (EC2) service. Cloud-based services include its website and VuFind discovery layer. Wake Forest’s assistant director for tech services, Erik Mitchell, reported favorably in a March 2010 Code4Lib Journal article (ow.ly/8OzxR) on EC2’s quality of service and overall found moving to the cloud “had similar costs but offered operational benefits.” Among those benefits were the minimization of hardware-related downtimes and the ability to implement new technology-based library services more quickly.
What’s in the Cloud?
There are a variety of cloud-based services in the library world. The most obvious is cloud-based access to a library’s book and AV collections through the online catalog (OPAC) that is part of the library’s integrated library system (ILS). OPACs can be overlaid with cloud-based front ends or recommender systems to make them more user-friendly. Bibliocommons is an example of a cloud-based front end for public libraries that works in tandem with a variety of ILSs. Bibliocommons and competing offerings from other vendors not only replace the search and discovery functionality of the OPAC but can also replace some patron account-related tasks, such as placing holds, paying fines, and updating user profiles. Some also provide a discovery experience based on community-contributed content, such as user-generated tags and reviews. Discovery layers like Serials Solutions’ Summon, EBSCO’s EDS, Ex Libris’s Primo Central, and others are meant to access all of a library’s data silos, not just resources cataloged in the ILS. Such discovery layers can provide access to special collections in the institutional repository and to products hosted outside of the library. For example, scans of public domain books in the HathiTrust digital repository can be found via the discovery layer of its academic library partners.
If a library wants more than the discovery layer in the cloud, library technology vendors including Innovative Interfaces and OCLC have either implemented or are in the process of launching completely new ILSs in the cloud, and open source providers, such as ByWater Solutions, can deliver cloud-based hosting services for the Koha ILS.
Electronic resources can also be made available through extramural repositories. Google Scholar incorporates metadata from journal indexes, article repositories, and other sources to offer web-scale access to scholarship that can be accessed at a patron’s library of choice. OverDrive, the most popular library ebook vendor, works with public and academic libraries and is making strides to integrate seamlessly with online library systems. 3M’s new ebook service, unveiled last year, permits users to sync their reading on multiple devices but still requires an initial download of the entire ebook file (that is, patrons don’t read a streaming book but a downloaded one).
Citation management software in the cloud can double as a platform for sharing content, forming communities around research topics, and recommending resources. Mendeley, for example, offers citation management through a web browser, though, to be fully functional, users need to download a client to their local computer.
New services may offer innovative approaches to managing scholarly communication electronically. Third Iron will let academic researchers browse and save new journal content through a service called BrowZine, available for the iPad.
Other services and products may also be of interest. Mobile phone apps can add value to cloud-based library data. OCLC’s WorldCat mobile site aims to direct patrons to the closest library owning a certain book by mashing-up data from WorldCat holdings, library locations, and user locations. StackMap shelf-mapping software is a new service that allows libraries to show users a map of the book’s physical location in the library based on a prerecorded call number range. Unlike radio-frequency identification (RFID) chips, which potentially allow for real-time search of a book via location tracking, this service is less dynamic but nonetheless useful.
Lastly, backups to the cloud can protect all kinds of library data—from repository contents to blog posts—from loss owing to fire, flood, local power blackouts, or other natural or computer-related disasters that could cause data to disappear. Amazon’s Elastic Compute Cloud (EC2) platform is a cloud service in the technologist’s sense of the term: it is scalable, metered, dynamic, and completely hosted by the bookselling giant. Libraries with technology staff can
use EC2 to implement virtual servers.
Edward M. Corrado is Director of Library Technology at Binghamton University Libraries, NY, and Heather Lea Moulaison is Assistant Professor at the University of Missouri’s School of Information Science and Learning Technologies, Columbia. They are coeditors of Getting Started with Cloud Computing: A LITA Guide (2011)