Monday, April 14, 2014

The Bleeding Heart of Computer Science

Who is to blame for the Heartbleed bug? Perhaps, it does not matter. Just fix it, and move on. Until the next bug, and the next, and the next.

The Heartbleed bug is different from other Internet scares. It is a vulnerability at the core of the Internet infrastructure, a layer that provides the foundation for secure communication, and it went undetected for years. It should be a wake-up call. Instead, the problem will be patched. Some government and industry flacks will declare the crisis over. We will move on and forget about it.

There is no easy solution. No shortcut. We must redevelop our information infrastructure from the ground up. Everything. Funding and implementing such an ambitious plan may become feasible only after a major disaster strikes that leaves no other alternative. But even if a complete redesign were to become a debatable option, it is not at all clear that we are up to the task.

The Internet is a concurrent and asynchronous system. A concurrent system consists of many independent components like computers and network switches. An asynchronous system operates without a central clock. In synchronous systems, like single processors, a clock provides the heartbeat that tells every component when state changes occur. In asynchronous systems, components are interrupt driven. They react to outside events, messages, and signals as they happen. The thing to know about concurrent asynchronous systems is this: It is impossible to de-bug them. It is impossible to isolate components from one another for testing purposes. The cost of testing quickly becomes prohibitive for each successively smaller marginal reduction in the probability of bugs. Unfortunately, when a system consists of billions of components, even extremely low-probability events are a daily occurrence. These unavoidable fundamental problems are exacerbated by continual system changes in hardware and software and by bad actors seeking to introduce and/or exploit vulnerabilities.

When debugging is not feasible, mathematical rigor is required. Current software-development environments are all about pragmatism, not rigor. Programming infrastructure is built to make programming easy, not rigorous. Most programmers develop their programs in a virtual environment and have no idea how their programs really function. Today's computer-science success stories are high-school geniuses that develop multimillion-dollar apps and college dropouts that start multibillion-dollar businesses. These are built on fast prototypes and viral marketing, not mathematical rigor. Who in their right mind would study computer science from people who made a career writing research proposals that never led to anything worth leaving a paltry academic job for?

Rigor in programming is the domain of Edsger W. Dijkstra, the most (in)famous, admired, and ignored computer-science eccentric. In 1996, he laid out his vision of Very Large Scale Application of Logic as the basis for the next fifty years of computer science. Although the examples are dated, his criticism of the software industry still rings true:
Firstly, simplicity and elegance are unpopular because they require hard work and discipline to achieve and education to be appreciated. Secondly we observe massive investments in efforts that are heading in the opposite direction. I am thinking about so-called design aids such as circuit simulators, protocol verifiers, algorithm animators, graphical aids for the hardware designers, and elaborate systems for version control: by their suggestion of power, they rather invite than discourage complexity. You cannot expect the hordes of people that have devoted a major part of their professional lives to such efforts to react kindly to the suggestion that most of these efforts have been misguided, and we can hardly expect a more sympathetic ear from the granting agencies that have funded these efforts: too many people have been involved and we know from past experience that what has been sufficiently expensive is automatically declared to have been a great success. Thirdly, the vision that automatic computing should not be such a mess is obscured, over and over again, by the advent of a monstrum that is subsequently forced upon the computing community as a de facto standard (COBOL, FORTRAN, ADA, C++, software for desktop publishing, you name it).
[The next fifty years, Edsger W. Dijkstra, circulated privately, 1996,
Document 1243a of the E. W. Dijkstra Archive,
https://www.cs.utexas.edu/users/EWD/ewd12xx/EWD1243a.PDF,
or, for fun, a version formatted in the Dijkstra handwriting font]

The last twenty years were not kind to Dijkstra's vision. The hordes turned into horsemen of the apocalypse that trampled, gored, and burned any vision of rigor in software. For all of us, system crashes, application malfunctions, and software updates are daily occurrences. It is build into our expectation.

In today's computer science, the uncompromising radicals that prioritize rigor do not stand a chance. Today's computer science is the domain of genial consensus builders, merchants of mediocrity that promise everything to everyone. Computer science has become a social construct that evolves according to political rules.

A bottoms-up redesign of our information infrastructure, if it ever becomes debatable, would be defeated before it even began. Those who could accomplish a meaningful redesign would never be given the necessary authority and freedom. Instead, the process would be taken over by political and business forces, resulting into effective status quo.

In 1996, Dijkstra believed this:
In the next fifty years, Mathematics will emerge as The Art and Science of Effective Formal Reasoning, and we shall derive our intellectual excitement from learning How to Let the Symbols Do the Work.
There is no doubt that he would still cling to this goal, but even Dijkstra may have started to doubt his fifty-year timeline.

Monday, March 31, 2014

Creative Problems

The open-access requirement for Electronic Theses and Dissertations (ETDs) should be a no-brainer. At virtually every university in the world, there is a centuries-old public component to the doctoral-degree requirement. With digital technology, that public component is implemented more efficiently and effectively. Yet, a small number of faculty fight the idea of Open Access for ETDs. The latest salvo came from Jennifer Sinor, an associate professor of English at Utah State University.
[One Size Doesn't Fit All, Jennifer Sinor, The Chronicle of Higher Education, March 24, 2014]

According to Sinor, Creative Writing departments are different and should be exempted from open-access requirements. She illustrates her objection to Open Access ETDs with an example of a student who submitted a novel as his masters thesis. He was shocked when he found out his work was for sale online by a third party. Furthermore, according to Sinor, the mere existence of the open-access thesis makes it impossible for that student to pursue a conventional publishing deal.


Sinor offers a solution to these problems, which she calls a middle path: Theses should continue to be printed, stored in libraries, accessible through interlibrary loan, and never digitized without the author's approval. Does anyone really think it is a common-sense middle path of moderation and reasonableness to pretend that the digital revolution never happened?

Our response could be brief. We could just observe that it does not matter whether or not Sinor's Luddite approach is tenable, and it does not matter whether or not her arguments hold water. Society will not stop changing because a small group of people pretend reality does not apply to them. Reality will, eventually, take over. Nevertheless, let us examine her arguments.

Multiyear embargoes are a routine part of Open Access policies for ETDs. I do not know of a single exception. After a web search that took less than a minute, I found the ETD policy of Sinor's own institution. The second and third sentence of USU's ETD policy reads as follows [ETD Forms and Policy, DigitalCommons@usu.edu]:
“However, USU recognizes that in some rare situations, release of a dissertation/thesis may need to be delayed. For these situations, USU provides the option of embargoing (i.e. delaying release) of a dissertation or thesis for five years after graduation, with an option to extend indefinitely.”
How much clearer can this policy be?

The student in question expressly allowed for third parties to sell his work by leaving a checkbox unchecked in a web form. Sinor excuses the student for his naïveté. However, anyone who hopes to make a living of creative writing in a web-connected world should have advanced knowledge of the business of selling one's works, of copyright law, and of publishing agreements. Does Sinor imply that a masters-level student in her department never had any exposure to these issues? If so, that is an inexcusable oversight in the department's curriculum.

This leads us to Sinor's final argument: that conventional publishers will not consider works that are also available as an Open Access ETDs. This has been thoroughly studied and debunked. See:
"Do Open Access Electronic Theses and Dissertations Diminish Publishing Opportunities in the Social Sciences and Humanities?" Marisa L. Ramirez, Joan T. Dalton, Gail McMillan, Max Read, and Nan Seamans. College & Research Libraries, July 2013, 74:368-380.

This should put to rest the most pressing issues. Yet, for those who cannot shake the feeling that Open Access robs students from an opportunity to monetize their work, there is another way out of the quandary. It is within the power of any Creative Writing department to solve the issue once and for all.

All university departments have two distinct missions: to teach a craft and to advance scholarship in their discipline. As a rule of thumb, the teaching of craft dominates up to the masters-degree level. The advancement of scholarship, which goes beyond accepted craft and into the new and experimental, takes over at the doctoral level.

When submitting a novel (or a play, a script, or a collection of poetry) as a thesis, the student exhibits his or her mastery of craft. This is appropriate for a masters thesis. However, when Creative Writing departments accept novels as doctoral theses, they put craft ahead of scholarship. It is difficult to see how any novel by itself advances the scholarship of Creative Writing.

The writer of an experimental masterpiece should have some original insights into his or her craft. Isn't it the role of universities to reward those insights? Wouldn't it make sense to award the PhD, not based on a writing sample, but based on a companion work that advances the scholarship of Creative Writing? Such a thesis would fit naturally within the open-access ecosystem of other scholarly disciplines without compromising the work itself in any way.

This is analogous to any number of scientific disciplines, where students develop equipment or software or a new chemical compound. The thesis is a description of the work and the ideas behind it. After a reasonable embargo to allow for patent applications, any such thesis may be made Open Access without compromising the commercial value of the work at the heart of the research.

A policy that is successful for most may fail for some. Some disciplines may be so fundamentally different that they need special processes. Yet, Open Access is merely the logical extension of long-held traditional academic values. If this small step presents such a big problem for one department and not for others, it may be time to re-examine existing practices at that department. Perhaps, the Open Access challenge is an opportunity to change for the better.

Monday, March 17, 2014

Textbook Economics

The impact of royalties on a book's price, and its sales, is greater than you think. Lower royalties often end up better for the author. That was the publisher's pitch when I asked him about the details of the proposed publishing contract. Then, he explained how he prices textbooks.

It was the early 1990s, I had been teaching a course on Concurrent Scientific Computing, a hot topic then, and several publishers had approached me about writing a textbook. This was an opportunity to structure a pile of course notes. Eventually, I would sign on with a different publisher, a choice that had nothing to do with royalties or book prices. [Concurrent Scientific Computing, Van de Velde E., Springer-Verlag New York, Inc., New York, NY, 1994.]

He explained that a royalty of 10% increases the price by more than 10%. To be mathematical about it: With a royalty rate r, a target revenue per book C, and a retail price P, we have that C = P-rP (retail price minus royalties). Therefore, P = C/(1-r). With a target revenue per book of $100, royalties of 10%, 15%, and 20% lead to retail prices of $111.11, $117.65, and $125.00, respectively.

In a moment of candor, he also revealed something far more interesting: how he sets the target revenue C. Say the first printing of 5000 copies requires an up-front investment of $100,000. (All numbers are for illustrative purposes only.) This includes the cost of editing, copy-editing, formatting, cover design, printing, binding, and administrative overhead. Estimating library sales at 1000 copies, this publisher would set C at $100,000/1,000 = $100. In other words, he recovered his up-front investment from libraries. Retail sales were pure profit.

The details are, no doubt, more complicated. Yet, even without relying on a recollection of an old conversation, it is safe to assume that publishers use the captive library market to reduce their business risk. In spite of increasingly recurrent crises, library budgets remain fairly predictable, both in size and in how the money is spent. Any major publisher has reliable advance estimates of library sales for any given book, particularly if published as part of a well-known series. It is just good business to exploit that predictability.

The market should be vastly different now, but textbooks have remained stuck in the paper era longer than other publications. Moreover, the first stage of the move towards digital, predictably, consists of replicating the paper world. This is what all constituents want: Librarians want to keep lending books. Researchers and students like getting free access to quality books. Textbook publishers do not want to lose the risk-reducing revenue stream from libraries. As a result, everyone implements the status quo in digital form. Publishers produce digital books and rent their collections to libraries through site licenses. Libraries intermediate electronic-lending transactions. Users get the paper experience in digital form. Universities pay for site licenses and the maintenance of the digital-lending platforms.

After the disaster of site licenses for scholarly journals, repeating the same mistake with books seems silly. Once again, take-it-or-leave-it bundles force institutions into a false choice between buying too much for everyone or nothing at all. Once again, site licenses eliminate the unlimited flexibility of digital information. Forget about putting together a personal collection tailored to your own requirements. Forget about pricing per series, per book, per chapter, unlimited in time, one-day access, one-hour access, readable on any device, or tied to a particular device. All of these options are eliminated to maintain the business models and the intermediaries of the paper era.

Just by buying/renting books as soon as they are published, libraries indirectly pay for a significant fraction of the initial investment of producing textbooks. If libraries made that initial investment explicitly and directly, they could produce those same books and set them free. Instead of renting digital books (and their multimedia successors), libraries could fund authors to write books and contract with publishers to publish those manuscripts as open-access works. Authors would be compensated. Publishers would compete for library funds as service providers. Publishers would be free to pursue the conventional pay-for-access publishing model, just not with library dollars. Prospective authors would have a choice: compete for library funding to produce an open-access work or compete for a publishing contract to produce a pay-for-access work.

The Carnegie model of libraries fused together two distinct objectives: subsidize information and disseminate information by distributing books to many different locations. In web-connected communities, spending precious resources on dissemination is a waste. Inserting libraries in digital-lending transactions only makes those transactions more inconvenient. Moreover, it requires expensive-to-develop-and-maintain technology. By reallocating these resources towards subsidizing information, libraries could set information free without spending part of their budget on reducing publishers' business risk. The fundamental budget questions that remain are: Which information should be subsidized? What is the most effective way to subsidize information?

Libraries need not suddenly stop site licensing books tomorrow. In fact, they should take a gradual approach, test the concept, make mistakes, and learn from them. A library does not become a grant sponsor and/or publisher overnight. Several models are already available: from grant competition to crowd-funded ungluing. [Unglue.it for Libraries] By phasing out site licenses, any library can create budgetary space for sponsoring open-access works.

Libraries have a digital future with almost unlimited opportunities. Yet, they will miss out if they just rebuild themselves as a digital copy of the paper era.

Monday, January 20, 2014

A Cloud over the Internet

Cloud computing could not have existed without the Internet, but it may make Internet history by making the Internet history.

Organizations are rushing to move their data centers to the cloud. Individuals have been using cloud-based services, like social networks, cloud gaming, Google Apps, Netflix, and Aereo. Recently, Amazon introduced WorkSpaces, a comprehensive personal cloud-computing service. The immediate benefits and opportunities that fuel the growth of the cloud are well known. The long-term consequences of cloud computing are less obvious, but a little extrapolation may help us make some educated guesses.

Personal cloud computing takes us back to the days of remote logins with dumb terminals and modems. Like the one-time office computer, the cloud computer does almost all of the work. Like the dumb terminal, a not-so-dumb access device (anything from the latest wearable gadget to a desktop) handles input/output. Input evolved beyond keystrokes and now also includes touch-screen gestures, voice, image, and video. Output evolved from green-on-black characters to multimedia.

When accessing a web page with content from several contributors (advertisers, for example), the page load time depends on several factors: the performance of computers that contribute web-page components, the speed of the Internet connections that transmit these components, and the performance of the computer that assembles and formats the web page for display. By connecting to the Internet through a cloud computer, we bypass the performance limitations of our access device. All bandwidth-hungry communication occurs in the cloud on ultra-fast networks, and almost all computation occurs on a high-performance cloud computer. The access device and its Internet connection just need to be fast enough to process the information streams into and out of the cloud. Beyond that, the performance of the access device hardly matters.

Because of economies of scale, the cloud-enabled net is likely to be a highly centralized system dominated by a small number of extremely large providers of computing and networking. This extreme concentration of infrastructure stands in stark contrast to the original Internet concept, which was designed as a redundant, scalable, and distributed system without a central authority or a single point of failure.

When a cloud provider fails, it disrupts its own customers, and the disruption immediately propagates to the customers' clients. Every large provider is, therefore, a systemic vulnerability with the potential of taking down a large fraction of the world's networked services. Of course, cloud providers are building infrastructure of extremely high reliability with redundant facilities spread around the globe to protect against regional disasters. Unfortunately, facilities of the same provider all have identical vulnerabilities, as they use identical technology and share identical management practices. This is a setup for black-swan events, low-probability large-scale catastrophes.

The Internet is overseen and maintained by a complex international set of authorities. [Wikipedia: Internet Governance] That oversight loses much of its influence when most communication occurs within the cloud. Cloud providers will be tempted to deploy more efficient custom communication technology within their own facilities. After all, standard Internet protocols were designed for heterogeneous networks. Much of that design is not necessary on a network where one entity manages all computing and all communication. Similarly, any two providers may negotiate proprietary communication channels between their facilities. Step by step, the original Internet will be relegated to the edges of the cloud, where access devices connect with cloud computers.

Net neutrality is already on life support. When cloud providers compete on price and performance, they are likely to segment the market. Premium cloud providers are likely to attract high-end services and their customers, relegating the rest to second-tier low-cost providers. Beyond net neutrality, there may be a host of other legal implications when communication moves from public channels to private networks.

When traffic moves to the cloud, telecommunication companies will gradually lose the high-margin retail market of providing organizations and individuals with high-bandwidth point-to-point communication. They will not derive any revenue from traffic between computers within the same cloud facility. The revenue from traffic between cloud facilities will be determined by a wholesale market with customers that have the resources to build and/or acquire their own communication capacity.

The existing telecommunication infrastructure will mostly serve to connect access devices to the cloud over relatively low-bandwidth channels. When TV channels are delivered to the cloud (regardless of technology), users select their channel on the cloud computer. They do not need all channels delivered to the home at all times; one TV channel at a time per device will do. When phones are cloud-enabled, a cloud computer intermediates all communication and provides the functional core of the phone.

Telecommunication companies may still come out ahead as long as the number of access devices keeps growing. Yet, they should at least question whether it would be more profitable to invest in cloud computing instead of ever higher bandwidth to the consumer.

The cloud will continue to grow as long as its unlimited processing power, storage capacity, and communication bandwidth provide new opportunities at irresistible price points. If history is any guide, long-term and low-probability problems at the macro level are unlikely to limit its growth. Even if our extrapolated scenario never completely materializes, the cloud will do much more than increase efficiency and/or lower cost. It will change the fundamental character of the Internet.

Wednesday, January 1, 2014

Market Capitalism and Open Access

Is it feasible to create a self-regulating market for Open Access (OA) journals where competition for money is aligned with the quest for scholarly excellence?

Many proponents of the subscription model argue that a competitive market provides the best assurance for quality. This ignores that the relationship between a strong subscription base and scholarly excellence is tenuous at best. What if we created a market that rewards journals when a university makes its most tangible commitment to scholarly excellence?

While role of journals in actual scholarly communication has diminished, their role in academic career advancement remains as strong than ever. [Paul Krugman: The Facebooking of Economics] The scholarly-journal infrastructure streamlines the screening, comparing, and short-listing of candidates. It enables the gathering of quantitative evidence in support of the hiring decision. Without journals, the work load of search committees would skyrocket. If scholarly journals are the headhunters of the academic-job market, let us compensate them as such.

There are many ways to structure such compensation, but we only need one example to clarify the concept. Consider the following scenario:

  • The new hire submitted a bibliography of 100 papers.
  • The search committee selected 10 of those papers to argue the case in favor of the appointment. This subset consists of 6 papers in subscription journals, 3 papers in the OA journal Theoretical Approaches to Theory (TAT), and 1 paper in the OA journal Practical Applications of Practice (PAP).
  • The university's journal budget is 1% of its budget for faculty salaries. (In reality, that percentage would be much lower.)

Divide the new faculty member's share of the journal budget, 1% of his or her salary, into three portions:

  • (6/10) x 1% = 0.6% of salary to subscription journals,
  • (3/10) x 1% = 0.3% of salary to the journal TAT, and
  • (1/10) x 1% = 0.1% of salary to the journal PAP.

The first portion (0.6%) remains in the journal budget to pay for subscriptions. The second (0.3%) and third (0.1%) portion are, respectively, awarded yearly to the OA journals TAT and PAP. The university adjusts the reward formula every time a promotion committee determines a new list of best papers.

To move beyond a voluntary system, universities should give headhunting rewards only to those journals with whom they have a contractual relationship. Some Gold OA journals are already pursuing institutional-membership deals that eliminate or reduce author page charges (APCs). [BioMed Central] [PeerJ][SpringerOpen] Such memberships are a form of discounting for quantity. Instead, we propose a pay-for-performance contract that eliminates APCs in exchange for headhunting rewards. Before signing such a contract, a university would conduct a due-diligence investigation into the journal. It would assess the publisher's reputation, the journal's editorial board, its refereeing, editing, formatting, and archiving standards, its OA licensing practices, and its level of participation in various abstracting-and-indexing and content-mining services. This step would all but eliminate predatory journals.

Every headhunting reward would enhance the prestige (and the bottom line) of a journal. A reward citing a paper would be a significant recognition of that paper. Such citations might be even more valuable than citations in other papers, thereby creating a strong incentive for institutions to participate in the headhunting system. Nonparticipating institutions would miss out on publicly recognizing the work of their faculty, and their faculty would have to pay APCs. There is no Open Access free ride.

Headhunting rewards create little to no extra work for search committees. Academic libraries are more than capable to perform due diligence, to negotiate the contracts, and to administer the rewards. Our scenario assumed a base percentage of 1%. The actual percentage would be negotiated between universities and publishers. With rewards proportional to salaries, there is a built-in adjustment for inflation, for financial differences between institutions and countries, and for differences in the sizes of various scholarly disciplines.

Scholars retain the right to publish in the venue of their choice. The business models of journals are used when distributing rewards, but this occurs well after the search process has concluded. The headhunting rewards gradually reduce the subscription budget in proportion to the number of papers published in OA journals by the university's faculty. A scholar who wishes to support a brand-new journal should not pay APCs, but lobby his or her university to negotiate a performance-based headhunting contract.

The essence of this proposal is the performance-based contract that exchanges APCs for headhunting rewards. All other details are up for discussion. Every university would be free to develop its own specific performance criteria and reward structures. Over time, we would probably want to converge towards a standard contract.

Headhunting contracts create a competitive market for OA journals. In this market, the distributed and collective wisdom of search/promotion committees defines scholarly excellence and provides the monetary rewards to journals. As a side benefit, this free-market system creates a professionally managed open infrastructure for the scholarly archive.

Monday, December 16, 2013

Beall's Rant

Jeffrey Beall of Beall's list of predatory scholarly publishers recently made some strident arguments against Open Access (OA) in the journal tripleC (ironically, an OA journal). Beall's comments are part of a non-refereed section dedicated to a discussion on OA.

Michael Eisen takes down Beall's opinion piece paragraph by paragraph. Stevan Harnad responds to the highlights/lowlights. Roy Tennant has a short piece on Beall in The Digital Shift.

Beall's takes a distinctly political approach in his attack on OA:
“The OA movement is an anti-corporatist movement that wants to deny the freedom of the press to companies it disagrees with.”
“It is an anti-corporatist, oppressive and negative movement, [...]”
“[...] a neo-colonial attempt to cast scholarly communication policy according to the aspirations of a cliquish minority of European collectivists.”
“[...] mandates set and enforced by an onerous cadre of Soros-funded European autocrats.”
This is the rhetorical style of American extremist right-wing politics that casts every problem as a false choice between freedom and – take your pick – communism or totalitarianism or colonialism or slavery or... European collectivists like George Soros (who became a billionaire by being a free-market capitalist).

For those of us more comfortable with technocratic arguments, politics is not particularly welcome. Yet, we cannot avoid the fact that the OA movement is trying to reform a large socio-economic system. It would be naïve to think that that can be done without political ideology playing a role. But is it really too much to ask to avoid the lowest level of political debate, politics by name-calling?

The system of subscription journals has an internal free-market logic to it that no proposed or existing OA system has been able to replace. In a perfect world, the subscription system uses an economic market to assess the quality of editorial boards and the level of interest in a particular field. Economic viability acts as a referee of sorts, a market-based minimum standard. Some editorial boards deserve the axe for doing poor work. Some fields of study deserve to go out of business for lack of interest. New editorial boards and new fields of study deserve an opportunity to compete. Most of us prefer that these decisions are made by the collective and distributed wisdom of free-market mechanisms.

Unfortunately, the current scholarly-communication marketplace is far from a free market. Journals hardly compete directly with one another. Site licenses perpetuate a paper-era business model that forces universities to buy all content for 100% of the campus community, even those journals that are relevant only to a sliver of the community. Site licenses limit competition between journals, because end users never get to make the price/value trade-offs critical to a functional free market. The Big Deal exacerbates the problem. Far from providing a service, as Beall contends, the Big Deal gives big publishers a platform to launch new journals without competition. Consortial deals are not discounts; they introduce peer networks to make it more difficult to cancel existing subscriptions. [What if Libraries were the Problem?] [Libraries: Paper Tigers in a Digital World]

If Beall believes in the free market, he should support competition from new methods of dissemination, alternative assessment techniques, and new journal business models. Instead, he seems to be motivated more by a desire to hold onto his disrupted job description:
“Now the realm of scholarly communication is being removed from libraries, and a crisis has settled in. Money flows from authors to publishers rather than from libraries to publishers. We've disintermediated libraries and now find that scholarly system isn't working very well.”
In fact, it is the site-license model that reduced the academic library to the easy-to-disintermediate dead-end role of subscription manager. [Where the Puck won't Be] Most librarians are apprehensive about the changes taking place, but they also realize that they must re-interpret traditional library values in light of new technology to ensure long-term survival of their institution.

Thus far, scholarly publishing has been the only type of publishing not disrupted by the Internet. In his seminal work on disruption [The Innovator's Dilemma], Clayton Christensen characterizes the defenders of the status quo in disrupted industries. Like Beall, they are blinded by traditional quality measures, dismiss and/or denigrate innovations, and retreat into a defense of the status quo.

Students, researchers, and the general public deserve a high-quality scholarly-communication system that satisfies basic minimum technological requirements of the 21st century. [Peter Murray-Rust, Why does scholarly publishing give me so much technical grief?] In the last 20 years of the modern Internet, we have witnessed innovation after innovation. Yet, scholarly publishing is still tied to the paper-imitating PDF format and to paper-era business models.

Open Access may not be the only answer [Open Access Doubts], but it may very well be the opportunity that this crisis has to offer. [Annealing the Library] In American political terms, Green Open Access is a public option. It provides free access to author-formatted versions of papers. Thereby, it serves the general public and the scholarly poor. It also serves researchers by providing a platform for experimentation without having to go through onerous access negotiations (for text mining, for example). It also serves as an additional disruptive trigger for free-market reform of the scholarly market. Gold Open Access in all its forms (from PLOS to PEERJ) is a set of business models that deserve a chance to compete on price and quality.

The choice is not between one free-market option and a plot of European collectivists. The real choice is whether to protect a functionally inadequate system or whether to foster an environment of innovation.

Monday, December 2, 2013

Amazon Floods the Information Commons

Amazon is bringing cloud computing to the masses. Any individual with access to a browser now has access to almost unlimited computing power and storage. This may be the moment that marks the official beginning of the end of the desktop computer, which was already on a downward slide because of the rise of notebooks, netbooks, tablets, and smartphones.

For managers of computer labs, this technology eliminates a slew of nitty gritty management problems without good solutions. When a shared computer is idle, do you take action after 5, 10, or 15 minutes? If you wait too long, you annoy users who are waiting for their turn, and you invite unauthorized users to sneak into someone else's session. If you act too soon, you ruin the experience for the current user. Should you immediately log off an idle user or do you lock the screen for a while before logging off? Again, you balance the interests of the current user against those of the next user. Which software do you install where? Installing all software on every computer is usually too expensive. But if each computer in the lab has its own configuration, how do you communicate those differences to the users? The ultimate challenge of the shared computer is how to let students install software that they themselves are developing while keeping the computer relatively secure, usable to others, and free from pirated software.

Amazon has solved all of this and more. With cloud-based computers, there is no such thing as an idle computer, only idle screens. Shutting down a screen and turning it over to another user does not ruin a session in progress. It is more like turning over a printer. The cloud-based personal computer is configured for one user according to his or her requirements. Students and faculty can install whatever software they need, including their own research software. As to the usual suite of standard applications, cloud services like Adobe Creative Cloud, Google Apps, and Windows Azure have eliminated software installation and maintenance entirely.

The potential of cloud computing in the Information Commons is more than substituting one technology with another. Students and faculty suddenly have their own custom computing laboratory with an unlimited number of computers over which they have complete control. One can imagine projects in which cloud-based computers harvest measurements from sensors across the globe (weather-related, for example), read and analyze the news, and data mine social networks. All of this data can then be fed to high-performance servers running research software for analysis and visualization.

Currently, retail pricing for a cloud-based personal computer starts at $35 per month. This is already a very good price point, considering that it eliminates the hardware replacement cycle, software maintenance, security issues, etc. One can also add and drop computers as needed. Moreover, this is a price point established before competitors have even entered the market. 

When computing and storage become relatively inexpensive on-demand commodity services, computing labs are no longer in the business of sharing computing devices, storage, and software; they are in the business of sharing visualization devices. Currently, Information Commons provide large-screen high-resolution monitors attached to a computer. As large-scale, high-performance, big-data projects grow in popularity across many disciplines, there will be increasing demand for more advanced equipment to visualize and render the results. Today's computing labs will morph into advanced visualization labs. They will provide the capacity to use multiple large high-resolution screens. They may provide access to CAVEs (CAVE Automatic Virtual Environment) and/or additive-manufacturing equipment (which includes 3-D printing). The support requirements for such equipment are radically different from those for current computer labs. CAVEs need large rooms with no windows, multiple projectors, and a sound system. Additive manufacturing may be loud and may require specialized venting systems.

For managers of Information Commons, it is not too early to start planning for this transition. They may look forward to getting rid of the nitty-gritty unsolvable problems mentioned above, but integrating these technologies into the real estate currently used for computing labs and libraries will require all of the organizational and management skills they can muster.