Tuesday, June 5, 2012

The Day After


On Sunday, the Open Access petition to the White House reached the critical number of 25,000 signatures: President Obama will take a stand on the issue. Yesterday was Open Access Monday, a time to celebrate an important milestone. Today is a time for libraries to reflect on their new role in a post-site-licensed world.

Imagine success beyond all expectations: The President endorses Open Access. There is bipartisan support in Congress. Open Access to government-sponsored research is enacted. The proposal seeks only Green Open Access: the deposit in an open repository of scholarly articles that are also conventionally published. With similar legislation being enacted world-wide, imagine all scholarly publishers deciding that the best way forward for them is to convert all journals to the Gold Open Access model. In this model, authors or their institutions pay publishing costs up front to publish scholarly articles under an open license.

Virtually overnight, universal Open Access is a reality.

9:00am

When converting to Gold Open Access, publishers replace site-license revenue with author-paid page charges. They use data from the old business model to estimate revenue-neutral page charges. The estimate is a bit rough, but as long as scholars keep publishing at the same rate and in the same journals as before, the initial revenue from page charges should be comparable to that from site licenses. Eventually, the market will settle around a price point influenced by the real costs of open-access publishing, by publishing behavior of scholars who must pay to get published, and by publishers deciding to get in or get out of the scholarly-information market.

10:00am

Universities re-allocate the libraries' site-license budgets and create accounts to pay for author page charges. Most universities assign the management of these accounts to academic departments, which are in the best position to monitor expenses charged by faculty.

11:00am

Publishers make redundant their sales teams catering to libraries. They cancel vendor exhibits at library conferences. They terminate all agreements with journal aggregators and other intermediaries between libraries and publishers.

12:00pm

Libraries eliminate electronic resource management, which includes everything involved in the acquisition and maintenance of site licenses. No more tracking of site licenses. No more OpenURL servers. No more proxy servers. No more cataloging electronic journals. No more maintaining databases of journals licensed by the library.

1:00pm

For publishers, the editorial boards and the authors they attract are more important than ever. These scholars have always created the core product from which publishers derived their revenue streams. Now, these same scholars, not intermediaries like libraries and journal aggregators, are the direct source of the revenue. Publishers expand the marketing teams that target faculty and students. They also strengthen the teams that develop editorial boards.

2:00pm

Publishers' research portals like Elsevier's Scopus start incorporating full-text scholarly output from all of their competitors.

Scholarly societies provide specialized digital libraries for every niche imaginable.

Some researchers develop research tools that data mine the open scholarly literature. They create startup ventures and commercialize these tools.

Google Scholar and Microsoft Academic Search each announce comprehensive academic search engines that have indexed the full text of the available open scholarly literature.

3:00pm

While some journal aggregators go out of business, others retool and develop researcher-oriented products.

ISI's World of Knowledge, EBSCO,  OCLC, and others create research portals catering to individual researchers. Of course, these new portals incorporate full-text papers, not just abstracts or catalog records.

Overnight, full-text scholarly search turned into a competitive market. Developing viable business models proves difficult, because juggernauts Google and MicroSoft are able to provide excellent search services for free. Strategic alliances are formed.

4:00pm

No longer tied to their institutions' libraries by site licenses, researchers use whichever is the best research portal for each particular purpose. Web sites of academic libraries experience a steep drop-off in usage. The number of interlibrary loan requests tumbles: only requests for nondigital archival works remain.

5:00pm

Libraries lose funding for those institutional repositories that duplicate scholarly research available through Gold Open Access. Faculty are no longer interested in contributing to these repositories, and university administrators do not want to pay for this duplication.

Moral

By just about any measure, this outcome would be far superior to the current state of scholarly publishing. Scholars, researchers, professionals in any discipline, students, businesses, and the general population would benefit from access to original scholarship unfettered by pay walls. The economic benefit of commercializing research faster would be immense. Tuition increases may not be as steep because of savings in the library budget.

If librarians fear a steadily diminishing role for academic libraries (and they should), they must make a compelling value proposition for the post-site-licensed world now. The only choice available is to be disruptive or to be disrupted. The no-disruption option is not available. Libraries can learn from Harvard Business School Professor Clayton M. Christensen, who has analyzed scores of disrupted industries. They can learn from the edX project or Udacity, major initiatives of large-scale online teaching. These projects are designed to disrupt the business model of the very institutions that incubated them. But if they succeed, they will be the disrupting force. Those on the sidelines will be the disrupted victims.

Libraries have organized or participated in Open Access discussions, meetings, negotiations, petitions, boycotts... Voluntary submission to institutional repositories has been proven insufficient. Enforced open-access mandates are a significant improvement. Yet, open-access mandates are not a destination. They are, at most, a strategy for creating change. The current scholarly communication system, even if complemented with open repositories that cover 100% of the scholarly literature, is hopelessly out of step with current technology and society.

In the words of Andy Grove, former chairman and chief executive officer of Intel: “To understand a company’s strategy, look at what they actually do rather than what they say they will do.” Ultimately, only actions that involve significant budget reallocations are truly credible. As long as pay walls are the dominant item in library budgets, libraries retain the organizational structure appropriate for a site-licensed world. As long as pay-wall management dominates the libraries' day-to-day operations, libraries hire, develop, and promote talent for a site-licensed world. This is a recipe for success for only one scenario: the status-quo.

Thursday, May 10, 2012

Lowest Common Denominator


A divisor of an integer divides that integer without leaving a remainder. The divisors of 28 are 1, 2, 4, 7, 14, and 28. The divisors of 60 are 1, 2, 3, 4, 5, 6, 10, 12, 15, 20, 30, and 60.

A common divisor of two integers divides both without leaving a remainder. The common divisors of 28 and 60 are 1, 2, and 4.

The greatest common divisor of two integers is the common divisor that is greater than all of the other common divisors. The greatest common divisor of 28 and 60 is 4.

The concept of a least common divisor is meaningless, as it is always 1.

A fraction, such as 5/8 and 3/10, consists of a numerator and a denominator. Any integer can be a numerator. Any non-zero integer can be a denominator.

“Lower” and “lowest” compare altitudes, not magnitudes.

Anyone using the phrase Lowest Common Denominator reduces the Greatest Common Divisor of human knowledge.

Please educate your pundits.

Friday, April 27, 2012

Annealing the Library: Follow up


Here are responses to some of the off-line reactions to the previous blog.


-

“Annealing the Library” did not contain any statements about abandoning paper books (or journals). Each library needs to assess the value of paper for its community. This value assessment is different from one library to the next and from one collection to the next.

The main point of the post is that the end of paper acquisitions should NOT be the beginning of digital licenses. E-lending is not an adequate substitute for paper-based lending. E-lending is not a long-term investment. Libraries will not remain relevant institutions by being middlemen in digital-lending operations.

I neglected to concede the point that licensing digital content could be a temporary bandaid during the transition from paper to digital.

-

In the case of academic libraries, the bandaid of site licensing scholarly journals is long past its due expiration date. It is time to phase out of the system.

If the University of California and California State University jointly announced a cancellation of all site licenses over the next three to five years, the impact would be felt immediately. The combination of the UC and Cal State systems is so big that publishers would need to take immediate and drastic actions. Some closed-access publishers would convert to open access. Others would start pricing their products appropriate for the individual-subscription market. Some publishers might not survive. Start-up companies would find a market primed to accept innovative models.

Unfortunately, most universities are too small to have this kind of immediate impact. This means that some coordinated action is necessary. This is not a boycott. There are no demands to be met. It is the creation of a new market for open-access information. It is entirely up to the publishers themselves how to decide how to respond. There is no need for negotiations. All it takes is the gradual cancellation of all site licenses at a critical mass of institutions.

-

Annealing the Library does not contradict an earlier blog post, in which I expressed three Open Access Doubts. (1) I expressed disappointment in the quality of existing Open Access repositories. The Annealing proposal pumps a lot of capital into Open Access, which should improve quality. (2) I doubted the long-term effectiveness of institutional repositories in bringing down the total cost of access to scholarly information. Over time, the Annealing proposal eliminates duplication between institutional repositories and the scholarly literature, and it invests heavily into Open Access. (3) I wondered whether open-access journals are sufficiently incentivized to maintain quality over the long term. This doubt remains. Predatory open-access journals without discernible quality standards are popping up right and left. This is an alarming trend to serious open-access innovators. We urgently need a mechanism to identify and eliminate underperforming open-access journals.

-

If libraries cut off subsidies to pay-walled information, some information will be out of reach. By phasing in the proposed changes gradually, temporary disruption of access to some resources will be minimal. After the new policies take full effect, they will create many new beneficiaries, open up many existing information resources, and create new open resources.


Tuesday, April 17, 2012

Annealing the Library


The path of least resistance and least trouble is a mental rut already made. It requires troublesome work to undertake the alternation of old beliefs.
John Dewey

What if a public library could fund a blogger of urban architecture to cover in detail all proceedings of the city planning department? What if it could fund a local historian to write an open-access history of the town? What if school libraries could fund teachers to develop open-access courseware? What if libraries could buy the digital rights of copyrighted works and set them free? What if the funds were available right now?

Unfortunately, by not making decisions, libraries everywhere merely continue to do what they have always done, but digitally. The switch from paper-based to digital lending is well under way. Most academic libraries already converted to digital lending for virtually all scholarly journals. Scores of digital-lending services are expanding digital lending to books, music, movies, and other materials. These services let libraries pretend that they are running a digital library, and they can do so without disrupting existing business processes. Publishers and content distributors keep their piece of the library pie. The libraries' customers obtain legal free access to quality content. The path of least resistance feels good and buries the cost of lost opportunity under blissful ignorance.

The value propositions of paper-based and digital lending are fundamentally different. A paper-based library builds permanent infrastructure: collections, buildings, and catalogs are assets that continue to pay dividends far into the future. In contrast, resources spent on digital lending are pure overhead. This includes staff time spent on negotiating licenses, development and maintenance of authentication systems, OpenURL, proxy, and web servers, and the software development to give a unified interface to disparate systems of content distributors. (Some expenses are hidden in higher fees for the Integrated Library System.) These expenses do not build permanent infrastructure and merely increase the cost of every transaction.

Do libraries add value to the process? If so, do libraries add value in excess of their overhead costs? In fact, library-mediated lending is more cumbersome and expensive than direct-to-consumer lending, because content distributors must incorporate library business processes in their lending systems. If the only real value of the library's meddling is to subsidize the transactions, why not give the money to users directly? These are the tough questions that deserve an answer.

Libraries cannot remain relevant institutions by being meaningless middlemen who serve no purpose. Libraries around the world are working on many exciting digital projects. These include digitization projects and the development of open archives for all kinds of content. Check out this example. Unfortunately, projects like these will be underfunded or cannot grow to scale as long as libraries remain preoccupied with digital lending.

Libraries need a different vision for their digital future, one that focuses on building digital infrastructure. We must preserve traditional library values, not traditional library institutions, processes, and services. The core of any vision must be long-term preservation of and universal open access to important information. Yet, we also recognize that some information is a commercial commodity, governed by economic markets. Libraries have never covered all information needs of everyone. Yet, independent libraries serving their respective communities and working together have established a great track record of filling global information needs. This decentralized model is worth preserving.

Some information, like most popular music and movies, is obviously commercial and should be governed by copyright, licenses, and prices established by the free market. Other information, like many government records, belongs either in the public domain or should be governed by an open license (Creative Commons, for example). Most information falls somewhere in between, with passionate advocates on both sides of the argument for every segment of the information market. Therefore, let us decentralize the issue and give every creator a real choice.

By gradually converting acquisition budgets into grant budgets, libraries could become open-access patrons. They could organize grant competitions for the production of open-access works. By sponsoring works and creators that further the goals of its community, each library contributes to a permanent open-access digital library for everyone. Publishers would have a role in the development of grant proposals that cover all stages of the production and marketing of the work. In addition to producing the open-access works, publishers could develop commercial added-value services. Finally, innovative markets like the one developed by Gluejar allow libraries (and others) to acquire the digital rights of commercial works and set them free.

The traditional commercial model will remain available, of course. Some authors may not find sponsors. Others may produce works of such potential commercial value that open access is not a realistic option. These authors are free to sell their work with any copyright restrictions deemed necessary. They are free to charge what the market will bear. However, they should not be able to double-dip. There is no need to subsidize closed-access works when open access is funded at the level proposed here. Libraries may refer customers to closed-access works, but they should not subsidize access. Over time, the cumulative effect of committing every library budget to open access would create a world-changing true public digital library.

Other writers have argued the case against library-mediated digital lending. No one is making the arguments in support of the case. The path of least resistance does not need arguments. It just goes with the flow. Into oblivion.

Friday, March 16, 2012

Annealing Elsevier

Through a bipartisan pair of shills, Elsevier introduced a bill that would have abolished the NIH open-access mandate and prevented other government research-funding agencies from requiring open access to government-sponsored research. In this Research Works Act (RWA) episode, Elsevier showed its hand. Twice. When it pushed for this legislation, and when it withdrew.

Elsevier was one of the first major publishers to support green open access. By pushing RWA, Elsevier confirmed the suspicion that this support is, at most, a short-term tactic to appease the scholarly community. Its real strategy is now in plain sight. RWA was not done on a whim. They cultivated at least two members of the House of Representatives and their staff. Just to get it out of committee, they would have needed several more. No one involved could possibly have thought they could sneak in RWA without anyone noticing. Yet, after an outcry from the scholarly community, they dropped the legislation just as suddenly as they introduced it. If Elsevier executives had a strategy, it is in tatters.

Elsevier’s RWA move and its subsequent retrenchment have more than a whiff of desperation. I forgive your snickering at this suggestion. After all, by its own accounting, Elsevier’s adjusted operating margin for 2010 was 35.7% and has been growing monotonously at least since 2006. These are not trend lines of a desperate company. (Create your own Elsevier reports here. Thanks to Nalini Joshi, @monsoon0, for tweeting the link and the graph!)

Paradoxically, its past success is a problem going forward. Elsevier’s stock-market shares are priced to reflect the company’s consistently high profitability. If it were to deteriorate, even by a fraction, share prices would tumble. To prevent that, Elsevier must raise revenue from a client base of universities that face at least several more years of extremely challenging budgets. For universities, the combination of price increases and budget cuts puts options on the table once thought unthinkable. Consider, for example, the University of California and the California State University systems. These systems have already cut to the bone, and they may face even more dire cuts, unless voters approve a package of tax increases. Because of their size, just these two university systems by themselves have a measurable impact on Elsevier’s bottom line. This is repeated across the country and the world.

Clearly, RWA was intended to make cancelling site licenses a less viable option for universities, now and in the future. When asked to deposit their publications in institutional repositories, it is an unfortunate fact that most scholars ignore their own institutions. They cannot ignore their funding agencies. Over time, funder-mandated repositories will become a fairly comprehensive compilation of the scholarly record. They may also erode the prestige factor of journals. After all, what is more prestigious? That two anonymous referees and an editor approved the paper or that the NIH funded it to the tune of a few million dollars? Advanced web-usage statistics of the open-access literature may further erode the value of impact factor and other conventional measures. Recently, I expressed some doubts that the open access movement could contribute to reining in journal prices. I may rethink some of this doubt, particularly with respect to funder-mandated open access.

Elsevier’s quick withdrawal from RWA is quite remarkable. Tim Gowers was uniquely effective, and deserves a lot of credit. When planning for RWA, Elsevier must have anticipated significant push back from the scholarly community. It has experience with boycotts and protests, as it has survived several. Clearly, the size and vehemence of the reaction was way beyond Elsevier's expectations. One can only speculate how many of its editors were willing to walk away over this issue.

Long ago, publishers figured out how to avoid becoming a low-profit commodity-service business: they put themselves at the hub of a system that establishes a scholarly pecking order. As beneficiaries of this system, current academic leaders and the tenured professoriate assign great value to it. Given the option, they would want everything the same, except cheaper, more open, without restrictive copyrights, and available for data mining. Of course, it is absurd to think that one could completely overhaul scholarly publishing by tweaking the system around the edges and without disrupting scholars themselves. Scholarly publishers survived the web revolution without disruption, because scholars did not want to be disrupted. That has changed.

Because of ongoing budget crises, desperate universities are cutting programs previously considered untouchable. To the dismay of scholars everywhere, radical options are on the table as a matter of routine. Yet, in this environment, publishers like Elsevier are chasing revenue increases. Desperation and anger are creating a unique moment. In Simulated Annealing terms (see a previous blog post): there is a lot of heat in the system, enabling big moves in search of a new global minimum.

Disruption: If not now, when?


Wednesday, February 22, 2012

Annealing the Information Market




When analyzing complex systems, applied mathematicians often turn to Monte Carlo simulations. The concept is straightforward. Change the state of the system by making a random move. If the new state is an improvement, make a new random move in a direction suggested by extrapolation. Otherwise, make a random move in a different direction. Repeat until a certain variable is optimized.

A commodity market is a real-life concurrent Monte Carlo system. Market participants make sequences of moves. Each new move is random, though it incorporates experience gained from previous moves. The resulting system is a remarkably effective mechanism to produce commodities at the lowest possible cost while adjusting to changing market conditions. Adam Smith called it the invisible hand of the free market.

In severely disrupted markets, the invisible hand may take an unacceptably long time, because Monte Carlo systems may remain stuck in local minima. We may understand this point by visualizing a mountain range with many peaks and valleys. An observer inside one particular valley thinks the lowest point is somewhere on that valley’s floor. He is unaware of other valleys at lower altitudes. To see these, he must climb to the rim of the valley, far away from the observed local minimum. This takes a very long time with small random steps that are biased in favor of going towards the observed local minimum.

For this reason, Monte Carlo simulations use strategies that incorporate large random moves. One such strategy, Simulated Annealing, is inspired by a metallurgical technique that improves the crystallographic structure of metals. During the annealing process, the metal is heated and cooled in a controlled fashion. The heat provides energy to change large-scale crystal structures in the metal. As the metal cools, restructuring occurs only at gradually smaller scales. In Simulated Annealing, the simulation is run “hot” when large random moves are used to optimize the system at coarse granularity. When sufficiently near a global minimum, the system is “cooled“, and smaller moves are used for precision at fine granularity. Note that, from a Monte Carlo perspective, large moves are just as random as small moves. Each individual move may succeed or fail. What matters is the strategy that guides the sequence of moves.

When major market disruptions occur, resistance to change breaks down and large moves become possible. (The market to runs “hot” in the Simulated Annealing sense.) Sometimes, government leaders or tycoons of industry initiate large moves, because they believe, right or wrong, that they can take the market to a new global minimum. Politicians enact new laws, or they orchestrate bailouts. Tycoons make large bets that are risky by conventional measures. Sometimes, unforeseen circumstances force markets into making large moves.

The music industry experienced such an event in late 1999, when Napster, the illegal music-sharing site, suddenly became popular. Eventually, this disruption enabled then-revolutionary business models like iTunes, which could compete with illegal downloading. This stopped the hemorrhaging, though not without leaving a disastrous trail. Traditional music retailers, distributors, and other middlemen were forced out. Revenue streams never recovered. With the Stop Online Piracy Act (SOPA), the music industry, joined by the entertainment industry, was trying to undo some of the damage. If enacted, it would have caused significant collateral damage, but it would have done nothing to reduce piracy. This is covered widely in the blogosphere. For example, consider blog posts by Eric Hellman [1] [2] and David Post [3].

While SOPA is dead, other attempts at antipiracy legislation are in the works. Some may succeed legislatively and may be enacted. In the end, however, heavy-handed legislation will fail. The evolution towards ubiquitous information availability (pirated or not) is irreversible. Even the cruelest of dictators cannot contain the flow of information. Why would anyone think democracies could? Eventually, laws follow society’s major trends. They always do.

When Napster became popular, the music industry was unable to fight back, because its existing distribution channels had become technologically obsolete. Napster was the large random move that made visible a new valley at lower altitude. Without Napster, some other event, circumstance, or product would eventually have come along, caused havoc, and be blamed. Antipiracy legislation might have delayed the music industry’s problems in 1999, but it will not solve the entertainment industry’s problems in 2012.

In the new market, piracy may no longer be the problem it once was. Consumers are willing to pay for convenience, quality of service, and security (absence of malware). Piracy may still depress revenues, but there are at least three other reasons for declining revenues. (1) Revenues no longer support many middlemen, and this is reflected in lower music prices through free-market competition. (2) Some consumers are interested in discovering new artists themselves, not in listening to artists discovered on their behalf by record labels. (3) The recession has reduced discretionary income.

It is difficult to assess the relative importance of disintermediation, behavior change, recession, and piracy. But the effect of piracy on legal downloads is probably much less than thought. This may be good news for the music industry. After many large and disruptive moves, the music market may be near a new global minimum. Here, it can rebuild and find new profit-making ventures. These are the kind of conventional “small” moves for a normal, non-disrupted market.

Other information markets are not that lucky.



Friday, October 28, 2011

Open Access Doubts


Science embraces the concept of weakly held strong ideas. This was illustrated recently by the excited reaction of the High-Energy Physics (HEP) community to a recent experiment. ("Measurement of the neutrino velocity with the OPERA detector in the CNGS beam", arXiv:1109.4897v1) If confirmed, it puts into doubt the speed of light as an absolute limit. The relevant paper is available through arXiv, which started as a HEP preprint repository and blazed a trail for Open Access. In light of the origins of the Open Access Movement, let us again be inspired by the HEP community and its willingness to follow experiments, wherever they may lead. Assessing the ongoing Open Access experiment, where are our doubts? I have three.

Is Affordable Better than Free?

All else being equal, open is better than closed. But… all else is not equal. A robust and user-friendly network of open scholarly systems seems farther away than ever because of inexpertly formatted content and bad, incomplete, and non-public (!) metadata. While there is always room for improvement, pay-walled journals provide professionally formatted and organized content with excellent metadata and robust services. The problem is cost. Unfortunately, we did nothing to reduce cost. We only negotiated prices.

What if we could significantly reduce cost by implementing pay walls differently? The root of the problem is site licenses. For details, see “What if Libraries were the Problem?”, “Libraries: Paper Tigers in a Digital World”, “The Fourth Branch Library”, and “The Publisher’s Dilemma”. Site licenses are market-distorting products that preserve paper-era business processes of publishers, aggregators, and libraries.

Universities can cut the Gordian knot right now by replacing site licenses with direct subsidies to researchers. After a few months of chaos, consumer-oriented services with all kinds of pricing models would emerge. Researchers, empowered to make individual price-value judgments, would become consumers in a suddenly competitive market for content and information services. The inception of a vibrant marketplace is impossible as long as universities mindlessly renew site licenses.

What are the Goals of Institutional Repositories?

Open Access advocates have articulated at least five goals for institutional repositories: (1) release hidden information, (2) rein in journal prices, (3) archive an institution’s scholarly record, (4) enable fast research communication, and (5) provide free access to author-formatted articles.

Institutional repositories are ideal vehicles for releasing hidden information that, until recently, had no suitable distribution platform (1). For example, archives must protect original pieces, but they can distribute the digitized content.

The four remaining goals, all related to scholarly journals, are more problematic. Institutional repositories fall short as a mechanism to rein in journal prices (2), because they are not a credible alternative for the current archival scholarly record. Without (2), goals (3), (4), and (5) are irrelevant. If we pay for journals anyway, we can achieve (3) by maintaining a database of links to the formal literature. Secure in the knowledge that their journals are not in jeopardy, publishers would be happy to provide (4) and (5).

A scenario consistent with this analysis is unfolding right now. The HEP community launched a rescue mission for HEP journals, which lost much of their role to arXiv. The SCOAP3 initiative pools funds currently spent on site-licensing HEP journals. This strikes me as a heavy-handed approach to protect existing revenue streams of established journals. On the other hand, SCOAP3 protects the quality of the HEP archival scholarly record and converts HEP journals to the open-access model.

Are Open-Access Journals a Form of Vanity Publishing?

If a journal’s scholarly discipline loses influence or if its editorial board lowers its standards, the journal’s standing diminishes and various quality assessments fall. In these circumstances, a pay-walled journal loses subscribers and, eventually, fails. An open-access journal, on the other hand, survives as long as it attracts a sufficient number of paying authors (perhaps by lowering standards even further). Financial viability of a pay wall is a crude measure of quality, but it is nonnegotiable and cannot be rationalized away: the journal fails, its editorial board disappears, its scholarly discipline loses some of its stature, and its authors must publish elsewhere.

We should not overstate this particular advantage of the pay wall. Publishers have kept marginal pay-walled journals alive through bundling and consortium incentives, effectively using strong journals to shore up weak ones. Open-access journals may not be perfect, but we happily ignore some flaws in return for free access to the scholarly record. For now, open-access journals are managed by innovators out to prove a point. Can successive generations maintain quality despite a built-in incentive to the contrary?