I last spoke about the need for scholarly communications to be syndicated in this series of posts on how scholarly communications must transform. In this post, I discuss the need for scholarship to be integrated into the cyberinfrastructure.
"Cyberinfrastructure" is a mouthful, but a vital concept today. Scholars, librarians, and all stakeholders in academic knowledge production need to understand the concept of cyberinfrastructure and come to see the generation of scholarship as something participating within and building this emerging structure for learned communication.
The cyberinfrastructure relates in part to the
physical, technical, globally networked system of
permanently maintained digital archives and repositories, but saying that makes it sound as though it is mostly the concern of IT specialists or librarians. Those people play crucial roles, but so do scholars in general and those who oversee them or evaluate their contributions. The cyberinfrastructure is built as much upon social parameters, intellectual property provisions, and academic evaluation systems as computer systems. It requires us to reconceptualize what is consequential about scholarly work beyond traditional genres or methods of academic publishing.
Those unfamiliar with this concept should begin with the 2008 Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure (known informally as "the Atkins report"), where the term was coined. The cyberinfrastructure, explains the report,
will become as fundamental and important as an enabler for the [scientific] enterprise as laboratories and instrumentation, as fundamental as classroom instruction, and as fundamental as the system of conferences and journals for dissemination of research outcomes. Through cyberinfrastructure we strongly influence the conduct of science and engineering research (and ultimately engineering development) in the coming decades. (Appendix A)
The report begins with the sciences, but anticipates comparable changes across all disciplines, and this has been explored by the American Council of Learned Societies in their companion report, Our Cultural Commonwealth: The report of the American Council of Learned Societies Commission on Cyberinfrastructure for the Humanities and Social Sciences. Provosts, Academic Vice Presidents, and every dean, chair or promotion and tenure committee member should know these reports and how they describe the new conditions for scholarly communication.
Why? Because we are no longer in the world of paper research or trafficking in the one-off article or monograph. We are publishing media, software, datasets, simulations, and interactive, socially-optimized research and learning structures. We are not distributing knowledge commodities in the print paradigm mode; we are constructing knowledge environments to accommodate ongoing and dynamic researching, teaching, and learning, and we have not finished the job by simply setting something loose on the world as a "publication." We are all becoming cyber-engineers, social constructionists, and curators of data both informal and formal.
I recommend the recent book by scholarly communications front runner, Christine L. Borgman, Scholarship in the Digital Age: Information, Infrastructure, and the Internet (one might also view this video lecture of the same title).
These resources begin to tackle the important issues beyond merely technical concerns. Obviously the move toward Open Access publishing is central, as is the role of institutional repositories and disciplinary repositories in hosting and curating learned communication. But I would emphasize that the cyberinfrastructure concept is critical for scholars themselves. I don't think most scholars think in terms of its concepts, and they should.
Why should scholars choke down a portemanteau word like "cyberinfrastructure"? Because the difference between our research processes and our publishing processes are blurring. Because scholars are becoming necessarily multidisciplinary and depending more upon data and databased resources (in all disciplines). Because scholarship is becoming more collaborative as people share large computing resources or cooperate in the construction and maintenance of large-scale online archives. Because what is starting to matter most with scholarly work does not align neatly with those quantifiables of the print paradigm, publications in scholarly journals.
The cyberinfrastructure includes not only large computing installations, but large data sets that need intelligent articulation and interpretation through combinations of machine and human intelligence. It includes social knowledge that is being piped through social media tools and integrated into research tools and methodologies. It includes the accumulation and aggregation of informal modes of reporting results or discussing knowledge projects. It is much much more than a discrete set of high-impact journals and the fact of publication within them.
In fact, as I've said elsewhere, traditional publications stand in danger of becoming less and less relevant to the emerging scholarly commons because of their static and isolated nature and their lack of articulating features like links, rich metadata, and informal metadata folksonomies.
Let me give an example of how something not valued highly in traditional scholarly publishing can be highly valued within the concept of the cyberinfrastructure. Within the print paradigm, interpretive scholarship has been more highly prized than the development of tools that make such interpretations possible. A scientific paper is aimed at conclusions; a paper in the humanities might do a close reading or theoretical interpretation of a work of literature. But preparing a scholarly edition of a work of literature has been considered second-tier work. Not in the cyberinfrastructure. In fact, those who are structuring data intelligently so that others in the future can make use of it in a variety of ways are doing as much or more to contribute to the growth of knowledge as interpretive scholarship in discursive form.
This is evident in something like the very successful Perseus Project. What began as a way to get classical texts digitized and online has turned into a project more focused on structuring data and data tools for other scholars to use:
The Perseus digital library project has developed a generalizable toolset to manage XML (Extensible Markup Language) documents of varying DTDs (Document Type Definitions); to extract structural and descriptive metadata from these documents and deliver document fragments on demand; and to support other tools that analyze linguistic and conceptual features and manage document layout. (source)
Not only is the content openly available, but Perseus has also evolved a set of portable services as part of its "Perseus Hopper" open data effort: linguistic support for corpus research of texts in classical languages; a contextualized reading module for customizing secondary texts to accompany primary texts and for soliciting user input; and a searching module enabling sophisticated searches of the morphologically complex ancient languages.
What this means is that other projects can piggyback on years of work parsing and structuring and marking up this data. One example of such secondary scholarly work is the Diogenes project, a separate tool for searching and browsing ancient texts that imports Perseus Project data (along with other sources). Another example is the Archimedes project -- a more focused database looking just at the history of mechanics. Yet it draws upon the Perseus Project data to the extent it is useful for that purpose. A very different use of the Perseus Project data is HandHeldClassics, which drew upon Perseus Project open data to bring classical texts to handheld devices. This last application appears to be somewhat dated (intended for early Palm Pilot-type handhelds). But the data, built on XML and openly available, is ready for someone to make it flow into the smart phones of today, or their successors a year or a decade from now.
How does one place a value on the Perseus Project? One values it in terms of its contribution to the cyberinfrastructure. Its makers did not simply publish valuable content; they published structures and schema that were then later combined with the efforts of other scholars. Their content has become, in effect, a platform for others' uses. We have always had the notion of "standing on the shoulders of giants" in the history of scholarship. Conventions for citation mark the debts that we owe to our predecessors. But obviously the Perseus Project makes a more direct and material way for the building of future scholarship upon earlier work. It is more of an exponential contribution than merely an additive one.
There are less monumental ways of adding to the cyberinfrastructure than a large, years-long project like the Perseus Project. But it illustrates in an obvious way a different mode of scholarship that is arguably as significant, or more significant, than the standard article.
What do you think? Do you know other examples of scholarship helping to build the cyberinfrastructure? What do you think we need to do to retool the way we work to promote this more sophisticated kind of knowledge building?
In my next post on this theme, I will talk about scholarship in terms of something usually thought of only in connection with optimizing web content for commercial purposes. When an entity like Google is proclaiming its intention to organize the world's knowledge (and has developed sophisticated mechanisms for directing our attention to and retrieval of that knowledge), then maybe we should think about scholarship having web metrics and analytics, too.