CSCW 2022 Synthesis Infrastructures Workshop

An all-virtual workshop at CSCW 2022 (Nov 12-13, 2022)

Apply to participate by September 30th, 2022, and/or read on for more details!

Growing New Scholarly Communication Infrastructures for Sharing, Reusing, and Synthesizing Knowledge

Overview

Sharing, reuse, and synthesis of knowledge is central to the research process. These core functions are in theory served by the system of monographs, abstracts, and papers in journals and proceedings, with citation indices and search databases that comprise the core of our formal scholarly communication infrastructure; yet, converging lines of empirical and anecdotal evidence suggest that this system does not adequately act as infrastructure for synthesis. Emerging developments in new institutions for science, along with new technical infrastructures and tooling for decentralized knowledge work, offer new opportunities to prototype new technical infrastructures on top of a different installed base than the publish or perish, neoliberal academy. For more details, see the Background section below.

This all-virtual, interactive workshop aims to integrate these developments and communities with CSCW’s deep roots in knowledge infrastructures and collaborative and distributed sensemaking, with new developments in science institutions and tooling, to stimulate and accelerate progress towards prototyping new scholarly communication infrastructures that are actually optimized for sharing, reusing, and synthesizing knowledge. For more details, see the Workshop Structure section below.

Some examples of concrete topics we would like to work on together include (but are not limited to):

“living” syntheses that respond appropriately to retracted, outdated, or inconsistent findings as knowledge bases evolve
integrating formality and machine readability — such as ontologies or semantic markup — in graceful and appropriate ways that respect the limits of machine intelligence, and maintain diversity of perspectives, evidence, and epistemologies
exposing granular, semantic forms of relationships between lines of work, such as argumentation patterns
integrating tacit and contextual knowledge required for synthesis into more shareable knowledge infrastructures
integrating novel forms of scholarly authorship and knowledge sharing into everyday individual and collaborative scholarly practices and workflows; novel assemblages, bricolages, and repurposing of existing tooling to enable these integrations
the role of AI and machine learning systems in assisting — vs. automating — the construction of new synthesis-optimized knowledge sharing infrastructures
the role of decentralized peer-to-peer technologies, and new developments in hypertext and personal knowledge graphs, in lowering barriers to community-owned and maintained syntheses of scientific knowledge
applications and design patterns from crowdsourcing and social computing for new forms of scholarly communication infrastructure

Workshop Structure

This all-virtual workshop will be structured around synchronous and asynchronous work sessions in working groups. This will include scheduled time to form and refine proposals, and short lightning talks about concrete proposals for artifacts to create, such as new syntheses/aggregations of resources and research theories and findings, pilot user studies and prototypes with novel tooling, and new designs and lo-fi prototypes. We will also schedule for an extended synchronous session at the end of the workshop to share progress, with ample time for discussion, feedback, and planning of next steps. We plan to organize a venue for invited publication of the mature results of the collaborations, approximately 3-6 months after the conclusion of the workshop. We envision the venue being a combination of community-owned open access publications, such as on PubPub, and a co-authored overview article in a more traditional publication such as CSCW or CACM.

Submission and pre-workshop activities

Prospective participants will submit proposed materials by late-September, and receive notifications by mid-October. Pre-workshop activities, including commenting and iterating on materials, as well as initial formation of working groups, in the github repository and over our hosted chat platform, will begin in the month leading up to the workshop date.

Workshop Schedule

We plan to limit synchronous joint sessions to ~3 hours in the morning of US East timezone, to maximize overlap in schedules across Asia, Europe, and the US. This schedule will be adjusted as needed based on the final set of participants.

To maximize informal and spontaneous interactions, we plan to host the workshop on gather.town. All sessions will also be recorded and shared with participants to allow for continued participation if time zones conflict.

The following is a proposed rough schedule (in US Eastern time).

Day 1:

09:30 - 09:40 - Welcome and kickoff
09:40 - 10:10 - Working groups finalize proposals
10:10 - 11:40 - Work sprint 1
11:40 - 12:25 - Working group proposal lightning talks
Various times (depending on time zone): Work sprint 2

Day 2:

09:30 - 12:30 - Working group proposal progress reports and discussion
12:40 - 12:50 - Closing call to action, next steps

Call for Participants

Given the time constraints and goals of producing and sharing work outputs, we are looking for 20-30 participants (to aim for no more than 5-6 working groups) to join this workshop.

We invite prospective participants to submit materials for consideration, appropriate for their background and interests:

researchers should submit a short summary of their past relevant research, and/or detailed annotated bibliographies;
tool-builders should submit video demos, and/or links to learn and try out their tools;
practitioners should submit case studies of their attempts to shift practice (along with problems and opportunities/solutions they have discovered), and/or training materials for practice innovations, and example datasets.

We invite participants to self-identify along these broad categories, according to how they wish to participate in the workshop.

Submissions will be accepted at the following submission form.

These submissions will be considered by the organizers according to criteria of relevance for the core workshop themes, as well as balance across communities of research, tooling, and practice. Accepted submissions will then be curated into a shared github repository of materials. Prior to the workshop, participants will comment on and iterate on the ideas and resources in the repository. The materials in this repository will serve as important context and resources to fuel impactful collaborative work during the workshop.

Key Dates and Logistical Information

September 30th, 2022 - participant submission deadline
October 14th, 2022 - notification of acceptance
October 15th, 2022 - pre-workshop activities begin
November 12-13, 2022 - workshop @CSCW

Organizers

Please direct any questions to joelchan@umd.edu

Joel Chan is an Assistant Professor in the University of Maryland’s College of Information Studies (iSchool) and Human-Computer Interaction Lab (HCIL). His research investigates systems that support creative knowledge work, such as scientific discovery and innovative design. His recent work focuses on studies of scientific thinking (including their synthesis practices), and tools for searching and synthesizing scientific literature. His research has received funding from the National Science Foundation, the Office of Naval Research, the Institute for Museum and Library Sciences, Adobe Research, and Protocol Labs.

Wayne Lutters is a professor in the University of Maryland’s College of Information Studies (iSchool). Wayne’s research interests are at the nexus of computer-supported cooperative work (CSCW), social computing, and social informatics. He specializes in field studies of IT-mediated work, from a socio-technical perspective, to better inform the design and evaluation of collaborative systems. Recent projects have focused on the human-side of information infrastructure for distributed science. He has served as a Program Director for Human-Centered Computing at the National Science Foundation. He earned his M.S. and Ph.D. in Information and Computer Science from the University of California, Irvine.

Jodi Schneider is Associate Professor at the School of Information Sciences, University of Illinois at Urbana-Champaign where she runs the Information Quality Lab. She studies the science of science through the lens of arguments, evidence, and persuasion with a special interest in controversies in science. Her recent work has focused on systematic review automation, semantic publication, and the citation of retracted papers. She has held research positions across the U.S. as well as in Ireland, England, France, and Chile. Her work has been funded by the Alfred P. Sloan Foundation, the European Commission, IMLS, NIH, Science Foundation Ireland, and an NSF CAREER award.

Karola Kirsanow is a Research Program Manager at Protocol Labs, an open-source research, development, and deployment lab creating new internet technologies. There she leads a team that builds research public goods, identifying and supporting high-impact research projects in the distributed systems space and designing experiments to align researchers and research funders. Her previous research background is in human evolutionary biology and palaeogenetics, including work funded by the Leakey Foundation and the European FP7 framework programme.

Sílvia Bessa is a Research Program Manager in the Network Research team at Protocol Labs, where she designs new mechanisms to incentivise and accelerate research to build public goods. She’s a strong believer that community-driven research is the best-known way to protect humanity’s knowledge from individual interests. Her previous research background is in computer vision and machine learning applied to breast cancer imaging, including work funded by national and European Programs, in close collaboration with the Portuguese National League Against Cancer and Champalimaud Foundation.

Jonny Saunders is a PhD candidate at the University of Oregon’s Institute for Neuroscience. They are a transdisciplinary research worker studying ill-defined categories of complex sounds in a mouse model of phonetics, embedding distributed systems of knowledge sharing in experimental tooling, and applied strategy for information liberation from the history of digital social movements. They search between the seams of technology, labor, and politics for points of leverage to pry apart the systems of hierarchy, extraction, and privatization that structure knowledge work. Their hope is that by organizing with researchers across disciplines that we might be able to contribute our diverse skills towards building liberatory digital infrastructures of communication and collaboration — and realize the role we might play in building a better world beyond the broader digital enclosure movement.

Background

How do researchers, scholars, and scientists share, reuse, and synthesize knowledge? Alongside informal channels of communication such as personal communications and interactions ¹ ², scholars today rely heavily on a central infrastructure of scholarly publishing: a system of monographs, abstracts, and papers in journals and proceedings, with citation indices and search databases overlaid on top that comprise the formal scholarly record. Through this scholarly communication infrastructure, scholars have access to a vast and rapidly growing literature, across disciplines — on the order of hundreds of millions of documents accessible through just a few clicks and keystrokes from a single search engine — from which they can draw to construct new knowledge.

However, this scholarly communication infrastructure is not acting as infrastructure for synthesis. The term infrastructure is meant to evoke reliability, sustainability, and “smooth functioning” that enables people to focus on the task at hand instead of fussing over preparatory overhead ³ ⁴. Instead of this smooth functioning, researchers in today’s infrastructure resort to laborious “hacks” and workarounds to “mine” publications for what they need ⁵. Search engines operate over documents and their metadata (authorship, date, publication outlet), not claims and their context and relationships; scholarly documents are primarily archived as unstructured text, often in PDFs. If researchers are lucky, they might come across a published synthesis that is both on topic, with sufficient coverage, and up to date; that would likely be difficult for interdisciplinary questions (e.g., public health policy, cybersecurity and misinformation, creativity and metascience), which are where synthesis is most needed! Researchers might also be able to piece together leads on papers and authors and verbal statements of key ideas from knowledgeable colleagues. Failing all this, they would need to do the laborious work of citation tracing, manually checking references because citation databases typically do not surface information about why and in what way scholarly works cite each other, and collecting documents through keyword searches and then screening first the titles and abstracts and then their full text, and then constructing their own database of claims and data, often in a bespoke system of notes, annotations, and spreadsheets.

While these hacks often work well enough for the task at hand, they are rarely transferred in systematic ways across projects and people, violating the dimensions of “reach or scope” and “embodiment of standards” of infrastructure ⁴. These are also not “one-time” costs: scientific problems require many such queries, and likely spawn new queries as projects evolve and more is learned. It is unsurprising, then, that synthesis of prior literature in published works is often subpar: studies of literature reviews in doctoral dissertations ⁶ ⁷ and even published papers ⁸ ⁹ ¹⁰ have found them frequently lacking key aspects of synthesis quality, such as critical engagement with and generative integration of prior work and theory. Similarly, systematic reviews are increasingly struggling to keep up with the pace of knowledge, with many becoming outdated soon after they are published ¹¹, but are rarely updated ¹².

In this workshop, we ask: how might we design scholarly communication infrastructures that are actually optimized for sharing, reusing, and synthesizing knowledge? And where might these issues intersect with CSCW, to stimulate fresh sociotechnical progress, as well as theoretical development within CSCW? This set of questions has deep roots within CSCW. For example, CSCW recently gave a lasting impact award to Star and Ruedhler’s classic study of worm scientists grappling with the design and implementation of a new knowledge sharing infrastructure ⁴. This line of work has continued to the present, studying the collaborative, practical, and social work of constructing and maintaining scientific knowledge infrastructures ¹³ ¹⁴ ¹⁵ ¹⁶, yielding and developing powerful concepts such as boundary objects ⁴ and infrastructure ¹⁶, and interacting with rich theories of situated and distributed cognition and organizational memory ¹⁷. Knowledge sharing in teams and organizations, too, are a central concern in CSCW ¹⁸ ¹⁷, as are systems for peer production of knowledge, such as Wikipedia ¹⁹. There is also a related rich tradition of studying complex sensemaking in collaborative and distributed settings ²⁰ ²¹ ²² ²³, and social computing and human-machine systems for synthesis ²⁴ ²⁵, ²⁶. In the cognate field of library and information science, too, there are decades of standards and platform-level work on new scholarly communication infrastructures that incorporate more sophistical models of semantics and scientific argumentation to support synthesis ²⁷ ²⁸ ²⁹.

We aim to stimulate fresh progress by bringing these deep roots into conversation with emerging trends in science and infrastructure reform and innovation. We are especially excited to engage with a range of new institutions for science that have recently emerged outside the academy, often deliberately structured to pursue different incentive structures. Some examples include the ML Collective, a nonprofit organization that supports open collaboration and accessible mentorship for aspiring machine learning researchers both inside and outside the academy; Invisible College, a prototype distributed and collocated set of communities for independent researchers; and the Arcadia Institute, a new bioscience research institute that is experimenting with completely open science workflows and micropublishing ³⁰; the Citizens and Technology Lab, where online platform users and moderators co-create research on online communities with researchers from Cornell University; and LabDAO, an open, community-run network of wet and dry laboratory services for fueling decentralized collaborative research in the life sciences. These institutions are fueled in part by new coordination structures and technologies, including open-source community publishing platforms like PubPub, and patterns from the emerging technology around Decentralized Science (DAOs) and decentralized, content-addressed, peer-to-peer knowledge infrastructures, such as IPFS, and local-first software ³¹. Co-evolving alongside these efforts are new experiments in science funding, from crowdfunding to larger investments in creating new kinds of scientific institutes. Scientists are also increasingly experimenting with novel publication formats that are more tuned to synthesis, such as micropublishing or semantic publication of results ³². The potential intersection of these communities and technological developments offers new opportunities to grow new technical infrastructures on top of a different installed base than the publish or perish, neoliberal academy. We aim for these new infrastructural patterns to synergize and combine with long-running advances and innovations in open science (sharing of code, data, and protocols) and open access and preprints. We hope this new wave of sociotechnical innovation will catalyze bottom-up evolution and growth towards new infrastructures for sharing, reusing, and synthesizing knowledge for the good of humanity.

References

Herbert Menzel. 1959. Planned and unplanned scientific communication (Proceedings of the international conference on scientific information). ↩
Diana Crane. 1972. Invisible Colleges; Diffusion of Knowledge in Scientific Communities. Chicago: University of Chicago Press. ↩
Paul N Edwards, Steven J Jackson, Geoffrey C Bowker, and Cory P Knobel. 2007. Understanding infrastructure: Dynamics, tensions, and design. Technical Report. ↩
Susan Leigh Star and Karen Ruhleder. 1996. Steps toward an ecology of infrastructure: Design and access for large information spaces. Information systems research 7, 1 (1996), 111–134. https://pubsonline.informs.org/doi/abs/10.1287/isre.7.1.111 03070 ↩ ↩² ↩³ ↩⁴
Ian A. Knight, Max L. Wilson, David F. Brailsford, and Natasa Milic-Frayling. 2019. Enslaved to the Trapped Data: A Cognitive Work Analysis of Medical Systematic Reviews. In Proceedings of the 2019 Conference on Human Information Interaction and Retrieval (CHIIR ’19). ACM, New York, NY, USA, 203–212. https://doi.org/10.1145/3295750.3298937 ↩
Barbara E. Lovitts. 2007. Making the Implicit Explicit: Creating Performance Expectations for the Dissertation. Stylus Publishing, Sterling, Va. ↩
Allyson Holbrook, Sid Bourke, Terence Lovat, and Kerry Dally. 2004. Investigating PhD thesis examination reports. International Journal of Educational Research 41, 2 (Jan. 2004), 98–120. https://doi.org/10.1016/j.ijer.2005.04.008 ↩
Adrienne Alton-Lee. 1998. A Troubleshooter’s Checklist for Prospective Authors Derived from Reviewers’ Critical Feedback. Teaching and Teacher Education 14, 8 (1998), 887–90. ↩
Jonathon McPhetres, Nihan Albayrak-Aydemir, Ana Barbosa Mendes, Elvina C. Chow, Patricio Gonzalez-Marquez, Erin Loukras, Annika Maus, Aoife O’Mahony, Christina Pomareda, Maximilian Primbs, Shalaine Sackman, Conor Smithson, and Kirill Volodko. 2020. A decade of theory as reflected in Psychological Science (2009-2019). Technical Report. PsyArXiv. https://doi.org/10.31234/osf.io/hs5nx ↩
Padhraig S. Fleming, Jadbinder Seehra, Argy Polychronopoulou, Zbys Fedorowicz, and Nikolaos Pandis. 2013. Cochrane and non-Cochrane systematic reviews in leading orthodontic journals: a quality paradigm? European Journal of Orthodontics 35, 2 (April 2013), 244–248. https://doi.org/10.1093/ejo/cjs016 Publisher: Oxford Academic. ↩
Kaveh G. Shojania, Margaret Sampson, Mohammed T. Ansari, Jun Ji, Steve Doucette, and David Moher. 2007. How Quickly Do Systematic Reviews Go Out of Date? A Survival Analysis. Annals of Internal Medicine 147, 4 (Aug. 2007), 224. https://doi.org/10.7326/0003-4819-147-4-200708210-00179 ↩
Ann-Margret Ervin. 2008. Motivating authors to update systematic reviews: practical strategies from a behavioural science perspective. Paediatric and perinatal epidemiology 22, 0 1 (Jan. 2008), 33–37. https://doi.org/10.1111/j.1365-3016.2007.00910.x ↩
Andrea K. Thomer, Michael Bernard Twidale, and Matthew J. Yoder. 2018. Transforming Taxonomic Interfaces: “Arm?s Length” Cooperative Work and the Maintenance of a Long-lived Classification System. Proceedings of the ACM on Human-Computer Interaction 2, CSCW (Nov. 2018), 173:1–173:23. https://doi.org/10.1145/3274442 ↩
Dave Randall, Rob Procter, Yuwei Lin, Meik Poschen, Wes Sharrock, and Robert Stevens. 2011. Distributed ontology building as practical work. International Journal of Human-Computer Studies 69, 4 (April 2011), 220–233. https://doi.org/10.1016/j.ijhcs.2010.12.011 ↩
Alyson L. Young and Wayne G. Lutters. 2017. Infrastructuring for Cross-Disciplinary Synthetic Science: Meta-Study Research in Land System Science. Computer Supported Cooperative Work (CSCW) 26, 1 (April 2017), 165–203. https://doi.org/10.1007/s10606-017-9267-z ↩
David Ribes and Charlotte P. Lee. 2010. Sociotechnical Studies of Cyberinfrastructure and e-Research: Current Themes and Future Trajectories. Computer Supported Cooperative Work (CSCW) 19, 3 (Aug. 2010), 231–244. https://doi.org/10.1007/s10606-010-9120-0 ↩ ↩²
Mark S. Ackerman, Juri Dachtera, Volkmar Pipek, and Volker Wulf. 2013. Sharing Knowledge and Expertise: The CSCW View of Knowledge Management. Computer Supported Cooperative Work (CSCW) 22, 4-6 (Aug. 2013), 531–573. https://doi.org/10.1007/s10606-013-9192-8 ↩ ↩²
Mark S. Ackerman and David W. McDonald. 1996. Answer Garden 2: merging organizational memory with collaborative help. In Proceedings of the 1996 ACM conference on Computer supported cooperative work - CSCW ’96. ACM Press, Boston, Massachusetts, United States, 97–105. https://doi.org/10.1145/240080.240203 00471. ↩
Guo Li, Haiyi Zhu, Tun Lu, Xianghua Ding, and Ning Gu. 2015. Is It Good to Be Like Wikipedia?: Exploring the Trade-offs of Introducing Collaborative Editing Model to Q&A Sites. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW ’15). ACM, New York, NY, USA, 1080–1091. https://doi.org/10.1145/2675133.2675155 ↩
Marcela Borge, Craig H. Ganoe, Shin-I Shih, and John M. Carroll. 2012. Patterns of team processes and breakdowns in information analysis tasks. In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work (CSCW ’12). Association for Computing Machinery, New York, NY, USA, 1105–1114. https://doi.org/10.1145/2145204.2145369 ↩
Nitesh Goyal and Susan R. Fussell. 2016. Effects of Sensemaking Translucence on Distributed Collaborative Analysis. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing (CSCW ’16). ACM, New York, NY, USA, 288–302. https://doi.org/10.1145/2818048.2820071 ↩
Kristie Fisher, Scott Counts, and Aniket Kittur. 2012. Distributed Sensemaking: Improving Sensemaking by Leveraging the Efforts of Previous Users. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’12). ACM, New York, NY, USA, 247–256. https://doi.org/10.1145/2207676.2207711 ↩
Sijia Xiao, Coye Cheshire, and Amy Bruckman. 2021. Sensemaking and the Chemtrail Conspiracy on the Internet: Insights from Believers and Ex-believers. Proceedings of the ACM on Human-Computer Interaction 5, CSCW2 (Oct. 2021), 1–28. https://doi.org/10.1145/3479598 ↩
Pao Siangliulue, Joel Chan, Bernd Huber, Steven P. Dow, and Krzysztof Z. Gajos. 2016. IdeaHound: Self-sustainable Idea Generation in Creative Online Communities. In Proceedings of the 19th ACM Conference on Computer Supported Cooperative Work and Social Computing Companion (CSCW ’16 Companion). ACM, New York, NY, USA, 98–101. https://doi.org/10.1145/2818052.2874335 ↩
Nathan Hahn, Joseph Chang, Ji Eun Kim, and Aniket Kittur. 2016. The Knowledge Accelerator: Big Picture Thinking in Small Pieces. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI ’16). ACM, New York, NY, USA, 2258–2270. https://doi.org/10.1145/2858036.2858364 ↩
Joel Chan, Steven C. Dang, and Steven P. Dow. 2016. Improving crowd innovation with expert facilitation. In Proceedings of the ACM Conference on Computer-Supported Cooperative Work & Social Computing. https://doi.org/10.1145/2818048.2820023 ↩
Simon Buckingham Shum, Enrico Motta, and John Domingue. 2000. ScholOnto: an ontology-based digital library server for research documents and discourse. International Journal on Digital Libraries 3, 3 (Oct. 2000), 237–248. https://doi.org/10.1007/s007990000034 ↩
Tobias Kuhn and Michel Dumontier. 2017. Genuine semantic publishing. Data Science 1, 1-2 (Jan. 2017), 139–154. https://doi.org/10.3233/DS-170010 ↩
Anita de Waard. 2010. From Proteins to Fairytales: Directions in Semantic Publishing. IEEE Intelligent Systems (2010). https://ieeexplore.ieee.org/document/5456415 ↩
P Avasthi and M. L. Hochstrasser. 2022. The experiment begins: Arcadia publishing 1.0. Arcadia Science (May 2022). https://doi.org/10.57844/arcadia-050a-q254 Publisher: Arcadia Science. ↩
Martin Kleppmann, Adam Wiggins, Peter van Hardenberg, and Mark McGranaghan. 2019. Local-first software: you own your data, in spite of the cloud. In Proceedings of the 2019 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software. ACM, Athens Greece, 154–178. https://doi.org/10.1145/3359591.3359737 ↩
Cristina-Iulia Bucur, Tobias Kuhn, Davide Ceolin, and Jacco van Ossenbruggen. 2022. Nanopublication-Based Semantic Publishing and Reviewing: A Field Study with Formalization Papers. arXiv:2203.01608 [cs] (March 2022). http://arxiv.org/abs/2203.01608 arXiv: 2203.01608. ↩