Citizen Science: Beyond the Laboratory @ 4S/EASST 2016

Dino eat conference

Spinosaurus on the hunt in Barcelona!

Every 4 years, the Society for the Social Studies of Science and European Association for the Study of Science & Technology co-locate their meetings. This year we met in Barcelona, and a huge crowd of STS (science & technology studies) scholars presented more than 1100 talks in over 100 tracks in just 3 days. The citizen science track, Citizen Science: Beyond the Laboratory, that we (Gabe Mugar, Carsten Østerlund, & I) organized had a whopping 3 sessions and 13 papers! Two of the papers weren’t technically on citizen science, so I’ve included only a brief description of the main themes, but the presenters did an amazing job of fitting their work into the conversation quite artfully.

As usual, these are fairly raw notes, unembellished and only minimally corrected. I did remove the honorifics that were inserted in the program, as distinguishing between those who have and have not finished their PhDs seemed very crass to me.


Session 1: Negotiating tensions at sites of investigation

Landscapes and Property Lines: the contradictory practices of citizen scientists
Karin Patzke (Rensselaer Polytechnic Institute)

Milam Co, TX: everyday environmental conservation with formal & informal citizen science practices. Uses concept of extramural knowledge production to define citizen science to highlight the roles/relationships; focus on participants in formal programs.

1990’s – state task force on nature tourism, involving multiple agencies to create rural nature-oriented tourism. Recommendations included tax incentive to transfer land management toward wildlife management. Formal category of property appraisal situated in ag, not based on productive value of the land eg resource management, but with properties part of larger ecoregion & network. 20 yrs later, this vision hasn’t succeeded; agriculture production is still in decline due to urban encroachment. Cultural value of land in TX is situated around land-rich & cash-poor status that has been the case since TX was independent republic. Wildlife management tax status lets them keep land that’s less productive.

Focusing on bureaucracy in formal tax evaluation. Framing from legal studies & STS. Legal fictions are how facts are used to create legality. Wildlife management is practice that encourages contradictory things: land as public space that species pass through, and as managed spaces like parks, but also required to manage as though private without  consideration of neighbors.

TX legal regime is hybrid of other traditions; legal fictions are genres of fact to create action, in this case, legal bureaucracy. Always legal action, not illegal because not adjudication. For this case, fiction is wildlife management as agriculture in tax law.

Landowners work with biologists for land management practices & to legitimate practices, often through citizen science like documenting migrations and species presence/absence. Works as form of legitimacy for cultural practices, still through a productive lens, attained through production & participation in citizen science. Dominant form is observation census: NestWatch, iNat, seed production counts. When taxes are due, landowners file a 10-pg form from appraisers office and add up to 50 pgs documenting their work in citizen science & workshops. Wildlife management isn’t just about bluebird boxes, but active participation as documented in paperwork which establishes legitimacy of wildlife management claims. Wildlife only seen as productive function of agricultural practices. Cit science thus legitimizes their practices.

But participation doesn’t lead to sustainable practices; geography of landscape doesn’t change because value of land based on what the property produces. So novel practices on their property doesn’t allow expanding practices to adjacent properties, so fences divide traditional agriculture from wildlife management, only way to get legal value for the land. Consuming nature is rhetoric, but it’s translation through legality of consumption to production that produces value for the state.

Participation in citizen science is how legitimacy is constructed to maintain property rights of ownership and keep land separate from a whole. By relying on production of science knowledge to legitimate wildlife management, hasn’t been much effort to get past land-rich cash-poor style. Reinforces divisions between properties.

Who are the citizens in citizen science? Public participation in distributed computing Traditional Knowledge, Citizenship, and the Conditions of Scientific Participation
Elise Tancoigne (Universite de Geneve)

Interested in amateur production of knowledge, their definition of citizen science. Starts with OSTP quote from Jenn Gustetic on 9/9/2015, asserting ability for people to engage in science even when not formally trained. Also highlights popular media focusing on democratization of discovery. Democracy has multiple meanings, here understood as egalitarian.

RQ: Do the citizen sciences democratize sciences? Or not? Case is distributed computing. Why? Distributed computing has millions of participants, which is convenient for demographic studies. Also considered not “true” citizen science, which makes it interesting. Gives some history of SETI@home, example of why they are focusing on BOINC, which has over 540K active volunteers and over 1.1M computers involved in 40 ongoing projects.

Graph of new active users/month in BOINC projects. Several papers that try to characterize Distributed Comp, but few exceed 1K survey respondents & don’t include demographics about profession. Didn’t run a survey due to methods problems; instead examined BOINC user profiles, sampled from 6 projects (SETI, LHC, Climate, Malaria, PrimeGrid, Rosetta) for 2K profiles coded for age, gender, profession, education, hobby & then ran descriptive stats, planning machine learning in future.

Categories for profession/education are science & engineering (S&E), IT, other; same for hobby except with scifi added. Current population trends: 90% males, and majority of people in 20’s, below median age of population. So age inequality is less important than gender inequality.

For profession, found 60-80% had job in S&E or IT, pretty evenly distributed but lower in SETI and higher in others. Global population is less than 1% in these professions. These participants are already engaged in science as a hobby – excerpts from profiles demonstrating geekery.

So the participants are young men with science background, job, or hobby. New avenue for people already engaged in S&E or IT activities. 1 participant of 5 corresponds to “the public” as portrayed by citizen science advocates. Example of nat history museum visitors to show imbalance: distributed computing not good for attracting new people to citizen science.

Q: ethics of downloading? Institutions also participating in distributed computing? How does it promote that concept?

Ethics: contacted projects & asked for permission to download, got very different answers from projects due to global participation, law doesn’t apply in the same way. Had to do it project by project; most of them didn’t know how to answer – not sure whether they could give it, but said it’s OK to take it.

Traditional Knowledge, citizenship & the conditions of scientific participation
Sarah Blacker, Max Planck Institute for the History of Science

Focus in northern Saskatchewan, Lake Athabasca, north of Athabasca oil sands and far north of Edmonton. Question of chemical composition of water flowing into the lake that spurred public investigations.

Athabasca oil sands one of the largest industrial projects in the world in terms of land area, 2 other oil sands adjacent. Bitumen separated from sands chemically, contaminants deposited in tailings ponds, hard to represent the scale of them. They contain pollutants that leak into the river & enter food chain. Company has acknowledged leakage of millions of gallons of polluted water daily.

Study collaborating between first nations communities & academic, producing evidence of extensive contamination & they’ve documented higher cancer rates. Developing evidence in 3 forms: unmediated, untranslated textual knowledge about contamination in communities; measurements of contaminant levels using current industry standards to make it legible & credible for policy; & synthesis of first two forms. Is the 3rd form a type of knowledge reflective of democratic knowledge or does it reproduce colonial power relations and privilege Western science? Exploration of epistemic consequences of a hybrid study, pointing to hierarchies and privilege.

Particular attention to role of knowledge mediator; ecologist has worked with community for years & was invited by them to collaborate. Environment Canada has also funded studies into contamination in region, but didn’t incorporate traditional knowledge & didn’t stay in the area long enough to acquire that depth. Government lacks capacity to incorporate traditional knowledge.

Definition is focused on public participation in science, due to problematics of terminology. First competing study by EC looked at cancer rates, elevated by 30% in area but attributed to other risk factors for community such as obesity, alcoholism, etc. Another study found 17-453x higher contamination than “safe” in animals but no relation to human cancers.

Locals reached out to McLachlan because their experiences weren’t being considered. Funded by Health Canada & SSHRC, peer reviewed by Health Canada. First time science aligned with First Nations leadership (according to McLachlan). Benefits of integration include pointing inquiry in the right direction. Research challenges include that cancer develops on a delay; traditional knowledge can register changes in different ways that are detectable and precede cancer detection.

Shows examples of reports – arsenic levels in meat animals such as moose, beaver. Reports produced for Mikisew Cree and Athabasca Chipewyan communities, notable that they are employed by oil sands, so they don’t want to shut down oil sands. Locals want to work with industry & government to manage cooperatively: aim is modest, even with a social democratic government in Alberta, provincial government is shutting down monitoring, citing unnecessary duplication of studies, etc.

Multiple barriers to indigenous participation in science. Argument that Western science and traditional knowledge are incompatible within an individual, but that they can be collaboratively engaged in communities. Attending university, gaining scientific expertise & ability to work with policy, removes indigenous peoples from their traditions.

Q: Other examples in Canada – why is it that communities invite scientists to participate, doesn’t happen that way elsewhere.
A: In this context, it’s due to the contamination & physician reporting higher cancer rates to mobilize resources, physician got fired. Community reached out to McLachlan because of his reputation for working with indigenous communities in Manitoba, they trusted him more than other scientists.

Architecture and social sciences’ spatial turn: dialogue or monologue? Toward an inherently collaborative rhetoric of science communication
Leandro Rodriguez‐Medina (Universidad de las Americas Puebla)

Absorptive capacity and routines to appropriate knowledge. Relationship between architects’ practices & social science knowledge: architects accept SSK when they recognize their project must take context into consideration. Artful translation of stakeholder relationships in architecture to relate to citizen science, very thoughtfully worked out.

Toward an inherently collaborative rhetoric of science communication
Erika Szymanski (University of Edinburgh)

Studied wine industry in NZ and science communication. Material semiotics important for dialogue between scientists & non-scientists. Tech transfer models fail to make connections to winemakers’ knowledge despite lots of participation. Again, excellent refocusing of original material to fit into the themes of the discussion and added a complementary view on scientist/non-scientist working relationships.

Session 1 Discussion

Q: Elise, what’s answer to the question?
A: In distributed computing, not really democratizing per se, but 20% are in fact “amateurs” drawn into the project.
Q: Others speakers?
A: Sarah – the fact that it doesn’t work says a lot about the context of her study.
A: Erika – failing because it’s being used as a tool, not a source of knowledge from outside of traditional science.

Q: 3-track model in Sarah’s talk, what was the synthesis?
A: Scientist asked her not to show images because of challenges with publishing, they’re still trying to get published & ran into problems with climate of censorship of science in the country at the time. It’s not so different from the representation, incorporates text & measurements on a page.

Q: Me – Sarah, is Antonio Gramsci’s organic intellectual concept applicable?
A: yes, and view expressed in talk was on the extreme end; other members of the community would agree with that conceptualization.

Q: Static style of science, dynamics of knowledge in these problem spaces. How is this present in these cases?
A: Sarah – McLachlan & a collaborator were in community for over 2 years, described it as more than dialogue, wouldn’t have been comfortable doing the project if he hadn’t lived there long enough to understand the situation in depth. Why that quote on that page? He felt it best represented the concern over the potential mis-contextualization of measurement. Definitely problems romanticizing contextual knowledge.
A: Erika – understanding of the methods is very strong, scientists & winemakers share similar social spaces in NZ, are friends, understand one another’s practices well. Surprised by the dynamics in the transfer space as a result. Rhetorical framing of how science communication is done reinforces the practices.
A: Karin – long generational history of immigrants in TX that’s created agriculture practices that reflect intercultural knowledge of legality, especially around moving cattle through spaces & planting certain crops. When wildlife management as agriculture came around, it challenged people to reconceptualize their land and properties. Notion of prairie has emerged as a romantic representation that challenges agriculture production frame – an imaginary about traditional knowledge, which is very problematic.
A: Leandro – hard to see how traditional Western knowledge can communicate; usual platforms, e.g. science papers & conferences, practices, he looked for new “products” that might have more influence than the professional version. One way to consider dialogues is seeking new platforms for presentation, questioning, etc.
A: Sarah – they did make a documentary!

Session 2: Ecosystems of participation

Trading Zones, Citizen Science and New Infrastructures for Knowledge Production
Per Hetland (Oslo University)

Context: part of larger project, Mediascapes, this is just one component; main goal is looking at links between museums & public. Citizen science is 2 case studies, plus one in humanities, & bridging studies. Variety of partners for citizen science: Nat History Museum in Oslo, GBIF, SABIMA, Norwegian Biodiversity Info Centre. SABIMA is “amateur” organizations.

RQ: How does NH Museum & stakeholders interact with communities of interest outside of professional institutions & engage amateurs/volunteers in citizen science? Has debated which label/concept to use, prefers amateurs or volunteers as better reflects the space.

Definition: project in which amateurs/volunteers partner with scientists to answer real-world questions (from CLO). Qs then are what are we doing, who volunteers and why, who decides what to ask, what do the answers look like? Bonney/Shirk models of citizen science. Focusing here on 3 middle models; collegial is interesting in historical sense but not sure it’s so common anymore.

5 cases. #1: crowdsourcing and transcription. Very common format. National historical records in Norway include 6M records with 60-70% already digitized, volunteers working on the rest. Motivations for this activity? Lit suggests contributing to science is most important, usually another one besides – save the whales, etc.

#2: Validation & expertise in Species Gateway (Artsobservasjoner) – observational data submissions on species reporting. Amateurs were disappointed with getting only footnotes on transcription work, felt they were left out and not visible anymore. So they asked for the species reporting portal, going on since 2008, 15M observations in a population of 5M, core group does most of the work, most only contribute a few. Challenge is validation with 5K records/day. Concept of apomediation (Eysenbach, 2008) to validate the work.

#3: social networking & amateur communities around grasshoppers, using Facebook; interested in how social aspect of participation is understood.

#4 User engagement & youth engagement; biodiversity mapping involves a lot of gray hair. Interested in attitude formation & science career trajectories.

#5: Amateur-institutional relationship & new technology – are museums still getting all the physical contributions that they once did, or only digital contributions? (Hetland, 2011)

Observational matrices for each case with Actors, Rules, Activities. Describes each the cases with this framing. #3 case, curators do the validation, biodiversity network paid to have “professional amateurs” to validate the species records. They have “red list” that needs close attention and “black list” of invasive species that are especially important. #4 curation role involves taxonomic maintenance, e.g., 8 different names for each plant.

Scientists, Citizen Scientists, and the People in the Middle
Hined Rafeh (Drexel University, Rensselaer Polytechnic Institute)

Self-reflective examination of “person in the middle.” Interned with SciStarter & did a lot of promotion at public events. Was part of partnrship with NASA GLOBE program SMAP Mission. Goal is ground-truthing satellite data, recruited among schools, universities, tribes, prisoners, kids, etc. Had 286 express interest, 110 trained & collected data, equipment provided for some who didn’t have any. Soil collection protocol took 2 hr training but 5-10 min to complete; weighed soil, dried it, and then re-weighed it.

When talking with scientists from NASA, it was all about data quality. They weren’t concerned with things like time zones and struggles using Hangouts. Surveyed participants with 42 responses; majority content with training, but also issues with details like choosing sites, how to use protocol properly, and not happy with GLOBE’s level of engagement.

Got to go to OSTP for 9/30/15 event. Nice to talk with people but felt there were lots of barriers to being taken seriously. Next steps for this project include expanding to new audiences; lending libraries for participation kits; partnership with eco-skills.

Felt that GLOBE scientists had nothing to lose & got a lot out of it. Couldn’t understand why people were participating because it’s free labor, she believed they got more frustration than enjoyment out of it despite the fact that they continued participating nonetheless.

Future research directions: 3rd party players; knowledge creation or data collection; motivation, empowerment, & social value of science. No one talking about groups other than scientists & contributors such as 3rd party organizations.

3rd parties: Claims they are all for-profit, which is an interesting interpretation because it’s only true of SciStarter, & mistakenly named other organizations which are not actually for-profit. Claims they are all financially backed separately from scientists & that they provide easy access to projects & resources.

Knowledge creation vs data collection: experienced emphasis on data quality & not much else. Believes that there’s no attention to other outcomes. Feels that people are barred from participation in the “real discussion”. Many projects say they intend to improve something, but what? What are real outcomes?

Motivations: quotes a person who enjoyed participating; then second-guesses that experience by questioning if the participant is “really doing science”, and if data collection is enough.

Empowerment: analytical framework from Corbett & Keller. If this is considered empowerment, what does it tell us about science & society?

Values: Kids are first ones to adopt “science is cool” perspectives. How does this figure in to other people’s relations to science?

[NB: This presenter’s comments on future directions were based on short-term personal experience with a project rather than formal inquiry.]

Enrolling scientists, citizens and lichens for knowing the chronic effects of pollution in the Fos‐sur‐mer industrial area (France)
Christelle Gramaglia (IRSTEA); Philippe Chamaret (Institut Ecocitoyen pour la Connaissance des Pollutions)

Focus on ignorance, undone science & citizen sciences. Local communities in polluted post-industrial spaces get no info about consequences on health & environment. Concerns are not taken into account as related research is unfunded, incomplete, neglected or undone. Citizen initiatives develop in reaction so that the missing/needed data are gathered. Variety of formats and outcomes.

Fos-sur-Mer area in southeast FR: industrial harbor, one of largest in Europe with 20K ha of heavy industry. Due to lack of knowledge about pollution impacts & siting of waste incinerator led to local community concerns about environmental & health risks. Nonprofit organization for citizen observatory to address these concerns. One of the focii is citizen engagement in biomonitoring, not just gathering data with established methods, but also trying other methods to augment that and evaluate for regulatory practice.

VOCE: volunteers for the environment. Collecting lay observations & insights, elaborating protocols, providing access to training & knowledge, encouraging engagement in monitoring. Monitoring includes observations & claims, measurements & data collection, etc. About 50 ongoing volunteers, many retired but more and more younger volunteers, they want to understand pollution impacts on the area. 2 cases of studies.

Researching water quality with conger eels: looking at how chemical contamination of marine habitat by measuring fishes’ impregnation. Citizens’ input important for finding a bio-indicator that respected the balances between science and social stakes, couldn’t choose a species with economic importance so landed on conger eels, and bringing technical know-how through specialized volunteers. Results showed high concentrations of Mercury, Arsenic, chlorine emissions.

Research air quality with lichens: implementing & maintaining register of exposure to pollution in industrial areas. Lichens have differential response to pollution, so they make a good bio-indicator in terms of biodiversity. Good public participation among non-specialized participants leading to high density of measurement sites. Results found complementary measures of air quality; data integrated into formal studies. Quantified impacts of pollution on lichens which showed higher concentration near to industrial sites (and lower biodiversity).

Data quality & access: close collaboration between scientists & volunteers because volunteers involved in many steps of the research process. They receive training for independent work, but scientists help model proper methods, answer questions, and do data validation. Provides scientifically useful data, validated by scientists, & results returned to contributors at public meetings.

Benefits of citizen science: important because it can address research needs that are otherwise overlooked. Sharing the scientific culture is possible & can help local organizations in formulating strategic claims e.g. during public hearings. Addresses important issues without offending locals’ sensibilities & interests while taking into account local knowledge & know-how. Accomplishes work that scientists alone could not do.

Co‐Creating Research Agendas through Multi‐Actor Engagement
Niklas Gudowsky (Austrian Academy of Sciences)

Developing models for co-created research agendas for future engagement. Expectations are important in shaping emerging tech & promise of progress. Future-oriented perspectives harness expectations, influence discussions & policy, which may impact funding paradigm. Anticipation beyond short-term prediction is very arbitrary, often contradictory – basically just educated guesses. Experts dilemma says you can elicit contradictory expert opinions at almost any time, so you can get the science opinion you want; similarly “pet experts” to policymakers who engage the same people over and over to portray a certain view of an issue. Imaginaries shape present such that alternate futures are less likely.

Public engagement in STI issues: lots of criticism that desired objectives are not met, failing to open up debate, didn’t yield rationality gains desired, results of public engagement don’t translate to policy, people are engaged too late in the process. Research agendas seem like a viable early entry point to evade these issues. Open framing of early agenda & “blue sky engagement” doesn’t require expert background. Collective utopian thought doesn’t require that framing prior to engaging, process orientation of thinking about desirable future which doesn’t require prior knowledge and everyone can respond to that, deliberatively engaging these future visions.

Basic framing: Citizens’ framings of visions go to experts & stakeholders, who generate recommendations, which are returned to citizens to evaluate whether it meets their visions; once validated, they can move to policy-making. Tested the process in 8 EU projects for early Horizons 2020 project. Have conducted several local and national-scale projects, focusing here on CIMULACT,

Diagram of the process – vision workshops, catalogued visions & needs they represented, co-created research programs around those areas of need, created research program scenarios, discussed through open online consultation and face-to-face meetings, which helped with enriching and prioritizing research program options, defining research topics in a pan-European conference, with outcomes of policy options and research topics that are expected to shape future EU research. Process starts by grounding in the past before they can move to visions for 2050, which engages older participants, and progressively more of the group engages as they can contribute, discussing changes from past. Used to spike creativity, using typical facilitation techniques.

Examples of need areas and the visions, research scenarios that include directions, questions, concerns, expert view & citizen view. Now doing online consultation, everyone can participate in process, takes about 20 minutes.

Discussant: Alan Irwin

Nothing he says is critical, just wants to ask questions that cut across presentations. 3 general questions & then a few words about each paper.

Theme 1: fascinating to see the rise of citizen science & ECSA, exciting movement now, when 20 years ago it was all unknown. Instead of being a general talking point, we now have empirical work. What is going on in the rise of citizen science? How do we come to terms with that? Is what we now call citizen science something we’d previously have called counter-expertise, radical science, along the lines of scientist-activists in 70s? What does this say about nature of citizens & of science?

Theme 2: Models of citizen science that are used, not to critique modeling which requires certain assumptions, but interesting reflection on the perspectives that models emerge from. Gives examples of Muki Haklay’s model. Isn’t it true that these models are assuming that citizen science is intended to influence science? That’s one dimension, but expects that many participants don’t care about the “real” science at all, that publication isn’t important to them, and they really want to solve problems and learn more about the world. What other models of citizen science could we have that don’t assume a knowledge dimension?

Theme 3: What do participants themselves get out of this? Especially when we acknowledge that the knowledge generation part is complex, what else are they getting? Emotional, learning, fascination – people just want to talk about things they’re passionate about. Moving away from epistemology – and bringing in the politics – this is a focal point that opens it up to broader discussion.

Comments on the papers:

Hetland: using Collins & Evans classification of contributory & interactional expertise, fits with the question of whether it’s intended to create knowledge. It was always both, as claimed, but how do you know that’s the case & whose perspective does that reflect?

Rafeh: invokes questions about empowerment, which was an underlying theme, and goes back to motivation, which assumes some power to do something. What was the power they were getting & what did they get out of it? Also suggests she’s not a middle person, a more active contributor than that. Makes us question what our role is in STS? We may be driving the discussion even as we analyze it.

Fos-sur-Mer team: Can be seen as counter-expertise; referred a lot to scientists and volunteers, want to consider that relationship more deeply: can’t the scientists also be volunteers & vice versa? This false dichotomy builds in potential to distance people rather than bringing together.

Gudowsky: How do you know that thinking about the future changes the present? Seems plausible but what’s the evidence? For example, did science fi visions from 50’s and 60’s shape today? Need to challenge the idea because it’s profound in the space of technology assessment. Tech is more complicated than that.

Return to 1st 3 questions:

What about nature of citizen science in general?
What about the models, do they over-emphasize epistemic models to neglect of other aspects?
What can you say about what contributors get out of it?

Hined: these are connected themes, e.g. nature of citizen science & motivations. Historically clear distinctions but now can’t separate science from the way we look at the world, science is the new religion. Breaking down barriers as to who can participate.
Per: knowledge comment is relevant; knowledge to contribute to science is very important, but much is passion-driven, which is very interesting.

Christelle: in early discussions about this work, enthusiasm about opening up participation & opportunities was discouraged, that’s not what they felt they wanted. When earlier citizen work was dismissed, their engagement with professional scientists was seen as a way to get solid science & aspiration/ambition toward scientific credibility. That’s what the strategic approach was for this institute, to share labor, and the data is to be used by multiple parties. Maybe related to French epistemic culture, high respect for science & expert authority, not challenging that frame.

Niklas: was asking that first question himself too: is what we’re doing citizen science? A lot of prior work focuses on hard sciences, in soft sciences not as much is happening. Looking at current work through future studies & saw active participation & co-creation in that space, felt that makes it a type of citizen science, can open up and include humanities & social science in citizen science. Asked what people get out of it: in workshops, they see it’s hard to find participants & they use a lot of different strategies, but once you get them in the room & encourage open discussion & ideation, it builds a lot of passion among participants and the networks of individuals persist for years. Empowerment of thinking that they could influence the future & they certainly don’t care about publishing papers. Asking about future changing present – not the thinking about it, but the results that come out of it have some impact on directions moving forward. If you look at corporate oversight, that has been strongly shaped by public engagement.

Hined: to return to epistemic monitoring, got called out for calling people citizen scientists – you’re the one giving them that label. Recognizes that she’s projecting her values on participants and that’s an active role that changes dynamics.

Christelle: what else volunteers get out of it – people with no prior political commitment, many do surveillance for forest fires or cleaning up the harbor, they didn’t mobilize over the incinerator & watched passively thinking nothing could be done. But by participating in biomonitoring is a way to act on their concerns and enact the attachment to a place that has been degraded but could be rehabilitated.

Per: Knowledge very important in biodiversity mapping. The museum’s records are seen as higher status than records from amateurs. They’re essentially competing for validation as accepted knowledge. Some researchers are using the data, others refuse & consider them unvalidated & unreliable, despite errors in the professional science work, but we still hold the expert knowledge at a higher status than amateurs.

Q: Not criticism, but Alan’s point about what models of citizen science are at play links to broad, old question of the relationship between experience & knowledge. How is knowledge experienced? When you use terms like volunteers/amateurs, and training instead of education, that sets a specific framing about power relationship.
A: Hined – how to write a paper about own experiences? The experiences aren’t standard research data, but figuring out what it is often comes from groups who prefer the label of layperson or volunteer. It doesn’t come from one place & that makes it hard to answer.

Q: A way to reformulate the question: when things don’t go as expected, interesting to see what happens when participants resist the way science is to be done. In Fos-sur-Mer, did anything happen to change protocol to address different interests in ways that modify research?
A: Christelle – Question of model; it was difficult to apply just one model, it’s so contextual. Model is enlightening but reductionist when giving an account of what’s happening. Recalls prior research where locals knew the ecology but it took several years to demonstrate its accuracy through scientific means. Conger eel case – if you work with oysters, you’d get in trouble & locals will resist because the work may stigmatize the area & destroy the way they make a living – so collaborating on indicator species selection was a useful move.

Q: Remark about sociological citizen science & social feedback: last year did a quant study & sent thank-you letters that invited their thoughts on the long interviews, was surprised at extent of response & some people even sent back the thank-you money.

Q: Claudia – for tech assessment, those formats like consensus conferences on tech issues, so you were doing engagement for STI policy-making, what are mechanisms to ensure the recommendations are taken up? And if we think future changes present, it’s essential to think & reflect about whether it counts or not, whether it’s a fiction of the participatory future-making, & the consequences that may have.
A: Christian – Uptake depends on the project, depends on who solicits the feedback. On the one for food security, they were both the participants and the people asking for the work to be done. Working with policy-makers takes a lot of effort, many letters, meetings to share results, etc. Also really depends on the interests of individuals – people asking for results so they can integrate them, others only slowly getting interested.

Q: Credibility by citizen scientists, tension in terminology of amateurs vs professionals. Something to consider.
A: Me – I never use the term because of implied assumptions, but Rick Bonney observed that the etymology of “amateur” speaks to work done out of love, and that sort of passion was a clear trend in the stories we heard from speakers. Despite the baggage we associate with the term when it’s placed into a false dichotomy with “experts” – and it really is a false dichotomy, these are not opposite concepts – it’s actually quite apt at capturing the sentiment expressed by participants.
A: Christelle – there are not 1 but 2 words in French for this set of notions that represent these sensibilities well, amateur and connoisseur [the Old French root means “to know”].

Session 3: Infrastructures, technologies, & policies

Citizen Science by Other Means: Technological Appropriation & iNaturalist
Anne Bowser (Woodrow Wilson Center) & Andrea Wiggins (UMD)

Departure from original goal of developing a typology of infrastructure: noticed a few things in an earlier research exercise. Components of infrastructure were not easily arranged in a hierarchical fashion, but represented more of a networked assemblage. Huge range of tech in use, like social media, in-house tools, etc. So zoomed out to understand citizen science tech as complex sociotechnical system, thinking of them as system assemblages with both technical & social components.

Key components are borrowed or bought through appropriation as defined by Bar, Pisani & Webster, 2007. New pilot method was Dix 2006 framework to analyze the potential for appropriation in iNaturalist. Overview of iNat: goals of socialization, education & data collection. Components include app, website, community & researchers. Process of participation is uploading observations, crowdsourced validation, sharing for research grade, and data access. Theoretically a global research infrastructure, primarily funded in the US and data collection reflects it.

iNat supports appropriation in a couple ways. On-site projects, partners for data collection campaigns like NPS Bioblitz, and OSS code base on GitHub that can be adopted, e.g. in Natusfera. So how does iNat support appropriation by design? Outlines the criteria from Dix with examples from iNat.

Allow interpretation: multiple labels for grade of data that can be redefined in other adoptions or uses. Research grade is a commonly understood by certain stakeholders, but if reworked then data may become incommensurate.

Promote visibility: make data flows explicit, make data processes apparent, & make code open. Scaling up led to project aggregator functionality which streamlines process, but also obscures data flow.

Expose intentions: make the primary goals clear. iNat does this well in some ways; consequences of decisions around geoprivacy are clear, but data quality assessment is more opaque. Confusion about the “needs ID” flagriculture (for example) can have downstream impacts on which data are considered research grade and used for research.

Support not control: reusable framework is flexible and allows a lot of options; while the platform could technically be used for things like H2O quality, for example, current configuration doesn’t support it, nor integration of complementary data to biodiversity content.

Plugability & configuration: highly structured process suggests that collecting a lot of data is the primary goal. Only certain configurations are supported.

Learn from appropriation: pay attention to use & whether tools can be better suited to end user needs. Natusfera was developed because iNat declined to support their needs for European projects with language support, etc.

Encourage sharing: documenting uses for supporting shared protocols and comparable data is very uneven right now. It suggests some preferences around which projects are using the platform “right” or in the preferred way.

Summary: prioritizing direct appropriation without  modification of existing data flows over more complex code-based forms of appropriation. There’s little middle ground for intermediate forms of appropriation, e.g. if partners wanted to collect air/water quality data in addition to biodiversity data.

Discussion: situatedness – are there situations where appropriating a tech & adopting inherent values in design is antithetical to project goals? Tech components like Google Maps is subject to some caveats that seem to align with values of iNat, e.g., around openness, but other users may not recognize they’re adopting 2nd & 3rd hand values through design of the tool they appropriate.

Dynamics: environments evolve; appropriation enables those dynamics. Example of Public Labs: once kite is appropriated, it’s unlikely to change upon appropriation for monitoring, but software components’ materiality are different. Software is more mutable and gets changed when code base is appropriated.

Ownership: appropriation brings a sense of ownership when people feel in control over a tech & its uses. DIY/maker communities may be able to contribute to this aspect in some cases. In iNat environment, appropriator is essentially a mediator for iNat, and that has implications for groups like NPS.

Summary: need to carefully consider impacts of appropriation on both the appropriated and appropriators.

Microfluidic systems: challenges and opportunities for citizen science
Mary Amasia (Madeira Interactive Technologies Institute)

Chemical Engineer by training, working on critical tech. Early stage work & welcomes feedback. Example of Flint water crisis: citizen science promise was only partial; participants took own samples, but were shipped to VA for EPA accredited lab, issues of scale also impact things like whether adequate evidence for action can be collected.

Has worked in microfluidics for 10 yrs, tools for portable chemical testing, e.g. Lab on a CD. Multidisciplinary field with several specializations required – physics needed to understand how to manipulate a testing substance via capillary action, chemistry for analysis, etc. Combined with features of optical disk readers, it’s a tiny powerful lab, but not being used in places where it would have most impact, in places where there’s no access to those resources. Examples of paper, droplet generator, and CD based systems.

Developing BlueLightsLab to use BluRay drives for chemical testing. Few computers even have BluRay or DVD players – obsolete tech being underutilized, can repurpose and reuse them for low-cost diagnostics. BluRay tech includes spinner, blue laser, and reader – so could be used for sensing at scales of under 1 micron – applications like microplastics, invasive species, larvae. Laser can scan surface of the DVD for imaging instead of pits in the disc for data. One application is for low-cost HIV testing, since it can detect microplastics at sizes ~ 1 micron & white blood cells are 10-15 microns. Another use is for testing for DNA linked to GMO crops. Some of these are being used in a modified drive with an extra sensor above DVD surface to image through refraction or absorption.

How can this tech integrate with citizen science? Most emphasis on microfluidics has been on developing for biotech, or huge scale testing. Not much on the challenges for real-world samples as in citizen science. Microfluidics that are $5/test are too expensive for medical uses, but acceptable for citizen science. Good potential for DIY testing or crowdsourced monitoring, but has to be integrated into citizen science frameworks.

Just starting to work on open hardware platform for DIY bio. Also not assuming one best approach and looking at how to develop low-cost test kits for crowdsourced monitoring. Prior work is proprietary & siloed, but they hope to make it open source.

More Qs: what is testable? What counts as direct evidence? What makes a particular sensing method accountable or auditable within the existing legal & political context: How can expert & lay methods be leveraged together?

Privacy and Responsible Research in Citizen Science projects
Gemma Galdon Clavell (Eticas research and Consulting)

Policy researcher in private company, working with industry, government agencies & admin. A year ago, got to work with a citizen science project, looking into ethics & legal aspects. Examined whether it’s being done as citizen science without the citizen. Focus on science, needs of researchers, but not as much the people providing data – data despotism.

Worked with 3 projects, Attrape el Tigre, Bee-Path, Observadores del mar (marine observations). 2 app-based, 1 web app. Basic principles of data protection – were they being taken into account? Issues of processing personal data in the context of citizen science, blurs lines between subject and object, so data privacy becomes an issue. Increasingly digital lives so it’s hard to know what happens to one’s data later, including risks of re-identification. Saw a need to protect participants.

Process is through analysis of data life cycle – process steps where things can go wrong. Incorporate principles by privacy by design and privacy by default to make sure final product is in line with legal requirements and expectations of clients. Private contract so can’t go into full details, but general findings.

Found general concern for privacy with specific remedy mechanisms. All were using own servers and storage without cloud, which is important, and were also well protected. There were access protections with logs for accountability and figuring out who had access and how data might have been leaked. All had relevant privacy & cookie notifications. Good starting point, sometimes the notices have nothing to do with what they really do.

Vulnerabilities: data collection through apps is main issue area, usually fairly minimal, through apps permissions & web forms especially with social media, plus excessive fields of unnecessary details sometimes required. Most projects relied on passive consent, had to opt out and not opt in; most privacy-enhancing functionality was not on by default and they recommend the opposite. User password management also a problem – emailed in plain text!!! Metadata in pictures can tell you about the device, owner, locations, etc, so photos need to be cleaned in a way that’s not commonly done. Also data transfers, willingly to 3rd parties but unwillingly to search engines & web repos – that means people can’t delete data because it still exists elsewhere online, sharing with repos you don’t control. Sharing and reuse had no risk assessment – what if they were exploited, what chances of re-identification? No one looks at it. Exploitation also an issue – they did an adversarial attack & outcomes were not good. Re-identification generally an issue. Data deletion is also not done in practice but is legally required, although it could be automated. Also, should always use https protocol, securing all the interactions with encryption.

Assessments of projects: Bee Path only collects data when in the research area, which is great, but did background tracking by default. Adversarial attack let them use name of someone in project and find their data in Google, even if data were deleted.

Specific recommendations: Determine beforehand which data categories may be shared or revealed; notify about privacy settings, implication & risks; option to hide data or anonymously publish it; allow people to rectify & erase data including stuff that may seem non-personal; request minimum info about participants including metadata; adopt transparent practices about sharing and notification.

Lessons learned: can’t decide unilaterally what privacy protections should be in place – should involve volunteers; transparency, open data, and participation not in conflict with data & privacy protection, not a tradeoff; data sets that are published need to be properly dissociated & analyzed; impossible to have single recipe so context is important.

[NB: as authors of a paper on this topic in Human Computation, Anne & I both approved of this talk!]

Awareness and attitudinal change in participatory air pollution monitoring
Christian Oltra (CIEMAT); Ainhoa Jorcano; Irene Eleta (CREAL‐ISGlobal); Roser Sala

Study as part of larger project on air pollution & public perception & change. Air pollution at harmful levels in Europe and it has a high cost in terms of health care and economic loss. Mainly a localized challenge using info systems based on reporting, websites, indices, alerts, advisories, and even apps. Traditional public info systems on air pollution rely on awareness of availability of data.

Participatory sensing based on increasing availability of new capacity for public engagement on environmental risks. In air quality, more development in terms of hardware and sensors; others have looked at how the tech can engage the public. Very few empirical studies on how mobile sensors impact attitudes and behaviors about air pollution.

RQ: how does perception of local air pollution change with use of sensor? How are attitudinal dimensions changed by using sensor? Do participants feel helpless or empowered to act based on experience with sensor?

Asked small groups of people to register air quality for a week using a sensor. Started with focus group, week-long usage diary study, and then a follow-up focus group. Sensor measured NO2, associated with particulates so even if most dangerous, it’s a good indicator and provides useful resolution of data. 4 groups of 6, half recruited & half self-selected.

Results: experience with the sensor: main patterns of recording were: recording levels in their surroundings; making comparisons between polluted & non-polluted places, looking for patterns, observing intuitions, etc. Half of the groups did a lot more than the other half, tracked with recruited vs self-selected. Responses said it was interesting, curiosity, surprise.

Impacts on perception: awareness of NO2 presence, but surprised that levels were higher/lower than expected in different places. Were more aware of existence of NO2 levels in city, problem and data were more visible to them, and problem was considered more specific due to focus on what was measured. Understanding improved for NO2 but not of impacts or other air pollution issues.

Impacts on risk perception: severity & susceptibility of beliefs, saw little evidence of changes in perceive susceptibility, very limited evidence of change in perceived severity, even when the levels are very high. Controllability & self-efficacy: little evidence of improved perception of controllability, e.g., I can’t do anything that would change my exposure. One person reported feeling empowered by personally collecting data rather than passively consuming it.

Impacts on behavioral intention: very little change to beliefs about issues and intention to act, but did find some evidence of using start & stop systems in her car to reduce air pollution, another reported intention to buy a mask for riding motorcycle, and third reported he wouldn’t use a butane gas heater anymore due to indoor air quality being worse than outdoor.

Experience generates some level of emotional interest and potential to generate more engagement, especially compared to just focus group. Effect of sensors was mostly related to understanding NO2 levels & awareness; limited change and terms of perceived susceptibility and self-efficacy to reduce exposure; seem to be significant differences between recruited & self-selected volunteers in terms of volume of contribution.

Q: Christian, next step in research?
A: No next steps yet; others working on improving sensors but they are the only social scientists working on it.

Q: For those participants who found opportunities for changes, were their experiences different from those who didn’t?
A: Christian – Provided suggestions from agencies on reduction & protection, and some took action, but it’s a poorly covered topic. No info on self-protection, generally on reduction. For them, main Q is risk communication & improvement in engagement, so should they be used by local agencies? Can it improve public engagement on air pollution?

Q: Christian, emphasis was on individual reaction/action, you can wear a mask but this is a problem that needs a collective response. Resilience is individual and asking if people were pushing for collective action?
A: Usually local governments emphasize different regulatory, political, and behavioral strategies. Haven’t gotten to that level yet, but they’re all important. In terms of behavioral, there’s room for self-protection, for example checking if standing back further from street helps reduce exposure, and also reduce polluting practices, and then engagement. Public must be more engaged in project with sensor.
Q: Anne, talked about ownership a lot, mostly toward the technology, but how about the environment, e.g. stewardship?
A: ownership is personal responsibility and stewardship is shared responsibility. Interesting to think about tech as shared in terms of stewardship; could imagine OSS platform but it seems unrealistic. More realistic is looking at convergence between maker/hacker and citizen science to advance practices through progressive evolution.

Q: Feeling that often the projects are focused on different analysis equipment & kits, led by research needs? What kind of experiences or ideas for mobilizing creativity of participants to formulate questions, generate ideas on conducting research, etc?
A: Gemma – no need to mobilize it, need instead to get people at the top to listen to it. Often no one is really listening, so innovation at the neighborhood level isn’t often chosen as a subject of study. People are doing things bottom-up so the question is how to make that work visible, and that’s true of privacy too. Demand exists but market doesn’t want to respond, more about learning to listen & stop imposing needs of other agents into communities.
A: Anne – earlier there was a presentation on future-oriented planning of engagement; can build on that through citizen science – not just contributing ideas and setting priorities, but also working with policymakers & scientists based on priorities of communities. Framework conditions need to be improved.

Q: To connect to last point, she is based in Netherlands, doing action research with public in project using platform in (energy?) transitions. At start of research, did interviews to see and listen to their needs. Argues it’s a combination of bottom-up and top-down approaches – if you ask people what they want to research, you will need to facilitate that process to make it a meaningful research project. Project leader focusing on transition management, giving people room to discuss and construct the problems they want to solve at the end. Wants to ask Gemma & Anne, she asks about very particular types of data, e.g., financial. Of course, in context of energy, this is considered sensitive. Finds the tension interesting – discussion with umbrella organization, she wanted to get access for research, and on one hand there’s a natural feeling that data needs to be protected due to sensitivities, but the other side of the coin is, it’s a community and we should share everything to learn and grow. Comments on this tension?
A: Gemma – specific issue with environmental research, effects of issues are felt individually down the line, so you don’t need to promote organizing because it’s detectable, but if it’s not immediately obvious, then it’s a bigger issues, especially if impacts are delayed. Privacy & transparency are not opposite, you can release data but would need to anonymize properly, analyze in aggregates and open data can expose private data that could be harvested by insurance companies to charge higher premiums. Researcher doesn’t need exact salary, needs salary category – details can be obscured, but not many people are doing that.
A: Anne – major tension is the need to collect data locally for global-scale research, which means making big questions relevant locally, figuring out how to make questions resonate locally & designing for that. There’s some level of dichotomy with the open sharing, decisions need to be made more collectively and communicated so that local questions and data collection can be meaningfully understood in global contexts.

Q: Mary, what would you like to see happen next? What resources would move this forward?
A: Here because in the time she spent in microfluidics she become critical of the narrative, developing for certain settings, getting patents and licensing. Got tired of that, and really wants to see it being used more broadly. Trying to work with existing components, reusable components, and trying to consider how to keep it as open as possible while making responsible decisions about the technology, particularly not to limit or determine how it’s used in the future. Thinking about that differently from traditional engineers. One of the initial projects is for invasive species monitoring on Madeira, lots of local investment, using it as a test case, wants it to be as open and usable, adoptable as possible. Considering working with Public Lab.

Q: Some social scientists are reluctant about citizen science. Not clear on participation, insisting on need for there to be a change beyond more awareness, but more directly, as an outcome. Maybe more work on research in action – citizen science as a political action? What are limits of privacy and ethics at that stage? European Commission discussion, reluctant to push on it because they believe it’s trivial engagement.
A: Anne – In some motivation studies, one thing volunteers look for is the outcome & evidence that contributions achieve something.
Q clarifies: more than just submitting data.
A: Anne – So projects with online participation, it’s on the science team to have the volunteers named in a published paper, in discovery of a planet – not just an aggregated data points, but acknowledging role in something bigger, which meets their motivations.
A: Me – Speaks to values underlying participation expectations, why are we imposing our values about participation on people? They don’t always want that.
Q clarifies: sees reluctance where belief is that citizen science participation should be something bigger than it has been.
…Extended and very lively further discussion with entire audience on tensions around values, funding, roles, etc…

Citizen Science & Health Data Donation: Health Data Exploration Project 2016

This week I had the privilege of joining the Health Data Exploration network meeting as an invited speaker. I found it a really interesting and eye-opening experience, since biomedical and health domains are not the most common focus in citizen science, and so rather new to me. The interaction of academic and clinical practice, and the implications for opening up participation, are a notably different paradigm than in some other sciences.

My talk was (very) well received, and I had a lot of really great conversations. My notes are pasted below, unembellished and minimally corrected, to give the broader citizen science community some insight into what the movers and shakers in the health data community are discussing as relates to engaging the public. Enjoy!

Health Data Exploration Network, May 17, 2016

“Projects that Have Advanced the Use of Personal Health Data for Research”

Stephen J Downs, Robert Wood Johnson Foundation
Mpact Parkinsons study with Apple ResearchKit – returning data to contributors was important & empowering. Sadly some of the controls in study turned out to have Parkinsons symptoms too.

Emil Chiauzza, Patientslikeme
Multiple Sclerosis study, small sample, using wearables to manage disease – how do people use data? Behavioral adaptations, day-to-day management of disease. Lots of tacit knowledge that isn’t captured, e.g., schedule shifting to avoid impact of heat on MS symptoms. Planning for rest time after more exhausting days. Avoiding heat with environmentally controlled settings. Pacing themselves, reducing activity intensity strategically. Developed a course in how to use the data from a wearable with concept of a “sweet spot” or targeted activity levels, contingent on the conditions.
Implications for data donation/cit sci: consider role of data within broader context of individuals’ uses, provide tools for action. Tools for wellness not always applicable for disease communities. Learn about “health hacks” that people adopt. Implications for precision medicine: put out survey about precision medicine, lots of people unaware, patients not included in the set-up.

Eric Hekler, ASU
Proximity sensors for adaptive interventions. Goal is just-in-time intervention to nudge toward positive or away from negative actions. Also looking into receptivity, at what moment is someone receptive to your message? iBeacon sensors: Bluetooth device tracks proximity & user accesses some aspect of it with an app, being used in retail, they are developing other uses, try it and donate data. Questions from this data: styles of participation, influence of program, re-identification behavior, predictions of future collaborations, etc. Interesting questions about data donation practices as well: how do you feel about the data collection, value that should be returned, who do you trust with this kind of data, what are ethics issues, are you willing to re-identify, locations of proximity sensors. Planning to validate measurements with users, did we measure the right things, is this useful or meaningful?

Michelle DeMooy, Center for Democracy & Technology
Working with Fitbit on internal R&D and ethics. Very interesting work. Report coming out tomorrow. Have a human subject researcher on the team, not just looking at data but helping provide context. More public-private partnerships are needed, making the debates public could be valuable, Fitbit wants to do the right thing but is stuck in a paradigm that makes it difficult.

Question about Fitbit adopting HIPPA – is that really a win? The privacy and security standards are pretty good, gives them a standard to work to and check against, which is a victory. That didn’t exist before, many wearable companies have nothing like it.

Julie Kientz, UW
Research on alertness: impacts on body clock, time of day, stimulant intake. Self-reported fatigue, sleep diaries, psychomotor vigilance task reaction time. Alertness varies over time, different patterns for different circadian rhythms. Also saw rhythms in app use by app category, over time, very interesting and logical. Wednesday was low productivity day, people hit a wall. If got enough sleep, used 61% more productivity apps, inadequate sleep 33% more entertainment apps. Late use of phone correlates with sleep disturbance. Future work – circadian-aware tech for planning and scheduling. Automated sensing doesn’t tell you everything, need the human-in-the-loop for understanding intent and meaning with as little burden as possible. Also notion of reward: forced reflection, we collect data and never look at it, put markers on your day and then annotate later, this is a benefit.

Barbara Evans, Uni Houston Law Center, commissioned paper
Consumer-Driven Data Commons: Health Data and the Transformation of Citizen Science

Data ownership: uses airline seat ownership of space to demonstrate the tendency to feel like we own something that we don’t. Feelings of data ownership are strong & intense, but legally we don’t own our data, and if we did, it would be different from owning a house. Most of our data are under a regime of shared control. Debating data ownership has been debated too long: critical issue is control & access.

“People-driven data commons” – uses the term commons in the sense of natural resources, e.g. work by Elinor Ostrom. It’s not the data but the institutional arrangements for data management and stewardship. Set of rules and arrangements to allow collective self-governance. Granular consent won’t get us where we need to be: acting autonomously puts power to individuals, but need collective action to achieve the outcomes. Collections of individuals can’t act together.

Normative ethics not admissable as testimony. Can get the data from either consumers or data holders. People-driven data commons means working with consumers so people work on getting their data to bring it together. 2×2 table of consent to data use from individuals, willingness to share data from data holders. So far we have neglected quadrant 3, which gives people access in order to contribute data.

“Personal Data Donation: Barriers and Paths to Success”

Anne Waldo, Waldo Law Offices: You can’t share data you can’t access-HIPAA. Rights to access data even include sending via unsecure email, but providers are not fully compliant. Providers are suspicious if you request records: are you leaving me or suing me? Refusal to use email despite the law requiring it, up to $600 (or $6K) to get records, ridiculous fees, and without estimates of cost until data are delivered. Data only available as PDF or on paper, not very usable.

Jason Bobe, ICAHN Institute: health is #3 priority for people worldwide, but don’t participate in organized health research: why? How do we overcome that and provide a good model for participating in health research? Really important for rare traits, e.g., “resilience” genes, need millions of people’s records to find the ones who have this, where medical prediction is they would be sick, but they’re healthy. Founded DIYbio, which has exploded since he started it. Key insight is not the $1K genome sequence, but the $1K genome sequencer. At the time, if you wanted sequencing, you were beholden to organizations that wouldn’t let you get your own data. Roles are changing, it’s no longer binary. Not just participants and researchers, participants can be the researchers. This influences research governance. Harvard Personal Genome Project: making genome data available as public benefit and resource. Question at the time was whether people would share, answer is yes. Variable across population, but people will share sensitive data even publicly, treating people as collaborators from the start was key, important to invoke reciprocity. Benefits beyond altruism and social good, people find meaning, and community, and education in new roles. Retrofitting sharing is too hard, have to completely rework it.

Nick Anderson, UC Davis: Assumption of precision medicine means we need lots of heterogeneous data of all kinds, not just genomic. Acceptance of the data sources is different. Value of data is differential, some are if not accepted, at least understood, e.g., NSA. Health data is different; how does a million people contributing all your data shift the question of when data acquired for one purpose is used for another. Even though promised to be for the benefit of all, it will probably only benefit a few to begin with. Social benefit is currently lightweight, hard to understand, not just asking to change patterns of how we do things, but to use that for things we’ll probably never be involved in. How is precision medicine going to shift the discussion? No clear acknowledgement of secondary purposes for data use in some systems, but we’re planning to do that anyway.

Camille Nebeker, UCSD: people were more thoughtful about sharing when she asked. They wanted to be motivated, for there to be a powerful purpose, that it would make a difference, and what it would be used for. It would be a motivator if it was of value to the community they were from, ethnicity or patient community. They wanted to have control over sharing, give permission and know who they’re sharing with. Commercial entities may be less likely to get data than academics. GINA for genetic data protections is important, preventing identity and against discrimination. Trust is important, would readily give data if they trusted the person. De-identification was key, odd since this group should know it’s hard to do, but still believed it could be done. Loopholes, for-profit, rights, etc all the barriers, huge cognitive burden to use policies, removing that will increase trust.

Aaron Coleman, Fitabase: sometimes you’re in a data sharing experiment and don’t know it. Understanding granularity of data and what can be inferred from that, for example, employer-based study-do you want them to know your sleep patterns. Do you want them knowing activity patterns that may suggest health cost differences? Once you dig into APIs, trying to pull data out, the barriers pile up. The more granular and often you want your data with more scope, the harder it is to get it. Permissions for access to data are now getting more specific, need the granularity and options to turn things off and revoke permission when we’re no longer comfortable with their data use.

Waldo: New guidance from HHS: have right to paper copy of records, to electronic copy if they’re in that form, to send to 3rd party of your choice regardless of who it is, and they can’t ask you why. You have a right to get records by unsecure email, though they have to warn you about the security issue. Have to give it in the form/format you request as long as it is reasonably produceable in that form. Rights to standing request. Within 30 days max, and should be almost instantaneous or very prompt. Providers can’t demand you come to office, require you to use portal, use their own authorization form-access request is different from authorization form. Can’t deny request if bill is unpaid, can’t deny if they don’t think a certain recipient should get it. Fees still exist, reasonable cost-based fee, only for copying and mailing, nothing related to the processing or storage or retrieval. Illegal to charge for download format, page fees for electronic, and must give up-front cost estimates. Best practice is to charge nothing.

These have been rights for a long time, is there any indication they will enforce it? HHS has been saying this clearly in conferences. Grace period on enforcement action since policy was released.

Question on issue of de-identification: how close are we to that being impossible? News article recently about Stanford profs getting phone records and being able to find the people. Ethnographic work-still hard to obscure who a family is to anyone who knows them. One of the issues with precision medicine is that you can do tangential re-identification. In technical world, only as good as HIPAA can make it. Movement toward considering probability of re-identification, why, and can we quantify that to make it easier to communicate that risk? Right now probability is pretty low. Need to educate people, from IRB member perspective, quantify the risk of harm for a study, tell them that in the consent form. If could quantify the likelihood of re-identification, start moving away from guarantees of anonymity, then people start to understand and we become more transparent.

Important to spread word about new guidance from HHS, this is huge culture change. Some safety in numbers, this is a tipping point. Feels like change agents, nice to be together with others who rock the boat the same way.

In policy and technical community, looking at linkability versus identifiability. Different paradigm for protections. Safe Harbor requires removing 18 identifiers, but 19th is “any other thing that would make it clear to someone else who that is.” Have to scrub it harder for case studies, maybe even change facts that aren’t meaningful. Expert statistician method-risk has to be very small, not zero. Should stick to standard of very small risk, because we deal with small risks in every other part of life.

Disconnect in talking about these precision medicine studies recruiting a million people, many others. Question is whether we are ready for this scale? Seems like there’s a huge disconnect. Jason says, PM initiative is bold by shooting for platinum record without producing a single song yet. But have ways to cheat, e.g. million veterans project. But sees mental model as the bigger barrier, people think medical research is when you’re sick and need experimental treatment. Need to shift perspective to research being a diverse experience, such that people seek out studies that meet their tastes. Danger of promising equal benefit to everyone, that’s not likely, so scale is going to be an issue.

Patti Brennan, UW-Madison
Citizen Engagement: Informatics in the service of health

4th director of NLM (upcoming.) Goal of health for all, in part through data donation. Think of “citizens” broadly, not the people you know or people of privilege. We need everyone engaged. Citizens, not patients, important framing for health services. People need rights to advocate for themselves; rights and responsibilities associated with citizen, much more so than engagement. Citizens as people of the world who engage, take part in the world.

Identifying health data life cycle-origination points for data, and also the points in between clinical treatment. Assertions:
1. Citizens are sometimes patients, but always citizens
2. Citizen engagement improves health
3. Citizen science provides a data-driven model to guide the next-generation of technical innovations, clinical practice, and biomedical & health knowledge

Needs to drive informatics in service of health. Emergence of new tools and tech, etc.

Engagement: participation by multiple parties achieving mutually established and jointly accountable goals. Also a promise, obligation, binding condition; giving someone a job; etc. Multiple definitions all have their own analogies in health. Arises from perspective of mutual recognition and respect; requires deliberate and intentional strategies.

Direct involvement in policy and public service delivery-in cooperation with rather than in place of experts. Discussion of citizen engagement along the lines of public engagement in science. Lots of ways that engagements can happen: extension, collaboration, co-creation.

Experiment: view videos of pond & list signs of life.
I saw and heard: water skeeters, Northern rough-winged swallow, multiple species of warblers (yellow, yellow-rumped?), shrubs, trees, water, song sparrow, tadpoles, frogs, grasses.

Citizen roles in citizen science: Extenders: extend the reach of professionals, perspective and goal defined by pros (contributory), Collaborative: professionals and policy define scope, citizens help set priorities; co-creation. Example of Audubon CBC. Rules and definitions, shared meanings, are all possible and useful.

Sharing is more important than storing. As important to seek normal patterns as unique things. Redoes the citizen science definition with citizen engagement in health. Australia defines effective engagement with: inform, consult, involve, collaborate, empower – likes this definition better than what is used in US.

CAISE typology parallel for health: gathering health data as guided by professionals, guiding priorities in policy, and in co-creation of balances of public health, patient care, and personal wellness. Potential benefits are personal, professional practice improvements, and public policy improvements. To achieve the goals, need information infrastructure.

Concept of doing a discharge simulation of virtual tour of home along with air quality data and other records, identify places for post-surgical care activities.

Expand the phenomenon of interest to health, measure what must be interpreted, rather than interpreting what we can measure. If data, storage, interpretation and use are separated, then provenance becomes more important. Metadata needs to be captured along with data and definitions created at point of use, transitioning to ontological formulation of what a data definition is. Need broader set of communication and information tech. Sets out necessary soft skills along with tech needs.

What if citizens are wrong? When clinicians are wrong, we call it differential diagnosis and don’t throw out the data, recorded as pathway that’s been abandoned. So why expect more from patients? Begin with what people believe is important and work from that.

How about rights that are under threat or restricted? Building systems based on privilege. Medical informatics can’t resolve it, but can be designed within social constraints to formalize impacts on what constitutes health, practices, and acceptable use of data.

Big Issues in CSCW session, CSCW 2013

ACM Conference on Computer Supported Cooperative Work and Social Computing
26 February, 2013
San Antonio, TX

Big Issues in CSCW session


Ingrid Erickson – Designing Collaboration: Comparing Cases Exploring Cultural Probes as Boundary-Negotiating Objects

Cultural probes around boundary negotiating objects – prompts via Twitter, e.g. “Take a picture of sustainability (post it with #sustainability and #odoi11)” – leveraged existing platforms. Lots of very interesting images came from this. Content not profound, but prompts engendered communication with people on the street, people in teams, dialogues that generated new hashtags besides those requested. Led into a design workshop.

Another instance of using cultural probes with Light in Winter event (in Ithaca, NY). Found that probes have several properties that make them generative.

  • Exogenous: probes act like exogenous shocks to small organizations systems, interruption to normal practice that requires attention, initiating mechanism for collaboration.
  • Scaffolding: directed but unbounded tasks; hashtags and drawings act as scaffolds, directed boundary work to prompt engagement, informal structure that supports exploration over accuracy/specificity.
  • Diversity: Outputs improved by diversity – diverse inputs increased value, acted as funnel for diversity to become collective insight.

Think about designing collaboration – taboo topic with inherent implication of social engineering, but we’ve been doing it all along. As designed activities, cultural probes were oblique tasks to invite interpretation and meaning-making, build on exogenous shock value, give enough specificity for mutual direction, salient to context but easy to understand.

Potential to use distributed boundary probes? Online interaction space/s – assemble, display inputs; organize w/ hashtag/metadata; easy way to revise organizational schemes as they are negotiated; allow collaborators to hear thinking-aloud of fellow collaborators; can be designed as a game or casually building engagement over longer periods of time.


Steve Jackson – Why CSCW Needs Science Policy (and Vice-Versa)

CSCW impact means making findings relevant to new and broader communities, make the work more effective and meaningful in the world.

We’re all used to “implications for design” and maybe even “implications for practice”, but need to start including more “implications for policy” in our work moving forward. Often fail to make connections in a useful way, need to learn from policy and policy research. Not immediately relevant to all CSCW research, but relevant at the higher level. The connections are just underdeveloped relative to potential value.

Particularly important around collaborative science, scientific practices – Atkins report as a prime example. Separate European trajectory covered in the paper along with history of science policy as relates to CSCW. So-called “supercomputing famine” in the 1970’s (drew laughs) reflected ambition of transforming science with technologies. Leading examples – CI generation projects – may also be misleading as these are the big money CIs. Ethnographic studies now including up to 250 informants but all projects are examples from MREFC projects – major research equipment funding something something.

CSCW & policy gap – institutional tensions in funding, data sharing practices and policies, software provision & circulation.

Social contract with science – support, autonomy, and self-governance in exchange for goods, expertise, and the (applied) fruits of (basic) science – this was the attitude after WWI. Stepping away from pipeline model, moved toward post-normal science. Identified 3 modes of science, which are culturally specific. Can’t wait to read this paper!

Wikipedia-Supported Cooperative Work session, CSCW 2013

ACM Conference on Computer Supported Cooperative Work and Social Computing
26 February, 2013
San Antonio, TX

Wikipedia-Supported Cooperative Work session


Jonathan Morgan – Tea & Sympathy: crafting positive new user experiences on Wikipedia

New user experiences on Wikipedia are intimidating – unintuitive markup, support impersonal, overwhelming load of rules. Everything about new user experience seems to be intended to drive people away. Women’s experiences as minority, dissatisfaction with abrasive norms, wish for more collaborative participation – good parallel for newbies in general, same issues.

Teahouse: created to meet needs of new users, want to help address deficiencies for newcomers. Tried to figure out what they want and how to support that. Newbies usually hesitant due to concern over humiliation, intimidated by making first move to socialize when you don’t know norms.

Teahouse has host profiles and guest profiles, Q&A forums. Goal of host profiles is making it more personable, showing that Wikipedians are real people. Also tried to address some usability constraints, but didn’t want it to work fundamentally differently from Wikipedia. Decided not to make people edit a page to ask how to edit a page, however – used Javascript widget, makes it easier while providing learning opportunity.

Most of the experience isn’t scripted. Welcome – personal welcome, another host gives a tip with less overwhelming links than they would get for a “getting started” comment in Wikipedia proper. Very different sensibility than Wikipedia, and the difference lies in the norms. Set up list of guidelines for interaction – not enforceable rules – if they work, hope that it’s because they remind Wikipedians what it was like to be a newbie.

Engaging new editors effectively, what they valued about the Teahouse wasn’t usability but sociability. Human-human interactions were more salient. Looked at retention as well, Teahouse guests made more edits on average than non-Teahouse control groups, more of them becoming high volume contributors. They stuck around longer, and were substantially more active than those who didn’t do Teahouse. Wikipedian hosts enjoyed it too.

Still going strong after a year, looking at using badges, data set available upon request.


Aaron Halfaker – Making Peripheral Participation Legitimate: Reader engagement experiments in Wikipedia

Power law participation even among small number of regular editors. LPP lens – how do newcomers enter communities of practice? Initial new member tasks should be simple, low risk of causing problems in community, and productive so it’s not useless – same thing as in Bryant et al 2005, but the participation experience has changed a lot since then.

Newcomer’s first edits are now much more complex, with higher rate of failure (immediate reversion), and it’s getting worse. Wikipedia is very, very complicated in terms of rules – lots and lots of policies. Verifiability policy alone is a 6-page document.

Tried setting up new task for newbie that’s much easier to do – added suggestion box on bottom for feedback, which will hopefully help editors improve articles and also get new people involved. It’s simple and low-risk, but is it productive? 3 experiments to find out.

RQ1: How do requests for participation affect quality and quantity of feedback – some engineered toward reader concerns, some for editor concerns. Found that engineering request toward reader concerns boosted contribution rate by 45% over just asking for rating, no loss of utility.

RQ2: How does prominence of the request affect quality and quantity of feedback? (placement @ end of article versus top, affects visibility) The more prominent button that scrolled along with the article boosted feedback contribution rate by 108% (on top of prior 45%) with no loss of utility – very surprised by this, ran a bunch of confirmatory experiments.

RQ3: How does presence of feedback form affect new editor conversion? Could cannibalize primary contributions (edits) but could be a stepping stone. Found that asking people to edit after feedback submission increased new conversions by 151%, but 20% drop in probability of editing (after?) that first week.

Conclusions: no tradeoff between quantity and quality of participation; inviting readers to convert increased rate of new editors but at lower success rate; need to balance value of contribution against cost of moderating. If that equation is wrong, it won’t work. Article feedback tool almost being taken down because Wikipedia has no way to moderate the feedback.


Stuart Geiger – Using Edit Sessions to Measure Participation in Wikipedia (or: Edit Counts Considered Harmful)

Started w/ discussion of cross-methods collaboration – what each of the authors see when they look at Wikipedia (very funny comparison). Bringing together ethnographic understandings with more computational approach to iteratively, inductively develop a quantitative measure.

Big difference in how academics evaluate participation and how Wikipedians do. Many Wikipedians think edit counts are inadequate, even harmful – researchers focus on discrete units of work but those don’t capture the work experience. Need to expand our idea of what counts as work. We measure work in hours in organizational settings – 20 hours of work for SVs, not 20 questions answered or 20 problems solved. Time-based metrics more relevant.

So, edit session: graph of Jimmy Wales’ edits over time. Wikipedians edit in short punctuated bursts, usually in short periods of time. Tasks therefore segmented, can use same techniques as counting edits, look at time between 2 sequential actions, when less than cutoff of an hour apart, as time spent in editing session.

Determined cutoff by looking at histogram of time between edits. Also found 3 distributions – within-session breaks, intra-session breaks, and what Wikipedians call “wiki-breaks” – long-term departures from community – usually a few months.

Looking at Wikipedia in terms of time spent editing, there’s actually a lot more growth in participation that edit counts suggest. Total concrete bursts of editing on Wikipedia (not including research, off-wiki interaction, etc.) add up to over 100M hours. That’s over 11,700 human years – greater than the population of Tuvalu and 14 other small countries!

Limits to measures like these – it misses lots of work that doesn’t directly make an edit (e.g., background research), kind of creepy and invasive as every interaction in Wikipedia is an edit, acts a little like a diary study. Similar issues elsewhere, but edit sessions contribute to a more holistic measure.

Crowdsourcing session, CSCW 2013

ACM Conference on Computer Supported Cooperative Work and Social Computing
26 February, 2013
San Antonio, TX

Crowdsourcing session


Tammy Waterhouse – Pay by the Bit: Information-theoretic metric for collective human judgment

Collective human judgment: using people to answer well-posed objective questions [RIGHT/WRONG]. Collective human computation in this context – related questions grouped into tasks, e.g. birthdays of each Texan legislator.

Gave example of Galaxy Zoo. Issues of measuring human computation performance. Fast? Encourages poor quality. Better? Percent correct isn’t always useful/meaningful.

Using info entropy – self-information of random outcome (surprise associated w/ outcome); entropy of random variable is its expected information. Resolving collective judgment – model uses Bayesian techniques. Then looked at entropy remaining after conditional information – conditional entropy. Used data from Galaxy Zoo to look at question scheduling; new approach improved overall performance.


Shih-Wen Huang – Enhancing reliability using peer consistency evaluation in human computation

Human computation not reliable – when tested, many people couldn’t count the nouns in 15-word list. Without quality control, they have 70% accuracy. Believes quality control is most important thing in human computation.

Gold standard evaluation: objectively determined correct answer [notably, not always possible]. Favored by researchers but not scalable because gold standard answers are costly to generate.

Peer consistency in GWAP: sometimes use inter-player consistency to reward/score. Mechanism significantly improves outcomes. Using peer consistency evaluation as scalable mechanism – can it work? Used AMT to test it. Concludes peer consistency is scalable and effective for quality control.


Derek Hansen – Quality Control Mechanisms for Crowdsourcing: Peer Review, Arbitration, & Expertise at FamilySearch Indexing

FamilySearch Index is one of largest crowdsourcing projects around. Volunteers transcribe old records – 400K contributors.

Looked at several models to improve efficiency while reducing added time. Use a downloaded package to do tasks, can use keystroke logging with idle time to evaluate task efficiency. Comparing arbitration process with a simple review. A-B agreement by form field varied. Experienced contributors had improved agreement.

Implications: retention is important – experienced workers faster, more accurate; encourages novices and experts to do more; contextualized knowledge, specialized skills needed for some tasks.  Tension between recruitment and retention with crowdsourcing – assumption that more people makes up for losing an experienced person, which is not always true. In this context it would take 4 new recruits to replace 1 experienced volunteer.

Findings: no need for a second round of review/arbitration – only slight reduction of error and arbitration adds more time (than it’s really worth).

Implications: peer review has considerable efficiency gains, nearly as good quality as arbitration process. Can prime reviewers to find errors, highlight potential problems (e.g., flagging), etc. Integrate human and algorithmic transcription – use algorithms on easy fields integrated with human reviews.

Citizen Science session, CSCW 2013

ACM Conference on Computer Supported Cooperative Work and Social Computing
27 February, 2013
San Antonio, TX

Citizen Science session


Sunyoung Kim – Sensr

Intro to types of citizen science, diversity of project types. Common underlying characteristic: using volunteer’s time to advance science. Many typologies, projects can be divided by activity types into primarily data collection and data analysis/processing. Focus here is field observation, has great opportunities for mobile technologies.

Problem is that most citizen science projects are resource-poor and can’t handle mobile technologies on their own. Goal is supporting people with no technical expertise to create mobile data collection apps for their own citizen science projects. Terms used: campaigns – projects, author – person who creates campaign, volunteer – someone who contributes to collecting data/analysis.

Design considerations include: 1) current tech use, similar available tools, needs for practitioners. Reviewed 340+ existing projects (campaigns) from, found only 11% provide mobile tools for data collection. Looked at types of data they’re collecting – primarily include location, pictures, and text data entry. 2) Data quality is paramount, and data also contains personal information. 3) How to recruit volunteers. Looked at similar mobile data collection tools like EpiCollect and ODK. They’re pretty similar in terms of available functionality, but Sensr is simplest to use. Most comparable platforms are open source so you need programming skills to make them work (free as in puppies!) – even the term open source can be very techie to the target users.

Built Sensr as visual environment combined with mobile app to author mobile data collection tools for citizen science. Demo video demonstrates setting up data collection form for “eBird”, pick fields to have on form. Just a few steps, creates back end database and front end mobile interface. Very straightforward interface to assemble a mobile app for citizen science data collection.

A couple of features: can define geographic boundary but can’t prevent people from outside the boundary to join (App Store is global), but you can help users target correct places. Can review the data before it is publicly viewable or goes into scientific data set.

Did case studies to see how nontechnical users did with it, betas with existing projects, before launching tool. Strong enthusiasm for the app, especially for projects with interest in attracting younger participants. Main contribution: Sensr lowers barriers for implementing mobile data collection for citizen science.

Question about native apps versus HTML5 mobile browser apps due to need for cross-OS support.

Question if there’s a way to help support motivation; not the focus in this study. Case study projects didn’t ask for it because they were so thrilled to have an app at all.


Christine Robson – Comparing use of social networking and social media channels for citizen science

One of main questions from practitioners at Minnowbrook workshop on Design for Citizen Science (organized by Kevin Crowston and me) was how to get people to adopt technologies for citizen science, and how to engage them. They were questions that could be tested out, so she did some experiments.

Built simple platform (sponsored by IBM Research) to address big picture questions about water quality for a local project, and this app development was advised by California EPA. App went global, have gotten data from around the world for 3 years now. Data can be browsed at, you can also download it in CSV if you want to work on it. “Available on the App Store” button on the website was important for tracking adoption.

Creek Watch iPhone app asks for only 3 data points: water level, flow rate, presence of trash. Taken from CA Water Rapid Assessment survey, used those definitions to help guide people on what to put in the app, timestamped images, can look for nearby points as well. More in the CHI 2011 paper. Very specific use pattern: almost everyone submits data in the morning, probably while walking the dog, taking a run, something like that.

Ran 3 experimental campaigns to investigate mobile app adoption for citizen science.

Experiment #1: Big international press release – listed by IBM as one of the top 5 things that were going to change the world. It’s a big worldwide thing when IBM makes press releases – 23 original news articles were generated, that’s not including republication in smaller venues. Lots of press, could track how many more new users came from it by evaluating normal rate of signups versus post-article signup. +233 users

Experiment #2: Local recruitment with campaign “snapshot day”, driven by two groups in CA and Korea. Groups used local channels, mailing lists, and flyers. +40 users

Experiment #3: Social networking campaign: launched new version of app with new feature, spent a day sending messages via FB and Twitter, guest speaker blog posts, YouTube video, really embedded social media campaign. Very successful, +254 new users.

Signups aren’t full story – Snapshot Day generated the most data in one day. So if you want more people, go for the social media campaign, but if you want more data, just ask for more data.

Implemented sharing on Twitter and Facebook – simple updates as usually seen in both systems. Tracking sharing feature – conversions tracked with App store button. Can’t link clickthrough to actual download, just know that they went to iTunes to look at it, but it’s a good conversion indicator. Lots more visits resulted from FB than Twitter, a lot more visitors in general from FB as a result. Conversion by social media platform was dramatically different – 2.5x more from FB versus Twitter or web, which were pretty much the same.

Effects of these sharing posts over time – posts are transient, almost all of the clicks occur in the first 2-5 hours, after that its effect is nearly negligible. Most people clicked through from posts in the morning, there are also peaks later in the evening when people check FB after work; then next morning they do data submission.

However, social media sharing was not that popular – only 1 in 5 wanted to use Twitter/FB feature. Did survey to find out why. Problem wasn’t that they didn’t know about the sharing feature, 50% just didn’t want to use it for a variety of reasons. Conversely, for those uninterested in contributing data, they were happy to “like” Creek Watch and be affiliated on Facebook, but also didn’t want to clutter FB wall with it.

Facebook campaign as effective – or more – than massive international news campaign from a major corporation (though the corporate affiliation may have some effect there), and much easier to conduct. Obviously there are some generalizability questions, but if you want more data, then a participation campaign would be the way to go. Sharing feature shows some promise, but it was also a lot of work for a smaller payoff. With limited resources, it would be more useful to cultivate Facebook community than build social media sharing into a citizen science app.