Citizen Science: Beyond the Laboratory @ 4S/EASST 2016

Dino eat conference

Spinosaurus on the hunt in Barcelona!

Every 4 years, the Society for the Social Studies of Science and European Association for the Study of Science & Technology co-locate their meetings. This year we met in Barcelona, and a huge crowd of STS (science & technology studies) scholars presented more than 1100 talks in over 100 tracks in just 3 days. The citizen science track, Citizen Science: Beyond the Laboratory, that we (Gabe Mugar, Carsten Østerlund, & I) organized had a whopping 3 sessions and 13 papers! Two of the papers weren’t technically on citizen science, so I’ve included only a brief description of the main themes, but the presenters did an amazing job of fitting their work into the conversation quite artfully.

As usual, these are fairly raw notes, unembellished and only minimally corrected. I did remove the honorifics that were inserted in the program, as distinguishing between those who have and have not finished their PhDs seemed very crass to me.

Enjoy!


Session 1: Negotiating tensions at sites of investigation

Landscapes and Property Lines: the contradictory practices of citizen scientists
Karin Patzke (Rensselaer Polytechnic Institute)

Milam Co, TX: everyday environmental conservation with formal & informal citizen science practices. Uses concept of extramural knowledge production to define citizen science to highlight the roles/relationships; focus on participants in formal programs.

1990’s – state task force on nature tourism, involving multiple agencies to create rural nature-oriented tourism. Recommendations included tax incentive to transfer land management toward wildlife management. Formal category of property appraisal situated in ag, not based on productive value of the land eg resource management, but with properties part of larger ecoregion & network. 20 yrs later, this vision hasn’t succeeded; agriculture production is still in decline due to urban encroachment. Cultural value of land in TX is situated around land-rich & cash-poor status that has been the case since TX was independent republic. Wildlife management tax status lets them keep land that’s less productive.

Focusing on bureaucracy in formal tax evaluation. Framing from legal studies & STS. Legal fictions are how facts are used to create legality. Wildlife management is practice that encourages contradictory things: land as public space that species pass through, and as managed spaces like parks, but also required to manage as though private without  consideration of neighbors.

TX legal regime is hybrid of other traditions; legal fictions are genres of fact to create action, in this case, legal bureaucracy. Always legal action, not illegal because not adjudication. For this case, fiction is wildlife management as agriculture in tax law.

Landowners work with biologists for land management practices & to legitimate practices, often through citizen science like documenting migrations and species presence/absence. Works as form of legitimacy for cultural practices, still through a productive lens, attained through production & participation in citizen science. Dominant form is observation census: NestWatch, iNat, seed production counts. When taxes are due, landowners file a 10-pg form from appraisers office and add up to 50 pgs documenting their work in citizen science & workshops. Wildlife management isn’t just about bluebird boxes, but active participation as documented in paperwork which establishes legitimacy of wildlife management claims. Wildlife only seen as productive function of agricultural practices. Cit science thus legitimizes their practices.

But participation doesn’t lead to sustainable practices; geography of landscape doesn’t change because value of land based on what the property produces. So novel practices on their property doesn’t allow expanding practices to adjacent properties, so fences divide traditional agriculture from wildlife management, only way to get legal value for the land. Consuming nature is rhetoric, but it’s translation through legality of consumption to production that produces value for the state.

Participation in citizen science is how legitimacy is constructed to maintain property rights of ownership and keep land separate from a whole. By relying on production of science knowledge to legitimate wildlife management, hasn’t been much effort to get past land-rich cash-poor style. Reinforces divisions between properties.


Who are the citizens in citizen science? Public participation in distributed computing Traditional Knowledge, Citizenship, and the Conditions of Scientific Participation
Elise Tancoigne (Universite de Geneve) www.citizensciences.net

Interested in amateur production of knowledge, their definition of citizen science. Starts with OSTP quote from Jenn Gustetic on 9/9/2015, asserting ability for people to engage in science even when not formally trained. Also highlights popular media focusing on democratization of discovery. Democracy has multiple meanings, here understood as egalitarian.

RQ: Do the citizen sciences democratize sciences? Or not? Case is distributed computing. Why? Distributed computing has millions of participants, which is convenient for demographic studies. Also considered not “true” citizen science, which makes it interesting. Gives some history of SETI@home, example of why they are focusing on BOINC, which has over 540K active volunteers and over 1.1M computers involved in 40 ongoing projects.

Graph of new active users/month in BOINC projects. Several papers that try to characterize Distributed Comp, but few exceed 1K survey respondents & don’t include demographics about profession. Didn’t run a survey due to methods problems; instead examined BOINC user profiles, sampled from 6 projects (SETI, LHC, Climate, Malaria, PrimeGrid, Rosetta) for 2K profiles coded for age, gender, profession, education, hobby & then ran descriptive stats, planning machine learning in future.

Categories for profession/education are science & engineering (S&E), IT, other; same for hobby except with scifi added. Current population trends: 90% males, and majority of people in 20’s, below median age of population. So age inequality is less important than gender inequality.

For profession, found 60-80% had job in S&E or IT, pretty evenly distributed but lower in SETI and higher in others. Global population is less than 1% in these professions. These participants are already engaged in science as a hobby – excerpts from profiles demonstrating geekery.

So the participants are young men with science background, job, or hobby. New avenue for people already engaged in S&E or IT activities. 1 participant of 5 corresponds to “the public” as portrayed by citizen science advocates. Example of nat history museum visitors to show imbalance: distributed computing not good for attracting new people to citizen science.

Q: ethics of downloading? Institutions also participating in distributed computing? How does it promote that concept?

Ethics: contacted projects & asked for permission to download, got very different answers from projects due to global participation, law doesn’t apply in the same way. Had to do it project by project; most of them didn’t know how to answer – not sure whether they could give it, but said it’s OK to take it.


Traditional Knowledge, citizenship & the conditions of scientific participation
Sarah Blacker, Max Planck Institute for the History of Science

Focus in northern Saskatchewan, Lake Athabasca, north of Athabasca oil sands and far north of Edmonton. Question of chemical composition of water flowing into the lake that spurred public investigations.

Athabasca oil sands one of the largest industrial projects in the world in terms of land area, 2 other oil sands adjacent. Bitumen separated from sands chemically, contaminants deposited in tailings ponds, hard to represent the scale of them. They contain pollutants that leak into the river & enter food chain. Company has acknowledged leakage of millions of gallons of polluted water daily.

Study collaborating between first nations communities & academic, producing evidence of extensive contamination & they’ve documented higher cancer rates. Developing evidence in 3 forms: unmediated, untranslated textual knowledge about contamination in communities; measurements of contaminant levels using current industry standards to make it legible & credible for policy; & synthesis of first two forms. Is the 3rd form a type of knowledge reflective of democratic knowledge or does it reproduce colonial power relations and privilege Western science? Exploration of epistemic consequences of a hybrid study, pointing to hierarchies and privilege.

Particular attention to role of knowledge mediator; ecologist has worked with community for years & was invited by them to collaborate. Environment Canada has also funded studies into contamination in region, but didn’t incorporate traditional knowledge & didn’t stay in the area long enough to acquire that depth. Government lacks capacity to incorporate traditional knowledge.

Definition is focused on public participation in science, due to problematics of terminology. First competing study by EC looked at cancer rates, elevated by 30% in area but attributed to other risk factors for community such as obesity, alcoholism, etc. Another study found 17-453x higher contamination than “safe” in animals but no relation to human cancers.

Locals reached out to McLachlan because their experiences weren’t being considered. Funded by Health Canada & SSHRC, peer reviewed by Health Canada. First time science aligned with First Nations leadership (according to McLachlan). Benefits of integration include pointing inquiry in the right direction. Research challenges include that cancer develops on a delay; traditional knowledge can register changes in different ways that are detectable and precede cancer detection.

Shows examples of reports – arsenic levels in meat animals such as moose, beaver. Reports produced for Mikisew Cree and Athabasca Chipewyan communities, notable that they are employed by oil sands, so they don’t want to shut down oil sands. Locals want to work with industry & government to manage cooperatively: aim is modest, even with a social democratic government in Alberta, provincial government is shutting down monitoring, citing unnecessary duplication of studies, etc.

Multiple barriers to indigenous participation in science. Argument that Western science and traditional knowledge are incompatible within an individual, but that they can be collaboratively engaged in communities. Attending university, gaining scientific expertise & ability to work with policy, removes indigenous peoples from their traditions.

Q: Other examples in Canada – why is it that communities invite scientists to participate, doesn’t happen that way elsewhere.
A: In this context, it’s due to the contamination & physician reporting higher cancer rates to mobilize resources, physician got fired. Community reached out to McLachlan because of his reputation for working with indigenous communities in Manitoba, they trusted him more than other scientists.


Architecture and social sciences’ spatial turn: dialogue or monologue? Toward an inherently collaborative rhetoric of science communication
Leandro Rodriguez‐Medina (Universidad de las Americas Puebla)

Absorptive capacity and routines to appropriate knowledge. Relationship between architects’ practices & social science knowledge: architects accept SSK when they recognize their project must take context into consideration. Artful translation of stakeholder relationships in architecture to relate to citizen science, very thoughtfully worked out.


Toward an inherently collaborative rhetoric of science communication
Erika Szymanski (University of Edinburgh)

Studied wine industry in NZ and science communication. Material semiotics important for dialogue between scientists & non-scientists. Tech transfer models fail to make connections to winemakers’ knowledge despite lots of participation. Again, excellent refocusing of original material to fit into the themes of the discussion and added a complementary view on scientist/non-scientist working relationships.


Session 1 Discussion

Q: Elise, what’s answer to the question?
A: In distributed computing, not really democratizing per se, but 20% are in fact “amateurs” drawn into the project.
Q: Others speakers?
A: Sarah – the fact that it doesn’t work says a lot about the context of her study.
A: Erika – failing because it’s being used as a tool, not a source of knowledge from outside of traditional science.

Q: 3-track model in Sarah’s talk, what was the synthesis?
A: Scientist asked her not to show images because of challenges with publishing, they’re still trying to get published & ran into problems with climate of censorship of science in the country at the time. It’s not so different from the representation, incorporates text & measurements on a page.

Q: Me – Sarah, is Antonio Gramsci’s organic intellectual concept applicable?
A: yes, and view expressed in talk was on the extreme end; other members of the community would agree with that conceptualization.

Q: Static style of science, dynamics of knowledge in these problem spaces. How is this present in these cases?
A: Sarah – McLachlan & a collaborator were in community for over 2 years, described it as more than dialogue, wouldn’t have been comfortable doing the project if he hadn’t lived there long enough to understand the situation in depth. Why that quote on that page? He felt it best represented the concern over the potential mis-contextualization of measurement. Definitely problems romanticizing contextual knowledge.
A: Erika – understanding of the methods is very strong, scientists & winemakers share similar social spaces in NZ, are friends, understand one another’s practices well. Surprised by the dynamics in the transfer space as a result. Rhetorical framing of how science communication is done reinforces the practices.
A: Karin – long generational history of immigrants in TX that’s created agriculture practices that reflect intercultural knowledge of legality, especially around moving cattle through spaces & planting certain crops. When wildlife management as agriculture came around, it challenged people to reconceptualize their land and properties. Notion of prairie has emerged as a romantic representation that challenges agriculture production frame – an imaginary about traditional knowledge, which is very problematic.
A: Leandro – hard to see how traditional Western knowledge can communicate; usual platforms, e.g. science papers & conferences, practices, he looked for new “products” that might have more influence than the professional version. One way to consider dialogues is seeking new platforms for presentation, questioning, etc.
A: Sarah – they did make a documentary!


Session 2: Ecosystems of participation

Trading Zones, Citizen Science and New Infrastructures for Knowledge Production
Per Hetland (Oslo University)

Context: part of larger project, Mediascapes, this is just one component; main goal is looking at links between museums & public. Citizen science is 2 case studies, plus one in humanities, & bridging studies. Variety of partners for citizen science: Nat History Museum in Oslo, GBIF, SABIMA, Norwegian Biodiversity Info Centre. SABIMA is “amateur” organizations.

RQ: How does NH Museum & stakeholders interact with communities of interest outside of professional institutions & engage amateurs/volunteers in citizen science? Has debated which label/concept to use, prefers amateurs or volunteers as better reflects the space.

Definition: project in which amateurs/volunteers partner with scientists to answer real-world questions (from CLO). Qs then are what are we doing, who volunteers and why, who decides what to ask, what do the answers look like? Bonney/Shirk models of citizen science. Focusing here on 3 middle models; collegial is interesting in historical sense but not sure it’s so common anymore.

5 cases. #1: crowdsourcing and transcription. Very common format. National historical records in Norway include 6M records with 60-70% already digitized, volunteers working on the rest. Motivations for this activity? Lit suggests contributing to science is most important, usually another one besides – save the whales, etc.

#2: Validation & expertise in Species Gateway (Artsobservasjoner) – observational data submissions on species reporting. Amateurs were disappointed with getting only footnotes on transcription work, felt they were left out and not visible anymore. So they asked for the species reporting portal, going on since 2008, 15M observations in a population of 5M, core group does most of the work, most only contribute a few. Challenge is validation with 5K records/day. Concept of apomediation (Eysenbach, 2008) to validate the work.

#3: social networking & amateur communities around grasshoppers, using Facebook; interested in how social aspect of participation is understood.

#4 User engagement & youth engagement; biodiversity mapping involves a lot of gray hair. Interested in attitude formation & science career trajectories.

#5: Amateur-institutional relationship & new technology – are museums still getting all the physical contributions that they once did, or only digital contributions? (Hetland, 2011)

Observational matrices for each case with Actors, Rules, Activities. Describes each the cases with this framing. #3 case, curators do the validation, biodiversity network paid to have “professional amateurs” to validate the species records. They have “red list” that needs close attention and “black list” of invasive species that are especially important. #4 curation role involves taxonomic maintenance, e.g., 8 different names for each plant.

www.academia.edu/1132386/Science_2.0?Briding_Science_and_the_Public


Scientists, Citizen Scientists, and the People in the Middle
Hined Rafeh (Drexel University, Rensselaer Polytechnic Institute)

Self-reflective examination of “person in the middle.” Interned with SciStarter & did a lot of promotion at public events. Was part of partnrship with NASA GLOBE program SMAP Mission. Goal is ground-truthing satellite data, recruited among schools, universities, tribes, prisoners, kids, etc. Had 286 express interest, 110 trained & collected data, equipment provided for some who didn’t have any. Soil collection protocol took 2 hr training but 5-10 min to complete; weighed soil, dried it, and then re-weighed it.

When talking with scientists from NASA, it was all about data quality. They weren’t concerned with things like time zones and struggles using Hangouts. Surveyed participants with 42 responses; majority content with training, but also issues with details like choosing sites, how to use protocol properly, and not happy with GLOBE’s level of engagement.

Got to go to OSTP for 9/30/15 event. Nice to talk with people but felt there were lots of barriers to being taken seriously. Next steps for this project include expanding to new audiences; lending libraries for participation kits; partnership with eco-skills.

Felt that GLOBE scientists had nothing to lose & got a lot out of it. Couldn’t understand why people were participating because it’s free labor, she believed they got more frustration than enjoyment out of it despite the fact that they continued participating nonetheless.

Future research directions: 3rd party players; knowledge creation or data collection; motivation, empowerment, & social value of science. No one talking about groups other than scientists & contributors such as 3rd party organizations.

3rd parties: Claims they are all for-profit, which is an interesting interpretation because it’s only true of SciStarter, & mistakenly named other organizations which are not actually for-profit. Claims they are all financially backed separately from scientists & that they provide easy access to projects & resources.

Knowledge creation vs data collection: experienced emphasis on data quality & not much else. Believes that there’s no attention to other outcomes. Feels that people are barred from participation in the “real discussion”. Many projects say they intend to improve something, but what? What are real outcomes?

Motivations: quotes a person who enjoyed participating; then second-guesses that experience by questioning if the participant is “really doing science”, and if data collection is enough.

Empowerment: analytical framework from Corbett & Keller. If this is considered empowerment, what does it tell us about science & society?

Values: Kids are first ones to adopt “science is cool” perspectives. How does this figure in to other people’s relations to science?

[NB: This presenter’s comments on future directions were based on short-term personal experience with a project rather than formal inquiry.]


Enrolling scientists, citizens and lichens for knowing the chronic effects of pollution in the Fos‐sur‐mer industrial area (France)
Christelle Gramaglia (IRSTEA); Philippe Chamaret (Institut Ecocitoyen pour la Connaissance des Pollutions)

Focus on ignorance, undone science & citizen sciences. Local communities in polluted post-industrial spaces get no info about consequences on health & environment. Concerns are not taken into account as related research is unfunded, incomplete, neglected or undone. Citizen initiatives develop in reaction so that the missing/needed data are gathered. Variety of formats and outcomes.

Fos-sur-Mer area in southeast FR: industrial harbor, one of largest in Europe with 20K ha of heavy industry. Due to lack of knowledge about pollution impacts & siting of waste incinerator led to local community concerns about environmental & health risks. Nonprofit organization for citizen observatory to address these concerns. One of the focii is citizen engagement in biomonitoring, not just gathering data with established methods, but also trying other methods to augment that and evaluate for regulatory practice.

VOCE: volunteers for the environment. Collecting lay observations & insights, elaborating protocols, providing access to training & knowledge, encouraging engagement in monitoring. Monitoring includes observations & claims, measurements & data collection, etc. About 50 ongoing volunteers, many retired but more and more younger volunteers, they want to understand pollution impacts on the area. 2 cases of studies.

Researching water quality with conger eels: looking at how chemical contamination of marine habitat by measuring fishes’ impregnation. Citizens’ input important for finding a bio-indicator that respected the balances between science and social stakes, couldn’t choose a species with economic importance so landed on conger eels, and bringing technical know-how through specialized volunteers. Results showed high concentrations of Mercury, Arsenic, chlorine emissions.

Research air quality with lichens: implementing & maintaining register of exposure to pollution in industrial areas. Lichens have differential response to pollution, so they make a good bio-indicator in terms of biodiversity. Good public participation among non-specialized participants leading to high density of measurement sites. Results found complementary measures of air quality; data integrated into formal studies. Quantified impacts of pollution on lichens which showed higher concentration near to industrial sites (and lower biodiversity).

Data quality & access: close collaboration between scientists & volunteers because volunteers involved in many steps of the research process. They receive training for independent work, but scientists help model proper methods, answer questions, and do data validation. Provides scientifically useful data, validated by scientists, & results returned to contributors at public meetings.

Benefits of citizen science: important because it can address research needs that are otherwise overlooked. Sharing the scientific culture is possible & can help local organizations in formulating strategic claims e.g. during public hearings. Addresses important issues without offending locals’ sensibilities & interests while taking into account local knowledge & know-how. Accomplishes work that scientists alone could not do.

wwww.institut-ecocitoyen.fr


Co‐Creating Research Agendas through Multi‐Actor Engagement
Niklas Gudowsky (Austrian Academy of Sciences)

Developing models for co-created research agendas for future engagement. Expectations are important in shaping emerging tech & promise of progress. Future-oriented perspectives harness expectations, influence discussions & policy, which may impact funding paradigm. Anticipation beyond short-term prediction is very arbitrary, often contradictory – basically just educated guesses. Experts dilemma says you can elicit contradictory expert opinions at almost any time, so you can get the science opinion you want; similarly “pet experts” to policymakers who engage the same people over and over to portray a certain view of an issue. Imaginaries shape present such that alternate futures are less likely.

Public engagement in STI issues: lots of criticism that desired objectives are not met, failing to open up debate, didn’t yield rationality gains desired, results of public engagement don’t translate to policy, people are engaged too late in the process. Research agendas seem like a viable early entry point to evade these issues. Open framing of early agenda & “blue sky engagement” doesn’t require expert background. Collective utopian thought doesn’t require that framing prior to engaging, process orientation of thinking about desirable future which doesn’t require prior knowledge and everyone can respond to that, deliberatively engaging these future visions.

Basic framing: Citizens’ framings of visions go to experts & stakeholders, who generate recommendations, which are returned to citizens to evaluate whether it meets their visions; once validated, they can move to policy-making. Tested the process in 8 EU projects for early Horizons 2020 project. Have conducted several local and national-scale projects, focusing here on CIMULACT, www.cimulact.eu.

Diagram of the process – vision workshops, catalogued visions & needs they represented, co-created research programs around those areas of need, created research program scenarios, discussed through open online consultation and face-to-face meetings, which helped with enriching and prioritizing research program options, defining research topics in a pan-European conference, with outcomes of policy options and research topics that are expected to shape future EU research. Process starts by grounding in the past before they can move to visions for 2050, which engages older participants, and progressively more of the group engages as they can contribute, discussing changes from past. Used to spike creativity, using typical facilitation techniques.

Examples of need areas and the visions, research scenarios that include directions, questions, concerns, expert view & citizen view. Now doing online consultation, everyone can participate in process, takes about 20 minutes.


Discussant: Alan Irwin

Nothing he says is critical, just wants to ask questions that cut across presentations. 3 general questions & then a few words about each paper.

Theme 1: fascinating to see the rise of citizen science & ECSA, exciting movement now, when 20 years ago it was all unknown. Instead of being a general talking point, we now have empirical work. What is going on in the rise of citizen science? How do we come to terms with that? Is what we now call citizen science something we’d previously have called counter-expertise, radical science, along the lines of scientist-activists in 70s? What does this say about nature of citizens & of science?

Theme 2: Models of citizen science that are used, not to critique modeling which requires certain assumptions, but interesting reflection on the perspectives that models emerge from. Gives examples of Muki Haklay’s model. Isn’t it true that these models are assuming that citizen science is intended to influence science? That’s one dimension, but expects that many participants don’t care about the “real” science at all, that publication isn’t important to them, and they really want to solve problems and learn more about the world. What other models of citizen science could we have that don’t assume a knowledge dimension?

Theme 3: What do participants themselves get out of this? Especially when we acknowledge that the knowledge generation part is complex, what else are they getting? Emotional, learning, fascination – people just want to talk about things they’re passionate about. Moving away from epistemology – and bringing in the politics – this is a focal point that opens it up to broader discussion.

Comments on the papers:

Hetland: using Collins & Evans classification of contributory & interactional expertise, fits with the question of whether it’s intended to create knowledge. It was always both, as claimed, but how do you know that’s the case & whose perspective does that reflect?

Rafeh: invokes questions about empowerment, which was an underlying theme, and goes back to motivation, which assumes some power to do something. What was the power they were getting & what did they get out of it? Also suggests she’s not a middle person, a more active contributor than that. Makes us question what our role is in STS? We may be driving the discussion even as we analyze it.

Fos-sur-Mer team: Can be seen as counter-expertise; referred a lot to scientists and volunteers, want to consider that relationship more deeply: can’t the scientists also be volunteers & vice versa? This false dichotomy builds in potential to distance people rather than bringing together.

Gudowsky: How do you know that thinking about the future changes the present? Seems plausible but what’s the evidence? For example, did science fi visions from 50’s and 60’s shape today? Need to challenge the idea because it’s profound in the space of technology assessment. Tech is more complicated than that.

Return to 1st 3 questions:

What about nature of citizen science in general?
What about the models, do they over-emphasize epistemic models to neglect of other aspects?
What can you say about what contributors get out of it?

Hined: these are connected themes, e.g. nature of citizen science & motivations. Historically clear distinctions but now can’t separate science from the way we look at the world, science is the new religion. Breaking down barriers as to who can participate.
Per: knowledge comment is relevant; knowledge to contribute to science is very important, but much is passion-driven, which is very interesting.

Christelle: in early discussions about this work, enthusiasm about opening up participation & opportunities was discouraged, that’s not what they felt they wanted. When earlier citizen work was dismissed, their engagement with professional scientists was seen as a way to get solid science & aspiration/ambition toward scientific credibility. That’s what the strategic approach was for this institute, to share labor, and the data is to be used by multiple parties. Maybe related to French epistemic culture, high respect for science & expert authority, not challenging that frame.

Niklas: was asking that first question himself too: is what we’re doing citizen science? A lot of prior work focuses on hard sciences, in soft sciences not as much is happening. Looking at current work through future studies & saw active participation & co-creation in that space, felt that makes it a type of citizen science, can open up and include humanities & social science in citizen science. Asked what people get out of it: in workshops, they see it’s hard to find participants & they use a lot of different strategies, but once you get them in the room & encourage open discussion & ideation, it builds a lot of passion among participants and the networks of individuals persist for years. Empowerment of thinking that they could influence the future & they certainly don’t care about publishing papers. Asking about future changing present – not the thinking about it, but the results that come out of it have some impact on directions moving forward. If you look at corporate oversight, that has been strongly shaped by public engagement.

Hined: to return to epistemic monitoring, got called out for calling people citizen scientists – you’re the one giving them that label. Recognizes that she’s projecting her values on participants and that’s an active role that changes dynamics.

Christelle: what else volunteers get out of it – people with no prior political commitment, many do surveillance for forest fires or cleaning up the harbor, they didn’t mobilize over the incinerator & watched passively thinking nothing could be done. But by participating in biomonitoring is a way to act on their concerns and enact the attachment to a place that has been degraded but could be rehabilitated.

Per: Knowledge very important in biodiversity mapping. The museum’s records are seen as higher status than records from amateurs. They’re essentially competing for validation as accepted knowledge. Some researchers are using the data, others refuse & consider them unvalidated & unreliable, despite errors in the professional science work, but we still hold the expert knowledge at a higher status than amateurs.

Q: Not criticism, but Alan’s point about what models of citizen science are at play links to broad, old question of the relationship between experience & knowledge. How is knowledge experienced? When you use terms like volunteers/amateurs, and training instead of education, that sets a specific framing about power relationship.
A: Hined – how to write a paper about own experiences? The experiences aren’t standard research data, but figuring out what it is often comes from groups who prefer the label of layperson or volunteer. It doesn’t come from one place & that makes it hard to answer.

Q: A way to reformulate the question: when things don’t go as expected, interesting to see what happens when participants resist the way science is to be done. In Fos-sur-Mer, did anything happen to change protocol to address different interests in ways that modify research?
A: Christelle – Question of model; it was difficult to apply just one model, it’s so contextual. Model is enlightening but reductionist when giving an account of what’s happening. Recalls prior research where locals knew the ecology but it took several years to demonstrate its accuracy through scientific means. Conger eel case – if you work with oysters, you’d get in trouble & locals will resist because the work may stigmatize the area & destroy the way they make a living – so collaborating on indicator species selection was a useful move.

Q: Remark about sociological citizen science & social feedback: last year did a quant study & sent thank-you letters that invited their thoughts on the long interviews, was surprised at extent of response & some people even sent back the thank-you money.

Q: Claudia – for tech assessment, those formats like consensus conferences on tech issues, so you were doing engagement for STI policy-making, what are mechanisms to ensure the recommendations are taken up? And if we think future changes present, it’s essential to think & reflect about whether it counts or not, whether it’s a fiction of the participatory future-making, & the consequences that may have.
A: Christian – Uptake depends on the project, depends on who solicits the feedback. On the one for food security, they were both the participants and the people asking for the work to be done. Working with policy-makers takes a lot of effort, many letters, meetings to share results, etc. Also really depends on the interests of individuals – people asking for results so they can integrate them, others only slowly getting interested.

Q: Credibility by citizen scientists, tension in terminology of amateurs vs professionals. Something to consider.
A: Me – I never use the term because of implied assumptions, but Rick Bonney observed that the etymology of “amateur” speaks to work done out of love, and that sort of passion was a clear trend in the stories we heard from speakers. Despite the baggage we associate with the term when it’s placed into a false dichotomy with “experts” – and it really is a false dichotomy, these are not opposite concepts – it’s actually quite apt at capturing the sentiment expressed by participants.
A: Christelle – there are not 1 but 2 words in French for this set of notions that represent these sensibilities well, amateur and connoisseur [the Old French root means “to know”].


Session 3: Infrastructures, technologies, & policies

Citizen Science by Other Means: Technological Appropriation & iNaturalist
Anne Bowser (Woodrow Wilson Center) & Andrea Wiggins (UMD)

Departure from original goal of developing a typology of infrastructure: noticed a few things in an earlier research exercise. Components of infrastructure were not easily arranged in a hierarchical fashion, but represented more of a networked assemblage. Huge range of tech in use, like social media, in-house tools, etc. So zoomed out to understand citizen science tech as complex sociotechnical system, thinking of them as system assemblages with both technical & social components.

Key components are borrowed or bought through appropriation as defined by Bar, Pisani & Webster, 2007. New pilot method was Dix 2006 framework to analyze the potential for appropriation in iNaturalist. Overview of iNat: goals of socialization, education & data collection. Components include app, website, community & researchers. Process of participation is uploading observations, crowdsourced validation, sharing for research grade, and data access. Theoretically a global research infrastructure, primarily funded in the US and data collection reflects it.

iNat supports appropriation in a couple ways. On-site projects, partners for data collection campaigns like NPS Bioblitz, and OSS code base on GitHub that can be adopted, e.g. in Natusfera. So how does iNat support appropriation by design? Outlines the criteria from Dix with examples from iNat.

Allow interpretation: multiple labels for grade of data that can be redefined in other adoptions or uses. Research grade is a commonly understood by certain stakeholders, but if reworked then data may become incommensurate.

Promote visibility: make data flows explicit, make data processes apparent, & make code open. Scaling up led to project aggregator functionality which streamlines process, but also obscures data flow.

Expose intentions: make the primary goals clear. iNat does this well in some ways; consequences of decisions around geoprivacy are clear, but data quality assessment is more opaque. Confusion about the “needs ID” flagriculture (for example) can have downstream impacts on which data are considered research grade and used for research.

Support not control: reusable framework is flexible and allows a lot of options; while the platform could technically be used for things like H2O quality, for example, current configuration doesn’t support it, nor integration of complementary data to biodiversity content.

Plugability & configuration: highly structured process suggests that collecting a lot of data is the primary goal. Only certain configurations are supported.

Learn from appropriation: pay attention to use & whether tools can be better suited to end user needs. Natusfera was developed because iNat declined to support their needs for European projects with language support, etc.

Encourage sharing: documenting uses for supporting shared protocols and comparable data is very uneven right now. It suggests some preferences around which projects are using the platform “right” or in the preferred way.

Summary: prioritizing direct appropriation without  modification of existing data flows over more complex code-based forms of appropriation. There’s little middle ground for intermediate forms of appropriation, e.g. if partners wanted to collect air/water quality data in addition to biodiversity data.

Discussion: situatedness – are there situations where appropriating a tech & adopting inherent values in design is antithetical to project goals? Tech components like Google Maps is subject to some caveats that seem to align with values of iNat, e.g., around openness, but other users may not recognize they’re adopting 2nd & 3rd hand values through design of the tool they appropriate.

Dynamics: environments evolve; appropriation enables those dynamics. Example of Public Labs: once kite is appropriated, it’s unlikely to change upon appropriation for monitoring, but software components’ materiality are different. Software is more mutable and gets changed when code base is appropriated.

Ownership: appropriation brings a sense of ownership when people feel in control over a tech & its uses. DIY/maker communities may be able to contribute to this aspect in some cases. In iNat environment, appropriator is essentially a mediator for iNat, and that has implications for groups like NPS.

Summary: need to carefully consider impacts of appropriation on both the appropriated and appropriators.


Microfluidic systems: challenges and opportunities for citizen science
Mary Amasia (Madeira Interactive Technologies Institute)

Chemical Engineer by training, working on critical tech. Early stage work & welcomes feedback. Example of Flint water crisis: citizen science promise was only partial; participants took own samples, but were shipped to VA for EPA accredited lab, issues of scale also impact things like whether adequate evidence for action can be collected.

Has worked in microfluidics for 10 yrs, tools for portable chemical testing, e.g. Lab on a CD. Multidisciplinary field with several specializations required – physics needed to understand how to manipulate a testing substance via capillary action, chemistry for analysis, etc. Combined with features of optical disk readers, it’s a tiny powerful lab, but not being used in places where it would have most impact, in places where there’s no access to those resources. Examples of paper, droplet generator, and CD based systems.

Developing BlueLightsLab to use BluRay drives for chemical testing. Few computers even have BluRay or DVD players – obsolete tech being underutilized, can repurpose and reuse them for low-cost diagnostics. BluRay tech includes spinner, blue laser, and reader – so could be used for sensing at scales of under 1 micron – applications like microplastics, invasive species, larvae. Laser can scan surface of the DVD for imaging instead of pits in the disc for data. One application is for low-cost HIV testing, since it can detect microplastics at sizes ~ 1 micron & white blood cells are 10-15 microns. Another use is for testing for DNA linked to GMO crops. Some of these are being used in a modified drive with an extra sensor above DVD surface to image through refraction or absorption.

How can this tech integrate with citizen science? Most emphasis on microfluidics has been on developing for biotech, or huge scale testing. Not much on the challenges for real-world samples as in citizen science. Microfluidics that are $5/test are too expensive for medical uses, but acceptable for citizen science. Good potential for DIY testing or crowdsourced monitoring, but has to be integrated into citizen science frameworks.

Just starting to work on open hardware platform for DIY bio. Also not assuming one best approach and looking at how to develop low-cost test kits for crowdsourced monitoring. Prior work is proprietary & siloed, but they hope to make it open source.

More Qs: what is testable? What counts as direct evidence? What makes a particular sensing method accountable or auditable within the existing legal & political context: How can expert & lay methods be leveraged together?


Privacy and Responsible Research in Citizen Science projects
Gemma Galdon Clavell (Eticas research and Consulting)

Policy researcher in private company, working with industry, government agencies & admin. A year ago, got to work with a citizen science project, looking into ethics & legal aspects. Examined whether it’s being done as citizen science without the citizen. Focus on science, needs of researchers, but not as much the people providing data – data despotism.

Worked with 3 projects, Attrape el Tigre, Bee-Path, Observadores del mar (marine observations). 2 app-based, 1 web app. Basic principles of data protection – were they being taken into account? Issues of processing personal data in the context of citizen science, blurs lines between subject and object, so data privacy becomes an issue. Increasingly digital lives so it’s hard to know what happens to one’s data later, including risks of re-identification. Saw a need to protect participants.

Process is through analysis of data life cycle – process steps where things can go wrong. Incorporate principles by privacy by design and privacy by default to make sure final product is in line with legal requirements and expectations of clients. Private contract so can’t go into full details, but general findings.

Found general concern for privacy with specific remedy mechanisms. All were using own servers and storage without cloud, which is important, and were also well protected. There were access protections with logs for accountability and figuring out who had access and how data might have been leaked. All had relevant privacy & cookie notifications. Good starting point, sometimes the notices have nothing to do with what they really do.

Vulnerabilities: data collection through apps is main issue area, usually fairly minimal, through apps permissions & web forms especially with social media, plus excessive fields of unnecessary details sometimes required. Most projects relied on passive consent, had to opt out and not opt in; most privacy-enhancing functionality was not on by default and they recommend the opposite. User password management also a problem – emailed in plain text!!! Metadata in pictures can tell you about the device, owner, locations, etc, so photos need to be cleaned in a way that’s not commonly done. Also data transfers, willingly to 3rd parties but unwillingly to search engines & web repos – that means people can’t delete data because it still exists elsewhere online, sharing with repos you don’t control. Sharing and reuse had no risk assessment – what if they were exploited, what chances of re-identification? No one looks at it. Exploitation also an issue – they did an adversarial attack & outcomes were not good. Re-identification generally an issue. Data deletion is also not done in practice but is legally required, although it could be automated. Also, should always use https protocol, securing all the interactions with encryption.

Assessments of projects: Bee Path only collects data when in the research area, which is great, but did background tracking by default. Adversarial attack let them use name of someone in project and find their data in Google, even if data were deleted.

Specific recommendations: Determine beforehand which data categories may be shared or revealed; notify about privacy settings, implication & risks; option to hide data or anonymously publish it; allow people to rectify & erase data including stuff that may seem non-personal; request minimum info about participants including metadata; adopt transparent practices about sharing and notification.

Lessons learned: can’t decide unilaterally what privacy protections should be in place – should involve volunteers; transparency, open data, and participation not in conflict with data & privacy protection, not a tradeoff; data sets that are published need to be properly dissociated & analyzed; impossible to have single recipe so context is important.

[NB: as authors of a paper on this topic in Human Computation, Anne & I both approved of this talk!]


Awareness and attitudinal change in participatory air pollution monitoring
Christian Oltra (CIEMAT); Ainhoa Jorcano; Irene Eleta (CREAL‐ISGlobal); Roser Sala

Study as part of larger project on air pollution & public perception & change. Air pollution at harmful levels in Europe and it has a high cost in terms of health care and economic loss. Mainly a localized challenge using info systems based on reporting, websites, indices, alerts, advisories, and even apps. Traditional public info systems on air pollution rely on awareness of availability of data.

Participatory sensing based on increasing availability of new capacity for public engagement on environmental risks. In air quality, more development in terms of hardware and sensors; others have looked at how the tech can engage the public. Very few empirical studies on how mobile sensors impact attitudes and behaviors about air pollution.

RQ: how does perception of local air pollution change with use of sensor? How are attitudinal dimensions changed by using sensor? Do participants feel helpless or empowered to act based on experience with sensor?

Asked small groups of people to register air quality for a week using a sensor. Started with focus group, week-long usage diary study, and then a follow-up focus group. Sensor measured NO2, associated with particulates so even if most dangerous, it’s a good indicator and provides useful resolution of data. 4 groups of 6, half recruited & half self-selected.

Results: experience with the sensor: main patterns of recording were: recording levels in their surroundings; making comparisons between polluted & non-polluted places, looking for patterns, observing intuitions, etc. Half of the groups did a lot more than the other half, tracked with recruited vs self-selected. Responses said it was interesting, curiosity, surprise.

Impacts on perception: awareness of NO2 presence, but surprised that levels were higher/lower than expected in different places. Were more aware of existence of NO2 levels in city, problem and data were more visible to them, and problem was considered more specific due to focus on what was measured. Understanding improved for NO2 but not of impacts or other air pollution issues.

Impacts on risk perception: severity & susceptibility of beliefs, saw little evidence of changes in perceive susceptibility, very limited evidence of change in perceived severity, even when the levels are very high. Controllability & self-efficacy: little evidence of improved perception of controllability, e.g., I can’t do anything that would change my exposure. One person reported feeling empowered by personally collecting data rather than passively consuming it.

Impacts on behavioral intention: very little change to beliefs about issues and intention to act, but did find some evidence of using start & stop systems in her car to reduce air pollution, another reported intention to buy a mask for riding motorcycle, and third reported he wouldn’t use a butane gas heater anymore due to indoor air quality being worse than outdoor.

Experience generates some level of emotional interest and potential to generate more engagement, especially compared to just focus group. Effect of sensors was mostly related to understanding NO2 levels & awareness; limited change and terms of perceived susceptibility and self-efficacy to reduce exposure; seem to be significant differences between recruited & self-selected volunteers in terms of volume of contribution.


Q: Christian, next step in research?
A: No next steps yet; others working on improving sensors but they are the only social scientists working on it.

Q: For those participants who found opportunities for changes, were their experiences different from those who didn’t?
A: Christian – Provided suggestions from agencies on reduction & protection, and some took action, but it’s a poorly covered topic. No info on self-protection, generally on reduction. For them, main Q is risk communication & improvement in engagement, so should they be used by local agencies? Can it improve public engagement on air pollution?

Q: Christian, emphasis was on individual reaction/action, you can wear a mask but this is a problem that needs a collective response. Resilience is individual and asking if people were pushing for collective action?
A: Usually local governments emphasize different regulatory, political, and behavioral strategies. Haven’t gotten to that level yet, but they’re all important. In terms of behavioral, there’s room for self-protection, for example checking if standing back further from street helps reduce exposure, and also reduce polluting practices, and then engagement. Public must be more engaged in project with sensor.
Q: Anne, talked about ownership a lot, mostly toward the technology, but how about the environment, e.g. stewardship?
A: ownership is personal responsibility and stewardship is shared responsibility. Interesting to think about tech as shared in terms of stewardship; could imagine OSS platform but it seems unrealistic. More realistic is looking at convergence between maker/hacker and citizen science to advance practices through progressive evolution.

Q: Feeling that often the projects are focused on different analysis equipment & kits, led by research needs? What kind of experiences or ideas for mobilizing creativity of participants to formulate questions, generate ideas on conducting research, etc?
A: Gemma – no need to mobilize it, need instead to get people at the top to listen to it. Often no one is really listening, so innovation at the neighborhood level isn’t often chosen as a subject of study. People are doing things bottom-up so the question is how to make that work visible, and that’s true of privacy too. Demand exists but market doesn’t want to respond, more about learning to listen & stop imposing needs of other agents into communities.
A: Anne – earlier there was a presentation on future-oriented planning of engagement; can build on that through citizen science – not just contributing ideas and setting priorities, but also working with policymakers & scientists based on priorities of communities. Framework conditions need to be improved.

Q: To connect to last point, she is based in Netherlands, doing action research with public in project using platform in (energy?) transitions. At start of research, did interviews to see and listen to their needs. Argues it’s a combination of bottom-up and top-down approaches – if you ask people what they want to research, you will need to facilitate that process to make it a meaningful research project. Project leader focusing on transition management, giving people room to discuss and construct the problems they want to solve at the end. Wants to ask Gemma & Anne, she asks about very particular types of data, e.g., financial. Of course, in context of energy, this is considered sensitive. Finds the tension interesting – discussion with umbrella organization, she wanted to get access for research, and on one hand there’s a natural feeling that data needs to be protected due to sensitivities, but the other side of the coin is, it’s a community and we should share everything to learn and grow. Comments on this tension?
A: Gemma – specific issue with environmental research, effects of issues are felt individually down the line, so you don’t need to promote organizing because it’s detectable, but if it’s not immediately obvious, then it’s a bigger issues, especially if impacts are delayed. Privacy & transparency are not opposite, you can release data but would need to anonymize properly, analyze in aggregates and open data can expose private data that could be harvested by insurance companies to charge higher premiums. Researcher doesn’t need exact salary, needs salary category – details can be obscured, but not many people are doing that.
A: Anne – major tension is the need to collect data locally for global-scale research, which means making big questions relevant locally, figuring out how to make questions resonate locally & designing for that. There’s some level of dichotomy with the open sharing, decisions need to be made more collectively and communicated so that local questions and data collection can be meaningfully understood in global contexts.

Q: Mary, what would you like to see happen next? What resources would move this forward?
A: Here because in the time she spent in microfluidics she become critical of the narrative, developing for certain settings, getting patents and licensing. Got tired of that, and really wants to see it being used more broadly. Trying to work with existing components, reusable components, and trying to consider how to keep it as open as possible while making responsible decisions about the technology, particularly not to limit or determine how it’s used in the future. Thinking about that differently from traditional engineers. One of the initial projects is for invasive species monitoring on Madeira, lots of local investment, using it as a test case, wants it to be as open and usable, adoptable as possible. Considering working with Public Lab.

Q: Some social scientists are reluctant about citizen science. Not clear on participation, insisting on need for there to be a change beyond more awareness, but more directly, as an outcome. Maybe more work on research in action – citizen science as a political action? What are limits of privacy and ethics at that stage? European Commission discussion, reluctant to push on it because they believe it’s trivial engagement.
A: Anne – In some motivation studies, one thing volunteers look for is the outcome & evidence that contributions achieve something.
Q clarifies: more than just submitting data.
A: Anne – So projects with online participation, it’s on the science team to have the volunteers named in a published paper, in discovery of a planet – not just an aggregated data points, but acknowledging role in something bigger, which meets their motivations.
A: Me – Speaks to values underlying participation expectations, why are we imposing our values about participation on people? They don’t always want that.
Q clarifies: sees reluctance where belief is that citizen science participation should be something bigger than it has been.
…Extended and very lively further discussion with entire audience on tensions around values, funding, roles, etc…

Citizen Science & Health Data Donation: Health Data Exploration Project 2016

This week I had the privilege of joining the Health Data Exploration network meeting as an invited speaker. I found it a really interesting and eye-opening experience, since biomedical and health domains are not the most common focus in citizen science, and so rather new to me. The interaction of academic and clinical practice, and the implications for opening up participation, are a notably different paradigm than in some other sciences.

My talk was (very) well received, and I had a lot of really great conversations. My notes are pasted below, unembellished and minimally corrected, to give the broader citizen science community some insight into what the movers and shakers in the health data community are discussing as relates to engaging the public. Enjoy!


Health Data Exploration Network, May 17, 2016

“Projects that Have Advanced the Use of Personal Health Data for Research”

Stephen J Downs, Robert Wood Johnson Foundation
Mpact Parkinsons study with Apple ResearchKit – returning data to contributors was important & empowering. Sadly some of the controls in study turned out to have Parkinsons symptoms too.

Emil Chiauzza, Patientslikeme
Multiple Sclerosis study, small sample, using wearables to manage disease – how do people use data? Behavioral adaptations, day-to-day management of disease. Lots of tacit knowledge that isn’t captured, e.g., schedule shifting to avoid impact of heat on MS symptoms. Planning for rest time after more exhausting days. Avoiding heat with environmentally controlled settings. Pacing themselves, reducing activity intensity strategically. Developed a course in how to use the data from a wearable with concept of a “sweet spot” or targeted activity levels, contingent on the conditions.
Implications for data donation/cit sci: consider role of data within broader context of individuals’ uses, provide tools for action. Tools for wellness not always applicable for disease communities. Learn about “health hacks” that people adopt. Implications for precision medicine: put out survey about precision medicine, lots of people unaware, patients not included in the set-up.

Eric Hekler, ASU
Proximity sensors for adaptive interventions. Goal is just-in-time intervention to nudge toward positive or away from negative actions. Also looking into receptivity, at what moment is someone receptive to your message? iBeacon sensors: Bluetooth device tracks proximity & user accesses some aspect of it with an app, being used in retail, they are developing other uses, try it and donate data. Questions from this data: styles of participation, influence of program, re-identification behavior, predictions of future collaborations, etc. Interesting questions about data donation practices as well: how do you feel about the data collection, value that should be returned, who do you trust with this kind of data, what are ethics issues, are you willing to re-identify, locations of proximity sensors. Planning to validate measurements with users, did we measure the right things, is this useful or meaningful?

Michelle DeMooy, Center for Democracy & Technology
Working with Fitbit on internal R&D and ethics. Very interesting work. Report coming out tomorrow. Have a human subject researcher on the team, not just looking at data but helping provide context. More public-private partnerships are needed, making the debates public could be valuable, Fitbit wants to do the right thing but is stuck in a paradigm that makes it difficult.

Question about Fitbit adopting HIPPA – is that really a win? The privacy and security standards are pretty good, gives them a standard to work to and check against, which is a victory. That didn’t exist before, many wearable companies have nothing like it.

Julie Kientz, UW
Research on alertness: impacts on body clock, time of day, stimulant intake. Self-reported fatigue, sleep diaries, psychomotor vigilance task reaction time. Alertness varies over time, different patterns for different circadian rhythms. Also saw rhythms in app use by app category, over time, very interesting and logical. Wednesday was low productivity day, people hit a wall. If got enough sleep, used 61% more productivity apps, inadequate sleep 33% more entertainment apps. Late use of phone correlates with sleep disturbance. Future work – circadian-aware tech for planning and scheduling. Automated sensing doesn’t tell you everything, need the human-in-the-loop for understanding intent and meaning with as little burden as possible. Also notion of reward: forced reflection, we collect data and never look at it, put markers on your day and then annotate later, this is a benefit.


Barbara Evans, Uni Houston Law Center, commissioned paper
Consumer-Driven Data Commons: Health Data and the Transformation of Citizen Science

Data ownership: uses airline seat ownership of space to demonstrate the tendency to feel like we own something that we don’t. Feelings of data ownership are strong & intense, but legally we don’t own our data, and if we did, it would be different from owning a house. Most of our data are under a regime of shared control. Debating data ownership has been debated too long: critical issue is control & access.

“People-driven data commons” – uses the term commons in the sense of natural resources, e.g. work by Elinor Ostrom. It’s not the data but the institutional arrangements for data management and stewardship. Set of rules and arrangements to allow collective self-governance. Granular consent won’t get us where we need to be: acting autonomously puts power to individuals, but need collective action to achieve the outcomes. Collections of individuals can’t act together.

Normative ethics not admissable as testimony. Can get the data from either consumers or data holders. People-driven data commons means working with consumers so people work on getting their data to bring it together. 2×2 table of consent to data use from individuals, willingness to share data from data holders. So far we have neglected quadrant 3, which gives people access in order to contribute data.


“Personal Data Donation: Barriers and Paths to Success”

Anne Waldo, Waldo Law Offices: You can’t share data you can’t access–HIPAA. Rights to access data even include sending via unsecure email, but providers are not fully compliant. www.getmyhealthdata.org Providers are suspicious if you request records: are you leaving me or suing me? Refusal to use email despite the law requiring it, up to $600 (or $6K) to get records, ridiculous fees, and without estimates of cost until data are delivered. Data only available as PDF or on paper, not very usable.

Jason Bobe, ICAHN Institute: health is #3 priority for people worldwide, but don’t participate in organized health research: why? How do we overcome that and provide a good model for participating in health research? Really important for rare traits, e.g., “resilience” genes, need millions of people’s records to find the ones who have this, where medical prediction is they would be sick, but they’re healthy. Founded DIYbio, which has exploded since he started it. Key insight is not the $1K genome sequence, but the $1K genome sequencer. At the time, if you wanted sequencing, you were beholden to organizations that wouldn’t let you get your own data. Roles are changing, it’s no longer binary. Not just participants and researchers, participants can be the researchers. This influences research governance. Harvard Personal Genome Project: making genome data available as public benefit and resource. Question at the time was whether people would share, answer is yes. Variable across population, but people will share sensitive data even publicly, treating people as collaborators from the start was key, important to invoke reciprocity. Benefits beyond altruism and social good, people find meaning, and community, and education in new roles. Retrofitting sharing is too hard, have to completely rework it.

Nick Anderson, UC Davis: Assumption of precision medicine means we need lots of heterogeneous data of all kinds, not just genomic. Acceptance of the data sources is different. Value of data is differential, some are if not accepted, at least understood, e.g., NSA. Health data is different; how does a million people contributing all your data shift the question of when data acquired for one purpose is used for another. Even though promised to be for the benefit of all, it will probably only benefit a few to begin with. Social benefit is currently lightweight, hard to understand, not just asking to change patterns of how we do things, but to use that for things we’ll probably never be involved in. How is precision medicine going to shift the discussion? No clear acknowledgement of secondary purposes for data use in some systems, but we’re planning to do that anyway.

Camille Nebeker, UCSD: people were more thoughtful about sharing when she asked. They wanted to be motivated, for there to be a powerful purpose, that it would make a difference, and what it would be used for. It would be a motivator if it was of value to the community they were from, ethnicity or patient community. They wanted to have control over sharing, give permission and know who they’re sharing with. Commercial entities may be less likely to get data than academics. GINA for genetic data protections is important, preventing identity and against discrimination. Trust is important, would readily give data if they trusted the person. De-identification was key, odd since this group should know it’s hard to do, but still believed it could be done. Loopholes, for-profit, rights, etc all the barriers, huge cognitive burden to use policies, removing that will increase trust.

Aaron Coleman, Fitabase: sometimes you’re in a data sharing experiment and don’t know it. Understanding granularity of data and what can be inferred from that, for example, employer-based study–do you want them to know your sleep patterns. Do you want them knowing activity patterns that may suggest health cost differences? Once you dig into APIs, trying to pull data out, the barriers pile up. The more granular and often you want your data with more scope, the harder it is to get it. Permissions for access to data are now getting more specific, need the granularity and options to turn things off and revoke permission when we’re no longer comfortable with their data use.

Waldo: New guidance from HHS: have right to paper copy of records, to electronic copy if they’re in that form, to send to 3rd party of your choice regardless of who it is, and they can’t ask you why. You have a right to get records by unsecure email, though they have to warn you about the security issue. Have to give it in the form/format you request as long as it is reasonably produceable in that form. Rights to standing request. Within 30 days max, and should be almost instantaneous or very prompt. Providers can’t demand you come to office, require you to use portal, use their own authorization form–access request is different from authorization form. Can’t deny request if bill is unpaid, can’t deny if they don’t think a certain recipient should get it. Fees still exist, reasonable cost-based fee, only for copying and mailing, nothing related to the processing or storage or retrieval. Illegal to charge for download format, page fees for electronic, and must give up-front cost estimates. Best practice is to charge nothing.

These have been rights for a long time, is there any indication they will enforce it? HHS has been saying this clearly in conferences. Grace period on enforcement action since policy was released.

Question on issue of de-identification: how close are we to that being impossible? News article recently about Stanford profs getting phone records and being able to find the people. Ethnographic work–still hard to obscure who a family is to anyone who knows them. One of the issues with precision medicine is that you can do tangential re-identification. In technical world, only as good as HIPAA can make it. Movement toward considering probability of re-identification, why, and can we quantify that to make it easier to communicate that risk? Right now probability is pretty low. Need to educate people, from IRB member perspective, quantify the risk of harm for a study, tell them that in the consent form. If could quantify the likelihood of re-identification, start moving away from guarantees of anonymity, then people start to understand and we become more transparent.

Important to spread word about new guidance from HHS, this is huge culture change. Some safety in numbers, this is a tipping point. Feels like change agents, nice to be together with others who rock the boat the same way.

In policy and technical community, looking at linkability versus identifiability. Different paradigm for protections. Safe Harbor requires removing 18 identifiers, but 19th is “any other thing that would make it clear to someone else who that is.” Have to scrub it harder for case studies, maybe even change facts that aren’t meaningful. Expert statistician method–risk has to be very small, not zero. Should stick to standard of very small risk, because we deal with small risks in every other part of life.

Disconnect in talking about these precision medicine studies recruiting a million people, many others. Question is whether we are ready for this scale? Seems like there’s a huge disconnect. Jason says, PM initiative is bold by shooting for platinum record without producing a single song yet. But have ways to cheat, e.g. million veterans project. But sees mental model as the bigger barrier, people think medical research is when you’re sick and need experimental treatment. Need to shift perspective to research being a diverse experience, such that people seek out studies that meet their tastes. Danger of promising equal benefit to everyone, that’s not likely, so scale is going to be an issue.


Patti Brennan, UW-Madison
Citizen Engagement: Informatics in the service of health

4th director of NLM (upcoming.) Goal of health for all, in part through data donation. Think of “citizens” broadly, not the people you know or people of privilege. We need everyone engaged. Citizens, not patients, important framing for health services. People need rights to advocate for themselves; rights and responsibilities associated with citizen, much more so than engagement. Citizens as people of the world who engage, take part in the world.

Identifying health data life cycle–origination points for data, and also the points in between clinical treatment. Assertions:
1. Citizens are sometimes patients, but always citizens
2. Citizen engagement improves health
3. Citizen science provides a data-driven model to guide the next-generation of technical innovations, clinical practice, and biomedical & health knowledge

Needs to drive informatics in service of health. Emergence of new tools and tech, etc.

Engagement: participation by multiple parties achieving mutually established and jointly accountable goals. Also a promise, obligation, binding condition; giving someone a job; etc. Multiple definitions all have their own analogies in health. Arises from perspective of mutual recognition and respect; requires deliberate and intentional strategies.

Direct involvement in policy and public service delivery–in cooperation with rather than in place of experts. Discussion of citizen engagement along the lines of public engagement in science. Lots of ways that engagements can happen: extension, collaboration, co-creation.

Experiment: view videos of pond & list signs of life.
I saw and heard: water skeeters, Northern rough-winged swallow, multiple species of warblers (yellow, yellow-rumped?), shrubs, trees, water, song sparrow, tadpoles, frogs, grasses.

Citizen roles in citizen science: Extenders: extend the reach of professionals, perspective and goal defined by pros (contributory), Collaborative: professionals and policy define scope, citizens help set priorities; co-creation. Example of Audubon CBC. Rules and definitions, shared meanings, are all possible and useful.

Sharing is more important than storing. As important to seek normal patterns as unique things. Redoes the citizen science definition with citizen engagement in health. Australia defines effective engagement with: inform, consult, involve, collaborate, empower – likes this definition better than what is used in US.

CAISE typology parallel for health: gathering health data as guided by professionals, guiding priorities in policy, and in co-creation of balances of public health, patient care, and personal wellness. Potential benefits are personal, professional practice improvements, and public policy improvements. To achieve the goals, need information infrastructure.

Concept of doing a discharge simulation of virtual tour of home along with air quality data and other records, identify places for post-surgical care activities.

Expand the phenomenon of interest to health, measure what must be interpreted, rather than interpreting what we can measure. If data, storage, interpretation and use are separated, then provenance becomes more important. Metadata needs to be captured along with data and definitions created at point of use, transitioning to ontological formulation of what a data definition is. Need broader set of communication and information tech. Sets out necessary soft skills along with tech needs.

What if citizens are wrong? When clinicians are wrong, we call it differential diagnosis and don’t throw out the data, recorded as pathway that’s been abandoned. So why expect more from patients? Begin with what people believe is important and work from that.

How about rights that are under threat or restricted? Building systems based on privilege. Medical informatics can’t resolve it, but can be designed within social constraints to formalize impacts on what constitutes health, practices, and acceptable use of data.

Citizen Science at AGU 2015 Fall Meeting

I went to my first meeting of the American Geophysical Union in December. It was quite the experience; I’ve never seen academic conferencing on such a scale before. I liked the primarily-posters format because it was much more interactive overall–I could linger and discuss where I wanted to and skip the stuff that was less interesting to me. And to my surprise, there was a lot more that interested me (mostly in the earth and space informatics section) than I had initially expected.

However, it was hard to find the citizen science content, aside from that which was labeled as “education” despite a primary focus on science over outreach. With such a massive program, it’s pretty important to be able to search effectively, and I missed a lot of good stuff just because I didn’t know how or where to find or look for it. I made it to just 2 oral presentation sessions that featured citizen science in the session title; most other citizen science presentations were unobtrusively tucked into sessions with titles that presumably focused more on the science than the process and participants.


 

Climate Literacy: Improving Climate Literacy through Informal Learning & Citizen Science 1
December 12, 2015

Realizing the Value of Citizen Science Data, Waleed Abdalati

Perspective matters: diversity of the public is part of benefits. He was NASA Chief Scientist at time of Curiosity landing.

4 part series of TV segments on citizen science, starting with CBC at Everglades. Then gives the example of NPN & Nature’s Notebook. Points out that these are good data because people really care, as much or more than professional scientists. CoCoRaHS is another example, video of NWS staff setting up an alert based on CoCoRaHS data, process between report and radio alert is 2-3 minutes.

Another series – the crowd & the cloud. #1 Even big data starts small; #2 Viral vs. virus; #3 Feet in the field, eyes in the sky; #4 Citizens 4 Earth. Smartfin – surfer science.

Fantastic high quality video, really compelling teaser for the series. Will air in 2017 on PBS.

Q: this takes skill, how is training done?
A: CoCoRaHS has training protocol.


Citizen Science Contributions: Local-scale Resource Management and National-scale Data Products, Jake Weltzin

“from kilometers to continents”

Monitoring for decision-making at Valle de Oro NWR, first urban wildlife refuse in Albuquerque. Needed decision info and also public engagement. Data presented with bar chart that shows when migratory species are present at refuge so they know when to manage for them. Also working on wetland restoration–reducing Siberian elm and increase Rio Grande cottonwoods. Checking the flowering an fruiting–Siberian elm is leafing and flowering about a month ahead of cottonwood, and you need bare ground for cottonwood to propagate, so they need to remove the Siberian elm in the month before cottonwoods in order to prompt cottonwood growth.

Product framework: phenology status data goes to phenometrics; climate and meteorological data goes to phenology models and integrated data sets; remote sensing data also goes into integrated data sets; phenometrics goes into phenology models; final products are gridded maps and datasets, and short-term forecasts.

Showing NCA annual start of spring based on lilac data. Very pretty maps of “PRISM data set” for start of spring, 4km scale national map. Local version maintains granularity and scales down to NPS locations, so you can see first leaf index for park locations. But NPS cares less about when things happen than change from historic record, since the data go back to 1900, they can show biological response to climate change at level of national parks.


Putting citizen-collected observations to work — CoCoRaHS, Nolan Doesken

Starts with funny 2-minute animated intro “each measurement is like a pixel in a picture”. Talks about 1997 flood in Fort Collins–60% of library holdings were destroyed as they were in the basement due to work on upper floors. Recent expansion into Canada and Bahamas; now has over 20K volunteers.

Goals are quality precipitation, and also education & outreach. Easy low-cost equipment is important–gauge is equivalent to that used for historic climate monitoring that NOAA does, therefore can fit into long history of measurements. Mobile app for data submission as well as web forms; permanently archive data and provide raw data and summary reports. Data are good for supplementing other sources like COOP.

Data tend to be accurate, spatially detailed (except in Nevada–not enough people), timely, etc. Who uses data? Weather forecasters, hydrologists, water management, researchers, agriculture, climatologists, health, insurance industry, tourism. Data are fed into weekly US Drought monitor process, drought conditions are reducing. Snow data is hard to get so their sources are valuable. National Hurricane Center uses the data in post-storm summaries to describe the impacts.

Challenges: owl sitting on top of rain gauge! Much more male than female, very white, mostly college educated. Age demographics leans toward older, those who stick with it tend to be from that demographic even though they have good rates of signup for younger and more diverse demographics. Recruiting at national scale is tough. They have over 250 local volunteer leaders; need to recruit and train 3K new volunteers per year to balance attrition.

Cost effective but not free; after 18 years, still hanging on. Photos of a bear checking a rain gauge.

Q: GEO group looking for improving in situ precipitation measurements, especially in Africa. How to export to Africa?
A: It’s a matter of finding local leaders who care about local precip. Putting local face on project is more compelling than most other options. Subsidize the rain gauge cost, and then communication is the next consideration–need infrastructure.


Crowdsourcing science to promote human health: new tools to promote sampling of mosquito populations by citizen scientists, Rebecca Boger

GLOBE program–international citizen science in the classroom, 20 years old. Discussing how materials are developed, new mosquito larvae protocol.

Train-the-trainer model with F2F workshops–big backlog and long waits to join program, so moving into an LMS. Developing training slides for 50+ protocols, will be available in 2016, emphasis on knowing how to conduct protocol and not pedagogy. They have to pass quizzes before being able to set up a login and get full access.

Developing a mosquito monitoring protocol: can do genera ID with hand lens, species ID with microscope and experts. Sampling from containers as well as ponds, streams, puddles. Lots of research questions students can explore with the data. Have to get it up at the end of the year; will be doing a field campaign early next year to launch new protocol.


Era of Citizen Science and Big Data: Intersection of Outreach, Crowd-Sourced Data, and Scientific Research 1
December 18, 2015

The Citizen CATE Experiment for the 2017 Total Solar Eclipse, Matthew Penn

Working with 3 government research labs, 3 corporate partners, 4 universities, 3 K12 teachers, and participants. Donating telescopes to observers after event, sponsors include the companies who make software and filters.

Upcoming eclipse on August 21, 2017 will drive tourism, will be most viewed eclipse in history. Total eclipse opens up a window for viewing the inner corona in a way we can’t from space. The part easily viewed from an eclipse is the hardest part to study from a spacecraft. Planning to look at what is happening with polar plumes–they’re interesting but they need more data than observations from 3.5 minutes from one location. Looking at the eclipse in Mongolia in 2009, they knew they would be able to see scientifically interesting events.

Path of totality goes from PNW to South Carolina, to plan is to provide identical telescopes or volunteers to use at specific locations, transfer ownership after the event, and support ongoing use of telescopes. While it will be viewed for only 2.5 minutes in each single location, the entire path of totality is 90 minutes.

Funding needs about $180K for equipment alone: 60 sets of telescopes, filters, software, mount and drive, still need $ to cover cameras and laptops. Expecting about 26 GB data per site, 1560 GB (TB?) in total. Sending data via 3-day priority mail, equivalent of 6 MB/second, upload of about 2 GB on day of eclipse itself.

Afterward, they’re looking to develop additional projects for work on comets (can’t get major telescope time), solar programs on sunspots, and variable stars with prototype equipment in partnership with AAVSO.

Proof of concept: Did one day of training with a volunteer who was going to Faroe Islands in Mar 2015, conditions were lousy, but for 15 seconds got data of inner corona. The harder job was shipping it around the world and using crummy software. For a more prepared test, doing a train-the trainer training with 5 locations in Indonesia for March 2016 to verify process.

Interested? mpenn@nso.edu; mpenn@noao.edu; sites.google.com/site/citizencateexperiment

Q: how much does weather matter? A: weather isn’t great for about half the range, tends to be 60% cloudy. But with 60 sites they should get good coverage, and they’re hoping for 100, but if they add more they have to add where it’s cloudy to hopefully get more data from the sparser areas. Some range of +/- 10 miles to move in, but expect some gaps.


Synergetic Use of Crowdsourcing for Environmental Science Research, Applications & Education, Udaysankar Nair

Motivated by needs for data that aren’t collected by agencies but suited to crowdsourcing with compute platforms like Google Earth Engine.

Using ODK, “end to end design” of system, that pushes data to Fusion Tables and Google Earth Engine, merged with sat imagery from NASA via a maps engine.

Land Use & Land Cover Change data currently relies on remote sensing data, but it needs ground truthing for contextual information. Many potential uses for data.

Claims 4m accuracy for GPS on app. Can use ODK offline to collect data–step by step overly simplified form, usability could be problematic.

Tested with a MS classroom, introducing with the topic of biomes. Requires lesson plan, including learning standards. Had kids use mobiles with ODK to track land cover in their neighborhood. Also did some work with student teachers in India for mapping small water bodies to support Kerala State Biodiversity Board. Also looking at collecting data on open water containers for vector borne disease research; frost occurrence; damage after severe weather. Doesn’t mention how this is fed back to students.


 

LastQuake: Comprehensive Strategy for Rapid Engagement of Global Earthquake Eyewitnesses, Massive Crowdsourcing, & Risk Reduction, Remy Bossu

Points to eBird: you can’t do this for earthquakes because target reporters are eyewitnesses. Focus on felt earthquakes, looking at SM activity and speed of feedback so info needs to be available across SM platforms. QuakeBot, apps & add-ons are intended to automatically merge direct & indirect eyewitness contributions, seismic data, and other sources.

Can’t identify “felt EQ” with instruments, but can via SM. Just look at tweets with earthquake in them in US. But not every place uses Twitter that much. They use real-time web traffic to their authoritative site to figure this out based on IP addresses, could tell Kathmandu had not been flattened b/c they continued getting visits after several minutes.

citizenseismology.eu, @LastQuake

During Nepal event, made automatic map but did not predict intensity until about 19 minutes, confirmed damage at 20 minutes, published 38 tweets in that time during which there were main shock and 5 felt aftershocks. Working to develop an app, UI improvements, gets better geolocated pics & videos, sharing comments to SM, and push notifications. Got decent data despite the fact that the quakes they have recorded were areas where LastQuake aren’t well known. They validate pics for lack of IP infringement, respect for human dignity, and accuracy to known issues.

Quick rise in app downloads 10 min after Nepal. After 9 days, had identified most of quakes post-main-event. 85% of access from Nepal via Mobile, with 1/3 via app & 70% of reports. Traffic picks up in under a minute of shocks. Case on December 7: 110K downloads, 82K in operation (75% retention). Saw app launched within 1 minute of event and notifications: immediate response worldwide.

One KPI is number of responses within 30 minutes. Examples where they aren’t well known: Afghanistan, Arizona, England, Malaysia–hundreds of responses in each case, 2400 for AZ.

How were people finding them? This is only app providing info on felt earthquakes. It only takes hours for info to be shared. So they asked for feedback–what improvements? They wanted help, what to do in earthquakes. So developing visual pop-ups with do’s and don’t’s: visual popups (stay away from buildings, don’t call 911 unless injured), adding an “I am safe” button. Seeing this as risk reduction information for public: reduce inappropriate behaviors and fatal errors.

Q: Implications of using this for catastrophes, like anthropogenic disasters like shooters? How can you verify the truthing, veracity of content?

A: Rapid-onset events are easy to tell: eyewitnesses hit the website within 2 minutes and others don’t know of the event yet to falsify, but floods are much harder to tell. They don’t see a lot of people messing with the comments, the “not so bright” people are easy to spot. Can easily remove outliers, likely not because they are lying but because they are so emotional. With pictures, it’s not about reliability–photo of small crack in the wall isn’t useful to them, care more about larger damage.


CosmoQuest: Building community around citizen science collaboration, Pamela Gay

Data landscape for space science data is changing dramatically–“horrific data flood flying down upon our heads and across our internet connections”. Need help handling tons of data. Can’t get enough postdocs, have to open the doors to the ivory tower. Open data and open access will help, but requires supporting community: curricula, projects specific to grade level, adult learning, planetarium & science on sphere content to recruit and disseminate, crowdsourced podcasts, “guerrilla science” at science-related events.

Current projects focus on surface science. CitizenScienceBuilder for image annotation. TransientTracker for photometry and other products. Building data products and simulations. Portals like Moon Mappers, Asteroid Mappers, etc. Funded through 2020 with some pre-selected projects, but if all goes well, there will be an RFP for projects with details on how to ensure that results get published for small grants up to $60K. Providing educational materials, curricula, etc.

They partner with a lot of programs for podcasts, live YouTube events with up to 5K attendees, in-person events. “Come science with me”.


A method to harness global crowdsourced behavior to understand something about avalanches, Jordy Hendrikx

Snow avalanches cause 30 deaths/yr in US, up to 500 fatalities worldwide. $1B damage in US alone. Also dramatic uptick in backcountry users, and fatalities increase slower than usage so education is likely helping. Historically 4 parts to risk: snowpack, weather, terrain, and most importantly people. Need to understand their decisions.

Research that tries to look at causes of avalanche fatalities tries to understand accidents based on result, but fatalities are usually a cascade of errors so it’s hard to figure out a causal factor. Rather than a consequence of a series of decision, try to go to “top of cliff” to figure out which groups more likely to be at risk in future due to behavior, and then use targeted education. Goal is prevention via behavioral understanding.

Crowdsourcing by taking realtime GPS tracks on a smartphone app, then do Internet surveys about decision-making they can connect to it. Using a marketing approach to decisions. They describe & quantify travel practices in concert with group decision making dynamics and participant demographics, using GPS track as expression of decisions and terrain use.

Sending people to webpage sounds easy but is hard, need to advertise and get word out, that’s harder than scientists think it will be. Then they show simple flowchart–sign up, download app, track trips, auto-reply afterward, fill in survey. Have been doing this since 2013/4 season, noticing that there’s self-selection bias among who participates and trying to grow sample to broader range so as to get more behavioral insight. Using snowball sampling via SM, word of mouth, but have to reflect culture of a crowd, not stuffy white lab coat. Getting thousands of track from around the world.

Outreach is critical–presenting at workshops & public events, publications in popular press.

 

“Citizen Science in Context” at 4S2015

Attending the annual meeting of 4S (the Society for the Social Studies of Science) in Denver this week has been lovely. It’s a delight to reconnect with colleagues across diverse spaces and make new acquaintances, all the while talking about science.

In the last 2 days alone, I’ve discussed killer robots, citizenship in citizen science, scientific conference cultures, the ups and downs of academia, the Federal Toolkit, and how PCS algorithms are invisibly affecting scientific careers by pre-assigning the wrong people to reviewers based on vocabulary problems.

Below I’ve posted my minimally-edited session notes from November 13’s session on Citizen Science in Context. Enjoy?


From the citizens’ point of view: Small scale and locally anchored models of citizen science
Lorna Heaton, Florence Millerand, Patricia Dias da Silva

Background focusing on large-scale growth of citizen science, usual themes around potential for exploitation. Sees smaller, locally anchored models as productive of new opportunities for meaningful engagement.

Alerta Ambiental: reporting around land-based activities for legal action, and environmental monitoring.

ONEM: species observations in France, basic wiki-based observation form for species of interest. Participant benefit in awareness of local habitat.

Flora Quebeca: knowledge exchange on Quebecois flowering species. Initial concerns around rare species harvesting. Discussions on Facebook around photos of rare species. Lots of learning via moderation. They provide ID keys, quizzes, etc.

Engagement that is specific to localized projects, distinct from larger-scale (so-called…) “decontextualized” projects. Tech mediation but strong local situation around sense of place. Shapes how knowledge is produced. Online and offline interactions are interrelated. Local citizen science supports understanding world nearby, public engagement beyond the local, and tech mediation that complements colocated participation and interaction. Sees online as potentially valuable for inclusion, learning, empowerment.

Q about how it’s science, not just activity.

A: Some of the data were used by researchers.


Citizen Science and Science Citizenship: same words, different meanings?

Alan Irwin

Points to development of ECSA, Fondation Science Citoyennes, explosion of growth. Questions around semantics of the terminology.

Agenda of European Environment Agency is dramatically different from Zooniverse. Many different meanings, term with interesting ability to capture attention.

He was at CSA 2015. Contrast in meanings of term used there among 600 participants were interesting. Highlights Chris Filardi’s talk: “they picked me up and put me inside their questioning community”. Contrast to Amy Robinson’s talk on Eyewire, enormous enthusiasm about what they’re doing (NB: one of the coolest keynotes I’ve seen in years). Notes the variation in scale–intense ethnographic experience on an island, vs 160K people in gamified environment online. Both connected to citizen science, but do they have something in common or not?

Yes, in that understandings and knowledge connect with epistemologies. Cites Haklay 2013 levels of participation in citizen science, not critiquing but attempts to categorize things that are dramatically different. But “categorizations are not innocent” in how they define the space. Extreme according to whom? Reflects a view from the ivory tower focusing on human-knowledge interface, overlooks the organizational aspects to create the systems. Nothing about how resistance can be the substance of it, how it can be a provocation or challenge.

Form (style) is less important than goals: sense of movement more valid, moves toward scientific citizenship. What if Eyewirers started asking questions about how the platform and the people there create a type of academic capitalism? What if the relationships on the island were hoovered up, with people treated as standardized sensors? Change can go both ways, can lead toward more rich development.

Concept around scientific citizenship–focused on more controversial areas of science and tech development, raises questions about relationships between knowledge and democracy. Cognitive justice as a keyword. Potential for scientific citizenship via distributed expertise, opening up science to society, practiced engagement, scientific-institutional-citizenship learning?

Potential of citizen science for scientific citizenship: is there evidence of it? Relatively little. More low-level engagement right now. Is the potential there? Yes, but:
1. Citizen science needs to be seen as a challenge, disturbance, or provocation to science, not solely an extension.
2. Questions of control: it can’t always be science-led.
3. Citizen dimensions should be taken as seriously as the science. What’s the model of citizenship and purpose of engagement?
4. Concepts like “epistemic justice” should be brought into the discussion
5. Institutional learning needs to be addressed in structural terms.
6. Citizen science must be taken in the wider context of sociotechnical relations.

Feels STS can bring important elements to discussion, but right now STS is very marginal to the discussion.

Q: usage of term “citizen” implying both responsibilities and rights.
A: more attention to scientific perspective than question of what do we mean by citizenship, what are possible implications of this, could it be a way to open up?


Negotiating the concept of data quality in citizen science

Todd Suomela

RQs: what is discourse around data (quality) in citizen science, how is that negotiated?

Background with dissertation on framing citizen science in journalism, DataONE internship, and data quality panel at CSA conference. Internship announcement came out of working group, reflects a perspective of needing work to justify the value of the work. Panel at CSA 2015, summary that many projects use multiple mechanisms to influence data quality; methodological iteration is common in developmental stages; methods sections in published papers capture only part of the mechanism decisions made by researchers, e.g., confusion matrices.

Theoretical interlude: social worlds and situational analyses. Publics and sciences: responding to consequences of actions and the dependency of science on communication between scientists and the public.

Positional mapping with a split between public and science, and orthogonal relationship between social worlds: insiders to regulars to tourists to outsiders. Project scientists/staff, educators, external scientists, journalists-writers. Themes include data and visible feedback, positioning the individuals’ work in bigger picture.

For some, citizen science is a new label for an old thing. Promoting engagement with data in deeper ways is a key goal for many project staff. Visible and rapid feedback makes it easier for volunteers to see the value, and is important in design conversation. Quality is an obsession for insiders working on citizen science but strangers to this social world, both scientific and public, remain skeptical.

Calls for future work on data quality perceptions among scientists outside of current citizen science communities, links to more work on science studies.

Why Sabotage is Rarely an Issue in Citizen Science

Following a recent Nature editorial, the Citizen Science researcher-practitioner community has been having a lively discussion. Muki Haklay made a great initial analysis of the editorial, and you should read that before continuing on.

OK, now that you know the back story, a related comment from Sam Droege on the cit-sci-discuss listserv included the following observation:

Statistically, to properly corrupt a large survey without being detected would require the mass and secret work of many of the survey participants and effectively would be so complicated and nuanced that it would be impossible to manage when you have such complex datasets as the Breeding Bird Survey.

I agree, and I frequently have to explain this to reviewers in computing, who are often concerned about the risk of vandalism (as seen all over Wikipedia).

Based on a very small number of reports from projects with very large contributor bases—projects that are statistically more likely to attract malcontents due to size and anti-science saboteurs due to visibility—only around 0.0001% of users (if that) are blacklisted for deliberately (and repeatedly) submitting “bad” data.

If we presume that we’re failing to detect such behavior for at least a few more people than the ones we actually catch, say at the level of a couple orders of magnitude, we’d still only be talking about 0.01% of the users, who pretty much always submit less than 0.01% of the data (these are not your more prolific “core” contributors). In no project that I’ve ever encountered has this issue been considered a substantial problem; it’s just an annoyance. Most ill-intentioned individuals quickly give up their trolling ways when they are repeatedly shut down without any fanfare. From a few discussions with project leaders, it seems that each of those individuals has a rather interesting story and their unique participation profiles make their behaviors obvious as…aberrant.

In fact, the way most citizen science projects work makes it unlikely that they would be seen as good targets for malicious data-bombing anyway. Why? For better or worse, a lot of citizen science sites provide relatively little support for social interaction: less access to an audience means they’re not going to get a rise out of people. Those projects that do have vibrant online communities rarely tolerate that kind of thing; their own participants quickly flag such behavior and if the project is well-managed, the traces are gone in no time, further disincentivizing additional vandalism.

From a social psychological standpoint, it seems that the reality of the situation is actually more like this:

  1. convincingly faking scientific data is (usually a lot) more work than collecting good data in the first place;
  2. systematically undermining data quality for any specific nefarious purpose requires near-expert knowledge and skill to accomplish, and people fitting that profile are unlikely to be inclined to pull such shenanigans;
  3. anyone who genuinely believes their POV is scientifically sound should logically be invested in demonstrating it via sound science and good data quality;
  4. most citizen science projects do not reward this kind of behavior well enough to encourage ongoing sabotage, as discussed above; and
  5. as Sam noted, effectively corrupting a large-scale project’s data without detection requires a lot of smarts and more collaboration than is reasonable to assume anyone would undertake, no matter how potentially contentious the content of the project. They’d be more likely to succeed in producing misleading results by starting their own citizen-counter-science project than trying to hijack one. And frankly, such a counter-science project would probably be easy to identify for what it was.

Seriously, under those conditions, who’s going to bother trying to ruin your science?

Citizen Science Data Quality is a Design Problem

I’ve been giving talks for years that boil down to, “Hey citizen science organizers, it’s up to you to design things so your volunteers can give you good data.” I genuinely believe that most data quality issues in citizen science are either 1) mismatched research question and methodology, or 2) design problems. In either case, the onus should fall on the researcher to know when citizen science is not the right approach or to design the project so that participants can succeed in contributing good data.

So it’s disheartening to see a headline like this in my Google alerts: Study: Citizen scientist data collection increases risk of error.

Well. I can only access the abstract for the article, but in my opinion, the framing of the results is all wrong. I think that the findings may contribute a useful summary–albeit veiled–of the improvements to data quality that can be achieved through successive refinements of the study design. If you looked at it that way, the paper would say what others have: “after tweaking things so that normal people could successfully follow procedures, we got good data.” But that’s not particularly sensational, is it?

Instead, the news report makes it sound like citizen science data is bad data. Not so, I say! Bad citizen science project design makes for bad citizen science data. Obviously. (So I was really excited to see this other headline recently: Designing a Citizen Science and Crowdsourcing Toolkit for the Federal Government!)

The framing suggests that the authors, like most scientists and by extension most reviewers, probably aren’t very familiar with how most citizen science actually works. This is also completely understandable. We don’t yet have much in the way of empirical literature warning of the perils, pitfalls, and sure-fire shortcuts to success in citizen science. I suspect a few specific issues probably led to the unfortunate framing of the findings.

The wrong demographic: an intrinsically-motivated volunteer base is typically more attentive and careful in their work. The authors saw this in better results from students in thematically aligned science classes than general science classes. The usual self-selection that occurs in most citizen science projects that draw upon volunteers from the general public might have yielded even better results. My take-away: high school students are a special participant population. They are not intrinsically-motivated volunteers, so they must be managed differently.

The wrong trainers and/or training requirements: one of the results was that university researchers were the best trainers for data quality. That suggests that the bar was too high to begin with, because train-the-trainer works well in many citizen science projects. My take-away: if you can’t successfully train the trainer, your procedures are probably too complicated to succeed at any scale beyond a small closely-supervised group.

The wrong tasks: students struggled to find and mark the right plots; they also had lower accuracy in more biodiverse areas. There are at least four problems here.

  1. Geolocation and plot-making are special skills. No one should be surprised that students had a hard time with those tasks. As discussed in gory detail in my dissertation, marking plots is a much smarter approach;  using distinctive landmarks like trail junctions is also reasonable.
  2. Species identification is hard. Some people are spectacularly good at it, but only because they have devoted substantial time and attention to a taxon of interest. Most people have limited skills and interest in species identification, and therefore probably won’t get enough practice to retain any details of what they learned.
  3. There was no mention of the information resources the students were provided, which would also be very important to successful task completion.
  4. To make this task even harder, it appears to be a landscape survey in which every species in the plot is recorded. That means that species identification is an extra-high-uncertainty task; the more uncertainty you allow, the more ways you’re enabling participants to screw up.

On top of species identification, the students took measurements, and there was naturally some variation in accuracy there too. There are a lot of ways the project could have supported data quality, but I didn’t see enough detail to assess how well they did. My take-away: citizen science project design usually requires piloting several iterations of the procedures. If there’s an existing protocol that you can adopt or adapt, don’t start from scratch!

To sum it up, the citizen science project described here looks like a pretty normal start-up, despite the slightly sensational framing of the news article. Although one of the authors inaccurately claims that no one is keeping an eye on data quality (pshah!), the results are not all that surprising given some project design issues, and most citizen science projects are explicitly structured to overcome such problems. For the sharp-eyed reader, the same old message shines through: when we design it right, we can generate good data.