Citizen Science Data Quality is a Design Problem

I’ve been giving talks for years that boil down to, “Hey citizen science organizers, it’s up to you to design things so your volunteers can give you good data.” I genuinely believe that most data quality issues in citizen science are either 1) mismatched research question and methodology, or 2) design problems. In either case, the onus should fall on the researcher to know when citizen science is not the right approach or to design the project so that participants can succeed in contributing good data.

So it’s disheartening to see a headline like this in my Google alerts: Study: Citizen scientist data collection increases risk of error.

Well. I can only access the abstract for the article, but in my opinion, the framing of the results is all wrong. I think that the findings may contribute a useful summary–albeit veiled–of the improvements to data quality that can be achieved through successive refinements of the study design. If you looked at it that way, the paper would say what others have: “after tweaking things so that normal people could successfully follow procedures, we got good data.” But that’s not particularly sensational, is it?

Instead, the news report makes it sound like citizen science data is bad data. Not so, I say! Bad citizen science project design makes for bad citizen science data. Obviously. (So I was really excited to see this other headline recently: Designing a Citizen Science and Crowdsourcing Toolkit for the Federal Government!)

The framing suggests that the authors, like most scientists and by extension most reviewers, probably aren’t very familiar with how most citizen science actually works. This is also completely understandable. We don’t yet have much in the way of empirical literature warning of the perils, pitfalls, and sure-fire shortcuts to success in citizen science. I suspect a few specific issues probably led to the unfortunate framing of the findings.

The wrong demographic: an intrinsically-motivated volunteer base is typically more attentive and careful in their work. The authors saw this in better results from students in thematically aligned science classes than general science classes. The usual self-selection that occurs in most citizen science projects that draw upon volunteers from the general public might have yielded even better results. My take-away: high school students are a special participant population. They are not intrinsically-motivated volunteers, so they must be managed differently.

The wrong trainers and/or training requirements: one of the results was that university researchers were the best trainers for data quality. That suggests that the bar was too high to begin with, because train-the-trainer works well in many citizen science projects. My take-away: if you can’t successfully train the trainer, your procedures are probably too complicated to succeed at any scale beyond a small closely-supervised group.

The wrong tasks: students struggled to find and mark the right plots; they also had lower accuracy in more biodiverse areas. There are at least four problems here.

  1. Geolocation and plot-making are special skills. No one should be surprised that students had a hard time with those tasks. As discussed in gory detail in my dissertation, marking plots is a much smarter approach;  using distinctive landmarks like trail junctions is also reasonable.
  2. Species identification is hard. Some people are spectacularly good at it, but only because they have devoted substantial time and attention to a taxon of interest. Most people have limited skills and interest in species identification, and therefore probably won’t get enough practice to retain any details of what they learned.
  3. There was no mention of the information resources the students were provided, which would also be very important to successful task completion.
  4. To make this task even harder, it appears to be a landscape survey in which every species in the plot is recorded. That means that species identification is an extra-high-uncertainty task; the more uncertainty you allow, the more ways you’re enabling participants to screw up.

On top of species identification, the students took measurements, and there was naturally some variation in accuracy there too. There are a lot of ways the project could have supported data quality, but I didn’t see enough detail to assess how well they did. My take-away: citizen science project design usually requires piloting several iterations of the procedures. If there’s an existing protocol that you can adopt or adapt, don’t start from scratch!

To sum it up, the citizen science project described here looks like a pretty normal start-up, despite the slightly sensational framing of the news article. Although one of the authors inaccurately claims that no one is keeping an eye on data quality (pshah!), the results are not all that surprising given some project design issues, and most citizen science projects are explicitly structured to overcome such problems. For the sharp-eyed reader, the same old message shines through: when we design it right, we can generate good data.

Case Study Writing Strategies

This is a tale of the two approaches I took to writing up case study research based on fieldwork and qualitative coding.

When I started writing up my dissertation case studies, I really had no idea how to do it. I’d read plenty of case studies but never tried to emulate them. I did, however, have a handy-dandy theoretical framework that needed to be worked into the findings.

I had three cases to report and more than enough data. Multiple case studies are typically used for comparative purposes, meaning that not only does this research design require writing up the individual cases, but also a cross-case comparison. I ended up writing four chapters to cover all of that material, with about 184 pages for the three cases and around 50 pages for the cross-case comparison.

I started off by writing up the case that had the most data – might as well get the big one out of the way, right? I wish I’d taken the reverse approach so that I would have saved some work when I found that my first try at writing up a case fell flat!

Method 1: Theoretical Framework Laundry List

I was told to be thorough in my dissertation writing. That may have been a mistake on my advisor’s part, as the final document was over 400 pages long, but I was determined to be as methodical and thorough as I could.

I started off by structuring my case description by the theoretical framework that I had developed. I went through every code in my framework and pulled out illustrative quotes that I organized under each heading, and then wrote up what I found for each concept in the framework. Even with rich and interesting empirical data to draw upon, however, it was deadly dull. It turned into a horrific laundry list in which readers became lost, much like one of those freaky hedge mazes you see in horror movies. It was ponderous and really soporific.

Repeating that two more times for the cases? No way. It was extremely slow and laborious writing, jerky and discordant, and there was no way I could meet my writing deadlines with that strategy. Fortunately, my writing group set me straight and offered suggestions of alternative structures. I listened, as one should when others are kind enough to read through drafts of heavy academic material and give thoughtful comment thereupon. Then I started over.

Method 2: Semi-Structured Thematic Template

I started over by cutting the chapter into strips and then physically coding and rearranging them into themes. Suddenly, there was a story and a flow to the material!

The first draft of the case study, cut into shreds and reassembled into a new structure.

It was done in a day. I remembered (just in time) to mark each strip of paper with the page number from which the material originated so that I could find it in the digital document to cut and paste. The process of cutting, pasting, and smoothing over transitions took another couple of days. I had every theoretical concept covered, and the material took on a much more palatable and interesting shape.

As I wrote the next two cases up, I started again with quotes, retrieving them systematically and writing up notes on the insights gleaned from them. Next, I organized them thematically rather than by conceptual framework constructs. It was easy to write the material that connected the quotes into a (mostly) coherent story, and much more interesting as the writing process generated more insights. I actually had fun with a lot of that writing!

I structured each case study chapter to start with sections providing the history and organizational setting of the case, an overview of the technologies and participation processes, and then continued from there with the thematic sections. At the end of each chapter, I included a summary with the main themes from each case and linked the highlights back to the research questions and constructs therein.

The overly-structured approach to writing a case study was painful and frustrating, but going with my intuition (while remaining steadfastly systematic) produced better results much faster. It also reduced repetition from linking concepts together and made those relationships much clearer. I expect every researcher will have to figure out an individual writing strategy, but it’s valuable to remember that the first approach may not be the best, and taking a different tack does not mean throwing out all the work you’ve already done.

The strategy for constructing the case comparison chapter, however, was a different matter entirely and a story for another day.

Qualitative Research: Why Do Participant Observation?

Writing up the case studies for my dissertation research on citizen science has required taking some time for reflection on the experience of doing qualitative research. I used a comparative case study methodology approach with fieldwork methods that included data collection through interviews, documents, and participant observation.

Participant observation, in particular, is time consuming and challenging. Retrospectively, however, I couldn’t imagine doing this research without participant observation, particularly for my “intensive” case, eBird. Why?

Here are a couple snippets of the case study that explain what fieldwork contributed to the study:

“My participant observation in eBird involved birding, monitoring and participating in birding listservs, recording my own usage of eBird over time, and attending meetings at the Cornell Lab of Ornithology. This experience was an integral part of this research. While I am not an ‘average’ eBirder, I match its new user demographics in terms of gender, memberships with related institutions, birding equipment owned, and level of education (nearly a third of new eBird users have a postgraduate degree.) At the time that I began fieldwork, I was younger than most new users of eBird and had no birding experience whatsoever.

Genuine participation in eBird meant that I had to learn how to bird. While I put up a bird feeder in my backyard in February 2009 when I first became interested in citizen science, I could identify only a handful of the most visible species in my area prior to participating in eBird. Learning to bird required a substantial time investment in learning how to identify wild birds, and additional investment in binoculars, field guides, audio recordings of bird calls, and backyard bird feeding supplies. As I developed basic bird identification skills and came to enjoy birding as a pastime for its own merits, I added time (and expense) to my business travels so that I could go birding in new and exotic locations. Field notes related to these birding experiences were made periodically throughout this study.”

What this translate to: birding is hard! It was much more difficult than I initially expected, and a lot more expensive.

“All of these forms of participation and observation contributed to substantially strengthening the research. I experienced the common challenges and triumphs of developing bird identification skills, learned the vocabulary of birding, and developed the same fascination with both birds and keeping lists of them that is typical of birders. Perhaps most telling in this respect, others started to describe me as an “avid birder” and friends began to come to me with questions about birds. It was a transformative experience that provided a deep appreciation of birders’ interests and enthusiasm for eBird. As a fellow birder, I now understand why each new feature elicits such excitement and gratitude from the birding community.

Following multiple email listservs provided a more thorough understanding of the broader context of the birding community and contextualized the community practices that interviewees discussed. In addition, many aspects of the birding experience are universal, and these interactions demonstrated that my that my birding and eBirding experiences are not unique.

A final benefit of participation was developing a genuine appreciation of the pleasure of birding. My daily life has been enriched by a heightened awareness of birds in my surroundings, and the rewards of birding – and more specifically, eBirding – continue to motivate me to further explore the world around me.”

What this translates to: I understand the context of this case in a way that would have been simply impossible without participant observation. And I had fun with it as well – how could I ask for anything more?

Oh yeah, let’s not forget – I got a postdoc out of it too. Not half bad, plus I have a pretty respectable life list after only a year and a half: 281 species, and counting!

Qualitative Analysis Tools

In part three of my review of software that I use for my academic work, I’m covering that all-time favorite, qualitative analysis tools! I have never seen a topic that gets so many requests for help (mostly from doctoral students) with so few useful answers. So here are a handful of tools that I have found helpful for my dissertation work, which involves qualitative analysis of semi-structured interviews, field notes, and documents.

As always, my main caveat is that these are Mac OS X programs. In fact, almost exclusively. If you’re spending a lot of time with a piece of software, having it behave like an OS native application is not worth the compromise. And as usual, I tend to favor of open source, free, or low-cost options. For the work that I’ve done, the applicable categories include data capture, transcription, coding, and theorizing (which might also apply for some quantitative work, depending on the nature of the beast.)

Data Capture

Sometimes you need screen shots. For this, I just use the Mac OS X built-in tool, Grab (may be under “Utilities” in your Applications folder), which works with keyboard shortcuts – my favorite! However, it grabs tiffs, which aren’t the most friendly format, and no matter what tool you use, screen captures are almost always 72 dpi = not print quality. So I resize to 300 dpi with Photoshop, making sure not to exceed the original file size (interpolated bits look just as bad as low dpi).

Sometimes you need to record a whole session of computer-based interaction. For that, nothing rivals Silverback for functionality and cost. It’s pretty cheap, works like a dream, and is great for capturing your own experiences, or that of participants. It uses your Mac’s built-in camera and mics to pick up the image and sound of the person at the keyboard, while logging and displaying keyboarding and mouseclicks. And it doesn’t make you record your settings until the end, so that’s one less thing to screw up when you’re setting up your session. Brilliant! I have to thank the WattBot CHI 2009 student design competition finalists from Indiana State for this discovery, since I never would have though to look for something like this. I use Silverback to log my own use of online tools for participant observation. It’s really entertaining to watch myself a year ago as I was just starting to use eBird. OK, more like painful. But compared to now, it’s really valuable to have those records of what the experience used to be like.

Transcription

I record all my interviews with a little Olympus digital recorder. It’s probably no longer on the market, but it was about $80 in 2007 and well worth every penny, even though at that time I mistakenly thought that I’d never do qualitative research. It was the second-best product from Olympus at the time, and has a built-in USB to move the files to a computer. Great. Except that all the files are in WMA format. All2MP3 to the rescue – free software is hard to beat. For awhile, I used a different audio converter, but it stopped working with an OS update and then I found this one. It’s dead simple, and despite the warnings that it always gives me about suboptimal formats, it works like a charm, every time.

But once those interviews are translated into a playable format, I still have to transcribe them. It’s good data review, of course, besides being cheaper than hiring someone – depending on your calculations. MacSpeech Dictate (now called Dragon Dictate) is my tool of choice for this task; it’s the Mac equivalent of Dragon Naturally Speaking, for you Windows users out there. Both softwares are owned by the same company, and you basically shouldn’t waste your time with anything else, because they are the market leader for a reason.

I use the voice recognition software to listen to my audio recordings with earbuds, and use the included headset to dictate the interview. The separate audio and voice systems are truly necessary, because if I can hear myself talking, it distracts me from what I’m dictating. It’s not flawless, but once the software was trained and so was I, it has worked pretty well. The big drawback is that it costs about $200. The big plus is that I went from 4-5 hours of transcription time for each hour of recording to 2-3 hours, and that’s a nontrivial improvement! I have definitely saved enough hours to make it a good deal for the grant that paid for it.

If you’re using dictation software, you have to dictate into some other software. And something has to play your audio files, too. Surprisingly enough (or not?), I have found open source software from IBM that works pretty well: it’s called IBM Video Note Taking Utility. Although it was originally PC-native, I begged the developer to encode Mac keyboard shortcuts as well, which he did – awesome!

The software was created for video transcription, but I just use it for audio. It’s very simple: you load up an mp3, it makes a text file, and you can use keyboard shortcuts to skip forward, backward, pause, and speed up or slow down the recording (plus some other stuff I don’t use). There are a couple of quirks, but the price is right and it does exactly what I want without lots of extra confusing stuff going on. Most of my transcription happens at 0.6 times normal speed, so when you take into account some correction time, the fact that I’m transcribing an hour of transcript in 2-3 hours means it’s nearly real-time transcription and there’s very little additional overhead. It’s just not possible to do any transcription at normal speaking speed, because unless you’re a court reporter, you just can’t keep up with what people are saying!

Coding/Annotation

When I first started working on qualitative research, one of my initial tasks was finding coding software that I liked. If you’re not using software for this task, consider joining the digital age. There are better options out there than innumerable 3×5 cards or sticky notes, even if you have to pay for it and spend a little time learning how to use it; the time you save is worth much more than the software costs. After some fairly comprehensive web searching, I was kind of horrified at how bad the options were for Mac-native software. $200 for what? Not much, I’ll tell you that. And from what I’ve seen looking over others’ shoulders, I don’t think the PC stuff is a ton better.

But there was something better than the modernized HyperCard option that I found, and pretty much everything else. And it, too, is open source! TAMS Analyzer has got my back when it comes to qualitative data analysis. It’s super-flexible, has a lot of power for searching, matching, and even visualizing your code sets, and can produce all the same intercoder reliability stats as the pricey licensed software. There’s a bit of learning curve, but I expect that’s true of any fully-featured annotation software. Plus, there’s a server version that has check-in/check-out control, which is awesome if you have multiple coders working on the same texts, and it’s pretty easy to set up (all things considered, you do have to be able to set up a mySQL database.) I have barely scraped the surface in terms of using its full capabilities. I’m constantly finding yet another awesome thing it can do, and I learn the functionality as I need it – all the really powerful stuff it can do doesn’t interfere with using it out of the box, so to speak.

And after you’ve spent some quality time with your coding, the time will come to sort those codes. For this, I use OmniOutliner, another product from the awesome OmniGroup. Once you have a huge heap of codes, the drag-and-drop hierarchical outline functionality is a highly convenient, fairly scalable way to handle getting your codes in order. I’ve done this with note cards, and it’s a big mess, excessively time-consuming by comparison to using digital tools, and wastes a lot of paper that is then hard to store. I also like keeping an “audit trail” of my work, so having the digitally sorted codes (in versioned documents) is a great way to do it.

Theorizing

Ah, theory. That’s what we’re all doing this academic thing for, right? Well, that or fame and glory, and we all know which one of those is more likely.

Everyone has their own way of thinking about this. I draw diagrams. And when I draw diagrams, whether for a poster, paper, or to sort out my own thinking, I use OmniGraffle. I can’t begin to say how awesome this software is, and how much mileage I’ve gotten out of my license cost. Enough that I should pay for it again, that’s how good it is. My husband calls OmniGraffle my “theory software” because when I’m using it, he knows I’m probably working on theory. I find it really useful for diagramming relationships between concepts and thinking visually about abstractions. Depending on the way you approach theorizing, it might be worth a try (free trials!)

So that’s the end of my three-part series on software to support academic work. I hope someone out there finds it useful, and if you do, please give one of these posts a shout-out on your social network of choice. You’ll be doing your academic pals a favor, because we all know that’s how people find information these days. :)

Tools of the Trade: Quantitative Analysis

Following up on my last post about the tools that I prefer for organizing and writing in academic work, today I’m going to review my preferred software for quantitative analysis. Yep, there’s enough that falls under “analysis” to merit two posts. This will be the easier of the two posts to write on analysis tools, because I find that qualitative analysis takes a much more complex assembly of technical tools to support the work.

All of these tools are cross-platform (except the SNA software) so although the view on my Mac OS X screen may look a little different than it would on other platforms, the essential functionality is all the same. Isn’t that nice? So let’s begin with the tool that makes the research world go ’round: Excel.

Yes, Excel is a Microsoft product, which I usually avoid. But it’s so functional that it’s hard to use anything else, and I have extensive experience doing some very fancy tricks with Excel. You know, the “power user” kind of stuff, like PivotTables in linked workbooks with embedded ODBC lookups (yep, fancy!) The simple fact of the matter is that a lot of science is done with Excel, so almost no one doing quantatitive research can completely avoid it. However, the advice that I offer when working with a spreadsheet tool for research is:

  1. Keep a running list of the manipulations you’ve done on your data. Embed explanations on your worksheets. It’s way too easy for a worksheet to become decontextualized and then you have no idea how you got those results or why you have two sets of results and which one is the right one. This is a pain to do, but trust me, keeping a record like this will save your hide at some point.
  2. Take the time to learn how to use named ranges and linked worksheets. This dramatically improves your ability to do data manipulation in a separate worksheet without touching the original copy, meaning you always have the initial version to return to. This is more important than I can possibly emphasize. Don’t mess with your raw data in Excel unless you have another (preferably uneditable) copy elsewhere!
  3. Customize your toolbars for maximum utility if you’re a frequent user. For example, I have added a button on the toolbar for “paste values” because this is a really useful function that doesn’t have an adequate keyboard shortcut, even though I’ve tried to program one. And for that matter, programming custom keyboard shortcuts for commonly used commands is also a really good idea if you use Excel often.
  4. Install the Analysis Toolpak for grown-up statistics. Use the Formula Viewer to understand what the heck is supposed to go into the formulae. I’ve found this helpful for data interpretation on more than one occasion.
  5. VLOOKUP. Learn it. Love it.

R is my go-to tool for statistical analysis, including network analysis. If you don’t know R, it’s basically a robust, free answer to (very expensive and limited time licenses for) SAS or SPSS. It can do just about anything you want, and it has a core-and-package structure that lets you download and activate packages at will to do specialized kinds of analysis. R is well supported in the research community and you’re sure to find a package that does what you need. Like the other major statistical analysis tools, it has its own sort of syntax, but I suspect it’s no harder to learn than the other stuff. R is a great tool, and it hooks into other analysis tools very nicely.

Tools like Taverna, which is a scientific workflow tool. I’ve used this for replicable, self-documenting, complex data retrieval, manipulation, and analysis routines. I’ve written papers about it and spent time with the myGrid team in the UK helping them evaluate usability. I’m definitely a fan of Taverna and I found it really useful for the kind of complex secondary data analysis that I worked on for free/libre open source software research. I’ll even be teaching a course this fall on eScience workflow tools, including Taverna.

Protege is an ontology editor. Ontologies aren’t exactly quantitative analysis, but they can be really useful in doing quantitative analysis of large data sets with semantic properties. If for any reason you need to build an ontology, Protege is a really nice tool.

Finally, the ultimate irony – buying proprietary software to run open source software. I use VMWare Fusion to run Windows XP so I can use Pajek for social network analysis. VMWare Fusion is extremely satisfactory software for the purpose and doesn’t cost much; I have been very happy with it. Windows XP is, well, Windows.

Pajek is nothing but ugly, interface-wise, but don’t let that put you off because it does the job well and has a lot of really detailed options for SNA. It has the most insanely deep menus I’ve ever seen, but to be fair, there’s a lot of analytical complexity under the hood. It also does visualizations, but they aren’t the prettiest thing you’ve ever seen. There are a lot of tools that you can choose for SNA, and this software choice reflects the fact that what I usually need is statistics, not pretty pictures. There’s even a great book for learning how to use Pajek – it was worth every penny when I was learning SNA, because it not only shows you how to use the software, but explains the SNA concepts pretty effectively as well.