Responding to Reviewers

“Revise and resubmit” is really the best outcome of academic peer review – acceptance for publication as submitted is so rare it may as well not exist, and most papers are genuinely improved through the peer review and revision processes. Generally speaking, an additional document detailing changes must accompany the revised submission, but the conventions for writing these “change logs” are a little opaque because they’re not typically part of the public discussion of the research.

San Antonio Botanical Gardens during CSCW 2013

There are a couple of great examples of change logs from accepted CSCW 2013 papers from Merrie Morris, and I’m offering my own example below as well. It’s no secret that my CSCW 2013 paper was tremendously improved by the revision process. I wrote the initial submission in the two weeks between submitting my final dissertation revisions and graduation. For a multitude of reasons, it wasn’t the ideal timing for such an endeavor, so I’m glad the reviewers saw a diamond in the rough.

My process for making revisions starts with not getting upset about criticism to which I willingly subjected myself – happily, a practice that becomes easier with time and exposure. (If needed, you can substitute “get upset/rant/cry in private, have a glass of wine, cool off, sleep on it, and then come back to it later,” which is a totally valid way to get started on paper revisions too.) Hokey as it sounds, I find it helpful to remind myself to be grateful for the feedback. And that I asked for it.

Then I print out the reviews, underline or highlight the items that need attention, and summarize them in a few words in the margin. Next, I annotate a copy of the paper to identify any passages that are specifically mentioned, and start to figure out where I need to make changes or could implement reviewers’ suggestions. I find these tasks much easier to do on paper, since being able to spread out all the pages around me sometimes helps when working on restructuring and identifying problem points.

During or after that step, I create a new word processing document with a table and fill it in with terse interpretations of the comments, as you’ll see in the example below. In the process, I sort and group the various points of critique so that I’m only responding to each point once. This also ensures that I’m responding at the right level, e.g., “structural problems” rather than a more specific indicator of structural problems.

The actual columns of the table can vary a little, depending on the context – for example, a table accompanying a 30-page journal manuscript revision in which passages are referenced by line number would naturally include a column with the affected line numbers to make it easier for the reviewer to find and evaluate the updated text. In the example below, I made such substantial changes to the paper’s structure that there was no sense in getting specific about section number, paragraph, and sentence.

As a reviewer, I’m all for process efficiency; I strongly prefer concise documentation of revisions. At that stage, my job is to evaluate whether my concerns have been addressed, and the documentation of changes should make that easier for me, rather than making me wade through unnecessary detail. Likewise, as an author, I consider it a problem with my writing if I need to include a lengthy explanation of why I’ve revised the text, as opposed to the text explaining itself. That heuristic holds under most circumstances, unless the change defies expectations in some fashion, or runs counter to a reviewer’s comment — which is fine when warranted, and the response to reviewers is the right place to make that argument.

Therefore, the response to reviewers is primarily about guiding the reviewer to the changes you’ve made in response to their feedback, as well as highlighting any other substantive changes and any points of polite disagreement. In a response to reviewers, the persuasive style of CHI rebuttals, the closest parallel practice with which many CSCW authors have experience, seems inappropriate to me because the authors are no longer in a position of persuading me that they can make appropriate revisions, but are instead demonstrating that they have done so. Ergo, I expect (their/my) revisions to stand up to scrutiny without additional argumentation.

Finally, once all my changes are made and my table is filled in, I provide a summary of the changes, which includes any other substantive changes that were not specifically requested by the reviewers, and note my appreciation for the AC/AE and reviewers’ efforts. A jaded soul might see that as an attempt at flattering the judges, but it’s not. I think that when the sentiment is genuine, expressing gratitude is good practice. In my note below, I really meant it when I said I was impressed by the reviewers’ depth of knowledge. No one but true experts could have given such incisive feedback and their insights really did make the paper much better.

——————————

Dear AC & Reviewers,

Thank you for your detailed reviews on this submission. The thoroughness and depth of understanding that is evident in these reviews is truly impressive.

To briefly summarize the revisions:

  • The paper was almost completely rewritten and the title changed accordingly.
  • The focus and research question for the paper are now clearly articulated in the motivations section.
  • The research question makes the thematic points raised by reviewers the central focus.
  • The analytical framework is discussed in more depth in the methods section, replacing less useful analysis process details, and is followed up at the close of the discussion section.
  • The case comparison goes into greater depth, starting with discussion of case selection.
  • The case descriptions and comparison have been completely restructured.
  • The discussion now includes an implications section that clarifies the findings and applicability to practice.

Below are detailed the responses to the primary points raised in the reviews; I hope these changes meet with your approval. Regardless of the final decision, the work has unquestionably benefited from your attention and suggestions, for which I am deeply appreciative.

Reviewer Issue Revisions
AC No clear research question/s A research question is stated toward the end of page 2.
AC, R1, R3 Findings are “obvious” The focus of the work is reframed as addressing obvious assumptions that only apply to a limited subset of citizen science projects, and the findings – while potentially still somewhat obvious – provide a more useful perspective.
AC, R2 Conclusions not strong/useful A section addressing implications was added to the discussion.
AC Improve comparisons between cases Substantial additional comparison was developed around a more focused set of topics suggested by the reviewers.
AC Structural problems The entire paper was restructured.
R1 Weak title The title was revised to more accurately describe the work.
R1 Does not make case for CSCW interest Several potential points of interest for CSCW are articulated at the end of page 1.
R1 Needs stronger analytic frame & extended analysis The analytic framework is described in further detail in the methods section, and followed up in the discussion. In addition, a section on case selection criteria sets up the relevance of these cases for the research question within this framework.
R1 Quotes do not add value Most of this content was removed; new quotes are included to support new content.
R1, R3 Answer the “so what?” question & clarify contributions to CSCW The value of the work and implications are more clearly articulated. While these implications could eminently be seen as common sense, in practice there is little evidence that they are given adequate consideration.
R1 Include case study names in abstract Rewritten abstract includes project names.
R1 Describe personally rewarding outputs in eBird These are described very briefly in passing, but with the revised focus are less important to the analysis.
R2 Compare organizational & institutional differences Including these highly relevant contrasts was a major point of revision. A new case selection criteria section helps demonstrate the importance of these factors, with a table clarifying these contrasts. The effects of organizational and institutional influences are discussed throughout the paper.
R2 Highlight how lessons learned can apply to practice The implications section translates findings into recommendations for strategically addressing key issues. Although these are not a bulleted list of prescriptive strategies, the reminder they provide is currently overlooked in practice.
R2 Comparison to FLOSS is weak This discussion was eliminated.
R2 Typos & grammatical errors These errors were corrected; hopefully new ones were not introduced in the revision process (apologies if so!)
R3 Motivation section does not cite related work Although the rewritten motivation section includes relatively few citations, they are more clearly relevant. For some topics, there is relatively little research (in this domain) to cite.
R3 Motivation section does not discuss debated issues The paper now focuses primarily on issues of participation and data quality.
R3 Consistency in case description structure The case descriptions are split into multiple topics, within which each case discussed. The structure of case descriptions and order of presentation is consistent throughout.
R3 Include key conclusions about each case with descriptions The final sentence of the initial descriptions for each case summarizes important characteristics. I believe the restructuring and refocusing of these revisions should address this concern.
R3 Does not tie back to theoretical framework used for analysis The Implications section specifically relates the findings back to the analytical framework, now discussed in greater detail in the methods section.
R3 No discussion of data quality issues This is now one of the primary topics of the paper and is discussed extensively. In addition, I humbly disagree that expert review is unusual in citizen science (although the way it was conducted in Mountain Watch is undoubtedly unique). Expert data review has been shown to be one of the most common data validation techniques in citizen science.
R3 No discussion of recruitment issues This topic is now one of the primary topics of the paper and is discussed extensively.
R3 Introduce sites before methods The case selection criteria section precedes the methods and includes overview descriptions of the cases. They are also given a very brief mention in the motivation section. More detailed description as relevant to the research focus follows the methods section.
R3 Do not assume familiarity with example projects References to projects other than the cases are greatly reduced and include a brief description of the project’s focus.
R3 Tie discussion to data and highlight new findings While relatively few quotes are included in the rewritten discussion section, the analysis hopefully demonstrates the depth of the empirical foundation for the analysis. The findings are clarified in the Implications section.
R3 Conclusions inconsistent with other research, not tied to case studies, or both To the best of my knowledge, the refocused analysis and resultant findings are no longer inconsistent with any prior work.

 

Big Issues in CSCW session, CSCW 2013

ACM Conference on Computer Supported Cooperative Work and Social Computing
26 February, 2013
San Antonio, TX

Big Issues in CSCW session

——

Ingrid Erickson – Designing Collaboration: Comparing Cases Exploring Cultural Probes as Boundary-Negotiating Objects

Cultural probes around boundary negotiating objects – prompts via Twitter, e.g. “Take a picture of sustainability (post it with #sustainability and #odoi11)” – leveraged existing platforms. Lots of very interesting images came from this. Content not profound, but prompts engendered communication with people on the street, people in teams, dialogues that generated new hashtags besides those requested. Led into a design workshop.

Another instance of using cultural probes with Light in Winter event (in Ithaca, NY). Found that probes have several properties that make them generative.

  • Exogenous: probes act like exogenous shocks to small organizations systems, interruption to normal practice that requires attention, initiating mechanism for collaboration.
  • Scaffolding: directed but unbounded tasks; hashtags and drawings act as scaffolds, directed boundary work to prompt engagement, informal structure that supports exploration over accuracy/specificity.
  • Diversity: Outputs improved by diversity – diverse inputs increased value, acted as funnel for diversity to become collective insight.

Think about designing collaboration – taboo topic with inherent implication of social engineering, but we’ve been doing it all along. As designed activities, cultural probes were oblique tasks to invite interpretation and meaning-making, build on exogenous shock value, give enough specificity for mutual direction, salient to context but easy to understand.

Potential to use distributed boundary probes? Online interaction space/s – assemble, display inputs; organize w/ hashtag/metadata; easy way to revise organizational schemes as they are negotiated; allow collaborators to hear thinking-aloud of fellow collaborators; can be designed as a game or casually building engagement over longer periods of time.

—-

Steve Jackson – Why CSCW Needs Science Policy (and Vice-Versa)

CSCW impact means making findings relevant to new and broader communities, make the work more effective and meaningful in the world.

We’re all used to “implications for design” and maybe even “implications for practice”, but need to start including more “implications for policy” in our work moving forward. Often fail to make connections in a useful way, need to learn from policy and policy research. Not immediately relevant to all CSCW research, but relevant at the higher level. The connections are just underdeveloped relative to potential value.

Particularly important around collaborative science, scientific practices – Atkins report as a prime example. Separate European trajectory covered in the paper along with history of science policy as relates to CSCW. So-called “supercomputing famine” in the 1970’s (drew laughs) reflected ambition of transforming science with technologies. Leading examples – CI generation projects – may also be misleading as these are the big money CIs. Ethnographic studies now including up to 250 informants but all projects are examples from MREFC projects – major research equipment funding something something.

CSCW & policy gap – institutional tensions in funding, data sharing practices and policies, software provision & circulation.

Social contract with science – support, autonomy, and self-governance in exchange for goods, expertise, and the (applied) fruits of (basic) science – this was the attitude after WWI. Stepping away from pipeline model, moved toward post-normal science. Identified 3 modes of science, which are culturally specific. Can’t wait to read this paper!

Crowdsourcing session, CSCW 2013

ACM Conference on Computer Supported Cooperative Work and Social Computing
26 February, 2013
San Antonio, TX

Crowdsourcing session

——

Tammy Waterhouse – Pay by the Bit: Information-theoretic metric for collective human judgment

Collective human judgment: using people to answer well-posed objective questions [RIGHT/WRONG]. Collective human computation in this context – related questions grouped into tasks, e.g. birthdays of each Texan legislator.

Gave example of Galaxy Zoo. Issues of measuring human computation performance. Fast? Encourages poor quality. Better? Percent correct isn’t always useful/meaningful.

Using info entropy – self-information of random outcome (surprise associated w/ outcome); entropy of random variable is its expected information. Resolving collective judgment – model uses Bayesian techniques. Then looked at entropy remaining after conditional information – conditional entropy. Used data from Galaxy Zoo to look at question scheduling; new approach improved overall performance.

——

Shih-Wen Huang – Enhancing reliability using peer consistency evaluation in human computation

Human computation not reliable – when tested, many people couldn’t count the nouns in 15-word list. Without quality control, they have 70% accuracy. Believes quality control is most important thing in human computation.

Gold standard evaluation: objectively determined correct answer [notably, not always possible]. Favored by researchers but not scalable because gold standard answers are costly to generate.

Peer consistency in GWAP: sometimes use inter-player consistency to reward/score. Mechanism significantly improves outcomes. Using peer consistency evaluation as scalable mechanism – can it work? Used AMT to test it. Concludes peer consistency is scalable and effective for quality control.

——

Derek Hansen – Quality Control Mechanisms for Crowdsourcing: Peer Review, Arbitration, & Expertise at FamilySearch Indexing

FamilySearch Index is one of largest crowdsourcing projects around. Volunteers transcribe old records – 400K contributors.

Looked at several models to improve efficiency while reducing added time. Use a downloaded package to do tasks, can use keystroke logging with idle time to evaluate task efficiency. Comparing arbitration process with a simple review. A-B agreement by form field varied. Experienced contributors had improved agreement.

Implications: retention is important – experienced workers faster, more accurate; encourages novices and experts to do more; contextualized knowledge, specialized skills needed for some tasks.  Tension between recruitment and retention with crowdsourcing – assumption that more people makes up for losing an experienced person, which is not always true. In this context it would take 4 new recruits to replace 1 experienced volunteer.

Findings: no need for a second round of review/arbitration – only slight reduction of error and arbitration adds more time (than it’s really worth).

Implications: peer review has considerable efficiency gains, nearly as good quality as arbitration process. Can prime reviewers to find errors, highlight potential problems (e.g., flagging), etc. Integrate human and algorithmic transcription – use algorithms on easy fields integrated with human reviews.

Citizen Science session, CSCW 2013

ACM Conference on Computer Supported Cooperative Work and Social Computing
27 February, 2013
San Antonio, TX

Citizen Science session

——

Sunyoung Kim – Sensr

Intro to types of citizen science, diversity of project types. Common underlying characteristic: using volunteer’s time to advance science. Many typologies, projects can be divided by activity types into primarily data collection and data analysis/processing. Focus here is field observation, has great opportunities for mobile technologies.

Problem is that most citizen science projects are resource-poor and can’t handle mobile technologies on their own. Goal is supporting people with no technical expertise to create mobile data collection apps for their own citizen science projects. Terms used: campaigns – projects, author – person who creates campaign, volunteer – someone who contributes to collecting data/analysis.

Design considerations include: 1) current tech use, similar available tools, needs for practitioners. Reviewed 340+ existing projects (campaigns) from scistarter.com, found only 11% provide mobile tools for data collection. Looked at types of data they’re collecting – primarily include location, pictures, and text data entry. 2) Data quality is paramount, and data also contains personal information. 3) How to recruit volunteers. Looked at similar mobile data collection tools like EpiCollect and ODK. They’re pretty similar in terms of available functionality, but Sensr is simplest to use. Most comparable platforms are open source so you need programming skills to make them work (free as in puppies!) – even the term open source can be very techie to the target users.

Built Sensr as visual environment combined with mobile app to author mobile data collection tools for citizen science. Demo video demonstrates setting up data collection form for “eBird”, pick fields to have on form. Just a few steps, creates back end database and front end mobile interface. Very straightforward interface to assemble a mobile app for citizen science data collection.

A couple of features: can define geographic boundary but can’t prevent people from outside the boundary to join (App Store is global), but you can help users target correct places. Can review the data before it is publicly viewable or goes into scientific data set.

Did case studies to see how nontechnical users did with it, betas with existing projects, before launching tool. Strong enthusiasm for the app, especially for projects with interest in attracting younger participants. Main contribution: Sensr lowers barriers for implementing mobile data collection for citizen science.

Question about native apps versus HTML5 mobile browser apps due to need for cross-OS support.

Question if there’s a way to help support motivation; not the focus in this study. Case study projects didn’t ask for it because they were so thrilled to have an app at all.

——

Christine Robson – Comparing use of social networking and social media channels for citizen science

One of main questions from practitioners at Minnowbrook workshop on Design for Citizen Science (organized by Kevin Crowston and me) was how to get people to adopt technologies for citizen science, and how to engage them. They were questions that could be tested out, so she did some experiments.

Built simple platform (sponsored by IBM Research) to address big picture questions about water quality for a local project, and this app development was advised by California EPA. App went global, have gotten data from around the world for 3 years now. Data can be browsed at creekwatch.org, you can also download it in CSV if you want to work on it. “Available on the App Store” button on the website was important for tracking adoption.

Creek Watch iPhone app asks for only 3 data points: water level, flow rate, presence of trash. Taken from CA Water Rapid Assessment survey, used those definitions to help guide people on what to put in the app, timestamped images, can look for nearby points as well. More in the CHI 2011 paper. Very specific use pattern: almost everyone submits data in the morning, probably while walking the dog, taking a run, something like that.

Ran 3 experimental campaigns to investigate mobile app adoption for citizen science.

Experiment #1: Big international press release – listed by IBM as one of the top 5 things that were going to change the world. It’s a big worldwide thing when IBM makes press releases – 23 original news articles were generated, that’s not including republication in smaller venues. Lots of press, could track how many more new users came from it by evaluating normal rate of signups versus post-article signup. +233 users

Experiment #2: Local recruitment with campaign “snapshot day”, driven by two groups in CA and Korea. Groups used local channels, mailing lists, and flyers. +40 users

Experiment #3: Social networking campaign: launched new version of app with new feature, spent a day sending messages via FB and Twitter, guest speaker blog posts, YouTube video, really embedded social media campaign. Very successful, +254 new users.

Signups aren’t full story – Snapshot Day generated the most data in one day. So if you want more people, go for the social media campaign, but if you want more data, just ask for more data.

Implemented sharing on Twitter and Facebook – simple updates as usually seen in both systems. Tracking sharing feature – conversions tracked with App store button. Can’t link clickthrough to actual download, just know that they went to iTunes to look at it, but it’s a good conversion indicator. Lots more visits resulted from FB than Twitter, a lot more visitors in general from FB as a result. Conversion by social media platform was dramatically different – 2.5x more from FB versus Twitter or web, which were pretty much the same.

Effects of these sharing posts over time – posts are transient, almost all of the clicks occur in the first 2-5 hours, after that its effect is nearly negligible. Most people clicked through from posts in the morning, there are also peaks later in the evening when people check FB after work; then next morning they do data submission.

However, social media sharing was not that popular – only 1 in 5 wanted to use Twitter/FB feature. Did survey to find out why. Problem wasn’t that they didn’t know about the sharing feature, 50% just didn’t want to use it for a variety of reasons. Conversely, for those uninterested in contributing data, they were happy to “like” Creek Watch and be affiliated on Facebook, but also didn’t want to clutter FB wall with it.

Facebook campaign as effective – or more – than massive international news campaign from a major corporation (though the corporate affiliation may have some effect there), and much easier to conduct. Obviously there are some generalizability questions, but if you want more data, then a participation campaign would be the way to go. Sharing feature shows some promise, but it was also a lot of work for a smaller payoff. With limited resources, it would be more useful to cultivate Facebook community than build social media sharing into a citizen science app.