Big Issues in CSCW session, CSCW 2013

ACM Conference on Computer Supported Cooperative Work and Social Computing
26 February, 2013
San Antonio, TX

Big Issues in CSCW session

——

Ingrid Erickson – Designing Collaboration: Comparing Cases Exploring Cultural Probes as Boundary-Negotiating Objects

Cultural probes around boundary negotiating objects – prompts via Twitter, e.g. “Take a picture of sustainability (post it with #sustainability and #odoi11)” – leveraged existing platforms. Lots of very interesting images came from this. Content not profound, but prompts engendered communication with people on the street, people in teams, dialogues that generated new hashtags besides those requested. Led into a design workshop.

Another instance of using cultural probes with Light in Winter event (in Ithaca, NY). Found that probes have several properties that make them generative.

  • Exogenous: probes act like exogenous shocks to small organizations systems, interruption to normal practice that requires attention, initiating mechanism for collaboration.
  • Scaffolding: directed but unbounded tasks; hashtags and drawings act as scaffolds, directed boundary work to prompt engagement, informal structure that supports exploration over accuracy/specificity.
  • Diversity: Outputs improved by diversity – diverse inputs increased value, acted as funnel for diversity to become collective insight.

Think about designing collaboration – taboo topic with inherent implication of social engineering, but we’ve been doing it all along. As designed activities, cultural probes were oblique tasks to invite interpretation and meaning-making, build on exogenous shock value, give enough specificity for mutual direction, salient to context but easy to understand.

Potential to use distributed boundary probes? Online interaction space/s – assemble, display inputs; organize w/ hashtag/metadata; easy way to revise organizational schemes as they are negotiated; allow collaborators to hear thinking-aloud of fellow collaborators; can be designed as a game or casually building engagement over longer periods of time.

—-

Steve Jackson – Why CSCW Needs Science Policy (and Vice-Versa)

CSCW impact means making findings relevant to new and broader communities, make the work more effective and meaningful in the world.

We’re all used to “implications for design” and maybe even “implications for practice”, but need to start including more “implications for policy” in our work moving forward. Often fail to make connections in a useful way, need to learn from policy and policy research. Not immediately relevant to all CSCW research, but relevant at the higher level. The connections are just underdeveloped relative to potential value.

Particularly important around collaborative science, scientific practices – Atkins report as a prime example. Separate European trajectory covered in the paper along with history of science policy as relates to CSCW. So-called “supercomputing famine” in the 1970’s (drew laughs) reflected ambition of transforming science with technologies. Leading examples – CI generation projects – may also be misleading as these are the big money CIs. Ethnographic studies now including up to 250 informants but all projects are examples from MREFC projects – major research equipment funding something something.

CSCW & policy gap – institutional tensions in funding, data sharing practices and policies, software provision & circulation.

Social contract with science – support, autonomy, and self-governance in exchange for goods, expertise, and the (applied) fruits of (basic) science – this was the attitude after WWI. Stepping away from pipeline model, moved toward post-normal science. Identified 3 modes of science, which are culturally specific. Can’t wait to read this paper!

Wikipedia-Supported Cooperative Work session, CSCW 2013

ACM Conference on Computer Supported Cooperative Work and Social Computing
26 February, 2013
San Antonio, TX

Wikipedia-Supported Cooperative Work session

——

Jonathan Morgan – Tea & Sympathy: crafting positive new user experiences on Wikipedia

New user experiences on Wikipedia are intimidating – unintuitive markup, support impersonal, overwhelming load of rules. Everything about new user experience seems to be intended to drive people away. Women’s experiences as minority, dissatisfaction with abrasive norms, wish for more collaborative participation – good parallel for newbies in general, same issues.

Teahouse: created to meet needs of new users, want to help address deficiencies for newcomers. Tried to figure out what they want and how to support that. Newbies usually hesitant due to concern over humiliation, intimidated by making first move to socialize when you don’t know norms.

Teahouse has host profiles and guest profiles, Q&A forums. Goal of host profiles is making it more personable, showing that Wikipedians are real people. Also tried to address some usability constraints, but didn’t want it to work fundamentally differently from Wikipedia. Decided not to make people edit a page to ask how to edit a page, however – used Javascript widget, makes it easier while providing learning opportunity.

Most of the experience isn’t scripted. Welcome – personal welcome, another host gives a tip with less overwhelming links than they would get for a “getting started” comment in Wikipedia proper. Very different sensibility than Wikipedia, and the difference lies in the norms. Set up list of guidelines for interaction – not enforceable rules – if they work, hope that it’s because they remind Wikipedians what it was like to be a newbie.

Engaging new editors effectively, what they valued about the Teahouse wasn’t usability but sociability. Human-human interactions were more salient. Looked at retention as well, Teahouse guests made more edits on average than non-Teahouse control groups, more of them becoming high volume contributors. They stuck around longer, and were substantially more active than those who didn’t do Teahouse. Wikipedian hosts enjoyed it too.

Still going strong after a year, looking at using badges, data set available upon request.

——

Aaron Halfaker – Making Peripheral Participation Legitimate: Reader engagement experiments in Wikipedia

Power law participation even among small number of regular editors. LPP lens – how do newcomers enter communities of practice? Initial new member tasks should be simple, low risk of causing problems in community, and productive so it’s not useless – same thing as in Bryant et al 2005, but the participation experience has changed a lot since then.

Newcomer’s first edits are now much more complex, with higher rate of failure (immediate reversion), and it’s getting worse. Wikipedia is very, very complicated in terms of rules – lots and lots of policies. Verifiability policy alone is a 6-page document.

Tried setting up new task for newbie that’s much easier to do – added suggestion box on bottom for feedback, which will hopefully help editors improve articles and also get new people involved. It’s simple and low-risk, but is it productive? 3 experiments to find out.

RQ1: How do requests for participation affect quality and quantity of feedback – some engineered toward reader concerns, some for editor concerns. Found that engineering request toward reader concerns boosted contribution rate by 45% over just asking for rating, no loss of utility.

RQ2: How does prominence of the request affect quality and quantity of feedback? (placement @ end of article versus top, affects visibility) The more prominent button that scrolled along with the article boosted feedback contribution rate by 108% (on top of prior 45%) with no loss of utility – very surprised by this, ran a bunch of confirmatory experiments.

RQ3: How does presence of feedback form affect new editor conversion? Could cannibalize primary contributions (edits) but could be a stepping stone. Found that asking people to edit after feedback submission increased new conversions by 151%, but 20% drop in probability of editing (after?) that first week.

Conclusions: no tradeoff between quantity and quality of participation; inviting readers to convert increased rate of new editors but at lower success rate; need to balance value of contribution against cost of moderating. If that equation is wrong, it won’t work. Article feedback tool almost being taken down because Wikipedia has no way to moderate the feedback.

——

Stuart Geiger – Using Edit Sessions to Measure Participation in Wikipedia (or: Edit Counts Considered Harmful)

Started w/ discussion of cross-methods collaboration – what each of the authors see when they look at Wikipedia (very funny comparison). Bringing together ethnographic understandings with more computational approach to iteratively, inductively develop a quantitative measure.

Big difference in how academics evaluate participation and how Wikipedians do. Many Wikipedians think edit counts are inadequate, even harmful – researchers focus on discrete units of work but those don’t capture the work experience. Need to expand our idea of what counts as work. We measure work in hours in organizational settings – 20 hours of work for SVs, not 20 questions answered or 20 problems solved. Time-based metrics more relevant.

So, edit session: graph of Jimmy Wales’ edits over time. Wikipedians edit in short punctuated bursts, usually in short periods of time. Tasks therefore segmented, can use same techniques as counting edits, look at time between 2 sequential actions, when less than cutoff of an hour apart, as time spent in editing session.

Determined cutoff by looking at histogram of time between edits. Also found 3 distributions – within-session breaks, intra-session breaks, and what Wikipedians call “wiki-breaks” – long-term departures from community – usually a few months.

Looking at Wikipedia in terms of time spent editing, there’s actually a lot more growth in participation that edit counts suggest. Total concrete bursts of editing on Wikipedia (not including research, off-wiki interaction, etc.) add up to over 100M hours. That’s over 11,700 human years – greater than the population of Tuvalu and 14 other small countries!

Limits to measures like these – it misses lots of work that doesn’t directly make an edit (e.g., background research), kind of creepy and invasive as every interaction in Wikipedia is an edit, acts a little like a diary study. Similar issues elsewhere, but edit sessions contribute to a more holistic measure.

Crowdsourcing session, CSCW 2013

ACM Conference on Computer Supported Cooperative Work and Social Computing
26 February, 2013
San Antonio, TX

Crowdsourcing session

——

Tammy Waterhouse – Pay by the Bit: Information-theoretic metric for collective human judgment

Collective human judgment: using people to answer well-posed objective questions [RIGHT/WRONG]. Collective human computation in this context – related questions grouped into tasks, e.g. birthdays of each Texan legislator.

Gave example of Galaxy Zoo. Issues of measuring human computation performance. Fast? Encourages poor quality. Better? Percent correct isn’t always useful/meaningful.

Using info entropy – self-information of random outcome (surprise associated w/ outcome); entropy of random variable is its expected information. Resolving collective judgment – model uses Bayesian techniques. Then looked at entropy remaining after conditional information – conditional entropy. Used data from Galaxy Zoo to look at question scheduling; new approach improved overall performance.

——

Shih-Wen Huang – Enhancing reliability using peer consistency evaluation in human computation

Human computation not reliable – when tested, many people couldn’t count the nouns in 15-word list. Without quality control, they have 70% accuracy. Believes quality control is most important thing in human computation.

Gold standard evaluation: objectively determined correct answer [notably, not always possible]. Favored by researchers but not scalable because gold standard answers are costly to generate.

Peer consistency in GWAP: sometimes use inter-player consistency to reward/score. Mechanism significantly improves outcomes. Using peer consistency evaluation as scalable mechanism – can it work? Used AMT to test it. Concludes peer consistency is scalable and effective for quality control.

——

Derek Hansen – Quality Control Mechanisms for Crowdsourcing: Peer Review, Arbitration, & Expertise at FamilySearch Indexing

FamilySearch Index is one of largest crowdsourcing projects around. Volunteers transcribe old records – 400K contributors.

Looked at several models to improve efficiency while reducing added time. Use a downloaded package to do tasks, can use keystroke logging with idle time to evaluate task efficiency. Comparing arbitration process with a simple review. A-B agreement by form field varied. Experienced contributors had improved agreement.

Implications: retention is important – experienced workers faster, more accurate; encourages novices and experts to do more; contextualized knowledge, specialized skills needed for some tasks.  Tension between recruitment and retention with crowdsourcing – assumption that more people makes up for losing an experienced person, which is not always true. In this context it would take 4 new recruits to replace 1 experienced volunteer.

Findings: no need for a second round of review/arbitration – only slight reduction of error and arbitration adds more time (than it’s really worth).

Implications: peer review has considerable efficiency gains, nearly as good quality as arbitration process. Can prime reviewers to find errors, highlight potential problems (e.g., flagging), etc. Integrate human and algorithmic transcription – use algorithms on easy fields integrated with human reviews.

Citizen Science session, CSCW 2013

ACM Conference on Computer Supported Cooperative Work and Social Computing
27 February, 2013
San Antonio, TX

Citizen Science session

——

Sunyoung Kim – Sensr

Intro to types of citizen science, diversity of project types. Common underlying characteristic: using volunteer’s time to advance science. Many typologies, projects can be divided by activity types into primarily data collection and data analysis/processing. Focus here is field observation, has great opportunities for mobile technologies.

Problem is that most citizen science projects are resource-poor and can’t handle mobile technologies on their own. Goal is supporting people with no technical expertise to create mobile data collection apps for their own citizen science projects. Terms used: campaigns – projects, author – person who creates campaign, volunteer – someone who contributes to collecting data/analysis.

Design considerations include: 1) current tech use, similar available tools, needs for practitioners. Reviewed 340+ existing projects (campaigns) from scistarter.com, found only 11% provide mobile tools for data collection. Looked at types of data they’re collecting – primarily include location, pictures, and text data entry. 2) Data quality is paramount, and data also contains personal information. 3) How to recruit volunteers. Looked at similar mobile data collection tools like EpiCollect and ODK. They’re pretty similar in terms of available functionality, but Sensr is simplest to use. Most comparable platforms are open source so you need programming skills to make them work (free as in puppies!) – even the term open source can be very techie to the target users.

Built Sensr as visual environment combined with mobile app to author mobile data collection tools for citizen science. Demo video demonstrates setting up data collection form for “eBird”, pick fields to have on form. Just a few steps, creates back end database and front end mobile interface. Very straightforward interface to assemble a mobile app for citizen science data collection.

A couple of features: can define geographic boundary but can’t prevent people from outside the boundary to join (App Store is global), but you can help users target correct places. Can review the data before it is publicly viewable or goes into scientific data set.

Did case studies to see how nontechnical users did with it, betas with existing projects, before launching tool. Strong enthusiasm for the app, especially for projects with interest in attracting younger participants. Main contribution: Sensr lowers barriers for implementing mobile data collection for citizen science.

Question about native apps versus HTML5 mobile browser apps due to need for cross-OS support.

Question if there’s a way to help support motivation; not the focus in this study. Case study projects didn’t ask for it because they were so thrilled to have an app at all.

——

Christine Robson – Comparing use of social networking and social media channels for citizen science

One of main questions from practitioners at Minnowbrook workshop on Design for Citizen Science (organized by Kevin Crowston and me) was how to get people to adopt technologies for citizen science, and how to engage them. They were questions that could be tested out, so she did some experiments.

Built simple platform (sponsored by IBM Research) to address big picture questions about water quality for a local project, and this app development was advised by California EPA. App went global, have gotten data from around the world for 3 years now. Data can be browsed at creekwatch.org, you can also download it in CSV if you want to work on it. “Available on the App Store” button on the website was important for tracking adoption.

Creek Watch iPhone app asks for only 3 data points: water level, flow rate, presence of trash. Taken from CA Water Rapid Assessment survey, used those definitions to help guide people on what to put in the app, timestamped images, can look for nearby points as well. More in the CHI 2011 paper. Very specific use pattern: almost everyone submits data in the morning, probably while walking the dog, taking a run, something like that.

Ran 3 experimental campaigns to investigate mobile app adoption for citizen science.

Experiment #1: Big international press release – listed by IBM as one of the top 5 things that were going to change the world. It’s a big worldwide thing when IBM makes press releases – 23 original news articles were generated, that’s not including republication in smaller venues. Lots of press, could track how many more new users came from it by evaluating normal rate of signups versus post-article signup. +233 users

Experiment #2: Local recruitment with campaign “snapshot day”, driven by two groups in CA and Korea. Groups used local channels, mailing lists, and flyers. +40 users

Experiment #3: Social networking campaign: launched new version of app with new feature, spent a day sending messages via FB and Twitter, guest speaker blog posts, YouTube video, really embedded social media campaign. Very successful, +254 new users.

Signups aren’t full story – Snapshot Day generated the most data in one day. So if you want more people, go for the social media campaign, but if you want more data, just ask for more data.

Implemented sharing on Twitter and Facebook – simple updates as usually seen in both systems. Tracking sharing feature – conversions tracked with App store button. Can’t link clickthrough to actual download, just know that they went to iTunes to look at it, but it’s a good conversion indicator. Lots more visits resulted from FB than Twitter, a lot more visitors in general from FB as a result. Conversion by social media platform was dramatically different – 2.5x more from FB versus Twitter or web, which were pretty much the same.

Effects of these sharing posts over time – posts are transient, almost all of the clicks occur in the first 2-5 hours, after that its effect is nearly negligible. Most people clicked through from posts in the morning, there are also peaks later in the evening when people check FB after work; then next morning they do data submission.

However, social media sharing was not that popular – only 1 in 5 wanted to use Twitter/FB feature. Did survey to find out why. Problem wasn’t that they didn’t know about the sharing feature, 50% just didn’t want to use it for a variety of reasons. Conversely, for those uninterested in contributing data, they were happy to “like” Creek Watch and be affiliated on Facebook, but also didn’t want to clutter FB wall with it.

Facebook campaign as effective – or more – than massive international news campaign from a major corporation (though the corporate affiliation may have some effect there), and much easier to conduct. Obviously there are some generalizability questions, but if you want more data, then a participation campaign would be the way to go. Sharing feature shows some promise, but it was also a lot of work for a smaller payoff. With limited resources, it would be more useful to cultivate Facebook community than build social media sharing into a citizen science app.