Citizen Science & Health Data Donation: Health Data Exploration Project 2016

This week I had the privilege of joining the Health Data Exploration network meeting as an invited speaker. I found it a really interesting and eye-opening experience, since biomedical and health domains are not the most common focus in citizen science, and so rather new to me. The interaction of academic and clinical practice, and the implications for opening up participation, are a notably different paradigm than in some other sciences.

My talk was (very) well received, and I had a lot of really great conversations. My notes are pasted below, unembellished and minimally corrected, to give the broader citizen science community some insight into what the movers and shakers in the health data community are discussing as relates to engaging the public. Enjoy!


Health Data Exploration Network, May 17, 2016

“Projects that Have Advanced the Use of Personal Health Data for Research”

Stephen J Downs, Robert Wood Johnson Foundation
Mpact Parkinsons study with Apple ResearchKit – returning data to contributors was important & empowering. Sadly some of the controls in study turned out to have Parkinsons symptoms too.

Emil Chiauzza, Patientslikeme
Multiple Sclerosis study, small sample, using wearables to manage disease – how do people use data? Behavioral adaptations, day-to-day management of disease. Lots of tacit knowledge that isn’t captured, e.g., schedule shifting to avoid impact of heat on MS symptoms. Planning for rest time after more exhausting days. Avoiding heat with environmentally controlled settings. Pacing themselves, reducing activity intensity strategically. Developed a course in how to use the data from a wearable with concept of a “sweet spot” or targeted activity levels, contingent on the conditions.
Implications for data donation/cit sci: consider role of data within broader context of individuals’ uses, provide tools for action. Tools for wellness not always applicable for disease communities. Learn about “health hacks” that people adopt. Implications for precision medicine: put out survey about precision medicine, lots of people unaware, patients not included in the set-up.

Eric Hekler, ASU
Proximity sensors for adaptive interventions. Goal is just-in-time intervention to nudge toward positive or away from negative actions. Also looking into receptivity, at what moment is someone receptive to your message? iBeacon sensors: Bluetooth device tracks proximity & user accesses some aspect of it with an app, being used in retail, they are developing other uses, try it and donate data. Questions from this data: styles of participation, influence of program, re-identification behavior, predictions of future collaborations, etc. Interesting questions about data donation practices as well: how do you feel about the data collection, value that should be returned, who do you trust with this kind of data, what are ethics issues, are you willing to re-identify, locations of proximity sensors. Planning to validate measurements with users, did we measure the right things, is this useful or meaningful?

Michelle DeMooy, Center for Democracy & Technology
Working with Fitbit on internal R&D and ethics. Very interesting work. Report coming out tomorrow. Have a human subject researcher on the team, not just looking at data but helping provide context. More public-private partnerships are needed, making the debates public could be valuable, Fitbit wants to do the right thing but is stuck in a paradigm that makes it difficult.

Question about Fitbit adopting HIPPA – is that really a win? The privacy and security standards are pretty good, gives them a standard to work to and check against, which is a victory. That didn’t exist before, many wearable companies have nothing like it.

Julie Kientz, UW
Research on alertness: impacts on body clock, time of day, stimulant intake. Self-reported fatigue, sleep diaries, psychomotor vigilance task reaction time. Alertness varies over time, different patterns for different circadian rhythms. Also saw rhythms in app use by app category, over time, very interesting and logical. Wednesday was low productivity day, people hit a wall. If got enough sleep, used 61% more productivity apps, inadequate sleep 33% more entertainment apps. Late use of phone correlates with sleep disturbance. Future work – circadian-aware tech for planning and scheduling. Automated sensing doesn’t tell you everything, need the human-in-the-loop for understanding intent and meaning with as little burden as possible. Also notion of reward: forced reflection, we collect data and never look at it, put markers on your day and then annotate later, this is a benefit.


Barbara Evans, Uni Houston Law Center, commissioned paper
Consumer-Driven Data Commons: Health Data and the Transformation of Citizen Science

Data ownership: uses airline seat ownership of space to demonstrate the tendency to feel like we own something that we don’t. Feelings of data ownership are strong & intense, but legally we don’t own our data, and if we did, it would be different from owning a house. Most of our data are under a regime of shared control. Debating data ownership has been debated too long: critical issue is control & access.

“People-driven data commons” – uses the term commons in the sense of natural resources, e.g. work by Elinor Ostrom. It’s not the data but the institutional arrangements for data management and stewardship. Set of rules and arrangements to allow collective self-governance. Granular consent won’t get us where we need to be: acting autonomously puts power to individuals, but need collective action to achieve the outcomes. Collections of individuals can’t act together.

Normative ethics not admissable as testimony. Can get the data from either consumers or data holders. People-driven data commons means working with consumers so people work on getting their data to bring it together. 2×2 table of consent to data use from individuals, willingness to share data from data holders. So far we have neglected quadrant 3, which gives people access in order to contribute data.


“Personal Data Donation: Barriers and Paths to Success”

Anne Waldo, Waldo Law Offices: You can’t share data you can’t access–HIPAA. Rights to access data even include sending via unsecure email, but providers are not fully compliant. www.getmyhealthdata.org Providers are suspicious if you request records: are you leaving me or suing me? Refusal to use email despite the law requiring it, up to $600 (or $6K) to get records, ridiculous fees, and without estimates of cost until data are delivered. Data only available as PDF or on paper, not very usable.

Jason Bobe, ICAHN Institute: health is #3 priority for people worldwide, but don’t participate in organized health research: why? How do we overcome that and provide a good model for participating in health research? Really important for rare traits, e.g., “resilience” genes, need millions of people’s records to find the ones who have this, where medical prediction is they would be sick, but they’re healthy. Founded DIYbio, which has exploded since he started it. Key insight is not the $1K genome sequence, but the $1K genome sequencer. At the time, if you wanted sequencing, you were beholden to organizations that wouldn’t let you get your own data. Roles are changing, it’s no longer binary. Not just participants and researchers, participants can be the researchers. This influences research governance. Harvard Personal Genome Project: making genome data available as public benefit and resource. Question at the time was whether people would share, answer is yes. Variable across population, but people will share sensitive data even publicly, treating people as collaborators from the start was key, important to invoke reciprocity. Benefits beyond altruism and social good, people find meaning, and community, and education in new roles. Retrofitting sharing is too hard, have to completely rework it.

Nick Anderson, UC Davis: Assumption of precision medicine means we need lots of heterogeneous data of all kinds, not just genomic. Acceptance of the data sources is different. Value of data is differential, some are if not accepted, at least understood, e.g., NSA. Health data is different; how does a million people contributing all your data shift the question of when data acquired for one purpose is used for another. Even though promised to be for the benefit of all, it will probably only benefit a few to begin with. Social benefit is currently lightweight, hard to understand, not just asking to change patterns of how we do things, but to use that for things we’ll probably never be involved in. How is precision medicine going to shift the discussion? No clear acknowledgement of secondary purposes for data use in some systems, but we’re planning to do that anyway.

Camille Nebeker, UCSD: people were more thoughtful about sharing when she asked. They wanted to be motivated, for there to be a powerful purpose, that it would make a difference, and what it would be used for. It would be a motivator if it was of value to the community they were from, ethnicity or patient community. They wanted to have control over sharing, give permission and know who they’re sharing with. Commercial entities may be less likely to get data than academics. GINA for genetic data protections is important, preventing identity and against discrimination. Trust is important, would readily give data if they trusted the person. De-identification was key, odd since this group should know it’s hard to do, but still believed it could be done. Loopholes, for-profit, rights, etc all the barriers, huge cognitive burden to use policies, removing that will increase trust.

Aaron Coleman, Fitabase: sometimes you’re in a data sharing experiment and don’t know it. Understanding granularity of data and what can be inferred from that, for example, employer-based study–do you want them to know your sleep patterns. Do you want them knowing activity patterns that may suggest health cost differences? Once you dig into APIs, trying to pull data out, the barriers pile up. The more granular and often you want your data with more scope, the harder it is to get it. Permissions for access to data are now getting more specific, need the granularity and options to turn things off and revoke permission when we’re no longer comfortable with their data use.

Waldo: New guidance from HHS: have right to paper copy of records, to electronic copy if they’re in that form, to send to 3rd party of your choice regardless of who it is, and they can’t ask you why. You have a right to get records by unsecure email, though they have to warn you about the security issue. Have to give it in the form/format you request as long as it is reasonably produceable in that form. Rights to standing request. Within 30 days max, and should be almost instantaneous or very prompt. Providers can’t demand you come to office, require you to use portal, use their own authorization form–access request is different from authorization form. Can’t deny request if bill is unpaid, can’t deny if they don’t think a certain recipient should get it. Fees still exist, reasonable cost-based fee, only for copying and mailing, nothing related to the processing or storage or retrieval. Illegal to charge for download format, page fees for electronic, and must give up-front cost estimates. Best practice is to charge nothing.

These have been rights for a long time, is there any indication they will enforce it? HHS has been saying this clearly in conferences. Grace period on enforcement action since policy was released.

Question on issue of de-identification: how close are we to that being impossible? News article recently about Stanford profs getting phone records and being able to find the people. Ethnographic work–still hard to obscure who a family is to anyone who knows them. One of the issues with precision medicine is that you can do tangential re-identification. In technical world, only as good as HIPAA can make it. Movement toward considering probability of re-identification, why, and can we quantify that to make it easier to communicate that risk? Right now probability is pretty low. Need to educate people, from IRB member perspective, quantify the risk of harm for a study, tell them that in the consent form. If could quantify the likelihood of re-identification, start moving away from guarantees of anonymity, then people start to understand and we become more transparent.

Important to spread word about new guidance from HHS, this is huge culture change. Some safety in numbers, this is a tipping point. Feels like change agents, nice to be together with others who rock the boat the same way.

In policy and technical community, looking at linkability versus identifiability. Different paradigm for protections. Safe Harbor requires removing 18 identifiers, but 19th is “any other thing that would make it clear to someone else who that is.” Have to scrub it harder for case studies, maybe even change facts that aren’t meaningful. Expert statistician method–risk has to be very small, not zero. Should stick to standard of very small risk, because we deal with small risks in every other part of life.

Disconnect in talking about these precision medicine studies recruiting a million people, many others. Question is whether we are ready for this scale? Seems like there’s a huge disconnect. Jason says, PM initiative is bold by shooting for platinum record without producing a single song yet. But have ways to cheat, e.g. million veterans project. But sees mental model as the bigger barrier, people think medical research is when you’re sick and need experimental treatment. Need to shift perspective to research being a diverse experience, such that people seek out studies that meet their tastes. Danger of promising equal benefit to everyone, that’s not likely, so scale is going to be an issue.


Patti Brennan, UW-Madison
Citizen Engagement: Informatics in the service of health

4th director of NLM (upcoming.) Goal of health for all, in part through data donation. Think of “citizens” broadly, not the people you know or people of privilege. We need everyone engaged. Citizens, not patients, important framing for health services. People need rights to advocate for themselves; rights and responsibilities associated with citizen, much more so than engagement. Citizens as people of the world who engage, take part in the world.

Identifying health data life cycle–origination points for data, and also the points in between clinical treatment. Assertions:
1. Citizens are sometimes patients, but always citizens
2. Citizen engagement improves health
3. Citizen science provides a data-driven model to guide the next-generation of technical innovations, clinical practice, and biomedical & health knowledge

Needs to drive informatics in service of health. Emergence of new tools and tech, etc.

Engagement: participation by multiple parties achieving mutually established and jointly accountable goals. Also a promise, obligation, binding condition; giving someone a job; etc. Multiple definitions all have their own analogies in health. Arises from perspective of mutual recognition and respect; requires deliberate and intentional strategies.

Direct involvement in policy and public service delivery–in cooperation with rather than in place of experts. Discussion of citizen engagement along the lines of public engagement in science. Lots of ways that engagements can happen: extension, collaboration, co-creation.

Experiment: view videos of pond & list signs of life.
I saw and heard: water skeeters, Northern rough-winged swallow, multiple species of warblers (yellow, yellow-rumped?), shrubs, trees, water, song sparrow, tadpoles, frogs, grasses.

Citizen roles in citizen science: Extenders: extend the reach of professionals, perspective and goal defined by pros (contributory), Collaborative: professionals and policy define scope, citizens help set priorities; co-creation. Example of Audubon CBC. Rules and definitions, shared meanings, are all possible and useful.

Sharing is more important than storing. As important to seek normal patterns as unique things. Redoes the citizen science definition with citizen engagement in health. Australia defines effective engagement with: inform, consult, involve, collaborate, empower – likes this definition better than what is used in US.

CAISE typology parallel for health: gathering health data as guided by professionals, guiding priorities in policy, and in co-creation of balances of public health, patient care, and personal wellness. Potential benefits are personal, professional practice improvements, and public policy improvements. To achieve the goals, need information infrastructure.

Concept of doing a discharge simulation of virtual tour of home along with air quality data and other records, identify places for post-surgical care activities.

Expand the phenomenon of interest to health, measure what must be interpreted, rather than interpreting what we can measure. If data, storage, interpretation and use are separated, then provenance becomes more important. Metadata needs to be captured along with data and definitions created at point of use, transitioning to ontological formulation of what a data definition is. Need broader set of communication and information tech. Sets out necessary soft skills along with tech needs.

What if citizens are wrong? When clinicians are wrong, we call it differential diagnosis and don’t throw out the data, recorded as pathway that’s been abandoned. So why expect more from patients? Begin with what people believe is important and work from that.

How about rights that are under threat or restricted? Building systems based on privilege. Medical informatics can’t resolve it, but can be designed within social constraints to formalize impacts on what constitutes health, practices, and acceptable use of data.