iDigBio workshop on Public Participation in Digitization of Museum Specimens, 9/28/12, Gainesville, FL
Introduction to NSF’s Advancing Digitization of Biodiversity Collections Program
Anne Maglia (U.S. National Science Foundation).
ADBC – challenge is to mobilize “dark data” in collections. Most data inaccessible, inconsistent, can’t be captured, etc. Feds get it, have been working on figuring it out for a few years now. NIBA (Network Integrated Biocollections Alliance) goal is centralized integration through research-based thematic networks. ADBC (Advancing Digitization of Biological Collections) is NSF program, 10 year, $10M/year – coordinating resource (iDigBio), TCNs (thematic collection networks). Lots of acronyms. Goals from the recommendations: understanding and appreciation of biodiversity through education and outreach, drive well-informed environmental and economic policies.
Updated view of NSF Broader Impacts: encompasses benefit to society and achievement of specific desired societal outcomes. New broader impacts should be achieved through the research itself, activities directly related to research, or activities that are supported by but complementary to the project. Meaningful assessment should be based on appropriate metrics, sometimes assessing effectiveness of activities is best done at a higher, more aggregated level, than the individual project – e.g. via a community.
Opportunity for engaging society in iDigBio – through products and processes, involvement in data capture, verification, and meta-analysis. Assessment at multiple levels can create model for community-wide impact.
Introduction to iDigBio by Larry Page (iDigBio; Florida Museum of Natural History, Univ. of Florida).
The thematic collections networks: overview of project goals and digitization methods, with recognition of steps that could involve the public
ADBC intended to facilitate use of biodiversity data for scientific, environmental, and economic challenges. 7 TCNs are involving 130 institutions. Goal of iDigBio is enabling digitization of biodiversity collections data, with efficient/effective digitization standards and workflows. Digitization – specimen-based, label data (georeferenced), images, metadata and ancillary data. Activities include databasing, georeferencing (major activity), imaging (sometimes how databasing is being accomplished).
Data portal includes specimen search, linking collections to ecology, paleontology, genomics, living collections (zoos?), other repos. NSF requiring collaboration with iDigBio on collections-based projects. Doing tool development and integration, host workshops, convene working groups, visiting scholar program, education and outreach.
Nash, Thomas. Lichens and Bryophytes Thematic Collections Network Project. Univ. of Wisconsin, Madison, Wisconsin.
The lichen and bryophyte and climate change (LBCC) TCN: an overview, current progress and relationship to the American Bryological and Lichenological Society.
2.3 million specimens, 65 institutions, 1 year after TCN founded. Bryophytes and lichens dominate arctic and northern boreal regions, commonly in many other ecosystems, store a major part of world’s organic carbon. American Bryological and Lichenological Society. Specimens include 900K lichens, 1.4M bryophytes, 16 digitization centers working with 65 institutions (herbaria) that include 95% of non-governmental collections. Complex workflow for digitization includes OCR, NLP, geo-referencing, transcription, etc.
Imaging stations required camera stands, jewelry lighting boxes, black coverings; requirements for imaging resolution is 20px high for small letter “x”, camera connected to computer, adequate battery life. Using barcode reader as well. Setup took several months to establish. Imaged over 40K specimens during first 6 months with 2 undergrads. Imaging 50K-80K records/year is doable.
Metadata – barcode each specimen, latest species name (requires taxonomic knowledge), collector and collection number. Sometimes also a few other fields, e.g. major geographic region, not particularly detailed. Now have two portals, one for lichens and one for bryophytes. Portal allows management of collections and access to stuff, you have to be an expert to use them.
Transcription – extensive data forms to get labels transcribed. Crowdsourcing transcription – national coordinator and Missouri Botanical Garden, volunteer programs, members of ABLS. Sophisticated user and workflow management system in SYMBIOTA – transcription, geo-referencing, professional quality control. Sounds too complicated for average contributor.
Brinda, John. Lichens and Bryophytes Thematic Collections Network Project. Missouri Botanical Garden, St. Louis, Missouri.
Digitization of Bryophyte Labels at the Missouri Botanical Garden.
Institutions have individual collections representing different collectors, time periods, geographical regions, taxonomic groups. Challenges: Exsiccatae – duplicates; recognizing historically important collections; handwritten labels. Connecting labels to collections allows inference of contextual metadata.
Historically important records – no one without a PhD can recognize them. Wildly variable information on them, only people who know what to look for will be able to identify them. Wants to work with crowdsourcing for handwriting analysis, language translation, georeferencing, nomenclature.
Speelman, Julie A. InvertNet Thematic Collections Network Project. Purdue Univ., West Lafayette, Indiana.
Community assisted digital imaging of insect specimens.
InvertNet – entomology. Staff includes “systemicist.” Goals are digitizing over 50M specimens at 22 Midwestern collections plus Hawaii. Specimen images and metadata (label info) – specimens drawers, vials, and slides. Advanced imaging – including 3D – target goal of $0.10/image. Want everyone to be able to browse/search/view specimens through web interface. Developing tools for data mining and analysis; community building, collaboration & support; education, outreach, & reference.
Different workflows for slides, vials, drawers. Scanning for slides includes loading slides into tray, scanning, saving, uploading to InvertNet. Vials are more complicated – curate specimens (taxonomy outdated), remove labels, replace parts if needed, place on scanner tray, scan, save, upload. Think volunteers could help with all steps except curation. Drawer workflows – have a robot for scanning! Curate specimens, digitize image, upload metadata.
Hard to recruit/retain volunteers. What communities can they tap? High school students, organizations like Audubon, Master naturalist (“citizen science groups”), retirees (but many don’t like IT!), undergrads. Vol program: needs assessment, determine objectives, written proposal, volunteer coordinator staff support, job descriptions, recruitment & selection of volunteers, training and implementation, reward staff and volunteers. (missing: ongoing care and feeding of volunteers)
Expect that volunteers can be integral to digitization with potential for huge cost savings, but requires organization and coordination.
Seltmann, Katja. Tri-Trophic Thematic Collections Network Project. American Museum of Natural History, New York, NY.
Plants, Herbivores, and Parasitoids: A Model System for the study of Tri-Trophic Associations
Goal is digitizing 3.5M specimens – transform data on specimen labels and get records georeferenced. Specimens mostly insect-related, 30+ institutions across US. Working with volunteers and paid interns, lots of experience with this across collaborating institutions. A lot of the work requires being onsite, workflow for bugs includes multiple specific steps organizing of specimens, identifying sex and exemplars, barcoding.
Volunteers can be managed to include easy entry level work with minimal supervision, identify the skills and step them up in the process with more autonomy and responsibility. One-day volunteers could do work like cutting apart labels. If they want to have more involvement, they can do more interesting stuff. Most of AMNH’s volunteers are recruited through social media incl. “dorkbot”, radio interviews, etc.
Potential for mutual benefit with other TCNs – software development, crowdsourcing georeferencing and transcribing label data from images. Outreach in terms of professional participation – symposium at Entomological Society of America, specimen-level data information management course at AMNH (opportunity for reusing DataONE materials and customizing them), workshop using collection level data in research – hands-on working through a small project.
Training participants takes 1-2 hours to start, then constant supervision and trying to get them chitter-chattering with one another through online chat to relieve supervisor burden. Need to start with listing volunteer programs at TCN groups.
Thiers, Barbara. Macrofungi Thematic Collections Network Project. New York Botanical Garden, New York, New York. The
Macrofungi Collection Consortium TCN and North American mycophiles: enhancing a long-standing relationship.
Macrofungi are mushrooms and related stuff – used for food, pharma, recreation, forest health, products, etc. Need to digitize specimen data, fieldnotes, photos – 700K specimen records, 70K specimen images, many ancillary data that need to be linked to specimen records. Working with MyCoPortal for searching and e-publications, can host a lot of different types of info about the organisms.
Amateur mycologists – mycophiles, help with data editing and adding content to portal. Hope they will use data from portal for their own education and biodiversity documentation projects – offer opportunity to share their work. Crowdsourcers for help with transcription of specimen labels, opportunity to learn more about fungi and natural history collections. Objectives for public participation – develop a corps of expert volunteers for specimen record digitization, outlet for publication of info gathered by public, build closer relationships for mutual benefit.
Amateur Mycology Community – 72 nationwide clubs, often do field outings, documenting fungal biota through observations and collections, educate general public, communicate through meetings and publish. “Fungus Fairs” involve collecting a bunch of material to show grand displays to interested public.
Relationship between amateur and professional communities: pros serve as lecturers, identifiers at club events, publish field guides for amateur use. They all share info through Mushroom Observer – brilliant use of social media to make pro-am connections. Amateurs make and maintain collections.
Objectives of crowdsourcing: opportunity for participation beyond amateur mycologists, opportunity to expand these groups beyond gray-hairs. Hope to help MaCC project meet and exceed promised deliverables, sustain regional herbaria, improve science literacy and appreciation of value of collections. Challenges: volunteers not interest in participating, participate but don’t feel appreciated; mission creep – enthusiasm feeds ambition for larger initiatives that eclipse main objectives.
Digitization projects never go away. In progress of introducing communities and incorporating amateur content into portal; next steps include implementing crowdsourcing component and preparing guidelines and best practices for incorporating crowdsourcer feedback into collections records.
Sweeney, Patrick. New England Vascular Plants Thematic Collections Network Project. Yale Univ., New Haven, Connecticut.
Mobilizing New England vascular plant data to track environmental change: an overview and preliminary thoughts on engaging the public.
Volunteer pool – regional and state level botanical clubs and societies. Rationale for TCN: goal is providing data to support study of consequences of climate change and land use history in New England. Themes are climate change – plant phenology – and land use history – herbarium specimens, habitat data for subset of target taxa, developing vocabularies for both phenology and habitat data.
Developing organizational network to support these activities. Workflow plan: collection preparation (pre-capture), primary digitization, data enhancement (secondary digitization). Precapture involves developing labels and barcodes to associate with folders; need some “special” volunteers to work with specimens directly, but that’s not impossible – still not a place for huge numbers of volunteers.
Primary digitization is image capture with barcoding and subset of label info and additional details. Currently testing a high throughput digitization apparatus in collaboration with an industrial engineer for conveyor belt system to mediate process. Can use voice recognition software to improve process.
Secondary digitization: georeferencing, town level probably adequate. May not need volunteers to do this for New England due to density of towns and size of states. Mobilization – all images and data available through Symbiota portal. Training and outreach – undergrad and grad students (paid) and interns. Plan to establish network of citizen science observers across New England for phenology data collection; Primack working with this. Key issues with public participation – recruitment, management, training, turn-over, quality control.
Basham, Melody. Southwest Collections of Anthropods Network Thematic Collections Network Project. Arizona State Univ., Phoenix, Arizona.
SCAN survey results: engaging the public with insect digitization workflows.
SCAN – 10 institutions. Anthropods – ground-dwelling insects like beetles, as opposed to butterflies. Plans to develop strategy and sustainable model to allow for more specimens to be entered into database, increase rates of identification, adopt and encourage broader virtual collaboration. Two websites include informative site and a Symbiota portal.
Survey of 10 respondents from SCAN community – Challenges for engaging volunteers in insect digitization – meaning/purpose, task limitations, QC/training, need for verification, skills/temperament for task. What would be easiest to engage public: data entry, imaging. Most difficult: taxonomy clearest challenge. Most potential for integration as a citizen science project: data entry & imaging. Most viewed it important to do crowdsourcing/citizen science. To what extent should it have meaningful significance – less agreement about importance. Specific groups that would be valuable – retired systematicists, taxonomists, mostly retired professionals.
Interesting comments from open response items – capturing specimen label images; make participation fun/meaningful; separate database or class of data for cit sci data; digitization workflow separate from citizen science.
Results to focus on: task should be purposeful and meaningful; data entry & imaging easiest to integrate people, most viewed citizen science as important component of public involvement; most groups not currently engaging the public; need to make insect labels accessible and user friendly.
Looking at potential for mobile app – iphoneographers – taking macro images that could be contributed to SCAN collections.
Hendricks, Jonathan. Paleoniches Thematic Collections Network Project. San Jose State Univ., San Jose, California.
Digital atlases of fossil collections: new resources for the public to identify and understand ancient biodiversity.
Goals are databasing several major invertebrate fossil collections, georeferencing to enable study of biogeographic patterns over time, generate digital atlases of fossil life for general public and scientific community – digital images of ancient biodiversity, paleogeographic maps for individual species at multiple time intervals.
Interested in 3 different time periods from phanerozoic eon: neogene, Pennsylvanian, ordovician. Each era of specimens comes from several different collections. Digital atlas goals – field guides to the past for several fossil-rich regions, with online webpages, mobile app. Intended audiences are scientists and avocational fossil collectors. These resources currently don’t exist. Hope to include over 800 species in the atlases.
Martin, Elizabeth. Core Science Analytics and Synthesis Program, U.S. Geological Survey, Gainesville, FL.
Biodiversity Information Serving Our Nation (BISON): a national resource for species occurrence data.
National unified resource for discovery, linkage, and reuse of species occurrence data. Goals: develop large integrated data store of fully indexed species occurrence records for the US; incorporate federal and non-federal datasets of observations and specimens; develop resource capabilities tailored to US needs. BISON will serve as repo for digitized federal collections data, biodiversity hub of EcoINFORMA. Staff participate in IWGSC – Interagency Working Group on Scientific Collections.
BISON will include: powerful, flexible data search and filter capabilities; easy donwload of data; GIS and data viz component w/ high res base layers for US; easily spun off web & map services, widgets, templates, stats for partners & users; social science component; citizen science component.
Data: over 106M records, mostly from GBIF. Initial new data input emphases – federal data sets, invasive species data. New data to be added this year – amphibians, birds, fish, invasive species. Can download data and documentation.
Cit Sci observation platform – mobile devices and social media for recording and delivering data, based on curated Twitter submissions and Twitter data mining/stream API – funded by USGS CDI. Demo projects include USGSted, DC/Baltimore Cricket Crawl with DiscoverLife. Hawaii Bee Bowl surveys – K-12 students for bee collection, specimens sent to Patuxent for IDs, some donated to museums, data integrated in USGS Native Bee Inventory DB and will be integrated into BISON.
Biodiversity collections software tools: primary purpose and unique contribution of each tool, as well as functionality that could involve members of the public
Georeferencing in FishNet 2
Global network of fish collections, 48 data providers, 2M jars of fish. GeoLocate – software and services for georeferencing of biodiversity data. Performs well in US with automatic identification – 95% of locations found within 6km. Outside the US, could find most localities in Australia but huge distance off w/ high standard error, even after refinement.
Collaborative georeferencing is the next step – take advantages of similarities across collections, distribute workloads appropriately. Working to build georeferencing communities, create data sources, add new users easily. Task assignment, e.g. assigning records from African regions to known experts.
Beach, Jim. Biodiversity Institute, Univ. of Kansas, Lawrence, Kansas.
Specify & Lifemapper: breaking away from narcissistic science.
Specify: bio collections data management platform, modular for plug-ins. Well funded by NSF, pretty good staffing. Represents “all natural history disciplines”, 15% annual adoption rate with 435 collections in 29 countries and 247 institutions using the tool. Several related applications, including a version without MySQL installation, others: Schema Mapper, collection wizard to define new databases, iReport for designing labels and reports, Scatter Gather Reconcile to find dupes in GBIF. Under development: thin client, portal upgrade pipeline with Specify/Symbiota/FilteredPush, image management plugin, specify insight – mobile platform for “consuming activities.”
Lifemapper project: copy of GBIF with web services for geospatial data, computing niche models, presence-absence matrices for biodiversity pattern analysis. Emphasis on researcher workflow, tools, and metadata archives. Several related grants for further development. Had BOINC style screensavers for awhile with similar outcomes in terms of competing sysadmins.
Gilbert, Edward. Symbiota Software Project.
Symbiota: using specimen data to support community inventories
FLOSS biodiversity portals – specimen search engine, allows creation of biotic inventories, e.g. species checklists and BioBlitz surveys. Also includes ID key, image library, distribution maps, descrips, taxonomic info. Specimen-based model – collections are central partners, focus on scientific integrity and other priority features. Is actually a CMS for specimen data management, includes stuff like specimen processing data entry form that displays image to be processed, the OCR results, etc.
SEINet – plants of the southwest with flora inventory projects. Multiple types of biotic inventories, including student lists, native plant societies, and personal checklists. Can do personal checklist management, create own checklists (public or private), become editor for other checklists. Being used for several community projects in AZ. Challenges to date include correct identifications, misspellings, and coordinating volunteers.
Specimens are considered central – backbone of biodiversity research with vouchers and verification, etc. Need to “prove” that something was where you claimed it was. Voucher conflicts – expert review by herbarium staff, visiting taxonomists, exchanges; annotations; identification changes; checklist vouchers; ID conflicts. Can manage personal specimen collections prior to herbarium submission, can handle both specimen and observation data; functionality includes data entry, data management, label printing, cloud management with browser-based tool with no special software. Linked records in voucher network between original observation and physical specimens.
Giddens, Michael. SilverBiology Software Project.
HelpingScience.org label processing method.
7-step workflow for identifying and assembling label data for herbarium specimens. Step 1: click-and-drag image annotation for labels – can get 300 labels/hr/person. Step 2: OCR with Evernote, costs about $0.001 per label, includes NLP. Step 3: ID words and associate them with Darwin Core fields for basic details. Step 4: lexical grouping and analysis – compare words to OCR values and if distinct, assign to lexical set, then send image to data entry. Bulk validation step based on value similarity of related images. Software gets better at seeing variations with time due to training. Data entry through multiple interfaces, users get virtual tokens to use in the store for correct words. Visual choices offered for lat-long to pick right format. Fields then have to be verified – some by computer, some by volunteers.
Once the data are received, it can be exported in CSV, RESTful services for export into other software, Darwin Core Archive, others on request. Also lets you filter by DarwinCore fields. Sustainability is a major consideration – symbiotic relationships with fee-for-service, so volunteers receive tokens to spend at HS store – not paid to people who contribute, that causes problems. Instead, people can direct funding to fundraisers like micro loans to botany undergrads for research, sponsorships for students to attend conferences, K-12 equipment funding for science departments. Also allows donations to charitable orgs and funding small herbaria digitization.
Denslow, Michael. Appalachian State Univ.
Notes from Nature: a scalable citizen science platform for transcribing records from natural history collections.
Challenges: natural history collection data not used to full potential, and only about 1/3 is digitized. Lots of heterogeneity. Want to promote public engagement, most people know nothing about this stuff, and want to use specimens in research. Solutions for transcription – success in other domains, e.g. Zooniverse.
Definition of citizen science – contribution of data, analyses, or solutions toward scientific research by volunteers – not a new thing. Not just a way to gather data, but mutually beneficial partnership. Trying to engage people in a new way using technology. Existing efforts for transcription of natural history collections – herbaria@home for UK herbaria, Atlas of Living Australia portal. Excellent models to build upon, but want to create generalized solution that works at high volume and is scalable.
Phase 1 of development: proposal to Citizen Science Alliance, who asked several proposers to work together on similar project. No money, but some software development time. Initial goals: transcription interface prototype for simple, direct interaction with specimens; address complexities of multiple collections; plan for recruiting new pool of volunteers. Progress so far – private beta, expect a public beta prototype in November 2012. Currently focusing on project needs for SERNEC, CalBug, Natural History Museum London. More of an interactive task, trying to improve on transcription tool and find ways to engage new volunteers.
Phase 2 ambitions: proposed innovations – accuracy assessment, user engagement, OCR integration, scalable solution (and more). Model based on multiple transcriptions with different volunteers repeating the task for accuracy. When there is low agreement, report the field as needing further attention. [what is rate of flagging?] But data is pretty complex, low likelihood of identical entry for longer text strings. Another goal is more engagement – badging system, inline tutorials, advanced interfaces, etc. Create/select downloadable curricular materials based on grade, locations, etc.
OCR integration – two strategies, in the wild or word spotting. Each has advantages. Workflow includes machine readable parts and human-in-the-loop approaches. Also want OCR web API. Scalability – want to set up partner portals to engage people in missions to complete tasks, so new content can be entered into the system easily.
Want to make sure this works in parallel with other activities going on, e.g. iDigBio, TCNs, other OCR efforts like SALIX.
Best, Jason. Botanical Research Institute of Texas, Fort Worth, Texas.
The Apiary Project: a workflow for herbarium specimen digitization.
Doesn’t really have much to do with apiaries! Funded in part by IMLS. BRIT has 312 active volunteers this year with over 10K hours in 2012, about 15 actively involved in digitization. Goal is transcribing data into structured format, bringing together people and machines to leverage best abilities of each. Currently have 3700 specimens in the workflow, beta launching Apiary Lite.
15 minute intro video for training, other training for various workflows at different levels of complexity. Main workflow is analyze specimen, transcribe text, and parse text into fields (using keyboard shortcuts). Uses on-screen markup to highlight text matched to fields. Verbatim parsing versus inferred. Currently working with on-site interns and volunteers, would like to involve people more ad hoc but it would likely require a different approach than the current interface.
Main concern is how to keep people engaged. Takes about 5 minutes to fully transcribe all labels for one specimen, without any OCR.
Flemons, Paul. Team Lead, Atlas of Living Australia Biodiversity Volunteer Portal. Australian Museum, Sydney, Australia.
Atlas of Living Australia’s Volunteer Portal: open model for crowdsourced capture of biodiversity information.
Portal concept: open scalable, distributed, standards (DwC) compliant, browser based, asynchronous application for crowdsourcing capture, and enabling the digital repatriation, of biodiversity data. Supports: template-based creation and management of transcription expeditions, 3 levels of permissions-based activity, including transcription, validation, and administration. Tutorials for getting started in each expedition.
Virtual expeditions – theme-based tasks. Includes leaderboard, expedition stats of number of tasks, volunteer transcribers, level of completion, progress bars for each expedition. [when you show completion info, do more complete expeditions attract more attention – accumulative advantage?] Roles in expeditions are similar to Old Weather.
Templates include field notes, issue is that they have to generate a new template for each different type that comes up. Existing templates are reusable, but new ones require ad hoc development. Originally wanted to make the template wizard-based for selecting fields and laying out templates, but not enough resources to do that. Showed some interfaces.
Engaging the public in science
Wiggins, Andrea. DataONE, Univ. of New Mexico, and Cornell Lab of Ornithology, Cornell Univ.
Citizen science phenotypes: typologies and implications of project design.
Zelt, Jessica. North American Bird Phenology Program, U.S. Geological Survey, Laurel, Maryland.
How to successfully engage the public in science.
See notes from USGS workshop.
Wilson, Nathan. Director of Biodiversity Informatics, Encyclopedia of Life, Marine Biological Laboratory, Woods Hole, Massachusetts.
Mushroom Observer and the Role of Observers
Created Mushroom Observer for himself – software professional and naturalist by avocation, brought him his dream job. Focus is Western US mushrooms, 3500 observers in the last 6 years, 100K observations, 250K photos. Scratch your own itch, start with an existing crowd – OSS concepts apply more broadly. “If you treat them as your most important asset, they will return the favor by becoming your most important aspect.”
Engaging the public: Embrace laziness, accept garbage, deal gently with conflicts, avoid anonymity and privacy – don’t let people hide, make it easy to become an expert. Important to evolution of Mushroom Observer: start off with licensing and data reuse policies in place, automatic data sharing. Noncommercial CC licenses not accepted by Wikipedia! Offers people reuse options for licenses, both NC and not.
Issues with observations: didn’t keep herbarium specimen, best guess ID may not be accurate but best on current knowledge. Most people don’t collect herbarium specimens. Rule of thumb: anything that takes some work loses 90% of the population – clicking one button will cause that level of drop out. Thought very few Mushroom Observer users would not have specimens, it’s a lot of extra work. Turned out 15-20% of Mushroom Observer data had herbarium specimens. 28% had made herbarium specimens – truly amazing. Next steps – more collaboration with professional herbariums; validate and assess the numbers.
Another goal – improving IDs in the system – 28% of observations above species level, but 65% of those are below “expert” confidence. 56% have no notes at all, average note length is 179 characters. Don’t really know a whole lot about fungi, actually, and people have recognized new taxa that have not yet been described – organism from CA with 50 observations, can’t figure out what it is – they’ve ruled out all possibilities. Needs better review process, documenting of observations [do you tell people what notes are useful?], computable descriptions, standardization of “provisional” names (one was named “Carl”).
What is an observation? You see a specimen, not a species! Just the facts; observed features – macroscopic, microscopic, molecular. Concept/definition of a species are somewhat divergent – type specimen is the definition of the species. But there are lookalikes, cryptic species, paraphyletic & polyphyletic groups, convergent evolution, and changing circumscriptions. Diagram connection people to observations, observations to circumscriptions and barcodes, magical jump to scientific names; barcodes go to types to species names; circumscriptions to scientific names. Maybe there’s a way to do something in between the observations and scientific names to keep the distinction of an observation separate from their species label.
Needs: names for shared observational experiences; peer reviewed, distinct, unique, and memorable. Computable definitions – duck typing (looks like a duck, quacks like a duck), semantic web technology. Connections to traditional scientific names – moving target.