Meeting Notes Workshops Leiden February 2013

From pro-iBiosphere
Jump to: navigation, search

Days 1 & 2

Introduction

pro-iBiosphere aims at creating a more efficient system for managing biodiversity information. Why do we need open access and exchangeable taxonomic information?

Copyright barriers hinder open access, make own legal framework. How do we prevent exploitation? Closed access is worse than exploitation.

Existing digital standards – use URIs as names for things. How do you move the science community up the understanding ladder? PDF->XML We need systems that hide the complexity from the user.

Workshop #1

The workshop was organised with the aim of connecting the synthetisers/producers of taxonomic knowledge to the producers of IT tools for taxonomy. The audience contained representatives of many scientific institutions, mainly from Europe but also from Africa, Asia and the Americas. The workshop allowed software developers to present their tools to editors and taxonomists of Floras and Faunas who are interested in adopting e-platforms and e-tools.

e-platforms for taxonomy: Scratchpads, EDIT, Linnaeus, Biowikifarm, [[1]]
Avoid cottage industries and networks of one. Provide the opportunity to work alone, but with the facility to connect later.
Create a collection of tools to support all parts of taxonomic workflow.
There is a lot of overlap between tools. Some duplication is good for innovation.
"If it’s not on the web it does not exist". True?
How do we provide technical support in the future? The user wants to trust the tool he’s using.
How to fill in funding gaps? Citizen science may be used for this purpose.
Wikis can share management load and allow people to follow the editorial process. Parallel development possible. They allow a shift from developers to users and more contributors than programmers.
Break out of the conventional wiki-format with semantic media-wiki (semantic-mediawiki.org). Use user friendly forms and templates – saves learning synthax and extremely customizable.
Purposes & uses of e-tools: Pensoft Writing Tool, EDIT, XPer2, Charaparser, Apps
Avoid creating a legacy of data that cannot be reused. Do up-front mark-up, for instance with online templates.
The possibility to link and transfer data needs to be there.
How to monitor mark-up tool performance for large scale imports?. Domain or journal specific ontologies could be one method. The more data you have, the more ways there are to check it.
How fine does mark-up need to be to be useful?
What do the users/"kids" want (e.g. apps?) What does a scientist in the field need?
Examples of projects using tools
Dedicated permanent financial funds towards Scratchapds and CDM support is a new and positive step.
The Alien plants project is using Scratchpads. It allows blogging, linking plant data to herbarium collections. Misidentified species are rare. The information is accessible, but not reusable. A relaxed copyright management has worked.
Online portals with dynamic checklists and repositories. Can CDM portals communicate with each other? at present not. How are reviews in years time embedded in the process? CDM store archive?
Could you pull down data from another organization and create something new?
Bottom-up approach of modelling – look at your data and then start modelling a tool.
Use analytics software to monitor statistics on your site.
How to preserve a site?, Web-archiving software?, Crowd-funding?

Discussion

Why are taxonomists not using e-tools?, is it because of time or lack of training?. Awareness of tools should start at university education and as early as possible in the scientific workflow. Training sessions could be offered more. Competition between systems is very confusing. Can we produce one stable system? Work is not supported by management and/or seen as useful staff time.

Taxonomists need to meet IT half-way, some standards need to be used.

It is very important to have institutional commitment on the use of e-tools/platforms for taxonomy. Educate and promote tools among taxonomists.

Day 3

Biosystematic literature: where are we and where do we want to go?

We need more awareness, availability, accessibility and promotion of Floras.

Where are our users?

Maintaining links keeps knowledge in the system. How to establish a persistent system in the absence of persistent identifiers? DOIs cost money. GBIF Links keep braking (changing URLs).

Mark-up and semantic enhancement

Many Flora/Fauna editors are science institutions. It is important to educate and convince editors to move into XML-publishing. Editors are often isolated and their activities are frequently a part-time staff effort. One of the advantags is that it increases dissemination.

Make your doc open enough to be shared by annotations.

Establish encoding guidelines for publishers. Web publishing should include mark-up of important core info: description of article, machine-readable reference list. Scoring system for publishers?

Referencing sites is often too expensive for small institutions. At present a complete automatisation of mark-up is not possible.

A special purpose parser (Charaparser) was created for descriptive data. The general purpose parser is good for language pattern recognition. Users are good at characterizing terms to help the parser.

How to define data quality? This needs monitoring. There need to be rules and it needs to be transparent.

Crowdsourcing
Crowd involvement by task. People actually participate by transcribing text.

Collective or collaborative intelligence? Solutions: pay, entertain, give recognition and access. Crowd agreements can be relatively certain, not much less than expert agreements.

Discussion

DOIS vers. LSIDs

Links to other websites citing the same information should always be included on sites.

Day 4

Why do we need Flora and Fauna works?

We need annotations. We need a network of trust. Defined by author?

Shared data storage is a solution.

What’s the first step when wanting to expose your institutional data? good guidelines.


Uses of extracting semantic data

Define ‘open’. Make an inventory of which things/sites are open, which are not. Open content is science tradition and the reason why we are publicly funded.


Improving XML workflow between registers

There should be ways to find out what has been published on one taxon. Get user reports of missing information.

Multi-lingual approach to attract more users.

In order to avoid duplication, repositories need to be synchronised. keeps data consistent and minimizes data set.

ID-system for entering new data, to avoid mistakes. Remote desktop system useful.

Synergies between pIB and other global initiatives

Use names to connect data. Names recognition should be available for Excel or PWT, suggesting newer names maybe. Separate names and taxonomy.

Need ontologies to structure unstructured data.

Use one flora to fill the gaps in another one?

How do you show aligning concepts?

Is the user able to see, after a search, how many records were searched successfully?

Moving Floras into an advanced XML based production workflow

Collaborations often symbolizes for small journals disappearance.
What is a reason to join a collaboration? Rapid, less cost intensive publishing. Collaborative printing not favoured. Automated submission saves money. Publishing through repositories, enriched by indexing.

Demonstration of workflows for publishing

Advantages of online publishing: all versions stored, fast review, no software, reminders, easy editing, credits.

If it’s not free, it’s not open.

Give your money to a publisher, who does what you want. Non-commercial. How do they perform in mark-up? Score them.

Discussion

Why do we publish? Support science.

Issues for publishing and maximum dissemination : format, archiving.

Subscription fees should be lower.

Give a recommendation to the EU to make it complimentary to publish this and that way. Project proposals should included open access.

What do you want from pIB? Reusable data, recognition of our work, time & cost-saving procedure.

All prospective articles should be recoverable, have identifiers.

Should we focus on new literature to keep a new backlock from appearing?

Same tools for legacy and prospective literature?

Open Access will come ( UK example)!

Image of participants

The participants of the pro-iBiosphere workshops (click on image to enlarge)