Jhu-Jyun Jhang, Melissa Jean-Yi Liu, Daphne Z Hoh, Chun-I Chang, Mao-Ning Tuanmu Taiwan Biodiversity Information Facility, Biodiversity Research Centre, Academia Sinica, Taipei, Taiwan
Abstract
The Darwin Core standard (DwC) and the GBIF data framework provide a flexible set of biodiversity data fields to accommodate various thematic datasets. However, this flexibility can make it challenging for data providers to get started, often leading to frustration when trying to map their original data fields to DwC terms. Additionally, while data cleaning is crucial for enhancing data quality, it requires significant expertise and effort, which may hinder the mobilization of high-quality data. To address these common pain points in data mobilization, the Taiwan Biodiversity Information Facility (TaiBIF) developed the TaiBIF Open Data Toolkit by integrating various thematic dataset templates, DwC terms, the Nansen Legacy Excel Template Generator, Excel data editing interfaces, GBIF Data Validator, and the OpenRefine. We combined the strengths of each tool to create a straightforward workflow from data sheet generation, data validation, data cleaning to dataset packaging. We hope the toolkit aligns with the needs of data publishers and facilitates a smoother and more user-friendly process of data management and publishing.