New! Sign up for our free email newsletter.
Science News
from research organizations

Audit finds biodiversity data aggregators 'lose and confuse' data

Date:
April 23, 2018
Source:
Pensoft Publishers
Summary:
Both online repositories the Atlas of Living Australia (ALA) and the Global Biodiversity Information Facility (GBIF) were found to 'lose and confuse' portions of the data provided to them, according to an independent audit of ca. 800,000 records from three Australasian museums. Genus and species names were found to have been changed in up to 1 in 5 records, and programming errors caused up to 100 percent data loss in some data categories.
Share:
FULL STORY

In an effort to improve the quality of biodiversity records, the Atlas of Living Australia (ALA) and the Global Biodiversity Information Facility (GBIF) use automated data processing to check individual data items. The records are provided to the ALA and GBIF by museums, herbaria and other biodiversity data sources.

However, an independent analysis of such records reports that ALA and GBIF data processing also leads to data loss and unjustified changes in scientific names.

The study was carried out by Dr Robert Mesibov, an Australian millipede specialist who also works as a data auditor. Dr Mesibov checked around 800,000 records retrieved from the Australian Museum, Museums Victoria and the New Zealand Arthropod Collection. His results are published in the open access journal ZooKeys, and also archived in a public data repository.

"I was mainly interested in changes made by the aggregators to the genus and species names in the records," said Dr Mesibov.

"I found that names in up to 1 in 5 records were changed, often because the aggregator couldn't find the name in the look-up table it used."

Another worrying result concerned type specimens -- the reference specimens upon which scientific names are based. On a number of occasions, the aggregators were found to have replaced the name of a type specimen with a name tied to an entirely different type specimen.

The biggest surprise, according to Dr Mesibov, was the major disagreement on names between aggregators.

"There was very little agreement," he explained. "One aggregator would change a name and the other wouldn't, or would change it in a different way."

Furthermore, dates, names and locality information were sometimes lost from records, mainly due to programming errors in the software used by aggregators to check data items. In some data fields the loss reached 100%, with no original data items surviving the processing.

"The lesson from this audit is that biodiversity data aggregation isn't harmless," said Dr Mesibov. "It can lose and confuse perfectly good data."

"Users of aggregated data should always download both original and processed data items, and should check for data loss or modification, and for replacement of names," he concluded.


Story Source:

Materials provided by Pensoft Publishers. Note: Content may be edited for style and length.


Journal Reference:

  1. Robert Mesibov. An audit of some processing effects in aggregated occurrence records. ZooKeys, 2018; 751: 129 DOI: 10.3897/zookeys.751.24791

Cite This Page:

Pensoft Publishers. "Audit finds biodiversity data aggregators 'lose and confuse' data." ScienceDaily. ScienceDaily, 23 April 2018. <www.sciencedaily.com/releases/2018/04/180423085406.htm>.
Pensoft Publishers. (2018, April 23). Audit finds biodiversity data aggregators 'lose and confuse' data. ScienceDaily. Retrieved December 21, 2024 from www.sciencedaily.com/releases/2018/04/180423085406.htm
Pensoft Publishers. "Audit finds biodiversity data aggregators 'lose and confuse' data." ScienceDaily. www.sciencedaily.com/releases/2018/04/180423085406.htm (accessed December 21, 2024).

Explore More

from ScienceDaily

RELATED STORIES