Novel methods to correct for observer and sampling bias in presence-only species distribution models

Aim: While species distribution models (SDMs) are standard tools to predict species distributions, they can suffer from observation and sampling biases, particularly presence-only SDMs that often rely on species observations from non-standardized sampling efforts. To address this issue, sampling background points with a target-group strategy is commonly used, although more robust strategies and refinements could be implemented. Here, we exploited a dataset of plant species from the European Alps to propose and demonstrate efficient ways to correct for observer and sampling bias in presence-only models.

Innovation: Recent methods correct for observer bias by using covariates related to accessibility in model calibrations (classic bias covariate correction, Classic-BCC). However, depending on how species are sampled, accessibility covariates may not sufficiently capture observer bias. Here, we introduced BCCs more directly related to sampling effort, as well as a novel corrective method based on stratified resampling of the observational dataset before model calibration (environmental bias correction, EBC). We compared, individually and jointly, the effect of EBC and different BCC strategies, when modelling the distributions of 1’900 plant species. We evaluated model performance with spatial block split-sampling and independent test data, and assessed the accuracy of plant diversity predictions across the European Alps.

Main conclusions: Implementing EBC with BCC showed best results for every evaluation method. Particularly, adding the observation density of a target group as bias covariate (Target-BCC) displayed most realistic modelled species distributions, with a clear positive correlation (r≃0.5) found between predicted and expert-based species richness. Although EBC must be carefully implemented in a species-specific manner, such limitations may be addressed via automated diagnostics included in a provided R function. Implementing EBC and bias covariate correction together may allow future studies to address efficiently observer bias in presence-only models, and overcome the standard need of an independent test dataset for model evaluation.

Funding Information:

This work was supported by:
  • Agence Nationale de la Recherche (link) (Grant/Award: ANR-10-LAB-56, ANR-16-CE93-004)
  • Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (link) (Grant/Award: 310030L_170059)

Related Datasets

KARGER, Dirk Nikolaus, CONRAD, Olaf, BÖHNER, Jürgen, et al. Climatologies at high resolution for the earth’s land surface areas. Scientific data, 2017, vol. 4, no 1, p. 1-20.

Aeschimann, D., Lauber, K., Moser, D.M. & Theurillat, J.P. (2004) Flora alpina: ein Atlas sämtlicher 4500 Gefässpflanzen der Alpen., (ed. by Haupt).


Chauvier, Yohann; Zimmermann, Niklaus; Poggiato, Giovanni; Bystrova, Daria; Brun, Philipp; Thuiller, Wilfried (2021). Novel methods to correct for observer and sampling bias in presence-only species distribution models. EnviDat. doi:10.16904/envidat.226.

DataCite ISO 19139 GCMD DIF README.txt BibTex RIS

Data and Resources


Field Values
DOI 10.16904/envidat.226
Publication State Published
  • Email: yohann.chauvierfoo(at) ORCID: Given Name: Yohann Family Name: Chauvier Affiliation: WSL DataCRediT: Software, Curation, Collection, Validation, Publication
  • Email: niklaus.zimmermannfoo(at) ORCID: Given Name: Niklaus Family Name: Zimmermann Affiliation: WSL DataCRediT: Supervision, Validation, Publication
  • Email: giov.poggiatofoo(at) Given Name: Giovanni Family Name: Poggiato Affiliation: LECA DataCRediT: Software, Validation, Publication
  • Email: daria.bystrovafoo(at) Given Name: Daria Family Name: Bystrova Affiliation: LECA DataCRediT: Validation, Publication, Software
  • Email: philipp.brunfoo(at) ORCID: Given Name: Philipp Family Name: Brun Affiliation: WSL DataCRediT: Validation, Publication
  • Email: wilfried.thuillerfoo(at) ORCID: Given Name: Wilfried Family Name: Thuiller Affiliation: LECA DataCRediT: Supervision, Software, Validation, Publication
Contact Person Given Name: Yohann Family Name: Chauvier Email: yohann.chauvierfoo(at) Affiliation: WSL ORCID:
Publication Publisher: EnviDat Year: 2021
  • Type: Created Date: 2021-06-02 End Date: 2021-06-02
Version 1.0
Type dataset
General Type Dataset
Language English
Location European Alps
Content License Creative Commons Attribution Share-Alike (CC-BY-SA)    [Open Data]
Last Updated February 13, 2023, 10:46 (UTC)
Created June 2, 2021, 13:35 (UTC)