LNCC
MCTI

Integration of Data Supported by Machine Learning

Scientists often need to combine multiple data sources in biodiversity analysis and synthesis activities. While there are many gaps in biodiversity data, such as the limited availability of data on species characteristics, there is much data that is machine-readable and freely available. However, they are highly heterogeneous, dispersed, and may not provide metadata or schemas. A promising approach to overcome these limitations has been to use machine learning techniques to support open data integration activities such as entity mapping. We aim to leverage and extend these techniques to integrate biodiversity data with other related datasets.

Institutions
Data Extreme Lab
Data Extreme Lab