1945

Abstract

This paper proposes the use of synthetic training data generated by large language models to improve machine learning SDG classifiers. It shows that supplementing existing training data with synthetic data produced by the ChatGPT tool improves the performance of the SDGClassy classifier. This addition of synthetic data is especially useful in building SDG classifiers given the limited availability of properly labeled data and the complex, interconnected nature of the SDGs. Synthetic data thus enable more effective machine-learning applications in this context.

Sustainable Development Goals:
Related Subject(s): Economic and Social Development
JEL: C88: Mathematical and Quantitative Methods / Data Collection and Data Estimation Methodology ; Computer Programs / Other Computer Software ; O20: Economic Development, Innovation, Technological Change, and Growth / Development Planning and Policy / General

You do not have access to article level metrics. Please click here to request access

/content/papers/10.18356/25206656-180
Loading
  • Published online: 30 Nov 2023
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
aHR0cHM6Ly93d3cudW4taWxpYnJhcnkub3JnLw==