Abstract
In the early stages of a disaster caused by a natural hazard (e.g., flood), the amount of available and useful information is low. To fill this informational gap, emergency responders are increasingly using data from geo-social media to gain insights from eyewitnesses to build a better understanding of the situation and design effective responses. However, filtering relevant content for this purpose poses a challenge. This work thus presents a comparison of different machine learning models (Naïve Bayes, Random Forest, Support Vector Machine, Convolutional Neural Networks, BERT) for semantic relevance classification of flood-related, German-language Tweets. For this, we relied on a four-category training data set created with the help of experts from human aid organisations. We identified fine-tuned BERT as the most suitable model, averaging a precision of 71% with most of the misclassifications occurring across similar classes. We thus demonstrate that our methodology helps in identifying relevant information for more efficient disaster management.
Original language | English |
---|---|
Journal | Information (Switzerland) |
Volume | 15 |
Issue number | 3 |
DOIs | |
Publication status | Published - 7 Mar 2024 |
Bibliographical note
Publisher Copyright:© 2024 by the authors.
Keywords
- BERT
- disaster management
- relevance classification
- semantic analysis
- social media
Fields of Science and Technology Classification 2012
- 211 Other Technical Sciences