Abstract
Cloud services for natural language processing (NLP) increasingly establish as viable alternatives to self-maintained and self-trained NLP pipelines. In particular, they feature low access barriers and management overhead, a pay-as-you-go pricing model, and elastic scalability allowing to process large amounts of natural language data ad hoc. Any deliberation about employing cloud NLP services in practice does, however, face the challenge that so far, little is known about the accuracy provided by such services as well as about how to conduct respective quality assessments.In this paper, we therefore present a method for evaluating the accuracy provided by cloud NLP services and apply it to cloud services for three prominent NLP tasks offered by Amazon, Google, Microsoft, and IBM. Our results show significantly different accuracies as well as different dependencies on the specifics of input data among the covered providers. Our insights therefore allow for a more evidence-based quality-driven choice of the provider to be used for NLP in practice. Furthermore, the general approach employed may also serve as a blueprint for additional future evaluations of cloud NLP services for other tasks or offered by other providers.
Originalsprache | Englisch |
---|---|
Titel | 2020 IEEE International Conference on Big Data (Big Data) |
Herausgeber (Verlag) | IEEE |
Seiten | 341-350 |
Seitenumfang | 10 |
ISBN (Print) | 978-1-7281-6252-2 |
DOIs | |
Publikationsstatus | Veröffentlicht - 13 Dez. 2020 |
Veranstaltung | 2020 IEEE International Conference on Big Data (Big Data) - Atlanta, GA, USA Dauer: 10 Dez. 2020 → 13 Dez. 2020 |
Konferenz
Konferenz | 2020 IEEE International Conference on Big Data (Big Data) |
---|---|
Zeitraum | 10/12/20 → 13/12/20 |
Schlagwörter
- Text recognition
- Scalability
- Pipelines
- Big Data
- Natural language processing
- Internet
- Task analysis
Systematik der Wissenschaftszweige 2012
- 102 Informatik