New Weighting Schemes for Meta-blocking

Nikolaus Augsten, Roland Kwitt, Matteo Lissandrini, Willi Mann, Themis Palpanas, George Papadakis

Publikation: Andere BeiträgeSonstiger BeitragForschung

Abstract

Entity Resolution constitutes a core data integration task that relies on Blocking in order to tame its quadratic time complexity. Schema-agnostic blocking comes at the cost of many irrelevant candidate pairs (i.e., comparisons), which can be significantly reduced with Meta-blocking. In Meta-blocking, a weighting scheme is first applied on every pair of candidate entities in proportion to the likelihood that they are matching, and a pruning algorithm then discards the pairs with the lowest scores.
In this work, we briefly discuss the existing Meta-blocking weighting schemes, and then propose four new weighting schemes that can be used by Meta-blocking techniques.
OriginalspracheEnglisch
PublikationsstatusVeröffentlicht - 1 Okt. 2021

Systematik der Wissenschaftszweige 2012

  • 102 Informatik
  • Scaling Density-Based Clustering to Large Collections of Sets

    Kocher, D., Mann, W. & Augsten, N., März 2021, EDBT 2021 - Proceedings of the 24th International Conference on Extending Database Technology: 24th International Conference on Extending Database Technology, Proceedings. Velegrakis, Y., Velegrakis, Y., Zeinalipour, D., Chrysanthis, P. K., Chrysanthis, P. K. & Guerra, F. (Hrsg.). OpenProceedings.org, S. 109-120 12 S. (Advances in Database Technology - EDBT; Band 2021-March).

    Publikation: Beitrag in Buch/Bericht/Konferenzband/GesetzeskommentarKonferenzbeitragPeer-reviewed

    Open Access

Dieses zitieren