Projects per year
Abstract
Memory disaggregation separates compute (CPU) and main memory resources into disjoint physical units
to enable elastic and independent scaling. Connected via high-speed RDMA-enabled networks, compute
nodes can directly access remote memory. This setting often requires complex protocols with many network
roundtrips as memory nodes have near-zero compute power.
In this paper, we design a scalable distributed inverted list index for disaggregated memory architectures.
An inverted list index maps a set of terms to lists of documents that contain this term. Current solutions either
partition the index horizontally or vertically with severe limitations in the disaggregated memory setting due to
data & access skew, high network latency, or out-of-memory errors. Our method partitions lists into fixed-size
blocks and spreads them across the memory nodes to balance skewed accesses. Block-based list processing
keeps the memory footprint of compute nodes low and masks latency by interleaving remote accesses with
expensive list operations. In addition, we propose efficient updates with optimistic concurrency control and
read-write conflict detection. Our experiments confirm the efficiency and scalability of our method.
to enable elastic and independent scaling. Connected via high-speed RDMA-enabled networks, compute
nodes can directly access remote memory. This setting often requires complex protocols with many network
roundtrips as memory nodes have near-zero compute power.
In this paper, we design a scalable distributed inverted list index for disaggregated memory architectures.
An inverted list index maps a set of terms to lists of documents that contain this term. Current solutions either
partition the index horizontally or vertically with severe limitations in the disaggregated memory setting due to
data & access skew, high network latency, or out-of-memory errors. Our method partitions lists into fixed-size
blocks and spreads them across the memory nodes to balance skewed accesses. Block-based list processing
keeps the memory footprint of compute nodes low and masks latency by interleaving remote accesses with
expensive list operations. In addition, we propose efficient updates with optimistic concurrency control and
read-write conflict detection. Our experiments confirm the efficiency and scalability of our method.
Original language | English |
---|---|
Number of pages | 27 |
DOIs | |
Publication status | Published - 30 May 2024 |
Event | ACM SIGMOD International Conference on Management of Data - Santiago de Chile, Chile Duration: 9 Jun 2024 → 14 Jun 2024 |
Conference
Conference | ACM SIGMOD International Conference on Management of Data |
---|---|
Abbreviated title | SIGMOD |
Country/Territory | Chile |
City | Santiago de Chile |
Period | 9/06/24 → 14/06/24 |
Keywords
- Distributed Database Management Systems
- Disaggregated Memory
- Inverted Index
- RDMA
Fields of Science and Technology Classification 2012
- 102 Computer Sciences
Projects
- 1 Active
-
DESQ: DESQ - Declarative and Efficient Similarity Queries
Augsten, N. (Principal Investigator)
1/12/21 → 30/11/25
Project: Research
Research output
- 1 Software
-
Scalable Distributed Inverted List Indexes in Disaggregated Memory (Source Code)
Widmoser, M. (Photographer), 2024Research output: Non-textual form › Software
-
Scalable Distributed Inverted List Indexes in Disaggregated Memory
Widmoser, M. (Speaker)
13 Jun 2024Activity: Talk or presentation › Poster presentation › science to science / art to art
-
Scalable Distributed Inverted List Indexes in Disaggregated Memory
Widmoser, M. (Speaker)
13 Jun 2024Activity: Talk or presentation › Oral presentation › science to science / art to art
-
Scalable Distributed Inverted List Indexes in Disaggregated Memory
Widmoser, M. (Speaker)
10 Feb 2024Activity: Talk or presentation › Oral presentation › science to science / art to art