Batch neighbour retrieval in single server case #21862
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is the first PR to retrieve neighbours in batches in traversals. This can drastically improve limited traversal runtimes on graphs with supernodes.
In the graph's
SingleServerProvider
, this PR adapts how neighbours of a specific vertex are read in theexpand
function. It now introduces a neighbour provider that is responsible for providing the neighbours in batches (which extracts a lot of code in theSingleServerProvider
). For now we loop over all batches to get all neighbours inexpand
. In a later PR (when the cluster case is also implemented),expand
should return one batch per call.The neighbour provider is set to a specific vertex and provides one batch of neighbours per call to its
next
function. Internally, it saves read neighbour batches to a cache. If the neighbour provider is set to a vertex for which the cache includes already all neighbours, the neighbour provider provides these cached batches instead of reading them again from memory.