java.lang.Object
org.apache.lucene.util.hnsw.HnswUtil
Utilities for use in tests involving HNSW graphs
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescription(package private) static final record
A component (also "connected component") of an undirected graph is a collection of nodes that are connected by neighbor links: every node in a connected component is reachable from every other node in the component. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescription(package private) static List
<HnswUtil.Component> components
(HnswGraph hnsw, int level, FixedBitSet notFullyConnected, int maxConn) componentSizes
(HnswGraph hnsw) Returns the sizes of the distinct graph components on level 0.componentSizes
(HnswGraph hnsw, int level) Returns the sizes of the distinct graph components on the given level.static boolean
graphIsRooted
(IndexReader reader, String vectorField) In graph theory, "connected components" are really defined only for undirected (ie bidirectional) graphs.(package private) static boolean
Returns true if every node on every level is reachable from node 0.private static HnswUtil.Component
markRooted
(HnswGraph hnswGraph, int level, FixedBitSet connectedNodes, FixedBitSet notFullyConnected, int maxConn, int entryPoint) Count the nodes in a rooted component of the graph and set the bits of its nodes in connectedNodes bitset.private static int
nextClearBit
(FixedBitSet bits, int index)
-
Constructor Details
-
HnswUtil
private HnswUtil()
-
-
Method Details
-
isRooted
Returns true if every node on every level is reachable from node 0.- Throws:
IOException
-
componentSizes
Returns the sizes of the distinct graph components on level 0. If the graph is fully-rooted the list will have one entry. If it is empty, the returned list will be empty.- Throws:
IOException
-
componentSizes
Returns the sizes of the distinct graph components on the given level. The forest starting at the entry points (nodes in the next highest level) is considered as a single component. If the entire graph is rooted in the entry points--that is, every node is reachable from at least one entry point--the returned list will have a single entry. If the graph is empty, the returned list will be empty.- Throws:
IOException
-
components
static List<HnswUtil.Component> components(HnswGraph hnsw, int level, FixedBitSet notFullyConnected, int maxConn) throws IOException - Throws:
IOException
-
markRooted
private static HnswUtil.Component markRooted(HnswGraph hnswGraph, int level, FixedBitSet connectedNodes, FixedBitSet notFullyConnected, int maxConn, int entryPoint) throws IOException Count the nodes in a rooted component of the graph and set the bits of its nodes in connectedNodes bitset. Rooted means nodes that can be reached from a root node.- Parameters:
hnswGraph
- the graph to checklevel
- the level of the graph to checkconnectedNodes
- a bitset the size of the entire graph with 1's indicating nodes that have been marked as connected. This method updates the bitset.notFullyConnected
- a bitset the size of the entire graph. On output, we mark nodes visited having fewer than maxConn connections. May be null.maxConn
- the maximum number of connections for any node (aka M).entryPoint
- a node id to start at- Throws:
IOException
-
nextClearBit
-
graphIsRooted
In graph theory, "connected components" are really defined only for undirected (ie bidirectional) graphs. Our graphs are directed, because of pruning, but they are *mostly* undirected. In this case we compute components starting from a single node so what we are really measuring is whether the graph is a "rooted graph". TODO: measure whether the graph is "strongly connected" ie there is a path from every node to every other node.- Throws:
IOException
-