@@ -41,8 +41,7 @@ def rwr(
41
41
Parameters
42
42
----------
43
43
G : GraphV2
44
- The input graph on which the Random Walk with Restart (RWR) will be
45
- performed.
44
+ The input graph to be sampled.
46
45
graph_name : str
47
46
The name of the new graph that is stored in the graph catalog.
48
47
start_nodes : list of int, optional
@@ -106,55 +105,59 @@ def cnarw(
106
105
job_id : Optional [str ] = None ,
107
106
) -> GraphWithSamplingResult :
108
107
"""
109
- Computes a set of Random Walks with Restart (RWR) for the given graph and stores the result as a new graph in the catalog.
108
+ Common Neighbour Aware Random Walk (CNARW) samples the graph by taking random walks from a set of start nodes
110
109
111
- This method performs a random walk, beginning from a set of nodes (if provided),
112
- where at each step there is a probability to restart back at the original nodes.
113
- The result is turned into a new graph induced by the random walks and stored in the catalog.
110
+ CNARW is a graph sampling technique that involves optimizing the selection of the next-hop node. It takes into
111
+ account the number of common neighbours between the current node and the next-hop candidates. On each step of a
112
+ random walk, there is a probability that the walk stops, and a new walk from one of the start nodes starts
113
+ instead (i.e. the walk restarts). Each node visited on these walks will be part of the sampled subgraph. The
114
+ resulting subgraph is stored as a new graph in the Graph Catalog.
114
115
115
116
Parameters
116
117
----------
117
118
G : GraphV2
118
- The input graph on which the Random Walk with Restart (RWR) will be
119
- performed.
119
+ The input graph to be sampled.
120
120
graph_name : str
121
- The name of the new graph in the catalog.
121
+ The name of the new graph that is stored in the graph catalog.
122
122
start_nodes : list of int, optional
123
- A list of node IDs to start the random walk from. If not provided, all
124
- nodes are used as potential starting points .
123
+ IDs of the initial set of nodes in the original graph from which the sampling random walks will start.
124
+ By default, a single node is chosen uniformly at random .
125
125
restart_probability : float, optional
126
- The probability of restarting back to the original node at each step .
127
- Should be a value between 0 and 1. If not specified, a default value is used .
126
+ The probability that a sampling random walk restarts from one of the start nodes .
127
+ Default is 0.1 .
128
128
sampling_ratio : float, optional
129
- The ratio of nodes to sample during the computation. This value should
130
- be between 0 and 1. If not specified, no sampling is performed .
129
+ The fraction of nodes in the original graph to be sampled.
130
+ Default is 0.15 .
131
131
node_label_stratification : bool, optional
132
- If True, the algorithm tries to preserve the label distribution of the original graph in the sampled graph.
132
+ If true, preserves the node label distribution of the original graph.
133
+ Default is False.
133
134
relationship_weight_property : str, optional
134
- The name of the property on relationships to use as weights during
135
- the random walk. If not specified, the relationships are treated as
136
- unweighted.
135
+ Name of the relationship property to use as weights. If unspecified, the algorithm runs unweighted.
137
136
relationship_types : list of str, optional
138
- The relationship types used to select relationships for this algorithm run.
137
+ Filter the named graph using the given relationship types. Relationships with any of the given types will be
138
+ included.
139
139
node_labels : list of str, optional
140
- The node labels used to select nodes for this algorithm run .
140
+ Filter the named graph using the given node labels. Nodes with any of the given labels will be included .
141
141
sudo : bool, optional
142
- Override memory estimation limits . Use with caution as this can lead to
143
- memory issues if the estimation is significantly wrong .
142
+ Bypass heap control . Use with caution.
143
+ Default is False .
144
144
log_progress : bool, optional
145
- If True, logs the progress of the computation.
145
+ Turn `on/off` percentage logging while running procedure.
146
+ Default is True.
146
147
username : str, optional
147
- The username to attribute the procedure run to
148
+ Use Administrator access to run an algorithm on a graph owned by another user.
149
+ Default is None.
148
150
concurrency : int, optional
149
- The number of concurrent threads used for the algorithm execution.
151
+ The number of concurrent threads used for running the algorithm.
152
+ Default is 4.
150
153
job_id : str, optional
151
- An identifier for the job that can be used for monitoring and cancellation
154
+ An ID that can be provided to more easily track the algorithm’s progress.
155
+ By default, a random job id is generated.
152
156
153
157
Returns
154
158
-------
155
159
GraphSamplingResult
156
- Tuple of the graph object and the result of the Random Walk with Restart (RWR), including the sampled
157
- nodes and their scores.
160
+ Tuple of the graph object and the result of the Common Neighbour Aware Random Walk (CNARW), including the dimensions of the sampled graph.
158
161
"""
159
162
pass
160
163
0 commit comments