Overview
GraphSAGE (SAmple and aggreGatE) is a general inductive framework. Instead of training individual embeddings for each node, it learns a function that generates embeddings by sampling and aggregating features from a node’s local neighborhood, thus can efficiently generate node embeddings for previously unseen data. GraphSAGE was proposed by W.H Hamilton et al. of Stanford University in 2017.
Related material of the algorithm:
 W.L. Hamilton, R. Ying, J. Leskovec, Inductive Representation Learning on Large Graphs (2017)
Basic Concept
Transductive Framework and Inductive Framework
Most traditional graph embedding approaches (such as those algorithms based on matrix factorization or random walk) learn node embeddings by using the information of all nodes during the iterations; whenever there are unseen nodes join the network, the model has to be retrained by using the whole dataset, as these transductive frameworks do not naturally generalize.
GraphSAGE, as a general inductive framework, it trains a set of aggregator functions instead individual embeddings for each node. Embeddings of newly joined nodes can be obtained according to the feature and structural information of nodes without reiterate the whole training process. This inductive capability is essential for highthroughput, production machine learning systems.
Embedding Generation Algorithm of GraphSAGE
Assume that we have learned the parameters of K aggregator functions (AGGREGATE_{k}), which aggregate information from node neighbors, as well as K weight matrices W^{k}, which are used to propagate information between different layers of the model or search depths. We train these parameters as described in GraphSAGE Train. The following is the GraphSAGE embedding generation process (i.e., forward propagation) in the minibatch setting:
1. Sample neighborhood
 Fix the number of nodes that are sampled at each layer of neighborhood of the target nodes to S_{i} (i = 1,2,...,K)
 For each target node: Perform uniform sampling at each layer of neighborhood of the node according to the set number, i.e., all neighbors have the same probability to be selected
 In case the number of neighbors at a certain layer is less than the set number, repeated sampling is performed until the set number of nodes are sampled
Fixedsize sampling is beneficial to adapt the algorithm for minibatch, as the memory and expected runtime of a single batch is fixed. The authors of GraphSAGE found that the K does not need to be big, practically it could achieve high performance with K = 2 and S_{1}·S_{2} < 500.
Take the graph above as example, when the parameter sample_size
is set as [5,3], sampling of the target node a is done from a to the farther neighborhood. Neighborhood set N_{2}(a) = {b,c,d} is gained during the first round of sampling, and N_{1}(a) = {f,g,h,i,j} during the second round.
The parameter
sample_size
is an array of integers, the length of the array is the maximum search depth K, elements in the array are the number of nodes sampled at layerK, layerK1, ..., layer1 successively.
Sampling is performed from layer1 to layerK, while the subscript order of the neighborhood set is exactly the opposite (K, K1, ..., 1), which is to match the next step of aggregating feature information of neighbors from the outside in.
2. Aggregate feature information from neighbors
For graph G = (V, E) and features for all nodes X_{v} (v∈V) are provided as input, X_{v} is formed by several properties of node and used as each node's initial representation vector:
Each iteration (k = K, ..., 2, 1) proceeds as follows:

For all target nodes and nodes in the neighborhood sets (neighborhood sets with subscripts ≥ k are eliminated) of them, each node is denoted as u and calculated as below:
 Aggregate the (k1)th representations of the nodes in N_{k}(u) into a single neighborhood vector:
 Concatenates the (k1)th representation of node u with the aggregated neighborhood vector, and this concatenated vector is fed through a fully connected layer with nonlinear activation function
σ
(such as Sigmoid function):

Normalize the representation of node u that is generated:
For the target node a in the above example:
 Calculation of the first iteration:
 Calculation of the second iteration:
Special Case
Lonely Node, Disconnected Graph
Lonely node has no adjacent node, hence it cannot generate feature information of any other nodes.
Nodes only generate feature information of neighbors in the same connected component.
Selfloop Edge
Selfloop edges of node are ignored in the calculation.
Directed Edge
For directed edges, GraphSAGE algorithm ignores the direction of edges but calculates them as undirected edges.
Command and Configuration
 Command：
algo(graph_sage)
 Configurations for the parameter
params()
:
Name  Type  Default 
Specification 
Description 

model_name  string  /  /  Name of the model trained by the algorithm algo(graph_sage_train) 
model_task_id  int  /  /  Task ID of the algorithm that trained the model 
ids  []_id 
/  /  IDs of nodes to generate embeddings; all nodes to be selected if not set 
node_property_names  []@<schema>?.<property> 
Read from the model  Node property, LTE needed  Node property/properties to be used as features 
edge_property_name  @<schema>?.<property> 
Read from the model  Numeric edge property, LTE needed  Edge weight property/properties, schema can be either carried or not. During random walk, nodes only walk along edges with the specified properties and the probability of passing through these edges is proportional to the edge weight; if edge has multiple specified properties, the edge weight is the sum of these property values; the weight of all edges is 1 if not set 
sample_size  []int  [25, 10]  Read from the model  Length of the array is the maximum search depth K, elements in the array are the number of nodes sampled at layerK, layerK1, ..., layer1 successively 
Algorithm Execution
Task Writeback
1. File Writeback
Not supported by this algorithm.
2. Property Writeback
Configuration  Writeback Content  Type  Data Type 

property  Embedding result  Node property  string 
Example: Calculate embeddings of all nodes by GraphSAGE model named model_1, write the embedding results back to node property named embedding
algo(graph_sage).params({
model_task_id: 1,
model_name:'model_1'
}).write({
db:{
property_name:"embedding"
}
})
3. Statistics Writeback
This algorithm has no statistics.
Direct Return
Not supported by this algorithm.
Streaming Return
Not supported by this algorithm.
Realtime Statistics
This algorithm has no statistics.