$graphLookup (aggregation) (original) (raw)

$graphLookup

Changed in version 5.1.

Performs a recursive search on a collection, with options for restricting the search by recursion depth and query filter.

The $graphLookup search process is summarized below:

  1. Input documents flow into the $graphLookup stage of an aggregation operation.
  2. $graphLookup targets the search to the collection designated by the from parameter (see below for full list of search parameters).
  3. For each input document, the search begins with the value designated by startWith.
  4. $graphLookup matches the startWith value against the field designated by connectToField in other documents in the from collection.
  5. For each matching document, $graphLookup takes the value of the connectFromField and checks every document in thefrom collection for a matching connectToField value. For each match, $graphLookup adds the matching document in thefrom collection to an array field named by the asparameter.
    This step continues recursively until no more matching documents are found, or until the operation reaches a recursion depth specified by the maxDepth parameter. $graphLookup then appends the array field to the input document. $graphLookupreturns results after completing its search on all input documents.

$graphLookup has the following prototype form:


{

   $graphLookup: {

      from: <collection>,

      startWith: <expression>,

      connectFromField: <string>,

      connectToField: <string>,

      as: <string>,

      maxDepth: <number>,

      depthField: <string>,

      restrictSearchWithMatch: <document>

   }

}

$graphLookup takes a document with the following fields:

Field Description
from Target collection for the $graphLookupoperation to search, recursively matching theconnectFromField to the connectToField. The from collection must be in the same database as any other collections used in the operation.Starting in MongoDB 5.1, the collection specified in thefrom parameter can be sharded.
startWith Expression that specifies the value of the connectFromField with which to start the recursive search. If startWith evaluates to an array,$graphLookup performs the search simultaneously from all array elements.
connectFromField Field name whose value $graphLookup uses to recursively match against the connectToField of other documents in the collection. If the value is an array, each element is individually followed through the traversal process.
connectToField Field name in other documents against which to match the value of the field specified by the connectFromFieldparameter.
as Name of the array field added to each output document. Contains the documents traversed in the$graphLookup stage to reach the document.Documents returned in the as field are not guaranteed to be in any order.
maxDepth Optional. Non-negative integral number specifying the maximum recursion depth.
depthField Optional. Name of the field to add to each traversed document in the search path. The value of this field is the recursion depth for the document, represented as aNumberLong. Recursion depth value starts at zero, so the first lookup corresponds to zero depth.
restrictSearchWithMatch Optional. A document specifying additional conditions for the recursive search. The syntax is identical toquery filter syntax.You cannot use any aggregation expression in this filter. For example, you can't use the following document to find documents in which the lastNamevalue is different from the lastName value of the input document:{ lastName: { ne:"ne: "ne:"lastName" } }You can't use the document in this context, because "$lastName"will act as a string literal, not a field path.

Starting in MongoDB 5.1, you can specify sharded collections in the from parameter of$graphLookup stages.

You cannot use the $graphLookup stage within a transaction while targeting a sharded collection.

Setting the maxDepth field to 0 is equivalent to a non-recursive $graphLookup search stage.

The $graphLookup stage must stay within the 100 megabyte memory limit. If allowDiskUse: true is specified for theaggregate() operation, the$graphLookup stage ignores the option. If there are other stages in the aggregate() operation,allowDiskUse: true option is in effect for these other stages.

See aggregration pipeline limitations for more information.

The $graphLookup stage does not return sorted results. To sort your results, use the $sortArray operator.

If performing an aggregation that involves multiple views, such as with $lookup or $graphLookup, the views must have the same collation.

A collection named employees has the following documents:


db.employees.insertMany( [

   { _id: 1, name: "Dev" },

   { _id: 2, name: "Eliot", reportsTo: "Dev" },

   { _id: 3, name: "Ron", reportsTo: "Eliot" },

   { _id: 4, name: "Andrew", reportsTo: "Eliot" },

   { _id: 5, name: "Asya", reportsTo: "Ron" },

   { _id: 6, name: "Dan", reportsTo: "Andrew" }

] )

The following $graphLookup operation recursively matches on the reportsTo and name fields in the employeescollection, returning the reporting hierarchy for each person:


db.employees.aggregate( [

   {

      $graphLookup: {

         from: "employees",

         startWith: "$reportsTo",

         connectFromField: "reportsTo",

         connectToField: "name",

         as: "reportingHierarchy"

      }

   }

] )

The output resembles the following results:


{

   _id: 1,

   name: "Dev",

   reportingHierarchy: [ ]

}

{

   _id: 2,

   name: "Eliot",

   reportsTo: "Dev",

   reportingHierarchy : [

      { _id: 1, name: "Dev" }

   ]

}

{

   _id: 3,

   name: "Ron",

   reportsTo: "Eliot",

   reportingHierarchy: [

      { _id: 2, name: "Eliot", reportsTo: "Dev" },

      { _id: 1, name: "Dev" }

   ]

}

{

   _id: 4,

   name: "Andrew",

   reportsTo: "Eliot",

   reportingHierarchy: [

      { _id: 2, name: "Eliot", reportsTo: "Dev" },

      { _id: 1, name: "Dev" }

   ]

}

{

   _id: 5,

   name: "Asya",

   reportsTo: "Ron",

   reportingHierarchy: [

      { _id: 2, name: "Eliot", reportsTo: "Dev" },

      { _id: 3, name: "Ron", reportsTo: "Eliot" },

      { _id: 1, name: "Dev" }

   ]

}

{

   "_id" : 6,

   "name" : "Dan",

   "reportsTo" : "Andrew",

   "reportingHierarchy" : [

      { _id: 4, name: "Andrew", reportsTo: "Eliot" },

      { _id: 2, name: "Eliot", reportsTo: "Dev" },

      { _id: 1, name: "Dev" }

   ]

}

The following table provides a traversal path for the document { "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" }:

Start value The reportsTo value of the document:{ ... reportsTo: "Ron" }
Depth 0 { _id: 3, name: "Ron", reportsTo: "Eliot" }
Depth 1 { _id: 2, name: "Eliot", reportsTo: "Dev" }
Depth 2 { _id: 1, name: "Dev" }

The output generates the hierarchyAsya -> Ron -> Eliot -> Dev.

Like $lookup, $graphLookup can access another collection in the same database.

For example, create a database with two collections:

db.airports.insertMany( [  
   { _id: 0, airport: "JFK", connects: [ "BOS", "ORD" ] },  
   { _id: 1, airport: "BOS", connects: [ "JFK", "PWM" ] },  
   { _id: 2, airport: "ORD", connects: [ "JFK" ] },  
   { _id: 3, airport: "PWM", connects: [ "BOS", "LHR" ] },  
   { _id: 4, airport: "LHR", connects: [ "PWM" ] }  
] )  
db.travelers.insertMany( [  
   { _id: 1, name: "Dev", nearestAirport: "JFK" },  
   { _id: 2, name: "Eliot", nearestAirport: "JFK" },  
   { _id: 3, name: "Jeff", nearestAirport: "BOS" }  
] )  

For each document in the travelers collection, the following aggregation operation looks up the nearestAirport value in theairports collection and recursively matches the connectsfield to the airport field. The operation specifies a maximum recursion depth of 2.


db.travelers.aggregate( [

   {

      $graphLookup: {

         from: "airports",

         startWith: "$nearestAirport",

         connectFromField: "connects",

         connectToField: "airport",

         maxDepth: 2,

         depthField: "numConnections",

         as: "destinations"

      }

   }

] )

The output resembles the following results:


{

   _id: 1,

   name: "Dev",

   nearestAirport: "JFK",

   destinations: [

      { _id: 3,

        airport: "PWM",

        connects: [ "BOS", "LHR" ],

        numConnections: NumberLong(2) },

      { _id: 2,

        airport: "ORD",

        connects: [ "JFK" ],

        numConnections: NumberLong(1) },

      { _id: 1,

        airport: "BOS",

        connects: [ "JFK", "PWM" ],

        numConnections: NumberLong(1) },

      { _id: 0,

        airport: "JFK",

        connects: [ "BOS", "ORD" ],

        numConnections: NumberLong(0) }

   ]

}

{

   _id: 2,

   name: "Eliot",

   nearestAirport: "JFK",

   destinations: [

      { _id: 3,

        airport: "PWM",

        connects: [ "BOS", "LHR" ],

        numConnections: NumberLong(2) },

      { _id: 2,

        airport: "ORD",

        connects: [ "JFK" ],

        numConnections: NumberLong(1) },

      { _id: 1,

        airport: "BOS",

        connects: [ "JFK", "PWM" ],

        numConnections: NumberLong(1) },

      { _id: 0,

        airport: "JFK",

        connects: [ "BOS", "ORD" ],

        numConnections: NumberLong(0) } ]

}

{

   "_id" : 3,

   name: "Jeff",

   nearestAirport: "BOS",

   destinations: [

      { _id: 2,

        airport: "ORD",

        connects: [ "JFK" ],

        numConnections: NumberLong(2) },

      { _id: 3,

        airport: "PWM",

        connects: [ "BOS", "LHR" ],

        numConnections: NumberLong(1) },

      { _id: 4,

        airport: "LHR",

        connects: [ "PWM" ],

        numConnections: NumberLong(2) },

      { _id:: 0,

        airport: "JFK",

        connects: [ "BOS", "ORD" ],

        numConnections: NumberLong(1) },

      { _id:: 1,

        airport: "BOS",

        connects: [ "JFK", "PWM" ],

        numConnections: NumberLong(0) }

   ]

}

The following table provides a traversal path for the recursive search, up to depth 2, where the starting airport is JFK:

Start value The nearestAirport value from the travelers collection:{ ... nearestAirport: "JFK" }
Depth 0 { _id: 0, airport: "JFK", connects: [ "BOS", "ORD" ] }
Depth 1 { _id: 1, airport: "BOS", connects: [ "JFK", "PWM" ] }{ _id: 2, airport: "ORD", connects: [ "JFK" ] }
Depth 2 { _id: 3, airport: "PWM", connects: [ "BOS", "LHR" ] }

The following example uses a collection with a set of documents containing names of people along with arrays of their friends and their hobbies. An aggregation operation finds one particular person and traverses her network of connections to find people who list golf among their hobbies.

A collection named people contains the following documents:


db.people.insertMany( [

   {

      _id: 1,

      name: "Tanya Jordan",

      friends: [ "Shirley Soto", "Terry Hawkins", "Carole Hale" ],

      hobbies: [ "tennis", "unicycling", "golf" ]

   },

   {

      _id: 2,

      name: "Carole Hale",

      friends: [ "Joseph Dennis", "Tanya Jordan", "Terry Hawkins" ],

      hobbies: [ "archery", "golf", "woodworking" ]

   },

   {

      _id: 3,

      name: "Terry Hawkins",

      friends: [ "Tanya Jordan", "Carole Hale", "Angelo Ward" ],

      hobbies: [ "knitting", "frisbee" ]

   },

   {

      _id: 4,

      name: "Joseph Dennis",

      friends: [ "Angelo Ward", "Carole Hale" ],

      hobbies: [ "tennis", "golf", "topiary" ]

   },

   {

      _id: 5,

      name: "Angelo Ward",

      friends: [ "Terry Hawkins", "Shirley Soto", "Joseph Dennis" ],

      hobbies: [ "travel", "ceramics", "golf" ]

      },

      {

         _id: 6,

         name: "Shirley Soto",

         friends: [ "Angelo Ward", "Tanya Jordan", "Carole Hale" ],

         hobbies: [ "frisbee", "set theory" ]

   }

] )

The following aggregation operation uses three stages:


db.people.aggregate( [

  { $match: { "name": "Tanya Jordan" } },

  { $graphLookup: {

      from: "people",

      startWith: "$friends",

      connectFromField: "friends",

      connectToField: "name",

      as: "golfers",

      restrictSearchWithMatch: { "hobbies" : "golf" }

    }

  },

  { $project: {

      "name": 1,

      "friends": 1,

      "connections who play golf": "$golfers.name"

    }

  }

] )

The operation returns the following document:


{

   _id: 1,

   name: "Tanya Jordan",

   friends: [

      "Shirley Soto",

      "Terry Hawkins",

      "Carole Hale"

   ],

   'connections who play golf': [

      "Joseph Dennis",

      "Tanya Jordan",

      "Angelo Ward",

      "Carole Hale"

   ]

}

A collection named employees has the following documents:


{ _id: 1, name: "Dev" },

{ _id: 2, name: "Eliot", reportsTo: "Dev" },

{ _id: 3, name: "Ron", reportsTo: "Eliot" },

{ _id: 4, name: "Andrew", reportsTo: "Eliot" },

{ _id: 5, name: "Asya", reportsTo: "Ron" },

{ _id: 6, name: "Dan", reportsTo: "Andrew" }

The following Employee class models documents in the employees collection:


public class Employee

{

    public ObjectId Id { get; set; }

    public string Name { get; set; }

    public Employee ReportsTo { get; set; }

    public List<Employee> ReportingHierarchy { get; set; }

        public List<string> Hobbies { get; set; }

}

To use the MongoDB .NET/C# driver to add a $graphLookup stage to an aggregation pipeline, call the GraphLookup() method on a PipelineDefinition object.

The following example creates a pipeline stage that recursively matches on the ReportsTo and Name fields in the employeescollection, returning the reporting hierarchy for each person:


var pipeline = new EmptyPipelineDefinition<Employee>()

    .GraphLookup<Employee, Employee, Employee, Employee, string, Employee, List<Employee>, Employee>(

        from: employeeCollection,

        connectFromField: e => e.ReportsTo,

        connectToField: e => e.Name,

        startWith: e => e.ReportsTo,

        @as: e => e.ReportingHierarchy);

You can use an AggregateGraphLookupOptionsobject to specify the depth to recurse and name of the depth field. The following code example performs the same $graphLookup operation as the previous example, but specifies a maximum recursion depth of 1:


var employeeCollection = client.GetDatabase("aggregation_examples").GetCollection<Employee>("employees");

var pipeline = new EmptyPipelineDefinition<Employee>()

    .GraphLookup<Employee, Employee, Employee, Employee, string, Employee, List<Employee>, Employee>(

        from: employeeCollection,

        connectFromField: e => e.ReportsTo,

        connectToField: e => e.Name,

        startWith: e => e.ReportsTo,

        @as: e => e.ReportingHierarchy,

        new AggregateGraphLookupOptions<Employee, Employee, Employee>

        {

            MaxDepth = 1

        });

You can also use an AggregateGraphLookupOptions object to specify a filter that documents must match in order for MongoDB to include them in your search. The following code example performs the same $graphLookup operation as the previous examples, but includes only Employee documents where the Hobbies field contains "golf":


var employeeCollection = client.GetDatabase("aggregation_examples").GetCollection<Employee>("employees");

var pipeline = new EmptyPipelineDefinition<Employee>()

    .GraphLookup<Employee, Employee, Employee, Employee, string, Employee, List<Employee>, Employee>(

        from: employeeCollection,

        connectFromField: e => e.ReportsTo,

        connectToField: e => e.Name,

        startWith: e => e.ReportsTo,

        @as: e => e.ReportingHierarchy,

        new AggregateGraphLookupOptions<Employee, Employee, Employee>

        {

            MaxDepth = 1,

            RestrictSearchWithMatch = Builders<Employee>.Filter.AnyEq(e => e.Hobbies, "golf") 

        });

Webinar: Working with Graph Data in MongoDB