How to best handle mongoose references

I’m looking for advice on how to models references in a mongoose / Feathers / DoneJS application.

Consider a schema that looks like:

const monthlyOsProjectSchema = new Schema({
      osProjectId: { type: Schema.Types.ObjectId, ref: 'os_project' },
      significance: Number
});

const contributionMonthSchema = new Schema({
      date: { type: Date, default: Date.now },
      monthlyOSProjects: [ monthlyOsProjectSchema ]
})

In short, a contribution month looks like:

{
  date: "10/20/1982"
  monthlyOSProjects: [
    {osProjectId: "23nh2dso8s", significance: 100}
  ]
}

The problem I’m dealing with is that if you use $populate like $populate=monthlyOSProjects.osProjectId instead of osProjectId coming back as a string, it comes back as an osProject:

{
  date: "10/20/1982"
  monthlyOSProjects: [
    {osProjectId: {_id: "23nh2dso8s", name: "CanJS"}, significance: 100}
  ]
}

If I’m trying to create useful models that work regardless if $populate is there or not, I might have something like:

MonthlyOSProjects = DefineList.extend({
  indexOfProjectById: function(osProjectId){
    for(var i  = 0 ; i < this.length; i++){
      if(this.get(i).osProjectId === osProjectId) {
        return i;
      }
    }
    return -1;
  }
})

If suddenly osProjectId is an object instead of a string this doesn’t work. Ideally, there will be a:

  • monthlyOSProject[0].osProjectId no matter what, and a
  • monthlyOSProject[0].osProject if someone wrote $populate=monthlyOSProjects.osProject

I could accomplish some of this right now by using parseInstanceData to check if osProjectId is an object, if it is, create an osProject on the response, and update osProjectId = osProjectId._id. However, I don’t writing: $populate=monthlyOSProjects.osProjectId. So I could rewrite my schema, and parseInstanceData to account for this.

I’m wondering if there are other solutions to this. I considered having a ref type where you could do:

  • monthlyOSProject[0].osProjectRef._id no matter what, and a
  • monthlyOSProject[0].osProjectRef.osProject if someone wrote $populate=monthlyOSProjects.osProjectRef or someone didn’t have it, but using a getter, it could be retrieved.

Welcome to the NoSQL side! About time :wink:

First of all, you should reconsider your schema conventions. The model name should be PascalCase, whereas the property name on the foreign schema should simply be the camelCase of the model name.

When you’re saving data, these are identical:

[
  {
    "_id" : "57567abf5a3d660100dcff34",
    "segments" : [
      {
        "clip" : "56b998783679ac958bdb8fd7"
      },
      {
        "clip" : "56b998793679ac958bdb8fe3"
      },
      {
        "clip" : "56b9987b3679ac958bdb8ff3"
      },
      {
        "clip" : "56b998783679ac958bdb8fd4"
      }
    ],
    "visibility" : 3,
    "__v" : 133
  }
]
// expanded _id
[
  {
    "_id" : "57567abf5a3d660100dcff34",
    "segments" : [
      {
        "clip" : {
          "_id": "56e97aee3679ac958bdca023"
        }
      },
      {
        "clip" : {
          "_id": "56b98cbb803c017d717727d7"
        }
      },
      {
        "clip" : {
          "_id": "56b9995d3679ac958bdb947d"
        }
      }
    ],
    "visibility" : 3,
    "__v" : 133
  }
]

Using the model.set will accept either the expanded or collapsed syntax. The clients should also be able to handle either format, and then utilize parseInstanceData to convert strings ID’s into an expanded object with _id set.

This approach then has the advantage that you’re always passing in an object into Type converters, and your serialization is consistent. You can always expect something like osProject._id as well.

define: {
  osProject: {
    Type: OsProject,
    serialize: function( val ) {
      if (val && !val.isNew()) {
        return val._id;
      }
    }
  }
}

Always convert to instances, always serialize as strings

parseInstanceData: function( data ) {
  if (typeof data.osProject === 'string') {
    data.osProject = {
      _id: data.osProject
    };
  }

  return data;
}

In an ideal situation, the above code would be looking pre-existing references of the same _id up as well, meaning you could lazy side-load the data client-side. It also has the benefit of hooking up cached copies, assuming all saves return de-populated data.

Additionally, you might want to reconsider this part:

const contributionMonthSchema = new Schema({
  date: { type: Date, default: Date.now },
  monthlyOSProjects: [ monthlyOsProjectSchema ]
})

People often get overzealous about subdocument arrays for problems they can otherwise solve via querying the primary collection with a findAll. If you can model your data to accomplish this sort of thing with a findAll, even if that means you’re including additional fields, you’re likely better off. Not only could this accommodate more flexible queries, but your data like be easier to run through an aggregate pipeline / mapReduce. I might add, in your specific case, simply adding a date field, say createdAt, would let you aggregate your first schema into the format of your second already.

Forgot to mention this, but the convention would generally be to have OsProject as the name of the constructor, ref, as well as what’s passed into the mongoose.model(schema, 'OsProject'); call

const monthlyOsProjectSchema = new Schema({
  osProjectId: { type: Schema.Types.ObjectId, ref: 'os_project' },
  significance: Number
});

====>

const monthlyOsProjectSchema = new Schema({
  osProject: {
    type: Schema.Types.ObjectId,
    ref: 'OsProject'
  },
  significance: {
    type: Number,
    min: 0,
    default: 0,
    required: true
  }
});

mongoose.model(monthlyOsProjectSchema, 'MonthlyOsProject');