node.js - What are the efficiency costs associated with using a custom ID in Mongodb -


i plan on using this npm package (shortid) produce shorter ids, use in url's, wish use them, directed, mongodb id (at least collections).

what costs associated using custom ids? effect lookup time, write time etc. in significant way?

these types of questions can wander off battle of opinions rather stating opinion think providing pros , cons , letting decide better application make more sense.

assuming format of "shortid" stored string think response abigail watson similar question on google groups sums of larger points. response aimed @ meteor apps , of pro/cons associated design decisions made meteor team can see how should thinking whether or not use objectid or "shortid" application based decision.

her entire response:

objectid pros

  • it has embedded timestamp in it.
  • it's default mongo _id type; ubiquitous
  • interoperability other apps , drivers

objectid cons

  • it's object, , little more difficult manipulate in practice.
  • there times when forget wrap string in new objectid()
  • it requires server side object creation maintain _id uniqueness
  • which makes generating them client-side minimongo problematic

string pros

  • developers can create domain specific _id topologies

string cons

  • developer has ensure uniqueness of _ids
  • findandmodify() , getnextsequence() queries may invalidated

meteor's choice go string, understand it, boils down latency compensation , being able generate _id on client-side in mini-mongo. default objectid implementation didn't lend being generated on client part of latency compensation framework, decided roll own _id scheme.

personally, find embedded timestamps in objectids invaluable later in application's lifecycle. more difficult manipulate, , add more debugging time application's development cycle. 10 or 20 hours put debugging objectids, can return 10x or 100x savings down road. example: @ work, salvaged year's worth of production data because of embedded timestamps, has saved hundreds of thousands of dollars of r&d time , effort.

objectid's great if can ensure there's 1 central authority generating them. they're preferred index type type of timeseries data. , while may seem tempting try make one-or-the-other decision entire app, find choosing string vs objectid (vs other index scheme) boils down topology of data in collection.

some useful questions maybe ask when choosing _id collection:

  • does data in collection need latency compensation?
  • is time-series data?
  • will other applications or worker utilities accessing collection?
  • what topology of data in collection?

https://groups.google.com/d/msg/meteor-talk/f-ljbdzowpk/oqyzqxcakn8j

my 2 cents throw mix considering if main reason use "shortid" shorter urls why not create url property indexed , used fetching documents url id? keep objectid don't have worry sharding or dependency issues down road while having shorter url id value.


Comments