Arangodb CPU Performance differences while running the same application on 2 separate machine -


i have foxx application developed , running on machine a. cpu utilization below 3-4% , spikes 20%. have close 6 million records.

same application deployed on machine (exact replica of machine a) , have data of 100k only, cpu utilization @ around 200%.

how debug this. happening on machine b. both machines have same application, same arangodb version, same configuration. disk i/o same, memory utilization @ machine b 1/6th of machine a.

any pointers. happening in production enviornment, important me debug quickly.

we able reproduce such issue ourselves. found there situation in scheduler thread go busy wait state, resulting in following loop executed on , over:

  1. a scheduler thread calling epoll_wait()
  2. epoll_wait() returning instantly, signalling event file descriptor
  3. the correct event handling callback being called, not removing file descriptor list of watched descriptors
  4. goto 1

as 1 file descriptor not cleared list of watched descriptors, epoll_wait() signalled event file descriptor available. made return instantly, , whole above loop being executed many times per second. caused cpu spikes in threads named scheduler.

we found 1 reason client-side connection timing out while operation triggered connection still executed on server-side operation. example, if client called server route took 5 seconds complete , respond, client disconnected after 3 seconds, might have happened. made hard reproduce it did not affect such client connections, - ones still unclear.

this particular issue fixed in arangodb 2.6.5, may want give try when released.


Comments