module RocketJob::Server::Model

Model attributes

Public Class Methods

counts_by_state() click to toggle source

Returns [Hash<String:Integer>] of the number of servers in each state. Note: If there are no servers in that particular state then the hash will not have a value for it.

Example servers in every state:

RocketJob::Server.counts_by_state
# => {
       :aborted => 1,
       :completed => 37,
       :failed => 1,
       :paused => 3,
       :queued => 4,
       :running => 1,
       :queued_now => 1,
       :scheduled => 3
     }

Example no servers active:

RocketJob::Server.counts_by_state
# => {}
# File lib/rocket_job/server/model.rb, line 59
def self.counts_by_state
  counts = {}
  collection.aggregate([{"$group" => {_id: "$state", count: {"$sum" => 1}}}]).each do |result|
    counts[result["_id"].to_sym] = result["count"]
  end
  counts
end
destroy_zombies() click to toggle source

Destroy’s all instances of zombie servers and requeues any jobs still “running” on those servers.

# File lib/rocket_job/server/model.rb, line 69
def self.destroy_zombies
  count = 0
  each do |server|
    next unless server.zombie?

    logger.warn "Destroying zombie server #{server.name}, and requeueing its jobs"
    server.destroy
    count += 1
  end
  count
end
zombies(missed = 4) click to toggle source

Scope for all zombie servers

# File lib/rocket_job/server/model.rb, line 82
def self.zombies(missed = 4)
  dead_seconds        = Config.heartbeat_seconds * missed
  last_heartbeat_time = Time.now - dead_seconds
  where(
    :state.in => %i[stopping running paused],
    "$or"     => [
      {"heartbeat.updated_at" => {"$exists" => false}},
      {"heartbeat.updated_at" => {"$lte" => last_heartbeat_time}}
    ]
  )
end

Public Instance Methods

refresh(worker_count) click to toggle source

Updates the heartbeat and returns a refreshed server instance.

# File lib/rocket_job/server/model.rb, line 110
def refresh(worker_count)
  SemanticLogger.silence(:info) do
    find_and_update(
      "heartbeat.updated_at" => Time.now,
      "heartbeat.workers"    => worker_count
    )
  end
end
zombie?(missed = 4) click to toggle source

Returns [true|false] if this server has missed at least the last 4 heartbeats

Possible causes for a server to miss its heartbeats:

  • The server process has died

  • The server process is “hanging”

  • The server is no longer able to communicate with the MongoDB Server

# File lib/rocket_job/server/model.rb, line 101
def zombie?(missed = 4)
  return false unless running? || stopping? || paused?
  return true if heartbeat.nil? || heartbeat.updated_at.nil?

  dead_seconds = Config.heartbeat_seconds * missed
  (Time.now - heartbeat.updated_at) >= dead_seconds
end

Private Instance Methods

requeue_jobs() click to toggle source

Requeue any jobs assigned to this server when it is destroyed

# File lib/rocket_job/server/model.rb, line 122
def requeue_jobs
  RocketJob::Job.requeue_dead_server(name)
end