November 1, 2018

Periodic Job Pattern

There is a kind of work, periodically query data and send them to remote API or save to file, e.g. every night, query yesterday purchase data and send them to vendor’s report service.

Solution

My best practice for this kind of work:

  • There is a job executes periodically, e.g. every 1:15 am, it enqueues the query data job with the data range, generally it’s date time range, e.g. from yesterday to today.

  • The query data job accepts data range and query or calculate and get all unique data identity, e.g. this job queries all matched purchase ids, enqueue all matched data identity

  • Now you have some jobs with data identity, every job do the query or calculate job and send to remote API.

Reason

  • Every job may get exception during executing, if there is a big job, it do everything, calculate date time range, query purchases, send to remote API, any point would break job execution, if the job failed, you have to retry, you may send duplicated data to remote because you don’t know where you down. If you break the big job into pieces, you can retry every unique job or the query data job.

  • It’s easy to identify duplicates. You can setup a redis or database record which purchase id is processed, etc.

Powered by Hugo & Kiss.