Introduction
Golem is a durable computing platform that makes it simple to build and deploy highly reliable distributed systems.
Golem is highly scalable, and partitions workers across many worker executor nodes, which are each in charge of running a different subset of workers.
Although partitioning workers across many nodes provides the benefit of horizontal scalability, it makes it more difficult to know which node is executing a particular worker.
Even if you know which node is executing a worker, it would not be convenient to interact with the node directly, because it could fail, and you would have to implement logic that detects failure and waits until the worker is recovered on a new node before retrying the invocation.
Moreover, the native protocol for invoking workers is low-level and inflexible, and most developers will not want to expose the invocation API to the outside world or to front-end applications.
In order to address these issues, Golem has a Worker Gateway service, which is effectively stateless and scaled independently of worker executor nodes.
The primary functions of the Worker Gateway are as follows:
- Identify the node that is responsible for executing the worker being invoked, and route the invocation request to the node.
- Transparently handle executor node failures by detecting failure, awaiting recovery, and retrying the invocation.
- Support the execution of custom APIs, which satisfy arbitrary business and technical requirements.
To learn more about how the Worker Gateway supports custom APIs, you can read the high-level introduction to API Definitions, which links to further references.