gates simply is the add-on to refinery, allowing it to process realtime data streams. With gates, you can use refinery to make predictions on data immediately, and use the results to make operational decisions.


To deploy a model via gates, simply hit the 'Open config' button, which will open a modal that shows the automations of your synched refinery project.

Here, you can select which heuristics should be used for inference. They will be aggregated into a weakly supervised label during runtime, but you will also have the single votes for indication during inference.


If you now want to call the model, you can either use the embedded playground to test the inference, or you can integrate the model by copy-pasting the displayed code snippet. For Python, this looks as follows:

import requests

url = "http://localhost:4455/commercial/predict/project/YOUR_PROJECT_ID"

headers = {
    "accept": "application/json",
    "content-type": "application/json",
    "Authorization": "YOUR_PROJECT_ID YOUR_API_KEY"

# replace with your own example data
example_data = {
  "running_id": 1,
  "text": "what do i need to do to apply for a visa card"

response =, headers=headers, json=example_data)



Each project comes with a minimal monitoring dashboard, which shows you the essential metrics for operations. These are:

  • Requests per hour: how often is your model called?
  • Confidence: how confident is your model during inference?
  • Response time: how long does it take your model to do inference?


Every gates model is deployed in a dedicated Docker container. If you want to run the model on your own premises instead of our hosted cloud, we can export the image and transfer it to you.