Walkthrough: Deploying a Hugging Face Model as a Worker Node

Overview

This guide shows how to deploy a Hugging Face model as an Allora Network worker. We'll use the Chronos time-series forecasting model to predict BTC prices on Topic 4.

All files are available in this repository (opens in a new tab).

About Chronos

Chronos (amazon/chronos-t5-tiny) (opens in a new tab) is a pretrained time-series forecasting model based on language model architectures. It:

Transforms time series into token sequences via scaling and quantization
Trains on these tokens using cross-entropy loss
Generates probabilistic forecasts by sampling multiple future trajectories

Chronos is trained on public time-series data and synthetic data from Gaussian processes, enabling zero-shot forecasting on unseen datasets.

This walkthrough uses zero-shot forecasting, meaning no additional training is required for new datasets.

Prerequisites

Docker environment with docker compose installed
Basic machine learning knowledge and familiarity with Hugging Face (opens in a new tab)
Understanding of deploying worker nodes with Docker
CoinGecko API key (opens in a new tab) for data access

Setup

Step 1: Clone Repository

git clone https://github.com/allora-network/basic-coin-prediction-node
cd basic-coin-prediction-node

Step 2: Configure Network

Copy config.example.json to config.json
Update the following fields:

wallet

nodeRpc: RPC URL for your network
addressKeyName: Wallet key name from wallet setup
addressRestoreMnemonic: Wallet mnemonic phrase

worker

Topic configuration array. For BTC 24h prediction on Topic 4:

"worker": [
  {
    "topicId": 4,
    "inferenceEntrypointName": "api-worker-reputer",
    "loopSeconds": 5,
    "parameters": {
      "InferenceEndpoint": "http://localhost:8000/inference/{Token}",
      "Token": "BTC"
    }
  }
]

⚠️

The worker array supports multiple topics. Duplicate the configuration for each additional topic, updating topicId, InferenceEndpoint, and Token accordingly.

Implementation

Step 3: Create Inference Server

Create app.py with a Flask application serving Chronos model inferences:

from flask import Flask, Response
import requests
import json
import pandas as pd
import torch
from chronos import ChronosPipeline
 
app = Flask(__name__)
model_name = "amazon/chronos-t5-tiny"
 
def get_coingecko_url(token):
    base_url = "https://api.coingecko.com/api/v3/coins/"
    token_map = {
        'ETH': 'ethereum',
        'SOL': 'solana',
        'BTC': 'bitcoin',
        'BNB': 'binancecoin',
        'ARB': 'arbitrum'
    }
 
    token = token.upper()
    if token in token_map:
        return f"{base_url}{token_map[token]}/market_chart?vs_currency=usd&days=30&interval=daily"
    else:
        raise ValueError("Unsupported token")
 
@app.route("/inference/<string:token>")
def get_inference(token):
    """Generate inference for given token."""
    try:
        pipeline = ChronosPipeline.from_pretrained(
            model_name,
            device_map="auto",
            torch_dtype=torch.bfloat16,
        )
    except Exception as e:
        return Response(json.dumps({"pipeline error": str(e)}), status=500, mimetype='application/json')
 
    try:
        url = get_coingecko_url(token)
    except ValueError as e:
        return Response(json.dumps({"error": str(e)}), status=400, mimetype='application/json')
 
    headers = {
        "accept": "application/json",
        "x-cg-demo-api-key": "<Your Coingecko API key>"  # Replace with your API key
    }
 
    response = requests.get(url, headers=headers)
    if response.status_code == 200:
        data = response.json()
        df = pd.DataFrame(data["prices"])
        df.columns = ["date", "price"]
        df["date"] = pd.to_datetime(df["date"], unit='ms')
        df = df[:-1]  # Remove today's price
    else:
        return Response(json.dumps({"Failed to retrieve data": str(response.text)}),
                        status=response.status_code,
                        mimetype='application/json')
 
    context = torch.tensor(df["price"])
    prediction_length = 1
 
    try:
        forecast = pipeline.predict(context, prediction_length)
        return Response(str(forecast[0].mean().item()), status=200)
    except Exception as e:
        return Response(json.dumps({"error": str(e)}), status=500, mimetype='application/json')
 
if __name__ == "__main__":
    app.run(host="0.0.0.0", port=8000)

Step 4: Configure Requirements

Create requirements.txt:

Flask==3.0.3
gunicorn==22.0.0
transformers==4.40.1
pandas==2.2.2
torch==2.3.0
git+https://github.com/amazon-science/chronos-forecasting.git
requests==2.32.2

Step 5: Create Dockerfile

Create Dockerfile.inference:

FROM --platform=linux/amd64 python:3.11-slim
 
WORKDIR /app
 
RUN apt-get update && apt-get install -y git && rm -rf /var/lib/apt/lists/*
 
COPY requirements.txt .
 
RUN pip install --no-cache-dir -r requirements.txt
 
COPY app.py .
 
CMD ["gunicorn", "--bind", "0.0.0.0:8000", "app:app"]

Step 6: Update Docker Compose

Add the inference service to docker-compose.yaml:

inference:
  container_name: inference
  build:
    context: .
    dockerfile: Dockerfile.inference
  command: python -u /app/app.py
  ports:
    - "8000:8000"
  networks:
    eth-model-local:
      aliases:
        - inference
      ipv4_address: 172.22.0.4

Ensure network configuration matches:

networks:
  eth-model-local:
    driver: bridge
    ipam:
      config:
        - subnet: 172.22.0.0/24

Deployment

Step 7: Initialize Configuration

chmod +x init.config
./init.config

This exports environment variables from config.json.

💡

Rerun init.config if you modify config.json:

chmod +x init.config
./init.config

Step 8: Get Testnet Tokens

Request tokens from the Allora Testnet Faucet (opens in a new tab) using your Allora address.

Step 9: Deploy

docker compose up --build

Both the offchain node and inference service will start and communicate through internal Docker DNS.

Verification

Check Deployment

Look for successful nonce checking:

offchain_node | {"level":"debug","topicId":4,"time":1723043600,"message":"Checking for latest open worker nonce on topic"}

Successful inference submission shows:

{"level":"debug","msg":"Send Worker Data to chain","txHash":<tx-hash>,"time":<timestamp>,"message":"Success"}

Test Locally

Test your inference endpoint:

curl http://localhost:8000/inference/BTC

Verify the response contains a numeric prediction.

Customization

To use different Hugging Face models:

Replace model_name in app.py with your chosen model
Update model loading and inference logic as needed for the new model's API
Modify input data processing to match model requirements
Update requirements.txt with any new dependencies

Next Steps

Explore other worker examples | Query worker data | View available topics

Using the Allora Model Development Kit Price Prediction Worker