Model Backend
The model_backend.yaml
file allows node operators to configure a specific model and point it to a different URL (backend).
Template
backend_provider: "custom" # "fxn", "custom", "vllm" or "ollama"
url: "http://your-backend:8082" # required for "custom"
fxn_id: "2" # can be be found here https://www.function.network/models
api_key: "your-api-key" # optional
bearer_token: "your-bearer-token" # optional
Parameters
backend_provider
backend_provider
Description: Specifies the type of backend provider.
Options:
"fxn"
,"custom"
,"vllm"
,"ollama"
Type:
string
While "custom"
is the primary supported provider for now, we are actively developing a native Function Network inference engine. See "The Future: Custom Inference Engine" below for more details.
url
url
Description: The URL of your custom backend. This is required when
backend_provider
is set to"custom"
.Type:
string
fxn_id
fxn_id
Description: The Function Network model ID to participate in
Type:
string
The fxn_id
for the model you want to participate in can be found here
api_key
api_key
Description: Your API key for the backend service (optional).
Type:
string
bearer_token
bearer_token
Description: Your bearer token for authentication (optional).
Type:
string
The Future: Custom Inference Engine
While the current "custom"
backend provider allows the node to act as a proxy to any existing model endpoint, this is an interim solution. We are actively developing a native, high-performance inference engine that will become the standard for the Function Network.
This upcoming engine is being built from the ground up for true, distributed inference. It will feature advanced optimizations such as:
Pipeline Parallelism: Processing different stages of inference simultaneously across multiple nodes.
Sharding: Splitting a model into smaller pieces (shards or layers) that are distributed across the network.
Custom Network Transport: A highly optimized network layer to reduce unnecessary data transfer between nodes, ensuring the most efficient communication for distributed inference.
Eventually, this means a single node will only need to hold and compute one shard of a model, rather than the entire model. This distributed approach will significantly lower the hardware barrier for node operators and enable the network to run much larger and more powerful models collectively.
This custom engine is a core part of our roadmap and will become the default inference backend as it becomes available. In the meantime, the "custom"
provider offers the flexibility to connect to any inference engine you choose to run.
Last updated