Introduction

The Interpretability API provides access to language model inference with layer-level activation caching and intervention capabilities.

Authentication

All requests require an API key passed via the x-api-key header.

POST /completions

Generate text completions with optional layer caching and interventions.

Request Body

ParameterTypeDefaultDescription
modelstringtiiuae/Falcon3-7B-BaseHuggingFace model identifier
inputstringThe CN Tower is located inInput text prompt
cache_layersarray[int][1, 2, 3]Layer indices to cache activations from
interventionsobject{}Layer-wise interventions to apply
max_tokensinteger1Maximum new tokens to generate
temperaturefloat1.0Sampling temperature
return_logitsbooleantrueWhether to return logits

Response

FieldTypeDescription
outputstringDecoded output text
sampled_tokensarrayToken IDs sampled during generation
layer_activationsobjectCached activations from specified layers
logitsarrayLog softmax probabilities (if requested)

Interventions

Intervention Structure

Interventions allow you to modify layer activations during inference. Each intervention specifies an operation and value to apply to a specific layer.

{
  "interventions": {
    "1": {
      "operation": "multiply",
      "value": 2
    },
    "2": {
      "operation": "add",
      "value": [[1]*3072]*6
    }
  }
}