Serving
The main and baseline models are exposed behind a single FastAPI app (grnti_text_classifier.serving.main:app). Both are loaded lazily from artifacts/{main,baseline}/hf/ at first request and held in memory thereafter. The active model is selected via a query parameter on each call.
Run
# local dev (auto-reload)
uvicorn grnti_text_classifier.serving.main:app --host 0.0.0.0 --port 8000 --reload
# production (4 workers)
uvicorn grnti_text_classifier.serving.main:app --host 0.0.0.0 --port 8000 --workers 4
Endpoints
| Method | Path | Purpose |
|---|---|---|
GET |
/health |
Liveness probe — returns {"status": "ok", "version": <model_version>}. |
GET |
/labels |
Returns the full list of 28 GRNTI label codes and human-readable names loaded from label_encoder.json. |
POST |
/classify |
Classify a single Russian text. Query param model selects main (default) or baseline. |
/classify returns HTTP 422 if text is empty or whitespace-only, and HTTP 503 if the HF snapshot directory is missing from disk.
Request — TextPayload
Schema source: grnti_text_classifier.serving.schemas.TextPayload.
class TextPayload(BaseModel):
text: str # Russian scientific text to classify (required, non-empty)
Response schemas
LabelProb
class LabelProb(BaseModel):
label: str # GRNTI class code, e.g. "27" (Mathematics)
name: str # Human-readable section name
prob: float # Softmax probability for this class
LabelEntry
class LabelEntry(BaseModel):
code: str # GRNTI top-level code (2-digit string)
name: str # Section name in Russian
ClassificationResponse
class ClassificationResponse(BaseModel):
top1_label: str # GRNTI code of the most likely class
top1_name: str # Human-readable name of the top-1 class
top1_prob: float # Softmax probability of the top-1 class
top5: list[LabelProb] # Top-5 classes with probabilities
truncated: bool # True if input exceeded max_length and was truncated
input_length_tokens: int # Token count before any truncation
request_id: str # 12-char UUID prefix for tracing
model_name: str # "xlm-roberta-base" or "rubert-base-cased"
model_version: str # e.g. "v0.1.0"
Environment variables
| Variable | Default | Purpose |
|---|---|---|
GRNTI_MAIN_DIR |
artifacts/main/hf |
Path to XLM-RoBERTa save_pretrained snapshot. |
GRNTI_BASELINE_DIR |
artifacts/baseline/hf |
Path to ruBERT save_pretrained snapshot. |
GRNTI_LABEL_ENCODER |
data/processed/label_encoder.json |
Path to label encoder JSON. |
GRNTI_MODEL_VERSION |
v0.1.0 |
Reported in /health and response body. |
curl examples
GET /health
curl http://localhost:8000/health
{"status": "ok", "version": "v0.1.0"}
GET /labels
curl http://localhost:8000/labels
[
{"code": "01", "name": "Общенаучное и междисциплинарное знание"},
{"code": "03", "name": "История. Исторические науки"},
...
]
POST /classify — main model
curl -X POST http://localhost:8000/classify \
-H "Content-Type: application/json" \
-d '{"text":"Исследование квантовой электродинамики в кристаллах."}'
{
"top1_label": "29",
"top1_name": "Физика",
"top1_prob": 0.923,
"top5": [
{"label": "29", "name": "Физика", "prob": 0.923},
{"label": "30", "name": "Химия", "prob": 0.031},
{"label": "44", "name": "Энергетика", "prob": 0.018},
{"label": "27", "name": "Математика", "prob": 0.012},
{"label": "50", "name": "Автоматика", "prob": 0.007}
],
"truncated": false,
"input_length_tokens": 14,
"request_id": "a1b2c3d4e5f6",
"model_name": "xlm-roberta-base",
"model_version": "v0.1.0"
}
POST /classify — baseline model
curl -X POST "http://localhost:8000/classify?model=baseline" \
-H "Content-Type: application/json" \
-d '{"text":"Исследование квантовой электродинамики в кристаллах."}'
Response field notes
| Field | Notes |
|---|---|
truncated |
True when input_length_tokens > max_length (256). The model still produces a prediction but context beyond 256 tokens was dropped. |
input_length_tokens |
Raw token count before truncation, useful for monitoring distribution shift at inference time. |
request_id |
First 12 characters of a UUID4 generated per request. Log this for end-to-end tracing. |
model_name |
Reflects the actual HF model identifier, not the alias (main/baseline). |
model_version |
Read from the GRNTI_MODEL_VERSION environment variable; matches the git tag of the published checkpoint. |