Version 2.0 — Updated April 2026 — Kaiko AI Research
Architectures for Emotionally Intelligent Systems: a next-generation emotional intelligence engine designed to understand, interpret, and respond to human emotion with unprecedented accuracy and cultural sensitivity.
This whitepaper presents the Synapse EQ+ system, a next-generation emotional intelligence engine. Building on the original EQ+ architecture (v1, June 2025), this updated version incorporates findings from a comprehensive system audit, open-source model landscape analysis, affective computing research, and cross-cultural psychology. The key advances are: (1) a multi-model ensemble architecture replacing single-head emotion classification, (2) ML-predicted dimensional emotions (valence, arousal, intensity) replacing rule-derived approximations, (3) native empathy/distress prediction as a first-in-market EQ capability, (4) calibrated decision policies replacing hard-coded thresholds, and (5) a cost-optimized training and deployment pipeline.
Evaluated on GoEmotions-28 (500 samples, 28 labels, 8 dimensions). Synapse EQ leads in 6 of 8 dimensions.
| System | EQ Score | Emotion F1 | Empathy r | ECE | Latency p95 |
|---|---|---|---|---|---|
| Synapse EQ | 70.58 | 0.579 | 0.883 | 0.015 | 407ms |
| Kimi | 47.9 | 0.397 | — | 0.032 | 1,120ms |
| Gemini | 44.5 | 0.367 | — | 0.051 | 1,902ms |
| GPT-4o | 43.8 | 0.352 | — | 0.162 | 6,279ms |
| Grok | 43.8 | 0.339 | — | 0.067 | 1,705ms |
| Claude | 43.0 | 0.345 | — | 0.044 | 1,682ms |
| Mistral | 43.8 | 0.321 | — | 0.089 | 1,322ms |
Empathy prediction is exclusive to Synapse EQ — no competing system offers native empathic concern or personal distress detection.
Emotional intelligence in AI systems requires more than sentiment analysis. True EQ demands:
Accurately identifying what someone is feeling from text, voice, and behavior
Comprehending why they feel that way, including context, culture, and appraisal
Generating responses that demonstrate understanding and appropriate emotional attunement
Improving emotional understanding over time through interaction history and feedback
| Capability | V1 (2025) | V2 (2026) |
|---|---|---|
| Core emotions | 6 (single-head) | 27 multi-label (GoEmotions) |
| Dimensional emotions | Rule-derived | ML-predicted regression |
| Empathy/distress | Not modeled | Native ML prediction |
| Safety/hostility | Keyword-based | ML classification |
| Calibration | Fixed thresholds | Learned per-label + temperature scaling |
| Deployment | Single model | Multi-model ensemble |
| Context | Single utterance | Dialogue context window |
| Cross-cultural | Not evaluated | ISEAR (37 countries) |
The system adopts a two-layer emotion representation grounded in complementary psychological theories:
Based on Russell's (1980) circumplex model and Mehrabian's (1996) PAD framework. Valence (pleasure-displeasure), Arousal (activation-deactivation), and Dominance (control-submission). Dimensional representations are more stable cross-culturally than discrete categories.
Based on Demszky et al.'s (2020) GoEmotions taxonomy, extending Ekman's (1992) basic emotions with fine-grained categories relevant to conversational AI. Categories provide interpretability for response policy and user-facing explanations.
| Branch | EI Capability | Synapse Implementation |
|---|---|---|
| Perceiving | Identifying emotions | Multi-model ensemble + VAD regression |
| Facilitating | Using emotion to enhance thinking | Conversation mode detection; personality mode |
| Understanding | Comprehending causes & transitions | Pattern detection; trajectory tracking; complexity |
| Managing | Regulating emotions | Growth tracking; de-escalation; empathic response |
V2 introduces empathy prediction as a first-class ML capability, distinguishing between empathic concern (Batson, 1991) — other-oriented emotional response that triggers supportive behavior — and personal distress — self-oriented aversive reaction that may trigger safety routing when combined with low concern. This distinction is critical for response policy: high empathic concern should trigger warmth; high personal distress without concern should trigger safety evaluation.
User Input (text)
│
▼
┌─────────────────────────────────────┐
│ MULTI-MODEL ENSEMBLE │
│ SamLowe/roberta-base (28 labels) │
│ + DeBERTa-v3-base (blended 65/35) │
│ + Dimensional predictor (VAD) │
│ + Hostility classifier (safety) │
│ + Empathy heads (concern/distress) │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ CALIBRATED DECISION LAYER │
│ Temperature scaling (T=0.447) │
│ Per-label optimized thresholds │
│ ECE-gated production readiness │
│ Conversation mode selection │
│ Personality mode adaptation │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ STATE & MEMORY LAYER │
│ Emotional trajectory (7/30/90-day) │
│ Pattern detection (5 types) │
│ Growth tracking (5 dimensions) │
│ Hostility escalation tracking │
│ Emotional RAG (9-D vectors) │
│ Belief detection & tracking │
└─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ RESPONSE GENERATION │
│ LLM conditioned on emotional state │
│ + personality mode guidance │
│ + empathy/distress signals │
│ + conversation mode strategy │
└─────────────────────────────────────┘The production system uses a multi-model ensemble with per-label optimized blending weights:
| Model | Role | Weight |
|---|---|---|
| SamLowe/roberta-base-go_emotions | Primary 28-label emotion classifier | 65% |
| DeBERTa-v3-base (fine-tuned) | Ensemble emotion member | 35% |
| Dimensional predictor | Valence, arousal, empathy regression | Standalone |
| Hostility classifier | Binary safety/toxicity detection | Standalone |
Post-hoc temperature scaling (T=0.447) is applied to the blended ensemble output. Per-label thresholds are optimized via grid search on the full GoEmotions validation set (5,426 samples) to maximize macro-F1. Expected Calibration Error (ECE) must be ≤ 0.05 for production deployment. Current production ECE: 0.015.
| Dimension | Range | Source | Description |
|---|---|---|---|
| Intensity | 0.0–1.0 | ML regression | Overall emotional strength (5 levels) |
| Valence | -1.0 to +1.0 | ML regression | Pleasure-displeasure tone |
| Arousal | 0.0–1.0 | ML regression | Activation-deactivation energy |
| Complexity | 4 levels | ML ordinal | Emotional simultaneity |
| Empathic Concern | 0.0–1.0 | ML regression | Other-oriented compassion (r=0.883) |
| Personal Distress | 0.0–1.0 | ML regression | Self-oriented aversive reaction |
| Safety Score | 0.0–1.0 | ML classification | Hostility/toxicity detection (AUPRC 0.716) |
| Dataset | Size | Use |
|---|---|---|
| GoEmotions | 58K | 27-label emotion classification |
| EmoBank | 10K | Valence & arousal regression |
| SemEval-2018 | 10K+ | Emotion intensity |
| ISEAR | 7.6K | Cross-cultural validation (37 countries) |
| Empathic Reactions | 1,860 | Empathy/distress prediction |
| HatEval | 20K+ | Hostility/safety classification |
Three-layer defense: ML safety head (score-based routing), keyword routing (immediate escalation for crisis patterns), and emotion thresholds (check-in protocol for combined negative signals). Recall targets: 0.95 (ML), 0.99 (keywords), 0.90 (thresholds).
Cross-cultural evaluation with maximum 5% F1 gap between locales. Demographic subgroup calibration with maximum 3% ECE gap. Regular memory bias audits to prevent stereotype reinforcement.
PII removal pipeline for training data. GDPR/CCPA compliance for emotion data storage. Encrypted storage with audit logs. Data retention aligned with memory graduation tiers.
GoEmotions 27-label mapping organized by valence and arousal:
| Category | Emotions | Valence | Arousal |
|---|---|---|---|
| Positive High-Arousal | admiration, amusement, excitement, joy, love, pride | + | High |
| Positive Low-Arousal | approval, caring, gratitude, optimism, relief | + | Low-Med |
| Negative High-Arousal | anger, annoyance, disgust, fear, nervousness | − | High |
| Negative Low-Arousal | disappointment, embarrassment, grief, remorse, sadness | − | Low |
| Ambiguous | confusion, curiosity, realization, surprise | Mixed | Med-High |
| Neutral | neutral | 0 | Low |
Questions about Synapse EQ? Contact us at research@kaikostudios.com