Documentation

Synapse EQ+ Whitepaper

V2

Version 2.0 — Updated April 2026 — Kaiko AI Research

Architectures for Emotionally Intelligent Systems: a next-generation emotional intelligence engine designed to understand, interpret, and respond to human emotion with unprecedented accuracy and cultural sensitivity.

Abstract

This whitepaper presents the Synapse EQ+ system, a next-generation emotional intelligence engine. Building on the original EQ+ architecture (v1, June 2025), this updated version incorporates findings from a comprehensive system audit, open-source model landscape analysis, affective computing research, and cross-cultural psychology. The key advances are: (1) a multi-model ensemble architecture replacing single-head emotion classification, (2) ML-predicted dimensional emotions (valence, arousal, intensity) replacing rule-derived approximations, (3) native empathy/distress prediction as a first-in-market EQ capability, (4) calibrated decision policies replacing hard-coded thresholds, and (5) a cost-optimized training and deployment pipeline.

Benchmark Results

Evaluated on GoEmotions-28 (500 samples, 28 labels, 8 dimensions). Synapse EQ leads in 6 of 8 dimensions.

SystemEQ ScoreEmotion F1Empathy rECELatency p95
Synapse EQ70.580.5790.8830.015407ms
Kimi47.90.3970.0321,120ms
Gemini44.50.3670.0511,902ms
GPT-4o43.80.3520.1626,279ms
Grok43.80.3390.0671,705ms
Claude43.00.3450.0441,682ms
Mistral43.80.3210.0891,322ms

Empathy prediction is exclusive to Synapse EQ — no competing system offers native empathic concern or personal distress detection.

1. The EQ Challenge

Emotional intelligence in AI systems requires more than sentiment analysis. True EQ demands:

Emotion Perception

Accurately identifying what someone is feeling from text, voice, and behavior

Emotion Understanding

Comprehending why they feel that way, including context, culture, and appraisal

Empathic Response

Generating responses that demonstrate understanding and appropriate emotional attunement

Adaptive Learning

Improving emotional understanding over time through interaction history and feedback

What's New in V2

CapabilityV1 (2025)V2 (2026)
Core emotions6 (single-head)27 multi-label (GoEmotions)
Dimensional emotionsRule-derivedML-predicted regression
Empathy/distressNot modeledNative ML prediction
Safety/hostilityKeyword-basedML classification
CalibrationFixed thresholdsLearned per-label + temperature scaling
DeploymentSingle modelMulti-model ensemble
ContextSingle utteranceDialogue context window
Cross-culturalNot evaluatedISEAR (37 countries)

2. Theoretical Foundations

Emotion Representation: Two-Layer Target Space

The system adopts a two-layer emotion representation grounded in complementary psychological theories:

Layer 1 — Dimensional Core (VAD)

Based on Russell's (1980) circumplex model and Mehrabian's (1996) PAD framework. Valence (pleasure-displeasure), Arousal (activation-deactivation), and Dominance (control-submission). Dimensional representations are more stable cross-culturally than discrete categories.

Layer 2 — Categorical Surface (27 GoEmotions)

Based on Demszky et al.'s (2020) GoEmotions taxonomy, extending Ekman's (1992) basic emotions with fine-grained categories relevant to conversational AI. Categories provide interpretability for response policy and user-facing explanations.

Mayer-Salovey Four-Branch EQ Model

BranchEI CapabilitySynapse Implementation
PerceivingIdentifying emotionsMulti-model ensemble + VAD regression
FacilitatingUsing emotion to enhance thinkingConversation mode detection; personality mode
UnderstandingComprehending causes & transitionsPattern detection; trajectory tracking; complexity
ManagingRegulating emotionsGrowth tracking; de-escalation; empathic response

Empathy as a Core EQ Construct

V2 introduces empathy prediction as a first-class ML capability, distinguishing between empathic concern (Batson, 1991) — other-oriented emotional response that triggers supportive behavior — and personal distress — self-oriented aversive reaction that may trigger safety routing when combined with low concern. This distinction is critical for response policy: high empathic concern should trigger warmth; high personal distress without concern should trigger safety evaluation.

3. System Architecture

User Input (text)
      │
      ▼
┌─────────────────────────────────────┐
│  MULTI-MODEL ENSEMBLE               │
│  SamLowe/roberta-base (28 labels)   │
│  + DeBERTa-v3-base (blended 65/35) │
│  + Dimensional predictor (VAD)      │
│  + Hostility classifier (safety)    │
│  + Empathy heads (concern/distress) │
└─────────────────────────────────────┘
      │
      ▼
┌─────────────────────────────────────┐
│  CALIBRATED DECISION LAYER          │
│  Temperature scaling (T=0.447)      │
│  Per-label optimized thresholds     │
│  ECE-gated production readiness     │
│  Conversation mode selection        │
│  Personality mode adaptation        │
└─────────────────────────────────────┘
      │
      ▼
┌─────────────────────────────────────┐
│  STATE & MEMORY LAYER               │
│  Emotional trajectory (7/30/90-day) │
│  Pattern detection (5 types)        │
│  Growth tracking (5 dimensions)     │
│  Hostility escalation tracking      │
│  Emotional RAG (9-D vectors)        │
│  Belief detection & tracking        │
└─────────────────────────────────────┘
      │
      ▼
┌─────────────────────────────────────┐
│  RESPONSE GENERATION                │
│  LLM conditioned on emotional state │
│  + personality mode guidance        │
│  + empathy/distress signals         │
│  + conversation mode strategy       │
└─────────────────────────────────────┘

Ensemble Architecture

The production system uses a multi-model ensemble with per-label optimized blending weights:

ModelRoleWeight
SamLowe/roberta-base-go_emotionsPrimary 28-label emotion classifier65%
DeBERTa-v3-base (fine-tuned)Ensemble emotion member35%
Dimensional predictorValence, arousal, empathy regressionStandalone
Hostility classifierBinary safety/toxicity detectionStandalone

Calibration

Post-hoc temperature scaling (T=0.447) is applied to the blended ensemble output. Per-label thresholds are optimized via grid search on the full GoEmotions validation set (5,426 samples) to maximize macro-F1. Expected Calibration Error (ECE) must be ≤ 0.05 for production deployment. Current production ECE: 0.015.

4. EQ Dimensions

DimensionRangeSourceDescription
Intensity0.0–1.0ML regressionOverall emotional strength (5 levels)
Valence-1.0 to +1.0ML regressionPleasure-displeasure tone
Arousal0.0–1.0ML regressionActivation-deactivation energy
Complexity4 levelsML ordinalEmotional simultaneity
Empathic Concern0.0–1.0ML regressionOther-oriented compassion (r=0.883)
Personal Distress0.0–1.0ML regressionSelf-oriented aversive reaction
Safety Score0.0–1.0ML classificationHostility/toxicity detection (AUPRC 0.716)

5. Training & Evaluation

Training Data

DatasetSizeUse
GoEmotions58K27-label emotion classification
EmoBank10KValence & arousal regression
SemEval-201810K+Emotion intensity
ISEAR7.6KCross-cultural validation (37 countries)
Empathic Reactions1,860Empathy/distress prediction
HatEval20K+Hostility/safety classification

Evaluation Protocol

  • Classification: Macro-F1, micro-F1, AUROC per label (active-labels methodology excludes zero-support labels)
  • Calibration: ECE, Brier score, reliability diagrams per label
  • Robustness: Paraphrase invariance, negation stress tests
  • Cross-cultural: Per-locale F1 on ISEAR; max gap ≤ 5%
  • Dimensional: Pearson r for valence, arousal, empathy
  • Safety: AUPRC for hostility detection

6. Safety & Ethics

Crisis Detection

Three-layer defense: ML safety head (score-based routing), keyword routing (immediate escalation for crisis patterns), and emotion thresholds (check-in protocol for combined negative signals). Recall targets: 0.95 (ML), 0.99 (keywords), 0.90 (thresholds).

Bias & Fairness

Cross-cultural evaluation with maximum 5% F1 gap between locales. Demographic subgroup calibration with maximum 3% ECE gap. Regular memory bias audits to prevent stereotype reinforcement.

Privacy

PII removal pipeline for training data. GDPR/CCPA compliance for emotion data storage. Encrypted storage with audit logs. Data retention aligned with memory graduation tiers.

7. Emotion Taxonomy

GoEmotions 27-label mapping organized by valence and arousal:

CategoryEmotionsValenceArousal
Positive High-Arousaladmiration, amusement, excitement, joy, love, pride+High
Positive Low-Arousalapproval, caring, gratitude, optimism, relief+Low-Med
Negative High-Arousalanger, annoyance, disgust, fear, nervousnessHigh
Negative Low-Arousaldisappointment, embarrassment, grief, remorse, sadnessLow
Ambiguousconfusion, curiosity, realization, surpriseMixedMed-High
Neutralneutral0Low

8. Roadmap

Now (Q1-Q2 2026)
  • ✓ 27-emotion ensemble (EQ 70.58)
  • ✓ ML-predicted dimensions
  • ✓ Native empathy prediction
  • ✓ Calibrated thresholds
  • ✓ Growth tracking & patterns
  • ✓ Crisis detection
Next (Q3-Q4 2026)
  • • API tiering (Basic/Pro/Enterprise)
  • • ONNX distillation (<100ms)
  • • Audio emotion pilot
  • • Multilingual expansion
  • • Active learning pipeline
Future (2027+)
  • • Physiological integration
  • • Vision modality
  • • Personalized models
  • • Cross-attention multimodal
  • • Real-time emotion coaching

References

  • Batson, C. D. (1991). The Altruism Question: Toward a Social-Psychological Answer.
  • Demszky, D., et al. (2020). GoEmotions: A dataset of fine-grained emotions. ACL 2020.
  • Ekman, P. (1992). An argument for basic emotions. Cognition & Emotion, 6(3-4), 169-200.
  • Mayer, J. D., & Salovey, P. (1997). What is emotional intelligence? In Emotional Development and Emotional Intelligence.
  • Mehrabian, A. (1996). Pleasure-arousal-dominance: A general framework for describing and measuring individual differences in temperament. Current Psychology, 14(4), 261-292.
  • Mohammad, S. M. (2018). Obtaining reliable human ratings of valence, arousal, and dominance for 20,000 English words. ACL 2018.
  • Russell, J. A. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39(6), 1161-1178.

Questions about Synapse EQ? Contact us at research@kaikostudios.com