Biologically Grounded Embodied Cognition for Quadruped Locomotion Learning
A simulated quadruped learns to refine its locomotion through a 15-step closed-loop cognitive architecture that integrates spiking neural networks, a cerebellar forward model, a central pattern generator, embodied emotions, and reward-modulated spike-timing-dependent plasticity (R-STDP). The system is a hybrid by design: an innate CPG provides the base gait, and the SNN + cerebellum learn to refine it on top — closer to how a young animal's brainstem and spinal cord come pre-wired while the cerebellum and cortex calibrate them with experience.
This
mainbranch is the Bittle-only public release (v0.8.1). The active hardware platform is the Petoi Bittle X. Earlier platforms (Unitree Go2 ablation, Freenove sim-to-real) are preserved as paper checkpoints — see the tags below.
📄 Paper Checkpoints
Paper Focus Tag Preprint Paper 1 — Ablation study Go2 10-seed validation, B vs PPO v0.4.1-paper1aiXiv 260301.000002 Paper 2 — Sim-to-Real Freenove hardware transfer, Bridge v4.4, phototaxis v0.4.3-paper2aiXiv 260409.000002 Use
git checkout v0.4.1-paper1orgit checkout v0.4.3-paper2to reproduce the paper results with their original Go2 / Freenove assets. A further snapshot of the lastmainthat still carried both platforms is taggedv0.7-go2-freenove-final.
MH-FLOCKE is explicitly a hybrid. Being clear about which parts are programmed and which parts adapt is central to the project:
Hardwired (programmed, present "from birth"): the CPG gait oscillator, spinal reflexes (righting, cross-extension, terrain), vestibular/light reflexes, the run-and-tumble navigation state machine, the drive/behaviour planner, and the competence gate. The SNN does not generate the gait — the CPG does.
Learned (adapts through experience): the SNN weights via R-STDP, the cerebellar correction (Marr-Albus-Ito, climbing-fibre error → Purkinje learning), the CPG→actor handoff as the actor proves competence, Hebbian co-activation, world-model prediction, emotional state, and activity-dependent synaptogenesis.
The interesting behaviour lives in the interaction: reflexes provide the scaffold, learning refines it. This is a design principle, not a limitation.
A 15-step closed-loop processing cycle runs at every simulation timestep:
SENSE → BODY SCHEMA → WORLD MODEL → EMOTIONS → MEMORY →
DRIVES → GLOBAL WORKSPACE → METACOGNITION → CONSISTENCY →
COMBINED REWARD → R-STDP LEARNING → SYNAPTOGENESIS →
HEBBIAN → DREAM MODE → NEUROMODULATION
Operating across nested timescales:
- Spinal reflexes (every step) — posture maintenance, stretch reflexes
- Central Pattern Generator — innate gait, competence-gated blending with the learned actor
- Cerebellar forward model — Marr-Albus-Ito framework, prediction-error-driven corrections
- SNN with R-STDP — Izhikevich neurons (≈535–756 depending on sensor configuration), reward-modulated STDP
- Cognitive layers — Global Workspace Theory, embodied emotions, episodic memory, drives
- Meta-learning loop — EpisodeAnalyzer, StrategyAdapter, CuriosityExplorer, HypothesisGenerator
The CPG provides a locomotion prior from step 1. As the SNN actor learns, a competence gate transitions from ≈90% CPG toward ≈40% CPG / 60% actor — the CPG floor stays at 40% by design, so the SNN refines the gait rather than replacing it. The creature walks from the start and improves through learning.
# Clone
git clone https://github.com/MarcHesse/mhflocke.git
cd mhflocke
# Install dependencies
pip install -r requirements.txt
# Train the Bittle on flat ground (OpenCat Trot gait + SNN refinement)
# --neural-cpg is REQUIRED for the Bittle: it loads the OpenCat Trot controller.
# Without it the SpinalCPG path is used and the robot falls immediately.
python scripts/train_baby.py --creature-name bittle --neural-cpg \
--scene flat --steps 25000 --hardware-sensors --fresh --snn-substeps 10
# Analyze training data
python flog_server.py
# Open http://localhost:5050 for the dashboard- Python 3.11+
- MuJoCo (via the
mujocopip package) - PyTorch
- NumPy, msgpack
MH-FLOCKE targets the Petoi Bittle X
(BiBoard V0, ESP32, MPU6050 IMU, 8 leg servos). The brain runs on a host PC and talks to the
robot over a WiFi WebSocket, using the same src/brain/ code as the simulator — one
codebase, two platforms. OpenCat's onboard balance is disabled so the motor commands pass
through unmodified.
The bridge needs websocket-client; the live --dashboard additionally needs websockets
(both are in requirements.txt). Each run writes per-step telemetry and the learned weights to
creatures/bittle/bridge_<timestamp>/.
# 1) Verify the WiFi / IMU channel — no motion, just reads the IMU.
# Replace <robot-ip> with the Bittle's address on your network.
python scripts/bridge_bittle_wifi.py --ip <robot-ip>
# 2) Live gait: the innate OpenCat Trot with a fresh SNN learning on top via R-STDP.
# No pre-trained brain needed — intrinsic drives (vestibular, curiosity) supply the reward.
python scripts/bridge_bittle_wifi.py --ip <robot-ip> --gait-loop --snn --duration 30
# 3) Same run, plus the live telemetry dashboard. Open the local file
# src/viz/bridge_live.html directly in a browser (double-click or a file:// URL);
# it connects to the bridge's WebSocket. Do NOT browse to localhost:5001 — that
# port is a raw WebSocket, not a web page.
python scripts/bridge_bittle_wifi.py --ip <robot-ip> --gait-loop --snn --dashboard --duration 30To load a simulation-trained brain instead of learning fresh, pass --snn-brain <path/to/brain.pt>
(sim-to-real transfer is active research — see the note below); --cerebellum adds the cerebellar
drift correction, and --yaw-pid closes an IMU yaw loop so the robot compensates mechanical
drift, surface, and battery level. The Bittle does not self-right from a supine fall — set it
upright by hand, or pass --recover to drive the OpenCat stand posture on side/forward falls
(learned weights are kept).
Sim-to-real note. Closing the simulation-to-hardware gap on the Bittle is an active research arc, not a solved benchmark. Distances and roll amplitudes differ between sim and hardware, and the public code reflects work in progress rather than a finished result.
The 10-seed ablation in Paper 1 was run on the Unitree Go2 (reproducible at tag
v0.4.1-paper1):
| Config | Distance (m) | Falls | Variance |
|---|---|---|---|
| B — SNN + Cerebellum | 45.15 ± 0.67 | 0 | σ = 0.67 |
| A — CPG only | 40.73 ± 6.14 | 0.2 | σ = 6.14 |
| PPO baseline | 12.83 ± 7.78 | 0 | σ = 7.78 |
How to read this honestly. The B configurations train on an external, shaped reward —
R_ext(t) = 0.8·v_forward + 0.2·upright — applied via R-STDP on top of the innate CPG gait.
The large gap over PPO is mostly the CPG locomotion prior, not the SNN learning to walk by
itself: the SNN + cerebellum's own marginal contribution over CPG-only (A → B) is about +11%
distance, alongside a collapse in seed-to-seed variance (σ 6.14 → 0.67) and zero falls. The
phrases "from scratch", "no reward shaping", and "no end-to-end RL" do not apply to these
numbers.
A separate intrinsic-reward line (train_baby.py --reward-blend 0) learns from body signals
alone — vestibular comfort, curiosity, proprioceptive prediction error — with no external
reward. That configuration trades distance for autonomy and is not the source of the
benchmark numbers above.
Three configurations isolate component contributions:
- A (CPG only) — spinal reflexes + vestibular. The minimal baseline.
- B (SNN + Cerebellum) — adds R-STDP learning, cerebellar forward model, drives, behaviour planner.
- C (Full system) — all 15 cognitive steps including GWT, metacognition, dream mode, synaptogenesis.
The training logger writes binary FLOG files (msgpack-encoded frames at 10-step intervals). The standalone dashboard provides real-time analysis:
python flog_server.pyFeatures: distance/velocity charts, fall detection, CPG/actor weight tracking, cerebellar prediction error, behavioural state timeline.
Render training runs with the full dashboard overlay and data-driven sonification:
# Render a Bittle training video
python scripts/render_bittle.py creatures/bittle/<run>/training_log.bin
# Instagram-format reel
python scripts/render_insta_reel_bittle.py creatures/bittle/<run>/training_log.bin
# Add data-driven audio (SNN crackle, CPG heartbeat, cerebellum tones, DA melody)
python scripts/sonify_flog.py --flog creatures/bittle/<run>/training_log.bin --speed 2 --mux output.mp4Requires
ffmpegon your PATH (external tool, not a pip package) for video encoding and audio muxing — install viaapt install ffmpeg,brew install ffmpeg, or the Windows build from ffmpeg.org.
The Brain3D visualization in rendered videos shows actual SNN topology and spike activity from the training data.
v0.8.1 — real overlay data. A few readouts in the rendered dashboard overlay were still placeholders, shown for demonstration. As of v0.8.1 they are all wired through to the real values from the training log; a reading that isn't available in a given run is shown as
—rather than a stand-in. You can verify a log carries real values withpython scripts/check_flog.py <run>/training_log.bin.
mhflocke/
├── scripts/
│ ├── train_baby.py # Baby-KI training loop (intrinsic + shaped reward)
│ ├── bridge_bittle_wifi.py # Bittle hardware bridge (WiFi/WebSocket, same src/brain/)
│ ├── render_bittle.py # Bittle training-video renderer (dashboard overlay)
│ ├── render_insta_reel_bittle.py # Instagram-format renderer
│ └── sonify_flog.py # Data-driven audio from FLOG
├── src/
│ ├── body/ # MuJoCo creature, terrain, OpenCat balance/controller
│ │ ├── bittle.py # Bittle body model
│ │ ├── hardware_drift.py # Mechanical-drift simulation (robot-agnostic, no-op without profile)
│ │ └── ...
│ ├── brain/ # SNN, cerebellum, CPG, cognitive brain
│ │ ├── snn_controller.py # Izhikevich SNN with R-STDP
│ │ ├── cerebellar_learning.py # Marr-Albus-Ito cerebellum
│ │ ├── spinal_cpg.py # Central pattern generator
│ │ ├── topology.py # Shared population sizing (no MuJoCo dep)
│ │ ├── spatial_map.py # Path integration + landmarks
│ │ ├── episode_analyzer.py # Meta-learning: episode comparison
│ │ ├── strategy_adapter.py # Meta-learning: parameter adaptation
│ │ ├── curiosity_hypothesis.py # Meta-learning: exploration + hypothesis generation
│ │ └── ...
│ ├── bridge/ # Task parsing, scene generation, curriculum
│ ├── viz/ # Brain3D, dashboard overlay
│ └── behavior/ # Drive-based behaviour planner
├── creatures/
│ └── bittle/ # Petoi Bittle X configuration
│ ├── bittle.xml # MJCF (measured inertia)
│ ├── scene_mhflocke.xml # Training scene
│ ├── profile.json # Robot profile + SNN topology
│ ├── cpg_config.json # Evolved CPG parameters
│ └── meshes_obj/ # Collision/visual meshes
├── docs/
│ └── FLOG_FORMAT.md
├── flog_server.py # FLOG analysis + dashboard
├── requirements.txt # Simulator dependencies
└── requirements-pi.txt # On-device dependencies (CPU-only)
Full documentation with architecture details, API references, mathematical formulations, and biological background:
Paper 1 — Ablation Study: MH-FLOCKE: Biologically Grounded Embodied Cognition Through a 15-Step Closed-Loop Architecture for Quadruped Locomotion Learning. Marc Hesse (2026). Preprint: aiXiv 260301.000002
Paper 2 — Sim-to-Real: MH-FLOCKE: Sim-to-Real Transfer of Biologically Grounded Spiking Neural Networks for Quadruped Locomotion. Marc Hesse (2026). Preprint: aiXiv 260409.000002
See CHANGELOG.md for full version history.
This project is licensed under the Apache License 2.0.
The Unitree Go2 MJCF model used in the Paper 1 ablation (available at tag v0.4.1-paper1) is
from the MuJoCo Menagerie project (Google
DeepMind), derived from Unitree Robotics URDF descriptions and
licensed under BSD-3-Clause.
MH-FLOCKE is named after the author's late dog Flocke. The current test pilot is Mogli.
@article{hesse2026mhflocke,
title={MH-FLOCKE: Biologically Grounded Embodied Cognition Through a 15-Step Closed-Loop Architecture for Quadruped Locomotion Learning},
author={Hesse, Marc},
year={2026},
note={Independent Researcher, Potsdam, Germany}
}- Website: mhflocke.com
- Email: info@mhflocke.com
- Reddit: u/mhflocke