Skip to main content

ML Engineering & Research

STEP

STEP: a unified Spiking Transformer evaluation platform reproducing and fairly comparing representative spiking transformers under one training pipeline across image and sequential image classification, plus module-wise ablations.

8items
8subjects
53%observed
Modelsubject type
unknownlicense
generaldomain
ml_engineeringdomain
imagemodality

Response matrix

Every model, scored item by item.

Each row is an AI model and each column an item, ordered so the strongest models and easiest items gather toward one corner. 8 subjects × 8 items, 53% of cells evaluated.

Fit to width. Hover for subject & item; click a cell for details.

STEP response matrix: AI models (rows) against items (columns)
lowhighUnobserved

Scale: 0 to 1 (per task): accuracy (classification) or mIoU / mAP (segmentation, detection) for spiking transformers; percentages ÷ 100.

Sample items

What the questions look like — and how subjects answer.

A spread of items across the difficulty range. This benchmark does not publish per-answer traces, so each item shows which subjects succeeded.

Item 186% solve rate

STEP classification dataset: cifar100

Subject outcomes

  • QKFormerscore 0.947
  • SGLFormerscore 0.944
  • Spikformerscore 0.774
Item 291% solve rate

STEP encoding ablation scheme: rate

Subject outcomes

  • Spikformerscore 0.988
  • Spikf_SEMMscore 0.987
Item 391% solve rate

STEP classification dataset: scifar

Subject outcomes

  • Spikformerscore 0.987
  • Spikf_SEMMscore 0.986
  • SDTscore 0.823
Item 492% solve rate

STEP encoding ablation scheme: phase

Subject outcomes

  • SDTscore 0.991
  • Spikf_SEMMscore 0.991
  • Spikformerscore 0.828
Item 599% solve rate

STEP classification dataset: psmnist

Subject outcomes

  • Spikf_SEMMscore 1
  • SDTscore 0.999
  • Spikformerscore 0.98
Item 6100% solve rate

STEP classification dataset: smnist

Subject outcomes

  • Spikf_SEMMscore 1
  • SDTscore 1
  • Spikformerscore 0.988

Subjects

The models, agents, and reward models evaluated.

8 subjects, ranked by mean response across this benchmark's items.

  1. 1Spikf_SEMM0.935
  2. 2SDT0.934
  3. 3Spikformer0.93
  4. 4SGLFormer0.927
  5. 5QKFormer0.926
  6. 6Spikingresformer0.922
  7. 7Spikingformer0.921
  8. 8SWFormer0.912