Manim Diagram Quality Testkit — Design

For Claude: REQUIRED SUB-SKILL: Use superpowers:writing-plans to create the implementation plan from this design.

Goal: Automatically detect rendering quality problems in manim diagrams (overlapping labels, arrows crossing text, elements outside viewport, visually indistinguishable overlapping lines) by validating the scene graph geometry before rendering.

Architecture: Extract exact coordinates from manim’s object model after construct() runs (no rendering needed). Run geometric checks on the extracted data. Integrate into pre-commit hooks and Dagger CI.

Tech Stack: manim (scene graph), shapely (geometric intersection), pytest (test harness)


Context

Manim renders PNGs — no SVG output. But manim mobjects expose exact bounding boxes, centers, and start/end points via get_corner(), get_start(), get_end(), width, height. We validate geometry at the scene-graph level rather than analyzing pixels.

This approach is inspired by the SVG quality testkit in diagram-claude-plugin (test_svg_quality.py), which parses Graphviz SVG DOM structure to detect arrows crossing text, labels overlapping, and text crossing borders. We mirror its patterns (parsed intermediate representation, check functions returning list[str], TestFixturesMustFail) but extract from manim’s object graph instead of XML.

Data Model

@dataclass
class Point:
    x: float
    y: float
 
@dataclass
class BBox:
    x_min: float
    y_min: float
    x_max: float
    y_max: float
 
@dataclass
class Label:
    name: str         # text content, e.g. "∇f" or "G = ⅓(a+b+c)"
    bbox: BBox
    center: Point
 
@dataclass
class ArrowSegment:
    name: str         # positional, e.g. "arrow_0"
    start: Point
    end: Point
 
@dataclass
class LineSegment:
    name: str         # positional, e.g. "line_0" or "dashed_line_2"
    start: Point
    end: Point
 
@dataclass
class ParsedScene:
    scene_name: str
    labels: list[Label]
    arrows: list[ArrowSegment]
    lines: list[LineSegment]
    frame_width: float
    frame_height: float

Scene Extractor

extract_scene(scene_class: type[Scene]) -> ParsedScene

  1. Instantiate the Scene subclass.
  2. Call construct() — this populates scene.mobjects without rendering.
  3. Walk scene.mobjects recursively (mobjects can contain sub-mobjects).
  4. Classify by type:
    • Text / MathTexLabel (bbox from get_corner(UL), get_corner(DR), name from text content)
    • ArrowArrowSegment (from get_start(), get_end())
    • Line / DashedLineLineSegment (from get_start(), get_end())
    • Polygon → extract edges as LineSegment pairs (consecutive vertices)
    • Dot / VMobject (curves) → skip for now
  5. Read config.frame_width and config.frame_height for viewport bounds.

Naming: labels use their text content. Arrows and lines use f"{type_name}_{index}" since manim mobjects don’t carry variable names at runtime.

Check Functions

Each check: (ParsedScene, **thresholds) -> list[str]. Empty list = pass.

All thresholds in manim scene units (default frame ≈ 14.2 × 8 units).

1. check_label_overlaps(scene, *, padding=0.05)

For every pair of labels, inflate each bbox by padding, test axis-aligned intersection. Catches labels whose bounding boxes touch or overlap.

2. check_arrow_crosses_label(scene, *, padding=0.08)

For each arrow, build a Shapely LineString(start, end). For each label, build padded bbox as Shapely polygon. Test intersection. Skip self-labels (if we later add arrow-label associations).

3. check_outside_frame(scene, *, margin=0.1)

Frame bounds: x ∈ [-frame_width/2, +frame_width/2], y ∈ [-frame_height/2, +frame_height/2]. Any label bbox corner or arrow endpoint outside frame_bounds - margin triggers an error.

4. check_line_near_parallel(scene, *, distance=0.15, angle_degrees=10)

For each pair of (arrow, line) or (line, line):

  • Compute perpendicular distance between midpoints.
  • Compute angle between direction vectors.
  • If distance < threshold AND angle difference < threshold, flag it.

Catches arrows running along polygon edges (the OC-along-AC problem).

Aggregator

def run_all_checks(scene: ParsedScene) -> list[str]:
    errors = []
    errors.extend(check_label_overlaps(scene))
    errors.extend(check_arrow_crosses_label(scene))
    errors.extend(check_outside_frame(scene))
    errors.extend(check_line_near_parallel(scene))
    return errors

File Layout

docs/diagrams/
  multivariable-calculus-manim.py    # the diagram scenes
  manim_quality.py                   # extractor + check functions
  test_manim_quality.py              # pytest harness + must-fail fixtures

Test Harness

Real scenes — parametrized

SCENE_CLASSES = [
    CoordinateFreeCentroid,
    ArcLengthComparison,
    GradientPerpendicularToLevelCurves,
    CriticalPointTypes,
    LagrangeTangentLevelCurves,
]
 
@pytest.fixture(params=SCENE_CLASSES)
def parsed_scene(request):
    return extract_scene(request.param)
 
class TestDiagramQuality:
    def test_no_label_overlaps(self, parsed_scene): ...
    def test_no_arrow_crosses_label(self, parsed_scene): ...
    def test_nothing_outside_frame(self, parsed_scene): ...
    def test_no_near_parallel_lines(self, parsed_scene): ...

Must-fail fixtures

Deliberately broken Scene subclasses defined in the test file:

  • OverlappingLabelsScene — two Text at the same position
  • ArrowCrossesLabelScene — an Arrow passing through a Text bbox
  • OutOfFrameScene — a Text placed at x=20 (well beyond frame)
  • ParallelLinesScene — two Line objects 0.05 units apart, same angle
class TestFixturesMustFail:
    @pytest.fixture(params=[
        OverlappingLabelsScene,
        ArrowCrossesLabelScene,
        OutOfFrameScene,
        ParallelLinesScene,
    ])
    def bad_scene(self, request):
        return extract_scene(request.param)
 
    def test_fixture_fails_at_least_one_check(self, bad_scene):
        errors = run_all_checks(bad_scene)
        assert errors, f"{bad_scene.scene_name} passed ALL checks"

Integration

Dagger CI (dagger/main.go)

The validateDiagrams function runs quality tests first, then pixel-diff:

  1. pip install manim shapely pytest (container already has texlive)
  2. pytest docs/diagrams/test_manim_quality.py -v — fail fast if geometry is broken
  3. Render all scenes → compare PNGs against committed (existing pixel-diff logic)

GitLab CI

No new job. The existing validate-diagrams job calls cd dagger && go run . validate-diagrams, which now includes both steps.

Pre-commit hook

Local fast feedback — no rendering, no LaTeX needed:

pytest docs/diagrams/test_manim_quality.py -v

Triggers when any file in docs/diagrams/ is staged. This is where Claude gets immediate feedback during iteration — bad coordinates are caught before commit.

Dependencies

  • shapely — added to manim script’s inline uv dependencies
  • pytest — test runner only, not a runtime dependency
  • No pixel analysis libraries, no OCR, no image processing