Build Daily

Tinley Park · May 29, 2026
databaseNeo4j, Inc.applied

Neo4j

The canonical-state graph database — Behavior nodes, Topic nodes, MemoryEntry log, ~536 typed nodes and ~1,256 relationships from the GodfreyLabs scanner today. Free community edition; runs as a Docker container; Cypher query language. The memory layer every GL agent reads before answering anything load-bearing.

Updated May 24, 2026

Neo4j is the layer that lets a GL agent remember what's true now — not what got logged at a moment in time, but the canonical state any agent can read before acting. Every agent on the stack consults it. This page is the orient-and-anchor surface — official docs at neo4j.com/docs own the Cypher reference.

What it is

A native graph database — data is stored as nodes with labels (types) and properties, connected by directed relationships that also have types and properties. Queries are written in Cypher, a declarative pattern-matching language. Free community edition (GPLv3); enterprise edition is paid (skip).

Distribution: official Docker image, Homebrew formula, or direct download. Runs as a single process — bolt protocol on :7687 for clients, HTTP/Browser UI on :7474. The GL deployment runs three containers — gl-graph-godmode (canonical), gl-graph-oracle (Sage readings), gl-graph-truthers (truthers project) — each on its own port set.

The pitch: when the questions you ask are relational ("which projects share a person?"), temporal ("what did Neil say about X first?"), or canonical ("what's the current state of Y?"), a graph beats a vector store and a relational table by a wide margin. The wrong primitive in a vector store is "find similar"; the wrong primitive in SQL is "join until your eyes bleed." A graph natively models both.

When to use it

Reach for it when:

  • The data is inherently graph-shaped — people, projects, ideas, protocols, goals, with rich cross-relationships.
  • You need canonical state with conflict resolution — "Topic node wins over append-only log" is a real pattern.
  • Retrieval is relational or temporal first, similarity-second. "Show me everything Neil decided about Pendle in the last month" is a graph query, not an embedding query.
  • The corpus has personalization layers — Behavior nodes that change agent decisions per user. Hard to do cleanly in pure-text RAG.
  • You want a single source of truth that multiple agents and processes can read and write consistently.

Skip it when:

  • The workload is pure document RAG — a flat list of passages and a query. LlamaIndex + a vector store is the right shape.
  • The team has no graph-modeling experience and the data shape is genuinely tabular. Forcing a graph on tabular data adds cost with no win.
  • You don't need canonical state — append-only logs and "always re-query the source" cover it.

At a glance

Core concepts

  • Node — an entity. Labeled with one or more types (:Person, :Project, :Behavior).
  • Relationship — a directed, typed edge between nodes ((:Person)-[:WORKS_ON]->(:Project)).
  • Property — key/value on a node or relationship.
  • Cypher — the query language. ASCII-art pattern matching: MATCH (n:Person {name:"Neil"})-[:USES]->(p:Protocol) RETURN p.
  • Constraint — uniqueness or existence guarantees on a label/property. agent_name_unique is a GL example — prevents the case-duplicate bug from hand-rolled MERGEs.
  • Index — on a label/property to make pattern lookups fast.

The GL deployment shape

Container Bolt port HTTP port Purpose
gl-graph-godmode 7687 7474 Canonical — Behavior, Idea, Goal, Person, Project, Topic, Session, MemoryEntry. The default target.
gl-graph-oracle 7690 7477 Sage readings — separate plumbing keeps reading state isolated from canonical state.
gl-graph-truthers 7688 7475 Truthers project — paused but live.

All share neo4j / godmode2026 for local development. Production hardening (real password, TLS, network restrictions) is per-environment.

Node types in active use

  • :Behavior — rules of engagement. "Don't recommend long veAERO locks." Loaded every session.
  • :Topic — canonical state for a coherent thread. "Active GL focus" wins over any conflicting log entry.
  • :MemoryEntry — append-only episodic log with timestamps. Searchable via FTS.
  • :Person, :Project, :Goal, :Protocol, :Idea, :Session, :Agent — domain entities.

How to integrate

Default integration for a new GL surface:

  1. Use the existing canonical container. gl-graph-godmode is the default target. Don't spin up a new database unless the surface needs isolated state.
  2. Wire the Python client. pip install neo4jfrom neo4j import GraphDatabase → driver targets bolt://localhost:7687. For CLI work, cypher-shell ships with the Neo4j Docker image.
  3. Read before write. Topic nodes win over logs on conflict — load the Topic at session start; don't infer state from a log query.
  4. Write via the sanctioned helpers, not raw MERGEs. The agent_name_unique constraint exists because hand-rolled MERGEs with inconsistent casing caused a case-duplicate bug. Use the helper scripts where they exist.
  5. Log corrections to MemoryEntry; update Topic; link them. The pattern: episodic entry + canonical update + a relationship between them. Don't just append; don't just update.
  6. Cypher style. Pattern-first. Use MATCH ... RETURN for reads, MERGE for upsert, CREATE only when you know the node doesn't exist. EXPLAIN and PROFILE show the planner's choices when a query is slow.

In the GL stack

builddaily.io

  • Behavior nodes loaded every session. Rules of engagement, voice constraints, scope boundaries — the canonical "don't do X" surface. Agent reads them before responding to anything load-bearing.
  • Topic node for active focus. "gl-active-focus-three-projects" is a Topic that resolves which projects are in scope; conflicts with logs are won by the Topic.
  • MemoryEntry log for episodic recall. "When did Neil first mention X" is a Cypher full-text query.
  • Behavior nodes capture qualitative drafter corrections (slice 2 of the agent-stack post). Every "no, not that — write it like this" becomes a Behavior the next compile reads.

paiddaily.io

  • Person + Protocol + Position graph. Wallet addresses link to people; people link to projects; projects link to positions in protocols. "Who has exposure to this Pendle market" is a single Cypher query.
  • Catalyst lineage. Pendle catalysts as nodes with :PRECEDES relationships when one catalyst sets up another. Graph beats SQL for "show me the chain that led to today's setup."

sagedaily.io

  • Per-user state. Standing intention, chart, cycle, prior readings — all anchored on a :Person node. Each reading's :OracleReading node links back to inputs and :Cards drawn.
  • Reading lineage. Sequential readings link via :FOLLOWED_BY; archetypes that recur thread through :RESONATES_WITH relationships. "What's coming up for you" is a literal pattern match.

Gotchas

  • Cypher is not SQL. Joining feels like "extending the pattern"; thinking SQL-first produces awkward queries that the planner can't optimize. Learn the patterns.
  • MERGE without uniqueness constraints silently creates duplicates. The agent_name_unique constraint exists because of a 2026-05-04 case-duplicate bug. Every entity type that's MERGE'd by a string key needs a uniqueness constraint.
  • Property explosion is real. Don't dump arbitrary JSON onto a node. Properties should be queryable; deeply-nested data belongs in linked nodes or a separate store.
  • Bolt protocol is binary, not HTTP. Don't try to hit :7687 with curl; use a Bolt driver. Browser UI on :7474 is for exploration.
  • Backup is not automatic. Community edition lacks online backup tooling. neo4j-admin database dump runs offline. Schedule it.

Risks

  • Single-vendor open-core. Neo4j Inc. drives the project. Community edition GPLv3; enterprise paid features (clustering, online backup, advanced security) are a real upgrade path the GL stack doesn't need today.
  • Operational footprint. Containerized Neo4j wants ~1-2GB RAM and meaningful disk for a non-trivial graph. Budget the box accordingly.
  • Cypher learning curve. Steeper than SQL for some patterns. The investment pays back fast once the relational/temporal queries that were painful in SQL become one-liners.

Related

  • LlamaIndex — handles document retrieval. Neo4j handles canonical state. Two systems, two jobs; both consulted at agent answer time.
  • Langfuse — captures quantitative traces. Neo4j captures qualitative state (Behavior nodes, Topics). No duplication.