Memory Layers (chat.pinaivu.ai)

Chat memory is arguably the most sensitive data in the whole system, so it follows the same “don’t trust a private DB” principle as routing receipts — via two separate Walrus-backed layers that solve two different problems.

The two layers

	chat-relayer (cross-session)	node (intra-session)
Scope	Across conversations, long-term	Within one conversation, short-term
What it remembers	User facts (“user prefers Rust”, “user is in Bengaluru”)	Verbatim conversation turns of this session
Recall trigger	Every turn of every session	Every turn of the current session
Selection	Embedding similarity over all memories	Sliding window from session start
Storage shape	pgvector (similarity index) + Walrus blobs (fact content)	Encrypted session blob on Walrus, keyed by `session_id`

Intra-session continuity (node-side)

Within a single conversation, the node serving a turn needs the prior turns to keep the thread coherent — but a different node might serve the next turn, since the network doesn’t pin a client to one operator. Each node:

Fetches the session’s encrypted Walrus blob (keyed by session_id), decrypts it with a key only the client controls (session_key), and assembles the context window.
After the turn, re-encrypts and re-uploads, chaining the new blob to the previous one (prev_blob_id) — so a session’s history is a tamper-evident chain on Walrus, not a row in a database any single node operator can edit.

This means any node — not just whichever one served turn 1 — can pick up turn 2 with full continuity. Decentralized compute only works in practice because the state isn’t trapped on one machine either.

Cross-session memory (chat-relayer)

Separately, chat-relayer gives chat.pinaivu.ai users long-term memory across different conversations — the equivalent of “remember I prefer concise answers” persisting weeks later. This layer:

Stores the actual fact content as an encrypted blob on Walrus — same principle as routing receipts: the durable copy lives outside any single company’s private infrastructure.
Keeps only a pgvector embedding + the Walrus blob pointer in chat-relayer’s own Postgres — enough to do similarity search (“what’s relevant to this new message”) without the plaintext fact ever sitting in that database.
Derives each user’s encryption key via HKDF from a master secret, scoped per owner_address — so one user’s memories aren’t decryptable using another user’s key, even though they’re stored in the same blob network.

One honest caveat. The per-user key derivation in chat-relayer (MemoryCrypto) is explicitly a placeholder today, not a finished trust-minimization story: the master secret it derives from is relayer-held, not held by the user’s own wallet. That means, as it stands, the relayer operator could technically derive any user’s key — real user-controlled access (the eventual plan is wiring in actual Seal-style SessionKey export, gated by the user’s own wallet signature) hasn’t landed yet. The storage is already decentralized (Walrus, not a private DB) and tamper-evident, but the access control on top of it is still a centralized placeholder.

How the two layers combine in a single turn

These aren’t redundant — they answer different questions, and a real chat.pinaivu.ai turn uses both, in sequence:

1. User sends a message to chat-relayer.

2. chat-relayer queries its OWN cross-session layer first:
      memory::recall(new_message) → pgvector similarity search
      → decrypt matching Walrus blobs
      → e.g. "user prefers concise answers", "user is building Pinaivu in Rust"
   These are long-term facts, true across every past conversation,
   not just this one.

3. chat-relayer dispatches to the coordinator/node, attaching those
   recalled facts as memwal_context alongside the new message.

4. The WINNING NODE — which may have never served this user before —
   fetches its OWN intra-session layer:
      fetch the session's Walrus blob chain (session_id)
      → decrypt with the session_key the client holds
      → recent verbatim turns of THIS conversation

5. The node assembles the final prompt:
      system_prompt + memwal_context (cross-session facts)
                     + recent_turns   (intra-session history)
                     + new_message
   then runs inference.

6. The node persists the new turn back to its Walrus session chain
   (so the NEXT node, whichever one wins next turn's auction, has
   full intra-session continuity too).

7. chat-relayer fires memory::analyze in the background — extracts
   any new durable facts from this turn and writes them into its
   own cross-session layer for future conversations.

Why both are needed, not just one

The intra-session layer alone would lose everything the moment a conversation ends — there’d be no “remember I’m a Rust engineer” carrying into next week’s chat. The cross-session layer alone would lose mid-conversation coherence — every node serving turn 2 of an active chat would need the verbatim turn-1 exchange, which isn’t the kind of thing you want to re-derive from a fact-extraction pass (it’s too granular, and a different node serving turn 2 still needs exact continuity, not a summarized fact). So: cross-session memory carries durable facts forward across conversations; intra-session memory carries verbatim recent context forward across nodes within one conversation — and a node assembling a prompt pulls from both, with the cross-session layer arriving as memwal_context injected by chat-relayer at dispatch time, since the node itself has no notion of “this user’s history with other sessions” at all. That boundary is intentional: it’s what lets any node serve any session without needing broader access to a user’s full history.

Auth scheme for chat-relayer

How a client signs requests to /v1/chat — no API key, Ed25519 over canonical bytes

​The two layers

​Intra-session continuity (node-side)

​Cross-session memory (chat-relayer)

​How the two layers combine in a single turn

​Why both are needed, not just one