AI infrastructure / Agent memory / Operator systems

Trent Doney

AI infrastructure builder. Memory-first agent systems. 20+ years operator experience.

I build AI infrastructure with an operator's bias: systems need memory, metrics, handoffs, recovery paths, and clear business usefulness. That lens comes from 20+ years across growth, ecommerce, performance marketing, automation, product strategy, and founder/operator work.

Tempe, AZ / Remote AI infrastructure consulting Technical co-founder opportunities
01

20+ years operator experience

Growth, product strategy, marketing operations, automation, and entrepreneurship.

02

$40M+ annual Shopify portfolio

Executive operating context across an eight-store ecommerce portfolio.

03

5x+ ROAS at $2M+ spend

Performance marketing systems across seven-plus acquisition channels.

04

47,869 BrainCore facts

Evidence-backed operational memory with 13,465 tracked entities and 265,970 evidence segments.

Operator Profile

AI systems shaped by business judgment.

I build AI systems from the operator's side of the table: infrastructure that has to run, remember, recover, and produce useful work. My background spans growth, ecommerce, performance marketing, automation, product strategy, and entrepreneurship, but the current priority is the local AI operating system running across SynapseGrid Ops.

That system combines an application and automation host, a primary local AI engine, private runbooks, service memory, retrieval, orchestration, agent handoffs, and GPU-backed image and model workflows. It is not a demo stack; it is a working infrastructure layer for building, testing, routing, and improving AI-assisted operations.

Rack server workspace with GPU systems for local AI infrastructure
Server 1

Primary App Host + Worker Node

CPU
AMD Ryzen 9 5900X
GPU
RTX 3090 FE 24GB VRAM
Memory
64GB
Storage
3 TB SSD 29 TB archive storage

Runs production apps, automations, worker jobs, media tooling, and always-on services.

Server 2

Primary Local AI Engine

CPU
AMD Ryzen Threadripper PRO 9975WX
GPU
RTX 6000 Blackwell 96GB VRAM
Memory
128GB
Storage
13 TB SSD 10 TB archive storage

Powers the local LLM stack, image generation, heavy GPU work, and AI production runtime.

Trent Doney operator desk with monitors, laptop, lighting, microphone, and production controls

Razer Blade Mobile Workstation

Portable AI development, browser automation, client review, Codex control, media ops, and on-the-go orchestration for the local SynapseGrid environment.

CPU
Intel Core i9-14900HX 24 cores / 32 threads
GPU
RTX 4080 16GB VRAM
Memory
64GB RAM
Storage
1TB SSD

Public Work

The work is visible where the systems live.

GitHub contribution activity snake visualization
Personal GitHub

github.com/trentdoney

Personal engineering surface for AI infrastructure, memory systems, orchestration tools, and production experiments.

Open GitHub
SynapseGrid Labs

github.com/SynapseGrid-Labs

AI infrastructure lab for memory-first agents, orchestration, local AI workflows, and operational systems.

Open SynapseGrid Labs
SynapseGrid Ops

Private operating layer

Operational backbone for runbooks, incident memory, service context, retrieval, and repeatable recovery workflows.

Discuss public-safe details

Selected Impact

Systems work with operational proof attached.

BrainCore corpus 47,869 extracted facts

Plus 909 published memories, 1,267 episodes, and 265,970 evidence segments.

Autonomous pipeline 19 nightly steps

Archive-first preservation, extraction, trust classes, and MCP retrieval.

Long-context serving 1,310,720 tokens

Qwen3-Coder lane with fp8 weights/KV cache in a 96GB VRAM envelope.

Reliability repair 6,376,250 failures eliminated

Restored execution throughput from 5-7 to 12-18 actions per cycle.

PAI upgrade 44 skills installed

Migrated shared agent infrastructure and wired 60+ agent-environment symlinks.

OpsVault Search 3.5x query speed

Improved Recall@10 28.7%, MRR 36.6%, storage 50%, and RSS memory 56%.

Flagship Systems

Production AI infrastructure, not demo theater.

These are not isolated demos. They are connected systems: memory that preserves evidence, orchestration that keeps authority clear, research infrastructure that maps the field, product pipelines that turn signals into workflows, and a private local lab that makes the work repeatable.

Operational Memory

BrainCore

Public AI memory infrastructure

Evidence-grounded operational memory for AI agents. BrainCore turns incidents, coding sessions, chats, dashboards, and source changes into queryable memory by archiving artifacts first, extracting facts with provenance, tracking trust and validity, and exposing retrieval through a local MCP-ready layer.

Facts
47,869
Entities
13,465
Evidence
265,970
PostgreSQL pgvector MCP Temporal facts
BrainCore evidence-first memory lifecycle architecture
AgentFanout architecture overview
Agent Orchestration

AgentFanout

Private SynapseGrid Ops orchestration layer

Provider-agnostic routing for bounded multi-agent work. AgentFanout decides when to fan out work, which provider or role should handle it, and when validation is required.

The main session keeps authority over secrets, private tools, git state, destructive actions, and final synthesis. Workers receive bounded packets and return reviewable outputs.

Codex Claude MiniMax
Agent Memory Atlas memory architecture and verification layer
Research Corpus

Agent Memory Atlas

Public research corpus

A public map of the agent-memory field: papers, repositories, benchmarks, taxonomies, product docs, protocols, and implementation patterns organized into a structured corpus.

The goal is to help builders compare architectures, understand verification status, and avoid designing memory systems from scattered claims.

Research Taxonomy Verification
ShockFeed media intelligence command center
AI Productization

ShockFeed

AI product infrastructure

AI product infrastructure for market-signal workflows. ShockFeed turns SEC filing and market signals into ingest pipelines, scoring queues, dashboards, alerts, and repeatable operator workflows.

The value is faster filtering, confidence context, and less noise around time-sensitive filing events.

Cloudflare Supabase Scoring
Local AI Lab orchestration and agent activity command surface
Local Runtime

Local AI Lab

Private AI runtime and operating vault

Local inference, image generation, LoRA training, OCR, captioning, and long-context coding run beside the private OpsVault layer.

OpsVault preserves infrastructure decisions, service context, incidents, remediations, and searchable project memory so agents recover context instead of rediscovering it.

vLLM ComfyUI OpsVault

Operating Evidence

The career bridge is the proof.

01

Marketing systems at real scale

20+ years across growth, product strategy, marketing operations, and entrepreneurship.

$40M+ annual Shopify portfolio
02

Performance discipline

$2M+ in ad spend across seven-plus channels at 5x+ ROAS, with operator focus on measurable output.

7+ channels / 5x+ ROAS
03

Builder with business outcomes

Founder/operator background including a $5M+ brand exit and marketing software/growth systems tied to $10M+ combined sales.

$5M+ exit / $10M+ systems
04

Image and video generation pipelines

ComfyUI production workflows for stills, video, upscaling, remastering, prompt packets, and repeatable asset review.

ComfyUI / LTX / SeedVR2
05

LoRA training and dataset operations

Dataset prep, caption strategy, workflow selection, quality gates, loader validation, and model-output review loops.

LoRA / captions / QA gates
06

Local LLM infrastructure

Long-context coding lanes, local model serving, GPU-aware runtime choices, inference routing, and recovery discipline.

vLLM / Qwen / GPT-OSS
07

Cloud product development

Cloudflare-backed sites, product dashboards, ingest pipelines, database-backed workflows, and deployable operator surfaces.

Cloudflare / Supabase / Postgres
08

Agent memory and orchestration

BrainCore, AgentFanout, MCP tools, Skills, role-bounded delegation, retrieval, provenance, and reviewable handoffs.

BrainCore / MCP / agent routing
09

Operational reliability

Runbooks, incident recovery, health checks, scheduled monitors, validation scripts, and post-fix evidence trails.

Gates / observability / recovery

Operator Track Record

Two decades of marketing and growth work before the AI infrastructure layer.

Before building agent memory and AI infrastructure, I spent 20+ years in growth, ecommerce, marketing operations, product strategy, lifecycle automation, paid media, and founder/operator roles. The through-line is systems that turn messy business motion into measurable execution.

The AI layer is stronger because it sits on that marketing base: attribution, conversion discipline, creative testing, funnel math, customer behavior, and the habit of tying tools to outcomes instead of novelty.

Executive ecommerce leadership $40M+ annual portfolio

Eight-store Shopify operating context at Interactive Life Forms.

Performance marketing $2M+ annual spend

Seven-plus acquisition channels at Scalpa while sustaining 5x+ ROAS.

DTC founder/operator $5M+ revenue

Particular Paws scaled as a direct-to-consumer brand before a successful exit.

Marketing systems $10M+ combined sales

Cash Is King Marketing software and operational growth systems.

Operating Surface

The stack spans agents, memory, model serving, and growth systems.

Agent Runtime

MCP, Skills, Codex CLI, Claude Code, Cline, n8n, guarded delegation, validation loops, and tool routing.

Memory Infrastructure

PostgreSQL 16, pgvector, HNSW indexes, temporal fact tables, provenance, evidence segments, and trust classes.

Local AI Systems

vLLM, Qwen3-Coder, GPT-OSS-120B, long-context coding lanes, OCR, captioning, and GPU-aware automation.

Image and Video Production

ComfyUI, LTX workflows, SeedVR2 upscaling, remaster passes, prompt packets, approval gates, and batch asset handling.

Model Adaptation

LoRA training prep, dataset curation, captioning, loader checks, workflow diffs, and output quality review.

Reliability Layer

Grafana, service checks, scheduled monitors, incident runbooks, recovery notes, regression gates, audit trails, and observability attribution.

Cloud Product Systems

Cloudflare, Supabase, Postgres, scoring pipelines, filing ingest, dashboards, queues, and deployable product surfaces.

Database and Retrieval

Postgres schemas, pgvector retrieval, hybrid search, evidence linking, data cleanup, migrations, and reportable metrics.

Growth Operations

Performance marketing, lifecycle automation, ecommerce operations, attribution, creative systems, CRO, and channel economics.

Experience

A career built around systems that produce outcomes.

2026 - Present

BrainCore / Public AI Memory Infrastructure

Designed and shipped a TypeScript + Python memory system that turns operational artifacts into searchable facts, project timelines, patterns, and remediation playbooks.

2025 - Present

AI Infrastructure and Product Systems

Built and operated AI production services across model serving, ingestion, monitoring, output quality, incident recovery, Skills, MCP, and agent runtime integration.

2023 - 2025

Chief Marketing Officer / Interactive Life Forms

Managed eight Shopify stores generating $40M+ annual revenue and optimized multi-team operating execution.

2020 - 2023

Chief Marketing Officer / Scalpa

Directed $2M+ annual ad spend across 7+ channels while sustaining 5x+ ROAS.

2016 - 2019

Founder and CEO / Particular Paws

Scaled a DTC brand to $5M+ revenue and exited successfully.

2011 - Present

Owner and CEO / Cash Is King Marketing

Built marketing software and operational growth systems tied to $10M+ combined sales.

Trust Layer

Built for systems that need to be trusted after the demo.

Evidence before memory

Operational artifacts are preserved first, then converted into facts, project timelines, patterns, and remediation playbooks.

Role boundaries for agents

Agent work is routed through clear scopes, provider selection, validation, human gates, and reviewable handoffs.

Recovery paths by design

Incidents, service catalogs, runbooks, and retrieval traces become reusable context instead of one-off troubleshooting.

Public / Private Boundary

Public where useful. Private where necessary.

Some systems are public proof surfaces: BrainCore, AgentFanout, and Agent Memory Atlas can be explained, inspected, and improved in public. Others are intentionally private: Local AI Lab and PAI/OpsVault contain operational context, device details, service catalogs, incidents, credentials-adjacent configuration, and client-sensitive workflows.

The public line is simple: publish architecture, principles, verified metrics, sanitized examples, and reusable patterns. Keep private infrastructure, secrets, raw operational history, unreleased strategy, and sensitive incident detail out of the public site.

Work With Me

Best fit: serious AI systems tied to real operations.

Agent memory and retrieval systems

Operational memory, provenance, temporal context, hybrid retrieval, MCP access, and source-bound claims.

Multi-agent routing and handoffs

Provider-agnostic orchestration, worker roles, validation loops, human-in-the-loop gates, and auditable execution.

AI product infrastructure

Scoring pipelines, signal intelligence, dashboards, queues, local generation workflows, and business-operating constraints.

Contact

Useful conversations start with the system, not the pitch.

Best fit: agent memory, AI infrastructure, orchestration, local AI workflows, technical co-founder work, and product systems where reliability and business judgment matter.