The Data Foundation

Last refresh · 55 days ago

The live product graph for enterprise software.

The PYRAMYD Product Graph is the most comprehensive, structured, and continuously refreshed knowledge base of enterprise software ever assembled.

252K+ Products2.6K Categories2.4M+ Reviews88 Node Types10 Field Groups163 Industries249 Countries4,329 Jobs

Book a Demo See the Platform

Loading 16,600-node production slice…

The enrichment pipeline

Every node, 10 field groups, one canonical shape.

Each taxonomy node is enriched through the same 10 field groups in canonical tab order. Watch a single category fill in · every cell carries its model, prompt hash, and citation set.

Enrichment progress

100%

Filling Pulse …

01Overview
02Demand
03Market
04Landscape
05Trends
06Operations
07Compliance
08Economics
09Capabilities
10Pulse

Every taxonomy node (category, industry, country, job, product) carries the same 10 field groups in this canonical order. Same shape across taxonomies = one mental model for analysts, one API for connectors, one audit row per cell.

Enrichment on demand

Enrich any taxonomy, row, column, group, or cell.

The same runner powers every scope. You pick what to refresh; PYRAMYD pays for most of it because every enrichment compounds back into the shared graph.

Taxonomy

1 of 5

All rows · all 10 groups · all cells. Held under an advisory lock so the run is atomic.

Subsidized

Row

2 of 5

One node (e.g. one category, one country) refreshed across every enrichable field group.

Subsidized

Column

3 of 5

One field across many rows · ship a new dimension to your whole taxonomy at once.

Subsidized

Group

4 of 5

One field group (e.g. demand, landscape, compliance) refreshed across the rows you pick.

Subsidized

Cell

5 of 5

One field on one row · the cheapest unit. Costs cents and runs in seconds.

Subsidized

Why we subsidize

Every enrichment you run
makes the graph richer for everyone.

When you enrich a row, PYRAMYD keeps a copy in the shared product graph. That means your $X of LLM + retrieval spend funds the next thousand customers asking the same question · so we charge you the marginal compute, not the sticker price.

· Pay only the marginal LLM + embedding cost, not the full retrieval pipeline.
· Every cell shipped is permanently citable · model, prompt hash, source URL, retrieval timestamp.
· Re-running a stale row falls back to the cache if the verification gate still passes.

What you pay

Marginal compute, not retail LLM pricing.

Cell · 1 field, 1 row

~3-8 LLM calls + 1 embedding

$0.02 · $0.08

Row · 1 node, 10 groups

~120-180 LLM calls + verification

$0.70 · $1.40

Taxonomy · 4,329 jobs

Atomic run, advisory-locked

$3K · $6K

Ranges reflect frontier-model selection (Claude Opus, GPT-5) vs. budget-tier (Haiku, Mini) · same verification gate either way.

Cell-level provenance, live in the workspace

Click any field. See where it came from.

A real Chrome capture from the production graph · the provenance popover shows freshness, quality score, confidence, sources, raw payload, and full revision history for the field the user clicked.

Live capture from the production graph · Battlecards module · Provenance popover open on a Strengths field.

Freshness

How recently the field was verified against its source. Within SLA in this example · refreshed 1 day ago.

Quality & Confidence

Quality is the underlying signal strength (65/100 here). Confidence is how sure the model is the extraction is correct. Both rendered as discrete scores, not heuristic stars.

Sources · Raw · History

Three tabs · Sources lists every URL fingerprinted into the field, Raw shows the model's captured payload byte-for-byte, History is the audit log of every previous value.

What's in the graph

Every vendor is a typed node with real fields.

A real Salesforce-shaped record from the production graph. Every field is typed, every FK traversable, every signal carries provenance.

Vendor node · production shape

{
  "id": "8a7c5f1e-...",
  "name": "Salesforce",
  "description": "Customer relationship...",
  "country": "United States",
  "industry": "Software · SaaS",
  "size": "Large Enterprise",
  "reviewCount": 18420,
  "productCount": 47,
  "categoryCount": 23,
  "dataQuality": {
    "verifiedAt": "2026-05-28T14:11:02Z",
    "confidence": 0.94,
    "sourcesCount": 12
  },
  "provenance": {
    "sourceUrl": "https://salesforce.com/about/",
    "retrievedAt": "2026-05-28T08:22:17Z",
    "refreshCadence": "weekly"
  }
}

Every field is queryable. Every foreign key resolves. Every signal links back to a source URL with a retrieval timestamp.

MCP server response · getVendor(slug="salesforce")

{
  "vendor": { "name": "Salesforce", "id": "..." },
  "products": [
    { "name": "Sales Cloud", "score": 4.4,
      "reviews": 9821, "category": "CRM" },
    { "name": "Service Cloud", "score": 4.3,
      "reviews": 4102, "category": "Help Desk" }
  ],
  "topCategories": ["CRM", "Marketing Auto",
    "Sales Engagement", "Field Service"],
  "competitors": [
    { "name": "HubSpot", "score": 4.5 },
    { "name": "Microsoft Dynamics", "score": 4.2 }
  ],
  "citations": [
    { "field": "reviewCount",
      "source": "g2.com/.../salesforce",
      "retrievedAt": "2026-05-28T..." }
  ]
}

Any agent · Claude, ChatGPT, GitHub Copilot, your internal LLM · can query the graph through the MCP server. Responses come back typed, cited, and traversable.

The Six Layers

Volume. Velocity. Variety. Governance. Network effects.

Each layer reinforces the others. Together they form a foundation that gets deeper every week · not narrower.

Volume

252K+ enterprise software products mapped across 2,606 live categories with 2.4M+ aggregated reviews · the broadest structured enterprise-software graph in production.

Velocity

1,000+ signal sources refreshed continuously on a per-source cadence. Multi-tier ETL learns when a vendor releases vs. when a category shifts and adjusts refresh rates accordingly.

Variety

88 universal node types · 200+ pre-transformed connectors · 183 bi-directional connectors. Every entity typed; every relationship traversable.

Governance

SOC 2 Type 2 · ISO 27001 / 42001 · GDPR · CCPA · EU AI Act Article 50 · all in progress. Every signal already carries provenance: source URL, retrieval timestamp, model + version, confidence.

Network Effects

Every customer's RFX responses, battle cards, and win/loss data privately enrich their tenant slice. Aggregated trends improve the public graph for everyone · without revealing any single customer's data.

AI Specialization

Inline embeddings on every node · multi-hop traversal · Graph RAG · 4× multi-hop accuracy vs. vector RAG. Vertical AI no horizontal copilot can match.

Internal estimate: 12–24 months and $1–3M of focused engineering + data acquisition spend to replicate from scratch.

Depth × Breadth

Not just rows. Every row, enriched across 10 dimensions.

Every taxonomy carries the same 10 field-group schema · overview, pulse, demand, market, landscape, trends, economics, operations, compliance, capabilities. Multiply row count × column width × 10 dimensions and the foundation is structured, not just big.

Node	Rows	Cols	Structured points	Refresh
Products	251,835	288	72.5M	Weekly · per-source cadence
Reviews	2,447,964	95	232.6M	Weekly · 200+ review sites
VPT features	2,870,530	190	545.4M	Weekly · vendor changelogs
Companies	251,835	273	68.8M	Monthly · funding + hiring
Categories	2,606	154	401,324	Weekly · LLM-enriched
Industries	742	106	78,652	Monthly · LLM-enriched
Countries	1,738	90	156,420	Quarterly · regulator-tracked
Jobs	4,329	76	329,004	Quarterly · LLM-enriched
Total	~5.8M	across 8 node types	~620M structured data points	Rolling enrichment

See the 10 enriched field groups See the 270+ starter templates

Enrichment coverage

100% of every taxonomy, fully enriched.

Every row in each taxonomy carries an LLM-grounded enrichment payload · google-search grounding, citations preserved, model + timestamp tracked per row. Pulled live from production.

Weekly refresh on every node. Real-time on demand.

The default cadence is weekly · categories, industries, countries, jobs, products, reviews all re-enriched on a 7-day rolling cycle. When a customer needs an update sooner, any node can be re-enriched in real time from the workspace; token consumption applies.

Refreshing · categories

The compounding foundation

Weekly refresh on every node. Real-time on any node, on demand.

The default cadence is weekly · every taxonomy, every product, every connector-sourced signal re-enriches on a 7-day rolling cycle. Battle cards, dashboards, and APEX answers all carry the latest evidence without an analyst lifting a finger.

Weekly
Default platform-wide refresh
Realtime
User-triggered per node · token usage applies
24/7
Connector ingestion · 1K+ signal sources

Weekly

Default cadence · every node, every taxonomy

Every category, industry, country, job, product, and connector-sourced signal re-enriches on a 7-day rolling cycle. Battle cards, dashboards, and APEX answers all carry the latest evidence without an analyst lifting a finger.

Real-time

User-triggered · any node, on demand

Need fresher data right now? Hit refresh on any node from the workspace and the enrichment job runs immediately. Token consumption applies per re-enriched row, billed against your workspace budget.

Continuous

Always-on signal ingestion

Review feeds, competitor website diffs, funding announcements, hiring signals, and press wires stream into the graph as they happen. Surfaced into Competitor 360 and Alerts within hours of the source event.

Why this matters

Every node carries a verified-at timestamp. When a review changes, a category shifts, or a competitor ships a price change, the battle card refreshes automatically on the weekly cycle · and any analyst can force an instant refresh on the rows that matter most this morning, with provenance preserved on every cited claim.

Live Today

What's actually in the graph, right now.

These aren't roadmap numbers. Every figure here is queryable on the platform today.

252K+

Enterprise software products tracked across 2.6K categories

2.4M+

Aggregated software reviews from 200+ review sources

1K+

Live signal sources, refreshed on per-source cadence

Universal node types across the entire graph schema

200+

Pre-transformed connectors, 183 bi-directional

3.08×

Live query speedup vs. baseline (2,519ms → 818ms)

88 Universal Node Types

Every entity in enterprise software, typed and connected.

The graph schema is the foundation. Ten categories span people, entities, products, revenue, finance, operations, comms, content, data, and systems · and every edge between them is a query the graph can answer.

6types

People

Contacts · roles · positions · interviews

8types

Entities

Companies · teams · workspaces · segments · locations · countries · industries

6types

Products

Products · categories · features · releases · reviews

8types

Revenue

Deals · orders · pipelines · contracts · campaigns · cadences · battle cards

7types

Finance

Transactions · postings · ledgers · periods · budgets · forecasts · filings

12types

Operations

Ideas · requirements · issues · projects · roadmaps · cycles · objectives · capabilities · processes

4types

Comms

Communications · chats · channels · events

11types

Content

Documents · articles · snippets · sheets · slides · notebooks · canvases · forms · files · folders · transcripts

14types

Data

Datasets · catalogs · connectors · transformations · prompts · agents · runs · models · experiments · metrics · signals · dashboards

12types

Systems

Repositories · branches · commits · credentials · settings · activities · devices · alerts · applications · policies · services

Every Signal Cited

Provenance is not optional.

Every node and every signal carries the metadata regulators want and the metadata sales leaders need: source, time, model, confidence, verification status.

Source URL

Every signal links back to the real source · a press release, a review, a vendor changelog, a regulatory filing.

Retrieval Timestamp

When the signal was captured. When it was last re-verified. When the source itself was updated.

Model + Version

Which model wrote the enrichment. Prompt version. Token count. Cost. Confidence score per field.

Verification Status

Verified · Needs Review · Disputed. A 2-gate audit (completeness + content) before any row reaches APEX.

When a customer asks "where did this come from?" they have a defensible, regulator-ready answer.

See the graph live in your category.

In 30 minutes we'll pull live data for your top 5 competitors, walk the graph, and show APEX answer a multi-hop question with every citation traceable to source.

Book a Demo Security & Compliance

The live product graph for enterprise software.

Every enrichment you runmakes the graph richer for everyone.

Marginal compute, not retail LLM pricing.

Volume

Velocity

Variety

Governance

Network Effects

AI Specialization

Weekly refresh on every node. Real-time on any node, on demand.

Source URL

Retrieval Timestamp

Model + Version

Verification Status

See the graph live in your category.

Every enrichment you run
makes the graph richer for everyone.