Semantic Discovery
Enable AI agents to find products by intent and context, not just keywords.
Overview
Traditional e-commerce search relies on keyword matching—if a product doesn't contain the exact words a user types, it won't appear in results. Semantic Discovery changes this by understanding the meaning behind queries.
Hyperfold uses vector embeddings to represent both products and queries in a high-dimensional space where semantically similar items are close together.
Vector Embeddings
Every product in your catalog is transformed into a dense vector representation that captures its semantic meaning:
{ "product_id": "prod_aero_x2", "name": "AeroRun X2 Marathon Shoe", "description": "Professional running shoe with Gore-Tex waterproofing", "semantics": { "category": "apparel/footwear/running", "usage_context": ["marathon", "trail", "wet_conditions"], "visual_tags": ["blue", "reflective", "mesh_upper", "chunky_sole"], "attributes": { "weight_g": 280, "heel_drop_mm": 8, "waterproof": true, "breathable": true }, "vibe_tags": ["professional", "performance", "outdoor", "serious_runner"] }, "embedding": [0.023, -0.156, 0.891, ...], // 1536-dim vector "similar_products": ["prod_storm_gt", "prod_trail_king"], "frequently_bought_with": ["prod_socks_wp", "prod_insoles_pro"]}How Embeddings Are Generated
When you import a product, Hyperfold uses Vertex AI's multimodal embedding model to:
- Analyze product images to extract visual features (color, style, shape)
- Process text descriptions to understand attributes and use cases
- Combine visual and textual features into a unified embedding
- Store the embedding in Vertex AI Vector Search for fast retrieval
Semantic Search
The difference between keyword and semantic search is dramatic:
# Traditional Keyword SearchQuery: "blue running shoes waterproof"Results: Exact matches for "blue" AND "running" AND "shoes" AND "waterproof"Problem: Misses "navy marathon trainers with Gore-Tex" # Semantic Vector SearchQuery: "shoes for running a marathon in the rain"Results: Products matching the *intent* and *context*Finds: "AeroRun X2" (waterproof, marathon-optimized) "StormRunner GT" (wet-condition specialty) "TrailKing WP" (outdoor, water-resistant)Testing Semantic Search
Use the CLI to test how your catalog responds to natural language queries:
# Test semantic search via CLI$ hyperfold search "cozy jacket for a rainy wedding" > [Vector] Generating embedding for query...> [Search] Querying Vertex AI Vector Search...> [Results] 5 products found (semantic confidence: 0.91) RANK PRODUCT CONFIDENCE PRICE1 Elegant Rain Trench 0.94 $1892 Waterproof Blazer 0.91 $2453 All-Weather Sport Coat 0.88 $1654 Classic Raincoat 0.85 $1295 Water-Resistant Parka 0.79 $99 > [Insight] Top results emphasize "formal" + "waterproof"> Query interpreted as: formal event, wet weatherSearch Configuration
Tune search behavior for your use case:
| Parameter | Default | Description |
|---|---|---|
min_confidence | 0.7 | Minimum similarity score to include in results |
max_results | 10 | Maximum number of products to return |
diversify | true | Reduce duplicate categories in results |
boost_in_stock | true | Prioritize available inventory |
Catalog Enrichment
Marketing-heavy product descriptions are optimized for humans, not AI agents. The catalog optimize command rewrites descriptions to be fact-dense and machine-readable:
# Optimize catalog for agent readability$ hyperfold catalog optimize --target="gpt-4o-buyer" > [Job] Started Batch Job: job_opt_221> [Progress] Processed 150/1500 SKUs... # Example transformation:BEFORE: "Our amazing jacket will keep you dry! Perfect for ANY occasion! Buy now! ⭐⭐⭐⭐⭐" AFTER: "Water-resistant blazer. Fabric: 65% wool, 35% polyester with DWR coating. Weight: 450g. Suitable for: formal events, light rain. Care: Dry clean only." > [Diff] SKU 102: Removed 12 marketing phrases> [Diff] SKU 102: Added explicit weight (450g) and fabric composition> [Diff] SKU 102: Added care instructions> [SUCCESS] Optimization complete. Agent readability +47%Vibe Matching
Beyond explicit attributes, Hyperfold captures the intangible "vibe" of products— the aesthetic, mood, and lifestyle they represent.
Query: "minimalist scandinavian desk lamp"
Matches products with clean lines, neutral colors, and modern aesthetic— even if they're not explicitly tagged "scandinavian."
Query: "cozy autumn vibes sweater"
Matches chunky knits in warm colors (rust, mustard, forest green)— the AI understands seasonal aesthetic preferences.
Vibe tags are automatically generated during product import using multimodal analysis of images and descriptions.