[Feature] Flexible Vector Dimension Infrastructure for Future Experiments

Open hammertoe opened this issue 1 month ago • 0 comments

Flexible Vector Dimension Infrastructure for Future Experiments

Summary

Migrate from hardcoded 2048-dim vectors to a configurable infrastructure that supports:

Runtime configuration of vector dimensions
Multiple concurrent indexes for A/B testing
Gradual traffic routing between indexes
Future experiments (256-dim, 512-dim, etc.) without code changes

Motivation

Currently, vector dimension is hardcoded across the codebase:

src/yuhgettintru/embeddings.py:23 - DEFAULT_DIMENSION = 2048
src/yuhgettintru/search/__init__.py - hardcoded index name
scripts/backfill_embeddings.py - no dimension flag
scripts/create_vector_index.py - hardcoded dimension

This prevents:

Running experiments with different dimensions (256, 512, 1024)
Gradual cutovers with traffic splitting
Safe rollback if a dimension change degrades quality

Background

Voyage AI's voyage-3.5 supports dimensions: 2048, 1024, 512, 256 via Matryoshka learning with minimal quality loss.

Current Architecture

Product Model → VoyageEmbeddingGenerator (2048-dim) → MongoDB `embedding_voyage` field
                                                              ↓
Search Service → $vectorSearch on "vector_index_voyage" (2048-dim)

Proposed Architecture

Config.yaml → VoyageEmbeddingGenerator (configurable dim) → MongoDB
                                                              ↓
                                            Search Service → $vectorSearch
                                            (configurable index + traffic routing)

Implementation Plan

Phase 1: Infrastructure (Flexible Foundation)

1.1 Add Vector Configuration to `config.yaml`

vector:
  dimension: 1024                      # Default embedding dimension
  index_name: vector_index_voyage_1024 # Current active index
  routing:
    enabled: false                     # Enable when running experiments
    weights:
      vector_index_voyage_2048: 1.0    # Legacy index
      vector_index_voyage_1024: 0.0    # New index

1.2 Update Embedding Generator to Read from Config

File: src/yuhgettintru/embeddings.py

# Before
DEFAULT_DIMENSION = 2048

# After
def __init__(self, config, output_dimension: int | None = None):
    self.output_dimension = output_dimension or config.vector.dimension

1.3 Update Search Service for Traffic Splitting

File: src/yuhgettintru/search/__init__.py

def build_vector_search_pipeline(self, query_embedding, config):
    indexes = []
    if config.vector.routing.enabled:
        # Query multiple indexes with weights
        for idx_name, weight in config.vector.routing.weights.items():
            if weight > 0:
                indexes.append((idx_name, weight))
    else:
        indexes = [(config.vector.index_name, 1.0)]
    
    # Merge results using weighted RRF
    return self._weighted_hybrid_search(query_embedding, indexes, config)

1.4 Parameterize Backfill Script

File: scripts/backfill_embeddings.py

# New usage
python scripts/backfill_embeddings.py \
  --dimension 1024 \
  --index vector_index_voyage_1024 \
  --store costuless

1.5 Parameterize Index Creation Script

File: scripts/create_vector_index.py

# New usage
python scripts/create_vector_index.py \
  --name vector_index_voyage_1024 \
  --dimension 1024 \
  --path embedding_voyage

Phase 2: Backfill (Future Step)

Not included in this issue - to be scheduled separately.

Phase 3: Gradual Cutover (Future Step)

Not included in this issue - to be scheduled separately.

Files to Modify

File	Change	Priority
`.beads/config.yaml`	Add `vector` section	Required
`src/yuhgettintru/embeddings.py`	Read dimension from config, add config injection	Required
`src/yuhgettintru/search/__init__.py`	Config-driven index + weighted traffic splitting	Required
`src/yuhgettintru/cli.py`	Pass config to embedding generator	Required
`src/yuhgettintru/extract_service.py`	Use config for embedding generation	Required
`src/yuhgettintru/llm_enhancer.py`	Use config for embedding generation	Required
`src/yuhgettintru/services.py`	Inject config into services	Required
`scripts/backfill_embeddings.py`	Add `--dimension` and `--index` flags	Required
`scripts/create_vector_index.py`	Make dimension/index configurable	Required
`tests/conftest.py`	Provide mock config fixture	Optional
`tests/test_embeddings.py`	Update tests to use config	Optional
`tests/test_search_service.py`	Update tests for traffic splitting	Optional

Future Experiment Workflow

Once this infrastructure is in place, running dimension experiments is straightforward:

# 1. Create new index with experimental dimension
./scripts/create_vector_index.py \
  --name vector_index_voyage_256 \
  --dimension 256 \
  --path embedding_voyage

# 2. Backfill subset of products
./scripts/backfill_embeddings.py \
  --dimension 256 \
  --index vector_index_voyage_256 \
  --limit 10000

# 3. Enable routing in config.yaml
# vector:
#   routing:
#     enabled: true
#     weights:
#       vector_index_voyage_2048: 0.8
#       vector_index_voyage_1024: 0.15
#       vector_index_voyage_256: 0.05

# 4. Monitor metrics (click-through, latency, zero-results)

# 5. Adjust weights or rollback

Configuration Options

Basic (Single Index)

vector:
  dimension: 1024
  index_name: vector_index_voyage_1024

Experiment (Multiple Indexes)

vector:
  dimension: 1024                    # For new embeddings
  index_name: vector_index_voyage_1024
  routing:
    enabled: true
    weights:
      vector_index_voyage_2048: 0.8  # 80% traffic
      vector_index_voyage_1024: 0.2  # 20% traffic

Acceptance Criteria

[ ] config.yaml includes vector section with dimension, index_name, and optional routing
[ ] VoyageEmbeddingGenerator reads dimension from config
[ ] ProductSearchService uses configured index name
[ ] Traffic splitting works (when enabled in config)
[ ] create_vector_index.py accepts --dimension and --name flags
[ ] backfill_embeddings.py accepts --dimension and --index flags
[ ] All existing tests pass after refactoring
[ ] Search quality remains unchanged (same results, same order)

Out of Scope

Actually migrating from 2048-dim to 1024-dim (Phase 2)
Running A/B tests or gradual cutovers (Phase 3)
Changes to quantization strategy (currently int8, no change planned)

References

Voyage AI Flexible Dimensions
Voyage-3.5 Blog Post
Current embedding implementation: src/yuhgettintru/embeddings.py
Current search implementation: src/yuhgettintru/search/__init__.py

Metadata

Field	Value
Priority	Medium
Estimated Effort	2-3 days
Requires Database Migration	No (new index)
Requires Downtime	No
风险	Low - no behavioral changes, just infrastructure

Dec 24 '25 09:12 hammertoe

[Feature] Flexible Vector Dimension Infrastructure for Future Experiments

Flexible Vector Dimension Infrastructure for Future Experiments

Summary

Motivation

Background

Current Architecture

Proposed Architecture

Implementation Plan

Phase 1: Infrastructure (Flexible Foundation)

1.1 Add Vector Configuration to config.yaml

1.2 Update Embedding Generator to Read from Config

1.3 Update Search Service for Traffic Splitting

1.4 Parameterize Backfill Script

1.5 Parameterize Index Creation Script

Phase 2: Backfill (Future Step)

Phase 3: Gradual Cutover (Future Step)

Files to Modify

Future Experiment Workflow

Configuration Options

Basic (Single Index)

Experiment (Multiple Indexes)

Acceptance Criteria

Out of Scope

References

Metadata

1.1 Add Vector Configuration to `config.yaml`