modern-data-warehouse-dataops
modern-data-warehouse-dataops copied to clipboard
SPIKE: Dynamic Data Generator
Description
As a data engineer, I want a Gen AI-powered tool to create dynamic datasets tailored to specific schemas and test scenarios so that I can generate realistic and privacy-compliant test data efficiently.
Capabilities:
- Generate datasets from schema definitions and constraints.
- Create edge-case datasets for thorough testing.
- Ensure compliance with privacy and data usage standards.
Definition of Done:
- Findings documented, including:
- Initial hypothesis/capability being tested.
- Assumptions about dataset structure and requirements.
- What was tested for reproducibility.
- Outcomes/learnings.
- Associated artifacts: generated datasets, schema definitions, prompts, and testing scripts.