Add comprehensive YAML configuration validation system for extractor
๐ฏ Problem Solved
Currently, the APIops extractor tool processes configuration.extractor.yaml files without validation, leading to runtime errors that are difficult to diagnose when configuration files contain:
- Empty sections (e.g., empty API lists)
- Duplicate entries across different sections
- Invalid data types or malformed YAML structure
- Names with invalid characters or whitespace
Users often encounter cryptic error messages during extraction, making it challenging to identify and fix configuration issues quickly.
๐ ๏ธ Solution Implemented
This PR introduces a comprehensive YAML configuration validation system that:
Core Features:
- ๐ Pre-runtime validation - Catches configuration errors before extraction begins
- ๐ฏ Detailed error reporting - Provides clear error messages with YAML line numbers
- โก Multiple validation modes - Automatic validation during runtime + standalone CLI tool
- ๐ง Comprehensive rules - Validates structure, content, types, and naming conventions
Validation Rules:
- Empty Section Detection - Warns about empty APIs, Products, Groups, etc.
- Duplicate Entry Prevention - Ensures no duplicate names across all configuration sections
- Type Validation - Confirms all entries are strings (not objects/arrays)
- Naming Convention Enforcement - Prevents whitespace and special characters in resource names
Integration Points:
- Automatic validation during extractor startup (fails fast on invalid config)
-
Standalone CLI tool:
./extractor validate-config <file>for manual validation - Consistent error handling using existing LanguageExt Either<T,U> patterns
๐ Files Added/Modified
New Files:
-
tools/code/common/ConfigurationValidator.cs- Core validation logic -
tools/code/extractor/ConfigurationValidationCommand.cs- CLI validation tool -
tools/code/extractor-config/configuration.extractor.example.yaml- Valid example -
tools/code/extractor-config/configuration.extractor.invalid-example.yaml- Invalid example for testing
Modified Files:
-
tools/code/common/Configuration.cs- Integrated validation into config loading -
tools/code/extractor/Configuration.cs- Added extractor-specific validation -
tools/code/extractor/App.cs- Added validation at application startup -
tools/code/extractor/Program.cs- Added validation command support -
docs/apiops/3-apimTools/apiops-2-1-tools-extractor.md- Updated documentation -
tools/README.md- Added validation guidelines -
configuration.extractor.yaml- Added validation comments
๐ Usage Examples
Automatic Validation (during extraction):
./extractor extract --configuration-file invalid-config.yaml
# Output: Configuration validation failed: Duplicate entry 'my-api' found in both apis and products sections (line 15)
Manual Validation:
./extractor validate-config ./configuration.extractor.yaml
# Output: โ
Configuration validation passed successfully!
./extractor validate-config ./invalid-config.yaml
# Output: โ Configuration validation failed:
# - Empty section detected: 'groups' section is empty (line 25)
# - Duplicate entry 'test-api' found in both apis and products sections (line 30)
Error Message Examples:
-
Empty section detected: 'apis' section is empty (line 10) -
Duplicate entry 'my-product' found in both products and groups sections (line 25) -
Invalid entry type: Expected string but found object for entry 'test-api' (line 15) -
Invalid naming convention: Entry 'my api' contains whitespace or special characters (line 20)
๐งช Testing
- โ Manual testing with valid and invalid configuration files
- โ Integration testing with existing extractor workflow
- โ Error handling verification with various malformed YAML scenarios
- โ CLI tool testing for standalone validation scenarios
๐ Documentation
- Updated extractor documentation with validation guidelines
- Added example configuration files with proper structure
- Included troubleshooting section for common validation errors
- Enhanced tools README with validation workflow
๐ Backward Compatibility
- โ Fully backward compatible - No breaking changes to existing functionality
- โ Graceful degradation - Validation failures provide clear guidance for fixes
- โ Optional CLI tool - Manual validation is opt-in, doesn't affect existing workflows
๐ Benefits for Users
- ๐ Faster debugging - Immediate feedback on configuration issues
- ๐ Better error messages - Clear, actionable error descriptions with line numbers
- ๐ Proactive validation - Catch issues before they cause extraction failures
- ๐ ๏ธ Developer experience - Standalone validation tool for configuration development
- ๐ Better documentation - Clear examples and validation guidelines
@MO2k4 - thanks for the PR. I unfortunately have to reject it as-is. We made a lot of configuration changes in our next version, which will conflict with your updates.
You can take a look at the v7 code to see where we're going. The CHANGELOG is incomplete but shows some configuration changes we made. Happy to get feedback.