Implement audio subsystem for userspace
⚠️ Dependency Status
BLOCKED BY #279 (Dynamic IRQ handler registration API)
This entire subsystem depends on interrupt-driven buffer management. Without #279, none of the audio components can function efficiently.
Dependency Chain
Prerequisites
─────────────
#279 (IRQ) ← CRITICAL PATH - Must complete first
│
├──────────────────────────┐
↓ ↓
#382 (Driver) #383 (Core)
• DMA interrupts • Interrupt notification
• Buffer completion • Writer wakeup
│ │
└──────────┬───────────────┘
↓
#384 (Syscalls)
• /dev/dsp
• Blocking write
↓
#385 (Userland API)
• libc audio functions
↓
#386 (Testing)
• WAV player
• Doom integration
Impact: Without #279, we cannot achieve acceptable audio latency and would require inefficient polling instead of interrupt-driven operation.
Goal
Create a complete audio output system for userspace applications like Doom, supporting PCM playback with hardware acceleration.
Context
Part of the Road to Doom milestone. Doom requires audio output for sound effects and music playback. This is a complex subsystem involving hardware drivers, kernel audio core, syscall interface, and userland API.
Scope Overview
This issue tracks the complete audio subsystem implementation, broken down into 5 major work packages:
- #382 - PCI audio driver (AC'97 or Intel HDA)
- #383 - Kernel audio core (PCM buffer manager and mixer)
- #384 - Audio syscalls and /dev/dsp device interface
- #385 - Userland audio API and libc support
- #386 - Audio validation and tooling (WAV player, tests)
Architecture
┌─────────────────────────────────────────────────────────────┐
│ Userland Applications │
│ (Doom, /bin/play, /bin/tone) │
└───────────────────────────┬─────────────────────────────────┘
│
│ libc audio API (#385)
│ audio_open/write/close
↓
┌─────────────────────────────────────────────────────────────┐
│ /dev/dsp Device (#384) │
│ open/write/ioctl/close syscalls │
│ OSS-compatible interface │
└───────────────────────────┬─────────────────────────────────┘
│
│ Stream management
↓
┌─────────────────────────────────────────────────────────────┐
│ Kernel Audio Core (#383) │
│ • PCM ring buffer manager │
│ • Software mixer (multi-stream) │
│ • Sample rate conversion │
│ • Buffer scheduling and wakeup [needs #279] │
└───────────────────────────┬─────────────────────────────────┘
│
│ Driver interface [needs #279]
↓
┌─────────────────────────────────────────────────────────────┐
│ PCI Audio Driver (#382) │
│ • AC'97 or Intel HDA │
│ • DMA ring buffer setup │
│ • Interrupt handling [REQUIRES #279] │
│ • Hardware abstraction │
└─────────────────────────────────────────────────────────────┘
Work Breakdown
Phase 1: Hardware Driver (#382)
Timeline: 2-3 weeks
Assignee: @pbalduino
Status: 🔴 Blocked by #279 (Phase 3)
- PCI device enumeration (AC'97 primary target)
- BAR mapping and register initialization
- DMA ring buffer setup (Buffer Descriptor List)
- Interrupt handler for buffer completion [REQUIRES #279]
- Buffer underrun detection and recovery
Key Deliverables:
- AC'97 driver operational in QEMU
- DMA writes reach hardware buffers
- Interrupts trigger on buffer completion
Phase 2: Kernel Audio Core (#383)
Timeline: 2-3 weeks
Assignee: @pbalduino
Status: 🔴 Blocked by #279 (Phase 3-4)
- PCM ring buffer allocation and management
- Stream lifecycle (create/setup/run/drain/stop)
- Software mixer for multiple streams
- Sample rate conversion (11/22/44 kHz → 48 kHz)
- Driver registration API
- Interrupt notification path [REQUIRES #279]
Key Deliverables:
- Multi-stream mixing working
- Buffer scheduling prevents underruns
- Integration with driver (#382)
Phase 3: Syscalls and Device Interface (#384)
Timeline: 2 weeks
Assignee: @pbalduino
Status: 🔴 Blocked by #382 + #383
- /dev/dsp character device in devfs
- open/close/write/ioctl operations
- OSS-compatible ioctl commands (SETFMT, SPEED, CHANNELS)
- Blocking/non-blocking write semantics
- poll/select support
Key Deliverables:
- /dev/dsp accessible from userland
- ioctl configures format/rate/channels
- write() queues samples to kernel
Phase 4: Userland API (#385)
Timeline: 1.5-2 weeks
Assignee: @pbalduino
Status: 🔴 Blocked by #384
- libc audio API (audio_open, audio_write, etc.)
- OSS header compatibility (<sys/soundcard.h>)
- Doom integration shims (I_InitSound, I_SubmitSound)
- Host test harness (SDL2 for macOS)
Key Deliverables:
- libc provides high-level audio API
- Doom can initialize audio and play samples
- Host tests forward to SDL2/OSS
Phase 5: Validation and Tooling (#386)
Timeline: 2 weeks
Assignee: @pbalduino
Status: 🔴 Blocked by all above
- Unit tests (mixer, buffer management)
- Integration tests (syscalls, ioctl)
- Tone generator (/bin/tone)
- WAV file player (/bin/play)
- Audio info utility (/bin/audioinfo)
- Sample WAV files
- Documentation (testing guide, API reference)
Key Deliverables:
- Comprehensive test coverage
- Tools for debugging audio issues
- Documentation for developers
Audio Requirements (Doom)
PCM Output
- Formats: 8-bit unsigned, 16-bit signed little-endian
- Sample Rates: 11025, 22050, 44100 Hz (hardware at 48 kHz)
- Channels: Mono and stereo
- Latency: < 100 ms for responsive sound effects
Software Mixing
- Mix multiple sound effects simultaneously
- Separate music and SFX streams
- Volume control per stream (optional for MVP)
- Clipping prevention (saturating arithmetic)
API Compatibility
- OSS /dev/dsp semantics
- Doom's I_InitSound() / I_SubmitSound() interface
- Blocking and non-blocking I/O
Dependencies
Infrastructure (Complete)
- ✅ #280 - PCI config space API (CLOSED)
- ✅ #335 - PCI host bridge driver (CLOSED)
- ✅ #136 - Device filesystem (devfs) (CLOSED)
- ✅ #96 - File descriptor management (CLOSED)
- ✅ #220 - ioctl syscall infrastructure (CLOSED)
- ✅ #40 - Condition variables (CLOSED)
- ✅ #193 - Minimal libc (CLOSED)
- ✅ #189 - FAT32 write support (CLOSED)
Critical Blockers
- 🔴 #279 - Dynamic IRQ handler registration (CRITICAL - BLOCKS EVERYTHING)
- Needed for: Interrupt-driven DMA, buffer completion, writer wakeup
- Estimated: 8-10 weeks part-time
- Priority: HIGHEST for audio subsystem
- 🔴 #281 - I/O port resource management (for AC'97 mixer)
- Priority: Medium (can work around initially)
Parallel Work
- #382, #383 can start Phase 1-2 in parallel (but blocked on Phase 3+ by #279)
- #384, #385, #386 are blocked until #279 complete
Risks and Challenges
Technical Risks
-
DMA constraints: Buffers must be physically contiguous and page-aligned
- Mitigation: Use kmalloc with alignment, verify physical addresses
-
Timing accuracy: Poor timer resolution causes audio glitches
- Mitigation: Use high-resolution timers (#287), profile interrupt latency
-
Format conversion overhead: Software mixing may be CPU-intensive
- Mitigation: Optimize hot paths, profile early, consider SIMD (future)
-
Buffer synchronization: Race conditions between write and mixer
- Mitigation: Careful locking, thorough testing
-
Hardware variations: AC'97 implementations differ across vendors
- Mitigation: Test on QEMU, VirtualBox, real hardware (Intel ICH)
Scope Risks
-
Underestimated complexity: Audio is notoriously tricky
- Mitigation: Phased approach, MVP first (mono, single stream)
-
Doom integration challenges: Unexpected API mismatches
- Mitigation: Early prototyping with simple tools (tone generator)
Timeline
Overall Estimate
Total effort: 9-12 weeks (5-7 weeks focused, plus testing/integration)
Phased Rollout
- Weeks 1-3: Driver (#382) + Core (#383) foundation
- Weeks 4-6: Syscalls (#384) + API (#385)
- Weeks 7-9: Testing (#386) + Doom integration
- Weeks 10-12: Stabilization, real hardware validation
Critical Path (Updated)
#279 (IRQ Registration) ← MUST COMPLETE FIRST (8-10 weeks)
│
├──────────────────────────┐
↓ ↓
#382 (Driver) #383 (Core)
Phases 1-2: 2 weeks Phases 1-2: 2 weeks
Phase 3: needs #279 Phases 3-4: needs #279
│ │
└──────────┬───────────────┘
↓
#384 (Syscalls)
2 weeks
↓
#385 (API)
2 weeks
↓
#386 (Tests)
2 weeks
↓
Doom Integration
Total Timeline: ~8-10 weeks for #279, then 6-8 weeks for remaining phases
Definition of Done
Kernel
- [ ] AC'97 driver detects and initializes hardware
- [ ] DMA ring buffers allocated and operational
- [ ] Interrupt handler processes buffer completions (requires #279)
- [ ] Software mixer combines multiple streams
- [ ] Sample rate conversion working (11/22/44 → 48 kHz)
- [ ] /dev/dsp device accessible from userland
- [ ] OSS-compatible ioctls implemented
Userland
- [ ] libc audio API functional
- [ ] Doom integration functions (I_*Sound) working
- [ ] Tone generator produces audible output
- [ ] WAV player plays sample files
- [ ] Host test harness forwards to SDL2/OSS
Testing
- [ ] Unit tests for mixer and buffers passing
- [ ] Integration tests for syscalls passing
- [ ] Stress tests (multi-stream, leak detection) passing
- [ ] Manual validation in QEMU successful
- [ ] Doom plays sound effects and music
Documentation
- [ ] API reference (docs/api/audio.md)
- [ ] Design document (docs/design/audio.md)
- [ ] Testing guide (docs/testing/audio.md)
- [ ] Code comments and inline docs
Success Criteria
Minimum Viable Product (MVP):
- Single audio stream at 22050 Hz, 16-bit, mono
- Basic /dev/dsp interface
- Doom pistol sound effect plays correctly
Full Success:
- Multi-stream mixing (8+ simultaneous sounds)
- Format/rate flexibility (8/16-bit, 11-48 kHz, mono/stereo)
- No buffer underruns during normal operation
- Doom gameplay with full audio (SFX + music)
- < 100 ms latency
- Comprehensive test coverage
References
Specifications
- AC'97 Specification: Intel Audio Codec '97 Component Specification
- Intel HDA Specification: High Definition Audio Specification
- OSS Programmer's Guide: http://www.opensound.com/pguide/
- RIFF WAVE format: http://soundfile.sapp.org/doc/WaveFormat/
Reference Implementations
- Linux ALSA:
sound/pci/ac97/,sound/core/pcm*.c - FreeBSD:
sys/dev/sound/pci/,sys/dev/sound/pcm/ - OpenBSD:
sys/dev/pci/ac97.c - OSDev Wiki: https://wiki.osdev.org/AC97
Related Work
- docs/road/road_to_doom.md (Audio Requirements section)
- Doom source:
i_sound.c,doomgeneric_menios.c
Sub-Issues (Work Breakdown)
- #382 - PCI audio driver (AC'97 or Intel HDA) [2-3 weeks] 🔴 Blocked by #279
- #383 - Kernel audio core (PCM buffer manager and mixer) [2-3 weeks] 🔴 Blocked by #279
- #384 - Audio syscalls and /dev/dsp device interface [2 weeks] 🔴 Blocked by #382+#383
- #385 - Userland audio API and libc support [1.5-2 weeks] 🔴 Blocked by #384
- #386 - Audio validation and tooling (WAV player, tests) [2 weeks] 🔴 Blocked by all
Total: 9.5-12 weeks (AFTER #279 completes)
Progress Tracking
- [ ] Prerequisite: #279 (IRQ Registration) [8-10 weeks] ← START HERE
- [ ] Phase 1: Hardware driver (#382)
- [ ] Phase 2: Kernel audio core (#383)
- [ ] Phase 3: Syscalls and device (#384)
- [ ] Phase 4: Userland API (#385)
- [ ] Phase 5: Validation and tooling (#386)
- [ ] Final: Doom integration and stabilization
Status: Open - BLOCKED by #279 (IRQ allocation must land first)
Priority: Medium-High (Doom blocker, but deferred from v0.1.666)
Complexity: High (5-7 weeks focused effort AFTER #279)
Last Updated: 2025-10-31