parseable icon indicating copy to clipboard operation
parseable copied to clipboard

Resource check

Open nikhilsinhaparseable opened this issue 3 months ago • 1 comments

take cpu and memory utilisation for 2 min rolling window before decide to reject the request

Summary by CodeRabbit

Release Notes

  • New Features

    • Automatic hourly memory release scheduler for optimized memory usage.
    • Rolling 2-minute resource history tracking for CPU and memory metrics.
  • Improvements

    • Resource monitoring now enabled by default for enhanced system stability.
    • Optimized memory management in query processing with batched operations.
    • Integrated jemalloc allocator for improved memory efficiency on non-Windows systems.

nikhilsinhaparseable avatar Oct 27 '25 02:10 nikhilsinhaparseable

Walkthrough

This PR integrates jemalloc as the global memory allocator and introduces a memory release scheduler that periodically purges jemalloc arenas. Query handlers and response serialization are optimized to reduce memory retention, and resource monitoring is enhanced with rolling averages to improve decision-making.

Changes

Cohort / File(s) Summary
Memory allocator integration
Cargo.toml, src/main.rs
Added tikv-jemalloc dependencies (ctl, jemallocator, jemalloc-sys) and configured jemalloc as global allocator for non-MSVC targets
Memory scheduler module
src/memory.rs, src/lib.rs
New memory module providing force_memory_release() and init_memory_release_scheduler() for hourly jemalloc arena purging via scheduled tasks
Server initialization
src/handlers/http/modal/server.rs, src/handlers/http/modal/query_server.rs, src/handlers/http/modal/ingest_server.rs
Integrated memory release scheduler initialization into startup sequences for all three server types
Query response optimization
src/handlers/http/query.rs, src/response.rs, src/utils/arrow/mod.rs
Optimized JSON serialization and batch processing with explicit memory drops, pre-allocation, and chunked iteration to reduce memory retention
Resource monitoring enhancement
src/handlers/http/resource_check.rs
Migrated from instantaneous checks to rolling 2-minute averages using VecDeque-based history; updated thresholds and logging to reflect rolling average context; added unit tests
Minor refactoring
src/metastore/metastores/object_store_metastore.rs
Inlined await in delete_overview method

Sequence Diagram

sequenceDiagram
    participant Server as Server Init
    participant Scheduler as Memory Scheduler
    participant Jemalloc as Jemalloc

    Server->>Scheduler: init_memory_release_scheduler()
    activate Scheduler
    Scheduler->>Scheduler: Create AsyncScheduler
    Scheduler->>Scheduler: Schedule hourly task
    Scheduler->>Scheduler: Spawn Tokio poller (60s interval)
    Scheduler-->>Server: Ok(())
    deactivate Scheduler

    loop Every 60 seconds
        Scheduler->>Scheduler: Poll scheduled tasks
        alt Task ready
            Scheduler->>Jemalloc: force_memory_release()
            Jemalloc->>Jemalloc: Advance epoch
            Jemalloc->>Jemalloc: Purge arenas
            Jemalloc-->>Scheduler: Success
        end
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Memory scheduler logic in src/memory.rs: Verify jemalloc epoch advancement and arena purging correctness, error handling consistency
  • Server initialization ordering: Confirm all three server types (server.rs, query_server.rs, ingest_server.rs) initialize memory scheduler at appropriate points before spawning servers
  • Query optimization pathways: Review explicit drops and chunked processing in src/handlers/http/query.rs, src/response.rs, and src/utils/arrow/mod.rs for correctness and memory safety
  • Rolling average logic: Validate ResourceHistory window cleanup, sample accumulation, and average computation in src/handlers/http/resource_check.rs

Possibly related PRs

  • parseablehq/parseable#1317: Modifies the same query handler functions (handle_non_streaming_query, create_batch_processor) in src/handlers/http/query.rs for memory optimization
  • parseablehq/parseable#1352: Introduces the resource-monitoring middleware infrastructure that this PR extends with rolling averages and integrates into server startup

Suggested labels

for next release

Suggested reviewers

  • parmesant

Poem

🐰 Hops through memory gardens with glee,
Jemalloc springs free, so shiny and clean,
Arenas purged hourly, no waste in between,
Query responses chirp with delight—
Memory optimized, running just right!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The PR description is largely incomplete compared to the repository's template. The provided description consists of a single sentence ("take cpu and memory utilisation for 2 min rolling window before decide to reject the request") and lacks the structured sections required by the template, including a detailed description of the goal, rationale for the chosen solution, key changes made, and the required testing and documentation checklists. While the sentence does convey the general intent of the PR, the minimal scope falls well short of the template's expectations for comprehensive PR documentation. Expand the PR description to follow the template structure more closely. Include a section explaining the goal (why 2-minute rolling averages are needed), the rationale for this approach over alternatives, and a summary of key changes (jemalloc integration, resource history tracking, memory optimizations in handlers, etc.). Additionally, complete the checklist items by confirming testing of log ingestion and querying, verifying code comments explain the "why," and documenting the new behavior for resource-based request rejection.
Title Check ❓ Inconclusive The title "Resource check" is vague and generic. While it relates to a real component modified in this PR (the resource checking mechanism in src/handlers/http/resource_check.rs), it fails to capture the core improvement that distinguishes this change: the implementation of a 2-minute rolling window for CPU and memory utilization decisions. A developer scanning commit history would not understand from this title alone that the PR introduces rolling averages for resource decision-making rather than a general refactor or bug fix to resource checking. The title would benefit from being more specific, such as "Implement 2-minute rolling window for resource checks" to clearly convey the main objective.
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • [ ] 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Oct 27 '25 02:10 coderabbitai[bot]