clp-s: Add the read path for single-file archive
Description
Validation performed
Walkthrough
The changes in this pull request focus on enhancing the functionality of the ArchiveWriter and related classes, particularly in handling single-file archives. Key updates include the introduction of new member variables and methods, modifications to existing method signatures, and improvements to command line argument parsing. The TimestampDictionaryWriter class has also been restructured to streamline its operations. Additionally, a new file defining structures for single-file archives has been added, along with updates to related classes and methods to support these enhancements.
Changes
| File | Change Summary |
|---|---|
components/core/src/clp_s/ArchiveWriter.cpp |
Added member variable m_single_file_archive, modified close method to differentiate between single and multi-file archives, added write_timestamp_dict method, updated store_tables return type to std::pair<size_t, size_t>. |
components/core/src/clp_s/ArchiveWriter.hpp |
Added bool single_file_archive to ArchiveWriterOption, updated store_tables return type, added methods for single-file archive handling. |
components/core/src/clp_s/CommandLineArguments.cpp |
Introduced single-file-archive option in command line argument parsing. |
components/core/src/clp_s/CommandLineArguments.hpp |
Added member variable m_single_file_archive and getter method get_single_file_archive(). |
components/core/src/clp_s/JsonParser.cpp |
Added single_file_archive to m_archive_options structure in constructor. |
components/core/src/clp_s/JsonParser.hpp |
Added bool single_file_archive to JsonParserOption struct. |
components/core/src/clp_s/SingleFileArchiveDefs.hpp |
Introduced definitions and structures for managing single-file archives, including ArchiveHeader, ArchiveCompressionType, and related structures. |
components/core/src/clp_s/TimestampDictionaryWriter.cpp |
Replaced write_and_flush_to_disk with write, added clear method, removed open and close methods. |
components/core/src/clp_s/TimestampDictionaryWriter.hpp |
Updated constructor and method signatures, removed file management methods, added write and clear methods. |
components/core/src/clp_s/archive_constants.hpp |
Added constant cTmpPostfix for temporary file postfix. |
components/core/src/clp_s/clp-s.cpp |
Modified compress function to include single_file_archive parameter. |
components/core/src/clp_s/TimestampEntry.hpp |
Updated method write_to_file to write_to_stream, changing parameter type from ZstdCompressor& to std::stringstream&. |
Possibly related PRs
-
#466: The changes in
ArchiveWriter.cppandArchiveWriter.hppregarding the handling of archive formats and metadata are related to the overall archiving functionality, which may connect with the changes inArchiveReaderthat also deal with metadata and schema reading. -
#600: The modifications to the
CommandLineArgumentsclass, specifically the renaming ofordered_chunk_sizetotarget_ordered_chunk_size, directly relate to the changes in the main PR that involve chunk size handling in theArchiveWriterclass. This indicates a cohesive approach to managing chunk sizes across different components.
Suggested reviewers
- wraymo
📜 Recent review details
Configuration used: CodeRabbit UI Review profile: CHILL
📥 Commits
Reviewing files that changed from the base of the PR and between 4c7a50ffc69f054f8e4584ecffa0ce34be0ec88a and c6984876cfc63516470404945b35cab7526caf6d.
📒 Files selected for processing (1)
-
components/core/src/clp_s/Utils.hpp(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- components/core/src/clp_s/Utils.hpp
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
🪧 Tips
Chat
There are 3 ways to chat with CodeRabbit:
- Review comments: Directly reply to a review comment made by CodeRabbit. Example:
-
I pushed a fix in commit <commit_id>, please review it. -
Generate unit testing code for this file. -
Open a follow-up GitHub issue for this discussion.
-
- Files and specific lines of code (under the "Files changed" tab): Tag
@coderabbitaiin a new review comment at the desired location with your query. Examples:-
@coderabbitai generate unit testing code for this file. -
@coderabbitai modularize this function.
-
- PR comments: Tag
@coderabbitaiin a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:-
@coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase. -
@coderabbitai read src/utils.ts and generate unit testing code. -
@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format. -
@coderabbitai help me debug CodeRabbit configuration file.
-
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.
CodeRabbit Commands (Invoked using PR comments)
-
@coderabbitai pauseto pause the reviews on a PR. -
@coderabbitai resumeto resume the paused reviews. -
@coderabbitai reviewto trigger an incremental review. This is useful when automatic reviews are disabled for the repository. -
@coderabbitai full reviewto do a full review from scratch and review all the files again. -
@coderabbitai summaryto regenerate the summary of the PR. -
@coderabbitai resolveresolve all the CodeRabbit review comments. -
@coderabbitai configurationto show the current CodeRabbit configuration for the repository. -
@coderabbitai helpto get help.
Other keywords and placeholders
- Add
@coderabbitai ignoreanywhere in the PR description to prevent this PR from being reviewed. - Add
@coderabbitai summaryto generate the high-level summary at a specific location in the PR description. - Add
@coderabbitaianywhere in the PR title to generate the title automatically.
CodeRabbit Configuration File (.coderabbit.yaml)
- You can programmatically configure CodeRabbit by adding a
.coderabbit.yamlfile to the root of your repository. - Please see the configuration documentation for more information.
- If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation:
# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json
Documentation and Community
- Visit our Documentation for detailed information on how to use CodeRabbit.
- Join our Discord Community to get help, request features, and share feedback.
- Follow us on X/Twitter for updates and announcements.
Nice work! Seems mostly good for a draft implementation.
Main things we should change quickly is putting the archive header + metadata section into the regular multi-file archive, and also formally pick a magic number + change the magic number to 4 bytes.
There are other things we need to clean up/think about before actually merging this, but the above should changes should be enough to build off of for prototyping.
Also need to go through and fix all of the fields that are a different size than what the spec specifies.