[Feature][Customize] Enhance Customize Plugin: Support CSV Import for issue_changelogs, issue_worklogs, sprints, and Add sprints Field to issues.csv
Search before asking
- [x] I had searched in the issues and found no similar feature requirement.
Use case
Many organizations use diverse or custom-built issue tracking systems that are not natively supported by dedicated DevLake plugins. The "Customize" plugin currently allows importing issues and issue_repo_commits via CSV, which is a valuable first step. However, to gain comprehensive insights into the issue tracking domain (similar to what's possible with natively supported tools like Jira), users need to import related data such as issue history, worklogs, and sprint information.
By expanding the Customize plugin's CSV import capabilities, users could:
- Ingest data from any unsupported third-party or in-house issue tracking system.
- Perform a complete analysis of their issue lifecycle, including status transitions, resolution times, and bottlenecks (using issue_changelogs).
- Track and analyze effort spent on tasks and projects (using issue_worklogs).
- Monitor sprint progress, team velocity, and scope changes (using sprints and linking them to issues).
- Fully leverage DevLake's dashboards and analytical features for the issue tracking domain, regardless of the source system.
Description
The "Customize" plugin is a powerful tool for ingesting data into DevLake's domain layer. Currently, for the issue tracking domain, it supports CSV import for the issues and issue_repo_commits entities.
This feature request proposes to extend the capabilities of the Customize plugin to allow CSV import for the following additional issue tracking domain entities:
-
issue_changelogs: To capture the historical changes of issues (e.g., status updates, assignee changes, priority modifications). This would require defining a CSV schema forissue_changelogs. -
issue_worklogs: To import time tracking data associated with issues (e.g., who logged time, how much time, and when). This would require defining a CSV schema forissue_worklogs. -
sprints: To import sprint information (e.g., sprint name, start date, end date, status). This would require defining a CSV schema forsprints.
Additionally, to effectively link issues to their respective sprints, we propose adding a new field to the existing issues.csv schema:
-
sprintsfield inissues.csv: This field (e.g., storing a list of sprint IDs or names) would allow an issue to be associated with one or more sprints during the CSV import process.
Expected Outcome: With these enhancements, users will be able to:
- Prepare CSV files for
issues,issue_changelogs,issue_worklogs, andsprintsextracted from their unsupported issue tracking systems. - Use the Customize plugin's API to import this data into the corresponding DevLake domain layer entities.
- Utilize DevLake's full suite of issue tracking dashboards and analytical capabilities (e.g., lead time, cycle time, sprint velocity, bug trends) for data sourced via CSV.
This will significantly increase the flexibility of DevLake and allow a broader range of users to consolidate and analyze their development data for engineering excellence.
Related issues
No response
Are you willing to submit a PR?
- [x] Yes I am willing to submit a PR!
Code of Conduct
- [x] I agree to follow this project's Code of Conduct
Based on the proposed plan, here's a detailed breakdown:
1. New CSV Template Structures:
We will implement the following new CSV template structures:
-
sprints.csv:-
id:varchar -
name:varchar -
url:varchar(Iteration web link, optional) -
started_date:datetime(Planned start time of iteration) -
ended_date:datetime(Planned end time, optional) -
completed_date:datetime(Actual completion time of iteration) -
status:enum(CLOSED | ACTIVE | FUTURE)
-
-
issue_changelogs.csv:-
id:varchar(255) -
issue_id:varchar(255) -
author_name:varchar(255)(Will generateaccountrecord via author name) -
field_name:enum(status | Sprint | assignee)- When
field_nameisstatus,original_from_valueandoriginal_to_valuewill be status values (e.g., Pending, In Progress, Done). - When
field_nameisSprint,original_from_valueandoriginal_to_valuewill be sprint IDs, comma-separated (e.g.,sprint_id_1,sprint_id_2,sprint_id_3). An empty value indicates no iteration set. - When
field_nameisassignee,original_from_valueandoriginal_to_valuewill be assignee names, and will be converted toaccount_idduring data import.
- When
-
original_from_value:text(Original value, different values based onfield_name) -
original_to_value:text(Changed value, refers tooriginal_from_value) -
created_date:datetime(Creation time)
-
-
issue_worklogs.csv:-
id:varchar(255) -
author_name:varchar(255)(Author name, will createaccountrecord and convert toid) -
comment:text(Worklog description, optional) -
time_spent_minutes:int(Work time, in minutes) -
logged_date:datetime(Log time) -
started_date:datetime(Start time) -
issue_id:varchar(255)
-
2. issues table new field and sprint_issues table data import:
-
issuescsv: We will add a new fieldsprint_id(type:varchar) to the existingissuescsv. This field will represent the current sprint(s) an issue belongs to. It can be empty, and multiple sprint IDs will be separated by commas. -
sprint_issuestable: We will implement the logic to import data intosprint_issuestable. This table will be populated based on thesprint_idfield inissues.csv, establishing the many-to-many relationship between sprints and issues.
This plan addresses the core requirements of importing comprehensive issue tracking data, including historical changes, worklogs, and sprint associations. We will ensure proper data parsing, transformation, and linkage to DevLake's domain layer entities.
All fields and their formats are designed with reference to the existing issue tracking domain layer
Implemented by #8456