doris icon indicating copy to clipboard operation
doris copied to clipboard

[feature-wip](parquet-reader) add parquet reader profile

Open AshinGau opened this issue 3 years ago • 0 comments

Proposed changes

Add profile for parquet reader. New couters:

  • ParquetFilteredGroups: Filtered row groups by RowGroup min-max statistics
  • ParquetReadGroups: The number of row groups to read
  • ParquetFilteredRowsByGroup: The number of filtered rows by RowGroup min-max statistics
  • ParquetFilteredRowsByPage: The number of filtered rows by page min-max statistics
  • ParquetFilteredBytes: The filtered bytes by RowGroup min-max statistics
  • ParquetReadBytes: The total bytes in ParquetReadGroups, may be further filtered If a page is skipped as a whole

Result

┌──────────────────────────────────────────────────────┐
│[0: VFILE_SCAN_NODE]                                  │
│(Active: 1s29ms, non-child: 96.42)                    │
│  - Counters:                                         │
│      - BytesRead: 0.00                               │
│      - FileReadCalls: 1.826K (1826)                  │
│      - FileReadTime: 510.627ms                       │
│      - FileRemoteReadBytes: 65.23 MB                 │
│      - FileRemoteReadCalls: 1.146K (1146)            │
│      - FileRemoteReadRate: 128.29331970214844 MB/sec │
│      - FileRemoteReadTime: 508.469ms                 │
│      - NumDiskAccess: 0                              │
│      - NumScanners: 1                                │
│      - ParquetFilteredBytes: 0.00                    │
│      - ParquetFilteredGroups: 0                      │
│      - ParquetFilteredRowsByGroup: 0                 │
│      - ParquetFilteredRowsByPage: 6.600003M (6600003)│
│      - ParquetReadBytes: 2.13 GB                     │
│      - ParquetReadGroups: 20                         │
│      - PeakMemoryUsage: 0.00                         │
│      - PredicateFilteredRows: 3.399797M (3399797)    │
│      - PredicateFilteredTime: 133.302ms              │
│      - RowsRead: 3.399997M (3399997)                 │
│      - RowsReturned: 200                             │
│      - RowsReturnedRate: 194                         │
│      - TotalRawReadTime(*): 726.566ms                │
│      - TotalReadThroughput: 0.0 /sec                 │
│      - WaitScannerTime: 1s27ms                       │
└──────────────────────────────────────────────────────┘

Problem summary

Describe your changes.

Checklist(Required)

  1. Does it affect the original behavior:
    • [ ] Yes
    • [x] No
    • [ ] I don't know
  2. Has unit tests been added:
    • [ ] Yes
    • [x] No
    • [ ] No Need
  3. Has document been added or modified:
    • [ ] Yes
    • [x] No
    • [ ] No Need
  4. Does it need to update dependencies:
    • [ ] Yes
    • [x] No
  5. Are there any changes that cannot be rolled back:
    • [ ] Yes (If Yes, please explain WHY)
    • [x] No

Further comments

If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...

AshinGau avatar Sep 21 '22 02:09 AshinGau