tsfile icon indicating copy to clipboard operation
tsfile copied to clipboard

[CPP]. Pointer length problem when parsing data

Open t59688 opened this issue 1 year ago • 2 comments

void print_data_result(DataResult* result) {
    printf("result->column_num:%d\n",result->column_num);
    std::cout << std::left << std::setw(15) << "timestamp";
    for (int i = 0; i < result->column_num; i++) {
        std::cout << std::left << std::setw(15)
                  << result->column_schema[i]->name;
    }
    std::cout << std::endl;
    for (int i = 0; i < result->cur_num; i++) {
        std::cout << std::left << std::setw(15);
        std::cout << result->times[i];
        for (int j = 0; j < result->column_num; j++) {
            ColumnSchema* schema = result->column_schema[j];
            double dval;
            float fval;
            std::cout << std::left << std::setw(15);
            switch (get_datatype(schema->column_def)) {
                case TSDataType::BOOLEAN:
                    std::cout
                        << ((*((int64_t*)result->value[j] + i)) > 0 ? "true"
                                                                    : "false");
                    break;
                case TSDataType::INT32:
                    std::cout << *((int64_t*)result->value[j] + i);
                    break;
                case TSDataType::INT64:
                    std::cout << *((int64_t*)result->value[j] + i);
                    break;
                case TSDataType::FLOAT:
                    memcpy(&fval, (int64_t*)result->value[j] + i,
                           sizeof(float));
                    std::cout << fval;
                    break;
                case TSDataType::DOUBLE:
                    memcpy(&dval, (int64_t*)result->value[j] + i,
                           sizeof(double));
                    std::cout << dval;
                    break;
                default:
                    std::cout << "";
            }
        }
        std::cout << std::endl;
    }
}

In the code,

case TSDataType::FLOAT:
                    memcpy(&fval, (int64_t*)result->value[j] + i,
                           sizeof(float));
                    std::cout << fval;

directly adds pointers, resulting in 8 bytes being skipped on a 64-bit system instead of the 4 bytes required by the float type. The pointer type needs to be converted before adding pointers.

The actual result should be:

timestamp point1 point3
1 18695.70 13891.19
2 13539.03 10634.20
3 1849.30 21680.58
4 19746.76 1269.98
5 4144.69 5239.53
6 1594.83 12430.77
7 22197.75 19263.72
8 9240.63 13242.19
9 19448.98 8873.40
10 9484.03 22182.68
11 8303.50 4624.43
12 19118.72 15378.68
13 1663.34 22420.44
14 7029.99 13379.90

But the original code output is:

timestamp point1 point3
1 18695.7 13891.2
2 1849.3 21680.6
3 4144.69 5239.53
4 22197.8 19263.7
5 19449 8873.4
6 8303.5 4624.43
7 1663.34 22420.4
8 10837.8 15687.3
9 0 0
10 0 0
11 0 0 
12 0 0 
13 0 0 
14 0 0

Obviously half of the data is missing.

In addition, is the CPP/C version still in a very early stage? There are many todos in both the source code and the example code.

t59688 avatar Sep 19 '24 14:09 t59688

The current TsFile C++ version is continuously being upgraded and improved, and it requires the help and support of open-source developers. TsFile CPP has already supported TsFile V3, but it has not yet supported TsFile V4, which is currently used by IoTDB and has been implemented in the Java version. The issue pointed out in this case concerns the C interface provided by CPP to the developer, and it will be fixed in the near future.

ColinLeeo avatar Sep 21 '24 05:09 ColinLeeo

I think this is a great project. Its position in time series database is like that of sqlite in relational database. A lightweight, in-process, self-sufficient, serverless, zero-configuration time series data file engine is very useful for many scenarios.

t59688 avatar Sep 21 '24 15:09 t59688