pl752

Results 15 comments of pl752

!unstale What about introducing configurations specific for Deepseek llms and adding cmake flags for enabling building these configurations, so default configuration build list isn't getting bloated, or adding flags for...

Okay, thank you for the reply. I will try to investigate further and give updates on this weird behaviour.

Here are some backtraces with debug symbols for amdhsa and hip runtime and with RelWithDbg build for some of the interesting threads Rocgdb bt output ``` slot launch_slot_: id 0...

Also libomp.so is missing

Unfortunately, I haven't got to running benchmarks yet, however changes resulted in significant reduction of cpu time usage and allocations in application performance profiling runs, I will try to perform...

Also I agree that auth part is a case of over-optimization and can be omitted. I just applied change pattern to everything which allocates temporary buffers and I have got...

Upd: I have run the Perf thing I found in a solution (idk if it is any representative) And yeah, the speed difference is pretty negligible, however reduction in allocations...

Upd2: Ran tests with firebird 3 (no embedded), so it does need further testing with other versions (especially embedded and batch operations in modern fb), there was an issue with...

Upd3: performed tests with embedded engine, all passed

Upd4: TLDR: Written some benchmarks specific to my (unfortunately private) solution's queries. Changes in query execution timing sometimes is hard to register due to fb3 engine being the main bottleneck...