polkovnikov
polkovnikov
@prusnak Just AVX + SSE3. Also, I don't have even F16C, I checked that in cpuinfo. `system_info: n_threads = 8 / 8 | AVX = 1 | AVX2 = 0...
@misutoneko I've tried on both laptops that I have. Both of them are bought around 2009-2012 year. Another one with 2 cores (2 hardware threads) also gives very slow result,...
@gjmulder @xiliuya I have this issue reported issue on my CPU. Apparently it has AVX, but no F16C (and no AVX2). I have quite old 10-15 year old Intel CPU...
@xiliuya You have two different answers for three reasons: 1) LLaMa uses random SEED value each time you run it, hence may produce different results even on same program. 2)...
@xiliuya Turning on AVX changes only speed of computation, but not the quality of answer. Don't know why it happened that AVX version is 1.5 times slower in your case....
@xiliuya The problem with your last patch of code is that it TOTALLY removes use of AVX or any other SIMD. Because if you don't define GGML_SIMD macro then only...
@beaclnd92 The problem with your solution that it just enables F16C feature of CPU. But my old CPU has only AVX, but no F16C feature. So your solution works for...
@wong2 Sure, `I have polynomial f(x) with integer only coefficients. And I want to know if it is a perfect square, meaning that exist another integer polynomial g(x) that is...
@indygreg Feature request above suggests to incorporate support for isolated sub-interpreters. This is experimental feature, which should be enabled separately, it is done through `./configure --with-experimental-isolated-subinterpreters` when compiling CPython. Or...
@indygreg Yes, if it is not too difficult, it would be great to add both kinds of options: 1) So that it is possible to do `python build.py --configure-flag=--with-experimental-isolated-subinterpreters=123`. 2)...