PR by Xuan-Son Nguyen for `llama.cpp`: > This PR provides a big jump in speed for WASM by leveraging SIMD instructions for `qX_K_q8_K` and `qX_0_q8_0` dot product functions. > > …
I’m still impressed because it was able to look at the existing solution, recognize a bottleneck, and write the code to address it. Most code is very boring, you don’t need genius solutions in it. And this could be of huge help for developers as well where you could have it analyze code and suggest where improvements can be made. It could be faster than profiling things.
I’m still impressed because it was able to look at the existing solution, recognize a bottleneck, and write the code to address it. Most code is very boring, you don’t need genius solutions in it. And this could be of huge help for developers as well where you could have it analyze code and suggest where improvements can be made. It could be faster than profiling things.