For relative comparison sure but isn’t it starting to become irrelevant these days? No Vulkan nor Raytracing nor Compute.
For relative comparison sure but isn’t it starting to become irrelevant these days? No Vulkan nor Raytracing nor Compute.
That’s because they imply DLSS 3.5 usage. Which isn’t too unfair but a bit misleading.
Alright, thanks for the info & additional pointers.
Ah thank you for the trove of information. What would be the best general knowledge model according to you?
Don’t be sorry, you’re being so helpful, thank you a lot.
I finally replicated your config:
localhost/koboldcpp:v1.43 --port 80 --threads 4 --contextsize 8192 --useclblas 0 0 --smartcontext --ropeconfig 1.0 32000 --stream "/app/models/mythomax-l2-kimiko-v2-13b.Q5_K_M.gguf"
And had satisfying results! The performance of LLaMA2 really is nice to have here as well.
Thanks a lot for your input. It’s a lot to stomach but very descriptive which is what I need.
I run this Koboldcpp in a container.
What I ended up doing and which was semi-working is:
--model "/app/models/mythomax-l2-13b.ggmlv3.q5_0.bin" --port 80 --stream --unbantokens --threads 8 --contextsize 4096 --useclblas 0 0
In the Kobboldcpp UI, I set max response token to 512 and switched to an Instruction/response model and kept prompting with “continue the writing”, with the MythoMax model.
But I’ll be re-checking your way of doing it because the SuperCOT model seemed less streamlined and more qualitative in its story writing.
Can confirm it’s the same on Proton / Linux. This game keeps being a joke on the technical side.
My bad. I think I confused this with the previous popular Unigine benchmarks.