As others have already mentioned, try qwen2.5-cider. With 16 GB, you should be able to confortably fit a quantised version of the 14b variant into VRAM. You can also try the 32b variant, but it will be much slower because not all layers can be off-loaded to the GPU.
As others have already mentioned, try qwen2.5-cider. With 16 GB, you should be able to confortably fit a quantised version of the 14b variant into VRAM. You can also try the 32b variant, but it will be much slower because not all layers can be off-loaded to the GPU.