Releases · dougeeai/llama-cpp-python-wheels

09 Nov 20:36

dougeeai

v0.3.16-cuda13.0-sm75-py313

8ad6f64

llama-cpp-python 0.3.16 + CUDA 13.0 sm75 Turing - Python 3.13 - Windows x64 Latest

Latest

Pre-built llama-cpp-python wheel for Windows with CUDA 13.0 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

OS: Windows 10/11 64-bit
Python: 3.13.x (exact version required)
CUDA: 13.0 (Toolkit not needed, just driver)
Driver: NVIDIA Driver 580 or higher
GPU: NVIDIA GeForce RTX 2080 Ti, RTX 2080 Super, RTX 2080, RTX 2070 Super, RTX 2070, RTX 2060 Super, RTX 2060 12GB, RTX 2060, TITAN RTX, GeForce RTX 2080 Super Laptop, RTX 2080 Super Max-Q, RTX 2080 Laptop, RTX 2080 Max-Q, RTX 2070 Super Laptop, RTX 2070 Super Max-Q, RTX 2070 Laptop, RTX 2070 Max-Q, RTX 2060 Laptop, RTX 2060 Max-Q, GeForce GTX 1660 Ti, GTX 1660 Super, GTX 1660, GTX 1650 Super, GTX 1650, GTX 1630, GTX 1660 Ti Laptop, GTX 1660 Ti Max-Q, GTX 1650 Ti Laptop, GTX 1650 Ti Max-Q, GTX 1650 Laptop, GTX 1650 Max-Q, GTX 1630 Laptop, Quadro RTX 8000, RTX 6000, RTX 5000, RTX 4000, RTX 3000, Quadro T2000, T1200, T1000, T600, T550, T500, T400, Tesla T40, T10, T4
Architecture: Turing (sm_75)
VRAM: 4GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda13.0.sm75.turing-cp313-cp313-win_amd64.whl

What This Solves

No Visual Studio required
No CUDA Toolkit installation needed
No compilation errors
No "No CUDA toolset found" issues
Works immediately with GGUF models

Keywords

llama-cpp-python, CUDA 13.0, Python 3.13, Windows, RTX 2080 Ti, RTX 2080 Super, RTX 2080, RTX 2070 Super, RTX 2070, RTX 2060 Super, RTX 2060, TITAN RTX, GTX 1660 Ti, GTX 1660 Super, GTX 1660, GTX 1650 Super, GTX 1650, GTX 1630, Quadro RTX 8000, Quadro RTX 6000, Quadro RTX 5000, Quadro RTX 4000, Tesla T4, Turing, sm75, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10

Assets 3

09 Nov 20:35

dougeeai

v0.3.16-cuda13.0-sm75-py312

8ad6f64

llama-cpp-python 0.3.16 + CUDA 13.0 sm75 Turing - Python 3.12 - Windows x64

Pre-built llama-cpp-python wheel for Windows with CUDA 13.0 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

OS: Windows 10/11 64-bit
Python: 3.12.x (exact version required)
CUDA: 13.0 (Toolkit not needed, just driver)
Driver: NVIDIA Driver 580 or higher
GPU: NVIDIA GeForce RTX 2080 Ti, RTX 2080 Super, RTX 2080, RTX 2070 Super, RTX 2070, RTX 2060 Super, RTX 2060 12GB, RTX 2060, TITAN RTX, GeForce RTX 2080 Super Laptop, RTX 2080 Super Max-Q, RTX 2080 Laptop, RTX 2080 Max-Q, RTX 2070 Super Laptop, RTX 2070 Super Max-Q, RTX 2070 Laptop, RTX 2070 Max-Q, RTX 2060 Laptop, RTX 2060 Max-Q, GeForce GTX 1660 Ti, GTX 1660 Super, GTX 1660, GTX 1650 Super, GTX 1650, GTX 1630, GTX 1660 Ti Laptop, GTX 1660 Ti Max-Q, GTX 1650 Ti Laptop, GTX 1650 Ti Max-Q, GTX 1650 Laptop, GTX 1650 Max-Q, GTX 1630 Laptop, Quadro RTX 8000, RTX 6000, RTX 5000, RTX 4000, RTX 3000, Quadro T2000, T1200, T1000, T600, T550, T500, T400, Tesla T40, T10, T4
Architecture: Turing (sm_75)
VRAM: 4GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda13.0.sm75.turing-cp312-cp312-win_amd64.whl

What This Solves

No Visual Studio required
No CUDA Toolkit installation needed
No compilation errors
No "No CUDA toolset found" issues
Works immediately with GGUF models

Keywords

llama-cpp-python, CUDA 13.0, Python 3.12, Windows, RTX 2080 Ti, RTX 2080 Super, RTX 2080, RTX 2070 Super, RTX 2070, RTX 2060 Super, RTX 2060, TITAN RTX, GTX 1660 Ti, GTX 1660 Super, GTX 1660, GTX 1650 Super, GTX 1650, GTX 1630, Quadro RTX 8000, Quadro RTX 6000, Quadro RTX 5000, Quadro RTX 4000, Tesla T4, Turing, sm75, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10

Assets 3

09 Nov 20:34

dougeeai

v0.3.16-cuda13.0-sm75-py311

8ad6f64

llama-cpp-python 0.3.16 + CUDA 13.0 sm75 Turing - Python 3.11 - Windows x64

Pre-built llama-cpp-python wheel for Windows with CUDA 13.0 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

OS: Windows 10/11 64-bit
Python: 3.11.x (exact version required)
CUDA: 13.0 (Toolkit not needed, just driver)
Driver: NVIDIA Driver 580 or higher
GPU: NVIDIA GeForce RTX 2080 Ti, RTX 2080 Super, RTX 2080, RTX 2070 Super, RTX 2070, RTX 2060 Super, RTX 2060 12GB, RTX 2060, TITAN RTX, GeForce RTX 2080 Super Laptop, RTX 2080 Super Max-Q, RTX 2080 Laptop, RTX 2080 Max-Q, RTX 2070 Super Laptop, RTX 2070 Super Max-Q, RTX 2070 Laptop, RTX 2070 Max-Q, RTX 2060 Laptop, RTX 2060 Max-Q, GeForce GTX 1660 Ti, GTX 1660 Super, GTX 1660, GTX 1650 Super, GTX 1650, GTX 1630, GTX 1660 Ti Laptop, GTX 1660 Ti Max-Q, GTX 1650 Ti Laptop, GTX 1650 Ti Max-Q, GTX 1650 Laptop, GTX 1650 Max-Q, GTX 1630 Laptop, Quadro RTX 8000, RTX 6000, RTX 5000, RTX 4000, RTX 3000, Quadro T2000, T1200, T1000, T600, T550, T500, T400, Tesla T40, T10, T4
Architecture: Turing (sm_75)
VRAM: 4GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda13.0.sm75.turing-cp311-cp311-win_amd64.whl

What This Solves

No Visual Studio required
No CUDA Toolkit installation needed
No compilation errors
No "No CUDA toolset found" issues
Works immediately with GGUF models

Keywords

llama-cpp-python, CUDA 13.0, Python 3.11, Windows, RTX 2080 Ti, RTX 2080 Super, RTX 2080, RTX 2070 Super, RTX 2070, RTX 2060 Super, RTX 2060, TITAN RTX, GTX 1660 Ti, GTX 1660 Super, GTX 1660, GTX 1650 Super, GTX 1650, GTX 1630, Quadro RTX 8000, Quadro RTX 6000, Quadro RTX 5000, Quadro RTX 4000, Tesla T4, Turing, sm75, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10

Assets 3

09 Nov 20:33

dougeeai

v0.3.16-cuda13.0-sm75-py310

8ad6f64

llama-cpp-python 0.3.16 + CUDA 13.0 sm75 Turing - Python 3.10 - Windows x64

Pre-built llama-cpp-python wheel for Windows with CUDA 13.0 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

OS: Windows 10/11 64-bit
Python: 3.10.x (exact version required)
CUDA: 13.0 (Toolkit not needed, just driver)
Driver: NVIDIA Driver 580 or higher
GPU: NVIDIA GeForce RTX 2080 Ti, RTX 2080 Super, RTX 2080, RTX 2070 Super, RTX 2070, RTX 2060 Super, RTX 2060 12GB, RTX 2060, TITAN RTX, GeForce RTX 2080 Super Laptop, RTX 2080 Super Max-Q, RTX 2080 Laptop, RTX 2080 Max-Q, RTX 2070 Super Laptop, RTX 2070 Super Max-Q, RTX 2070 Laptop, RTX 2070 Max-Q, RTX 2060 Laptop, RTX 2060 Max-Q, GeForce GTX 1660 Ti, GTX 1660 Super, GTX 1660, GTX 1650 Super, GTX 1650, GTX 1630, GTX 1660 Ti Laptop, GTX 1660 Ti Max-Q, GTX 1650 Ti Laptop, GTX 1650 Ti Max-Q, GTX 1650 Laptop, GTX 1650 Max-Q, GTX 1630 Laptop, Quadro RTX 8000, RTX 6000, RTX 5000, RTX 4000, RTX 3000, Quadro T2000, T1200, T1000, T600, T550, T500, T400, Tesla T40, T10, T4
Architecture: Turing (sm_75)
VRAM: 4GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda13.0.sm75.turing-cp310-cp310-win_amd64.whl

What This Solves

No Visual Studio required
No CUDA Toolkit installation needed
No compilation errors
No "No CUDA toolset found" issues
Works immediately with GGUF models

Keywords

llama-cpp-python, CUDA 13.0, Python 3.10, Windows, RTX 2080 Ti, RTX 2080 Super, RTX 2080, RTX 2070 Super, RTX 2070, RTX 2060 Super, RTX 2060, TITAN RTX, GTX 1660 Ti, GTX 1660 Super, GTX 1660, GTX 1650 Super, GTX 1650, GTX 1630, Quadro RTX 8000, Quadro RTX 6000, Quadro RTX 5000, Quadro RTX 4000, Tesla T4, Turing, sm75, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10

Assets 3

09 Nov 20:24

dougeeai

v0.3.16-cuda13.0-sm100-py313

8ad6f64

llama-cpp-python 0.3.16 + CUDA 13.0 sm100 Blackwell - Python 3.13 - Windows x64

Pre-built llama-cpp-python wheel for Windows with CUDA 13.0 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

OS: Windows 10/11 64-bit
Python: 3.13.x (exact version required)
CUDA: 13.0 (Toolkit not needed, just driver)
Driver: NVIDIA Driver 580 or higher
GPU: NVIDIA RTX 5090, 5080, 5070 Ti, 5070, 5060 Ti, 5060, 5050, RTX 5090 Laptop, RTX 5080 Laptop, RTX 5070 Ti Laptop, RTX 5070 Laptop, RTX 5060 Laptop, RTX 5050 Laptop, RTX PRO 6000 Blackwell Workstation Edition, RTX PRO 6000 Blackwell Max-Q, RTX PRO 6000 Blackwell Server Edition, RTX PRO 5000 Blackwell, RTX PRO 4500 Blackwell, RTX PRO 4000 Blackwell, RTX PRO 4000 SFF Blackwell, RTX PRO 2000 Blackwell, RTX PRO 5000 Blackwell Laptop, RTX PRO 4000 Blackwell Laptop, RTX PRO 3000 Blackwell Laptop, RTX PRO 2000 Blackwell Laptop, RTX PRO 1000 Blackwell Laptop, RTX PRO 500 Blackwell Laptop, B100, B200, B300 (Blackwell Ultra), GB200, GB300
Architecture: Blackwell (sm_100)
VRAM: 8GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda13.0.sm100.blackwell-cp313-cp313-win_amd64.whl

What This Solves

No Visual Studio required
No CUDA Toolkit installation needed
No compilation errors
No "No CUDA toolset found" issues
Works immediately with GGUF models

Keywords

llama-cpp-python, CUDA 13.0, Python 3.13, Windows, RTX 5090, RTX 5080, RTX 5070 Ti, RTX 5070, RTX 5060, RTX 5050, RTX PRO 6000 Blackwell, RTX PRO 5000 Blackwell, RTX PRO 4500 Blackwell, RTX PRO 4000 Blackwell, B100, B200, B300, Blackwell Ultra, GB200, GB300, Blackwell, sm100, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10

Assets 3

09 Nov 20:23

dougeeai

v0.3.16-cuda13.0-sm100-py312

8ad6f64

llama-cpp-python 0.3.16 + CUDA 13.0 sm100 Blackwell - Python 3.12 - Windows x64

Pre-built llama-cpp-python wheel for Windows with CUDA 13.0 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

OS: Windows 10/11 64-bit
Python: 3.12.x (exact version required)
CUDA: 13.0 (Toolkit not needed, just driver)
Driver: NVIDIA Driver 580 or higher
GPU: NVIDIA RTX 5090, 5080, 5070 Ti, 5070, 5060 Ti, 5060, 5050, RTX 5090 Laptop, RTX 5080 Laptop, RTX 5070 Ti Laptop, RTX 5070 Laptop, RTX 5060 Laptop, RTX 5050 Laptop, RTX PRO 6000 Blackwell Workstation Edition, RTX PRO 6000 Blackwell Max-Q, RTX PRO 6000 Blackwell Server Edition, RTX PRO 5000 Blackwell, RTX PRO 4500 Blackwell, RTX PRO 4000 Blackwell, RTX PRO 4000 SFF Blackwell, RTX PRO 2000 Blackwell, RTX PRO 5000 Blackwell Laptop, RTX PRO 4000 Blackwell Laptop, RTX PRO 3000 Blackwell Laptop, RTX PRO 2000 Blackwell Laptop, RTX PRO 1000 Blackwell Laptop, RTX PRO 500 Blackwell Laptop, B100, B200, B300 (Blackwell Ultra), GB200, GB300
Architecture: Blackwell (sm_100)
VRAM: 8GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda13.0.sm100.blackwell-cp312-cp312-win_amd64.whl

What This Solves

No Visual Studio required
No CUDA Toolkit installation needed
No compilation errors
No "No CUDA toolset found" issues
Works immediately with GGUF models

Keywords

llama-cpp-python, CUDA 13.0, Python 3.12, Windows, RTX 5090, RTX 5080, RTX 5070 Ti, RTX 5070, RTX 5060, RTX 5050, RTX PRO 6000 Blackwell, RTX PRO 5000 Blackwell, RTX PRO 4500 Blackwell, RTX PRO 4000 Blackwell, B100, B200, B300, Blackwell Ultra, GB200, GB300, Blackwell, sm100, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10

Assets 3

09 Nov 20:22

dougeeai

v0.3.16-cuda13.0-sm100-py311

8ad6f64

llama-cpp-python 0.3.16 + CUDA 13.0 sm100 Blackwell - Python 3.11 - Windows x64

Pre-built llama-cpp-python wheel for Windows with CUDA 13.0 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

OS: Windows 10/11 64-bit
Python: 3.11.x (exact version required)
CUDA: 13.0 (Toolkit not needed, just driver)
Driver: NVIDIA Driver 580 or higher
GPU: NVIDIA RTX 5090, 5080, 5070 Ti, 5070, 5060 Ti, 5060, 5050, RTX 5090 Laptop, RTX 5080 Laptop, RTX 5070 Ti Laptop, RTX 5070 Laptop, RTX 5060 Laptop, RTX 5050 Laptop, RTX PRO 6000 Blackwell Workstation Edition, RTX PRO 6000 Blackwell Max-Q, RTX PRO 6000 Blackwell Server Edition, RTX PRO 5000 Blackwell, RTX PRO 4500 Blackwell, RTX PRO 4000 Blackwell, RTX PRO 4000 SFF Blackwell, RTX PRO 2000 Blackwell, RTX PRO 5000 Blackwell Laptop, RTX PRO 4000 Blackwell Laptop, RTX PRO 3000 Blackwell Laptop, RTX PRO 2000 Blackwell Laptop, RTX PRO 1000 Blackwell Laptop, RTX PRO 500 Blackwell Laptop, B100, B200, B300 (Blackwell Ultra), GB200, GB300
Architecture: Blackwell (sm_100)
VRAM: 8GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda13.0.sm100.blackwell-cp311-cp311-win_amd64.whl

What This Solves

No Visual Studio required
No CUDA Toolkit installation needed
No compilation errors
No "No CUDA toolset found" issues
Works immediately with GGUF models

Keywords

llama-cpp-python, CUDA 13.0, Python 3.11, Windows, RTX 5090, RTX 5080, RTX 5070 Ti, RTX 5070, RTX 5060, RTX 5050, RTX PRO 6000 Blackwell, RTX PRO 5000 Blackwell, RTX PRO 4500 Blackwell, RTX PRO 4000 Blackwell, B100, B200, B300, Blackwell Ultra, GB200, GB300, Blackwell, sm100, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10

Assets 3

09 Nov 20:20

dougeeai

v0.3.16-cuda13.0-sm100-py310

8ad6f64

llama-cpp-python 0.3.16 + CUDA 13.0 sm100 Blackwell - Python 3.10 - Windows x64

Pre-built llama-cpp-python wheel for Windows with CUDA 13.0 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

OS: Windows 10/11 64-bit
Python: 3.10.x (exact version required)
CUDA: 13.0 (Toolkit not needed, just driver)
Driver: NVIDIA Driver 580 or higher
GPU: NVIDIA RTX 5090, 5080, 5070 Ti, 5070, 5060 Ti, 5060, 5050, RTX 5090 Laptop, RTX 5080 Laptop, RTX 5070 Ti Laptop, RTX 5070 Laptop, RTX 5060 Laptop, RTX 5050 Laptop, RTX PRO 6000 Blackwell Workstation Edition, RTX PRO 6000 Blackwell Max-Q, RTX PRO 6000 Blackwell Server Edition, RTX PRO 5000 Blackwell, RTX PRO 4500 Blackwell, RTX PRO 4000 Blackwell, RTX PRO 4000 SFF Blackwell, RTX PRO 2000 Blackwell, RTX PRO 5000 Blackwell Laptop, RTX PRO 4000 Blackwell Laptop, RTX PRO 3000 Blackwell Laptop, RTX PRO 2000 Blackwell Laptop, RTX PRO 1000 Blackwell Laptop, RTX PRO 500 Blackwell Laptop, B100, B200, B300 (Blackwell Ultra), GB200, GB300
Architecture: Blackwell (sm_100)
VRAM: 8GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda13.0.sm100.blackwell-cp310-cp310-win_amd64.whl

What This Solves

No Visual Studio required
No CUDA Toolkit installation needed
No compilation errors
No "No CUDA toolset found" issues
Works immediately with GGUF models

Tested Configuration

Built on: Windows 11
Build date: November 9, 2025
llama-cpp-python version: 0.3.16
Build flags: CMAKE_CUDA_ARCHITECTURES=100

Keywords

llama-cpp-python, CUDA 13.0, Python 3.10, Windows, RTX 5090, RTX 5080, RTX 5070 Ti, RTX 5070, RTX 5060, RTX 5050, RTX PRO 6000 Blackwell, RTX PRO 5000 Blackwell, RTX PRO 4500 Blackwell, RTX PRO 4000 Blackwell, B100, B200, B300, Blackwell Ultra, GB200, GB300, Blackwell, sm100, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10

Assets 3

09 Nov 20:32

dougeeai

v0.3.16-cuda12.1-sm75-py313

8ad6f64

llama-cpp-python 0.3.16 + CUDA 12.1 sm75 Turing - Python 3.13 - Windows x64

Pre-built llama-cpp-python wheel for Windows with CUDA 12.1 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

OS: Windows 10/11 64-bit
Python: 3.13.x (exact version required)
CUDA: 12.1 (Toolkit not needed, just driver)
Driver: NVIDIA Driver 525.60.13 or higher
GPU: NVIDIA GeForce RTX 2080 Ti, RTX 2080 Super, RTX 2080, RTX 2070 Super, RTX 2070, RTX 2060 Super, RTX 2060 12GB, RTX 2060, TITAN RTX, GeForce RTX 2080 Super Laptop, RTX 2080 Super Max-Q, RTX 2080 Laptop, RTX 2080 Max-Q, RTX 2070 Super Laptop, RTX 2070 Super Max-Q, RTX 2070 Laptop, RTX 2070 Max-Q, RTX 2060 Laptop, RTX 2060 Max-Q, GeForce GTX 1660 Ti, GTX 1660 Super, GTX 1660, GTX 1650 Super, GTX 1650, GTX 1630, GTX 1660 Ti Laptop, GTX 1660 Ti Max-Q, GTX 1650 Ti Laptop, GTX 1650 Ti Max-Q, GTX 1650 Laptop, GTX 1650 Max-Q, GTX 1630 Laptop, Quadro RTX 8000, RTX 6000, RTX 5000, RTX 4000, RTX 3000, Quadro T2000, T1200, T1000, T600, T550, T500, T400, Tesla T40, T10, T4
Architecture: Turing (sm_75)
VRAM: 4GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda12.1.sm75.turing-cp313-cp313-win_amd64.whl

What This Solves

No Visual Studio required
No CUDA Toolkit installation needed
No compilation errors
No "No CUDA toolset found" issues
Works immediately with GGUF models

Keywords

llama-cpp-python, CUDA 12.1, Python 3.13, Windows, RTX 2080 Ti, RTX 2080 Super, RTX 2080, RTX 2070 Super, RTX 2070, RTX 2060 Super, RTX 2060, TITAN RTX, GTX 1660 Ti, GTX 1660 Super, GTX 1660, GTX 1650 Super, GTX 1650, GTX 1630, Quadro RTX 8000, Quadro RTX 6000, Quadro RTX 5000, Quadro RTX 4000, Tesla T4, Turing, sm75, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10

Assets 3

09 Nov 20:32

dougeeai

v0.3.16-cuda12.1-sm75-py312

8ad6f64

llama-cpp-python 0.3.16 + CUDA 12.1 sm75 Turing - Python 3.12 - Windows x64

Pre-built llama-cpp-python wheel for Windows with CUDA 12.1 support

Skip the build process entirely. This wheel is compiled and ready to install.

Requirements

OS: Windows 10/11 64-bit
Python: 3.12.x (exact version required)
CUDA: 12.1 (Toolkit not needed, just driver)
Driver: NVIDIA Driver 525.60.13 or higher
GPU: NVIDIA GeForce RTX 2080 Ti, RTX 2080 Super, RTX 2080, RTX 2070 Super, RTX 2070, RTX 2060 Super, RTX 2060 12GB, RTX 2060, TITAN RTX, GeForce RTX 2080 Super Laptop, RTX 2080 Super Max-Q, RTX 2080 Laptop, RTX 2080 Max-Q, RTX 2070 Super Laptop, RTX 2070 Super Max-Q, RTX 2070 Laptop, RTX 2070 Max-Q, RTX 2060 Laptop, RTX 2060 Max-Q, GeForce GTX 1660 Ti, GTX 1660 Super, GTX 1660, GTX 1650 Super, GTX 1650, GTX 1630, GTX 1660 Ti Laptop, GTX 1660 Ti Max-Q, GTX 1650 Ti Laptop, GTX 1650 Ti Max-Q, GTX 1650 Laptop, GTX 1650 Max-Q, GTX 1630 Laptop, Quadro RTX 8000, RTX 6000, RTX 5000, RTX 4000, RTX 3000, Quadro T2000, T1200, T1000, T600, T550, T500, T400, Tesla T40, T10, T4
Architecture: Turing (sm_75)
VRAM: 4GB+ recommended

Installation

pip install llama_cpp_python-0.3.16+cuda12.1.sm75.turing-cp312-cp312-win_amd64.whl

What This Solves

No Visual Studio required
No CUDA Toolkit installation needed
No compilation errors
No "No CUDA toolset found" issues
Works immediately with GGUF models

Keywords

llama-cpp-python, CUDA 12.1, Python 3.12, Windows, RTX 2080 Ti, RTX 2080 Super, RTX 2080, RTX 2070 Super, RTX 2070, RTX 2060 Super, RTX 2060, TITAN RTX, GTX 1660 Ti, GTX 1660 Super, GTX 1660, GTX 1650 Super, GTX 1650, GTX 1630, Quadro RTX 8000, Quadro RTX 6000, Quadro RTX 5000, Quadro RTX 4000, Tesla T4, Turing, sm75, GGUF, llama.cpp, no compilation, prebuilt wheel, Windows 11, Windows 10

Assets 3

Releases: dougeeai/llama-cpp-python-wheels

llama-cpp-python 0.3.16 + CUDA 13.0 sm75 Turing - Python 3.13 - Windows x64

Requirements

Installation

What This Solves

Keywords

Uh oh!

llama-cpp-python 0.3.16 + CUDA 13.0 sm75 Turing - Python 3.12 - Windows x64

Requirements

Installation

What This Solves

Keywords

Uh oh!

llama-cpp-python 0.3.16 + CUDA 13.0 sm75 Turing - Python 3.11 - Windows x64

Requirements

Installation

What This Solves

Keywords

Uh oh!

llama-cpp-python 0.3.16 + CUDA 13.0 sm75 Turing - Python 3.10 - Windows x64

Requirements

Installation

What This Solves

Keywords

Uh oh!

llama-cpp-python 0.3.16 + CUDA 13.0 sm100 Blackwell - Python 3.13 - Windows x64

Requirements

Installation

What This Solves

Keywords

Uh oh!

llama-cpp-python 0.3.16 + CUDA 13.0 sm100 Blackwell - Python 3.12 - Windows x64

Requirements

Installation

What This Solves

Keywords

Uh oh!

llama-cpp-python 0.3.16 + CUDA 13.0 sm100 Blackwell - Python 3.11 - Windows x64

Requirements

Installation

What This Solves

Keywords

Uh oh!

llama-cpp-python 0.3.16 + CUDA 13.0 sm100 Blackwell - Python 3.10 - Windows x64

Requirements

Installation

What This Solves

Tested Configuration

Keywords

Uh oh!

llama-cpp-python 0.3.16 + CUDA 12.1 sm75 Turing - Python 3.13 - Windows x64

Requirements

Installation

What This Solves

Keywords

Uh oh!

llama-cpp-python 0.3.16 + CUDA 12.1 sm75 Turing - Python 3.12 - Windows x64

Requirements

Installation

What This Solves

Keywords

Uh oh!