Setting Up Text Generation WebUI (No AVX Required)

Step-by-step guide to installing Text Generation WebUI on systems without AVX2 support, with setup for non-AVX CPUs.

Setting Up Text Generation WebUI (No AVX Required)

Text Generation WebUI is a great alternative to LM Studio that offers non-AVX builds, making it compatible with older CPUs. If you’re not sure whether your CPU has AVX2, see my short explainer on what AVX is and why it matters for LLM runtimes. For a wider comparison with Ollama, LM Studio and LocalAI, see the local LLM ecosystem overview. There are several installation options available:

  1. Visit the official GitHub repository

  2. Download the installer that specifies “Non-AVX” support:

    • For Windows: oobabooga-windows-noavx.zip
    • For Linux: oobabooga-linux-noavx.zip
  3. Extract the zip file and run:

    • Windows: start_windows.bat
    • Linux: start_linux.sh

Option-2 Manual Installation

If you prefer manual installation:

 1# Clone the repository
 2git clone https://github.com/oobabooga/text-generation-webui
 3cd text-generation-webui
 4
 5# Create and activate virtual environment
 6python -m venv venv
 7source venv/bin/activate  # On Windows: venv\Scripts\activate
 8
 9# Install with the noavx flag
10pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cpu/torch_stable.html --prefer-binary --no-cache-dir
11
12# Start the UI
13python server.py --listen --no-download-loader

Running Gemma 2 2b

  1. Launch the web interface (it should open at http://127.0.0.1:7860)
  2. Go to the “Model” tab
  3. Select “Download model or LoRA”
  4. In the “Hugging Face Hub model” field, enter: google/gemma-2-2b-it
  5. Click “Download”
  6. After downloading, select the model from the dropdown and click “Load”

Configuration Options

For better performance on CPUs without AVX2:

  1. In the “Parameters” tab:

    • Set Context Length to a lower value (like 2048)
    • Enable “8-bit” or “4-bit” quantization
    • Use “cpu” as the Inference Device
  2. In the “Session” tab:

    • Set “Instructions template” to match Gemma 2

For a deeper look at what those settings actually do — context window, embedding size, temperature and quantization — see my hyperparameters guide.

Alternative Models

If Gemma 2 2b is still too demanding, try these smaller models that work better on older CPUs:

  • TinyLlama (1.1B parameters)
  • Phi-2 (2.7B parameters)
  • Mistral 7B in 4-bit quantization

Conclusion

In this guide, we covered the installation and configuration of Text Generation WebUI without requiring AVX support. By following the outlined steps, you should be able to set up the web interface and run various models on older CPUs. Remember to explore alternative models if you encounter performance issues with larger ones. Happy experimenting!