Learn how to run powerful uncensored language models completely offline on affordable hardware for enhanced privacy and unrestricted access to information.
Introduction
Welcome to the Global Science Network! I’m going to show you how to download and run a large language model that was trained on what would be equivalent to:
- Reading 127 million novels
- Reading through all of Wikipedia 2,500 times
The best part? This model can be downloaded and run on an external flash drive that costs around $12. The model only requires about 10GB of storage space.
Why Run Local LLMs
Running uncensored, offline LLMs offers two major advantages:
1. Unrestricted Access to Information
Users gain access to information that might otherwise be difficult to access. Some countries and tech companies limit and censor internet content, but with these models, you can access previously restricted information.
Important note: The quality of output depends on what data the model was trained on. There could still be inherent biases based on which sources (Meta, OpenAI, Google, Anthropic, xAI, or DeepSeek) were used for training.
While LLMs don’t provide absolute truth and can produce incorrect results, they are excellent tools for:
- Finding information quickly
- Summarizing information
- Converting thoughts into usable code
2. Enhanced Privacy
Running models offline ensures that tech companies and governments cannot monitor what you’re searching or thinking about. This provides genuine privacy while still allowing access to powerful AI capabilities.
Offline models are particularly valuable when working with:
- Proprietary information
- Classified data
- Personal information
These models can even be further trained based on your specific needs, making them uniquely suited to your requirements over time.
Understanding the Model Architecture
The Dolphin Llama 3 model comes in two versions:
- 8 billion parameter model (~5GB storage)
- 70 billion parameter model (~40GB storage)
Both were trained by Meta with 15 trillion tokens (about 60 terabytes of raw text data).
The 8 billion parameter model consists of:
- 32 transformer layers with self-attention and feed-forward networks
- Self-attention components using 496×496 weight matrices
- Each layer containing 67.1 million parameters for attention (totaling 2.15 billion parameters)
- Feed-forward networks with large weight matrices to expand and refine token representations
- Layer normalization and biases to stabilize training
- Token embeddings and positional encodings to help the model understand meaning and word order
Step-by-Step Installation Guide
Requirements
- Computer with sufficient RAM (8GB+ recommended)
- 128GB USB 3.0 flash drive (for portable usage)
- Internet connection (for initial download only)
Step 1: Download Ollama
- Go to Ollama.com
- Navigate to the Models tab
- Search for “dolphin”
- Select the Dolphin Llama 3 Model
- Click Download and run the executable
Step 2: Pull the Model
- Open two terminals (PowerShell terminals if on Windows)
- In the first terminal, type:
ollama serve - In the second terminal, copy the run command from Ollama.com and paste it
- Wait for the model to download (may take a few minutes)
- When finished, press Ctrl+D and Ctrl+C to end the programs
Step 3: Test the Model
- Open two new terminals (don’t run as administrator, as this might activate censorship)
- In terminal one, enter:
ollama serve - In terminal two, enter:
ollama run dolphin-llama-3 - Test with a query that would typically be censored to ensure it’s working as expected
Step 4: Transfer to External Drive
- Format a 128GB USB flash drive using NTFS file system (allows files larger than 4GB)
- Locate the Ollama files on your system (typically in C:\Ollama on Windows)
- Verify one of the model files is around 4.5GB
- Copy the Ollama folder to your external drive
- Find and copy the base Ollama server program files as well
Step 5: Run from External Drive
- Open two PowerShell terminals
- In the first terminal:
- Change directory to the external drive
- Set the environmental variables and model path
- Start the server with the serve command
- In the second terminal:
- Change directory to the external drive
- Run the program with:
ollama.exe run dolphin-llama-3
Using a Better Interface: AnythingLLM
The terminal interface works, but AnythingLLM provides a much better experience:
- Run the Ollama server from PowerShell as before
- Download AnythingLLM from anythingllm.com
- During installation, set the path to your external drive
- Create an .env file in the AnythingLLM folder with the correct model path
- Start the program and select:
- Ollama
- “Run LLMs on your machine”
- Model:
dolphin-llama-3:latest
The AnythingLLM interface allows you to:
- Upload documents
- Get more user-friendly responses
- Switch between different AI models
- Customize workspace settings
Additional Resources
You can also use other interfaces to run offline models:
- GPT4All
- LM Studio
- Open WebUI
Conclusion
You now have access to a powerful AI model trained on 127 million novels worth of data, running completely offline on affordable hardware. This provides both unrestricted access to information and enhanced privacy.
In future videos, I’ll explore building a low-cost companion robot with a mix of LLMs and hardware-based neural networks. Stay tuned for more content about hardware-based neural networks!
Need More Power?
Running large 70B models locally can be demanding on consumer hardware. If you need cloud GPU power, try DigitalOcean GPU droplets — get started with free credits and scale up when you need it.
If you have questions or suggestions about other AI models to run, please let me know in the comments.










