Welcome to the Guide, as this is the initial page that should cover and have all the information for you on all three projects you can work on with this site.
Phase I—The Core Foundation
Command Execution Primer
Before launching into the commands for Docker and WSL, a user needs to know where to type them and how to ensure they have the necessary permissions.
Where to Run Your Commands
All commands in this guide should be executed in a command-line interface (CLI) with elevated privileges (Run as Administrator).
| Operating System | Recommended Tool | Action |
| Windows | PowerShell or Windows Terminal | Search for the tool, then right click and select “Run as Administrator.” This is required for the wsl --install command. |
| Linux / macOS | Terminal / Bash | Standard Terminal access is fine, but any command that requires elevated privileges, like `curl |
1. Hardware Check: Your Local Lifeline
| Component | Recommendation | Minimum (for 7B models) |
| RAM (System Memory) | 32 GB | 16 GB |
| Storage (SSD) | 1 TB NVMe SSD | 500 GB SSD |
| GPU (Graphics Card) | 8 GB VRAM or more | 6 GB VRAM |
2. The Golden Rule: Install Docker First
- Action: Download and install Docker Desktop. (Link to Docker Desktop download page will be placed here.)
Networking & Ports Analogy (Addition to Section D, after Docker)
The terms localhost and ports are crucial to the guide but can be confusing. Using a simple analogy helps demystify the process.
Understanding Ports: Your Computer’s Digital Apartment
When you run software using Docker, you are giving each application its own isolated “apartment” on your computer.
- Your Computer = The Apartment Building: Your local machine.
localhost(or127.0.0.1) = The Street Address: This is the universal address that always refers to your own computer.- The Port (
:3000,:8000) = The Apartment Number: Since one computer can run many applications, the port number tells your web browser which specific application it should connect to. For instance:http://localhost:3000means: “Go to my own computer and find the application running in Apartment 3000 (Open WebUI).”
This brief explanation provides the necessary context for why the user is typing different port numbers (3000, 8000, 5000, 8188) throughout the guide.
3. Critical Step for Windows Users: Enable WSL
- Action: Open PowerShell or Command Prompt as an administrator.
- Command:
wsl --install
4. Step 1: Install the AI Engine (Ollama)
- Action: Download the installer. (Link to Ollama download page will be placed here.)
- Linux Command:
curl -fsSL https://ollama.com/install.sh | sh
5. Step 2: Install the User Interface (Open WebUI)
- Crucial Command (All OS): Bash
docker run -d -p 3000:8080 –add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data –name open-webui –restart always ghcr.io/open-webui/open-webui:main
Open WebUI—The Private Chat Hub
Open WebUI serves as a polished, multi-purpose chat environment for interacting with your local models and managing your system.
| Feature | Description & Key Action | Quick Find Location |
| Model Selection | Easily switch between all the models you have downloaded with Ollama (e.g., Llama 3, Mistral, Code Llama). | Model Dropdown Menu (Top of the Chat Window) |
| System Prompts | Set the AI’s core role. Before starting a chat, define a prompt (like “You are a Python programmer,” or “Act as a helpful peer support specialist”) to guide its responses. | System Prompt box (top of the chat window) |
| RAG Management | Set up your long-term memory. This is where you upload private documents or link past conversations to be used as a knowledge base for the AI. | Settings panel (left sidebar) under RAG |
| User/Model Settings | Adjust default settings, change your avatar, and manage the access of other users if you choose to share your instance. | Settings panel (left sidebar) |
Command for CUDA (NVIDIA GPU/Linux/WSL):
Bash
docker run -d -p 3000:8080 –gpus all –add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data –name open-webui –restart always ghcr.io/open-webui/open-webui:cuda
6. Step 3: Access and Download Your First Model
- Access: Open your web browser and go to:
http://localhost:3000 - Download Command: Bash
ollama run llama3:8b
Note: This command downloads the 8-billion-parameter version of Llama 3, which is instruction-tuned for great conversational quality.
Section E: Modular Progression—Project A: The Basic Assistant
- Goal Achieved: You have successfully built and deployed Project A: The Basic Assistant.
- Safety Checkpoint: Ensure you have completed the Foundational Security and Digital Hygiene steps (Section C).
💡 NEXT STEPS FOR PERSONALIZATION
Your basic assistant is private and functional, but it currently only has the context of your immediate chat history. If you want to give your AI long-term memory based on your private files or past conversations, you can add Retrieval-Augmented Generation (RAG) at any time.
- Where to find it: We cover the initial RAG setup within Open WebUI in Section G, Step 1 (Persistent Memory via RAG), and advanced tuning in Section H (Advanced Customization).
Section F: Modular Progression—Project B: The Interactive Storyteller
1. The New Interface: SillyTavern
- Action: Install SillyTavern via Docker:
Bash
docker run -d -p 8000:80 –name sillytavern –restart always ghcr.io/sillytavern/sillytavern:latest
Access: http://localhost:8000
SillyTavern — Your Interactive Story Engine
SillyTavern is much more than a simple chatbot interface; it is a dedicated environment built for deep roleplaying and character interaction.
| Feature | Description & Key Action | Quick Find Location |
| Character Cards | Define your character. These files contain the core personality, appearance, internal monologue, and dialogue style. You can search online for thousands of pre-made characters. | Character Management panel (left sidebar icon) |
| Lorebooks | The dynamic memory. Allows you to define places, objects, or people. The AI will only use the information when the corresponding keyword is mentioned in the conversation. | Worldbook/Lorebook tab (upper right) |
| Generation Settings | Control the AI’s style. Adjust the temperature (for creativity/randomness), context size (how much chat history the AI remembers), and response length for custom outputs. | Generation Settings tab (upper right) |
| Extensions | Modular add-ons. This is where you connect external tools like TTS (AllTalk) for voice output, Image Generation (ComfyUI), and other utilities. | Extensions menu (right sidebar icon) |
2. Connecting SillyTavern to Ollama
- API TYPE: OpenAI Compatible
- API Endpoint:
http://localhost:11434/v1
3. (Optional) Voice Immersion: Text-to-Speech (TTS)
❗ OPTIONAL PROGRAM: This step is not required, but it greatly enhances immersion.
- Action: Install AllTalk V2 via Docker (requires GPU): Bash
docker run -d –gpus all -p 5000:5000 –name alltalk –restart always erew123/alltalk:latest
Connect to SillyTavern: Use the TTS Extension and enter the AllTalk server address: http://localhost:5000
💡 NEXT STEPS FOR FULL COMPANION MODE
You now have a powerful, immersive roleplaying system. If you wish to give your companion:
- A visual 3D avatar that syncs with its voice.
- The ability to generate images based on your character’s stories.
- Secure access from your phone or tablet while away from home.
These features are covered in our final stage.
- Where to find it: We cover all these Companion features in Section G: Modular Progression—Project C: The Full Companion.
4. Optional Deep Dive: Finding and Fine-Tuning TTS Voices
| Topic | Key Action | Resource Suggestions |
| Finding Voices | Use short, clean voice clips (8-10 seconds). | Look on Hugging Face for XTTS models or community forums. |
| Fine-Tuning/Cloning | Place a clean .wav file into the alltalk_tts/voices/ folder. | Use Audacity for editing and trimming samples. |
| Advanced Quality | Explore RVC (retrieval-based voice conversion) for superior naturalness. | Look for tutorials on RVC voice models. |
💻 Section G: Modular Progression—Project C: The Full Companion
1. Persistent Memory via RAG (Retrieval Augmented Generation)
- Action (Setting up Memory): Log into Open WebUI and navigate to the RAG (Retrieval Augmented Generation) section in the Settings. Enable RAG and toggle the “Memory/History” setting to include past conversations as a source.
- RAG Pro Tips:
- Chunk documents into 200–500 tokens for optimal retrieval.
- Embedding Models: Use all-MiniLM-L6-v2 for speed or text-embedding-ada-002 for quality.
- Context Window (
num_ctx): To fix slowdowns, reduce this in Ollama (default is 2048). Increase it only if you need longer chat history, but beware of memory consumption.
2. Contextual Image Generation with ComfyUI
- Action: Install the ComfyUI server via Docker (requires GPU): Bash
docker run -d –gpus all -p 8188:8188 –name comfyui –restart always -v C:/ComfyUI/models:/app/models -v C:/ComfyUI/output:/app/output ghcr.io/comfyui/comfyui:latest
Integrate to SillyTavern: Enable the Image Generation Extension and set the API Endpoint to http://localhost:8188.
Volume Path Warning: The paths starting with C:/ in the command above are for users running Docker Desktop directly on Windows. If you are using WSL, replace C:/ComfyUI/models with a Linux path (e.g., /mnt/c/ComfyUI/models to access your Windows drive or a Linux path like ~/comfyui/models). Failure to correctly specify this path will result in the container not being able to find your models.
3. Advanced Companion Features
| Feature | Description | Action |
| Secure Remote Access | Tailscale: Creates a private, encrypted network for remote access. | Install the Tailscale client on your main AI computer. (This is where the Tailscale link will be placed). |
| VRM/3D Model Support | SillyTavern supports VRM files (created via VRoid Studio or Ready Player Me) for 3D character avatars using the VMC protocol. | Upload the VRM file to your character card. Tracking can be handled via VTuber Studio. |
| Lorebooks / World Data | Knowledge databases in SillyTavern that auto-inject details only when relevant keywords are triggered. | Use the SillyTavern interface to set up new Lorebook entries for complex characters or worldbuilding. |
4. Critical Safety Checkpoint Before Remote Access
Before you enable remote access, you must confirm that you have completed:
- Foundational Security & Hygiene (Section C): All steps (DNS filtering, antivirus, and the password manager ad blockers) must be active.
- Tailscale Installation: You must use Tailscale for a secure, peer-to-peer connection. Never use standard, unsecured port forwarding.
