The Guide - Why My AI

Welcome to the Guide, as this is the initial page that should cover and have all the information for you on all three projects you can work on with this site.

Phase I—The Core Foundation

Command Execution Primer

Before launching into the commands for Docker and WSL, a user needs to know where to type them and how to ensure they have the necessary permissions.

Where to Run Your Commands

All commands in this guide should be executed in a command-line interface (CLI) with elevated privileges (Run as Administrator).

Operating System	Recommended Tool	Action
Windows	PowerShell or Windows Terminal	Search for the tool, then right click and select “Run as Administrator.” This is required for the `wsl --install` command.
Linux / macOS	Terminal / Bash	Standard Terminal access is fine, but any command that requires elevated privileges, like `curl

1. Hardware Check: Your Local Lifeline

Component	Recommendation	Minimum (for 7B models)
RAM (System Memory)	32 GB	16 GB
Storage (SSD)	1 TB NVMe SSD	500 GB SSD
GPU (Graphics Card)	8 GB VRAM or more	6 GB VRAM

2. The Golden Rule: Install Docker First

Action: Download and install Docker Desktop. (Link to Docker Desktop download page will be placed here.)

Networking & Ports Analogy (Addition to Section D, after Docker)

The terms localhost and ports are crucial to the guide but can be confusing. Using a simple analogy helps demystify the process.

Understanding Ports: Your Computer’s Digital Apartment

When you run software using Docker, you are giving each application its own isolated “apartment” on your computer.

Your Computer = The Apartment Building: Your local machine.
localhost (or 127.0.0.1) = The Street Address: This is the universal address that always refers to your own computer.
The Port (:3000, :8000) = The Apartment Number: Since one computer can run many applications, the port number tells your web browser which specific application it should connect to. For instance:
- http://localhost:3000 means: “Go to my own computer and find the application running in Apartment 3000 (Open WebUI).”

This brief explanation provides the necessary context for why the user is typing different port numbers (3000, 8000, 5000, 8188) throughout the guide.

3. Critical Step for Windows Users: Enable WSL

Action: Open PowerShell or Command Prompt as an administrator.
Command: wsl --install

4. Step 1: Install the AI Engine (Ollama)

Action: Download the installer. (Link to Ollama download page will be placed here.)
Linux Command: curl -fsSL https://ollama.com/install.sh | sh

5. Step 2: Install the User Interface (Open WebUI)

Crucial Command (All OS): Bash

docker run -d -p 3000:8080 –add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data –name open-webui –restart always ghcr.io/open-webui/open-webui:main

Open WebUI—The Private Chat Hub

Open WebUI serves as a polished, multi-purpose chat environment for interacting with your local models and managing your system.

Feature	Description & Key Action	Quick Find Location
Model Selection	Easily switch between all the models you have downloaded with Ollama (e.g., Llama 3, Mistral, Code Llama).	Model Dropdown Menu (Top of the Chat Window)
System Prompts	Set the AI’s core role. Before starting a chat, define a prompt (like “You are a Python programmer,” or “Act as a helpful peer support specialist”) to guide its responses.	System Prompt box (top of the chat window)
RAG Management	Set up your long-term memory. This is where you upload private documents or link past conversations to be used as a knowledge base for the AI.	Settings panel (left sidebar) under RAG
User/Model Settings	Adjust default settings, change your avatar, and manage the access of other users if you choose to share your instance.	Settings panel (left sidebar)

Command for CUDA (NVIDIA GPU/Linux/WSL):

Bash

docker run -d -p 3000:8080 –gpus all –add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data –name open-webui –restart always ghcr.io/open-webui/open-webui:cuda

6. Step 3: Access and Download Your First Model

Access: Open your web browser and go to: http://localhost:3000
Download Command: Bash

ollama run llama3:8b

Note: This command downloads the 8-billion-parameter version of Llama 3, which is instruction-tuned for great conversational quality.

Section E: Modular Progression—Project A: The Basic Assistant

Goal Achieved: You have successfully built and deployed Project A: The Basic Assistant.
Safety Checkpoint: Ensure you have completed the Foundational Security and Digital Hygiene steps (Section C).

💡 NEXT STEPS FOR PERSONALIZATION

Your basic assistant is private and functional, but it currently only has the context of your immediate chat history. If you want to give your AI long-term memory based on your private files or past conversations, you can add Retrieval-Augmented Generation (RAG) at any time.

Where to find it: We cover the initial RAG setup within Open WebUI in Section G, Step 1 (Persistent Memory via RAG), and advanced tuning in Section H (Advanced Customization).

Section F: Modular Progression—Project B: The Interactive Storyteller

1. The New Interface: SillyTavern

Action: Install SillyTavern via Docker:

Bash

docker run -d -p 8000:80 –name sillytavern –restart always ghcr.io/sillytavern/sillytavern:latest

Access: http://localhost:8000

`SillyTavern — Your Interactive Story Engine`

SillyTavern is much more than a simple chatbot interface; it is a dedicated environment built for deep roleplaying and character interaction.

Feature	Description & Key Action	Quick Find Location
Character Cards	Define your character. These files contain the core personality, appearance, internal monologue, and dialogue style. You can search online for thousands of pre-made characters.	Character Management panel (left sidebar icon)
Lorebooks	The dynamic memory. Allows you to define places, objects, or people. The AI will only use the information when the corresponding keyword is mentioned in the conversation.	Worldbook/Lorebook tab (upper right)
Generation Settings	Control the AI’s style. Adjust the temperature (for creativity/randomness), context size (how much chat history the AI remembers), and response length for custom outputs.	Generation Settings tab (upper right)
Extensions	Modular add-ons. This is where you connect external tools like TTS (AllTalk) for voice output, Image Generation (ComfyUI), and other utilities.	Extensions menu (right sidebar icon)

2. Connecting SillyTavern to Ollama

API TYPE: OpenAI Compatible
API Endpoint: http://localhost:11434/v1

3. (Optional) Voice Immersion: Text-to-Speech (TTS)

❗ OPTIONAL PROGRAM: This step is not required, but it greatly enhances immersion.

Action: Install AllTalk V2 via Docker (requires GPU): Bash

docker run -d –gpus all -p 5000:5000 –name alltalk –restart always erew123/alltalk:latest

Connect to SillyTavern: Use the TTS Extension and enter the AllTalk server address: http://localhost:5000

💡 NEXT STEPS FOR FULL COMPANION MODE

You now have a powerful, immersive roleplaying system. If you wish to give your companion:

A visual 3D avatar that syncs with its voice.
The ability to generate images based on your character’s stories.
Secure access from your phone or tablet while away from home.

These features are covered in our final stage.

Where to find it: We cover all these Companion features in Section G: Modular Progression—Project C: The Full Companion.

4. Optional Deep Dive: Finding and Fine-Tuning TTS Voices

Topic	Key Action	Resource Suggestions
Finding Voices	Use short, clean voice clips (8-10 seconds).	Look on Hugging Face for XTTS models or community forums.
Fine-Tuning/Cloning	Place a clean `.wav` file into the `alltalk_tts/voices/` folder.	Use Audacity for editing and trimming samples.
Advanced Quality	Explore RVC (retrieval-based voice conversion) for superior naturalness.	Look for tutorials on RVC voice models.

💻 Section G: Modular Progression—Project C: The Full Companion

1. Persistent Memory via RAG (Retrieval Augmented Generation)

Action (Setting up Memory): Log into Open WebUI and navigate to the RAG (Retrieval Augmented Generation) section in the Settings. Enable RAG and toggle the “Memory/History” setting to include past conversations as a source.
RAG Pro Tips:
- Chunk documents into 200–500 tokens for optimal retrieval.
- Embedding Models: Use all-MiniLM-L6-v2 for speed or text-embedding-ada-002 for quality.
- Context Window (num_ctx): To fix slowdowns, reduce this in Ollama (default is 2048). Increase it only if you need longer chat history, but beware of memory consumption.

2. Contextual Image Generation with ComfyUI

Action: Install the ComfyUI server via Docker (requires GPU): Bash

docker run -d –gpus all -p 8188:8188 –name comfyui –restart always -v C:/ComfyUI/models:/app/models -v C:/ComfyUI/output:/app/output ghcr.io/comfyui/comfyui:latest

Integrate to SillyTavern: Enable the Image Generation Extension and set the API Endpoint to http://localhost:8188.

Volume Path Warning: The paths starting with C:/ in the command above are for users running Docker Desktop directly on Windows. If you are using WSL, replace C:/ComfyUI/models with a Linux path (e.g., /mnt/c/ComfyUI/models to access your Windows drive or a Linux path like ~/comfyui/models). Failure to correctly specify this path will result in the container not being able to find your models.

3. Advanced Companion Features

Feature	Description	Action
Secure Remote Access	Tailscale: Creates a private, encrypted network for remote access.	Install the Tailscale client on your main AI computer. (This is where the Tailscale link will be placed).
VRM/3D Model Support	SillyTavern supports VRM files (created via VRoid Studio or Ready Player Me) for 3D character avatars using the VMC protocol.	Upload the VRM file to your character card. Tracking can be handled via VTuber Studio.
Lorebooks / World Data	Knowledge databases in SillyTavern that auto-inject details only when relevant keywords are triggered.	Use the SillyTavern interface to set up new Lorebook entries for complex characters or worldbuilding.

4. Critical Safety Checkpoint Before Remote Access

Before you enable remote access, you must confirm that you have completed:

Foundational Security & Hygiene (Section C): All steps (DNS filtering, antivirus, and the password manager ad blockers) must be active.
Tailscale Installation: You must use Tailscale for a secure, peer-to-peer connection. Never use standard, unsecured port forwarding.