From 5ee840747321069b056988e0c8d7de42f301f086 Mon Sep 17 00:00:00 2001 From: just n Date: Sun, 28 Dec 2025 16:29:00 +0000 Subject: [PATCH] Update README.md --- README.md | 111 +++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 109 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 6cbd3c1..756b7bb 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,110 @@ -# xmpp-jarvis +# XMPP LLM Bot + +A Python bot that bridges XMPP group chats and direct messages with Large Language Models (LLMs). + +## OMEMO support and some other features are based on [Arne/xmpp-ai-bot](https://codeberg.org/Arne) with addition of experimental new features including multi-modal support, file hosting, and media generation. +## Inspired by [jjj333-p/gemini-xmpp](https://github.com/jjj333-p/gemini-xmpp) for group chat features. + +# TODO + +- Add video understanding with Gemini + +## Features + +### LLM Capabilities +* **Multi-Provider Support:** Native integration for **Google Gemini** (via `google-genai`) and **OpenAI** (via `openai` lib), plus generic support for OpenAI-compatible endpoints (LocalAI, Groq, Ollama). Also supports Groq Compound built-in tools. +* **System Prompts:** Configurable global system instructions with **per-room override** capabilities. +* **Persistent Memory:** Remembers conversations with configurable context windows. Supports saving memory to disk (`json`) to survive restarts. +* **Thinking Skipping:** Option to skip `` bots from reasoning models (like DeepSeek). It helps to keep rooms cleaner. Only necessary when you are using the custom endpoint (%99 of the time you should not). + +### Security & Privacy +* **OMEMO Encryption (XEP-0384):** Support for E2EE in Direct Messages. (I do NOT guarantee it will work 100% of times) +* **AES-GCM Support:** Can encrypt and decrypt `aesgcm://` links (commonly used by clients like *Conversations*, *Cheogram* and *monocles chat*) to read or send encrypted media. +* **Access Control:** Whitelist/Blacklist modes for Direct Messages and a "Privileged Users" list. + +### Multi-Modal (Vision & Audio) (only tested with Gemini) +* **Image Recognition:** Users can send images (http/https or encrypted aesgcm) for the bot to analyze. +* **Audio Transcription:** Users can send voice notes; the bot processes them using the LLM's audio capabilities. +* **URL Context:** The bot can fetch URLs mentioned in chat, strip the HTML, and read the content to provide context-aware answers. (turn it off if you don't want people to post IP grabber links to bot) + +### Generative Tools +* **Image Generation (`!imagen`):** Generates images using **Cloudflare Workers AI Flux** (Black Forest Labs) and automatically uploads them to a file host. +* **Text-to-Speech (`!tts`):** Converts text to high-quality speech using **Gemini TTS**. + * Supports automatic audio format conversion (PCM -> OGG/WAV) using `pydub`. + * **Auto-TTS Mode:** Can optionally reply to *every* message with a voice note. + +### File Handling +* **Multi-Host Uploading:** Automatically uploads generated media (Images/TTS) to public hosts or your XMPP server itself so XMPP clients can render them inline. +* **Supported Hosts:** Catbox, Litterbox, 0x0.st, Imgur, ImgBB, Envs.sh, and Uguu.se. You can use your XMPP server too. + +### Core XMPP Features +* **MUC Support:** specialized handling for Group Chats (nicknames, mentions, quoting). +* **Connection Configuration:** Join retry logic and rate limiting to prevent spam bans. +* **Native Formatting:** Supports XMPP quoting (`>`) and Out-of-Band (`OOB`) data for media. + +## Prerequisites + +* **Python 3.9+** (only tested with Python 3.13) +* **FFmpeg:** Required for audio conversion (TTS/Voice notes). + * Ubuntu/Debian: `sudo apt install ffmpeg` + * Arch Linux: `sudo pacman -S ffmpeg` + * Fedora: `sudo dnf install ffmpeg` (I am not sure check for the package name yourself) + * Windows: Download and add to system PATH. + * macOS: `brew install ffmpeg` (not sure figure it out yourself) + * Other OS: Figure it out yourself or just disable text to speech. +* **System Libraries:** `libolm` or `libgcrypt` may be required for OMEMO support depending on your OS. + +## Installation + +1. **Clone the repository:** + ```bash + git clone https://github.com/yourusername/xmpp-llm-bot.git + cd xmpp-llm-bot + ``` + +2. **Set up a Virtual Environment:** + ```bash + python3 -m venv venv + source venv/bin/activate # On Windows: venv\Scripts\activate + ``` + +3. **Install Dependencies:** + ```bash + pip install -r requirements.txt + ``` + +## Configuration + +1. **Copy the example configuration file:** + ```bash + cp config.example.ini config.ini + ``` + +2. **Edit config.ini with your credentials:** + + XMPP Section: JID, Password, Rooms. + + Bot Section: API Keys, Models, Triggers. + +## Usage + +**Start the bot:** + + ```bash + python3 bot.py + ``` + +## Interaction Guide + +- Mentions: Mention the bot's nickname (@BotName / BotName, / Botname:) or reply to its message to trigger a response. + +- Triggers: Start a message with the configured trigger (default !aibot) to force a response without mentioning. + +- Commands: + + !imagen - Generate an image using Flux. + + !tts - Speak the provided text. + +- Direct Messages: Send a DM to the bot for a private conversation (OMEMO supported). -Mirror of codeberg/xmppkeyboardguy/xmpp-llm-bot \ No newline at end of file