diff --git a/README.md b/README.md index e208ce6..af07729 100644 --- a/README.md +++ b/README.md @@ -16,6 +16,7 @@ This repository is intended to be a simple skill source: install the repo (or a |---|---|---| | `gitea-api` | Interact with Gitea via REST API (repos, issues, PRs, releases, branches, user info). | `skills/gitea-api` | | `portainer` | Manage Portainer stacks via API (list, start/stop/restart, update, prune images). | `skills/portainer` | +| `searxng` | Search through a local or self-hosted SearXNG instance for web, news, images, and more. | `skills/searxng` | | `web-automation` | Automate browsing/scraping with Playwright + Camoufox (auth flows, extraction, bot-protected sites). | `skills/web-automation` | ## Install ideas diff --git a/docs/README.md b/docs/README.md index 221c304..b438efb 100644 --- a/docs/README.md +++ b/docs/README.md @@ -6,4 +6,5 @@ This folder contains detailed docs for each skill in this repository. - [`gitea-api`](gitea-api.md) — REST-based Gitea automation (no `tea` CLI required) - [`portainer`](portainer.md) — Portainer stack management (list, lifecycle, updates, image pruning) +- [`searxng`](searxng.md) — Privacy-respecting metasearch via a local or self-hosted SearXNG instance - [`web-automation`](web-automation.md) — Playwright + Camoufox browser automation and scraping diff --git a/docs/searxng.md b/docs/searxng.md new file mode 100644 index 0000000..3f758e0 --- /dev/null +++ b/docs/searxng.md @@ -0,0 +1,50 @@ +# searxng + +Search the web through a local or self-hosted SearXNG instance. + +## What this skill is for + +- General web search +- News, image, and video search +- Privacy-respecting search without external API keys +- Programmatic search output via JSON + +## Runtime requirements + +- `python3` +- Python packages: `httpx`, `rich` + +## Configuration + +Preferred: + +- `SEARXNG_URL` environment variable + +Optional config file: + +- workspace `.clawdbot/credentials/searxng/config.json` +- or `~/.clawdbot/credentials/searxng/config.json` + +Example: + +```json +{ + "url": "https://search.fiorinis.com" +} +``` + +## Wrapper + +Use the bundled script directly: + +```bash +python3 skills/searxng/scripts/searxng.py search "OpenClaw" -n 5 +python3 skills/searxng/scripts/searxng.py search "latest AI news" --category news +python3 skills/searxng/scripts/searxng.py search "OpenClaw" --format json +``` + +## Notes + +- Falls back to `http://localhost:8080` if no URL is configured. +- Prints the URL it attempted when connection fails. +- Uses the SearXNG JSON API endpoint. diff --git a/skills/searxng/.clawhub/origin.json b/skills/searxng/.clawhub/origin.json new file mode 100644 index 0000000..14461a6 --- /dev/null +++ b/skills/searxng/.clawhub/origin.json @@ -0,0 +1,7 @@ +{ + "version": 1, + "registry": "https://clawhub.ai", + "slug": "searxng", + "installedVersion": "1.0.3", + "installedAt": 1773013420514 +} diff --git a/skills/searxng/CHANGELOG.md b/skills/searxng/CHANGELOG.md new file mode 100644 index 0000000..c72690a --- /dev/null +++ b/skills/searxng/CHANGELOG.md @@ -0,0 +1,38 @@ +# Changelog + +All notable changes to the SearXNG skill will be documented in this file. + +The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), +and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). + +## [1.0.1] - 2026-01-26 + +### Changed +- **Security:** Changed default SEARXNG_URL from hardcoded private URL to generic `http://localhost:8080` +- **Configuration:** Made SEARXNG_URL required configuration (no private default) +- Updated all documentation to emphasize configuration requirement +- Removed hardcoded private URL from all documentation + +### Security +- Eliminated exposure of private SearXNG instance URL in published code + +## [1.0.0] - 2026-01-26 + +### Added +- Initial release +- Web search via local SearXNG instance +- Multiple search categories (general, images, videos, news, map, music, files, it, science) +- Time range filters (day, week, month, year) +- Rich table output with result snippets +- JSON output mode for programmatic use +- SSL self-signed certificate support +- Configurable SearXNG instance URL via SEARXNG_URL env var +- Comprehensive error handling +- Rich CLI with argparse + +### Features +- Privacy-focused (all searches local) +- No API keys required +- Multi-engine result aggregation +- Beautiful formatted output +- Language selection support diff --git a/skills/searxng/PUBLISH.md b/skills/searxng/PUBLISH.md new file mode 100644 index 0000000..6731d3e --- /dev/null +++ b/skills/searxng/PUBLISH.md @@ -0,0 +1,147 @@ +# Publishing SearXNG Skill to ClawdHub + +## ✅ Pre-Publication Verification + +All files present: +- [x] SKILL.md (v1.0.1) +- [x] README.md +- [x] LICENSE (MIT) +- [x] CHANGELOG.md +- [x] scripts/searxng.py +- [x] .clawdhub/metadata.json + +Security: +- [x] No hardcoded private URLs +- [x] Generic default (http://localhost:8080) +- [x] Fully configurable via SEARXNG_URL + +Author: +- [x] Updated to: Avinash Venkatswamy + +## 📤 Publishing Steps + +### Step 1: Login to ClawdHub + +```bash +clawdhub login +``` + +This will open your browser. Complete the authentication flow. + +### Step 2: Verify Authentication + +```bash +clawdhub whoami +``` + +Should return your user info if logged in successfully. + +### Step 3: Publish the Skill + +From the workspace root: + +```bash +cd ~/clawd +clawdhub publish skills/searxng +``` + +Or from the skill directory: + +```bash +cd ~/clawd/skills/searxng +clawdhub publish . +``` + +### Step 4: Verify Publication + +After publishing, you can: + +**Search for your skill:** +```bash +clawdhub search searxng +``` + +**View on ClawdHub:** +Visit https://clawdhub.com/skills/searxng + +## 📋 What Gets Published + +The CLI will upload: +- SKILL.md +- README.md +- LICENSE +- CHANGELOG.md +- scripts/ directory +- .clawdhub/metadata.json + +It will NOT upload: +- PUBLISH.md (this file) +- PUBLISHING_CHECKLIST.md +- Any .git files +- Any node_modules or temporary files + +## 🔧 If Publishing Fails + +### Common Issues + +1. **Not logged in:** + ```bash + clawdhub login + ``` + +2. **Invalid skill structure:** + - Verify SKILL.md has all required fields + - Check .clawdhub/metadata.json is valid JSON + +3. **Duplicate slug:** + - If "searxng" is taken, you'll need a different name + - Update `name` in SKILL.md and metadata.json + +4. **Network issues:** + - Check your internet connection + - Try again: `clawdhub publish skills/searxng` + +### Get Help + +```bash +clawdhub publish --help +``` + +## 📊 After Publishing + +### Update Notifications + +If you make changes later: + +1. Update version in SKILL.md and metadata.json +2. Add entry to CHANGELOG.md +3. Run: `clawdhub publish skills/searxng` + +### Manage Your Skill + +**Delete (soft-delete):** +```bash +clawdhub delete searxng +``` + +**Undelete:** +```bash +clawdhub undelete searxng +``` + +## 🎉 Success! + +Once published, users can install with: + +```bash +clawdhub install searxng +``` + +Your skill will appear: +- On ClawdHub website: https://clawdhub.com +- In search results: `clawdhub search privacy` +- In explore: `clawdhub explore` + +--- + +**Ready to publish?** Run `clawdhub login` and then `clawdhub publish skills/searxng`! diff --git a/skills/searxng/PUBLISHING_CHECKLIST.md b/skills/searxng/PUBLISHING_CHECKLIST.md new file mode 100644 index 0000000..15323f4 --- /dev/null +++ b/skills/searxng/PUBLISHING_CHECKLIST.md @@ -0,0 +1,111 @@ +# ClawdHub Publishing Checklist + +## ✅ Pre-Publication Checklist + +### Required Files +- [x] `SKILL.md` - Skill definition with metadata +- [x] `README.md` - Comprehensive documentation +- [x] `LICENSE` - MIT License +- [x] `CHANGELOG.md` - Version history +- [x] `scripts/searxng.py` - Main implementation +- [x] `.clawdhub/metadata.json` - ClawdHub metadata + +### SKILL.md Requirements +- [x] `name` field +- [x] `description` field +- [x] `author` field +- [x] `version` field +- [x] `homepage` field +- [x] `triggers` keywords (optional but recommended) +- [x] `metadata` with emoji and requirements + +### Code Quality +- [x] Script executes successfully +- [x] Error handling implemented +- [x] Dependencies documented (inline PEP 723) +- [x] Help text / usage instructions +- [x] Clean, readable code + +### Documentation +- [x] Clear description of what it does +- [x] Prerequisites listed +- [x] Installation instructions +- [x] Usage examples (CLI + conversational) +- [x] Configuration options +- [x] Troubleshooting section +- [x] Feature list + +### Testing +- [x] Tested with target system (SearXNG) +- [x] Basic search works +- [x] Category search works +- [x] JSON output works +- [x] Error cases handled gracefully +- [ ] Tested on different SearXNG instances (optional) +- [ ] Tested with authenticated SearXNG (optional) + +### Metadata +- [x] Version number follows semver +- [x] Author attribution +- [x] License specified +- [x] Tags/keywords for discovery +- [x] Prerequisites documented + +## ⚠️ Optional Improvements + +### Nice to Have (not blocking) +- [ ] CI/CD for automated testing +- [ ] Multiple example configurations +- [ ] Screenshot/demo GIF +- [ ] Video demonstration +- [ ] Integration tests +- [ ] Authentication support (for private instances) +- [ ] Config file support (beyond env vars) +- [ ] Auto-discovery of local SearXNG instances + +### Future Enhancements +- [ ] Result caching +- [ ] Search history +- [ ] Favorite searches +- [ ] Custom result templates +- [ ] Export results to various formats +- [ ] Integration with other Clawdbot skills + +## 🚀 Publishing Steps + +1. **Review all files** - Make sure everything is polished +2. **Test one more time** - Fresh installation test +3. **Version bump if needed** - Update SKILL.md, metadata.json, CHANGELOG.md +4. **Git commit** - Clean commit message +5. **Submit to ClawdHub** - Follow ClawdHub submission process +6. **Monitor feedback** - Be ready to address issues + +## 📝 Current Status + +**Ready for publication:** ✅ YES + +**Confidence level:** High + +**Known limitations:** +- Requires a running SearXNG instance (clearly documented) +- SSL verification disabled for self-signed certs (by design) +- No authentication support yet (acceptable for v1.0.0) + +**Recommended for:** Users who: +- Value privacy +- Run their own SearXNG instance +- Want to avoid commercial search APIs +- Need local/offline search capability + +## 🎯 Next Steps + +1. **Publish to ClawdHub** - Skill is ready! +2. **Gather user feedback** - Real-world usage +3. **Plan v1.1.0** - Authentication support, more features +4. **Community contributions** - Accept PRs for improvements + +--- + +**Assessment:** This skill is publication-ready! 🎉 + +All critical requirements are met, documentation is excellent, and the code works reliably. diff --git a/skills/searxng/README.md b/skills/searxng/README.md new file mode 100644 index 0000000..395ba25 --- /dev/null +++ b/skills/searxng/README.md @@ -0,0 +1,168 @@ +# SearXNG Search Skill for Clawdbot + +Privacy-respecting web search using your local SearXNG instance. + +## Prerequisites + +**This skill requires a running SearXNG instance.** + +If you don't have SearXNG set up yet: + +1. **Docker (easiest)**: + ```bash + docker run -d -p 8080:8080 searxng/searxng + ``` + +2. **Manual installation**: Follow the [official guide](https://docs.searxng.org/admin/installation.html) + +3. **Public instances**: Use any public SearXNG instance (less private) + +## Features + +- 🔒 **Privacy-focused**: Uses your local SearXNG instance +- 🌐 **Multi-engine**: Aggregates results from multiple search engines +- 📰 **Multiple categories**: Web, images, news, videos, and more +- 🎨 **Rich output**: Beautiful table formatting with result snippets +- 🚀 **Fast JSON mode**: Programmatic access for scripts and integrations + +## Quick Start + +### Basic Search +``` +Search "python asyncio tutorial" +``` + +### Advanced Usage +``` +Search "climate change" with 20 results +Search "cute cats" in images category +Search "breaking news" in news category from last day +``` + +## Configuration + +**You must configure your SearXNG instance URL before using this skill.** + +### Set Your SearXNG Instance + +Configure the `SEARXNG_URL` environment variable in your Clawdbot config: + +```json +{ + "env": { + "SEARXNG_URL": "https://your-searxng-instance.com" + } +} +``` + +Or export it in your shell: +```bash +export SEARXNG_URL=https://your-searxng-instance.com +``` + +## Direct CLI Usage + +You can also use the skill directly from the command line: + +```bash +# Basic search +uv run ~/clawd/skills/searxng/scripts/searxng.py search "query" + +# More results +uv run ~/clawd/skills/searxng/scripts/searxng.py search "query" -n 20 + +# Category search +uv run ~/clawd/skills/searxng/scripts/searxng.py search "query" --category images + +# JSON output (for scripts) +uv run ~/clawd/skills/searxng/scripts/searxng.py search "query" --format json + +# Time-filtered news +uv run ~/clawd/skills/searxng/scripts/searxng.py search "latest AI news" --category news --time-range day +``` + +## Available Categories + +- `general` - General web search (default) +- `images` - Image search +- `videos` - Video search +- `news` - News articles +- `map` - Maps and locations +- `music` - Music and audio +- `files` - File downloads +- `it` - IT and programming +- `science` - Scientific papers and resources + +## Time Ranges + +Filter results by recency: +- `day` - Last 24 hours +- `week` - Last 7 days +- `month` - Last 30 days +- `year` - Last year + +## Examples + +### Web Search +```bash +uv run ~/clawd/skills/searxng/scripts/searxng.py search "rust programming language" +``` + +### Image Search +```bash +uv run ~/clawd/skills/searxng/scripts/searxng.py search "sunset photography" --category images -n 10 +``` + +### Recent News +```bash +uv run ~/clawd/skills/searxng/scripts/searxng.py search "tech news" --category news --time-range day +``` + +### JSON Output for Scripts +```bash +uv run ~/clawd/skills/searxng/scripts/searxng.py search "python tips" --format json | jq '.results[0]' +``` + +## SSL/TLS Notes + +The skill is configured to work with self-signed certificates (common for local SearXNG instances). If you need strict SSL verification, edit the script and change `verify=False` to `verify=True` in the httpx request. + +## Troubleshooting + +### Connection Issues + +If you get connection errors: + +1. **Check your SearXNG instance is running:** + ```bash + curl -k $SEARXNG_URL + # Or: curl -k http://localhost:8080 (default) + ``` + +2. **Verify the URL in your config** +3. **Check SSL certificate issues** + +### No Results + +If searches return no results: + +1. Check your SearXNG instance configuration +2. Ensure search engines are enabled in SearXNG settings +3. Try different search categories + +## Privacy Benefits + +- **No tracking**: All searches go through your local instance +- **No data collection**: Results are aggregated locally +- **Engine diversity**: Combines results from multiple search providers +- **Full control**: You manage the SearXNG instance + +## About SearXNG + +SearXNG is a free, open-source metasearch engine that respects your privacy. It aggregates results from multiple search engines while not storing your search data. + +Learn more: https://docs.searxng.org/ + +## License + +This skill is part of the Clawdbot ecosystem and follows the same license terms. diff --git a/skills/searxng/SKILL.md b/skills/searxng/SKILL.md new file mode 100644 index 0000000..a25f348 --- /dev/null +++ b/skills/searxng/SKILL.md @@ -0,0 +1,46 @@ +--- +name: searxng +description: Search the web through a local or self-hosted SearXNG instance. Use when you want privacy-respecting web, image, video, or news search without external search API keys, via scripts/searxng.py. +--- + +# SearXNG Search + +Use `scripts/searxng.py` to run searches against a SearXNG instance. + +## Requirements + +Required runtime: +- `python3` +- Python packages: `httpx`, `rich` + +Configuration lookup order: +1. `SEARXNG_URL` environment variable +2. workspace `.clawdbot/credentials/searxng/config.json` found by walking upward from the script location +3. `~/.clawdbot/credentials/searxng/config.json` +4. fallback: `http://localhost:8080` + +Config file shape if used: + +```json +{ + "url": "https://search.example.com" +} +``` + +## Usage + +Examples: + +```bash +python3 scripts/searxng.py search "OpenClaw" -n 5 +python3 scripts/searxng.py search "latest AI news" --category news -n 10 +python3 scripts/searxng.py search "cute cats" --category images --format json +python3 scripts/searxng.py search "rust tutorial" --language en --time-range month +``` + +## Notes + +- Uses the SearXNG JSON API endpoint at `/search`. +- HTTPS certificate verification is disabled in the current script for compatibility with local/self-signed instances. +- If connection fails, the script prints the URL it attempted. +- `--format json` is best for programmatic use; table output is best for humans. diff --git a/skills/searxng/_meta.json b/skills/searxng/_meta.json new file mode 100644 index 0000000..7751dd1 --- /dev/null +++ b/skills/searxng/_meta.json @@ -0,0 +1,6 @@ +{ + "ownerId": "kn76z88c7kaynewbq2n2cv8831801bfs", + "slug": "searxng", + "version": "1.0.3", + "publishedAt": 1769472992634 +} \ No newline at end of file diff --git a/skills/searxng/scripts/searxng.py b/skills/searxng/scripts/searxng.py new file mode 100644 index 0000000..11bc8c3 --- /dev/null +++ b/skills/searxng/scripts/searxng.py @@ -0,0 +1,253 @@ +#!/usr/bin/env python3 +# /// script +# requires-python = ">=3.11" +# dependencies = ["httpx", "rich"] +# /// +"""SearXNG CLI - Privacy-respecting metasearch via your local instance.""" + +import argparse +import os +import sys +import json +import warnings +from pathlib import Path +import httpx +from rich.console import Console +from rich.table import Table +from rich import print as rprint + +# Suppress SSL warnings for local self-signed certificates +warnings.filterwarnings('ignore', message='Unverified HTTPS request') + +console = Console() + + +def candidate_searxng_urls(): + urls = [] + env_url = os.getenv("SEARXNG_URL") + if env_url: + urls.append(env_url) + + current = Path(__file__).resolve().parent + for base in [current, *current.parents]: + cfg = base / ".clawdbot" / "credentials" / "searxng" / "config.json" + if cfg.is_file(): + try: + data = json.loads(cfg.read_text()) + url = data.get("url") or data.get("SEARXNG_URL") + if isinstance(url, str) and url: + urls.append(url) + except Exception: + pass + + urls.extend([ + os.path.expanduser("~/.clawdbot/credentials/searxng/config.json"), + ]) + + # Read optional home config file path entries + expanded = [] + for item in urls: + if isinstance(item, str) and item.endswith("config.json") and os.path.isfile(item): + try: + data = json.loads(Path(item).read_text()) + url = data.get("url") or data.get("SEARXNG_URL") + if isinstance(url, str) and url: + expanded.append(url) + except Exception: + pass + else: + expanded.append(item) + + expanded.append("http://localhost:8080") + return list(dict.fromkeys([u for u in expanded if u])) + + +SEARXNG_URL = candidate_searxng_urls()[0] + +def search_searxng( + query: str, + limit: int = 10, + category: str = "general", + language: str = "auto", + time_range: str = None, + output_format: str = "table" +) -> dict: + """ + Search using SearXNG instance. + + Args: + query: Search query string + limit: Number of results to return + category: Search category (general, images, news, videos, etc.) + language: Language code (auto, en, de, fr, etc.) + time_range: Time range filter (day, week, month, year) + output_format: Output format (table, json) + + Returns: + Dict with search results + """ + params = { + "q": query, + "format": "json", + "categories": category, + } + + if language != "auto": + params["language"] = language + + if time_range: + params["time_range"] = time_range + + try: + # Disable SSL verification for local self-signed certs + response = httpx.get( + f"{SEARXNG_URL}/search", + params=params, + timeout=30, + verify=False # For local self-signed certs + ) + response.raise_for_status() + + data = response.json() + + # Limit results + if "results" in data: + data["results"] = data["results"][:limit] + + return data + + except httpx.HTTPError as e: + console.print(f"[red]Error connecting to SearXNG at {SEARXNG_URL}:[/red] {e}") + return {"error": str(e), "results": []} + except Exception as e: + console.print(f"[red]Unexpected error while using {SEARXNG_URL}:[/red] {e}") + return {"error": str(e), "results": []} + + +def display_results_table(data: dict, query: str): + """Display search results in a rich table.""" + results = data.get("results", []) + + if not results: + rprint(f"[yellow]No results found for:[/yellow] {query}") + return + + table = Table(title=f"SearXNG Search: {query}", show_lines=False) + table.add_column("#", style="dim", width=3) + table.add_column("Title", style="bold") + table.add_column("URL", style="blue", width=50) + table.add_column("Engines", style="green", width=20) + + for i, result in enumerate(results, 1): + title = result.get("title", "No title")[:70] + url = result.get("url", "")[:45] + "..." + engines = ", ".join(result.get("engines", []))[:18] + + table.add_row( + str(i), + title, + url, + engines + ) + + console.print(table) + + # Show additional info + if data.get("number_of_results"): + rprint(f"\n[dim]Total results available: {data['number_of_results']}[/dim]") + + # Show content snippets for top 3 + rprint("\n[bold]Top results:[/bold]") + for i, result in enumerate(results[:3], 1): + title = result.get("title", "No title") + url = result.get("url", "") + content = result.get("content", "")[:200] + + rprint(f"\n[bold cyan]{i}. {title}[/bold cyan]") + rprint(f" [blue]{url}[/blue]") + if content: + rprint(f" [dim]{content}...[/dim]") + + +def display_results_json(data: dict): + """Display results in JSON format for programmatic use.""" + print(json.dumps(data, indent=2)) + + +def main(): + parser = argparse.ArgumentParser( + description="SearXNG CLI - Search the web via your local SearXNG instance", + formatter_class=argparse.RawDescriptionHelpFormatter, + epilog=f""" +Examples: + %(prog)s search "python asyncio" + %(prog)s search "climate change" -n 20 + %(prog)s search "cute cats" --category images + %(prog)s search "breaking news" --category news --time-range day + %(prog)s search "rust tutorial" --format json + +Environment: + SEARXNG_URL: SearXNG instance URL (default: {SEARXNG_URL}) + """ + ) + + subparsers = parser.add_subparsers(dest="command", help="Commands") + + # Search command + search_parser = subparsers.add_parser("search", help="Search the web") + search_parser.add_argument("query", nargs="+", help="Search query") + search_parser.add_argument( + "-n", "--limit", + type=int, + default=10, + help="Number of results (default: 10)" + ) + search_parser.add_argument( + "-c", "--category", + default="general", + choices=["general", "images", "videos", "news", "map", "music", "files", "it", "science"], + help="Search category (default: general)" + ) + search_parser.add_argument( + "-l", "--language", + default="auto", + help="Language code (auto, en, de, fr, etc.)" + ) + search_parser.add_argument( + "-t", "--time-range", + choices=["day", "week", "month", "year"], + help="Time range filter" + ) + search_parser.add_argument( + "-f", "--format", + choices=["table", "json"], + default="table", + help="Output format (default: table)" + ) + + args = parser.parse_args() + + if not args.command: + parser.print_help() + return + + if args.command == "search": + query = " ".join(args.query) + + data = search_searxng( + query=query, + limit=args.limit, + category=args.category, + language=args.language, + time_range=args.time_range, + output_format=args.format + ) + + if args.format == "json": + display_results_json(data) + else: + display_results_table(data, query) + + +if __name__ == "__main__": + main()