
LinkedIn Job Bot
Production automation tool scraping 8+ job boards (LinkedIn, Wuzzuf, Bayt, GulfTalent, Remotive, Himalayas) and broadcasting results to a Telegram channel with 1,400+ subscribers. Uses asyncio.gather for concurrent scraping, SQLite for deduplication, and runs 24/7 via Docker on Railway — no browser automation needed.
Timeline
2024 – Present
Role
Lead Developer
Status
In-progressTechnology Stack
Key Challenges
- Each job board has a different HTML structure and anti-scraping measures, requiring custom parsing logic and rotating headers.
- Rate limiting Telegram API calls to avoid hitting the 30 messages/second limit while posting batch results.
Key Learnings
- asyncio.gather provides massive performance gains for I/O-bound tasks — scraping 8 boards concurrently is nearly as fast as scraping 1.
- SQLite is underrated for single-process applications — it provides ACID guarantees without any server setup.
- Docker on Railway is a zero-ops deployment strategy that works surprisingly well for always-on automation bots.
The Problem
Job seekers in the Middle East and remote frontend market manually check 8+ job boards daily — LinkedIn, Wuzzuf, Bayt, GulfTalent, Remotive, and Himalayas. This manual process wastes hours and causes qualified candidates to miss time-sensitive postings.
Technical Decisions
Concurrent Scraping: Python's asyncio.gather runs all scrapers concurrently, reducing total scrape time from minutes to seconds. Each scraper is implemented as an independent async function with its own retry logic and error handling.
Deduplication: SQLite stores job fingerprints (hash of title + company + URL) to prevent duplicate postings. The database runs as a single file alongside the bot, requiring zero infrastructure setup.
Telegram Bot API: Results are formatted as rich Telegram messages with job title, company, location, and direct apply links. The channel has grown to 1,400+ subscribers organically.
Docker + Railway: The bot runs 24/7 as a Docker container on Railway. Health checks and automatic restarts ensure continuous operation without manual intervention.
The Outcome
A production automation tool serving 1,400+ subscribers with daily job updates from 8+ boards. The entire system runs autonomously — no browser automation, no Selenium, no headless Chrome — just pure HTTP requests with httpx.