Managing My Open Source Repos with Autonomous AI Agents

Just found a way to squeeze even more out of my Claude Code subscription, and this trick also works with other coding subscriptions. The idea: make your AI coding agent work while you sleep.

The Problem

I maintain about 10 open source projects, mostly Cloudflare Workers tools like workers-qb, R2-Explorer, and workers-research. Keeping up with each one of them takes time, and I’m just talking about basic maintenance. Forget about me adding new features. There are always issues piling up, dependencies to update, tests to add, and documentation to improve.

Yes, I can open Claude and guide it through the process of fixing something, then reviewing it, then committing. But honestly, that is the last thing I want to do after spending a complete day already talking to Claude at work. A coding agent shouldn’t need me to be present for routine maintenance.

Here’s the gap that bothered me most: Claude is great at finding review issues, but for some reason I need to explicitly tell it “hey, review this PR you just made.” Only then it will catch problems. That manual loop (implement, then remind it to review, then fix, then remind it again) is what I wanted to eliminate.

The Approach

Instead of waiting for some perfect platform to solve this, I built my own lightweight automation. The key insight: you don’t need a complex platform, just a task queue, a scheduler, and an AI coding agent that can use git and GitHub CLI.

The whole thing works very simply: a kanban board and a schedule with a prompt. That’s it. I built a small tool called prodboard to glue it together. It’s a CLI-first issue tracker with a cron scheduler backed by SQLite. But you could do the same with GitHub Actions, a shell script, or anything that can run commands on a schedule. Every schedule spins up a sub-agent, a new Claude Code instance in a tmux terminal.

I set up the entire pipeline in a single Claude Code conversation. All 5 agents, the issue templates, even a fix to the scheduler’s systemd service. From zero to fully autonomous in one sitting.

Cost: nothing extra. The whole thing runs on my existing $100/month Claude Max subscription. I don’t pay anything above what I’m already paying. If you need Claude during the day for your actual work, you could even configure the cron jobs to run mostly at night.

The Architecture

The system runs as a systemd user service on a Linux machine. Five agents on cron schedules:

GitHub Open Source Contributor (hourly)

Picks a random repo with 20+ stars from my GitHub profile, checks for open issues or scans for TODOs and FIXMEs in the code, and implements a focused fix. It verifies no one else already has an open PR for the same work. Every PR must include tests. No tests, no PR. After pushing, it creates a review ticket on the board.

PR Code Review Agent (every 15 min)

Picks the next PR waiting for review and gates on CI. It won’t review if checks are still failing or pending. The review runs 5 independent perspectives: Correctness, Security, Performance, Code Quality, and Testing. At least 4 out of 5 must approve with no major or medium issues for the PR to pass. When it requests changes, it leaves detailed line-level comments.

Issue Worker (every 10 min)

Picks the next todo issue from the queue and validates it has enough information: a repo URL, a problem statement, affected code references. If anything is missing, it flags the issue for human attention with a specific comment about what’s needed. If the info is sufficient, it clones the repo, reads the codebase, implements the fix, and opens a PR.

Daily Summary (daily at 9 AM)

Generates a report covering issues solved, PRs opened, cost, and token usage. It creates this as a “done” issue so it doesn’t get picked up by other agents.

House Cleaning (hourly)

Checks all issues in human-approval status: merged PRs get moved to done, conflicting PRs get sent back to todo for a rebase. Keeps the board clean automatically.

The Issue Lifecycle

todo -> agent implements -> review -> code review agent checks
  ^                                        |
  |-- sent back (needs fixes) <------------+
  |
  v
human-approval -> human merges -> house cleaning -> done

The human is always the final gatekeeper. Nothing gets merged without me clicking the merge button. I’ve also made a human-approval status for sensitive changes or when Claude actually needs me to enter a credential somewhere. The agents propose, review, and clean up, but I make the final call.

Real Results

In the first 24 hours, the agents found real bugs across my repos:

SQL injection in query parameter handling in workers-qb
Authorization bypass letting non-admin users access other mailboxes in email-explorer
Content-Disposition header injection in R2-Explorer
Quarter date truncation returning NULL for months 5-12 in django-cf

Beyond bug fixes, the agents set up automated versioning with changesets and npm trusted publishing across about 10 repos in one batch. They opened 30+ PRs, each reviewed by the 5-perspective code review agent. Issues needing human judgment (missing context, ambiguous requirements, architectural decisions) were correctly flagged for my review.

That said, I’m not going to oversell this. Not every PR is perfect. Sometimes the agent produces a fix that’s technically correct but doesn’t match the design intent of the project. Sometimes it creates an issue that’s too vague to act on. But the hit rate is surprisingly high for routine maintenance work.

What Makes It Work

Well-structured issues: repo URL, file paths, line numbers, concrete suggested fix. Garbage in, garbage out.
Test requirements: the contributor agent must include tests. This single constraint eliminates most bad PRs.
Multi-perspective review: one reviewer misses things. Five catch more.
CI gating: no reviews on failing CI. Prevents cascading failures.
Graceful degradation: when stuck, flag for human instead of making a mess.

A Prompt Example

Here’s a simplified version of what a schedule prompt looks like:

You are an autonomous code reviewer. Pick the next PR in "review" status.
1. Check if CI passed. If not, request changes
2. Review from 5 perspectives: correctness, security, performance, quality, testing
3. If 4/5 approve with no major issues -> approve the PR
4. If issues found -> leave comments and send back for fixes

The actual prompts are longer, but the pattern is the same: structured steps, validation gates, and fallback behaviors. The structure is what makes agents reliable. Without clear guardrails, they wander.

Tips and Future Ideas

Night mode is the most practical tip I can share. Schedule your agents to run at night so your subscription capacity is free during work hours. You wake up to PRs ready for review instead of a blank board.

The same pattern works for reviewing external PRs from community contributors. Faster feedback means happier contributors. And cross-repo consistency is powerful: I batched changesets configuration across 10 repos in one issue. Dependency updates, release management, documentation: these are all natural extensions.

For popular repos, an agent could triage incoming issues: label, categorize, and attempt to reproduce bugs before a human ever looks at them.

Build Your Own

I’m not expecting anyone to use prodboard. Anyone looking at this should build their own version. That’s the whole point. The pattern matters more than any specific tool.

You need 4 things:

A task queue: SQLite, GitHub Projects, Notion, even a text file
A scheduler: cron, systemd timers, GitHub Actions
An AI coding agent: Claude Code, Codex, Cursor CLI, whatever you prefer
GitHub CLI for programmatic interaction with repos

The secret sauce is prompt engineering: clear steps, validation gates, fallback behaviors. Start with one cron job that picks a TODO and opens a PR. See how it goes. Add review. Add cleanup. Build up incrementally.

The best part? You can set up the whole thing by asking your AI coding agent to do it. That’s how I did it, in a single conversation.