Asynchronous CLI Agents in GitHub Actions

The CLI strikes back — your guide for delegating coding work to CLI agents running asynchronously in CI

Aug 22, 2025

This guide explores an alternative to cloud-hosted products for asynchronous AI-assisted coding: running command-line interface (CLI) agents within a Continuous Integration (CI) workflow. Specifically, we'll examine how tools like Claude Code, Gemini CLI, and opencode can be deployed as GitHub Actions to function as autonomous agents.

Through a practical journey of refactoring a repository, this report details the setup, user experience, and outcomes of using these three agents. The conclusion is that while this approach requires more initial configuration than their SaaS counterparts, it offers significant advantages in flexibility, control, and model choice, making it a powerful option for developers and engineering teams.

See our previous article on using cloud-hosted asynchronous AI coding agents

CLI Coding Agents in GitHub Actions

While cloud-hosted platforms like GitHub Copilot Agent offer a streamlined experience, integrating a CLI coding agent into your CI pipeline presents a compelling alternative. This method involves configuring a tool to run non-interactively as part of a workflow, typically triggered by an event like a comment on a GitHub issue.

Advantages

Choice and Flexibility: You can select from a wide range of open-source and commercial CLI agents. Open-Source tools like opencode further expand your options by allowing you to use virtually any model from any provider, from open models like Qwen to proprietary ones like Claude or GPT-5.
Control and Security: The agent operates within your own CI environment — whether it's a GitHub-hosted runner or your own self-hosted infrastructure. This gives you complete control over the execution environment and can address specific security or compliance requirements.
Configurability: The agent's behavior is defined in a workflow .yml file, giving you fine-grained control over its prompts, triggers, permissions, and the specific commands it executes.

Trade-Offs

Setup Complexity: Installation is more involved than simply enabling a cloud service. It typically requires adding API keys as repository secrets, setting refined permissions and access control for the execution environment, and committing workflow files to your repository.
User Experience: The user interface is limited to the output of the GitHub Action run and comments posted to issues or pull requests. This is generally less polished than the dedicated UIs of cloud-hosted products. However, as these agents operate asynchronously, an elaborate UI is often unnecessary once the system is functioning correctly.

A Refactoring Journey with CLI Agents

To evaluate this approach, I used several CLI agents to perform a series of well-defined refactoring tasks on a project repository. The setup for each involved adding the necessary API key(s) as secrets to the GitHub repository.

Claude Code

Claude Code was the first agent that popularized this CI-based workflow.

Setup

The setup process was remarkably smooth.

I ran the /install-github-app in Claude Code command locally.
This command opened a browser window, guiding me through the process of installing the Claude GitHub App on my repository.
Once authorized, Claude automatically created a pull request to add the necessary GitHub Actions workflow files to my repository.
After reviewing and merging the PR, the setup was complete.

Execution and Experience

To trigger the agent, I commented @claude on a GitHub issue, followed by my request.

The user experience was highly polished. Claude promptly replied with a comment confirming it was working, including a link to the live job run. It then posted another comment with a detailed, itemized to-do list, which it updated in real-time as it progressed. This provided excellent visibility into its process. The commits and comments were made by a dedicated "claude" user, making its contributions easy to identify.

While the raw output in the GitHub Actions log was a verbose stream of JSON, the well-formatted updates in the issue itself were more than sufficient for monitoring its work.

Outcome

After about 15 minutes, Claude commented that it had finished and provided a link to create a pull request. The initial PR had failing CI tests. However, by commenting on the PR and asking Claude to fix the errors, it successfully identified the problem, created a new plan, and pushed a fix. The final PR was correct and ready to merge. The only significant limitation is that this tool works exclusively with Anthropic's Claude models.