An attempt to run Chromium automation with bare CDP commands.
NOTE: Breaking changes may occur until the API is finalized.
gopilot is my attempt to provide a simple, minimalistic API for automating Chromium browsers. It's not meant to be another Puppeteer. Instead, it's focused on the essential features most users need for straightforward browser tasks—no fluff, just what you need.
Under the hood gopilot uses github.com/mafredri/cdp for chrome communication, inspired by gRPC provides a really nice and easy API.
I wanted to simplify browser automation by sticking to the core functionalities that most of us use:
- Navigation to web pages
- Clicking on elements
- Typing text
- Taking screenshots
- Extracting HTML content
I’ve also added some features for intercepting requests, which is handy if you want to cancel or grab AJAX info. Overall, gopilot aims to be a lightweight tool that doesn’t bog you down with unnecessary complexity.
- Headfull mode support: Designed to run as headful and compatible with Docker using Xvfb for display.
- Headless mode: Easily switch to headless operation when needed.
- Navigate to a specified URL
- Query Selector to find elements on the page
- Click on elements
- Extract HTML content from the page
- Intercept (Needs rework in order to allow modifying the request) network requests for those who want to dig deeper
- Set, get, and clear cookies
Here's a very basic example of how to use gopilot to open a URL:
package main
import (
"context"
"log/slog"
"os"
"os/signal"
"time"
"github.com/falmar/gopilot/pkg/gopilot"
)
func main() {
ctx, cancel := signal.NotifyContext(context.Background(), os.Interrupt, os.Kill)
defer cancel()
logger := slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{
Level: slog.LevelDebug,
}))
cfg := gopilot.NewBrowserConfig()
b := gopilot.NewBrowser(cfg, logger)
err := b.Open(ctx, &gopilot.BrowserOpenInput{})
if err != nil {
logger.Error("unable to open browser", "error", err)
return
}
defer b.Close(ctx)
pOut, err := b.NewPage(ctx, &gopilot.BrowserNewPageInput{})
if err != nil {
logger.Error("unable to open page", "error", err)
return
}
page := pOut.Page
defer page.Close(ctx)
_, err = page.Navigate(ctx, &gopilot.PageNavigateInput{
URL: "https://www.google.com",
WaitDomContentLoad: true,
})
if err != nil {
logger.Error("unable to navigate", "error", err)
return
}
time.Sleep(2 * time.Second)
// do some magic ...
}
For more practical examples of how to use gopilot, check out the examples provided:
By default, gopilot runs in headful mode, which may require a display server when running in a Docker container. To
switch to headless mode, simply call the EnableHeadless
method on the BrowserConfig
object. You can start the
browser in headless mode as follows:
package main
import "github.com/falmar/gopilot/pkg/gopilot"
func main() {
// EnableHeadless will make the browser start as headless
cfg := gopilot.NewBrowserConfig()
cfg.EnableHeadless()
// ...
}
// which is basically:
func (c *BrowserConfig) EnableHeadless() {
c.AddArgument("--headless=new")
}
- Allow users to input an external browser endpoint
- Taking screenshots of web pages and elements (yes, just element bounding box)
- Setting, getting, and clearing local storage
- Typing text into input fields
Contributions are welcome! If you've got a feature request or an idea to share, reach out.