The Challenge #
I have a python library with a CLI, built using the awesome typer library. The dependencies are managed with poetry and the users are mostly gophers with a robust dislike of python.
Getting your own python code working is usually pretty straightforward. Getting someone else’s python code working is usually a bit of a nightmare. Getting the right python version installed, configuring venvs, using their package / dependency manager, etc. etc.
“Wouldn’t it be great if there was a homebrew package?!”
Yes, it surely would. But how do we get from here (python code in a private github repo) to there (packaged binaries in a ruby-based packaging system)?
TL;DR: Example code in my github repo
Assemble the pieces #
Homebrew docs have a page on building python formulae but it is very quiet on the subject of private github repos, doesn’t mention poetry at all, and refers to homebrew-pypi-poet which was last released 6 years ago when python 3.6 was still considered new and cool. Well, “new” anyway.
This blog post got me to an initial build of a Cask (pre-built zipped “binary” built using pyinstaller ).
It was a start, and did at least provide a way for folks to download a packaged build that would run using its own embedded python binary. No need to manage a python install on the machine.
Note: throughout this project I rely on github release assets, so these steps only work after a new github release has been created. They work fine if it is a draft release, but it still needs to be a release rather than arbitrary files in the repo.
Version 1.0 - a Cask from a private repo #
Let’s say you have a github org called “my_org_name”, with a python repo called “pybinary_repo” and we’re going to create an executable called “pybinary”.
This needs the gh
formula installed, and credentials for your private repo(s):
gh auth login --hostname github.com -p https -w
Now you can set up a Tap (github repo with a homebrew- prefix), eg
homebrew-tools, and add it to homebrew with brew tap my_org_name/tools
Then in
homebrew-tools, create Cask/pybinary.rb:
ORG = "my_org_name"
REPO = "pybinary_repo"
cask "pybinary" do
# Update version and sha256 to release a new Cask
version "1.0"
sha256 "cbeafe76301d2f814487ee6631bc0cbf0708d90034c8a3ab3b8be7a0840aa029"
depends_on formula: "gh"
url do
assets = GitHub.get_release(ORG, REPO, "v#{version}").fetch("assets")
zip_url = assets.find{|a| a["name"] == "pybinary-#{version}.zip"}.fetch("url")
[zip_url, header: [
"Accept: application/octet-stream",
"Authorization: bearer #{GitHub::API.credentials}"
]]
end
name "pybinary"
desc "Epic CLI tool"
homepage ""
# Documentation: https://docs.brew.sh/Brew-Livecheck
livecheck do
url "https://github.com/#{ORG}/#{REPO}/releases"
end
binary "pybinary-#{version}/pybinary"
caveats do
"Please run 'xattr -r -d com.apple.quarantine #{staged_path}' to remove the quarantine flag"
end
postflight do
ohai "Removing quarantine flag"
system_command "/usr/bin/xattr", args: ["-r", "-d", "com.apple.quarantine", staged_path]
ohai "Unpacking the PyBinary CLI tool"
# Actually try running it, to spot any problems with packaging asap:
system_command "#{staged_path}/pybinary-#{version}/pybinary", args: ["--version"]
end
# Documentation: https://docs.brew.sh/Cask-Cookbook#stanza-zap
zap trash: ""
end
Creating the zip file that is going to become the Cask is fairly straightforward, assuming you already have a working venv and poetry package defined for your code.
- Install pyinstaller:
poetry add pyinstaller
(if you don’t already have it) - Bundle the script into a versioned directory:
poetry run pyinstaller src/main.py -n pybinary-1.0
(assuming your code is in src/main.py, otherwise use the actual path to the main script) - Change into the “dist” directory (output from pyinstaller):
cd dist
- Rename the main executable back to its base name:
mv pybinary-1.0/pybinary-1.0 pybinary-1.0/pybinary
- Zip up the code:
zip -r pybinary-1.0.zip pybinary-1.0
You could add this to a Makefile to simplify the process:
POETRY_VERSION=1.8
POETRY=$(shell PATH="$$HOME/bin:$$PATH" command -v poetry${POETRY_VERSION})
VER=$(shell PATH="$$HOME/bin:$$PATH" poetry${POETRY_VERSION} version -s)
CLI_VER=pybinary-${VER}
dist/${CLI_VER}.zip:
${POETRY} run pyinstaller src/main.py -n ${CLI_VER} \
&& cd dist \
&& ${CLI_VER}/${CLI_VER} --version \
&& mv ${CLI_VER}/${CLI_VER} ${CLI_VER}/pybinary \
&& zip -r ${CLI_VER}.zip ${CLI_VER}
cli-zip: dist/${CLI_VER}.zip
I have several poetry versions installed because I work on repos owned by other people who haven’t upgraded yet. Poetry is generally stable across patch releases, but almost always has incompatibilities across minor versions. Which means it works fine to have poetry 1.8.3 installed as poetry1.8 in my PATH.
I make life easier by explicitly setting this repo’s required poetry version in the Makefile and checking it exists before doing anything else.
With this setup, I can run make cli-zip
and it will build a zip with the
current version number from the poetry configuration.
When building a release, I run make cli-zip
and add the file to the assets in
github.
Then update the version tag and sha256 in pybinary.rb cask file, save, commit, and push the homebrew-tools repo.
Users of the code need to initially install the tap and can then install the cask:
brew install gh
gh auth login --hostname github.com -p https -w
brew tap my_org_name/tools
brew install pybinary
When the release is updated, a brew update; brew upgrade
will install the new
version.
This works, but it feels kinda ugly. The zipfile can get big, and the workaround to remove the quarantine flags is a bit of a hack.
Can we do it with a Formula instead of a Cask, and build a “proper” homebrew bottle? Yes.
I have now removed the Cask definition from my own repo, but it may be good enough for other folks so I’m happy to share what I learned.
Building a python Formula #
This is where our packaging process gets about 80% more complex in order to provide a 20% nicer experience for the end user.
In order to build from source, we cannot use the same url hack to download the release assets from github. Or at least, I couldn’t make that work.
Which means we need our own downloader implementation, based on the existing Curl download strategy. This is informed by, but simpler than, other folks solutions
I call it repo.rb:
# Custom downloader for private repo.
class GitHubPrivateRepositoryReleaseDownloadStrategy < CurlDownloadStrategy
def initialize(url, name, version, **meta)
parse_url_pattern(url)
super
end
def parse_url_pattern(url)
url_pattern = %r{https://github.com/([^/]+)/([^/]+)/releases/download/([^-]+)-([0-9.]+)(\.arm\S+)}
unless url =~ url_pattern
raise CurlDownloadStrategyError, "Invalid url pattern for GitHub Release."
end
_, @owner, @repo, pkg, version, filename = *url.match(url_pattern)
@tag = "v#{version}"
@filename = "#{pkg}--#{version}#{filename}"
end
def download_url
"https://api.github.com/repos/#{@owner}/#{@repo}/releases/assets/#{asset_id}"
end
private
def _fetch(url:, resolved_url:, timeout:)
# HTTP request header `Accept: application/octet-stream` is required.
# Without this, the GitHub API will respond with metadata, not binary.
curl_download download_url, "--header", "Accept: application/octet-stream", "--header", "Authorization: Bearer #{GitHub::API.credentials}", to: temporary_path, timeout: timeout
end
def asset_id
@asset_id ||= resolve_asset_id
end
def resolve_asset_id
release_assets = fetch_release_assets
assets = release_assets.select { |a| a["name"] == @filename }
raise CurlDownloadStrategyError, "Asset file not found." if assets.empty?
assets.first["id"]
end
def fetch_release_assets
GitHub.get_release(@owner, @repo, @tag).fetch("assets")
end
end
Now we can build a Formula/pybinary.rb:
# Homebrew Formula for pybinary
# We need a customer downloader to use github release assets from our private repos
require_relative 'repo.rb'
class Pybinary < Formula
include Language::Python::Virtualenv
desc "Epic CLI tool"
homepage ""
url "https://github.com/my_org_name/pybinary_repo.git",
branch: "main",
tag: "v1.0"
license ""
depends_on "python@3.12"
depends_on "rust" => :build
depends_on "python-setuptools" => :build
bottle do
root_url "https://github.com/my_org_name/pybinary_repo/releases/download",
using: GitHubPrivateRepositoryReleaseDownloadStrategy
sha256 cellar: :any, arm64_sonoma: "4a922d718e7e616ab4f59eb4615ec78d2b70c96d4b737d4c2a1d8e5df716d675"
end
eval(IO.read(File.join(File.expand_path(File.dirname(__FILE__)), 'resources.rb')))
def install
# Handle changes to clang / xcode paths in most recent xcode update.
# Without this, grpcio fails to build due to missing cstddef inclue.
ENV.append_to_cflags "-I/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1"
virtualenv_install_with_resources
end
test do
# Not needed
system "false"
end
end
Oh wait- what’s that resources.rb
?
Oh yeah, this is where it gets fun.
Resources? Dependencies? Requirements? “DLL Hell is back, baby!” #
The homebrew python docs offer this helpful advice:
You can use
brew update-python-resources
to help you write resource stanzas.
What does it do? Queries the names of installed python modules, then goes and grabs the latest version of each. Whether or not they are compatible with each other.
Great, except for things like the protobuf
library which must be a matching
version for any other modules relying on protobufs (eg any google client code).
It’s pretty disappointing to watch a build run for 10-20 minutes, succeed, and then the resulting code crashes on startup.
One frustrating part of this is that we already have poetry to do all the
dependency resolution, and it already has a list of specific versions of every
module which will satisfy all the constraints. But the homebrew code has no idea
about poetry, nor any way to input the same constraints that we have in
pyproject.toml
.
What’s the workaround?
Export the poetry.lock file to requirements.txt format (which has very simple
module_name==version constraints and a list of matched sha256 digests for the
downloadable packages of the module). poetry export -o requirements.txt
Then use something like make_resources.py
from pathlib import Path
from subprocess import run
from rich.console import Console
import typer
app = typer.Typer()
console = Console()
@app.command()
def brew_resources(file: Optional[Path] = None, output: Optional[Path] = None) -> None:
"""Outputs brew resource stanzas from requirements.txt."""
if file is None:
subprocess.run(["make", "requirements.txt"], check=True)
file = Path("requirements.txt")
if output is None:
output = FORMULA_DIR / "resources.rb"
pypi_base_url = "https://pypi.org/pypi"
seen = set()
resource_blocks = []
with open(file, "r", encoding="utf-8") as file:
for line in file:
line = line.strip()
if line.startswith("-e") or line.startswith("#") or line == "":
continue
while line.endswith("\\"):
line += next(file).strip()
if ";" in line:
pkg_spec, constraints = line.split(" ; ", 1)
else:
pkg_spec, constraints = line, ""
try:
pkg_name, pkg_version = pkg_spec.split("==")
except ValueError:
errs.print(f"Skipping invalid line: {line}")
continue
# Ignore package extras
if "[" in pkg_name:
pkg_name, *_ = pkg_name.split("[")
if pkg_name in seen:
continue
seen.add(pkg_name)
pkg_info_url = f"{pypi_base_url}/{pkg_name}/{pkg_version}/json"
response = requests.get(pkg_info_url, timeout=10)
if response.status_code != 200:
errs.print(f"Failed to fetch package info for {pkg_name}=={pkg_version} using {pkg_info_url}")
continue
pkg_info = response.json()
selected_url = ""
# I prefer the pre-built wheels, especially for google client code.
# The sdist tar.gz files can be huge, and I'm trying for a quick, simple, repeatable build.
for suffix in ["-none-any.whl", ".tar.gz"]:
selected_url = next(
(
url_info
for url_info in pkg_info["urls"]
if url_info["url"].endswith(suffix) and url_info["digests"]["sha256"] in constraints
),
None,
)
if selected_url:
break
if not selected_url:
errs.print(f"No distribution found for {pkg_name}=={pkg_version}")
continue
download_url = selected_url["url"]
sha256 = selected_url["digests"]["sha256"]
resource_block = (
f'resource "{pkg_name}" do\n' f' url "{download_url}"\n' f' sha256 "{sha256}"\n' "end\n\n"
)
resource_blocks.append(resource_block)
with open(output, "w", encoding="utf-8") as outfh:
outfh.writelines(resource_blocks)
console.print(f"{len(resource_blocks)} resources written to {output}")
During a package release, you can now build a bottle of the homebrew formula:
- Update the resources by exporting from poetry to requirements.txt then building resources.rb
- Update the version in Formula/pybinary.rb to match the github release tag.
HOMEBREW_NO_INSTALL_FROM_API=1 brew install --build-bottle my_org_name/tools/pybinary
HOMEBREW_NO_INSTALL_FROM_API=1 brew bottle my_org_name/tools/pybinary
- Add the newly created
pybinary--*.bottle.tar.gz
to the github release assets.- I could not seem to prevent the creation of a file with a double hyphen.
- But that is what homebrew expects, and it works.
- Update Formula/pybinary.rb to update the sha256 cellar digest to match the new bottle. This enables users to quickly install a pre-packaged build and save themselves 10+ minutes of building from source.
- Save, commit, and push the tap repo.
Again, users just set up their gh
credentials, add the my_org_name/tools tap,
and brew install pybinary
.
But this time, it’s using the brew-managed python binary and standard pre-built wheels (python module packages) for as many dependencies as possible.