木鸟杂记

大规模数据系统

Building Golang Projects With Bazel

bazel-golang.pngbazel-golang.png

Introduction

Bazel is an excellent open-source build system from Google. Its positioning, in the official words, is:

a fast, scalable, multi-language and extensible build system

In plain language:

A blazingly fast, scalable, multi-language, and extensible build system

To build Golang projects with Bazel, in addition to Bazel’s own features, you also need to understand the Golang-specific extension package rules_go. Additionally, you can use bazel gazelle to automate some of the boilerplate work.

Author: Muniao’s Notes https://www.qtmuniao.com, please indicate the source when reprinting

Bazel Feature Overview

For the four official features of Bazel, let’s discuss them one by one.

Fast

Bazel’s build process is very fast, incorporating common acceleration techniques from previous build systems. These include:

  1. Incremental compilation. It only rebuilds the necessary parts—that is, by analyzing dependencies, it only recompiles modified files and the affected paths.
  2. Parallel compilation. It executes independent parts in parallel. You can specify the number of parallel jobs with --jobs, usually matching the number of CPUs on your machine. When running at full throttle on large projects, Bazel can max out every core on your machine.
  3. Distributed/local caching. Bazel treats the build process as functional—given the same input, the output is deterministic. It doesn’t change with the build environment (though this requires some constraints). Therefore, different modules can be cached and reused in a distributed manner, which provides a significant speed boost for very large projects.

Scalable

Bazel claims to handle projects of any scale, whether ultra-large monorepos or multi-repo projects with numerous libraries. Bazel can also be easily integrated into CI/CD pipelines and leverage distributed environments for cloud builds.

It uses a sandbox mechanism for compilation, isolating all build dependencies within a sandbox. For example, when building Golang projects, it does not rely on your local GOPATH, thereby ensuring the same source code produces the same output across different environments—i.e., build determinism.

Multi-language

If different modules of a project use different languages, Bazel allows you to manage external and internal dependencies in a consistent style. A typical example is Ray. This project uses C++ for Ray’s core scheduling components and provides multi-language APIs via Python/Java, managing all these modules within a single repo. Such organization makes project integration extremely difficult, but Bazel handles it with ease. You are welcome to explore the repo for more details.

Extensible

Bazel’s syntax is based on a language derived from Python: Starlark. It is highly expressive. At a small scale, users can define custom rules (similar to functions in general-purpose languages) to reuse build logic. At a large scale, it supports third-party developers writing rule sets to adapt to new languages or platforms, such as rules_go. Bazel does not natively support building Golang projects, but by introducing rules_go, you can manage Golang projects in a consistent style.

Bazel Main Files

Projects managed by Bazel typically include the following Bazel-related files: WORKSPACE, BUILD(.bazel), .bzl, and .bazelrc, etc. Among them, WORKSPACE and .bazelrc are placed in the project root directory; BUILD.bazel is placed in every folder of the project (including the root directory); *.bzl files can be freely placed according to user preference, usually in a dedicated folder under the project root (for example, a build folder).

WORKSPACE

  1. Defines the project root directory and project name.
  2. Loads Bazel tools and rule sets.
  3. Manages external dependency libraries for the project.

BUILD.(bazel)

This file is mainly used for dependency resolution (label) and target definition (bazel target) within its folder. For Go, build targets can be go_binary, go_test, go_library, etc.

Earlier versions of Bazel used the filename BUILD, but on some case-insensitive systems, it was easily confused with build directories. Therefore, it was later changed to the explicit BUILD.bazel. If both exist in a project, Bazel prefers the latter. For all new projects, the explicit BUILD.bazel is recommended. There are some discussions on GitHub here.

To reference a dependency, Bazel uses the label syntax to uniquely identify all packages, with the following format:

1
@workerspace_name//path/of/package:target

For example, the label for logrus, a commonly used logging library in Go, is:

1
@com_github_sirupsen_logrus//:go_default_library

If it’s a package path within this project, you can omit the workspace name before //.

Custom Rules (*.bzl)

If your project has some complex build logic, or some reusable build logic, you can save this logic as functions in .bzl files for WORKSPACE or BUILD files to call. Its syntax is similar to Python:

1
2
3
4
5
6
7
8
9
10
11
def third_party_http_deps():
http_archive(
name = "xxxx",
...
)

http_archive(
name = "yyyy",
...
)

Configuration .bazelrc

The rc suffix naming convention is a classic small tradition in computing; if you’re interested, you can check out this StackOverflow answer. In short, this file is used to configure some parameters when the corresponding command runs. Common examples include .vimrc, .bashrc, etc.

For Bazel, if certain build actions always require a specific parameter, you can write it in this configuration file to avoid retyping it every time you run a command. Here’s a Go example: due to network conditions in China, GOPROXY may be needed during build, test, and run phases. You can configure it as follows:

1
2
3
4
# set GOPROXY
test --action_env=GOPROXY=https://goproxy.io
build --action_env=GOPROXY=https://goproxy.io
run --action_env=GOPROXY=https://goproxy.io

Building Golang Projects with Bazel

With the Bazel basics above, building Golang projects also requires understanding two concepts: rules_go and bazel gazelle.

rules_go

rules_go is a Bazel extension package that enables Bazel to compile Go. It consists of a series of rules, including go_library\go_binary\go_test, supports vendor and cross-compilation, and can easily integrate tools such as protobuf, cgo, gogo, and nogo.

It compiles within Bazel’s sandbox, independent of the local GOROOT/GOPATH, and automatically downloads the corresponding Go version, enabling consistent compilation across different platforms.

bazel gazelle

Gazelle is a tool that automatically generates Bazel build files, including adding external dependencies to WORKSPACE and scanning source file dependencies to automatically generate BUILD.bazel files. Gazelle natively supports Go and protobuf, and can be extended to support other languages and rules via plugins. Gazelle can be run using the bazel command with the gazelle rule, or used as a standalone command-line tool.

  • Automatically add external dependencies

Use bazel run //:gazelle update-repos repo-uri to import the corresponding dependency package from go.mod.

For example, to add the segmentio Kafka Go client package to the project, simply run the following command in the project root directory: bazel run //:gazelle update-repos github.com/segmentio/kafka-g

Gazelle will automatically add a dependency to the WORKSPACE file:

1
2
3
4
5
6
go_repository(
name = "com_github_segmentio_kafka_go",
importpath = "github.com/segmentio/kafka-go",
sum = "h1:Mv9AcnCgU14/cU6Vd0wuRdG1FBO0HzXQLnjBduDLy70=",
version = "v0.3.4",
)
  • Automatically generate build files

Gazelle can automatically generate BUILD.bazel files for each directory in just two simple steps:

  1. In the BUILD.bazel file at the project root, configure to load and set up Gazelle:

    1
    2
    3
    4
    5
    6
    load("@bazel_gazelle//:def.bzl", "gazelle")

    # gazelle:prefix your/project/url
    gazelle(
    name = "gazelle",
    )

    Note that content after # is a comment for Bazel, but a syntax for Gazelle that is used at runtime. Of course, besides running via the bazel rule, Gazelle can also be executed directly from the command line.

  2. Run bazel run //:gazelle in the root directory.

Best Practices

Bazel has many practical best practices, such as using http_archive to download external dependencies of a specific version, using stamp variables for injection, packaging and releasing, etc. It’s worth looking at some open-source projects with good Bazel build practices:

  • github.com/kubernetes/test-infra (Best practices for Go projects, relatively comprehensive deep usage of Bazel)

  • github.com/kubernetes/repo-infra (Best practices for Go projects, very elegant configurations and scripts)


我是青藤木鸟,一个喜欢摄影、专注大规模数据系统的程序员,欢迎关注我的公众号:“木鸟杂记”,有更多的分布式系统、存储和数据库相关的文章,欢迎关注。 关注公众号后,回复“资料”可以获取我总结一份分布式数据库学习资料。 回复“优惠券”可以获取我的大规模数据系统付费专栏《系统日知录》的八折优惠券。

我们还有相关的分布式系统和数据库的群,可以添加我的微信号:qtmuniao,我拉你入群。加我时记得备注:“分布式系统群”。 另外,如果你不想加群,还有一个分布式系统和数据库的论坛(点这里),欢迎来玩耍。

wx-distributed-system-s.jpg