In our engineering practice, there are some small tricks for structuring code whose underlying ideas are also commonly seen in everyday life. This series is a set of such strange associations spanning life and engineering. This is the first one: multi-pass decomposition, that is, many things we are used to doing in one go can sometimes be much simpler and more efficient when broken down into multiple passes.
When I do code reviews, I often see novice developers trying to do too many things in a single for loop. This often leads to deeply nested code or gigantic for loop bodies. At this point, if performance is not significantly impacted, I usually suggest breaking down the tasks into multiple steps, with one for loop per step. You can even make each step a separate function.
Of course, all of this is from a maintenance perspective. Because humans can’t keep too many things in mind at once; doing things step by step, rather than mashing them together, makes each step’s logic much clearer. The latter, I often call the “pancake-spreading” style of code. This kind of code feels natural to write, but is painful to maintain—mixing details together always causes complexity to explode. The concept of a minimum viable prototype in software engineering follows a similar philosophy.
Author: Muniao’s Notes https://www.qtmuniao.com/2023/08/21/life-engineering-many-passes Please indicate the source when reprinting
This philosophy is also everywhere in “functional” programming, where when operating on a dataset, we apply a series of transformation functions in a chain, making the data flow clearly visible. In big data processing, this paradigm is even more common. For example, as mentioned in the Spark paper:
1 | errors.filter(_.contains("HDFS")) |
SQL query engines also use a similar mechanism when implementing queries, converting a query statement into a series of multi-pass operator transformations applied to a two-dimensional dataset composed of rows and columns. As shown in the figure below.

Image source: CMU 15-445, Query Execution Lecture Notes.
I learned a little sketching in high school. Although I never really got into it, its multi-pass drawing techniques left a deep impression on me: first sketch the outline, then refine layer by layer. When hatching, you also work layer by layer, rather than finishing one area before moving to another. I often translate articles these days. At first, I always aimed to get the translation perfect in one pass. But the result was extremely slow, and I easily gave up. Later, I started using a multi-pass, layer-by-layer polishing method. First, I have ChatGPT help with a rough translation, then I check the original text to correct the semantics, and finally I do a pass to adjust word order and smooth out the sentences. As the saying goes, good writing comes from rewriting; it must be the same principle.
Professor Srinivasan Keshav of the University of Waterloo, in his “How to Read a Paper”, expounds the classic “three-pass approach” to reading papers, which follows a similar idea:
- The first pass: a bird’s-eye skimming, focusing on the abstract, section headings, conclusions, and other key points.
- The second pass: a bit more detailed, but don’t get bogged down in details.
- The third pass: read carefully to achieve complete understanding.
You can stop at any step: this may not be the paper you need. But before, I often fell into a pitfall when reading papers, which I like to call the “carpet-bombing reading method”—going through every detail word by word. Including when I first started doing code reviews, I often fell into this trap too.
Doing things all at once, in sequence, is most people’s instinct, but this instinct is often inefficient. We must overcome it through constant practice. Speaking of which, when ordering at a restaurant, we also often use a two-pass method—the first pass, add everything you want to eat; the second pass, consider various constraints (preference strength, price level, whether you’ve had it before, etc.) to narrow down the dishes to a reasonable range.
I think the underlying reasons are:
- Human attention is limited, so we are only good at focusing on one thing at a time.
- Human cognition is also a process from shallow to deep, and layer-by-layer refinement takes advantage of this characteristic.
This article is from my paid Xiaobot column Daily Record of Systems, focusing on distributed systems, storage, and databases. It includes series on graph databases, code deep dives, translations of high-quality English podcasts, database learning, paper interpretations, and more. Friends who like my articles are welcome to subscribe 👉 Column to support me. Your support is very important for me to continue creating high-quality articles. Below is the current list of articles:
Graph Database Series
- Graph Database Resources
- Translation: Factorization & Great Ideas from Database Theory
- Memgraph Series (Part 2): Serialization Implementation
- Memgraph Series (Part 1): Multi-Version Data Management
- Graph Database Series (Part 4): “Fate” and “Conflict” with the Relational Model
- Graph Database Series (Part 3): Graph Representation and Storage
- Graph Database Series (Part 2): A First Look at Cypher
- Graph Database Series (Part 1): What Is the Property Graph Model and Its Shortcomings 🔥
Databases
- Translation: Database Research Trends Over Fifty Years
- Translation: Code Generation in Databases (Codegen in Databas…
- Facebook Velox Runtime Mechanism Analysis
- Distributed System Architecture (Part 2) — Replica Placement
- Recommended Reading: Pipeline Construction in DuckDB
- Translation: How Much Do You Know About the Currently Popular Vector Databases?
- The Great Unification of Data Processing — From Shell Scripts to SQL Engines
- Firebolt: How to Assemble a Commercial Database in Eighteen Months
- Paper Review: NUMA-Aware Query Evaluation Framework
- High-Quality Information Sources: Distributed Systems, Storage, Databases 🔥
- Vector Database Milvus Architecture Analysis (Part 1)
- The Modeling Philosophy Behind the ER Model
- What Is a Cloud-Native Database?
Storage
- Storage Engine Overview and Resources 🔥
- Translation: How RocksDB Works
- RocksDB Optimization Notes (Part 2): Prefix Seek Optimization
- RocksDB Optimization Notes (Part 3): Async IO
- Experiences Using RocksDB in Large-Scale Systems
Code & Programming
- Three “Codes” That Influence How I Write Code 🔥
- Folly Asynchronous Programming: Futures
- On Interfaces and Implementations
- C++ Private Function Override
- ErrorCode or Exception?
- Infra Interview Data Structures (Part 1): Blocking Queue
- Data Structures and Algorithms (Part 4): Recursion and Iteration
Daily Database Learning Series
- Daily Database Learning Lecture #06: Memory Management
- Daily Database Learning Lecture #05: Data Compression
- Daily Database Learning Lecture #05: Workload Types and Storage Models
- Daily Database Learning Lecture #04: Data Encoding
- Daily Database Learning Lecture #04: Log-Structured Storage
- Daily Database Learning Lecture #03: Data Layout
- Daily Database Learning Lecture #03: Database and OS
- Daily Database Learning Lecture #03: Storage Hierarchy
- Daily Database Learning Lecture #01: Relational Algebra
- Daily Database Learning Lecture #01: Relational Model
- Daily Database Learning Lecture #01: Data Models
Miscellaneous
- Common Misconceptions in Database Interviews 🔥
- Life Engineering (I): Multi-Pass Decomposition🔥
- Some Interesting Conceptual Pairs in Systems
- Simplicity and Completeness in System Design
- The Cycle of Engineering Experience
- On Borrowing “Names”
- Cache and Buffer Are Both Caches — What’s the Difference?
