Many computer science programs in domestic universities tend to focus heavily on the “instilling” of fundamentals and theory (based on my own experience back then; things may be better now). While there are some course projects to build coding skills, they are often insufficient. As a result, many students feel less than confident about their coding abilities before entering the workforce. Below are some suggestions based on my own coding journey.
Author: Muniao Notes https://www.qtmuniao.com/2023/03/25/how-to-read-and-write-code. Please credit when reposting.
Some Experiences
I entered Beijing University of Posts and Telecommunications (BUPT) in 2010, more or less stumbling into the computer science major. Naturally, I had no real study plan at the start of college. I simply followed the high-school study habits, going through the motions—attending lectures, self-studying, and doing homework after class. The result was low learning efficiency: I barely understood the lectures and couldn’t fully grasp the exercises. But after finishing the introductory computer science course, I managed to work through the programming assignments on my own. Though it was a bumpy ride, I gradually began to feel the joy of programming.
The courses with the most substantial projects back then—C++ Programming, Algorithms and Data Structures, Operating Systems, Computer Networks, Microcomputer Principles, and so on—most of them feel like toys in retrospect. It was only later, when I worked on labs from well-known foreign universities’ open courses, that I realized how difficult it is to design a good course project:
- First, the material must match the students’ level, with detailed lab instructions.
- Second, a solid code framework must be prepared, leaving “blanks” in the right places for students to fill in.
- Finally, a robust automated testing and grading platform is needed.
Building all of this from scratch involves complexity and effort no less than publishing a top-tier conference paper. From a professor’s perspective, given the time available, why not write a paper instead? After all, in domestic universities, research always comes first and teaching last.
Therefore, I didn’t build a solid coding foundation during my undergraduate years. It was only through constant exploration during graduate school and work that my skills gradually improved. Looking back, the biggest influences on my coding ability can be summarized as three “Codes”: LeetCode, the Writing/Review Code Loop, and Clean Code.
LeetCode
Before talking about LeetCode, I’d like to mention a remarkable type of person I met after starting work—those who competed in algorithm contests (commonly called ACM, referring to ICPC and CCPC). The most striking trait of these classmates, described in the simplest terms, is this: they ship fast. Years of competition experience allow them to turn a requirement (problem) into code in the shortest possible time once they understand it in their heads.
Being too naive back then, I naturally never competed. By the time I realized the many benefits of competitions, it was already the second semester of my junior year. The school team wouldn’t recruit someone so “old” by then, and even if they did, the bar was extremely high. It remains one of my bigger regrets from college.
Later, in graduate school, about a year before job hunting, LeetCode had already become quite popular. I teamed up with classmates and we motivated each other to grind through problems. At the time, the problem bank wasn’t that large; by the summer of my second year of grad school when I started looking for internships, I had roughly gone through the first 200+ problems twice. At first, I kept thinking about what each problem meant and which algorithm to use. Sometimes I couldn’t figure it out for half a day, so I would look at the top-voted solutions. Many of those solutions were truly elegant and concise—probably one of the most appealing aspects of LeetCode back then. Gradually, as I developed a feel for different problem types, I started practicing for speed and pass rate. That is, after understanding the problem, being able to quickly turn it into bug-free code, as mentioned above.
So, although I never competed, the training through LeetCode did give me similar benefits. Naturally, I was far behind those “battle-hardened” competitors in depth, breadth, and speed. Still, it benefited me immensely:
- Mastery of common data structures and algorithms. For example, I can now list the six sorting algorithms, their characteristics, use cases, and underlying principles with ease. When it comes to recursive and non-recursive tree traversals, I can mentally simulate the recursive execution process effortlessly. Similarly, I can simulate and code linked lists, queues, graphs, and so on in my head.
- Learning many elegant code “building blocks”. For example, how to binary search, how to iterate, how to handle head and tail nodes in linked lists, how to design interfaces for basic data structures, and so on. These relatively “atomic” building blocks became the flesh and blood of my code at work later on.
But these alone are far from enough. Once you get into large projects, the code you write easily becomes—“good lines but no good chapters.”
Writing/Review Code Loop
The predicament described above usually stems from a lack of experience with medium-to-large projects. In terms of space, you don’t know how to organize tens of thousands of lines of code, how to divide functional modules, or how to build hierarchical systems. In terms of time, you haven’t gone through the build-rot-refactor cycle of “building a tower, hosting a feast, and watching it collapse.”
There is a tension in engineering between understanding code and organizing it:
- Understandability. As a maintainer, when we learn code, we prefer to follow the data flow and control flow—tracing from a starting point all the way down. This is vertical.
- Maintainability. But as an architect, when we organize code for easy maintenance, we usually group it by modules—clustering closely related code together. This is horizontal.
So when we first approach a large codebase, reading it cover-to-cover immediately will surely make us drowsy and yield half the results for double the effort. Unfortunately, due to the strong habit formed from years of reading, this problem stuck with me for a long time. The right way to approach it is like dealing with a bundle of tangled threads: find the “loose ends” and slowly pull them out. In a project, these loose ends are: the main function of a service, various unit test entry points, and so on.
But when building a large project ourselves, we do the opposite: start with a tangled-together main flow, then iterate gradually. It’s like Pangu separating heaven and earth—evolving over time, letting the sky slowly rise and the earth slowly sink, transforming the whole into the four poles, mountains and rivers, the sun and the moon. Through such iteration, a chaotic flow gradually becomes modular: common utility modules (utils), business-related base modules (common), control modules (controller, manager), RPC/HTTP callback handler modules (processor), and so on.
Of course, if you already have experience building a certain type of system, you don’t need to go through this long process in the early stages; you can divide modules directly based on experience. Going a step further, you may have built up your own code library—things like clocks, networking, multithreading, flow control, and so on—that you can use directly.
The remaining issue is fine-tuning details: when layering, should a boundary function move up or down? Should a relatively complete structure be flattened into the using class or extracted separately? There are no fixed rules for these varied decisions; they depend more on the scenario’s needs, deadlines, and other practical factors. The intuition behind these decisions is built up slowly through long-term study of medium-to-large projects, reviewing others’ changes, and hands-on experience in scaffolding and patching.
Just as the stock market has cycles, engineering code has its cycles too. Without experiencing a full bull-bear market cycle, we dare not speak lightly of long/short positions; without experiencing a project’s build-maturity-rot cycle, we dare not speak lightly of trade-offs. That is, we cannot foresee the most common usage patterns early in a project’s construction, and therefore cannot design for the primary scenarios while sacrificing convenience for secondary ones.
The importance of unit tests cannot be overstated. On one hand, whether you can write unit tests indicates whether your code’s module boundaries are clear. On the other hand, by designing proper inputs and outputs, tests guarantee a certain “invariant”—no matter how you tweak or refactor later, as long as you pass the existing tests, you can be half assured. The other half depends on continuous review by yourself and others, and iterative testing in staging and production clusters.
So, this process is an endless loop—continuous grinding, followed by continuous improvement.
Clean Code
Finally, let’s talk about taste in code. The section title is Clean Code because my taste in code was initially shaped by the book Clean Code: A Handbook of Agile Software Craftsmanship. Its second chapter’s discussion of naming—the “hardest” thing in engineering—left a deep impression on me.
A few examples:
- Single responsibility. If you can’t clearly name your class or function, it means your class or function is doing too much.
- Names over comments. For example, don’t use literal constants directly; give them a name. Similarly, avoid anonymous functions when possible; give them meaningful names.
At work, we often say so-and-so has a “cleanliness obsession” with code. I have a bit of that too, but I don’t see it as an obsession; rather, it’s an appreciation and pursuit of beauty. Where does the beauty of code manifest? I’ll toss out a few thoughts here (of course, I’ve also written a lengthy article on naming before; interested readers can click here):
- Consistency. Entities with the same meaning should use the same names; entities that need differentiation should be distinguished through naming conventions or prefixes. This minimizes the cognitive burden on readers.
- Systematicity. When designing a set of related interfaces, consider their systematic nature. For example, CRUD operations, produce-consume patterns, pre-processing/processing/post-processing, read-write, and so on. Systematicity includes symmetry and logicality, allowing someone to understand at minimal cost how a set of interfaces relate to and differ from one another.
- No excess fat. Don’t be verbose when writing code; don’t be verbose, don’t be verbose. If you accidentally are, it probably means you haven’t fully grasped the essence of the problem you’re solving. Complex appearances often have a very simple key underneath, once you strip away the impurities. Grasp these keys, then attach flesh and bone to them, while clarifying one-to-one, one-to-many, and many-to-many dependency relationships. This often allows you to simplify the complex.
The relationships between different concepts (corresponding to classes in code) are crucial when understanding code organization; it’s best to reflect them in names—one-to-one, one-to-many, or many-to-many. The connotation of each concept, as well as the containment and connection relationships among multiple concepts, are among the most important things to consider when designing modules.
Beyond aesthetics, let’s also talk about modeling (which is, to some extent, related to metaphor). After all, when we speak of construction, we are already borrowing a metaphor from architecture. Similar metaphors abound in software engineering.
When our brains comprehend new things, we mostly build upon extrapolations from old models. Therefore, when dealing with modules, if you can find a relatively suitable abstraction from a library of classic models, you can often greatly lower the barrier to understanding. Examples include the classic producer-consumer model, tree organization model, router model, thread scheduling model, memory model, and so on. In addition, using common imagery or metaphors to name projects can also yield surprising benefits in understandability. For example, a monitoring system might be called “Eagle Eye”; a pipeline control system might be called “Foxconn” (manually squinting); and for the more common task of data collection, we all know what it’s called—“crawler.”
The Last Thing
Things in this world are often mutually corroborating and complementary—if you want to write good code, you can’t just keep your head down and code. You need to read history, study art, write prose, and see the world. Build your own aesthetic preferences, then transpose those ideals into your code, and only then can you write code that is intuitive and beautiful.
This article is from my Xiaobot column. The initial plan includes the following series:
- Graph Database 101
- Learn a Bit of Databases Every Day
- Recommended System Reading
- Reading Notes
- Data-Intensive Paper Reading Guide
Subscription details are in the column introduction. Welcome friends who enjoy my articles to subscribe and support me, motivating me to produce more high-quality content.
