I’m curious which software design principles you find most valuable in real projects.
Two concise summaries I’ve found:
Not clean code - uncle Bob is a hack.
KISS YAGNI DRY in that order.
Think about coupling and cohesion. Don’t tie things together by making them share code that coincidentally is similar but isn’t made for the same purpose.
Don’t abstract things until you have at least 2 (preferably 3) examples of what you’re trying to abstract. If you try to guess at the requirements of the 2nd or 3rd thing you’ll probably be wrong and have to undo or live with mistakes.
When you so abstract and break things down, optimize for reading. This includes maximizing loading the code into your head. Things that make that hard are unnecessary indirections (like uncle Bob tells you to do) and shared state (like uncle Bob tells you to do).
Pure functions (meaning they take inputs and remit outputs without any side effects such as setting shared state) are the platonic ideal. Anything written not as a pure function should have a reason (there are tons of valid reasons, but it’s a good mental anchor)
I should really read the Ousterhout book. It would be great if I could just point people at something, and it sounded decent from that discussion between him and Bob I saw the other day
Edit: I don’t agree with everything in here but it’s pretty great https://grugbrain.dev/
Your principles and order are good; before DRY I’d insert “a little copying is better than a little dependency.” (Rob Pike)
I’m a fan of KISS YAGNI DRY, in that order, or as I’ve started calling it KYDITO, thus triggering the next generation of acronyming.
- Low coupling, high cohesion
- Sometimes it’s better to use a less optimized solution for clarity or simplicity
- A simple solution is usually better than a “clever” one
- Allot time for refactoring during development, don’t assume it will be done later (spoiler: it won’t)
I write my code for future maintainers. I optimize for clarity, testability, and readability.
I’ve become a huge fan of dependency injection. That does not mean I like DI frameworks (Guice). I tend to do it manually with regular code.
When I maintain code and I sit there wondering what it actually does, I write a unit test for it right then and there
And so on
DI without a tool/injector is just composition. just saying
A good reminder that composition is a useful concept.
it’s my fav and it’s easy. allows containing details of a lower lever gizmo in a higher level thingamabob and basically free strategy pattern, especially if you use DI… and allows mock/spy testing!
And that future maintainer happens to be yourself most of the time.
I’d say “Separation of Responsibilities” is probably my #1. Others here have mentioned that you shouldn’t code for future contingencies, and that’s true, but a solid baseline of Separation of Responsibilities means you’re setting yourself up for future refactors without having to anticipate and plan for them all now. I.E. if your application already has clear barriers between different small components, it’s a lot easier to modify just one or two of them in the future. For me, those barriers mean horizontal layers (I.E. data-storage, data-access, business logic, user-interfacing) and vertical slicing (I.E. features and/or business domains).
Next, I’ll say “Self-Documenting Code”. That is, you should be able to intuit what most code does by looking at how it’s named and organized (ties into separation of responsibilities from above). That’s not to say that you should follow Clean Code. That takes the idea WAY too far: a method or class that has only one call site is a method or class that you should roll into that call site, unless it’s a separation of responsibility thing. That’s also not to say that you should never document or comment, just that those things should provide context that the code doesn’t, for things like design intent or non-obvious pitfalls, or context about how different pieces are supposed to fit together. They should not describe structure or basic function, those are things that the code itself should do.
I’ll also drop in “Human Readability”. It’s a classic piece of wisdom that code is easier to write than it is to read. Even of you’re only coding for yourself, if you want ANY amount of maintainability in your code, you have to write it with the intent that a human is gonna need to read and understand it, someday. Of course, that’s arguably what I already said with both of the above points, but for this one, what I really mean is formatting. There’s a REASON most languages ignore most or all whitespace: it’s not that it’s not important, it’s BECAUSE it’s important to humans that languages allow for it, even when machines don’t need it. Don’t optimize it away, and don’t give control over when and where to use it to a machine. Machines don’t read, humans do. I.E. don’t use linters. It’s a fool’s errand to try and describe what’s best for human readability, in all scenarios, within a set of machine-enforceable rules.
“Implement now, Optimize later” is a good one, as well. And in particular, optimize when you have data that proves you need it. I’m not saying you should intentionally choose inefficient implementations just because they’re simpler, but if they’re DRASTICALLY simpler… like, is it really worth writing extra code to dump an array into a hashtable in order to do repeated lookups from it, if you’re never gonna have more than 20 items in that array at a time? Even if you think you can predict where your hot paths are gonna be, you’re still better off just implementing them with the KISS principal, until after you have a minimum viable product, cause by then you’ll probably have tests to support you doing optimizations wolithout breaking anything.
I’ll also go with “Don’t be afraid to write code”, or alternatively “Nobody likes magic”. If I’m working on a chunk of code, I should be able to trace exactly how it gets called, all the way up to the program’s entry point. Conversely, if I have an interface into a program that I know is getting called (like, say, an API endpoint) I should be able to track down the code it corresponds to bu starting at the entry point and working my way down. None of this “Well, this framework we’re using automatically looks up every function in the application that matches a certain naming pattern and figures out the path to map it to during startup.” If you’re able to write 30 lines of code to implement this endpoint, you can write one more line of code that explicitly registers it to the framework and defines its path. Being able to definitively search for every reference to a piece of code is CRITICAL to refactoring. Magic that introduces runtime-only references is a disaster waiting to happen.
As an honorable mention: it’s not really software design, but it’s somethign I’ve had to hammer into co-workers and tutorees, many many times, when it comes to debugging: “Don’t work around a problem. Work the problem.”. It boggles my mind how many times I’ve been able to fix other people’s issues by being the first one to read the error logs, or look at a stack trace, or (my favorite) read the error message from the compiler.
“Hey, I’m getting an error ‘Object reference not set to an instance of an object’. I’ve tried making sure the user is logged in and has a valid session.”
“Well, that’s probably because you have an object reference that’s not sent to an instance of an object. Is the object reference that’s not set related to the user session?”
“No, it’s a ServiceOrder object that I’m trying to call .Save() on.”
“Why are you looking at the user session then? Is the service order loaded from there?”
“No, it’s coming from a database query.”
“Is the database query returning the correct data?”
“I don’t know, I haven’t run it.”
I’ve seen people dance around an issue for hours, by just guessing about things that may or may not be related, instead of just taking a few minutes to TRACE the problem from its effect backwards to its cause. Or because they never actually IDENTIFIED the problem, so they spent hours tracing and troubleshooting, but for the wrong thing.
Idempotence / self-healing: the system should be built in such a way that it tries to reach the correct end state, even if the current state is wrong. For instance, every time our system gets an update, it will re-evaluate the calculation from first principles, instead of doing a diff based on what was there before. This prevents bad data from snowballing and becoming a catastrophe.
Giving yourself knobs to twiddle in production: at work we have ways of triggering functionality in the system on request. Basically calling a method directly on the running process. This is so, so useful in prod issues, especially when combined with the above. We can basically tell the system “reprocess this action/command/message” at any time and it will do it again from first principles.
Debugging: I always first try and find a way to replicate it quickly. Then, I try and simplify it one tiny step at a time until it’s small enough I can understand in one go. I never combine multiple steps per re-run and always verify whether the bug is there or not at every single stage. This can be quite a slow approach but it also means I am always making progress towards finding the answer, instead of coming up with theories which are often wrong, and getting lost in the process.
Would you be willing to give an example of the second? I feel like my boss would throw a shitfit if I told him I wrote anything that even remotely alter prod
Certainly! The line we don’t cross is that we don’t directly edit data. Every record in our database must be generated by the system itself. But, we can re-trigger behaviour, or select different flows, or tweak properties around the edges as much as we want.
For example:
- Reflows - for every message that enters or leaves our system, we store it in a table. We can then reflow the message either into our system or to our downstreams. This means if there was a transient error or a code change since we received the message, we can replay it again without having to involve anyone else.
- Triggers - i.e. ask the system to regenerate its output based on its inputs again. This is useful if there’s a bug that’s only hit in certain situations.
- Migration - we have lots of different flows and some are triggered only on some accounts. We have some scripts that lets us turn on/off migration and then automatically reflow all the different messages.
Destroy abstractions
The reality is, if you have an abstraction layer and one implementation of it, you dont need that abstraction layer
People will complain, “oh but think about the refactor if we have to change vendors/etc” but i have yet to ever switch vendors/api/etc and not had to completely rethink the abstraction layer
Just get rid of it, it will be easier, less code, more precise, and in the long run you’ll cargo cult less
Just write the code for the things you have, and if things change, yup then things will change - to anticipate future changes and upfront the work for the unknown only to then have to make more changes once those real changes eventually arrive and dont match your old predictions is just more work and more confusion
Zen of python (PEP 20):
Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren’t special enough to break the rules. Although practicality beats purity. Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you’re Dutch. Now is better than never. Although never is often better than right now. If the implementation is hard to explain, it’s a bad idea. If the implementation is easy to explain, it may be a good idea. Namespaces are one honking great idea – let’s do more of those!
Sparse is better than dense?
Right, so expanding our code for better readability vs trying to make a super dense one liner.
Using human readable vars and not excessive short hand.
Ah got it. I was thinking about dense vs sparse arrays or containers
Single responsibility. I deplore my backend developers who think that just because you’re mauling a single (Java) stream for an extended operation, it’s ok to write a single wall-of-text, 5 lines long, 160 characters wide. Use fucking line breaks, for fuck’s sake!
On 4K monitors, 160 character lines can be quite nice, though you may not want a wall of that, yeah
deleted by creator
Encapsulation.
Any time i even think I need inheritance, I immediately change it for encapsulation. I’ve never regretted this.
Cut the problem into tiny pieces, then group it back together with nice clean connections
Code in nice straight lines. Like good cable management - behaviors should flow from cause to effect, and as much as possible should flow through the main channels
Decide how you organize things, and stick to it. When you see code you don’t remember writing, you should be able to say “if I were me, how would I do this?” and immediately know the correct answer
When I explore or consider alternatives, I don’t think of or ask myself about design principles, but consider and weigh what could and would make sense where I am.
More than principles, the guiding goal is Maintainability - Readability, Graspability, Consistency, Correctness, Robustness. Weighted against constraints.
I guess separation of concerns is a big one I use implicitly. Like many others.
Don’t design for having a nice codebase today, design for having a clean codebase after 3 months of Devs copy pasting one bit of code then tweaking it to do what they need or adding more fields to existing concepts.
This generally means it’s best to have one pattern for a given thing, rather than having several patterns you pick based on context, the later runs into problems:
- Someone copy/pasted pattern A for a pattern B context
- Enough stuff changes in a pattern A implementation that it would now be better as a patter B thing.
A second consideration for this is that if there are a group of classes/files/whatever that regularly needs to be copied they should live together. If there are different sections of the code that needs to be edited when creating a new resource, they should be kept in one place and kept small-ish.
Most of this comes from accepting the way people tend to work and from the perspective that software is a living evolving process and only regarding a snapshot of it misses vital information.
- Talk to your colleagues: clarify requirements, question assumptions, get feedback, talk about best practices and why you do stuff the way you do it
- Single responsibility principle
Common:
- Procedural, preferably Functional. If you need a procedure or function use a procedure or function.
- Object Oriented. If you need an object use an object.
- Modular
- Package/Collection of Modules
- Do not optimize unless you need to.
- Readable is more important then compact.
- Somone said minimal code coupling, Yes! Try to have code complexity increase closer to N then N factorial where N is code size.
Frankly everything else is specialized though not unuseful.
One principle I try to apply (when possible) comes from when I learned Haskell. Try to keep the low-level logical computations of your program pure, stateless functions. If their inputs are the same, they should always yield the same result. Then pass the results up to the higher level and perform your stateful transformations there.
An example would be: do I/O at the high level (file, network, database I/O), and only do very simple data transformations at these levels (avoid it altogether if possible). Then do the majority of the computational logic in lower level, modular components that have no external side effects. Also, pass all the data around using read-only records (example: Python dataclasses with
frozen=True
) so you know that nothing is being mutated between these modules.This boundary generally makes it easier to test computational logic separately from stateful logic. It doesn’t work all the time, but it’s very helpful in making it easier to understand programs when you can structure programs this way.