Core Principles of Infrastructure Engineering
This is a synopsis of some of my ideas about how to provide reliable computing services in a cost-effective manner. It is based on material that I used to present these ideas to some individuals with traditional engineering (EE, CE, software engineering) backgrounds who wanted to learn about how to run SaaS or IaaS and improve the business value of a lacklustre IT organisation. Since it was for a presentation, there is lots of explanatory material that was given orally and these are just notes.
I draw heavily on traditional unix philosophy so I must give credit to various folks from Bell Labs such as Doug McIlroy, Dennis Ritchie, Ken Thompson and those stood on their shoulders. I also acknowledge The Art of Unix Programming for some phrasing. I suggest that it is much more broadly applicable than just to Unix Programming.
These principles are guiding approaches that tend to make technology projects more successful, they don't necessarily make project work more expensive to complete, nor do they necessarily make it cheaper. However, applying these principles correctly will result in more cost-effective IT service provisioning and better matching of expectations.
I am not pretending that they are either comprehensive or universal: indeed, there is another rule at play, the Rule of Diversity which advocates a healthy distrust of any idea that pretends to be One True Way and emphasises the inherent strengths of diverse systems, just as does the natural world; certainly everything here will have some exceptions. Real-life situations require common sense to balance these principles against the current business needs/constraints.
- Reliable
- Build services that work
- Sustainable
- Build services that are easy & cheap to maintain & to grow
- Open & Synergistic
- "Build systems to do one thing and do it well. Build systems to to work together." - Doug McIlroy, Bell Labs
Reliability
- Rule of Simplicity
- Design for simplicity, add complexity only when you must
- Rule of Generation
- Writing code/config to generate services. Cheap reproducibility is better than trying to bulletproof.
- Rule of Transparency
- Design for visibility to make inspection and troubleshooting easier; "If it's not monitored, it doesn't exist."
- Minimise Single Points of Failure
- at all levels: application, OS/VM, hardware, network, human
Sustainability
- Make it easy over the anticipated life of the service
- Make it easy for others to support/troubleshoot/redeploy
- Start small and make incremental improvements
- Rule of Generation: Write code/config to generate services
- Design for multiple instances, they will be required for at least one of reliable, sustainable and/or synergistic
- Role Based Access Control eases the administrative burden of security
Open & Synergistic
- Rule of Composition: Design to be connected to other, as yet unknown, systems.
- Build services that are open to many usage scenarios, to get more value out of the effort
- Make services broadly useful to as many people/departments/automated agents as help the value proposition
- Extensible
- Simple, well-defined interfaces