Actions: | Security

AllGoodBits.org

Navigation: Home | Services | Tools | Articles | Other

Fundamental Goals to Guide Technology Infrastructure Design

I was recently asked to produce a strategic technology plan for a medical device company that is unusual in that it is does all aspects in-house: design, manufacturing, sales, distribution, tech support, customer(patient) service, billing (insurance reimbursement). The audience were high-level business folk (CxO), not geeks, so I had to ensure that I concentrated my presentations on high-level, business value rather than technical elegance, completeness and getting lost in the details.

I was asked to present a small number of fundamental goals, and then make a nod in the direction of explanation of what each meant, why it was important and what it might take to make them happen. Here is a brief synopsis with the company-specific material generalized or removed. This is not so much an article as a distillation of my notes leaving out many complexities, caveats, examples and other details that might or might not have been mentioned in the surrounding discussion; in particular, there is nothing about implementation details.

  1. Automated service deployment
  2. Metrics
  3. Machine Readable Input and Output formats
  4. Monitoring

Each of these is potentially a broad area with a large number of sub-projects; each of these "goals" is a philosophical approach, a vision which doesn't need to be considered as "achieved" or "not achieved/failed", but a guideline to direct how we design technology projects. Clearly there are other sensible possibilities for what could be emphasized, but I chose these after weighing how well the company was already doing in various areas, achievability (you don't start violin lessons by playing Paganini, or even Bach), importance and various other subjective factors. Yes, this field is an Art.

Automated service deployment

We should be able to quickly and easily (re-)deploy new or existing services with an automated/non-interactive process.

Why

  • increasing numbers of systems/services doesn't then require increased headcount, you only have to increase headcount when you increase the complexity, change the architecture or significantly change the scale
  • the need for traditional backup processes is reduced because service recovery becomes just the same as "let's build a new one"
  • automation promotes homogeneity which directly correlates to cheap administration
  • we know that we want to perform some activities that we don't yet know about or perhaps haven't even thought of yet; it should be a core goal to make that as easy as possible

Major Sub-Projects

  • Automated OS installation
  • Version controlled configuration management
  • Automated network provisioning
  • Automated storage provisioning

Metrics

Why

The only way to determine whether some effort or service is effective and/or worthwhile requires knowledge of what we have, how much it's used and what its benefits are.

Major Sub-Projects

  • Determine metrics for each service, inheriting from similar services
  • Define some "we need more performance" semantics, for various kinds of service, by criticality and by service type (http, db, mail, etc.)
  • Information visualization tools

Machine Readable Input and Output formats

Every service should be able to receive and generate data in a machine readable format.

Why

Machine readable IO formats are important and valuable because it's much more expensive (in both dev time and in stability/reliability) to make machine readable starting from human readable than it is to make human readable starting from machine readable.

Monitoring

Continuously verifying that each service is performing according to specified parameters, logging or even alerting if it's not.

Why

We need to be confident that each service is available to avoid escalating the cost of problem diagnosis

Major Sub-projects

  • determine what we already have
  • automatically generate monitoring configuration based on system/service deployment
  • choose/create a battery of service availability/correctness checks