Skip to main content

IT infrastructure best practices

Yikes! Do the masses get lost and confused in the buzz word and buzz tech of the day. Let's consider:

  • configuration as code
  • idempotent
  • micro services
  • CI/CD
  • full stack automated deploy
  • fail early fail fast
  • agile - scrum
  • six sigma
And then there's what Kubernetes got right but that we do not really understand and how it might be wrong after all.

Taken separately each item in the list seems like a good idea and together they feel like powerful tools. In practice, however, there is no magic brush that allows you to paint the Mona Lisa in a single stroke. Even in the most ideal HA cluster scenario there are so many dependencies that have to be addressed and at some point you have to make decisions on the many single points of failure and risk.

For example; building a docker swarm from scratch from a batch file is pretty simple. Just configure and license one or more VMware or bare metal docker servers and deploy to your heart's content. Make sure that the containers you install are trusted then you are still OK. Somewhere you need a git server and some shared volumes and some sort of CI/CD to deploy your apps and services on the swarm/cluster. That's gong to work for a while until you add a second customer or a second cluster or when a package needs to be upgraded or report deployed.

Configuration in code and full stack deployment means that if one service changes then the entire stack needs to be relaunched. This way DEVOPS knows, verifies and trusts that they can recover from a failure. But when your services are a combination of OLTP and OLAP services then a full redeploy may have other types of unwanted side effect. Also depending on the size of the system it could take hours to redeploy a system.

But if you have to deploy manually it's also a challenge to keep the docs and scripts in order.

Six sigma is a waste of a good tree because you can still be in compliance if you schedule the downtime. So what's the point of that?

The other problem with an all in one approach is that we NEVER seem to execute the promise of move everything at once. Management always seems to change their mind and move things one piece at a time which actually creates it's own set of problems.


Popular posts from this blog

Entry level cost for CoreOS+Tectonic

CoreOS and Tectonic start their pricing at 10 servers. Managed CoreOS starts at $1000 per month for those first 10 servers and Tectonic is $5000 for the same 10 servers. Annualized that is $85K or at least one employee depending on your market. As a single employee company I'd rather hire the employee. Specially since I only have 3 servers.

The pricing is biased toward the largest servers with the largest capacities; my dual core 32GB i5 IntelNuc can never be mistaken for a 96-CPU dual or quad core DELL

If CoreOS does not figure out a different barrier of entry they are going to follow the Borland path to obscurity.

UPDATE 2017-10-30: With gratitude the CoreOS team has provided updated information on their pricing, however, I stand by my conclusion that the effective cost is lower when you deploy monster machines. The cost per node of my 1 CPU Intel NUC is the same as a 96 CPU server when you get beyond 10 nodes. I'll also reiterate that while my pricing notes are not currently…

eGalax touch on default Ubuntu 14.04.2 LTS

I have not had success with the touch drivers as yet.  The touch works and evtest also seems to report events, however, I have noticed that the button click is not working and no matter what I do xinput refuses to configure the buttons correctly.  When I downgraded to ubuntu 10.04 LTS everything sort of worked... there must have been something in the kermel as 10.04 was in the 2.6 kernel and 4.04 is in the 3.x branch.

One thing ... all of the documentation pointed to the wrong website or one in Taiwanese. I was finally able to locate the drivers again: (it would have been nice if they provided the install instructions in text rather than PDF)
Please open the document "EETI_eGTouch_Programming_Guide" under the Guide directory, and follow the Guidline to install driver.
download the appropriate versionunzip the fileread the programming manual And from that I'm distilling to the following: execute the answer all of the questio…

Prometheus vs Bosun

In conclusion... while Bosun(B) is still not the ideal monitoring system neither is Prometheus(P).


I am running Bosun in a Docker container hosted on CoreOS. Fleet service/unit files keep it running. However in once case I have experienced at least one severe crash as a result of a disk full condition. That it is implemented as part golang, java and python is an annoyance. The MIT license is about the only good thing.

I am trying to integrate Prometheus into my pipeline but losing steam fast. The Prometheus design seems to desire that you integrate your own cache inside your application and then allow the server to scrape the data, however, if the interval between scrapes is shorter than the longest transient session of your application then you need a gateway. A place to shuttle your data that will be a little more persistent.

(1) storing the data in my application might get me started more quickly
(2) getting the server to pull the data might be more secure
(3) using a push g…