Skip to main content

Google Cloud Computing Outage

If you're complaining about the outage yesterday then you're an id10t. I just read the byline on TechCrunch and the author and editors clearly do not understand the options and why the occasional cloud infrastructure outage is simply not that big of a deal.

Simply put you get what you pay for and diversity is king.

First of all there is nothing capable of 100% availability. Just look at all of the loopholes in the six sigma specification. Second the more reliable or the more clicks you want to capture closest to an "event" the more it will cost to reconcile those last events; Just as your creditcard's provider. And more of that cost will have to be passed on to the customer.

Sure, it sucks when a system goes down and even worse when it's a system you or great number of people rely on. But your transaction is not any more or less important than anyone else's... And this is not a call to move your services in-house. If these systems were in-house [a] would you be able to prevent this from happening? [b] could you detect it any faster? [c] wold you be able to resolve it any faster? [d] what do you tell your customers?

[a] would you be able to prevent this from happening? -- It just depends on how much you want to spend waiting for the 1000 year event or is the 500 year even good enough. And the solution may be weaker letting it ride.

[b] could you detect it any faster? -- unlikely. Cloud providers continue to instrument their systems and they are looking or the 1000 year events. Big outages have big costs in terms of reputation, ability to raise prices, restore customer faith.

[c] would you be able to resolve it any faster? -- not unless you started hiring world class SREs and they do not come cheap.

[d] what do you tell your customers? -- if it's your hardware you have to dance and if it's the cloud provider (providing it's reputable) then you get to blame someone else and you can justify it based on cost to the customer.

Interestingly... in the creditcard business there is an agreement between mastercard and visa that if one or the other has a systemwide outage that they can rely on the other to carry their transactions.

Also, consider that after years of Microsoft Blue Screen Of Death and so may viruses that people still buy, install and upgrade Microsoft products.


Popular posts from this blog

Entry level cost for CoreOS+Tectonic

CoreOS and Tectonic start their pricing at 10 servers. Managed CoreOS starts at $1000 per month for those first 10 servers and Tectonic is $5000 for the same 10 servers. Annualized that is $85K or at least one employee depending on your market. As a single employee company I'd rather hire the employee. Specially since I only have 3 servers.

The pricing is biased toward the largest servers with the largest capacities; my dual core 32GB i5 IntelNuc can never be mistaken for a 96-CPU dual or quad core DELL

If CoreOS does not figure out a different barrier of entry they are going to follow the Borland path to obscurity.

UPDATE 2017-10-30: With gratitude the CoreOS team has provided updated information on their pricing, however, I stand by my conclusion that the effective cost is lower when you deploy monster machines. The cost per node of my 1 CPU Intel NUC is the same as a 96 CPU server when you get beyond 10 nodes. I'll also reiterate that while my pricing notes are not currently…

eGalax touch on default Ubuntu 14.04.2 LTS

I have not had success with the touch drivers as yet.  The touch works and evtest also seems to report events, however, I have noticed that the button click is not working and no matter what I do xinput refuses to configure the buttons correctly.  When I downgraded to ubuntu 10.04 LTS everything sort of worked... there must have been something in the kermel as 10.04 was in the 2.6 kernel and 4.04 is in the 3.x branch.

One thing ... all of the documentation pointed to the wrong website or one in Taiwanese. I was finally able to locate the drivers again: (it would have been nice if they provided the install instructions in text rather than PDF)
Please open the document "EETI_eGTouch_Programming_Guide" under the Guide directory, and follow the Guidline to install driver.
download the appropriate versionunzip the fileread the programming manual And from that I'm distilling to the following: execute the answer all of the questio…

Prometheus vs Bosun

In conclusion... while Bosun(B) is still not the ideal monitoring system neither is Prometheus(P).


I am running Bosun in a Docker container hosted on CoreOS. Fleet service/unit files keep it running. However in once case I have experienced at least one severe crash as a result of a disk full condition. That it is implemented as part golang, java and python is an annoyance. The MIT license is about the only good thing.

I am trying to integrate Prometheus into my pipeline but losing steam fast. The Prometheus design seems to desire that you integrate your own cache inside your application and then allow the server to scrape the data, however, if the interval between scrapes is shorter than the longest transient session of your application then you need a gateway. A place to shuttle your data that will be a little more persistent.

(1) storing the data in my application might get me started more quickly
(2) getting the server to pull the data might be more secure
(3) using a push g…