Skip to main content

CoreOS - discovery etcd io

The first step in deploying my next cluster is building a bootstrap server. This bootstrap server needs to host a number of small static services that are used by the nodes in the cluster(s) and in the worker pool. Examples of these services include, NTP, DNS, PXE/TFTP as traditional *nix services but then as needed by etcd in order to discovery cluster membership.

coreos:  etcd2:    # generate a new token for each unique cluster from    discovery:<discovery_token>    # multi-region deployments, multi-cloud deployments, and Droplets without    # private networking need to use $public_ipv4:    advertise-client-urls: http://$private_ipv4:2379,http://$private_ipv4:4001    initial-advertise-peer-urls: http://$private_ipv4:2380    # listen on the official ports 2379, 2380 and one legacy port 4001:    listen-client-urls:,    listen-peer-urls: http://$private_ipv4:2380
** Sorry, clearly glogger did not paste the code properly.

The CoreOS team developed and deployed a public version of the discovery tool and then made the code available. Unfortunately the tool itself needs to be deployed in a cluster of etcd servers. And so there are two conflicts... (a) whether or not to use the public instance. (b) whether to perform the discovery manually.

(a) the TTL means that the record and UUID should not live long enough for someone to trick your cluster in order to replace one of your nodes into believing (i) that is does not belong; and (ii) that the bad guy can replace it. I'm not an expert but I imagine that one could validate the cluster peers with some list of IP addresses from a 3rd party proxy; in my case the "retrieve droplets" API at Digital Ocean.

(b) before implementing a pared down version of the discovery service and the reason I think that a lite version is required; read this doc as it describes the discovery protocol and hints as to how easy it might be.

The challenge is the design...

Create a private discovery service inside your firewall... but it requires an etcd cluster. And that cluster depends on tokens... so either you have to hand stitch the token or use the public discover service... which depends on an etcd cluster that was already clustered... and follow the tokens and services recursively until an ops person installed the tokens manually.

Because the TTL is fairly low, some temporary persistence, and because it's not necessary to stay alive 24x7 it would make sense that the discovery service might be detached from the etcd service.

UPDATE:  I received a response to a G+ question from Brandon @ CoreOS. "The service only lasts for the purpose to construct the cluster. Nothing else." So the defacto discovery service is probably safe.  ON THE OTHER HAND the CoreOS clearly documents:

Running Your Own Discovery Service
The public discovery service is just an etcd cluster made available to the public internet. Since the discovery service conducts and stores the result of the first leader election, it needs to be consistent. You wouldn't want two machines in the same cluster to think they were both the leader.

Since etcd is designed to this type of leader election, it was an obvious choice to use it for everyone's initial leader election. This means that it's easy to run your own etcd cluster for this purpose.

If you're interested in how discovery API works behind the scenes in etcd, read about etcd clustering.

It's sort of a miscommunication here. Furthermore, creating a discovery service requires an etcd cluster which in turn requires an etcd cluster. And while there is some documentation on bootstraping an etcd cluster it seems involved and should have been scripted.


Popular posts from this blog

Prometheus vs Bosun

In conclusion... while Bosun(B) is still not the ideal monitoring system neither is Prometheus(P).


I am running Bosun in a Docker container hosted on CoreOS. Fleet service/unit files keep it running. However in once case I have experienced at least one severe crash as a result of a disk full condition. That it is implemented as part golang, java and python is an annoyance. The MIT license is about the only good thing.

I am trying to integrate Prometheus into my pipeline but losing steam fast. The Prometheus design seems to desire that you integrate your own cache inside your application and then allow the server to scrape the data, however, if the interval between scrapes is shorter than the longest transient session of your application then you need a gateway. A place to shuttle your data that will be a little more persistent.

(1) storing the data in my application might get me started more quickly
(2) getting the server to pull the data might be more secure
(3) using a push g…

Entry level cost for CoreOS+Tectonic

CoreOS and Tectonic start their pricing at 10 servers. Managed CoreOS starts at $1000 per month for those first 10 servers and Tectonic is $5000 for the same 10 servers. Annualized that is $85K or at least one employee depending on your market. As a single employee company I'd rather hire the employee. Specially since I only have 3 servers.

The pricing is biased toward the largest servers with the largest capacities; my dual core 32GB i5 IntelNuc can never be mistaken for a 96-CPU dual or quad core DELL

If CoreOS does not figure out a different barrier of entry they are going to follow the Borland path to obscurity.

Weave vs Flannel

While Weave and Flannel have some features in common weave includes DNS for service discovery and a wrapper process for capturing that info. In order to get some parity you'd need to add a DNS service like SkyDNS and then write your own script to weave the two together.
In Weave your fleet file might have some of this:
[Service] . . . ExecStartPre=/opt/bin/weave run --net=host --name bob ncx/bob ExecStart=/usr/bin/docker attach bob
In sky + flannel it might look like:
[Service] . . . ExecStartPre=docker run -d --net=host --name bob ncx/bob ExecStartPre=etcdctl set /skydns/local/ncx/bob '{"host":"`docker inspect --format '{{ .NetworkSettings.IPAddress }}' bob`","port":8080}' ExecStart=/usr/bin/docker attach bob
I'd like it to look like this:
[Service] . . . ExecStartPre=skyrun --net=host --name bob ncx/bob ExecStart=/usr/bin/docker attach bob
That's the intent anyway. I'm not sure the exact commands will work and that's partly why we…