Skip to main content

Docker Gotchas

I've watched as the memes have gravitated to Docker like bees to nectar or flies to poop. I guess that's a half full half empty thing. As I've said many times in the past you gotta know your stack. And the more I use Docker and the more they advance the project the more I realize that it's just too early for adoption... unless you have a person or three on the inside.

By example my first CoreOS+docker server is continuously running out of disk space in part because btmp is filling up (which I've read is probably an attack). My second CoreOS+Docker server shows an explosion of files in the Docker overlay folder. I'm not certain why I need so many folders and if any of them are zombies or not.

Another article made mention of my filesystem iNodes. While my filesystem is at %5 used and iNodes at 15%; which is not an unhealthy ratio, in my opinion, but since this is a single Docker container running on a 200GB HDD it suggests that getting any sort of density is going to be unpredictable and painful.

I may need to investigate NixOS containers.

UPDATE:  found this article echoing my problem.

UPDATE: I made some discoveries that are not actually documented anywhere.  The first truth that I realized is that proper VMs are probably still a better proposition than containers of any kind. CoreOS toolbox and it's analogs make sense but not for actually running a production service. This is yet another reason why the Phusion containers are a bad choice.

Next, you simply do not need intermediate layers. The first reason is that many times the builder falsely uses an old container. For example I have a RUN go get some_go_package and docker does not rebuild it. SO I have to but my go get commands after my ADD commands in order to insure they are executed.

Now that I'm deleting all of my images, intermediate layers, I'm getting ready to rebuild my containers with the --rm and --force-rm options. I do not want any intermediate files. The issue is that docker build is consuming too much disk space and any hope of reasonable density is impossible.

delete all of your contianers
docker rm $(docker ps -a -q)
delete all of your images
docker rmi $(docker images -q)
delete eveything
docker rmi $(docker images -q)
** this last one takes a VERY long time. My 331 overlays has been deleting for nearly 30 minutes and white it was only 2.5GB it made up 15% of the inodes meaning there were a lot of files.

UPDATE:  One more thing to keep in mind as my system is still deleting those intermediate layers... GCE (google compute engine) limits the iops for my host VM and so bulk deletes like this are going to take a while regardless whether it's an SSD or HDD.

UPDATE: In hindsight this sort of cleanup (a) would have been faster to rebuild the host (b) if there had been more than one container on this system I would never been able to clean it up because there are no tags. The also means that the build and deploy system should probably be different machines.

UPDATE: compressing a container [link] (devbox and devboxrun are my image and container names)
ID=$(docker run -d devbox /bin/bash)
docker export $ID | docker import - devboxrun
UPDATE: backup compressed snapshots
ID=$(docker run -d image-name /bin/bash)
(docker export $ID | gzip -c > image.tgz)
gzip -dc image.tgz | docker import - flat-image-name 
UPDATE: I'm testing my last clean build configuration (bash functions)
function cleanbox {
        docker rm $(docker ps -q -a --filter=image=builddevbox:latest)
        docker rmi builddevbox
        docker rmi golang:cross

function buildbox {
        docker build --force-rm --rm -t builddevbox .
        ID=$(docker run -d builddevbox /bin/bash)
        docker export $ID | docker import - devbox


Popular posts from this blog

Entry level cost for CoreOS+Tectonic

CoreOS and Tectonic start their pricing at 10 servers. Managed CoreOS starts at $1000 per month for those first 10 servers and Tectonic is $5000 for the same 10 servers. Annualized that is $85K or at least one employee depending on your market. As a single employee company I'd rather hire the employee. Specially since I only have 3 servers.

The pricing is biased toward the largest servers with the largest capacities; my dual core 32GB i5 IntelNuc can never be mistaken for a 96-CPU dual or quad core DELL

If CoreOS does not figure out a different barrier of entry they are going to follow the Borland path to obscurity.

UPDATE 2017-10-30: With gratitude the CoreOS team has provided updated information on their pricing, however, I stand by my conclusion that the effective cost is lower when you deploy monster machines. The cost per node of my 1 CPU Intel NUC is the same as a 96 CPU server when you get beyond 10 nodes. I'll also reiterate that while my pricing notes are not currently…

eGalax touch on default Ubuntu 14.04.2 LTS

I have not had success with the touch drivers as yet.  The touch works and evtest also seems to report events, however, I have noticed that the button click is not working and no matter what I do xinput refuses to configure the buttons correctly.  When I downgraded to ubuntu 10.04 LTS everything sort of worked... there must have been something in the kermel as 10.04 was in the 2.6 kernel and 4.04 is in the 3.x branch.

One thing ... all of the documentation pointed to the wrong website or one in Taiwanese. I was finally able to locate the drivers again: (it would have been nice if they provided the install instructions in text rather than PDF)
Please open the document "EETI_eGTouch_Programming_Guide" under the Guide directory, and follow the Guidline to install driver.
download the appropriate versionunzip the fileread the programming manual And from that I'm distilling to the following: execute the answer all of the questio…

Prometheus vs Bosun

In conclusion... while Bosun(B) is still not the ideal monitoring system neither is Prometheus(P).


I am running Bosun in a Docker container hosted on CoreOS. Fleet service/unit files keep it running. However in once case I have experienced at least one severe crash as a result of a disk full condition. That it is implemented as part golang, java and python is an annoyance. The MIT license is about the only good thing.

I am trying to integrate Prometheus into my pipeline but losing steam fast. The Prometheus design seems to desire that you integrate your own cache inside your application and then allow the server to scrape the data, however, if the interval between scrapes is shorter than the longest transient session of your application then you need a gateway. A place to shuttle your data that will be a little more persistent.

(1) storing the data in my application might get me started more quickly
(2) getting the server to pull the data might be more secure
(3) using a push g…