Skip to main content

Response to "Seven Databases in Seven Weeks"

For this "7 in 7" book I just glanced at the motives for selecting the DBs that the author did. What caught my attention was the TOC. While the title of the book suggests that this is going to be a reference to modern databases and the NoSQL movement it included Postgres. What's curious here is that a) PSQL is not a modern database and it's not a NoSQL database either. b) While it is a modern implementation none of the modern features are mentioned.

And then there is a huge gap where BDB, BerkeleyDB, should be. While BDB is sometimes considered a NoSQL database it does not implement the CAP theorem which is consistently attached to NoSQL DBs. What makes BDB interesting, and which would seem to be the subliminal rationale for the many query dialects of the NoSQL DBs is an essay that Mike Olsen wrote where he justified BDB's APIs and the absence of a formal query language. [programmers know their the data better than any query optimizer] and then there was [the extra steps to compile and optimize are time consuming and better at compile time instead of runtime].

CAP is the anti-pattern to ACID. Essentially CAP comes down to a principle of economics [pick two of the following three attributes]. A lot of time has been devoted to this paper and the many followup research papers. I'm not qualified to rebut the thesis but I always wonder if there is a spoiler out there. VoltDB has a novel approach that suggests that you can, in fact, have your cake and eat it too. (It's also absent)

The real challenge with the NoSQL movement and this publication is that they are implementing code as fast as they can. By the time this article is posted something new and interesting will have been deployed.

Missing from consideration:

  • memcache

  • leveldb

  • big table

  • S3

  • BDB (mentioned)

  • Orient

  • UnSQL (a completely different movement)

  • SQLite


Finally, the one thing that is missing for me is a comprehensive or at least a beginner list of use-cases and the DBs that best satisfy those use-cases and why. For example Riak seems to be a special purpose DB where MongoDB seems to be more of a general purpose DB. There are still some edge cases... but when you're talking about the volume of data that many of the NoSQL people talk about you better have a good plan, specially if you think you might be moving the data from one storage engine to another.

Comments

Popular posts from this blog

Entry level cost for CoreOS+Tectonic

CoreOS and Tectonic start their pricing at 10 servers. Managed CoreOS starts at $1000 per month for those first 10 servers and Tectonic is $5000 for the same 10 servers. Annualized that is $85K or at least one employee depending on your market. As a single employee company I'd rather hire the employee. Specially since I only have 3 servers.

The pricing is biased toward the largest servers with the largest capacities; my dual core 32GB i5 IntelNuc can never be mistaken for a 96-CPU dual or quad core DELL

If CoreOS does not figure out a different barrier of entry they are going to follow the Borland path to obscurity.

UPDATE 2017-10-30: With gratitude the CoreOS team has provided updated information on their pricing, however, I stand by my conclusion that the effective cost is lower when you deploy monster machines. The cost per node of my 1 CPU Intel NUC is the same as a 96 CPU server when you get beyond 10 nodes. I'll also reiterate that while my pricing notes are not currently…

eGalax touch on default Ubuntu 14.04.2 LTS

I have not had success with the touch drivers as yet.  The touch works and evtest also seems to report events, however, I have noticed that the button click is not working and no matter what I do xinput refuses to configure the buttons correctly.  When I downgraded to ubuntu 10.04 LTS everything sort of worked... there must have been something in the kermel as 10.04 was in the 2.6 kernel and 4.04 is in the 3.x branch.

One thing ... all of the documentation pointed to the wrong website or one in Taiwanese. I was finally able to locate the drivers again: http://www.eeti.com.tw/drivers_Linux.html (it would have been nice if they provided the install instructions in text rather than PDF)
Please open the document "EETI_eGTouch_Programming_Guide" under the Guide directory, and follow the Guidline to install driver.
download the appropriate versionunzip the fileread the programming manual And from that I'm distilling to the following: execute the setup.sh answer all of the questio…

Prometheus vs Bosun

In conclusion... while Bosun(B) is still not the ideal monitoring system neither is Prometheus(P).

TL;DR;

I am running Bosun in a Docker container hosted on CoreOS. Fleet service/unit files keep it running. However in once case I have experienced at least one severe crash as a result of a disk full condition. That it is implemented as part golang, java and python is an annoyance. The MIT license is about the only good thing.

I am trying to integrate Prometheus into my pipeline but losing steam fast. The Prometheus design seems to desire that you integrate your own cache inside your application and then allow the server to scrape the data, however, if the interval between scrapes is shorter than the longest transient session of your application then you need a gateway. A place to shuttle your data that will be a little more persistent.

(1) storing the data in my application might get me started more quickly
(2) getting the server to pull the data might be more secure
(3) using a push g…