Skip to main content

N-way file merge Perl, Python, Go and Lua [Java, Ruby]- Compared

[Update 2012-06-22] Here is the java version of this assignment. It is/was awful. Once you stray from OO, in java, the code inflates like a sea monkey and clearly OO is over the top overkill.

[Update 2012-06-22] Here is the Ruby version of this assignment. I like it's compactness although that came at a steep price as accessing hash elements meant clunky dereferencing and string comparisons were just awful. [That was an error on my part; works as you'd expect]

Not to beat a dead horse but I now have 4 example implementations in Perl, Python, Go, and Lua.

I did my complaining about Lua in a previous article, however, in summary here... this example in Lua is verbose and lacks consistency. I'm not expecting to reduce this to a single LOC (line of code) but I would have liked some additional APIs that would have implemented more efficient algorithms based on internals knowledge. Or at least well documented idioms.

The Go example was fun because the version 1.x of the toolset was simple to use. I would regularly execute "go run merge_tick_data_hash.go file1.csv file2.csv" and it would run like a champ. The only challenge is/was that simple errors that most dynamic languages permit until the code would actually execute would cause the compiler to barf. And initially I had no idea they were compiler errors; but it was easy enough to get used too. The compiled version of the code was lightening fast to startup and execute even though it was 1.4M bytes in size.

The Perl version took some doing. I was able to reduce the LOC and optimize the code quite a bit. I think there is still some room for improvement based on the Python implementation which was just a few lines smaller because it had the benefit of being written last. In this case I sacrificed adding the filename to the %ticks hash and that reduced a few LOC but added some de-referencing which "might" be optimized by a good JIT; cannot say for certain.

I'd like to compare these implementations to a Ruby version but I'm just not a fan of RVM this week. As for the remaining candidates. I have to admit that I really liked the Go version, however, I do have a complaint that while "Go" seemed like a good name for the project when it started. (prefix for google) right now it's hard to do google searches. GO is such a small and common word that there is no way to optimize searches. One strong advantage is the static linking once the project is compiled.


Popular posts from this blog

Entry level cost for CoreOS+Tectonic

CoreOS and Tectonic start their pricing at 10 servers. Managed CoreOS starts at $1000 per month for those first 10 servers and Tectonic is $5000 for the same 10 servers. Annualized that is $85K or at least one employee depending on your market. As a single employee company I'd rather hire the employee. Specially since I only have 3 servers.

The pricing is biased toward the largest servers with the largest capacities; my dual core 32GB i5 IntelNuc can never be mistaken for a 96-CPU dual or quad core DELL

If CoreOS does not figure out a different barrier of entry they are going to follow the Borland path to obscurity.

UPDATE 2017-10-30: With gratitude the CoreOS team has provided updated information on their pricing, however, I stand by my conclusion that the effective cost is lower when you deploy monster machines. The cost per node of my 1 CPU Intel NUC is the same as a 96 CPU server when you get beyond 10 nodes. I'll also reiterate that while my pricing notes are not currently…

eGalax touch on default Ubuntu 14.04.2 LTS

I have not had success with the touch drivers as yet.  The touch works and evtest also seems to report events, however, I have noticed that the button click is not working and no matter what I do xinput refuses to configure the buttons correctly.  When I downgraded to ubuntu 10.04 LTS everything sort of worked... there must have been something in the kermel as 10.04 was in the 2.6 kernel and 4.04 is in the 3.x branch.

One thing ... all of the documentation pointed to the wrong website or one in Taiwanese. I was finally able to locate the drivers again: (it would have been nice if they provided the install instructions in text rather than PDF)
Please open the document "EETI_eGTouch_Programming_Guide" under the Guide directory, and follow the Guidline to install driver.
download the appropriate versionunzip the fileread the programming manual And from that I'm distilling to the following: execute the answer all of the questio…

Prometheus vs Bosun

In conclusion... while Bosun(B) is still not the ideal monitoring system neither is Prometheus(P).


I am running Bosun in a Docker container hosted on CoreOS. Fleet service/unit files keep it running. However in once case I have experienced at least one severe crash as a result of a disk full condition. That it is implemented as part golang, java and python is an annoyance. The MIT license is about the only good thing.

I am trying to integrate Prometheus into my pipeline but losing steam fast. The Prometheus design seems to desire that you integrate your own cache inside your application and then allow the server to scrape the data, however, if the interval between scrapes is shorter than the longest transient session of your application then you need a gateway. A place to shuttle your data that will be a little more persistent.

(1) storing the data in my application might get me started more quickly
(2) getting the server to pull the data might be more secure
(3) using a push g…