Saturday, December 31, 2011

A new locale

I cannot say that I completely understand how this works. I cannot even say that I understand even the basics. In this case, however, I managed to let my intuition guide me and I seem to have gotten to the finish line.

The challenge ahead of me is that I'm getting a new warning message from mercurial when I attempt to pull the latest changesets from my repository:
-bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)

I have 3 machines. My desktop, a development system and a mercurial repository. Let's call them desk, dev, and repo; respectively.

My .profile on my desktop had a few interesting statements:
export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8

And the values were brought forward to my dev machine. Assuming that term has a way to pull these values from nested terminal sessions. Then when I issue mercurial commands on the dev machine I see the warnings I mentioned earlier.

I logged into all three machines and I executed the command: locale. The results on my desktop and dev were the same (en_US.UTF-8). And on the repo server it was POSIX. Well, I want to repeat that I still do not completely understand why but the following worked.

Option #1: remove the two export lines from my desktop config. This was a snore and sort of expected. Since nothing was being pulled forward through the terminal sessions they each must have used their default values.

Option #2: This is the option that I executed. And for no particular reason over #1. It was all intuition. On both my dev and repo systems I executed the following:
locale
sudo dpkg-reconfigure locales
sudo locale-gen en_US.UTF-8

And then tested the hg command again. It worked. I'm not sure what the side effects are going to be but for the time being all is well. (I remember a time when I was cut/pasting T-SQL using Microsoft's SQL Server and whatever my code editor was at the time but I was in unicode hell.)

Friday, December 30, 2011

Mongrel2 = mongrel + python

mongrel was originally written for the ruby folk by a programmer named Zed. I cannot speak to how useful it is or was but I remember reading the name a few times and that was about that.

Recently while I was reading up on ZeroMQ I ran into a reference to a framework called Mongrel2. Mongrel 2 was implemented using Python and ZeroMQ. It was also written by Zed.

It uses the single process/thread eventd model to implement the client-side web server. Requests are taken in from inside an IOloop and those messages are forwarded to and through ZeroMQ as quickly as possible. The programmer need only implement a route table and the "handler" (a ZeroMQ worker).

What makes it interesting is that it implements a 1:Many (client to worker ratio). And; it is implemented in a brokerless structure. In a brokered structure a single incoming transaction can yield 4x transactions on the same box and between 6x and 8x if shifting the transaction off of the current cluster. Also, they did not implement a traditional REQ/RSP (request/response) workflow. In fact on the inbound transaction the webserver pushes to the ZMQ' queue and the worker pulls from the queue.  When the worker has a response it uses a publish/subscribe in order to deliver the response to the awaiting webserver instance.

It would be nice to have a picture to go along with that description but I do not have one.

And now the shortcomings.

  1. While the source code is available the last thing I want to do is instrument all of that code.

  2. ZeroMQ does not have any queue management APIs. So there is no way to know what the TTL for a particular message is... every message must be processed meaning that you can DOS yourself.


There are some APIs in the IOLoop class that could be used:

  • setup all the callback handlers

  • send() the message

  • receive the "sent" trigger

  • wait for the response...

  • if there is a timeout then try to reverse the request and release the client


In one other use-case:


  • setup all the callback handlers

  • send() the message

  • wait for the response...

  • if there is a timeout and the "sent" was never received then the message is queued. abort the request.



In conclusion, while I like many of the design decisions in Mongrel2 and it clearly benefits from being V2, it's still incomplete in many ways due to ZMQ and not itself.

My New Python Project Setup

[update 2012-01-18] postgres has been updated to 9.1.2; the latest version as of today.

[update 2012-01-17] feel free to ignore my comments about Lua. While Lua might sit in an interesting place between Python and Java in an embedded/scripting place. The fact that lunatic-python does not compile and lupa depends on LuaJIT2 which is compatible with Lua 5.1 and the current Lua version 5.2 was recently released... and the comment from the LuaJIT team about adoption was a little snarky. I gotta think about something else.

[update 2011-12-29] I forgot to add twitter's bootstrap CSS/JS. I'll cover that in a future post when I also discuss modern-package-template

It's pretty simple to set things up. There are some prerequisites and some basic install packages that need root access but the intent is to get the config in userspace as soon as possible. This article covers VM slices at Rackspace using Ubuntu 11.10.

First: Install and update:

  • allocate the OS

  • select the OS and wait for it to complete.  You're going to receive an email with the root password

  • login, change the root password

  • create an "admin" privileged user (usually my name or "builder")

  • add this user to the sudo

  • change it's password

  • edit /etc/ssh/sshd_config and disable root login

  • update the package definitions (apt-get update)

  • upgrade the packages (apt-get upgrade)

  • reboot


Second: Install the required roo packages

  • postfix - when prompted select the default values

    • apt-get -y install postfix



  • apt-get -y install python-setuptools daemontools daemontools-run python-dev mailutils mutt build-essential uuid-dev python-nose vim htop sysstat dstat ifstat screen locate apache2-utils unzip siege python-virtualenv bwm-ng libcairo2-dev libglib2.0-dev libpango1.0-dev libxml2-dev fail2ban openssl libssl-doc openvpn libssl-dev libgcrypt11-dev lighttpd lighttpd-dev libevent-dev libcurl4-openssl-dev  libreadline6-dev beanstalkd tree

  • apt-get install postgresql-9.1 postgresql-client-9.1 postgresql-doc-9.1 postgresql-plperl-9.1 postgresql-plpython-9.1 postgresql-server-dev-9.1

  • easy_install pip

  • easy_install mercurial

  • easy_install pycurl

  • pip install virtualenvwrapper


That's it, essentially, for the second layer, however, here's an explanation of the modules from a macro perspective:

  • python-setuptools - make the installer, easy_install, available

  • daemontools daemontools-run - there are so many ways to implement a 'daemon' this tools make it simple to make daemon deployment simple

  • python-dev python-nose python-virtualenv - basic prereqs for python development. virtualenv is needed so that packages can be installed in userspace

  • mailutils mutt - generate emails

  • build-essential uuid-dev  - basic developer tools

  • vim screen - editor and console tool

  • htop sysstat dstat ifstat locate unzip bwm-ng - debug /monitoring tools

  • libcairo2-dev libglib2.0-dev libpango1.0-dev libxml2-dev - libs used when rendering usage graphics

  • fail2ban - detect login attempts and put the IP in time-out

  • openssl libssl-doc openvpn libssl-dev libgcrypt11-dev libcurl4-openssl-dev - crypto

  • lighttpd lighttpd-dev - web server that should be in front of the framework

  • libevent-dev - kevent, kpoll libs

  • apache2-utils siege - performance simulation tools

  • beanstalkd - message queue


And finally the third layer, the userspace framework layer. But before you start installing packages you need to create the virtual environment:

  • cd ${HOME}

  • mkdir -p src

  • cd ${HOME}/src

  • virtualenv currentenv

  • . ./currentenv/bin/activate


Now install the third layer.


  • pip install tornado

  • pip install flask

  • pip install flask-rest

  • easy_install pip

  • pip install pycurl

  • pip install simplejson

  • pip install tornado

  • pip install Fabric

  • pip install PasteDeploy

  • pip install PasteScript

  • pip install modern-package-template

  • pip install requests

  • pip install gevent

  • pip install pystache

  • pip install nose

  • pip install redis-py

  • pip install pymongo

  • pip install hoover

  • pip install pyzmq

  • pip install pyyaml

  • pip install beanstalkc

  • pip install django

  • pip install django-redis-cache

  • pip install clint

  • pip install djangorestframework

  • pip install pyparsing

  • pip install flup


I'm hoping that there is a practical use-case for embedding Lua in Python. There are an few interesting projects like lunatic-python and lupa. Normally I would not consider Lua for anything beyond "hello world", however, the redis team is embedding Lua, it seems like a very lightweight codebase, it can be embedded in just about any language (do a google search).

  • cd ${HOME}

  • mkdir -p tmp

  • cd tmp

  • wget http://www.lua.org/ftp/lua-5.2.0.tar.gz

  • tar zxvf lua-5.2.0.tar.gz

  • cd lua-5.2.0

  • make linux

  • sudo make install


NOTE: the lunatic project does not compile under Lua 5.2. So this thread is postponed for now.

  • pip install lunatic-python


Alternatively I tried [lupa] but that requires LuaJIT 2.0 which is currently in beta (version 9)

  • cd ${HOME}

  • mkdir -p tmp

  • cd tmp

  • wget http://luajit.org/download/LuaJIT-2.0.0-beta9.tar.gz

  • tar zxvf LuaJIT-2.0.0-beta9.tar.gz

  • cd LuaJIT-2.0.0-beta9

  • make

  • sudo make install

  • sudo ldconfig


Then install lupa.


  • pip install lupa



NOTE: hoover is a client library for loggly.com.  You'll need an account if you want to use this service.

In closing, I would like to include a few more libraries, however, the current version in apt-get is too old. I'd prefer installing them from scratch. They are necessary packages so for the time-being I'm just going to list them. They should be installed when installing the first layer and by the root user (or sudo)

  • ZeroMQ - trivial to build and deploy if you follow the instructions

    • cd /tmp

    • wget http://download.zeromq.org/zeromq-2.1.11.tar.gz

    • tar zxvf zeromq-2.1.11.tar.gz

    • ./configure

    • make

    • make install

    • ldconfig



  • MongoDB - (can actually be installed in userspace)

    • cd /tmp

    • wget http://fastdl.mongodb.org/osx/mongodb-linux-x86_64-2.0.2.tgz

    • mkdir -p ${HOME}/bin

    • cd ${HOME}/bin

    • tar zxvf mongodb-linux-x86_64-2.0.2.tgz

    • sudo mkdir -p /data/db

    • sudo chown `id -u` /data/db

    • ./mongodb-linux-x86_64-2.0.2/bin/mongod

    • ... or ...

    • cd ${HOME}/bin

    • find ./mongodb-linux-x86_64-2.0.2/bin/ -type f -exec ln -s {} \;

    • ./mongod



  • Redis - trivial to build and deploy if you follow the instructions

    • cd /tmp

    • wget http://redis.googlecode.com/files/redis-2.4.6.tar.gz

    • tar zxvf redis-2.4.6.tar.gz

    • cd redis-2.4.6

    • make

    • make install



  • SQLite - a simple SQL DB

    • cd /tmp

    • wget http://www.sqlite.org/sqlite-autoconf-3070900.tar.gz

    • tar zxvf sqlite-autoconf-3070900.tar.gz

    • cd sqlite-autoconf-3070900

    • ./configure

    • make

    • make install

    • ldconfig



  • ISO8583 - ISO8583 lib

    • # will use virtualenv if it is configured

    • cd /tmp

    • http://pypi.python.org/packages/source/I/ISO8583%20Module/ISO8583%20Module-1.2.zip#md5=259c5eb63bb36e3376f8a430a5b4a092

    • unzip ISO8583\ Module-1.2.zip

    • cd ISO8583\ Module-1.2

    • python ./setup.py install




Good luck!

PS: You should consider scripting this installation so that the deploy can be automated.  Specially via Fabris, chef, or puppet.

Monday, December 26, 2011

Dude where's my computer?

I received a call early this morning. A former manager of mine was calling because his computer was stolen. This was not going to be that big of a deal because he regularly backed everything up and had an administrator password. Well almost. If this was just a crime of opportunity and they hardware was to be sold right away then everything is ok. If not, then he could be in some trouble and he'll have to spend some time monitoring his TRW and the like.

Whether or not you currently have valuable data on your PC or not ... at some point prior to a theft or loss you might find yourself regretting that decision. So here are some preventative measures.

  • completely encrypt your harddrive with a password

  • backup your data offsite and with encryption

  • use a screen saver with a non-trivial password

  • install a service like lojack for laptops

  • ... and a little fun


Here is my explanation:

By encrypting your harddrive you are essentially guaranteeing that if the thief has to power the computer that he/she will need a password. This password cannot be spoofed or practically guessed... and depending on the tool it might also erase the harddrive after just a few failed attempts.

Dell offers a hardware encryption solution on some of their laptops. I had the service enabled on a Dell that was running Ubuntu. It was painful at times because I had not memorized the password but at least I was NEVER in fear of losing my client's data or source code.

There are other tools like TrueCrypt that will encrypt an entire Windows harddrive. TrueCrypt is not hardware locked so the drive could be pulled and some sort of automated attack could be performed... but that person would have to be committed to the endeavor.

The best solution for backing up your computer has got to be CrashPlan. a) it's cross platform; b) there is a family pack and a business version; c) it supports multiple simultaneous encrypted destinations; d) the cost is wallet friendly; e) the encryption prevents even their customer service from hacking your data.

Even if you encrypt you harddrive ... and a thief steals your "sleeping" laptop. They will have access to your data. All they need to do is wake your computer and there everything is. So set a screensaver password and use it. Make certain it is never disabled.

Someone recommended a lojack service too. I suppose that would be a good idea, however, I do not know if I would ever use a computer that I might recover. Bad guys are just that. You have no idea if they installed a keylogger or some other piece of nefarious code (aka trojan). Be grateful you have it, drive a nail through it and toss it in the trash.

Better than actually putting lojack on the computer would be to tweak your screensaver so that it popped up a screen or login screen that looked like a serious "you are being tracked" with some official looking graphics so that the thief might abandon the laptop or might actually help you out by destroying it themselves.

Losing something like a laptop can rock your confidence. Not only was your space violated but it's possible that your privacy will too.

PS: While you are at it... you better put passwords on your iPhone and iPad... and other smart devices that you carry around with you like jewelry.

Sunday, December 25, 2011

Modern Application Development

There was a time when I was very proud that I could go a year without rebooting my Linux or BSD machines; specially when it was my main development machine. Over the last 6 years of my OSX tenure I have had very similar success. The main difference is that I have been using VMware or remote virtual servers a lot more for development.

Last night I was forced to reboot my OSX machine. It was an unpleasant experience because it felt like I was being coerced into the reboot. I know the feeling all too well from my Windows days when I might have to reboot my Win machine several times a day.

Last night's experience was triggered by a large swap file. This was caused because several applications had allocated between 500M and 1.5G of memory.

  • Chrome of OSX

  • Safari (latest)

  • PyCharm (3 directories open but it's java)

  • Little Snitch (what? There shouldn't be that much data)

  • CrashPlan (also Java; since uninstalled)

  • Kernel Task (expected)

  • VMWare (expected)


I brought up htop in the VMware slice where I had a PyCharm instance running. There must have been 30 java processes. Who knows what that's all about? I remember reading that most programmers that do threaded programming are not any good at it and it should be left to the experts. I wonder if jetbrains are experts.

There is a software version of a saying that [software is going to expand to fill all available memory and CPU capacity] or something like that. So here is my take on this:

a) software is way too complicated and that complication is represented in more and more code with more and more friction for the user with more and more bugs.

b) being the one application that takes 90% of RAM is not a good thing. A little care and you can reduce your footprint and with luck improve performance. And if you use something like SQLlite or some other embedded DB you might accomplish both.

My new years resolution will be to take my own advice and reduce my friction where ever I can.

Tuesday, December 20, 2011

Response to "Seven Databases in Seven Weeks"

For this "7 in 7" book I just glanced at the motives for selecting the DBs that the author did. What caught my attention was the TOC. While the title of the book suggests that this is going to be a reference to modern databases and the NoSQL movement it included Postgres. What's curious here is that a) PSQL is not a modern database and it's not a NoSQL database either. b) While it is a modern implementation none of the modern features are mentioned.

And then there is a huge gap where BDB, BerkeleyDB, should be. While BDB is sometimes considered a NoSQL database it does not implement the CAP theorem which is consistently attached to NoSQL DBs. What makes BDB interesting, and which would seem to be the subliminal rationale for the many query dialects of the NoSQL DBs is an essay that Mike Olsen wrote where he justified BDB's APIs and the absence of a formal query language. [programmers know their the data better than any query optimizer] and then there was [the extra steps to compile and optimize are time consuming and better at compile time instead of runtime].

CAP is the anti-pattern to ACID. Essentially CAP comes down to a principle of economics [pick two of the following three attributes]. A lot of time has been devoted to this paper and the many followup research papers. I'm not qualified to rebut the thesis but I always wonder if there is a spoiler out there. VoltDB has a novel approach that suggests that you can, in fact, have your cake and eat it too. (It's also absent)

The real challenge with the NoSQL movement and this publication is that they are implementing code as fast as they can. By the time this article is posted something new and interesting will have been deployed.

Missing from consideration:

  • memcache

  • leveldb

  • big table

  • S3

  • BDB (mentioned)

  • Orient

  • UnSQL (a completely different movement)

  • SQLite


Finally, the one thing that is missing for me is a comprehensive or at least a beginner list of use-cases and the DBs that best satisfy those use-cases and why. For example Riak seems to be a special purpose DB where MongoDB seems to be more of a general purpose DB. There are still some edge cases... but when you're talking about the volume of data that many of the NoSQL people talk about you better have a good plan, specially if you think you might be moving the data from one storage engine to another.

Monday, December 19, 2011

Response to "Seven Languages in Seven Weeks"

Bruce Tate is a good writer and recently he published a book titled: "Seven Languages in Seven Weeks". I do a lot of career development so I completely agree with the premise, however, the first place I get lost is the selection of languages:

  • Ruby,

  • IO

  • Prolog

  • Scala

  • Erlang

  • Clojure

  • Haskell


Initially there is nothing wrong with this selection. Tate tells the reader that the choices were made my asking his readers. And at first glance this might makes sense (blame the reader), however, it's more dubious than that.

As I rebuked a recent blog for it's survey results pertaining to Agile because the sample group were in relative social circles to the author. I believe the same can be said here. About the only thing in Tate's favor, however, is that the words "practical" or "pragmatic" were omitted. Had they been present then I believe that the language selection might have echoed github's language survey.

In hindsight I should have read the TOC before I purchased the book. I already had a cursory knowledge of Ruby, I've been coding erlang professionally for 3 years, prolog was deprecated when Borland's Turbo Prolog was decommissioned, I've reviewed Haskell and it's of no general interest... I think erlang got it right. And as the saying goes, "lipstick on a pig, it's still a pig" when it comes to Scala and Clojure.

If it were my book I think the list would have been a little different:

  • go - the languages solves some concurrency and messaging issues in many other languages, it's also statically linked.

  • erlang - lightweight processes, fast, has momentum

  • python 3 & perl 6 and PHP(Facebook) - These updates have been in development a long time. It's critical to understand whether the new versions are worth the mid share or if they should be deprecated.

  • modern C or C++ -

  • groovy - It's java lite and while I do not have any practical experience with it, since I do a lot of development in python, perl and ruby this makes sense in the JVM environment.

  • serverside javascript (NodeJS, MongoDB, etc) - another up and comer. This is probably more like a 1/2 week experience, however, just because you know browser javascript does not mean that you'll be successful on the server side.

  • R - The google-ites and Facebook-ies are going crazy with analytics and now that the "social" aspect has entered just about every website tools that render information about the business are becoming critical. R has a great many tools to help out. Hopefully one does not need a Phd in math to be successful.


What sets my list apart from Tate's is that they are look to the future. "Where are we going?" not "What's slipped between the cushions?"

As a sidebar, I have another list that I think might be interesting: "Seven Frameworks in Seven Days". You're not going to become an expert in seven days but you might know enough to make a choice for your next project based on that experience:

  • TornadoWeb or Cyclone (python) - very capable frameworks but they are even driven.

  • Mojolicious (perl) - another event driven framework.

  • Sinatra (Ruby) - something to attract ruby-ists. It's as capable as those above.

  • Limonade (PHP) - PHP is powering back up thanks to Facebook's compiler.

  • Orbit (Lua) - Lua was conceived in the vacuum of Brazil and has an adopted home in World of Warcraft. At some point those programmers are going to want to break out of the game into the real world.

  • Snap (Haskell) - It's fast.

  • Nitrogen (erlang) - interesting GUI, comet, baked right in.


One reason for the entries in this list is that the language portion of the exercise is trivial. Micro frameworks are not capable of running an enterprise but it's low cost of entry is going to get things started so that your burn rate is smaller.

While Snap and Nitrogen are interesting in their own right that's about it. They will not likely be here in 2 or 3 years but the ideas are great.

Thanks for reading.

webapp application stack

I'm trying not to be a "bitter betty" over the time spent on Mojolicious so I've been refreshing my stack in order to prepare for future endeavors. In the meantime the application I've been building for a client required that the GUI be split into an API and a GUI rather than just an integrated GUI. At this point I have completed the API development and it's as extensible and scalable as I could hope for. Now it's time for the GUI stack.

There are several missions here. a) python; b) tornadoweb; c) keep in mind the person making changes to this app will not likely be a pythonista (thank goodness). That's it. Now I need to pick the rest of the tools. If there is some interest on your part then do some googling. They are easy to find.

  • PyCharm IDE - not all of the future developers need or want to understand everything from the command line.

  • virtualenv - handle different versions of python.

  • easy_install and pip - needed for installing the different dependencies, however, distutils might be best for deploying your app

  • pycurl - dep

  • simplejson - dep (at some point it was incorporated directly into python)

  • tornadoweb - web framework

  • fabric - repetitive CLI tool

  • modern-package-template - python project folder structure

  • libevent -

  • requests - simple http client APIs

  • gevent - needed by requests

  • pystache (from mustache) - templating without logic of any kind - I like it because it's supported by so many platforms including JavaScript for client side and perl too. The best part is that any mustache file can be reused.

  • nose - unit testing.

  • daemontools - daemonize any userspace app.

  • redis - but I'm not actually accessing redis directly although there is some room for cacheing.


and my GUI elements:


  • bootstrap - this cooked Less / CSS is great. The best part is the best practices but the files are easy enough to spoil if necessary.

  • and it's dependencies



It's a pretty simple setup from here and creating your first project is easy too. After some experimentation you'll figure out the order that works best for you.

Sunday, December 11, 2011

"Mojolicious" has lost it's Mojo

I've recently encountered my second issue with Mojolicious; since the Mojo team had recently added some core developers and reinstated their ticket system via GitHub I decided to open a ticket.

The first issue I presented to the team was their use of "daemon" as a command line param to launch your webapp, however, it unexpectedly ran the app in the foreground. My understanding of "daemon" and the behavior of any number of apps is that "-d" or daemon-mode meant background execution.

The issue I am currently working on is the new 2.37 version. It's the first build from the new core team and that was a good thing, however, it would not build on my system (OpenBSD 5.0). So I reported it. The response that I received directed me to the CPAN team who verified that it worked.

Less obvious was the notion that I needed to check my dependencies. So I suggested exactly that... to which I received a reply that a) we were off topic and b) that there are no dependencies.

Well, ok... we were a little off topic... but there is such a thing as customer service. And while the cpan install file may not actually set the dependencies... they are listed on the CPAN page. So why not add that to the install?

So I'm up to my eyeballs in this. I've spent the last few months (calendar time) trying to wrap my head around this thing called Mojolicious. I happen to like perl and the CPAN. But I'm also frustrated with my two encounters with the Mojolicious team. They are neither smart enough or pretty enough to warrant this sort of snarkey-ness,

In the last 24-hours I've spent some time looking at flask, Werkzeug, Cyclone, and of course I already use Django and TornadoWeb. I also like python. The dependencies are easier here too.

Thursday, December 8, 2011

"Rails Is Not For Beginners"

[UPDATE] I don't know anything about this book and the fact that it's Ruby is not interesting but the quotes and snippets are: here

Oscar posted "Rails Is Not For Beginners". His general assertion is that Rails is complex and has a lot of code making it difficult for beginners... and that Sinatra is a better tool because it's only a fraction of the code. (about 100:1)

I responded to the original post with this comment:
"not for beginners" is mostly true but meaningless. The noobs have adopted it instead of VB. However, I think that it's a little more complicated than that and is much a psychological mystery as trying to understand the stock market.

There is something to be said about "full stack awareness" and sinatra's LOC makes that easier. Sinatra also allows you to get some real work done. My intuition tells me that sinatra does not do as much fancy META magic under the covers as Rails does; which is more of a property of the language than the framework.

My issue with noobs, rails and ruby... is that while the "UNIX way" suggests building projects on top of each other like layers of a cake. Many of the useful GEMS have such deep dependencies that FSA is extremely unlikely... If the MBAs that ran our companies fully appreciated the complexity and risk then Rails would still be in the drawing room.

While I agree with the assertion I think it's probably a little more complicated than that. The fact of the matter is that "most" and not "all" beginners suck at programming. It's not enough to know language syntax or to have built a few hosted and dynamic websites. This career/profession requires a lot more attention to detail, situational awareness ... and with all due respect to Dorsey ... a finely tuned intuition; in addition to the syntax and idioms of at least 3 or 4 mainstream languages. Not to mention plenty of business awareness.

Wednesday, December 7, 2011

RestMQ routing table for PerlMQ

[Update 2011-12-11] After several months of preparation to build this app I'm regretfully abandoning the effort. a) the mojo guys; b) since I've decided to concentrate on Python... RestMQ will have to do because there is no sense in rewriting the code for TornadoWeb. Good luck to us all.

As the project continues this article is going to convert the routing table from the RestMQ app to the PerlMQ version. The RestMQ snippet for the routing table associates the URL to a particular event handler in a list form. The Mojolicious version does not. In the Mojo version the URL and the handler's signature are declared at once.

Here is the RestMQ routing table:

(r"/", IndexHandler),
(r"/q/(.*)", RestQueueHandler),
(r"/c/(.*)", CometQueueHandler),
(r"/p/(.*)", PolicyQueueHandler),
(r"/j/(.*)", JobQueueInfoHandler),
(r"/stats/(.*)", StatusHandler),
(r"/queue", QueueHandler),
(r"/control/(.*)", QueueControlHandler),
(r"/ws/(.*)", WebSocketQueueHandler),


Converting this to the perl version is going to be pretty easy. Most of the handlers have support for get/post and some support delete. These are the rest commands and the actual route or call of the individual function methods are handled by the frameworks. As to whether the list version is better than the inline version is to be debated.


In my opinion the list version is better because it's concise. The code is self documented right here. However, Mojolicious has an interesting feature is that it can export the routing table as part of the command line interface. This allows the user to verify the work before forcing the programmer to re-review the code from scratch.

The Mojolicious analog of the route table above is as follows:
# ----------------> (r"/", IndexHandler),
# http://localhost/?[queue=...]&[callback=...]
get '/' => sub { ... }
# http://localhost/
# queue=
# [msg=...]
# [value=...]
post '/' => sub { ... }

# ----------------> (r"/q/(.*)", RestQueueHandler),
# http://localhost/?[callback=...]
get '/q' => sub { ... }
# http://localhost/<queue>?[callback=...]
# [msg=...]
# [value=...]
post '/q/:queue => sub { ... }

# http://localhost/<queue>?[callback=...]
delete '/q/:queue' => sub { ... }

# ----------------> (r"/queue", QueueHandler),
# http://localhost/queue
get '/queue' => sub { ... }

# http://localhost/queue
# [msg=...]
# [body=...]
post '/queue' => sub { ... }

# ----------------> (r"/c/(.*)", CometQueueHandler),
get '/c/:queue' => sub { ... }

# ----------------> (r"/p/(.*)", PolicyQueueHandler),
# http://localhost/p/<queue>
get '/p/:queue' => sub { ... }

# http://localhost/p/<queue>
# [policy=...]
# [callback=...]
post '/p/:queue' => sub { ... }

# ----------------> (r"/j/(.*)", JobQueueInfoHandler),
# http://localhost/j/<queue>
get '/j/:queue' => sub { ... }

# ----------------> (r"/stats/(.*)", StatusHandler),
# http://localhost/stats/<queue>
get '/stats/:queue' => sub { ... }

# ----------------> (r"/control/(.*)", QueueControlHandler),
# http://localhost/control/[queue]
get '/control/:queue' => sub { ... }

# http://localhost/control/<queue>
# [status=(start|stop) ]
post '/control/:queue' => sub { ... }

# ----------------> (r"/ws/(.*)", WebSocketQueueHandler),
# I'm not sure how this is going to translate yet
#get '' => sub { ... }
#post '' => sub { ... }
#delete '' => sub { ... }

PS: One thing that is missing from this project so far... is the actual layout of the folders and and sort of installation program. For the moment I have deferred that step because I'm still trying to decide whether or not to integrate this into the Mojo project (If they will take me).

Tuesday, December 6, 2011

Reverse Engineering RestMQ

RestMQ is no big secret. The source is available online and there is some doc too. The project I'm referring to was written in python using the Cyclone framework which depends on Twisted. I like both of those tools but they feel wrong for a project like this. It seems to me that the dependencies are too deep.

That's when I requested and was granted an account from the CPAN team. My intention is to build a version of RestMQ in perl using the Mojolicious framework. I originally thought to call it 'RestMQ-pl' and then I thought of 'PerlMQ' and I'm also lingering on 'MojoMQ'. The last one makes sense if I create a separate core API than can be integrated into the Mojo base code. (when I learn it)

The RestMQ code and doc is on par with just about every other open source project out there but there are some limits. One thing that is missing for me is enough code/doc that I can use as a requirements doc in order to implement the code in any language or any platform. So I am forced to install it and reverse engineer the behavior and the data.

The other item that I'm struggling with is that this article is getting longer and longer. It's hard to edit, get feedback and move on to the next idea... and quite possibly abandon the article if there is no interest. So for the purpose of this article I'm going to address installing RestMQ, Cyclone and Twisted on an OpenBSD 5.0 system which I have been using for general purpose development for the last few months.

I tried to install RestMQ and it's dependencies on an OpenBSD 5.0 system. My first attempt was a failure because the easy_install script reported that it was missing epoll.h which I'm pretty certain is a Linux kernel file and therefore is not going to be available on OpenBSD.

My next stop is the Twisted website to see if they have a proper build script. That too failed because while Twisted was available in source form there were still too many dependencies that I did not want to chase after... and while I really do want some "full stack awareness" at this moment it's off target for the mission. Get 'er installed.

I started my second attempt using an OpenBSD package:
# OpenBSD 5.0
pkg_add http://ftp.openbsd.org/pub/OpenBSD/5.0/packages/i386/py-twisted-web-11.0.0.tgz
easy_install cyclone
wget https://github.com/gleicon/restmq.git
cd restmq
python ./setup.py install

This was not going to be enough. A couple of the scripts expect the bash shell to be in a different place. There are a couple ways to solve this and for an OpenBSD system the following is a severe cheat. One should really correct the code. Possibly change the static to use "env". But in the meantime that too would be off mission. So this sym link will have to do for the moment.
# need bash in the right place
ln -s /usr/local/bin/bash /bin/bash

Let's see what happens next. From this point you should be able to follow the RestMQ instructions for starting and testing RestMQ... and it worked. Now for some reverse engineering.
(The next article will roughly address the route table.)

Languages and more languages

[Update 2011-12-06] Actually there is still a lot missing from this list. The tier-1/2 dependency list is pretty high. Things like Spread, ZeroMQ, OpenSSL, JDBC, jPOS, Mnesia, Mojolicious, Twisted, Celery, ActiveRecord, Hibernate, iBatis, Riak, Redis, MongoDB, Cassandra ... and it goes deeper than that too.

As a followup article to "projects and more projects"; this article is going to cover all of the programming languages, tools and frameworks I've used over the years as they relate to the projects I've worked on. This is going to be the hardest document to write because in some cases the dependencies were too deep for me to know for sure. (thank you WebLogic) but I'll give it a shot in that maybe it'll be interesting in the end.



I'm not sure if this is everything but it's pretty darn close.

Monday, December 5, 2011

Chaos in Framework development

I think I am observing....
the more general purpose a framework or language becomes the more likely that the things implemented on top of it also become more general purpose ... recursively.

Which has me asking the question:
So at what point do we end up back at the original source?

Think about this. Sun's, now Oracle's, JVM executes on just about everything out there. The JIT is producing runtimes that are nearly as good as C. So given all of the extra heat that the JVM generates why not just write good C code in the first place?

And that has me thinking about something else...
Forget whether or not Moore's law will be true forever but assuming that it is. At what point do we stop caring?

For about the first 4 to 5 years of the PC Moore's law was really important. We were aching for CPUs that would allow us to do exponentially more work. We wanted more memory etc... MS DOS applications and MS Windows was around the corner. And we just waited for the hardware to catch up.

Now the hardware is here. I know it's here because for the last few years we have watched NASCAR show realtime car telemetry on the screen rendering the driver's box and car data with an arrow to the car. Or the colorized hockey puck during professional hockey games (abandoned but it worked).

We can do just about anything with out PC/Mac(s) and our phones and tablets are getting there. So what does it all mean? Where are we going next?

Friday, December 2, 2011

I want a private twitter clone for my business

While I'm not sure I know what the difference between short emails and tweets or their effect on productivity. I am wondering about the movement to ban emails from the workplace.

It's not that I mind that my conversations are in the cloud it's that I want to choose which conversations are in the cloud. Also, a recent article was posted where a CEO or two banned email from their company(s).

If I had to list a number of requirements for a private twitter:

  • mobile client

  • mobile web

  • desktop app

  • browser app

  • backchannel for internal employes or inner circle

  • server storage of conversations

  • temporary accounts for external resources with filtered output

  • up/down voting

  • tagging (both twitter and external tagging)

  • tag a selection of messages

  • live filtering like tweetdek in one column and my conversation(s) in another

  • export to PDF or some other format

  • search

  • attachements

  • slightly larger than 140 bytes


But then what about GMail or some other Mail client? If I could filter my messages by size so that the smaller emails were given a higher priority and people emailing me knew that... (maybe an auto responder that told them that the email was going to be #100 in line so that they could adjust it and resend; eliminating potential gaming of the system like putting the real email in an attachment). Allow for chaining of messages into a conversation and being able to alter the subject line in the email. (stuffing something in the header).

When you think about there are only a few differences between emailing using only the subject line except you lose access to the chain of events but it is pretty close.

The only reason why email loses this battle is because the email clients have momentum... inertia. There is no way that all these developers would, could or should accept my email agenda. Granted someone like google could implement this feature and it might actually stick. The others might follow.

I don't know where this leaves me. I don't think I have the time to implement either. No matter how much better it makes me feel. It also sounds a lot like circles in google+. But then there is that privacy issue again.

*sigh* I might actually have something here. I'll have to give this some thought.

Detecting Mobile Browsers

I have a webapp that I'm about to normalize into separate model and view apps. It's not an uncommon approach in order to perform different rendering based on the browser type in particular mobile. SO the first step in the process of information gathering is browser detection. Like everyone before me they did the google search and found the many tools out there. Clearly this is going to be a moving target over time.

I ran the generator code and ended up with this mess:
# Ported by Matt Sullivan http://sullerton.com/2011/03/django-mobile-browser-detection-middleware/
import re
from django.http import HttpResponseRedirect

reg_b = re.compile(r"android.+mobile|avantgo|bada\\/|blackberry|blazer|comp
al|elaine|fennec|hiptop|iemobile|ip(hone|od)|iris|kindle|lge |maemo|midp|mm
p|opera m(ob|in)i|palm( os)?|phone|p(ixi|re)\\/|plucker|pocket|psp|symbian|
treo|up\\.(browser|link)|vodafone|wap|windows (ce|phone)|xda|xiino", re.I|r
e.M)
reg_v = re.compile(r"1207|6310|6590|3gso|4thp|50[1-6]i|770s|802s|a wa|abac|
ac(er|oo|s\\-)|ai(ko|rn)|al(av|ca|co)|amoi|an(ex|ny|yw)|aptu|ar(ch|go)|as(t
e|us)|attw|au(di|\\-m|r |s )|avan|be(ck|ll|nq)|bi(lb|rd)|bl(ac|az)|br(e|v)w
|bumb|bw\\-(n|u)|c55\\/|capi|ccwa|cdm\\-|cell|chtm|cldc|cmd\\-|co(mp|nd)|cr
aw|da(it|ll|ng)|dbte|dc\\-s|devi|dica|dmob|do(c|p)o|ds(12|\\-d)|el(49|ai)|e
m(l2|ul)|er(ic|k0)|esl8|ez([4-7]0|os|wa|ze)|fetc|fly(\\-|_)|g1 u|g560|gene|
gf\\-5|g\\-mo|go(\\.w|od)|gr(ad|un)|haie|hcit|hd\\-(m|p|t)|hei\\-|hi(pt|ta)
|hp( i|ip)|hs\\-c|ht(c(\\-| |_|a|g|p|s|t)|tp)|hu(aw|tc)|i\\-(20|go|ma)|i230
|iac( |\\-|\\/)|ibro|idea|ig01|ikom|im1k|inno|ipaq|iris|ja(t|v)a|jbro|jemu|
jigs|kddi|keji|kgt( |\\/)|klon|kpt |kwc\\-|kyo(c|k)|le(no|xi)|lg( g|\\/(k|l
|u)|50|54|e\\-|e\\/|\\-[a-w])|libw|lynx|m1\\-w|m3ga|m50\\/|ma(te|ui|xo)|mc(
01|21|ca)|m\\-cr|me(di|rc|ri)|mi(o8|oa|ts)|mmef|mo(01|02|bi|de|do|t(\\-| |o
|v)|zz)|mt(50|p1|v )|mwbp|mywa|n10[0-2]|n20[2-3]|n30(0|2)|n50(0|2|5)|n7(0(0
|1)|10)|ne((c|m)\\-|on|tf|wf|wg|wt)|nok(6|i)|nzph|o2im|op(ti|wv)|oran|owg1|
p800|pan(a|d|t)|pdxg|pg(13|\\-([1-8]|c))|phil|pire|pl(ay|uc)|pn\\-2|po(ck|r
t|se)|prox|psio|pt\\-g|qa\\-a|qc(07|12|21|32|60|\\-[2-7]|i\\-)|qtek|r380|r6
00|raks|rim9|ro(ve|zo)|s55\\/|sa(ge|ma|mm|ms|ny|va)|sc(01|h\\-|oo|p\\-)|sdk
\\/|se(c(\\-|0|1)|47|mc|nd|ri)|sgh\\-|shar|sie(\\-|m)|sk\\-0|sl(45|id)|sm(a
l|ar|b3|it|t5)|so(ft|ny)|sp(01|h\\-|v\\-|v )|sy(01|mb)|t2(18|50)|t6(00|10|1
8)|ta(gt|lk)|tcl\\-|tdg\\-|tel(i|m)|tim\\-|t\\-mo|to(pl|sh)|ts(70|m\\-|m3|m
5)|tx\\-9|up(\\.b|g1|si)|utst|v400|v750|veri|vi(rg|te)|vk(40|5[0-3]|\\-v)|v
m40|voda|vulc|vx(52|53|60|61|70|80|81|83|85|98)|w3c(\\-| )|webc|whit|wi(g |
nc|nw)|wmlb|wonu|x700|xda(\\-|2|g)|yas\\-|your|zeto|zte\\-", re.I|re.M)

class DetectMobileBrowser():
def process_request(self, request):
request.mobile = False
if request.META.has_key('HTTP_USER_AGENT'):
user_agent = request.META['HTTP_USER_AGENT']
b = reg_b.search(user_agent)
v = reg_v.search(user_agent[0:4])
if b or v:
return HttpResponseRedirect("http://detectmobilebrowser.com/mobile")

At first when I looked at the block of regex code I immediately thought about javascript obfuscation but settled on the fact that it was a complex set of regex. HOLY CRAP what a mess! I'm sincerely hoping that this code is generated by machine and auto-tested too because if this is hand-coded and tested only bad things will happen. The worst of it is that the code would be subject to updates whenever there were changes on the mobile side. (that's a different set of problems)

PART 2 of the challenge is that the actual render app(s) needs to detect the different mobiles and browsers and render something that you'd be happy to call you're own. The good news is that jQuery and Dojo both have mobile versions/modules and they seem fairly complete and functional. The bad news is that given the number and variety of browsers (see code above) there must be huge numbers of exceptions and exception processing in the code that actually drives the UI.

So the mission starts, first, with the redirect and then the facelift.

Thursday, December 1, 2011

Response to "2011 IT Project Success Rates Survey Results"

In a recent post 2011 IT Project Success Rates Survey Results the author posted some results from a survey and the contents had been discussed in a Dr Dobbs article which clearly provided some credibility. However the results and the conclusion almost had me fooled.

In summary when comparing iterative, agile, lean, ad-hoc and traditional project teams... Agile scored equal or better than the other project team types.

I expected this result because I was directed to the article by someone I trust and highly recommend as an Agile professional. But when I got to the bottom of the article I realized that it was not an authentic research report.

From the information given I conclude that the survey and it's conclusion were biased. a) because the respondents were contacted via methods relative to the author:
This survey was performed during the last two weeks of October and there was 178 respondents.  The survey was announced in my October 2011 DDJ column, my DDJ blog, on the Ambysoft announcements list, my Twitter feed, and several LinkedIn discussion forums (IASA, TDWI, Enterprise Architecture Network, Greater IBM connection, and Considerate Enterprise Architecture Group).

b) after reading the survey (I'm not a survey expert) but the contents of the survey assume that all of the respondents have experience in all of the project team types (which I find unlikely) and the questions that are pro-Agile scream "I am agile".

c) more bad news about the survey; the questions were subjective and they gave the respondent to guess at the various success rates. I have never worked for or heard of a company keeping statistics on the success or failure of projects with the management style.

Who is this article good for? Someone with a vested interest in Agile. Someone who wants to bring Agile to their company because it's new and cool. Think of it as if drug companies did their own FDA trials. Not a good idea! The author's company provides agile services.

So if you want to prove that Agile is a good thing then you need unbiased and random respondents and not your circle of friends. Which is to say that the article is not a total FAIL but it demonstrates the need to capture real metrics on the different project team types and their successes. But it has to be performed in a clinical way and not ad-hoc by people with a vested interest in one method over the other.

another bad day for open source

One of the hallmarks of a good open source project is just how complicated it is to install, configure and maintain. Happily gitlab and the ...