Saturday, March 31, 2012

Startups, Incubators, and Hackathons are Evil!

I want to build my own startup!

I remember a conversation I had with my father about 30+ years ago. He was recently retired for the second time when we were talking about potential businesses. There was a time when he was the president of his own company and I always thought that when he retired I would take over. Anyway that was the message in the media in the 70s. Now it was time for a new future. As we discussed businesses he mentioned that over 90% of all new businesses fail.

Shit... what was I going to do now? (if I had to do it all over again I would have minored in computer science instead of majored)

I don't know what the numbers are today but I cannot imagine that they are any better. And let's be specific... the percentage of tech startups that succeed has got to be lower that that.

I want to be at a hackathon!

Until I decided to include hackathon in this article I had no real idea what it was. Some weeks ago I wrote about the social aspects of hackathons and the hiring process. I don't think anything new or interesting is here. Frankly why would I want to sit around a table with a number of programmers etc... working on some idea that some else may or will cash in on. This is akin to the programming challenge and if you've read anything I've written then you know how I feel about that.

Hackathons are a way for employers or projects to get some concentrated and potentially high powered thinkers in a room to solve difficult problems. What happens to the results are anyones guess. They are also a way for companies to audition potential employees. And for junior programmers to leapfrog the apprentice process.

I want to be accepted by the next incubator class!

I have a great idea. If I can get some VC money it will totally disrupt a $100B market in the US. The problems here are serious.  First of all it's $100B dollars. Once I start to pitch VCs it's going to be a race to market. Patents or not it's going to be about who delivered. Everyone else is going to be in court trying to figure it out for the next 100 years and in the meantime I'm standing in breadlines.

And if the incubator takes me on the first pass... then there is the huge chunk that I have to forego as compensation. Not to mention that's when the Silicon Valley types start to insist that I have ping pong and foosball tables... and ask interview questions that have programming challenges.

The real question?

How does one become the 10 or 15% successful startups, that did not have to trick or social engineer it's way into prominence, and did not give up too much of it's equity.... and famously do no evil ... in the process?

PS

I really do have a $100B-USD disruptive idea.

Why is Mozilla investing in Rust?

A very long time ago I was having a conversation with peers that spilled into a blog post. At the time I was noticing that all of the big boys like Google, Yahoo and others were gobbling up language gurus like Guido.

Now, in hindsight, Mozilla is creating Rust. I do not pretend to know what their real motives are but I do find it interesting to observe. Mozilla's history is all over the map. It was commercial, then it was open source and non-profit, then it was commercial again under AOL and then it was open-source and semi-nonprofit as the Mozilla Foundation... or something like that.

It just seems curious to me that they would go this route. They have 3 or 4 successful projects. They have uber cool tools that are functionally cross platform. I don't think they do any pure or applied research in languages to this point in time. Why Rust?

Google's GO fills a need and they are clearly going to direct the future of the language. Unlike the days of the IBM and Microsoft - OS/2 and Windows wars or the days of Lotus and Excel... There is no API war to be won. Rust could be a fork of GO and it would not matter in the least as it once did.

It seems to me that DSL(domain specific language) is actually being replaced with BSL(brand specific language) and everyone wants to get into the act.

Friday, March 30, 2012

Programming Challenges are "de-motivational"

I think the title of this article give away the ending. Sorry... I suppose you can stop reading here.

Thank you for reading on... I recently applied for a programming position. In the initial response email from the hiring manager or HR I received a response like:
... and then there is the programming challenge ... and it should take 2 days.

Really? Are you kidding? You want me to give up 2 days for what? That was my unfiltered subconscience speaking. But really, that's a lot to ask. Specially when someone is going to take my 2 days of work and skim it for some quasi critical check boxes and make a summary evaluation.

So I said no.
As for the programming challenge. With over 25 years of practical experience and interviewing on both sides of the fence... I do not do programming challenges if I can help it. They are (a) subjective (b) generally insulting (c) trumpeted by junior silicon valley programmers (d) perpetuated by myth (e) and a sign of lazy managers.

I've written about this sort of thing before and I think this summarized the many articles. But this morning I had one of those inspirational moments.

I recently started reading The Developers CodeI'm only part way through the book but I have started to connect with the author. Reading an essay at a time as my toddlers crawled around the living room playing with their toys... I made it into the "Motivation" section.

Having pets in the office was a necessity when you work 10 to 16 hours a day, however, I have to admit that I never liked ping pong, darts or pool table in the place of business. And this is where I fork into two equally important thoughts.

(a) these so called "perks" require more than one person to be stimulating. In one workplace I visited all of the workspace wall are glass so if you are playing ping pong instead of your job (1) everyone sees (2) it requires two. Both have a demotivating effect on team members trying to get work done while others are at play.

(b) the "perk" of our career path is supposed to be the work. I happen to like transactions and databases. Luckily for me there is plenty of work in this space as most applications today are built this way. But the work is it's own reward.

So when you ask me to take 2 days of my life to work on something that is not going to yield any appreciable results... there is simply no motivation to do the work. It also speaks to the nature of the organization.

As an aside, some programmers like to talk about code as art or science. The field might actually be divided on this note. I do not think it's either. but that's fodder for another article.

Thursday, March 29, 2012

tmux is better than webex

Many of my clients use webex, go-to-meeting or one of those desktop presentation programs in order to share a desktop and get some work done. They are really nice when you have display slides, GUIs, and sometimes videos. About the only problem with the scenario is that it take plenty of bandwidth for just a terminal session... along with the audio.

I'm new to tmux, however, I've used GNU Screen for years. Screen has the ability to share a terminal session but it's not very friendly or easy when compared to tmux. The PragProg book has some nice recipes for tmux and is worth checking out.

Using tmux for paired programming (not a favorite) is easy. And tmux is specially nice that it let's one or both users to work in the session but you also get to do other things if you want to be in readonly mode. tmux is installed by default on any OpenBSD installation ... I think it's a tool you can trust. Also, iTerm2 had some plug-ins for tmux but that requires a custom version.

Tuesday, March 27, 2012

"Unfortunately, we'll reject most software developer job applications."

Andrew Stuart of Supercoders in Australia made the bold statement that is the title of this response. While I understand the point he is trying to make he has selected an example that demonstrates that he does not get "it".

He insists, in his article, that potential employees know the [parochial] definition of private, protected, public, abstract class and interface.

There was a time when I knew the definitions word for word. The GOF were also a part of my indelible memory... but as time progressed and as open source took a foothold idioms like protected and private meant less and less.

Back in the day when library vendors were distributing binary-only files private and protected meant something. It meant that the vendor could hide the implementation details. This was usually necessary when there was some intellectual property in play. It's simply not the case any more.

Going back to Andrew's comments. I don't know many a-list developers, programmers, architects - that rely on wrote memory. Good luck to you Andrew.

Sunday, March 25, 2012

Little Snitch - Network Monitoring

One of the really neat features of Little Snitch(LS) is that it has a small dashboard that indicates the network i/o for a particular destination. There is a console version of that feature for the Linux set called iftop (there is a version for OSX too). But as I sit here considering LS I'm thinking that it was good in it's time be that is not the case right now.

LS has a "rules" engine where you specify the application and what remote systems it can connect to. But as I look at the rules they are all enabled. So what benefit is it?

In order to install the OSX version it's best to have already installed MacPorts. Then install iftop.
sudo port install iftop

(clearly you need root/admin access to your OSX system).

Once you have iftop installed you can launch it with:
sudo iftop

You need root access because intercepting the network packets requires root access.

Saturday, March 24, 2012

GO's missing feature fulfilled

I really like the GO language, however, I've had a couple of complaints that prevent me from using it in production. (a) the language has been a moving target and it seems that the 3rd party library developers have lost whatever momentum they had. (b) the absence of a version manager like RVM for Ruby or VirtualEnv for Python.

Well we cannot do anything about (a) because it's as much political and emotional as it is technical and cost. But having (b) in our back pocket makes development so much easier(GVM - go version manager). I think once it makes it into the mainstream of GO development more real progress is going to be made.

Go offers a nice balance between perl, python, ruby, java, erlang, lisp and a few others. With it's memory management, sockets, IPC, datatypes, compiled, formated, syntax... it's just a nice balance.

Friday, March 23, 2012

Where are my machines? (nmap)

I do odd jobs from my home office. It's not much but it supports my hobbies. Recently I was handed a POS and asked to configure it/integrate a new PinPad from a vendor that was not already supported.

The first task was to login. It was a linux based system with an ethernet port and ssh-server running. When the machine boots up it calls for an IP address from the local DHCP server. In my case the DHCP is served by my firewall. Depending on the type of software running on the DHCP server sometimes they list the IP address leases. This one did not.

nmap is a tool used to locate hosts and open ports. It has a number of good and bad uses.

The first step is to install nmap. This is simple on most Linux distros. Since the system I'm on is OSX I had an extra step. Assuming that I already macports already installed.
sudo port install nmap

Once the installation was completed I needed to execute the search.
nmap -v -p22 10.0.1.2-200

"-v" is the option for verbose

"-p22" is the setting to search for port 22 on each IP address. (port 22 is the ssh server port in most configurations)

"10.0.1.2-200" tells nmap to search the 199 IP address from 10.0.1.2 thru 10.0.1.200... for example: 10.0.1.2, 10.0.1.3, 10.0.1.4 ... and so on.

I got my list of devices and since it was a short list it was easy enough to try each.
Initiating Connect Scan at 19:10
Scanning 10 hosts [1 port/host]
Discovered open port 22/tcp on 10.0.1.27
Discovered open port 22/tcp on 10.0.1.21
Completed Connect Scan at 19:10, 0.20s elapsed (10 total ports)
Nmap scan report for 10.0.1.5
Host is up (0.19s latency).
PORT STATE SERVICE
22/tcp closed ssh

Nmap scan report for 10.0.1.20
Host is up (0.023s latency).
PORT STATE SERVICE
22/tcp closed ssh

Nmap scan report for 10.0.1.21
Host is up (0.0028s latency).
PORT STATE SERVICE
22/tcp open ssh

Nmap scan report for 10.0.1.22
Host is up (0.0061s latency).
PORT STATE SERVICE
22/tcp closed ssh

Nmap scan report for 10.0.1.23
Host is up (0.00027s latency).
PORT STATE SERVICE
22/tcp closed ssh

Nmap scan report for 10.0.1.24
Host is up (0.0028s latency).
PORT STATE SERVICE
22/tcp closed ssh

Nmap scan report for 10.0.1.25
Host is up (0.20s latency).
PORT STATE SERVICE
22/tcp closed ssh

Nmap scan report for 10.0.1.27
Host is up (0.0025s latency).
PORT STATE SERVICE
22/tcp open ssh

Nmap scan report for 10.0.1.28
Host is up (0.0063s latency).
PORT STATE SERVICE
22/tcp filtered ssh

Nmap scan report for 10.0.1.29
Host is up (0.0062s latency).
PORT STATE SERVICE
22/tcp filtered ssh

Read data files from: /opt/local/share/nmap
Nmap done: 199 IP addresses (10 hosts up) scanned in 20.18 seconds

Thursday, March 22, 2012

What is so interesting about the Flask microframework?

I get that Flask has a lot of the same design patterns that Ruby's Sinatra has. I suppose if one used a metadata approach to application construction/deployment that you might be able to basically interchange between them.

I did a search hoping to find out the differences between Flask and Tornado. I was rewarded with a page from the the Flask development doc. The contributor was suggesting that one might link or cascade Flask with either Tornado, Gevent, Gunicorn or some other proxy setup.

While mentioning Tornado the contributor says...
Tornado is an open source version of the scalable, non-blocking web server and tools that power FriendFeed. Because it is non-blocking and uses epoll, it can handle thousands of simultaneous standing connections, which means it is ideal for real-time web services. Integrating this service with Flask is a trivial task:

In the deployment section of the doc Flask makes is clear that the built-in webserver is strictly for development. The reasons are probably very similar to Rails' webrick but in the case of Flask there are no explanations. Nor is there a recommendation just a list of servers.

I recently deployed a Tornado-ZeroMQ bridge in order to increase the transaction throughput. Sitting in front of the Tornado instance is a traditional webserver like apache, lighttpd, nginx. These webservers are serving static content because that is what they do best and the dynamic requests are passed thru. But why would I deploy lighttpd->tornadoweb->flask? There is plenty of room for improvement here but someone transitioning from sinatra to/from flask could be rewarded.

hello world from their respective websites:
(Flask)

from flask import Flask
app = Flask(__name__)

@app.route("/")
def hello():
return "Hello World!"

if __name__ == "__main__":
app.run()

(Sinatra)
require 'sinatra' 
get '/hi' do
"Hello World!"
end

Wednesday, March 21, 2012

Programmer DNA in your code - "In Plain Sight"

In a recent "In Plain Sight" episode the writers tried to suggest that the bad guys, given enough CPU cycles, could identify and geographically locate an individual programmer based on his code's DNA.

Seriously?

I implemented some algorithms as an undergrad that could compare two documents and determine the likelihood of plagiarism. But in this case (a) the english language (b) comparing two known documents.

I would like to think that I'm not the only one that realizes that the potential for matching two snippets or even complete source trees is as unlikely as proving program "correctness". (when I was an undergrad that proof could not be completed)

First of all the number of syntax permutations is infinite. The problems they solve are equally as large. Variable name substitution does not count, and with applications like gofmt and IDEs that reformat and inject comments... one is more likely to identify the IDE and not the programmer.

Tower App Missed the Runway - BitBucket

I like Tower. It's a strong GUI in front of a DVCS that can be hard to manage if you're not completely versed in it's execution. Even with quality books from PragProg it's still a challenge. So Tower is welcome.

This week they announced support for BitBucket. BitBucket is a DVCS based on Mercurial instead of Git. (BitBucket does have a Git option when creating repositories.) So when the announcement was made I was pretty excited.

I use both GitHub and BitBucket. GH is perfect for my public projects and BB is perfect for my private projects.  It's all about the cost.

Naturally I was excited when they mentioned the BB support. I thought... one tool, two repos.

But that is not the case. There is little or no documentation on this feature. The "manage repositories" is very unclear and the behavior seems to be adaptive. To what, I do not know. The one thing that is missing is a statement like:
While we have support for BitBucket, we are still limited to Git repositories.

That would make a lot more sense.  I'm not certain that I would convert my repos to Git because I happen to like mercurial better... and for no particular reason other than the fact that as a tool it is implemented in a single language unlike Git which uses several. This has the potential, and has in the past, failed to build when versioning goes bad.

Oh well, maybe later.

Saturday, March 17, 2012

Sparrow (Mail app) falls short

SparrowMailApp is actually a functional mail application both on the OSX desktop and iOS iPhone. The key, however, is that it is a standalone and it simply does not integrate well.

On the desktop, unlike MailplaneApp, it downloads an caches all sorts of data even though it's using IMAP. And if you have one of those new MacBook Air laptops then you probably do not have enough disk space and/or the sync process when you use multiple computers is just silliness. Also, if you use GMail heavily, as I do, then you probably also use google voice which unless you have unlimited minutes on your phone you'll need google chat in order to use your computer as the endpoint.

The iPhone app is another nice try if you use it as a standalone. (a) it does not use notifications (b) it does not check your email in the background (c) it does not integrate into other applications like Instapaper in order to email links or texts. There are some workarounds like Prowl and BoxCar but after using it for a day with Prowl it felt klunky and I've restored the default MailApp. Part of the problem here is going to be Apple's security and sandbox policy. I hope that this is not simply a developer problem.

I'm waiting for my updates.

Crossroads has forked ZeroMQ

I understand the mission but crossroads has a long way to go. They should not be at version one until most or all of the client libraries fall in line.

I'm all for a more stable source but their major complaint might be better resolved in github rather than a fork.

Good luck either way. I'll be watching.

Friday, March 16, 2012

What is the current employment situation?

I always follow several job boards regardless of my work status. The quantity and quality of the postings act as an indicator as to the type and availability of potentially interesting and lucrative opportunities. Sometimes you can sense a migration from contract work to fulltime and vice versa as companies adapt to the economics.

Obviously the best opportunities come from personal connections and not cold calls.

According to news and government reports there are lots of available jobs out there. Suggesting that employees have the high ground. So it was interesting to me when I spoke to a professional recruiter the other day and he said that employers currently hold the upper hand. That employers are reluctant to hire people and that they have to be a perfect match. And if you want that next job you're resume needs to read exactly like the requirements.

Keep in mind that if a recruiter has a perfect position for you, and they know it when they see it, you could present your resume using animal hide, rice paper or smoke signals and you'd still be perfect for the job.

Recently I changed my resume format from the standard worked-here-and-did-that type resume to more of a narrative approach. My goal is to present a successful generalist instead of a specialist even though I specialize in several vertical. Now I'm working on a hybrid version. It's taking me beyond my goal of a single page (no staples or lost pages) and it may dilute the narrative a little. I'm hoping, either way, that it initiates a conversation.

Saturday, March 10, 2012

JSON and more JSON

A few articles ago I wrote about processing CDRs from an asterisk server. One of the design decisions I made was using very simple JSON messages as the message payload from the publisher to the subscriber. As most are familiar with JSON it's a key/value container represented in UTF-2/ASCII. The format of the message is well documented and widely implemented.

There are a number of alternatives to JSON, like BSON, s-expr, msgpack and a few others. I settled on JSON because it was trivial to hand code a JSON payload without having to make any decisions and that it mapped to/from hash datatypes in python and ruby.

In the article I wrote the publisher in C and the subscriber in Ruby.

The publisher was an not an easy decision but by the time I committed to C I had a plan and it's just one page of code for JSON, Redis and a but of parsing. (this code had to be fast and small for the reasons discussed in the article)

The subscriber was a different. It did not have to be all that fast... but in hindsight I'm thinking that the Ruby code worked as a proof of concept and now I want to reduce the memory footprint of the subscriber, however, as I look at the output of "top" I see that it has consumed only 54K of memory.  Not bad. I'll have to watch it over time to see of there are any leaks.

All that aside, I was thinking that I would implement the subscriber in C. I knew that one of the complications of  JSON+C was that C did not have a native datatype that the JSON could be parsed into. So the code would have to work more like strtok().  I supposed that's not much of a problem but it's going to add code on my side to implement. Big deal- 20 or 30 lines of code.

But what was also interesting was the wide variety of implementation in the different libraries that the JSON project links to. Some of the projects were small and some were outrageously huge. I suppose there is more than one way to skin a cat. At least in this case while the intellection pursuit is interesting that's about it. The Ruby solution was memory friendly for the moment.

Where is TopView now?

Back in the simple DOS days, and prior to even Windows, there was a movement to get multiple applications to run on the desktop at once. There were several solutions. Some were text only. Others supported text and/or graphics. Almost all of it was 8-bit color. Of course the first Apple Macintosh was release in 1984 but it was not doing any protected mode work and the more advanced processors had only started to roll of the Intel.

One of my favorite operating environments was TopView. By todays standards it's not that impressive but then again given the amount of memory, disk, and CPU we had in those days; the team did an amazing job. I met the team of developers once. They worked in a small set of 10-ish windowless offices that could only be described as in the basement. (as basements go in Florida). It truly felt like a dungeon.

Some time later Quaterdeck released a project called DesqView. I used and upgraded that puppy for years. My operating environment is not much different today. One or two applications is in distraction free mode and the others are in overlapping everything mode with at least one terminal window open to run my programs in.

In terms of productivity I do not think that anything has changed much. When we wrote code back in the day we watched out memory and cpu diet carefully. We wrote less and faster code. It required less testing and less people. Today we write more code faster and it takes more people to test. And maybe we finish in about the same time.
Ah... back in the day...

Friday, March 9, 2012

Reliable Asterisk CDRs

This is going to be a long and technical article pertaining to the capture of CDRs (call detail record) using the custom extension file and an AGI script.

I designed, built, deployed, and maintain a number of asterisk servers which are used as part of a VOIP arbitrage system.

The first generation system, which inherited, was a single system that housed the Asterisk server, database server for CDR and other billing and reporting data, and a PHP webapp that reported on the data. The system worked (a) when the volume was low; (b) when the overall amount of data was low. Needless to say I was brought in when the "system" (1) started hanging (2) losing call detail records (3) webapp could not return result before the browser would timeout. It was a mess.

The new system uses multiple systems (n+1). There can be a limited number of asterisk servers connected to a dashboard. The dashboard is where all the data is stored, where the ETL is performed, and where the reporting is initiated. The following design supports about 5000 channels on fairly moderate hardware; will auto-restart/recover if there is a crash; and will operate independently from the dashboard.

Here's how it works. In the VOIP reporting business we live and die by the accuracy of the CDR. In VOIP there is no association equivalent of Visa or MasterCard that sets the rules or arbitrates the discrepancies. It's always going to be "on you". Therefore it's always important to get the data from the switch. RDBMS people like to call the transaction ACID. Something very similar applies here.

In the basic Asterisk installation there are a number of ways to get the CDRs from the system. You can export them directly into flat files or directly into one of several brands of SQL databases like SQLite3. The problem with this approach is that the database is expensive in terms of resources and the flat file is inefficient because it's one big file. This is additionally cumbersome when you're trying to report and monitor in realtime.

My strategy is twofold. (1) Export the CDRs to a small flat file and change the flat file once a minute. (2) Then send the flat file to the dashboard server for processing. This is surprisingly efficient and it allows the system to continue to process calls if the dashboard is rebooting or in maintenance mode.

While approach has been wildly successful there is still some for improvement. The first improvement went live today.

Today's challenge: When Asterisk receives an incoming call it authenticates the source and then tries to locate a route for the call (or the destination).  The routing of the call takes place in a file called extensions_custom.conf. In this file you'll see some "code" that is more of a macro or script then an actual programming language. This macro tells asterisk when to do with the incoming call and at the end of the call to hangup. There are some other more complex functions like interactive voice prompts and voicemail but we're just interested in routing. When the call completes we have to initiate a "hangup" and then we need to record the CDR.

So based on the approach above when the call was terminated (hangup) control would be passed to a 3rd layer script (through the AGI interface). This script could be written in any language and it would collect all of the data from the call and append the CDR to the flat file.

So let's review:

  1. call initiated

  2. authenticate the call source

  3. check the extensions config for an appropriate route

  4. when the call is complete hangup

  5. send the CDR to a PHP script through the AGI interface


Step #5 looks like this:
exten => _X.,n,Hangup
exten => h,1,Set(CDR(userfield)=Hangupcause:${HANGUPCAUSE} Qos:${RTPAUDIOQOS})
exten => h, n, AGI(cdr_new.php, ${SIPCALLID}, ${CDR(dcontext)}, ${SUPPLIER},${CDR(start)}, ${CDR(duration)}, ${CDR(billsec)}, ${CDR(disposition)}, ${HANGUPCAUSE}, ${V_NETWORK}, ${CDR(lastapp)}, ${DEST})

The module that was replaced was "cdr_new.php". The new module took the same parameters and was called "cdr_pub".

The problem with the original PHP code was that it processed the incoming data and then created the target filename... and then opened the target for in order to append a record. It's been working great but we are to a point where we might be losing some CDRs. (this is not definitive, just intuition) With 5000 channels running that means that there can be as many as 5000 instances of the routing process. That means when 5000 calls terminate at once there is a rush to append their CDRs. It's simply not efficient for PHP to block when appending to the file. Not to mention that there is a lot of overhead for the PHP interpreter to load with each call completion.

The performance issues:

  • the latency to load php with each call completion

  • the possible deadlocks when more than one process tries to append to the same file at the same time. Blocking and resolution are not guaranteed.


The new plan. I rewrote the PHP script in C. Even with the few libraries I needed it's not more than 20 or 30K. Since it's native C it loads very fast. So this program gets all of the data from the AGI in the form of command line parameters and data in the STDIN. Then, instead of rushing to append the data to a file the small program sends or "publishes" the CDR to a redis pub/sub queue. There is a single, external, application that "subscribed" to the redis queue an when a message event arrives that external app will write the CDR to the flat file. Since there is only one external app appending to the flat file it cannot have the same problems.

One side note. If the publisher fails then the message event is posted to the syslog. And if the subscriber fails to append to the flat file then it also posts an event onto syslog. If something goes horribly wrong (with the exception of disk space) then we should have a chance to replay the calls in the dashboard by scrubbing the syslog file.

PS: once side note. This configuration also limits the number of simultaneous channels. Therefore if the CDR recording process blocks of any reason that will prevent the system from accepting the next call when the system is running at capacity for that source.

PS: the subscribe app was written in ruby. Installing ruby on my production asterisk server was not my first choice but it was worth it. The Ruby code was compact and it handled exceptions nicely. There were some idioms that I liked a little more than python. And while some of the development took place in Ruby 1.9.3 and the default version on the server was 1.8.7 I did have some challenges getting it to run and I needed to install some additional packages...... which as a side note confirms all of my previous beliefs about full stack awareness.

PS: One last note. When deciding on the publisher implementation and after abandoning C based on it's lack of a JSON library that made sense I tried go and then considered java, other JVM-based and several dynamic languages... In the end C was the only choice because of it's size, load latency and runtime.

Thursday, March 8, 2012

Lua is "good enough" development at it's worst

I'm trying to rewrite my Asterisk CDR capture program from PHP to just about anything else. I inherited the PHP version of the program from a programmer who did not know anything about concurrency or performance. Inside the Asterisk configuration there is a file "extensions_custom.conf".  This file contains all of the scripts for the individual PBX extensions. And when the call is completed the last thing the script does is capture the CDR.

The first thing that you might say is that I should use the built-in CDR recording. I would except for two important reasons.  My extension file contains many customizations so I need the CDR to reflect that information accurately. The second is that the built-in capture does not really address concurrency from the reporting/admin side of the equation. Regardless of the database or storage that info needs to be taken off-box so that it can be reported on... think map/reduce.

Back to the PHP code I have the following challenges: (a) it's PHP and the interpreter needs to be loaded with each call completion. (b) the output is a textfile in a log directory. The file is a standard unix file with no concurrency mechanisms or file lock detection and avoidance. Needless to say the PHP version must go.

My first replacement was going to be built with golang. I wrote a separate article on that experience. In summary 3rd party libs FAIL.

My second choice was Lua. For one reason the redis team was using it for it's scripting and so I wanted to learn it. The work was not hard but the documentation was not written like any language manual I had ever read previously. The best and worst part seems to be that the built-in functions are strictly limited. The idea being that 3rd parties would deliver the missing pieces.

At one time or another that was probably true, however, while trying to get the missing bits to work I found that the LuaForge project is practically dead. The individual projects are starting to scatter to the 4 corners.

As a side note it seems that a great many developers are still using Lua 5.1 even though Lua 5.2 has been available for about a month. The common response has been that not much had changed to warrant the change; although there were some compatibility issues.

In the end I simply do not understand the Lua ecosystem and since this is not that critical because I can code this project in C directly. I wanted to give Lua a fair chance after all the previous comments I've made. Basically Lua is still crap.

golang 3rd party libs let me down

[UPDATE 2012-03-28] Google released version 1.

I'm working on a presentation project and one of the demo apps is a pub/sub that I need for a VOIP customer.

Currently the app is written in php. It's executed when each phone call completes. And it currently writes directly to a log file. Both attributes are bad which I was hoping my demo app would correct. And while I've considered C I'd rather not for the moment.

golang has a fast and functional API and language. The target executable is statically linked and a reasonable size. My concern is that the beanstalkd and redis libs are old. So before I pushed my first message I've abandoned golang for lua.

Wednesday, March 7, 2012

commit often?

[Update 2012.03.07] - I forgot to mention that fork'ing, rebase'ing, and some other basic features make it easier to be a write-only developer. One of the critical paths is when you update/merge your local repo with the remote base while you're actively developing. Yes, there are some procedures to minimize the risk like commit then pull and merge, however, this does not isolate your changes nicely. A subject for a completely different conversation is "one code repo" like Google or "one code repo per project"?

I have participated in a number of strategies for commit'ing code to a repository. Each strategy has side effects, unintended consequences, and unimaginable consequences. The common strategies are:

  • commit often

  • commit after a complete thought

  • commit daily regardless of state

  • commit after a complete feature


When beginning a new project I like to start with a fresh directory/folder; layout the first set of project files which could be created by a generator like Rails, Django or an IDE; and then I like to take a commit snapshot. This gives me a solid "initial commit/import" so that any false commits from this point can always be rolled back to right here.

Commit'ing often protects the project from hardware and human error and when you are in a team setting where the scope is narrow or with a high level of mutual dependency then commit'ing often seems like a good idea. The problem here is that rolling back commits can be a very painful procedure. Specially when commits are things like "updated a comment". It's just not on-par.

Commit'ing after a complete thought seems like another good idea. For me, however, a complete thought can take days to commit. Typically there is a POC (proof of concept) and then a few cycles of refinement. I hate to commit the POC only because peer review can be brutal when not taken in context.

Commit'ing daily regardless of state is probably the worst idea ever. This is guaranteed to break the build. In the book Debugging The Development Process the author talks about not commit'ing unless it builds locally without errors and passes the regression testcases. PragProg has a book, long since forgotten, where they first introduced me to CI (continuous integration). Someone hooked up multicolor lava lamps to the CI machine and any time the build broke the red light would be on. And the person who commit'ed the broken code became the librarian.

Finally, commit'ing after a complete feature. This is not unlike commit'ing after each thought, however, in this case you have a better connection between the feature request, aka story, (I hate that usage), and the actual code. I don't really like the idea of a project request having n number of pull requests. I remember a teammate/librarian; she was constantly pulling requests and merging back into trunk. It was simply painful to watch,.

As an aside; commit'ing for the sake of code reviews or metrics is a mistake. Programmers with eventually find a way to game the system; whatever system you setup. LOC (lines of code) has long since been debunked.

So what is the best practice? It's probably somewhere in between. It depends on the team and their capabilities, the scope of the project or projects. The important thing, however, is to be flexible.

Tuesday, March 6, 2012

IntelliJ+Grails is not that bad sort of...

Essentially, JDK 7 is OpenJDK, and there still are known issues with running IDEA under OpenJDK. Oracle JDK 6 works best in our experience

I have been writing java code since it's first release. At the time it did not do much so I did not actually deploy anything. Around the time of version 1.0.2 it was stable and robust enough to build production applications that would run under both Sun and Microsoft operating systems.

Happily I moved away from the JDK and toward perl, python, ruby and erlang only to be dragged back into the fold. Java has changed a lot over the years. (a) there are a lot more APIs, (b) tools and IDEs are everywhere. Game changers include many of the Apache projects and a good number of 3rd party libraries.

Now I'm getting back into the swing of things. IntelliJ, from JetBrains, is as I remember. JB is undoubtedly pushing the capabilities of the environment, the language and the different SDKs. But on many levels it feels unnatural.

Take the quote at the top of the article. I do not know the guy who wrote the quote or his affiliation but I do know that versioning and deep dependencies drive me nuts. Also the notion of private and protected functions and classes is a joke and flies in the face of full stack awareness.

The thing that get's me where I live is that first part of the JDK expectation is that it runs everywhere. The second thing is that JDK versions are not supposed to matter; when clearly they do. Or at least they matter a lot more than I want them to.

Twitter, Facebook, when is it time to drop all your social networks?

Just a few minutes ago I saw a linkedIn post from a former coworker. It read "excellent read" and then some shortened URL. I used to follow this guy because he was a CTO that I respected while I worked with him but recently, since I started following him on Twitter, I realized that he did not have a lot to say.

One would think that a CTO for a technology company would promote the company, promote the technology or even the people. No. This guy tweets about pizza, wine, and his last great kill in the wild. So I Un-followed him. But when "excellent read" popped up in LinkedIn I found myself asking... just for a second... could he be publishing the next piece of disruptive advice that would propel me into financial and intellectual nirvana.

Meh! I doubt it and I'll never know.

In the meantime I find myself asking myself some of the same questions. What is all this social stuff good for? The Zuckerbergs of the world would have you believe that we all need it like we need air. The fact of the matter is that all social networks are time suckers. In order to keep up with your email from work, facebook, messages and so on... you'd have to dedicate your day. And all of the micro interruptions that time consultants used to tell us about are coming back around.

There is nothing like a clean desktop policy and yet I have two screens and a second computer all directing interruptions at me.

Monday, March 5, 2012

Python is Better than Perl - The Killer App.

Back on July 5th, 2011, I wrote a post about how perl was better than python. Since then I have a new opinion. Partly because of the cruft I received from the Mojolicious community. It's sad because perl is a great language and the CPAN community is also very strong. That a few curmudgeons can ruin it for the rest is sad. As an editorial comment they are only hurting the language they are protecting so much.

One of the biggest chalenges with languages like perl, python and ruby is that they are being installed as part of the base OS. Specially in most linux distros. (perl has been there from as early as I can remember) Recently python and ruby have been added to the basic install. This is not a bad thing but these languages are under constant development and they are constantly evolving making current applications obsolete or non-functioning.

This is also a HUGE challenge in the development cycle when dependencies are constantly moving and yet developers need to make progress. S recently I started spending time with virtualenv. I'm also using workon a virtualenv wrapper.

In the perl article I was all gaga over CPAN. The python folks have PyPi which does a great job. The best part is that submitting your own packages can be pretty simple if you use the modern-package-template and some of the built-in tools. My sense is that the CPAN can be brutal to submit.

The best part of the virtualenv is that you can specify the version of python, install specific versions of packages, and basically encapsulate the entire application environment away from the base OS version. The next step is going to be deploying with a virtualenv install in production but for the moment the current challenges have been met.

Sunday, March 4, 2012

Dell XPS13 or MacBook Air 13?

I'm not going to make any statements about the Dell other than to say that it's probably a nice machine and it probably runs Microsoft Windows really well. It might even run Linux after you've already paid the MS tax.

But looking at the feature functions and comparing them to the MBA... Ya gotta go with the MBA.

  • OSX is a better OS than Windows and if you have to pay the tax then the cost of OSX upgrade and the fact that there are only two versions where MS has 8 or 10 versions. Not to mention that the Server version is only slightly more expensive.

  • The MBA certainly allows dual boot and supports Windows and Linux.

  • Neither system permits memory upgrades (apple has a 2GB model most are 6GB)


I'm abandoning the comparison here. The hardware is pretty similar. The cost difference is marginal. Which is the better company and where is your application investment?

Saturday, March 3, 2012

Contractor or Consultant?

[Update 2012-03-04] Wikipedia has a great definition of "consultant". (Everyone is an expert)

Scott Berkum wrote an interesting essay; or so I thought because it struck a cord. This article was written in 2007 and before I realized that there was a followup to it in 2008; I had already replied.
BRILLIANT! Most of these are easy to understand although they are an unpleasant reality.

ADD is cause[d] by by an under-qualified individual in power and in fear of competition for their job [or intellectual cache] and so this is the only way to be the hero.

CDD happens when an ADD hires someone only to find out that the other person is smarter, faster, more well read, however, ADD-man is still ADD-man.

CYAE is also an ADD. In this case ADD’s boss usually make a hiring decision that ADD does not approve of. Then ADD goes into CYA mode and will sometimes go into get the other guy fired mode.

DBD is also attributed to ADD-man. See ADD-man has read a few articles and magazines but he has no real experience. So he uses the words but does not really know what it’s supposed to do.

GMPM, yikes, another ADD-man. This guy just wants to be the hero. He takes 2 months to fill a position that has been vacant for 6 months. The candidate is usually a slug and will eventually be fired or asked to clean the bathrooms.

ADD-man is everywhere.

[Update] a great book to read is herding cats.

Chronologically I read the third article out of order. In this article Scott takes offense to the behavior of a single consultant and seems to generalize over the entire profession. I replied to the article too.
I call bullshit! Since I’m writing [without] the benefit of having read the other comments I’m going to head for the heart of this.

There is a difference between a consultant and a contractor. A contractor is there to do your bidding [dig ditches]. In many circles they call this workforce supplementation.

Consultants, on the other hand, are there to do a different job. They are supposed to be subject matter experts (SME). They are supposed to cost more and they are supposed to get the job done. They might be there to train the staff or they might be there to implement some new function that would not distract from the rest of the team. Specially when the existing team is already engaged. The best part of a consultant is they are highly expendable and they know it. (and t They tend to work on critical yet very much proof of concept projects)

[The best part of a consultant is they are highly expendable and they know it. Their assignments are typically measured in weeks and months where contractors can be measured in months and years.]

Personally I’ve seen bigger assholes in the established teams because they think they are the shit. Think ADD.

Yep, I take a little offense. Contractors and consultants are not the same service providers or service. And using a consultant when you need a contractor is as much the consumer's responsibility. Consultants are people too. They want to do a good job. They want to get paid for a good job and in many cases they want to add value so that you might invite them back. Clearly the individual dynamics of the team are special and in many cases hiring a consultant that has not been screened by the team can offend some. But that's a different problem. One that needs immediate management attention and not a rant from an ADD.

Finally... In this case did Scott just become the ADD?

Hire Me

I just started a open/public project meant to capture some of the really subjective material when it comes to system design and implementation. I'm trying to mirror the projects on both GitHub and BitBucket. So far I've managed to keep the README files in tact, however, I think it's going to be a challenge to code and deploy multiple programming languages etc... and still get the mission across.

On the other side of the coin, integration is one of my specialities.

In the meantime I'm currently in the market for a contract or a full-time opportunity. I'd prefer a well funded startup or a new venture within an established organization. Either way let's talk.

The README.md:
Welcome
This project has been created in response to the many recent articles that have been written and the number of job search services which feed on public source repositories. Part of me wants to find a novel way to game the system but I have a higher purpose and that's the message I want to get out.

If you want to know more about my writing you can always head over to my blog. (http://richardbucker.com). Depending on when inspiration hits I tend to write a new article about once a week and I cover a wide variety of topics.

As a general rule, however, while I have my opinions I have a good sense between what is opinion or intuition and what is fact. My training and application of root cause analysis has been an eye-opener in knowing the difference.

Goals
My goal is to land the perfect job in the perfect environment for the perfect company. I'm not particular about startup or established, fulltime or contract, or the vertical market, programming language, project management style... although I have my favorites. My mission is to provide a high rate of ROI.

Next
Over the next few days I'm going to populate this project with code that I've written specifically for this purpose. Some of it will run and some will be gists.

Warning
The amount of time I put into this project is going to be completely inversely proportional to what happens in real life project development. The application of comments, selection of variable names, even sample code complexity; such nuance are more subjective than ever.

Thank You
Thank you for taking the time to read through this project. If you're a hiring manager or just a passer-by; if you have some comments I hope you'll forward them to me (richard@bucker.net).

Copyright
The content of this website is Copyright (c) 2012, Richard Bucker. I'd like to share and I believe that the public nature/license of this repo is that it be fork-able. So have at it, however, please keep my copyright in tact.

Thursday, March 1, 2012

iterm2, tmux and the ever-present security

Being a freelance consultant I worry a lot. I worry that I might lose or misplace my laptop or worse that it falls into the hands of someone with less than honorable intentions. Of course you might also install a trojan, be attacked by a virus through multiple vectors.

As a result my clients' secret sauce falls into the wrong hands; or maybe my family's private information is leaked like credit cards or SSN.

This and far worse is possible. Unfortunately there are no absolutes. Not even if you built your OS and applications from scratch. First of all there is not enough time to code review everything you'd need. You are probably not a programmer and if you are there is only a slim chance that you can code everything from a video device driver to a web server and a word processor. (there are only a few on the planet and I'm certainly not one of them).

So the best way to protect yourself is a layered approach.

  • Pay for your hardware from somewhere reputable; HP, Dell, Apple.

  • Pay for your operating system or at least get it from a source with a profit motive. Red Hat, Fedora, Ubuntu, CentOS, Microsoft or Apple.

  • When you are installing Free software. Look for the profit motive. If you find one then it might be safe. If not then avoid it and look for one to pay for. OpenOffice is a good choice because it was once part of Sun but before that it might have been questionable.

  • The same can be said for websites, RSS feeds, torrents and so on.

  • And have some checks and balances. For example I use little snitch and Apple's firewall software to make sure that applications running on my computer do not have random access to the internet.


The profit motive is a strong magnet. It's what drives the thieves and it's also what will protect you.

So as I sit here playing with iTerm2, which I have been using for a long while, and tmux and I'm starting to get a case of butterflies. I'm confident that these programmers are good and lawful but I don't know them personally. The fact that one of them could put in a key logger and then stream that data to their servers make me sick. (hopefully little snitch will catch it but it's not foolproof.)

Anyway, practice safe computing.

another bad day for open source

One of the hallmarks of a good open source project is just how complicated it is to install, configure and maintain. Happily gitlab and the ...