Skip to main content

Posts

Showing posts from June, 2011

Pyzmq - not a lot of best practices

ZeroMQ is a Message Queue (MQ) framework that plainly works. Two of the most interesting elements are 1) ZMQ supports a number of client languages; 2) the broker (generally an application that exists to route traffic from a producer to a consumer) is left to the programmer to implement.

If you've read or used any of the other MQs or if you've done some interprocess communication (IPC) before you probably have a good or general idea how this is supposed to work. RabbitMQ does a really good job naming the different patterns and keeping the list to something manageable. While the ZMQ doc is long, detailed, absent of examples for each of the client languages, examples are buggy or old, the examples are simple; but they have many more patterns than RMQ.

One idea that keeps getting trapped in my head is; How do I send a request to the broker and wait for a response. And if the response does not arrive, then what? Basically I'm looking for a best practice here.

In my application des…

NoSQL for your next project?

I keep keep going back and forth on the whole idea of NoSQL and it bothers me to no end. On the one side the idea of sharding the data at the server level is appealing. Then there are the Key/Value databases and then the document versions. They are graph databases, object databases and a few in between. But then, as always, there is this reality check.

As wonderful as NoSQL would appear to be there is no single use-case that would seem to make it/them the obvious choice. And this is no more obvious as I stumble around a new project I'm considering... an open source merchant gateway and an open source issuing system. I would really like to have one and only one system that I could use for everything but that does not seem possible.

For example; in order to deal with the protocol impedance I need a fifo queue of some kind. I like redis for this as it also has a TTL so that old data is simply removed from the queue. It also has the notion of fields in the value so that a single record …

A new approach : HamsterDB

Revisiting my favorite subject again, credit card processing, the hamsterDB's description on the NoSQL website triggered an alarm.
hamsterDB - (embedded solution) ACID Compliance, Lock Free Architecture (transactions fail on conflict rather than block), Transaction logging & fail recovery (redo logs), In Memory support – can be used as a non-persisted cache, B+ Trees – supported
The key words being "lock free". In any typical CC issuing system you can expect to see transaction times from 50 to 500ms depending on the amount of work the authorization system has to perform, DB latency and locking.

Typical transaction workflow looks like some code that just tries to get some data from the DB, do some work, get some more data from the DB and do some more work. And while performing I/O with the DB you always have to be ready for a failure. Typical failures are deadlocks, consistency because another process updated a record and so on. And when you think about the breadcrumbs a…

Current News and Updates

Google released version 1.5.1 of App Engine. They added some significant APIs and features, however, in my mind it's missing a GO update.

Tornado has been in version 1.2.1 for a long time... and the developers just release version 2.0. (download here) Looking at the release notes there are 3 major updates and several minor. Many of the minor updates are prerequisites. The most impressive will undoubtedly be support for python 3.2. However there may be some minor backward compatibility issues.

Should Go replace my use of Python?

Here is an interesting post that posited the question in my title: Experience porting 4k lines of C code to go http://bit.ly/jm0Qws

There are a lot of reasons to use GO. I like that it's from Google but I don't like that there is a release often approach. I need something that is a little more stable than that. Granted this offers some justification for deploying packages and the like and using goinstall in order to deploy and update packages as new releases of GO are made available. There is also something to be said about the monolithic codebase, however, that flies in the face of this deploy approach.

But I like the compiled performance, channels and the wealth of packages (It needs more like a performant web framework, templates and production ready database adapters.)

Go, while cool, is still a little half baked. Where python and perl are still up to the challenge.

"The Network Is the Machine" : Erlang is not all that

I like erlang and I like it most because it solves a number of problems, however, the problems that I think it solves in general application development are not the kinds of problems that most erlang programmers want to solve. For the [sp: life] live of me I cannot understand why erlang programmers would implement a database like Riak. It's a complicated undertaking and frankly considering how deep the callstack has to be at times it does not seem practical without a real debugger.

As I consider the amount of work that it takes to implement a single credit card transaction I realize that the entire callstack is going to consist of a few thousand instructions regardless of the language. The hardest part of a credit card transaction is the DB record versioning and not the actual in-memory workflow.

So then we start talking about the threading and IPC. MEH. I no longer care about that stuff. Not even for a second. With libraries like ZMQ "we" should reconsider how we allow pr…

The Really Big O

Get your mind(s) out of the gutter. I'm actually thinking about benchmarks not bed-marks.

Back in the olden days we used to refer to a program or algorithm's performance profile in terms of o-notiation. I'm pretty sure that most computer scientists still follow this montra and for the most part it's probably still true.

So if it is the case that o-notation is still a real form of measure then why do most languages have different performance profiles while they perform the same work. For example 100K compares is the same when it all boils down to the assembly instruction that makes the decision:
CMP AX, BX
JE #SAME_VALUE
When it comes down to it every language makes decisions the same way. So why such different profiles. I say again.

First of all I think that the trouble lies in the libraries. I'm not convinced that the same care id put into every library so that the minimal number of instructions is executed. The reality is... how much code needs to execute on eit…

BerkeleyDB in Payments

BerkeleyDB is awesome... but I liked it better when it was a part of SleepyCat and not Oracle. I hope that Oracle does not bury the product and that it gets the attention it richly deserves.
A number of years ago, while I was working for WildCard Systems, I was designing an authorization system that had a few constraints. The first was that everything was deployed on Windows and second that the DB was going to be MS SQL Server and as a side effect of moving requirements from the sales team I was forced to implement the business rules as stored procedures. At the time SQL Server did not have a replication system and we were still running on souped up PCs pretending to be servers.
So I built my own replication engine. That failed. And then I tweaked it... it was OK for a while... until the transactions started to show up. Over the years we tried several, including Microsoft's version too. They all failed one way or another. But I digress.
At some point everything was moved to enterpris…

Redis In Payments

There are a number of hurdles for the merchant checkout/shopping-cart to overcome when accepting credit card transactions. There are a number of obvious and outwardly facing challenges like:

PCI-DSS
Acquirer contracts
Shopping Cart
Banking
requirements - payments, recurring payments

Once you make it past, not so technical speed bumps, there are a number of implementation details that follow. On the one hand there is the user experience and how that is implemented by the website; and then there are the many acquirers and the different protocols and payload formats. This is usually referred to as transaction impedance.

What does this mean? How is that implemented in Redis?
Let's start with the user experience. At some point the user will want to complete the purchase by providing some credit card information that you are going to use to send to an acquirer for processing. Given the number of ways this can be accomplished the best way will be an internal implementation using an iFrame. This …

membase + couchDB = couchbase : Why?

Just a short note because I have some reading to do. In the back of my mind there are echos of membase being a ram-only cache-like db and that there were forks and the like that implemented the same APIs and had persistance. This guy does a good job breading things down, however, he gives membase 200K/sec and does not say anything about Redis' performance. So while many of the elements that I would use for comparison are there they are not equally presented. Furthermore there is still a question in my mind about membase and it's persistance that he suggests.
So many questions.
The biggest doubt I have is the title of this article. From the link in the article membase appears to have been absorbed into the couchbase project and so it does not exist on it's own. Not to mention that membase.com redirects to couchbase.com. The same can be said for membase.org and couchbase.org.
Then primary use-case for membase is caching of data from any traditional DB, so where is the benefit o…

Stay Tuned

I've been working on some new articles that I hope you'll find interesting.

How Much Code Have you Written?
Do you pad your resume?
Perception is everything!
FaceTime - Killer App?

You may or may not be able to tell what's coming from the titles. Either way it should be somewhat interesting.

So Long Replay, Hello Tivo

[caption id="attachment_95" align="alignleft" width="300" caption="ReplayTV shutting down operations"][/caption]

I was going to watch some show I had previously recorded on my ReplayTV and this message popped up on my screen. It seems that ReplayTV is going to halt all operations on July 31, 2011. This really sucks because I like my ReplayTV. They were the first to offer remote room playback (record a show in one room and playback in another using a second ReplayTV and a wired ethernet network.)

This is sort of interesting because at least one of the ReplayTV devices has been misbehaving. Every once in a while it just hangs. So I was thinking that I would need to replace it in a few weeks or months at the most.

In the meantime I was looking at the latest Tivo. They seem really cool. Some of the interesting features include iPad/iPhone/web integration so that you can change the recording options while you are away from home but connected to the inte…

Loading CDRs into MongoDB

Sweet. This was as slick as you'd expect.

The task was to load 235529 records from 100+ CDR files into MongoDB using the mongoimport tool. Using a Rackspace server with 512M ram and 20GB disk... but it's all virtual anyway.

Here are the numbers (not scientific at all):

1m 10s - with verbose turned on
34s - with verbose turned off

I'm certain that some portion of the latency with verbose on is that the console was remote and so there was some lag in the i/o across the internet.
The import:

$ . ./bin/cdrmongoimport.sh
connected to: 127.0.0.1
dropping: data.cadb
30700 10233/second
57500 9583/second
85000 9444/second
113600 9466/second
144000 9600/second
170700 9483/second
197200 9390/second
223600 9316/second
imported 235529 objects

Just to be sure I checked that all of the data was loaded... some people have been complaining that data has been lost.
$ wc -l /tmp/20110515/*
. . .(snip). . .
  235529 total
And then I checked the count on mongo.
$ ./mongo/mongo
MongoDB shell ver…

Reported my First Bug to MongoDB

I have a client that generates several million Asterisk CDR (call data records). These CDRs are not perfect. In fact they are formatted as TSV and not CSVs; and they have a leading TAB character. Since the CDRs are generated in 5 minute intervals and the files contain a few thousand CDRs it does not make sense to load the DB a record at a time. It actually makes more sense to bulk load so that the data is processed at as low a level in the DB engins as possible.

My first attempt to load data into MongoDB failed. The data was all askew. The problem is/was that there was a leading tab in the TSV file. And during the normal processing of the input file the import utility was stripping all leading whitespace regardless of the filetype. Since the whitespace includes the TAB character and since the first column of my data was mostly empty... the file had a leading TAB character.

And this character was considered a whitespace and so it was deleted before the record was processed.

So I did what …

Is Social Networking Just a Fad?

I read the headline: Is Social Networking Just a Fad?And instead of reading the article I aggregated a number of other headlines that I read including One Simple Rule: Why Teens are Fleeing Facebook. There was also some grumbling about the upcoming IPO. Now that social network software is mainstream it's just a matter of time for individual business to find a way to capitalize. If you've played and Cyberpunk RPG and you understand the backstory... it seems inevitable.

The rule of thumb in this fiction is always:
"The corporation" wants to spider away as much information as it can from any source that they can. Information from public sources is to be copied and then destroyed. The information from the competition is of utmost importance. And so on...
Sure there is going to be a FB but it will be different. It will be the public forum; it will be the anonymous platform; it will be the place of activism; games; and time wasting. Nothing is going to be real. It will be lik…

Knuth has gotta go

I graduated HS in 1983 and unlike many of the other students who used the Apple computer lab in school I used my family's Radio Shack TRS-90. And since my father was mostly a business man I wrote text based software. And when we bought our first IBM PC it was monochrome. So it was no surprise that I ended up taking programming courses in college.

It took me 3 years to get out of community college and a 1 year hiatus to find myself and then another 3 or 4 years to catch back up. Since graduating HS I was always working in my field. At some point, while I was in college, I decided to buy the Knuth books. I mean this guy defined our/my existence and there has to be some knowledge in there that I need.
I would like to note that Vol 1 was initially published in 1973. Volume 4a is due out this year(2011) and Volume 5 is due by 2020.
So these books have been following me around for almost 20 years and I'm sad to say that I have never opened them other than to see the publishing date in …

A brand new annoyance: Adobe Air

I installed TweetDeck a few days ago and I don't have anything good to say about that experience. This morning my wife installed the latest Shutterfly express app and I'm not happy about that either. You see, both programs required that I install Abobe Air and in today's environment where nothing is free not even the free stuff... it's just one more commercial entity with free access to my computer.

If I paid for the software I suppose I would feel more comfortable with the fact that the application was not going to spider my disk drive and give away my family or professional jewels. As it is we permit way too many software snippets access to our virtual homes. These intruders must be sandboxed and in such a way that does not create additional friction to the user.

It's for that reason that I really like the iOS development environment. Every application is sandboxed and that's just the way it is.

So I managed to get a little sidetracked... I don't want to kno…

Review: Google Music Beta

I've been granted a beta account on Google's new music service and my initial impression is that I'm disappointed (and I hope Apple is taking notes for iCloud)

I received my invitation yesterday and I was pretty happy about it. I requested an invitation recently and given the number of uber power users at Google I was not expecting anything.

I installed the desktop app, which is not really a desktop app at all. It's an application that quietly runs in the background aka daemon (or TSR for you DOS throwbacks). The actual user experience takes place in the browser. So on to my checklist of complaints:

The browser app seems sluggish (could be because the upload is running).
The daemon is uploading my entire library (8K songs) rather than just the signature.
They deployed an Android version of the player but no iPhone.
If you elect to receive the free songs you cannot tell them from yours unless you really know your library.
It only plays on my computer and I like the AirPlay in …

MarsEdit is from Pluto

My issues with MarsEdit are not that serious; so let's run the numbers.

Cost - $39.99. Yikes. I had to check the appstore to refresh my memory. I cannot believe I paid this much. The description suggests that it is the #1 blog editor for the mac, so either there are no other blog editors on the mac or this might actually be the best.

Features - Here I have to admit that ME supports a wide number of blogging platforms and maybe that's why the editor seems so clunky. It feels like using the old edlin(DOS) compared to vi circa 1983. I'm just grateful that I did not have to dust off my WordStar cheat sheet.

Append, not update - This could be a bug or it might be PEBKAC; I was doing some simple edit/publish cycles. I published the same document several times... thinking nothing of it. I decided to check my work when I noticed duplicate posts. So it seemed that with every publish I had an extra copy of my document.

Feels like a webapp - There is some RTF in the editor but it feels c…

VoIP on iPhone

I know a little about VoIP because I part-time manage 6 Asterisk servers and 3 CDR aggregators. These systems are involved in telephony arbitrage which I understand from the outside but not as an insider. Let's just say that from the outside looking in it's meant to be obtuse.
So it seemed natural to me to want to reduce my phone bill and still have all the comfort features and functions that I get from the local telco. I do not really want to manage my own asterisk server inside the house. I might want to use my analog phones. And I definitely need to maintain the same QOS as I get with the other guy.
That when I found a reddit post that caught my attention and eventually got me looking at PlugPBX. This is a great idea once you get past some of the networking issues and the need for an analog connection to the local phone, however, if you're willing to install IP phones around the house or SIP capable wireless phones, this will be a game changer.
With the advent of wireless …

Cassandra; a game changer?

I'm not certain that Cassandra is a NoSQL contender. It may be part of a solution but a solution unto itself. Upon first reading the apache group tells you all these wonderful things that Cassandra does but it feels like is it not enough. The glaring omission is a MapReduce function and the closest your going to get is using Cassandra as the storage engine for a Hadoop NoSQL framework.
Hadoop is a beast of a different color. It seems to support different storage engines... HBase is their traditional storage engine and it also also supports Cassandra. Their may be others but Hadoop is not the focus here. The last word on Hadoop is that is seems that Google has given the Hadoop license a waiver from it patent on parallel queries.
There is a product named Brisk from Datastax. This is Hadoop+Cassandra+some DataStax sugar. I'm just not buying this setup. The website suggests to me that it's more about the DataStax (commercial) dashdoard than it is about the integration of Hadoop …

seven in seven?

I learn a new language at least once a year. It's just something that I have tried to do since I started taking my profession seriously(1988-ish). Recently I started to get the itch to learn a new language and it did not take long to select one.
I had been working with ZMQ (ZeroMQ) for a while and luckily for me they have example code in a number of client languages. Since ZMQ is implemented in C, they have plenty of C examples but curiously enough all of those examples have Lua versions too. The remainder of the examples vary from language to language.
I do not select languages because of the geek factor or the cool factor but for it's ability to shorten the development cycle, the tools it provides, community support, the development pool, community activity and viability in business. And using this criterial I had initially dismissed Lua.
For example, there have not been any releases or patches in over 5 years even though Lua is the scripting language used in WOW(world of warcr…

The New Desktop

INTUITION: The new desktop is going to be an iPad or something based on iOS.
We are yet to see a virus, trojan or malware attack against an iOS device. Of course Apple has been singing the praises of OSX for years on the basis that it has not been attacked or penetrated (true or not). So for a moment just assume that an iOS device is as impenetrable as Steve Jobs would have you believe.
(I'm in my happy place)
So there are a few adjectives that I would use to describe the iOS devices:

secure
app-liscious
cloudable
mobile
inexpensive
self-destructable
accessible

The device is secure. It needs to be connected to a PC that has iTunes installed. The iTunes application requires an Apple ID. And somewhere in there is a chain of custody that links the device to the user... by everything short of a DNA scan. And as a application designer I know that each application is sandboxed; meaning that no application can access the data of another app.
Speaking of apps. There are plenty of them. The number of …

Hello world!

This is just one of the many first posts I've created today. From several twitter accounts to ongoing facebook customizations and trying to get the OTJA going with some strong alliances. Hass is  the manager's manager and myself for my technical intuition. I'm also looking for some additional contributors that complement the team.
This is not necessarily going to be a happy-happy house but one that produces some quality discussions... as I've discovered recently;

not much out there is actually fact.. in fact it's mostly subjective opinion.

So here's to hoping that my there is room out there for my intuition and that I/we can convert our thoughts into something useful.
/r