Tuesday, July 31, 2012

More Fermi Problems

I am not a fan of Fermi Problems. At least that is the category of interview question that interviewers have come to ask in the absence of true relevant interview questions. The first reason that this is bad is (i) because the interviewer rarely has the question written down and so reciting the question from rote memory is a guess at best. (ii) very few is only the most elite interviewers know exactly how to quantify this very subjective measure.

Here are some examples:

(1) You are at one end of a long hallway with two light switches. At the other end of the hallway is a closed door. Behind the closed door is a lightbulb, currently off.  Which light switch controls the lightbulb.

(2) You have 3 egg cartons labelled S, M, and L. Inside the cartons the eggs are marked S, M, L. Each carton has 12 eggs of the same type. The cartons are all mislabeled. What is the minimum number of cartons that you need to open in order to correct the labels?

(3) You have 3 jars of uniformly colored jelly beans. The beans are flavored Cherry, Grape, and Mixed. The jars are all mislabeled Cherry, Grape, and Mixed. What is the minimum number of jars/beans you need to sample in order to correct the labels?

(4) Your friend has 5 fair coins and you have 6 fair coins. You win if you flip at least one more head than your friend. What are your chances to win? (or something like that)

(5) You are 1 of 50 prisoners. The warden is going to play a sadistic game. He places a black or white hat on each prisoner's head and lines the prisoners front to back such that each prisoner can only see the other prisoners in front. Each prisoner must guess what color their hat is. If the prisoner guesses wrong he is executed. If he guesses correctly then he is freed. Before the game starts the prisoners can decide on a strategy. What is the best possible outcome and what is the strategy?

(6) You are one of three prisoners from the same prison. The deputy warden is also sadistic and has a new game. He gives each prisoner a black or white hat. Everyone can see each other's hat, except their own. This time, however, the prisoners must write down what color they think their own hat is or nothing at all; their guesses are revealed at the same time. If none of the prisoners write anything down they are all executed. If even one guess is wrong they are all executed. What is the best possible outcome and what is the strategy?

Someone needs to explain to me what the value of these questions are and justify it.

Sunday, July 29, 2012

Apple iTunes and iBook are on my shit list

Admittedly I've been an Apple fanboy for a number of years. Thankfully the first step to correcting the problem is that there is a problem.  One of those 12-step programs.

If you buy a book using the iPhone appstore then you can only read the book using the dedicated reader. It's not like you can reload the the book in iBooks.

If you but the book using iTunes on your mac or on your iPhone then you get an ePub file that can only be read on your iPhone or iPad using iBook. There is no way to read the file on your desktop.

In the case of the O'Reilly books there is an additional premium that you have to pay in order to get the DRM removed and the PDF version. This is the double whammy.

What a waste!

Saturday, July 28, 2012

It's all about version 1.0.2

When Sun's JDK version 1.0.2 was release I was finally able to create the first Java network application that made sense. This was the first version of the JDK that lived up to it's two promisses.  (a) that the network was the computer and (b) write once and run anywhere.

Now that Google's GO language compiler has reached it's version 1.0.2 I have been longing for the days when the JDK was useful and manageable. GO feels like the JDK of old. The tools are more complete and mature than the JDK ever was. The APIs are mostly easy to understand and use. So many of the architecture features have been well thought out instead of just being workarounds; for example assertions, exceptions and error handling in general.

I've said it before.  Java has become our generations COBOL and GO has become the new Java.

What can I say? Version 1.0.2 seems to be a lucky number.

OSX Mountain Lion - wake from screenlock

It has been 24 hours since I installed mountain lion and I have not seen anything "new and interesting". There have been some "new and uninteresting" and then the FAILs.

I keep my laptops powered up and running 24x7. The screensavers are running on a short leash in order to preserve the screen and save some energy, however, since I'm a professional I never know when I'm going to be asked to do something and I'm much too impatient to wait. Also, I run backups 24x7 and I need to make certain that the backups are performed at night when I'm offline in order to preserve bandwidth.

My screensaver engages at the 15 minute mark and the screen is turned off at the 20min mark. What is currently bothering me is that when the screen is sleeping (powered down), and I click a key or move the mouse, the laptop display flashes white for just a second and then restores the previous image... and it's annoying.

I have not debugged the video driver and I certainly do not know what the design details are. Whether it's time to startup the backlight or the difference between the LED and fluorescence backlights. It's just plain annoying.

Google Chrome for iPhone FAIL

I was just playing with my iPhone trying to get access to my Google Tasks. This included looking for a dedicated iPhone app from the appstore as well as some solid bookmarks.

When I launched my Chrome browser I was presented with a popup that recommended that I "add to home screen". I had performed this function some months ago using the phone's built-in Safari browser but I wanted to see if there were any plusses for using Chrome for the iPhone.

Well, the popup arrived, it was was cutoff at the bottom and it was not hovering above the buttons as they do on Safari. Worse still is that I could actually save anything to the home screen.

So, (a) Google's Chrome should detect the browser and only popup when it makes sense. (b) provide some useful features for their webapps either built-in or with some other kind of plugin that enhances the experience without creating a security problem.

Friday, July 27, 2012

Google Drive stopped working on my OSX Mountain Lion install

It's a long title be that did not change anything. Apple has sandboxed the OS. "We" expected that to happen but my memory says there was not going to be a workaround. Luckily: http://osxdaily.com/tag/mac-os-x-10-8/

I did not read the entire article; just the parts that interested me and my problem. Which was that OSX popped up a dialog saying that Google Drive was not downloaded from the app store or not provided by an identified developer. I suppose that last part will be corrected by Google eventually. In the meantime use the link above to install Google Drive.

I had to change the switch to allow "anything".  But once Google Drive was installed I was able to move the selection back to the middle position and it still functioned.

One more thing. The current version of google drive is: 1.3.3209.2688. This is good to know because Google does not offer any other way to ID the different google drive versions.

What? Zynga? Really?

Am I the only person on Earth that is disturbed by the happenings over at Zynga? Clearly the SEC needs to take a look into things very closely. I cannot believe that the trades were allowed to be processed, however, it's possible that I might be overreacting because there are rules for these types of transactions. To the point where the SEC probably has the authority to roll the transactions back if they are found to be improper.

OSX Mountain Lion - Notification = FAIL

This completely defies logic. Growl does a good job of alerting the user. I just cannot understand why Growl was not acquired. Sure they added a notification panel that probably required access to the kernel code in order to implement... but in then it is uneventful. The worst part is that I have not determined which to use although it's probably growl... but then there is the individual bias that the Apple apps are going to have instead of playing nice in the sandbox. I also hate that something in my system is trying to email me using mail.app. I want to specify the mail app and that's that.

OSX Mountain Lion - iCal = FAIL

iCal is a disappointment.  I know that it's pretty and a little refined from the last version... but it sucks! Or more succinctly it fails to integrate with my Google apps calendar. I'm not even sure how to describe the failure.  All I know for sure is that it could not sync my calendar completely. It was missing so very many of my events. It was missing so many that I'm not interested in even trying to make it work. Google simply works and that's all I need.

Monolithic Code Tree Jumps the Shark

I have been considering using a monolithic code tree and I'm starting to have second thoughts. Initially I was thinking that I was going to have one local copy for all of my development and stash my changes as I moved from one HEAD to another... editing more than one subtree at a time. If it sounds complicated then you know what Iam talking about.

So I started to consider heading back the other way when another idea struck.

What about if I actually had several local copies. One per subtree.
cd ${HOME}
git clone the_big_tree
cd the_big_tree/proj-1

then
cd ${HOME}
git clone the_big_tree
cd the_big_tree/proj-2

Now as I modify the different subtrees I can make changes and commit separately. Of course this is going to take some discipline but at least the changes should be localized.

Saturday, July 21, 2012

I call for a do over - index organized tables

I was talking to a DBA recently and he was trying to impress me with his intimate knowledge of everything Oracle. I never claimed to be a DBA and I certainly never claimed to be an Oracle DBA but he was throwing out this particular feature: Index Organized Tables. It's an interesting feature but one that I would never use.

First there is the FUD. Index organized tables are re-packaged in order to recover space and facilitate performance. The fact that the data needs to be rewritten means that there is always a chance that data can and will be destroyed. Whereas in the normal table format the data is essentially static. No moving parts, not data loss or at least there is a better chance to recover the data.

But forget the FUD, it's not even important. There are two hard and fast rules you want from your PK (primary key) for optimum performance and flexibility in an RDBMS and this applies to Oracle, MS SQL Server, Informix, Postgres and MySQL.

Your page free space should be minimal, extent size as big as you can afford, the PK should be a sequence number of some kind and the index type should be a hash. This will accomplish several important things about the physical structure on disk. (a) the pages will be full so there is no dead space and reshuffling will be left to a minimum (b) extending the file with plenty of extents will allow for efficient high volume inserts (c) if you do not do any deleting then the PKs will be in insert order (d) hashes are the fastest way to locate the record on the order of O(1). Everything else should be left to secondary indexes.

Secondary indexes are a completely different animal. Sometimes you need to use covered indexes in order to reduce the number of reads. But for the most part you want to tread lightly. Rebuilding indexes, depending on the amount of data, can take hours and in many cases this can take you out of service. But you want to move as little data as possible... and use hashes wherever you can. Specially in OLTP systems.

You need to know your tools to be an effective programmer or database developer. This feature might just be TMI (too much information)

Linux Desktop Proliferation - Attack of the Memes.

Oracle recently announced it's own version of Linux. I'm not sure if the intent is to capture some desktops or if it is truly meant to operate in the datacenter. Oracle has a history of certifying it's database to run on particular operating systems. I think this is partly because they want to make certain that their software licensing system remains in tack after installation and partly because they want to keep customer service operating costs to a minimum and if the OS is understood then the costs should be predictable.

Another Linux distro, Scientific Linux, peaked my interest this month.  It was functional, quick, handled multiple displays nicely... generally modern. What bothered me about it was it's construction. It was effectively RedHat with a modest increase in default packages (which I'm certain could have been installed manually or with Puppet or Chef).

So before I head of on some tangent I'll cut to the conclusion. Neither of these groups/companies are worth the visit. Use the Oracle distro if you're using the Oracle stack but not for general computing. I do not think they are adding any real value other than they understand their stack the best.

As for SL. They are not adding any real value to the experience. It would be easier for them to get/use institutional licenses or even a custom distro directly from RedHat. There is nothing that SL needs or wants that is not already available out there. Since SL is a funded project... once the Higgs-Bosen is discovered and confirmed I could imagine that the various colliders will either power down or rather all the money is going to head toward the next great adventure. Something like SL would likely see itself defunded in the first round; but that's just my opinion.

The next programming language you learn should be GO

I have been asked several times this past week about technologies I would choose to build my next application. There was a time when I adopted Java and the most of the world was still looking at Microsoft's Visual Studio line of languages.
There was a time when people used to say "no one ever got fired for buying IBM" and more recently this rule was applied to Microsoft.

Java has now displaced COBOL in a lot of mainframe and other big iron installations. While it is stable in many environments it is still encumbered by licensing, deep dependencies, lack of a quality rating system, and is still not available on every platform.

Eventually GO may end up in the same place, however, the current state of the art tools for building and packaging GO application appear to be giving it a leg up. Also since the level of coding is somewhere between C and C++/Java one needs to take an algorithmic approach to software development. With any luck this means more performant code and scaling systems.
Google has deprecated several projects this year and terminated others. This is not always a good thing but clearly they are looking at their ROI as they should. It would be nice if Google would let us know what their commitment for LTS was going to be.

Unlike Java which was closed source for many years after it's 1.0 release, GO has been open source since it's beta days. I personally think they are lacking an IDE and a AppEngine toolkit similar to the python version. But for the moment it's my goto after python.

Enough is enough ... stop with the elitism ... now!

For my 300th posting I wanted to do something meaningful and I think I hit on it yesterday. I was talking to the CEO of a NYC startup. The product they are building is interesting and in many ways it may be version 2 (see mythical man month) and in some ways it may actually be a different product altogether. But what concerns me about this project is that they are moving from NodeJS, which was a complete and utter failure for version 1 and now they are moving to scala for version 2. The decision to use NodeJS was born when a frontend programmer thought that he could write sevrer code. Whether or not NodeJS was up to the task is not important. This person thought he had the chops.  Now version 2 is around the corner and a different group have been taking lessons learned and implementing the new version on Scala. Scala, as a language, is not evil but it is unjustifiably elitest.

Business people, hiring managers, HR departments, recruiters, team leads and architects... take heed. Stop implementing core systems in elitest languages and frameworks. You are not getting any extra points for implementing anything in erlang or scala. Bad programmers are bad programmers and they are not automatically better because they used a different language or framework. Using elite tools simply means that you are paying more with little in return.
NoSQL based technologies are elitist. Stay away from them. When NoSQL has as many tools as the current state of the art RDBMS does for reporting, management, monitoring, etc... then you have something. NoSQL is hugely under represented in the toolset.

The exact same thing can be said for the project management discipline. Whether you use Agile Project Management, Agile Manifesto, Scrum, Scrumban, KanBan, waterfall, RUP, etc... Whatever you use should represent past, present, and future adoption of some project management process and tools. To hold one methodology or process above all others means that you do not know them as well as you think or you do not know your teams skills or needs in the project context. Adopt what works and don't get hung up on the glossary. Nothing is sacred except GSD (get shit done). You would be better off identifying when your process stops working and prepare to slide into some old or new process that might be a better fit.
[para] Sometimes you have to fire the rockstar for the sake of the project or the team-- Debugging The Development Process

Final words of advice.  As someone who worked for a payment processor when it was a startup and before it was acquired... it was a startup first and a technology company second... contrary to the executive whitepaper that had everyone believing it was a tech company. The only thing that we did that was cutting edge was implement some tools in Java and even that was block and tackle.  Everything else was VB, SQL, ColdFusion, and shrink wrapped Microsoft tools.
If you want to refer to yourself as a technology company then you are going to be on the leading edge of the upcoming tech ... and you gotta be prepared to fail in dramatic fashion.

But if you want to be a leader in your market then describe yourself as such. I would have described us as a "highly customizable boutique payment processor". Finally, when you brand your company as "high tech" you are going to pay more for everything. Specially programmer salaries. So pick technologies that are going to get you to market with few bugs with some basic ability to scale. Plan to make scale a hardware and devops problem and not dev.
Even a billion or trillion monkeys banging away on keyboards might have a remote possibility of producing the ultimate scalable web property but I doubt it will happen in my lifetime. So find other ways than writing code to scale your business.

[Updated 2012-07-21] I decided to add one final thought. Many successful businesses that scale tend to use a variety of languages. In no particular order they use everything from Ruby, Rails, Python, Django, Perl, Java, C, C++, .NET... but you never hear them crying about scale or using the latest language of the day... with one exception when FaceBook and Amazon declared that they separately implemented a chat server in erlang. To me that was just to generate press and possibly attract some rockstars. (If you really want to scale you need this guy or the message but your rank-n-file web programmer is not going to function at this level)

How do you want to be perceived by employers?

I put together this poll because I'm curious as to the best way to present myself and what is going to generate the best possible response for the best possible jobs and opportunities.

In the poll below I provide two links. One of them is to my link at about.me and the second is to my blog. (in no particular order)

[polldaddy poll=6406653]

Modern IDEs need better themes

I do not really consider TextEditsublime text 2 or bbedit IDEs but they offer a robust set or developer tools within their tool chest. Others from JetBrains (pycharm, intellij, rubymine) and from ActiveState (komodo) which are more hybrid IDEs. And then NetBeans and Eclipse. One thing that irks me is that their theme managers are weak.

All in all the themes themselves are nice looking but they miss the mark on so many levels. Some support named themes. Some support a common/shared format or bundles. Some are hosted with previews. Some are on github or similar. HOWEVER:

(a) installation is anything but smooth
(b) the base editors have mixed syntax support meaning that the number of different colors is sketchy
(c) the colors are typically limited to the editing window and not the menus and surrounding UI widgets

Interestingly the best integration and themes seem to be implemented in the vi and it's variants and emacs/x. Gotta love that console!

Thursday, July 19, 2012

Editing in the cloud - web based IDEs etc

The challenge: Last month I tried an experiment.  Could I be productive writing code exclusively on my MacBook Air 11". While the results were positive I wanted to get back to my 24" monitor. It has not changed my need to mouse of cmd+tab around the active apps but I can see more, specially now that I have a second monitor for things like IM and Skype. What I do not like about this configuration is that I'm using BBEdit and editing files on my remote dev server via SFTP. And knowing how unstable an internet connection can be; not to mention that BBEdit has no collaborative features so if I edited the file on a second system there is a chance I could jank the whole thing up. Of course I could use GitHub or BitBucket as a proxy, however, there are plenty of use-cases there that is not practical and that means keeping the dirty laundry around longer than I want.

Ideal Solution: I'd like to see a chrome or safari plugin that uses their sync capability to keep my credentials secure and then an offline editor plugin with collaborative functionality similar to subethaedit. And while I'd really like to edit the files on my servers via SFTP or FTP/S I'd also like access to my DropBox instance.

(the coding monkeys have not done anything in over a year. subethaedit is clearly sub-par as an editor but it would not take much to make it a leader... maybe before macromates?)

Less than ideal: At first shiftedit looked like a potential alternative. It was not a perfect fit but it offered some features that I liked. But when I started to read the bottom line 3pt font... they keep my dropbox uid/password on their servers. Are you kidding me?  Codeanywhere was another alternative. Their website touts very similar functions that shiftedit does. One interesting feature is that they will let you resume a current editing session and syncs the edit sessions across the different tools like chrom, ipad etc... nice. But again, the bad news is that they require a user account. I have a support request into them in order to get a sense of what they keep or proxy on my behalf.

I just don't get it: Many of these cloud services companies are simply mashups of different cloud services. That's easy.  What is hard is keeping all of these mashups from becoming a severe risk. Think about this... if the likes of GlobalPayments and similar businesses that are wrapped in a veil of PCI requirements, implementation and audits including obfuscation of card numbers and account numbers with very robust encryption including duckput... what makes you think the likes of the cloud mashup of the day is (a) protecting your data adequately (b) and even scarier is not simply a trojan of sorts. Once these mashups have access to your account they can read it all, not just the files you designate.

So pick and choose VERY carefully.

The secret sauce for effective scaling

Many people think that scaling is a matter of selecting one programming language over another. I do not think that anyone would argue that there are plenty of advantages in most proper functional languages like erlang over dynamic languages like python or ruby. What most developers fail to recognize is all software that needs to scale is actually a business with a profit motive. And most executives under emphasize "business" in their technology business.

Going back to the lessons of Henry Ford. Scaling production came from 2 key areas. (a) interchangeable parts. (b) the assembly line. While it was not particularly motivating for workers to be on the assembly line, in the same role, day after day. It paid the bills.

So which is cheaper? Cloning a machine or rewriting your application for some fractional gain? Even if rewriting your rewrite was to get 2 - 3 times performance increase, it's simply not sustainable over time and the next rewrite. I wish there were a study on the cost per LOC(lines of code) for the different languages... my personal experience has been that demanding erlang coding will pay 2x the most demanding ruby. For any project there is a 1:1 ratio of LOC between erlang and ruby.

So the secret sauce:

  • know what your costs are and when you need your ROI to be

  • decide on a strategy and architecture that will scale to thousands of nodes; you do not have to implement this day one but you have to know where you are going.

  • Do not get wrapped up in the GIL, instead concentrate on IPC between processes on the same LAN and virtual WAN.

  • code to the roadmap and plan not to rewrite anything

  • segment responsibility into teams (tools, core, UI, etc)

  • always think about LTS from your providers. While you might not like Windows they are the kings of LTS these days. And if not them, at least there is Ubuntu and Red Hat

  • build tools tools tools

  • Don't repeat yourself


Simply put, writing new code in order to scale to infinite levels suffers from the law of diminishing and delayed returns. Cloning hardware is a fixed and capitalizable expense with immediate returns.

Wednesday, July 18, 2012

Ready, Set, Go.....

I was recently asked what I thought the next big thing in languages was going to be. After thinking about it long and hard I strongly believe it's going to be Google's Go. When I read and code in Go I have flashbacks to when I first started writing Java 1.0.2. The same percentage of people were using this language and that (C, C++, VB)... yes, the API wars were in full swing too. OS/2 was almost completely gone, HP had acquired the Dec Alpha and that was on the decline.

When Java 1.0.2 was released it was a practical system programming language. It ran on Sun and Intel hardware (maybe a few others). It had a clean and robust API set that also worked across platforms. It removed many of the tedious activities like memory management and it implemented OO in a clean way.

Now 15 years later Java's SDK has tripled or quadrupled in size, maybe more. The number of APIs are growing daily. Frankly Java and it's offspring have become the COBOL of our time.

Go, in my opinion, is in the same place that Java was. The difference now, however, is that "we" are trying to solve slightly different problems for which Go has some solutions. But first, Go looks like gava once did. It's lean, cross platform, has plenty of APIs with enough to get reasonable system work done. Now that have added quality concurrency, which Java did not have in the beginning and arguably still doesn't, and they added IPC in the way of channels, which java still lacks. Add native compiled output, monolithic/static code linking, and integrated package management; and you have a very nice platform. Then there is my intuition.

I recently write a Go version of a program meant to merge multiple presorted files. The go version was a little longer and a little more verbose. But not that much. Anyway, it's certainly a language worth watching.

Monday, July 16, 2012

Would the real 'Agile' please stand up!

I'm not a fan of rap but I do like Slim Shadey(sp). I decided to see what wikipedia had to say about Agile and when I got there I was very surprised. First of all I was expecting to see something that mirrored the Agile Manifesto. What I mean is that when the wikipedia page rendered on my phone the first thing I saw was a drawing/poster of the Agile process and (a) the vocabulary did not look like it came from the Agile Manifesto but (b) from waterfall.  In fact, while I have been saying this for some time... it is now fully realized.
The "Agile Software Development" process or whatever you want to call it is actually a modified "Waterfall". (It even looks a lot like RUP too.

Please do me a favor. Stop calling it Agile because it is not. Agile and the Agile Software Development are two completely different things.

Finally, just because your development team practices the Agile [new name goes here] Software Development process does not mean that your entire management structure needs to be flattened. Neither Agile nor ASD recommend anything beyond "the team" of developers. Of course you might trat your managers as a team in sort of a breadth first search sort of way. But to suggest that self organizing teams should be left completely to themselves is just irresponsible. Simply put mob wisdom does not always work; specially when it's my money on the line.

Tuesday, July 10, 2012

The good and bad of the about.me SDK

The folks at AOL have done something interesting. They have created an SDK for their about.me property. I like my about.me page as an anchor to everything I'm doing or that I need and that I'm willing to make very public. But now that AOL has created an API/SDK some new mashups are bound to happen.
I question what is AOL's endgame is here. about.me offers a metoo - email client of sorts and "offers" like coupons that you can never make go away. And yes you are giving them plenty of information and access to even more.

As for the good news. about.me's API/SDK seems to be fairly new. That means that other than the stock services (read apps) there has not been much action. But now that anyone can add an app/service to about.me even the most humble app might get some traction.

As for the bad news. If there is a landrush I'm not certain what AOL will be able to do about it. Clearly if there is a vetting process then they can control the rate at which new services are added to the site. If the velocity or count gets too high they may have to make some drastic changes to the interface.

As I look at my about.me and the way that I use my browser and the internet I'm just not certain how or what I would add to this page. Of course a bitly link might be nice. So would a delicious look alike. But the fact of the matter is that about.me seems to be a metoo app itself. For example. I have an idea for an app that I would like to connect to about.me but I do not want to require that my users create about.me accounts in order to use my app. And I'm certain that I will not be able to attract other about.me users to install my app once the list grows.

Just plain confusing.

Sunday, July 8, 2012

TornadoWeb - nice but incomplete

Quickie: TornadoWeb is a very nice framework.  I used it to build a cool RESTful payment gateway. But after reading Davis' blog about asynchronous mongoDB+TornadoWeb I realized that TW is good for very shallow transactions. Any sort of decision making or series of callbacks are non-trivial and as about as much fun as NodeJS' callbacks.

NOT!

Ruby - ARGF interesting but destructive

[Update 2012-07-08] I reviewed the ARGF source code and it explicitly closes the file. This is silly nonsense because (a) #close is supposed to perform a File::close and then advance to the next file in the list. (b) #close and #skip are not documented as aliases (c) the doc for #skip does not say anything about side effects.

Thanks to the guys at pragprog I have been rereading pickaxe. And in the process I found the ARGF class. It seemed pretty interesting because it was supposed to process ARGV as if all of the arguments represented files. (state of the files not withstanding).

The problem, however, is that it just feels improperly implemented. The first time you do anything with ARGF is has already opened the first file. And when you are on the last file where is no way to know other than checking closed? or fileno against the previous iteration. For example I would expect something like this:

[sourcecode language="ruby"]
while ARGF.more?
ARGF.skip
# and then do some stuff to the file
end
[/sourcecode]

Instead you have to do this:

[sourcecode language="ruby"]
while not ARGF.file.closed?
# do some stuff to the file
ARGF.skip
end
[/sourcecode]

or the while loop could be:

[sourcecode language="ruby"]
while ARGF.argv.size
[/sourcecode]

I suppose there is no much difference between the first example and the second from a structure perspective, however, there are side effects. Once you start to use ARGF the first file is opened. That might be a good thing and it might be bad. But it may also mean handling exceptions in more than one place. It certainly feels more verbose. Specially if you want to count the number of params by reusing ARGV.

The big floppy issue, however, is that ARGF is destructive.  Since ARGF connects to ARGV via a reference meaning that external code can alter the ARGV and the effects will be reflected in ARGF. The opposite is true too.  When skip'ing over parameters ARGF appears to be removing the file from ARGV. Meaning that you cannot do anything with argv/ARGV after the fact as it has been consumed. (This side effect is never mentioned; in fact the documentation for ARGF never even hints at this.)

The third piece of code makes processing easier but unexpected.

Saturday, July 7, 2012

Celery 3.0 - congrats, so what you're still fat.

Celery is HUGE. The dependencies include RabbitMQ which requires erlang. And while erlang offers some killer features it's one big-ass VM. Given how Celery is meant to be used I'm not convinced that RabbitMQ is a good tool. There are so many other MQs out there. Make sure you check the features you really want and need. I happen to like speed, persistent queueing, multiple workers, flexible topology. Much of this is implemented in ZMQ but even that can be heavy winded. Some folks are using redis, others mongodb for this exact function.

I guess this is why Cerlery offers some options to the RabbitMQ broker.

Trust in the wake of Stuxnet?

I watched a short report on the Stuxnet bot, virus, trojan, worm, thing. All the super-spy stuff scared the crap out of me. I do not care how sophisticated it is/was. Whether it has really been detected. Or whether or not it's actually real.

What bother's me about it is (a) it is said to have been running silent. (b) seemed to know exactly what it was looking for. So I find myself asking a number of questions:

(1) have my computers been infected with something I need to be worried about?

(2) has the infrastructure that I depend on daily been infected or compromised?

(3) what happens if/when Stuxnet-lite or #2 completes the Stuxnet mission?

Two days ago I wanted to FAX a 30 page document. I took my document to the local USPS store. They scanned it and sent it. I also asked them to email me a copy of the same docs. The funny thing is... I have a stack of thumb drives on my desk. I could have easily used one to transport the scanned image home. But I started to think about Stuxnet and it's attack vector. Thumb drives. Thus the USPS agent emailed my document to me.

Sandboxing as described by Apple is going to resolve a number of security issues but it is not going to solve them all. It's not going to help if OSX has been compromised with a backdoor from the source. It's not going to help if there are some bugs in the hardware (think SQL injection to a website). And so on.

The thing that Stuxnet was was supposed to do was provide some plausible deniability. Consider that in the forest of BSOD that Windows receives in a year. How many are real and how many are something else?

The hardest aspect of software development!

The hardest aspect of software development is keeping your systems configured properly. Installing tools like rvm, govm, virtualenv and so on is pretty simple but once you've installed the base packages there are any number of dependencies that you have to deal with and getting a consistant install is no less difficult or tedious than TDD (test driven development).

I had a strategy once that included using a VMWare instance that I would use to build a base-line configuration for the target project. Then share that baseline with others on the team. This strategy came to me after years of dependency documents, repository sub-projects, storing dependencies in the project tree, and even dedicated target build/test machines which is the closest to the VM strategy.

I think it would be ideal if there were a script or program (and it might already exist) that would reverse engineer your current dependency list; installed gems or pips etc... and generate an install script that would would work cooperatively with your rvm or virtualenv so that you or a team member could re-build the dev/staging/production environment from scratch as easily as possible.

 

Thursday, July 5, 2012

The monolithic code tree is a data cancer

I've been a fan of Google's approach of a single code tree for quite some time. I thought it was a pretty good idea because there could be some unique benefits. (a) programmers might stray into folders that they might otherwise have access too and learn or contribute something unexpected (b) reduce the potential that code would get duplicated; for instance (c) the code librarian's maintenance cycle is more manageable.
However there is one serious side effect to having this sort of intra company openness.

For my company I decided to create a single source tree was a good idea supporting (c).  Since I was the librarian I did not want to be managing tre after tree after tree. (Even though bitbucket.org offered unlimited private repositories).

The issue is actually much bigger than this. Google turns over it's codebase several times a year. Also, they share a lot of code between projects. And while this works for Google it's not the way most businesses run. With the exception of Google's spiders Google provides a read-only service to it's users. Whereas most businesses are read/write and the write must be consistent and reproducible. Which is the antithesis of the "Google Way".

So as I look at my FlaFreeIT repository I realize that there are projects that I will not ever update or repair. Clients that have long since departed. It might simply be better to have separate trees. If not because of local storage, clone latency and performance, risk of leaking or losing code. Any number of other justifications.

One last thought. When googlers make their presentations it's likely that these are on Google Apps. And that makes sense. But do they really have separate systems for presentations and development. And if so, do they really keep all that source code on each of their laptops?  That's got to be painful!

Tuesday, July 3, 2012

I want to play with sinatrarb but I cannot bring myself to install RVM

I've been given a small project to quote. At first I thought that the team was already using rails but after a code inspection of the proof of concept (POC) I realized it was PHP. Well, I'm not a PHP person but I can do it if I have to and that's when I started to think that maybe I could implement this in Rails. But Rails was overkill so why not sinatra? It's actually a half decent platform so why not.

I'll tell you why not.  Because for this one simple, teeny, tiny project I would have to start installing all sorts of cruft and that made me very unhappy. As I have mentioned previously RVM is a mess and it puts all sorts of demands on your system. Sure the ruby gurus have plenty of time and patience for this sort of thing. But for certain, I do not.

*sigh* but I might do it anyway.

Monday, July 2, 2012

In the beginning there was the BSS then AOL

There are a number of revisionist and non-revisionist histories for the dawn of the internet age. I suppose it depends on what or where the historian's vantage point was at the time. For me, I went from CB radio, to BBS, to AOL. And I was grateful for it. The dial-tone that is.

Now over the last 10 or 15 years things started to shift. They went from just a handful of domains, to a ICANN land rush, million dollar domain name sales, domain name lawsuits, and now to TLD expansion and more of the same.

At the same time AOL expands and contracts. For one really short moment AOL was the internet-lite for the rest of us. From a technology perspective they nailed it. Email, integrated address books, graphics, search, keywords (pre Google Adwords), over the air compression(pre Opera Browser), gateway to compuserve, national and international service. And the most important fact that we should not and cannot ignore. AOL was the cloud before there was a cloud.

Now history is repeating itself again. The monolith that was AOL is now being repeated in Google, Amazon, and many other internet service providers. AOL's user base is falling as they struggle to reinvent themselves. AOL recently announced the end of life for AIM which was the defacto instant messenger for a huge population of tech savvy and neophyte users. AOL is not my favorite platform, however, between "AOL scale" and it's user base there has to be an acquisition opportunity for a tech company like Google instead of one failed media company after another.

Sunday, July 1, 2012

I want three features in my GIT GUI

I just read an article where the author suggested that GIT GUIs were bad and that the command was the best way to go. Then in response to the first comment that "tower was good" the author agreed.

Distributed reversion control (DVCS) is non-trivial for the programmers, librarians and builders. It seems to be better than the alternatives like subversion or CVS and so much better than RCS. That said I have to argue that there is so much rich information in the application that the console is just not capable of displaying everything in a nice way.

So, I want three features in my GIT GUI:
(1) GUI-like visualization of the output from and command
(2) time series-like reporting
(3) a command line with good auto-complete

Adobe LR4 (LightRoom4) and Lua

I have been hacking on Lua and Lightroom4 for the last week or so and I have to admit that while I was skeptical of Lua and it's potential I'm actually starting to like it in this space. It appears to be fast and efficient while some of the syntax sugar seems to be less than modern and is probably suffering from the not invented here syndrome as the developers were inventing the language with a sense of strong national pride.
For example many languages, Python for example, 0(zero), '' (empty string), None (same as nil or null) are treated as false. In Lua you have to explicitly test the values then perform the logical operations.

I have one good thing to say about LR4 and Lua. The sandbox seems to work really well during the development cycle. There is a flag that can be set where LR4 will reload your code if it has changed. It makes development a lot easier. (but there may actually be a memory leak).

And that's about where the good news ends. The GUI controls are amazingly limited. Given today's toolbox you'd think Adobe would give me access to more controls. Like a date-picker or a proper list control. But more importantly if you're going to provide documentation for a feature like auto_completion then it needs to work exactly like the documentation says it should, and it should work exactly the same on all platforms.
auto_completion/completion does not work the same on Windows and OSX.

And while the particular bug I've been chasing was reported 4 years ago(2008) the response was that it was documentation that was leaked into the public and that it was going to be fixed ASAP. In the meantime the feature is now available on OSX but has not been ported to Windows.

Finally, I wish I knew what the LR4/Lua community or ecosystem looked like. After spending $100 on Lua [LR4] I can see that since Lua plays such a minimal role that there is no real need for consulting or freelancing etc in this space. It's strictly a support role.

another bad day for open source

One of the hallmarks of a good open source project is just how complicated it is to install, configure and maintain. Happily gitlab and the ...