Thursday, April 16, 2015

R annoyances

R seems to be, hmm, quirky, from what I've seen of it so far... For instance, ddply has different behavior using the American vs. British spellings of "summarize". Seriously?!
> g <- ddply(m, c("QGroup","Income.Group"), summarize, All = length(CountryCode))
Error: argument "by" is missing, with no default

> g <- ddply(m, c("QGroup","Income.Group"), summarise, All = length(CountryCode))
Yes, that's right - "summarise" works fine but "summarize" apparently takes different arguments.

Trouble installing jpeg R package on OSX?

I'm taking the Coursera "Getting and Cleaning Data" class in R and one of the quizzes requires you to use the jpeg package. The R package install was failing for some reason though:
install.packages("jpeg")
In file included from read.c:1:
./rjcommon.h:11:10: fatal error: 'jpeglib.h' file not found
#include 
         ^
I use homebrew, so I tried the usual...
brew install jpeg
Warning: jpeg-8d already installed

brew unlink jpeg && brew link jpeg
But that didn't fix the issue. The solution was to install the XCode command line tools with:
xcode-select --install
(As a side note, this is my development machine so I already had XCode command line tools but for whatever reason I had to install them again, maybe an XCode upgrade?)

Tuesday, April 7, 2015

Sometimes the best code is that which is never written

Someone wrote some moderately over-complicated code to provide functionality in a service. That developer has moved on. A little time passes and suddenly there are a bunch of stack traces popping up in the logs indicating that a status message isn't getting published.
First reaction - dig into the error, figure out exactly what's wrong and fix it, ASAP!

Here's the error:  Invalid parameter: Message too long (Service: AmazonSNS; Status Code: 400; Error Code: InvalidParameter; Request ID: <redacted>)

Seems pretty straightforward; two seconds of Googling turns up that SNS messages have a limit of 256KB. Go into the appropriate class and add something detect how many bytes are in the message. Hmmm... maybe these messages are really large, and our app has a ton of threads - don't want to use up too much memory. A quick check on StackOverflow suggests using a CountingInputStream (there's one in the Apache commons-io, but it's also pretty trivial to write). Add it. Write a couple tests to make sure that it counts bytes correctly for UTF-8, UTF-16, etc... Looking good and have only spent 20mins so far fixing this problem.

Now that we can reliably detect the message size, we need to deal with ones that are too large for SNS... Uh-oh, the message body is actually something getting serialized into a JSON payload, we can't simply truncate or the message won't be valid JSON any more. This will require some thought, but I'm totally up for the challenge!

STOP! Take a step back from the technical details and think about what problem you're trying to solve.


Basically, the point of this code was to "make sure a notification goes out that something failed to process". There's nothing in there to suggest that the entire response needs to be serialized and sent on the wire (although that's what it was doing originally). Think to yourself, "what's the simplest thing that could work?". In this case, all that really needs to be sent is an ID and a status of success or failure:

{ id: 1234, status: "failure" }

There's NO way that message will ever approach 256KB, so do I need to worry about truncating this message? No. Looking back on it, did I need a CountingOutputStream even with the truncating solution? Probably not, it was a JSON payload in UTF-8; UTF-8 handles the ASCII range in 1 byte per character so simply getting the length of the String should have been "close enough".

In the end, I had to write a negligible amount of code to create that new JSON payload, and I removed a bunch of complexity from the original codebase along the way.

The take-away from this exercise? Step back and understand the problem-space from the start and resist the impulse to dive in and fix the errors immediately.

What's important to me, career-wise

(Continuing on from "What's important to you, career-wise?", here's my personal list of top 5 needs from a job, ranked from most important to least)

1 - Interesting people

It's important to me to work with interesting people who are more experienced/skilled/intelligent than I am. As Chad Fowler said in The Passionate Programmer: Always be the worst guy in every band you’re in. - so you can learn. The people around you affect your performance. Choose your crowd wisely. It is NOT a good idea to surround yourself with people who think identically or have had the same life experiences. Just because everyone around you hates node.js doesn't mean there aren't good ideas in it, but if there's no one championing it then it's easy to get mired in your (mis-)conceptions. Different viewpoints can challenge your perceptions and make you grow. Learn from their experiences and challenge dogma wherever you go.
In the almost 6 yrs I've been here, there's only been one person I've disliked working with; that's almost unheard of. It lends credence to the idea that hiring decisions based on culture can result in a better working environment. Unfortunately, niceness doesn't help you grow. There are people here I would love to work on projects with, but the reality is that there are no projects big or complex enough to justify us working together. Those people are a scarce resource that gets spread out to make the most use of their time - 1/5

2 - Challenge

You grow from experience and from solving problems, not just putting in time. I think you can find something to challenge you in most projects/situations, whether that's solving scaling, latency, algorithm selection or just plain figuring out how to meet the business needs. But there's a corollary to that, it has to be something meaningful. Sure, "reduced latency from 1,023ms to 45ms" looks great on a resume, but if it's an operation that gets called once a day from a cron job, did you achieve anything? If you spent the last 6 months "configuring" a system owned by your parent company or fighting with byzantine build tools, but don't have any means to improve that process, did you grow?
I can't remember the last project that involved an actual technical challenge. A true craftsman doesn't blame their tools, but trying to build a skyscraper with a plastic hammer isn't a challenge, it's foolish. Most of the "challenges" here are self-inflicted. - 0/5

3 - Growth

I'm huge on growth and learning. This is third on my list but there are aspects of it in pretty much everything I choose to do (or not do). I attend a lot of conferences, I find it's a great way to keep up with the industry, see what's hot and upcoming and get new ideas or inspiration. I read a ton of books and articles. I also take advantage of MOOCs (Coursera/Udacity/etc) - I'm taking some courses on R right now. Elasticsearch? Docker? Kafka? TDD? Continuous delivery? Sign me up!! I'm happy learning new languages and being a polyglot (java/ruby/go/perl/whatever) programmer. It helps me to select the best language for the job at hand and team situation. That being said, there's a vast difference between HelloWorld and the invaluable experience of running something in production. If you don't have anywhere to apply all these ideas, concepts and languages to something real, you won't grow nearly as much.
Zappos funds one conference a year and lets you expense work-related books. 'Pursue growth and learning' is actually a core value. There's a quarterly hackathon to explore and try to solve any problem you want. Ultimately though, it boils down to "Do I have somewhere to apply what I'm learning?" and the answer has been almost uniformly no. All the time and effort I invest in outside learning barely offsets the complete lack of mental stimulation offered from the projects available. - 2/5

4 - Balance

Work hard, play hard. I love thinking about tough problems, and like the fable of Archimedes, having that Eureka moment. But those moments don't necessarily occur at work. They occur at the gym or lounging by the pool. They happen at a bar with friends or coworkers or on a trip to an exotic beach with crystal clear water. What's important is that I have the challenging problem to puzzle over in the first place. I don't enjoy purely remote work, I like being in contact with other human beings and coworkers. Some of the best times in my life have been during grueling projects with a small group of awesome people, doing something that mattered.
You definitely have the freedom to play hard here, and the freedom to joust at windmills if you choose. But work hard? Not doing something that matters. 2/5

5 - Money

Last on the list is salary, for a reason. This is more of an enabler than an end goal. As Henry David Thoreau said, Wealth is the ability to fully experience life. Once I have enough to live the way I want, and to do the things I want, then it stops being a motivating factor.
I feel more than fairly compensated for this market, which is all I ask. - 5/5

I heard the Arctic Monkeys playing 'U R Mine' on the way to work this morning:

And I go crazy cause here isn't where I wanna be
And satisfaction feels like a distant memory