October 28, 2010

Parsing Census Data For Fun and Profit - Thursday

The numbers are in and according to my calculations using the 2000 census educational attainment data and the 2008 county election results.

Obama won 53.88% of the college educated vote
Obama won 53.68% of the vote overall

Not much of a correlation. And while we can't make assumptions about college education correlating with intelligence I think it's safe to say that people aren't morons just for voting for the republicans or democrats.

See my previous post for an explanation of the colors in the maps below.


2000 Census Educational Attainment
2008 Presidential Election
Election Education Composite

October 27, 2010

Parsing Census Data For Fun and Profit - Tuesday

Some early results from working with census and election data. The first two graphs represent the 2008 presidential election where blue represents Obama and red represents McCain. The second graph represents people over the age of 25 who have had at least some college education.

The third graph is a composite of the two. Light blues and orange represent a high percentage of education and a bias towards one candidate or the other. While dark blues and reds represent relatively low levels of education along with bias towards one candidate or another. I'll provide some numbers showing any actual correlation with educational attainment and voting later on in the week.

2000 Census Educational Attainment
2008 Presidential Election
Election Education Composite

October 25, 2010

Parsing Census Data For Fun and Profit - Monday

Something that I've been missing since leaving my last job has been the opportunity to parse large data sets and pull out meaningful information. At Wetpaint I was able to write scripts and hadoop jobs to parse gigabytes of access logs. My efforts were directly responsible for identifying and removing abuse from the system as well as highlighting bottlenecks that weren't apparent at first glance. That's why this week I'll be focusing on parsing and presenting large quantities of data. First step will be parsing US Census data and providing detailed maps correlating things like education, race and income with the 2008 presidential election. Afterwards I'll move on to building out some infographics for easy digestion and consumption.

Project Summary: Develop Infographics using US Census Data

Features:
  • Parsed US Census Data
  • Queryable Database
  • Graphical Output in Vector Format
  • Pretty Infographics

October 24, 2010

PuzzleHire - Sunday Week 2

Signed, sealed and not quite delivered. The site is finished and working to my satisfaction but I'll be shelving it for now in favor of waiting for a point where I can devote more marketing resources to it. It has potential but requires a critical mass of puzzles, developers and recruiters to be viable. I'll keep you folks updated.

In the meantime watch out for tomorrows post. All signs point to large scale data processing with hadoop.

October 18, 2010

PuzzleHire - Monday Week 2

The project continues, I finished up the templating of the basic app last night. All major functionality and the backend is complete. This week we'll see further refinement into a full realized product. Stay tuned for screenshots, demos and more information on what this site is all about.

October 11, 2010

Puzzle Hire - Monday

One of the more embarrassing experiences I've had during my career was interviewing a candidate that was unable to implement a loop to print out the numbers 1 to 100. I had read this person's resume, conducted a phone screen and recommended they be brought in for an interview loop. I also had to ask him to leave after the first round of the interview as this person clearly wasn't going to be able to perform the required job function. I vowed there and then that I was never going to let a candidate dupe me like that. I had to stop taking candidates with "2 years of shell and Perl scripting experience" at face value and verify it or distribute homework problems.

But there has to be a better way and that's where this week's project comes into play. What if you could verify coding ability, see code samples and contact an already existing set of job seekers? Or what if you are a job seeker looking for work, wouldn't it be great to complete a few coding puzzles, mark yourself as available for hire, and then get contacted by recruiters or companies looking for someone with your skills?

This particular project may be split later in the week and extended into a two-part project as the scope is fairly large at this point. In the meantime I'm also going to concentrate more on the core application and less on monetization right now.

Project Summary: Build a site where developers can complete puzzles similar to Google Code Jam or Ruby Quiz and companies can contact interested candidates and view code samples.

Planned Features:
  • Large corpus of programming puzzles
  • Input/Output data sets similar to google code jam
  • Messaging system to protect user privacy from recruiters
  • Github and Stack Overflow reputation integration
  • Plagiarism Detection
  • User Submittable Puzzles

October 10, 2010

The Big Game - Sunday

Video games in my teenage years were a passion for me. I spent a week living out what one might describe as my adolescent dream. And I even succeeded in developing my first game from start to finish. It's text based, it's unconventional and unfortunately it's incredibly boring. But at the very least I've finished building all of the essential game mechanics. I just failed to make it terribly interesting or worth playing. With more work it may be salvageable. But this project definetly falls into the failure category despite work being completed.

On a positive note it was fun to build out an interactive system and see a world built procedurally from a seeded random number generator. I have a new found respect for game developers. Especially the smaller indie developers trying to make it in an arena dominated by the behemoths like EA.

Tomorrow I'll outline the next project. For now we'll be going back to web apps and I'm considering jumping in with both feet into yet another framework, Django. We'll see how that turns out.

October 04, 2010

The Big Game - Monday

Perhaps you were like me as a kid, good with computers, chalked full of a false sense of confidence, and played too many video games. This naturally left you wanting to build a video game of your own. However you had no prior experience programming and only a vague sense that you might need a compiler for something at some point. Regardless you got to work with your graph paper and started designing levels and features with your friends. The harsh realization that building a game was a lot of hard work and required some fundamentals that you might not have developer yet showed up later.

I'll let you folks in on a secret, I've never actually built a game. Not even a space invaders clone. I've started several over the years, but I've never actually gotten close to completing one. I've been kicking around an idea for a simple game. The focus will be on play rather than graphics and may even be text based. It likely won't make money, it may not even be fun, but at the very least it will be fun to try my hand at game development.

More details on exactly what the game will entail will come later in the week.

Project Summary: Build a fun, playable game. Fun optional.

Features
  • Rudimentary Graphics
  • Text Based Interface
  • Replayability
  • Keep in mind portability to mobile and web platforms.

October 03, 2010

API/SaaS Wrapper - Sunday

Released! After a quick review Saturday with some changes and refactoring to whip my mega-methods into more manageable short chunks (thanks Jana!) it was ready to go. However at this time I don't want to open-source the code and I don't have a cohesive plan to market it. However if you're interested in testing it out or want to implement in your business please let me know, I'd be happy to provide a free trial in exchange for some feedback.

A few things that went better than expected

  • By Tuesday I had a working proxy of flickr's API. After that it was as simple as taking on a new set of features each day and polishing the others into something more consistent. For a project that seemed monumental at the start it came together surprisingly well.
  • I decided to open source a small project I put together Saturday morning to help with testing. It echos back some information on the request and allows you to specify what content type, status code and content you want it to echo back to you. Check it out on github http://github.com/pcorliss/Echo-Server
Challenges
  • There was a slight issue with the initial configuration and setup. I ended up opting to spend my time coding instead of configuring so I fired up eclipse's Google App Engine (GAE) plugin and got started quickly. GAE apps are largely portable except for the memcache service which will need a small change to run on a different platform. This actually makes it a selling point for smaller startups because it means they could get up and running quickly without having to purchase servers or do much setup besides configuring a CNAME.
  • The project's market was murky at best at the beginning of the week. And I'm still not sure who I'll sell it to or if it would be better off being open sourced. I think it's definitely ready to be put into production, or at least into heavy load-testing. But I don't think setting up a server and running on a freemium model is the right plan at the moment.
Features Missed
  • After some thought I realized that output conversion isn't really something I would want if I or anyone else was building a public API. So it seemed reasonable to skip.
  • After a lot of reading it turns out I really didn't understand oauth all that well. Acting as an oauth provider or as a consumer for an API that sits in the middle doesn't really make sense. Perhaps in the provisioning portion, but that isn't practical since you'd have to interface with end points that aren't defined. By Thursday I had dropped the feature and moved on.

October 01, 2010

API/SaaS Wrapper - Friday

And it all comes together. I'll be spending some free hours Saturday and Sunday polishing the project up and adding some features from my "Nice to have" list. But otherwise it's good to go. While it isn't yet close to a direct competitor to Mashery's platform I feel like it's going to be a pretty strong contender for the lazy developer in us all who doesn't want to have to tack on access keys, signing and rate limiting to their existing product in order to make their API publicly accessible. I'll post a final wrap up on Sunday.

Exciting Features, Developed and Working
  • Proxy multiple APIs via CNAME or path with a single instance
  • Developer provisioning with access keys, secrets and signing options
  • Configurable default and per developer rate limits in minute, hour and day increments.
  • Standalone jetty service, google app engine compatible