Validated Learning, using Instagram as a lab

Instagram finally released an Android version, so I’m happily snapping away. My feed is at http://www.gramfeed.com/dparmenter.

I’ve spent much of this spring pondering how to put the precepts in ‘The Lean Startup’ into practice in a practical way. Ideally, I’ll be able to do this at work and help my projects be awesome! But it’s kind of tricky, and not clear how to get started.

This week, I used Instagram as a laboratory. At the beginning of the trial, I had 27 followers. The metric I tried to maximize is conversion rate to ‘follower’. I have created five separate cohorts of Instagram users as follows.

  1. Cohort #1 – people who I simply ‘liked’ 4 of their photos
  2. Cohort #2 – people who I blindly followed
  3. Cohort #3 – people who liked my photos that I followed back (and liked 4 of their photos)
  4. Cohort #4 – followers’ followers – just follow them
  5. Cohort #5 – followers’ followers – when they accept I then immediately unfollowed

All test subjects were found from my feed, people I follow’s feed, etc, so there is probably a clustering effect. I should say, I feel kinda guilty about following people insincerely, but I am just doing an experiment. Hopefully no karmic damage.

In the week I did this experiment, I went from 27 followers

And the results:

  1. Cohort #1: 3 followers
  2. Cohort #2: 1 follower
  3. Cohort #3: 3 followers
  4. Cohort #4: 5 followers
  5. Cohort #6: 6 followers

Additionally, I picked up 18 other followers. Some of them ‘legitimately’ and others because I could see the #4 worked so well, I used it some more.

I think I can categorically state that your best bet for getting followers is to follow your followers’ followers.

In terms of process, keeping your cohorts straight is crucial. Once you’ve done that, you can study them for a long time if you like.

As the business owner (of my feed), I think the next step for me is to try to get to the point where I have roughly the same # of followers as followee’s.

Balsamiq and the iPad

I’ve been a loyal balsamiq customer for about a year. The software is insanely great, it does just what it is designed to do, very pleasing to look at and to use. And perhaps importantly, the team that makes it seems rather awesome! Don’t believe me? Check out their manifesto.

So there’s just this one thing I don’t get:

Why no native iPad support?

I decided to be vendor-centric (just invented this term!) and get in touch with Peldi to find out the answer.

He wrote back almost immediately, so here’s the tutorial:

Hi David, thanks for the kind words. iPad elements are here for now:
http://mockupstogo.mybalsamiq.com/projects/ios - instructions on how to use them are here:
https://mockupstogo.mybalsamiq.com/projects/aboutmtg/Finding+and+Using+Libraries

We’ll make these easier to find from within the app soon!
Peldi

I tried it and it works! You need to be following their best practices and put all of your assets in a directory with that name wherever your BMML files live, and then tutti va bene!

I still would like for them to add this support into the product natively, but I’m unblocked.

January / February hiatus bearing fruit

I find it devilishly difficult to simplify my own ideas. Other people’s ideas, I whittle them down to their essence in no time. I guess this is what Lennon/McCartney, Rodgers/Hart was all about, but I just have me!

As I regroup from the deadend I created for myself, I am asking:

  • what is super popular in the iPhone-o-spere that I can cooperate with?
  • how do I eliminate just about all of the processing steps in my design
  • how can I get this thing designed and coded in the 4 hours / week I seem to have available?

I think I’ve got a plan, but I’m staying in stealth mode for a little bit more. Stay tuned!

In the meantime, I’ll probably start blogging about work a bit, since there are some pretty cool things happening that I can share.

Give blood – Learn Parallelism

A couple weeks ago, my pal Bob Treitman got the bloodmobile from Mass General Hospital to come to Adobe at Waltham.  It was a great experience, thanks Bob!  Lots of people showed up, and even people from neighboring offices came down.

I was amazed by how well organized it was, and how capable the staff was.

The bloodmobile also presents itself as a lovely example of how to build a robust, parallel processing system.

  1. They schedule visitors in 15 minute increments, but if more, or less, show, this is not a problem because…
  2. They can wait inside, and overflow outside
  3. They give you privacy, while they get your data, check your vitals, ask you personal questions
  4. If need be back to the waiting area, although in my case they put me on the couch right away
  5. My blood sugar was a little low, so they sent me to the …
  6. Fridge and then back again
  7. When I was done, I got a card and felt great about the whole thing!

I believe that if you can get more than 30 people to show, they will probably come to you.  Visit their site for more information!

 

Back to the design phase

“It’s a dangerous business going out your front door.”
– Bilbo Baggins

VintagePost.app is out of control.  It *could* do so many things, and yet right now, it can’t actually do any of them.  Time to step back and stop coding, and reengage with the XD part of my brain.

I’ll probably make an inventory of all the pieces of technology I have “working” at some point, but first I have to channel my inner agilist.

Nearest neighbor for geo data

Given a latitude and longitude, I need to pull the nearest neighbors out of the database and return them in proximity order.  To get this to work we use:

To do the query, we need to compute a bounding box:

swhash = Geohash.encode(latitude-delta, longitude-delta)
nehash = Geohash.encode(latitude+delta, longitude+delta)

This produced two hashcodes, one to the southwest of the orig, and one the northeast.  A delta of 0.5 is about 138 KM, in North America, at least.  Geohash codes are great because they can be sorted in lexical order, so my boto query now looks like this:

sdb = boto.connect_sdb(AWS_ACCESS_KEY, AWS_SECRET_KEY])
domain = sdb.get_domain(PLACES_DOMAIN)
places = domain.select("select * from `Places` where
    `geohash` >= '%s' and `geohash` <= '%s' LIMIT 10"
    %(swhash, nehash), max_items=10)

The places object is actually an iterator much thanks to Chris Moyer for documenting this. We want the array.  NOTE, the query is lazily performed when you interpret the first token from the iterator.

results = [place for place in places]

The distance function is very simple:

def distance(a,b):
    R = 6371 # radius of Earth in KM
    lat1 = radians(a[0])
    lon1 = radians(a[1])
    lat2 = radians(b[0])
    lon2 = radians(b[1])
    d = math.acos(math.sin(lat1)*math.sin(lat2) +
        math.cos(lat1)*math.cos(lat2) * math.cos(lon2-lon1)) * R;
    return d

Finally, python sorts the result:

results = sorted(results, key=lambda place: distance(orig,
    (float(place['latitude']), float(place['longitude']))))

This actually worked on the second try.

Notes: there are better distance functions, but this one works well and is quite fast.  I have not put in the error handling, or the back off code if you get too many or too few results.

YAML > JSON > XML

My app gets images from various photo sharing sites.  After checking for copyright correctness, I download images onto my own machine for review and Photoshopping. Then I upload them into my S3/SimpleDB cloud storage.

The metadata I get back from the photo sharing sites is generally in JSON format, but I frequently want to edit this metadata and even though JSON is the bee’s knees, it’s not that great to edit, plus grepping for metadata kinda sucks too.

Enter Yaml:

city: ''
country: ''
filtered: 0
geohash: dhwfq5vbb02h
id: '3077164973'
kind: scene
latitude: 25.728738
license: '4'
longitude: -80.236244
owner: 11211909@N00
ownername: corsi photo
state: ''
tags: miami floridawaterbeachskylineshorelineoceanbuildings
title: Miami shoreline
url_z: http://farm4.staticflickr.com/3149/30771649c6a6c_z.jpg
vp_id: flickr.scene.11211909@N00-3077164973

Exit JSON:

{'tags': 'miami floridawaterbeachskylineshorelineoceanbuildings', 'vp_id': 'flickr.scene.11211909@N00-3077164973', 'owner': '11211909@N00', 'id': '3077164973', 'city': '', 'kind': 'scene', 'url_z': 'http://farm4.staticflickr.com/3149/3077164973_a31b4c6a6c_z.jpg', 'license': '4', 'title': 'Miami shoreline', 'country': '', 'longitude': -80.236244, 'state': '', 'geohash': 'dhwfq5vbb02h', 'ownername': 'corsi photo', 'latitude': 25.728738, 'filtered': 0}

PyYAML even has type inferencing, so when I reload the data, it seems to do just what I want.

Sandboxitude

Homebrew and Virtualenv go together like cognac and chocolate (YMMMV: your metaphorical mileage may vary!).

By using mac ports and being unaware of virtualenv, my upstairs mac had become an absolute mess.  Finally, I removed all traces of mac ports from the system, I think I did anyway :-) .

Now, I am installing and uninstalling recipes, never once sudo’ing.

Even better is virtualenv.  Putting Python code into production is a breeze, since you’re already running in a sandbox, you’ve documented your dependencies, and Python is portable.  What a great world!