Can I code?

In short, yes. But I don’t often admit it.

My current “style” of coding, if style is really there at all, would have to be from the school of 1940s home and gadget design, which is to say: It’s blocky but functional, it’s thoughtful and never excessive, and its designs are heavily patterned and borrowed from the work of others.

So though I use code to get a lot done, I don’t write a lot of it from scratch, and don’t feel as though I ‘own’ it.

So I have a hard time showing off “my code.” But, a number of people over the past few months have asked to see samples of what I can do, so here goes:

Python

batch_youtube_dl: Python wrapper to read in lines of YouTube links from a text file, and call another script to download those videos.

The Japanese Eggplant Problem: A silly math problem I couldn’t figure on on paper, involving keying in the wrong grocery store produce codes. I finally solved it (I think) with this script.

Asteroids, sorta. (Hit the “Play” icon to execute the code.) This code is still live on the CodeSkulptor platform. This is my implementation of the final coding exercise for an introductory course to interactive python programming. I’ve done a bunch of Actionscript interactive coding before, but this is the first time I made games with Python.

R & R Markdown

Not much to show. I have tons of snippets of incomplete data analyses on my laptop, but they’re all notes and drafts, and not really much to show.

However, here are some steps from a data analysis/visualization assignment I did last month, on this html page, which was generated with the R-markdown knitter (It’s not prettified, but it’s there).

Octave

I did one of Andrew Ng’s offerings of “Introduction to Machine Learning” back in 2011. All the Octave implementations I came up with are sitting on a media server in another country. I’ll think about uploading them when I get home.

Ruby and Rails

I completed a movie-rating sample Rails app from a coursera offering. I don’t really want to publish these files publicly, as I suspect the course runs again at some point, but I also need to show I’ve built upon a Ruby-Rails skeleton. I had this deployed to Heroku until recently. If you really want to see it for some reason, I’ll work something out.

Actionscript, inside Flash

Several (four) games/apps for the “Journey to a New Land” site, way back in 2004. I still like what I implemented here: XML-specified text, audio and images loaded at runtime (makes things bi-lingual compatible, fast to load at runtime, easy to change with just edits to the XML-file. You can’t see the code though unless you decompile the Flash files.

Other

I used mash scripts and tools together a lot. Here’s a slideshow of weekly crime maps that I converted from PDFs to jpgs using ImageMagick, and assembled with Soundslides. I also can do image manipulation in Photoshop, as I did with these satellite images of Beijing, to make them match up for a jQuery before-after slider.

This is only a start. I have other recent code work samples that I will try to post when I remember them and/or find them.

I think everything from pre-2004 has bit-rotted away somewhere, or at least the platforms that code would run on are now obsolete.

‘Crazy good’ resources and ‘crazy bad’ air

Last month, The New York Times wrote about how the U.S. Embassy in Beijing was measuring, recording, and tweeting air quality data, including some ‘crazy bad’ Beyond Index readings for PM2.5 particulates and ozone levels.

Today, a team of geo-media ninjas from Google released some satellite photography taken over Beijing at about that time of “Beyond Index” readings on or about Jan. 11 – Jan. 12, depending on your time zone.

The Google ninjas linked to this tweet, for a time reference:

I massaged the images slightly, and put them together with “before” photos from March 28 (first image set) and September 14 (other two image sets) that the Google team also provided:

Now, I can’t say whether I’m seeing cloud cover or pollution, but if you trust the tweets put out by the U.S. Embassy that day, it doesn’t look good.

Not that things are fine and dandy either at home here, in Vancouver, B.C.

Continue reading

Playing with Python – the ‘Japanese Eggplant’ problem

I thought I would program in Ruby (because its guides are cuter) but after trying to implement some cryptography functions for an online course, I decided that Python had better muscles for the hex and numeric functions I wanted to do.

I didn’t really learn Python, however, until I recently began the coursera course “An Introduction to Interactive Programming in Python.” The algorithms and logic/function design are fairly simple (I’ve coded before), but this is the first time I’ve implemented a Pong game — which seems hard for me to believe — and this is the first time in a long time (since 1998?) that I’ve felt comfortable doing work with a scripting language (as opposed to something that needs to be compiled.)

Anyhow, now that Python is literally at my fingertips, I decided to solve a real-world problem that has been bothering me for a few months.

The Japanese Eggplant Problem, as I’ve named it, is probably just like one of those math problems on the “so you think you’re smart” high school math challenge sets. (I was never very good at those.) The problem is: How can I generate a series of codes for grocery produce, where a cashier’s typos are more likely to result in an invalid code, rather than resulting in the code for a different product?

Motivation(true story): the code 4601 is for Japanese eggplant, but I am purchasing lettuce, whose code is 4061. The cashier accidentally types it in as 4601, and suddenly I’m paying $8.27 for a head of lettuce that ought to cost $1.79.

Whoops.

I know that two-digit typos are fairly common (for example, writing “hte” or “teh” instead of “the”) so the Japanese-eggplant-question re-worded is: how many 4-digit produce codes don’t have 2-digit typos amongst themselves, and how can I generate one such list?

Continue reading

March Madness – Part 1: Examining the B.C. government’s purchasing cards spending

I became aware of a few new DataBC data sets a few weeks ago, consisting of various spreadsheets of government purchasing card expenditures stretching from the fiscal year ending in 2007 to the fiscal year ending in 2010. (The fiscal years end March 31, hence the mad rush to spend, spend spend…)

But it was only a week ago that I decided to take a closer look.

I was reading Stephen Quinn’s Globe and Mail column — the entry about the non-renaming of BC Place — and one of the comments mentioned a local political blog, which I checked out and which led to a link to the most recent update (FYE11) in the data set of purchasing cards spending.

It was a PDF.

Now, nothing riles an open-data / data-journalist type more* than tabulated information being presented as a PDF, so I attacked it like the proverbial starving Chihuahua on a pork chop. Continue reading

#NICAR – A yearly dose of journo/nerding motivation

I went to the Investigative Reporters and Editors (IRE) annual Computer-Assisted Reporting (CAR) conference again. This is my third. I can’t believe I was almost considering not going. Besides seeing a few familiar faces again and meeting people who I’d only been following on Twitter since last year’s conference, I managed to leave with that inner “oomf” feeling a little restored.

I want to program. I want to do analysis. And I want to report… what the data says to me.

I have the skills to do it all, and I am also realizing that it’s not the end of the world if I don’t have a physical newsroom team right now to do it in. This extended network of journo-nerds is like a virtual newsroom team.

For posterity, and so I could better explain the conference to my spouse, I used Storify and collected images people created during #NICAR12. Enjoy it below. Continue reading

The Bastards Book and @ScanBC tweets

I decided this year to get accustomed to both Python and Ruby and spent quite some time getting my 10.5.8 Leopard mac working with Xcode and pip and all sorts of Terminal-coding-environment things that were quite buggy, as it turns out. (Whoever recommended to me to use MacPorts last year: Shame on you.)

Now that I have nice virtualenv Python instances and tidy little rbenv Ruby silos, it’s all fun and games. I install libraries into my selected builds, and it works. So I’m finally able to run sample code without it throwing a whole bunch of errors about how outdated my environment is. (So embarrassing!)

I’ve just started on Dan Nguyen‘s The Bastards Book of Ruby, where he actually encourages cut-and-run in chapter 2, on ‘Tweet Fetching’, perhaps just to get you excited about coding. (It runs! It does stuff! I’m hacking Twitter!)

But yeah, I mean, it is fun.

When the exercises turned to downloading and running some simple stats on sets of tweets, I decided to use something that is much more interesting than my own feed: @ScanBC.

ScanBC is an online community in British Columbia of people who like to monitor emergency radio communications* — and in some cases, hook up and stream live feeds over the Internet.

Occasionally, someone from within that community tweets out what is overheard… Continue reading

10 promising technologies for journalists at SIGGRAPH 2011

I know that the SIGGRAPH computer graphics conference is cool beans in the computing world, which I used to live in.

If you’ve seen any animated feature this year, or any CG-heavy action feature, or played a video game that was made in North America chances are that key figures from the animation production crew were present, or presenting, at this video game and film industry gathering last week in Vancouver.

Above: The floor. Below: VanArts clearly wins the booth bunny contest.

Some journalists were there covering the people, companies, and demos themselves.

I, without a media pass, went on meta-recon to see what goodies might be handy in the actual process of news gathering or presentation.

Though I only went with a basic pass (no papers, no in-depth talks, no animation festival), I came up with this list of 10 promising, or inspiring, technologies for present and future journalists:

  1. 3D animated explainers

  2. 3D models from 2D images
  3. Mapping the indoors
  4. Open 3D web standards
  5. On-demand 3D printing
  6. Workstations in the cloud
  7. 3D mice
  8. Cameras that see around corners
  9. Clustering algorithms for images
  10. Augmented reality (AR)

Read below if you want to know more about each one, as I saw it, at SIGGRAPH 2011 in Vancouver. Continue reading

MoJo final project: Be anyhwere, be everywhere with ‘Proof’

Download this document as a PDF.

Software product proposal: Proof

Be anywhere, be everywhere.

Proof is a peer-to-peer network of world-wide webcams.

For Proof, anything can be considered a webcam, as long as it is able to take and transmit still or video footage to a server.

Proof is a platform that enables newsrooms to quickly source images from breaking news when it seems unlikely that regular staff or freelancers will be able to make it to the scene in time.

When a user wishes to commission some visuals, he or she logs in to Proof and submits the assignment’s details and pay. The Proof back-end then grinds away in search of registered users in the assignment’s geographic area (either via GPS or cell tower proximity). Proof spits out assignment notifications to the nearest parties. When one of those ‘Proofers’ decides to take on the task, the assigning user gets one last opportunity to vet the freelancer. As with systems such as eBay, Proof collects user feedback and assign user ratings.

Once an assignment ‘handshake’ has occurred, Proof deducts the listed pay plus a small transaction fee from the user and holds it until the assignment is either completed or cancelled.

Figure 1, below, is a rendering of a possible network configuration for Proof.

Figure 1: Proof, consisting of file transfer servers, transaction servers, and geolocation servers, tied together by a ‘dispatch’ application server and accessed by users through a web server presentation layer. Continue reading

MoJo week 3 assignment: Thoughts on collaboration

This week, my reflection on the Knight-Mozilla learning lab is meant to answer a specific question:

Keeping in mind the objectives and challenges identified in this week’s presentations by Shazna Nessa and Mohamed Nanabhay, how does your project take into account the need to facilitate collaboration in the newsroom (whether real or virtual), while acknowledging that team members will have varying technological skill sets?

And I can’t answer it.

Why? Well, unfortunately, the assumption behind that question is a problem. In three points, this is why:

  1. My project entails contract webcams: a vast army of unrelated, autonomous contract webcams.*
  2. This would be really helpful in the job I do right now, in a newsroom, finding and posting news stories online.
  3. However, I do not intend for the “contract webcams” project to center around a newsroom.

If all of what we call “reporters” and “editors” and “produces” suddenly became billionaires and quit and all remaining infrastructure in what we call “newsrooms” spontaneously combusted, I would hope that my application would barely register the blip.

Why am I so anti-reporter and anti-newsroom? Continue reading

MoJo week 3: The Programmer-Journalist Polarization

In their presentations to the Knight-Mozilla Journalism Learning Lab, Shazna Nessa and Mohamed Nanabhay described hiccups in their drive to bring better tech to their organizations (AP and Al Jazeera, respectively).

I keep hearing this common refrain: Cultural change within journalism institutions is the biggest barrier to adopting new technology and adapting to new and changing audiences.

And that stasis really has been the bane of my existence as a wannabe data-journalist*; I need to be in a newsroom to be taken seriously by sources (data-holders), and yet within the newsroom I don’t have the time, resources, permission, buy-in or access to do much, if anything, with the data

Continue reading