GE flight quest — flight routes

6 12 2012

I thought a prediction model would need to be bound to a certain route, so route grouping it is… and mapping, why not.. This is the reference data on Nov 20, 2012. Not all flights are included, otherwise it’ll become too cluttered. 

GE flight quest — airports

2 12 2012

First exploration of the data, I save all aiport locations (combined departure and arrival, at least for those that have ICAO code, with coordinates from aiport data) in a kml file (get the link URL and open it in Google Maps). 511 aiports are contained over the whole US mainland. 

GE Flight Quest challenge page here.

Sarah Jessica Parker

14 11 2012

OK, this post has actually very little to do with the actress. It’s just there’s this guy in YouTube, commenting my comment on a Doctor Who clip, which was meant to refer to the Sarah Jessica Parker’s horse joke. He/she said that “Sarah” or “Jessica” is not a common name in the western time. Well, I don’t want to start a fight with him/her, it’s just interestingly I’d been playing around with American baby name historical data, and so I’m really tempted to figure out how popular those names are.

First as a background, the referred Doctor Who episode is called “A Town Called Mercy“, aired in September this year. The episode features the Doctor going back to the wild wild west time, with cowboys and stuff. In one scene, the Doctor, claiming that he speaks horse, contradicts a preacher, saying that the horse he’s about to ride is called Susan, not Joshua as the preacher had claimed.

There was no mention of the year where the whole story takes place, so I can only infer it from a dialogue between the Doctor and Rory:

The Doctor: That’s not right

Rory: It’s a street lamp.

The Doctor: An electric about ten years too early.

Rory: That’s only a few years out.

The Doctor: That’s what you said when you left your phone charger in Henry VIII’s own suite.

Given that Thomas Edison invented the electric light bulb in 1879, the event must have taken place around 1860-1870s. My dataset started in 1880, so it’s actually not so far off.

So I just went on doing some processing of the data using pandas, and get the percentage over time of the name “Sarah” over the whole American population:

Hmm, in 1880 the name “Sarah” constituted about 1.3 percent of the overall population. As we will see later, this is actually quite high. Maybe not so surprising because Sarah is sort of a biblical name (well, I guess; at least it has some religious flavour in Islam, I suppose it’s pretty much the same in the Bible..). Extrapolate back in time, looking at the graph, it could as well be higher before 1880. So that kid’s comment is invalid! Sarah is a popular name…

By the way I don’t differentiate if the name is a boy’s or a girl’s name (the dataset actually does). I just sum up the statistics of both as that is the only interesting number for my analysis.

Then another plot for “Jessica”:

All right, “Jessica” seems to be a modern-world phenomenon. It gained some popularity in the 1960s, peaked up in late 1980s, and has lost its popularity since then. Now, “Parker”:

Hmmm…. the name “Parker” is even a more recent phenomenon. What I find really interesting is the glitch slightly after the year 2000 and the continuing popularity of this name. Just go ahead let your imagination free and relate this phenomenon with the release year of Spiderman the movie… (Peter Parker, that is. You’re welcome.)

Now, about the relative proportion of the name “Sarah”; the following plot is a segment of the first plot, between 1880 and 1890, overlaid on the average proportion of all names in each year:

Here’s what it means: any (boy or girl) name between 1880 and 1890 constitutes in average only about 0.09 percent of the population. With 1.3 percent, “Sarah” is actually quite popular…

What can we learn from this? If a horse in 1870 claims that her name is Sarah, we really should believe it. If a horse today claims that her name is Sarah Jessica Parker, I think that’s quite possible as well.

UPDATE. The name “Susan” is actually less popular than “Sarah” in 1880