Weekend bookmark

12 11 2012

…full of data analysis and machine learning stuffs….

  • Google refinea power tool for working with messy data. Nice, but kinda slow when I loaded the US presidential candidate donation data from Wes McKinney’s Pandas tutorial (ca. 500000 lines; I use an MBA with 4GB of RAM, ca. 1 gig was available when loading the data to Refine….)
  • scikit-learn, machine learning in Python. A MUST TRY.
  • mrjob, Yelp‘s open sourced mapreduce package for Python.
  • dumbo, another Python mapreduce package. Not sure why Yelp created a new library (mrjob, that is) for the same purpose…
  • Nominatim, kinda nice tool to get (latitude, longitude) coordinate from addresses or vice versa.
  • Seven Python libraries you should now about….
  • and, what seems to be the most exciting so far: Ramp, rapid machine learning prototyping, essentially a pandas wrapper around Python’s various machine learning and statistics libraries (scikit-learn, rpy2, etc.). 

A time full of excitement awaits us, folks…