I was trying to use Hadoop's streaming pattern to use python code on a largish data set. However, uploading my data to the cluster actually takes approximately forever (I've not yet actually succeeded ...