[R-sig-hpc] What to Experiment With? (original) (raw)

ivo welch ivowel at gmail.com
Sat Apr 21 00:16:47 CEST 2012


Dear R HPC experts:

I have about $5,000 to spend on building fast computer hardware to run our problems. if it works well, I may be able to scrounge up another $10k/year to scale it up. I do not have the resources to program very complex algorithms, administer a full cluster, etc. (the effective programmer's rate here is about $50/hour and up, and I have severe restrictions against hiring outsiders.) the programs basically have to work with minimum special tweaking.

There are no real-time needs. Typically, I operate on historical CRSP and Compustat data, which are about 1-5GB (depending on subset). most of what I am doing involves linear regressions. I often need to calculate Newey-West/Hansen-Hodrick/White adjusted standard errors, and I often do need to sort and rank, calculate means and covariances. these are not highly sophisticated stats, but it entails lots of it. most of what I do is embarrassingly parallel.

Now, I think in the $5k price range, I have a couple of options. Roughly, the landscape seems to be:

I would presume that an internal PCI bus is a lot faster than an ethernet network, and a GPU could be faster than a CPU, but a GPU is also less flexible. Sigh...not sure. what should I try?

/iaw


Ivo Welch (ivo.welch at gmail.com)



More information about the R-sig-hpc mailing list