[Numpy-discussion] distance matrix speed (original) (raw)

Michael Sorich michael.sorich at gmail.com
Fri Jun 16 02:26:37 EDT 2006


Hi Sebastian,

I am not sure if there is a function already defined in numpy, but something like this may be what you are after

def distance(a1, a2): return sqrt(sum((a1[:,newaxis,:] - a2[newaxis,:,:])**2, axis=2))

The general idea is to avoid loops if you want the code to execute fast. I hope this helps.

Mike

On 6/16/06, Sebastian Beca <sebastian.beca at gmail.com> wrote:

Hi, I'm working with NumPy/SciPy on some algorithms and i've run into some important speed differences wrt Matlab 7. I've narrowed the main speed problem down to the operation of finding the euclidean distance between two matrices that share one dimension rank (dist in Matlab):

Python: def dtest(): A = random( [4,2]) B = random( [1000,2]) d = zeros([4, 1000], dtype='f') for i in range(4): for j in range(1000): d[i, j] = sqrt( sum( (A[i] - B[j])**2 ) ) return d Matlab: A = rand( [4,2]) B = rand( [1000,2]) d = dist(A, B') Running both of these 100 times, I've found the python version to run between 10-20 times slower. My question is if there is a faster way to do this? Perhaps I'm not using the correct functions/structures? Or this is as good as it gets? Thanks on beforehand, Sebastian Beca Department of Computer Science Engineering University of Chile PD: I'm using NumPy 0.9.8, SciPy 0.4.8. I also understand I have ATLAS, BLAS and LAPACK all installed, but I havn't confirmed that.


Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion



More information about the NumPy-Discussion mailing list