[Numpy-discussion] Distance Matrix speed (original) (raw)

Sebastian Beca sebastian.beca at gmail.com
Sun Jun 18 18:49:27 EDT 2006


I checked the matlab version's code and it does the same as discussed here. The only thing to check is to make sure you loop around the shorter dimension of the output array. Speedwise the Matlab code still runs about twice as fast for large sets of data (by just taking time by hand and comparing), nevetheless the improvement over calculating each value as in d1 is significant (10-300 times) and enough for my needs. Thanks to all.

Sebastian Beca

PD: I also tried the d5 version Alex sent but the results are not the same so I couldn't compare.

My final version was:

K = 10 C = 3 N = 2500 # One could switch around C and N now. A = random.random( [N, K]) B = random.random( [C, K])

def dist(): d = zeros([N, C], dtype=float) if N < C: for i in range(N): xy = A[i] - B d[i,:] = sqrt(sum(xy2, axis=1)) return d else: for j in range(C): xy = A - B[j] d[:,j] = sqrt(sum(xy2, axis=1)) return d

On 6/17/06, Johannes Loehnert <a.u.r.e.l.i.a.n at gmx.net> wrote:

Hi,

> def d4(): > d = zeros([4, 1000], dtype=float) > for i in range(4): > xy = A[i] - B > d[i] = sqrt( sum(xy**2, axis=1) ) > return d > > Maybe there's another alternative to d4? > Thanks again, I think this is the fastest you can get. Maybe it would be nicer to use the .sum() method instead of sum function, but that is just my personal opinion. I am curious how this compares to the matlab version. :) Johannes


Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion



More information about the NumPy-Discussion mailing list