python - Parallelize distance calculation method with multiprocessing -


this question related other one posted days ago; i've read this question issue related multiprocessing pickling instance methods. problem did not understand how apply solution provided case:

def _pickle_method(method):     # author: steven bethard     # http://bytes.com/topic/python/answers/552476-why-cant-you-pickle-instancemethods     func_name = method.im_func.__name__     obj = method.im_self     cls = method.im_class     cls_name = ''     if func_name.startswith('__') , not func_name.endswith('__'):         cls_name = cls.__name__.lstrip('_')     if cls_name:         func_name = '_' + cls_name + func_name     return _unpickle_method, (func_name, obj, cls)  def _unpickle_method(func_name, obj, cls):     # author: steven bethard     # http://bytes.com/topic/python/answers/552476-why-cant-you-pickle-instancemethods     cls in cls.mro():         try:             func = cls.__dict__[func_name]         except keyerror:             pass         else:             break     return func.__get__(obj, cls)  copy_reg.pickle(types.methodtype, _pickle_method, _unpickle_method)  class circle(feature): # stuff...     def __points_distance(self,points):         xa = n.array([self.xc,self.yc]).reshape((1,2))         d = n.abs(dist.cdist(points,xa) - self.radius)         return d  def points_distance(self,points,pool=none):     if pool:         return pool.map(self.__points_distance,points)     else:         return self.__points_distance(points) 

this gives valueerror: xa must 2-dimensional array error when running this:

import tra.features fts import numpy np import multiprocessing mp  points = np.random.random(size=(1000,2)) circle_points = np.random.random(size=(3,2))  feature = fts.circle(circle_points)  pool = mp.pool() ds = feature.points_distance(points,pool=pool) 

but (obviously) work when doing:

pool = none ds = feature.points_distance(points,pool=pool) 

any clues?

this different this (i checked this implementation) because method used inside class instantiate circle class , calls points_distance method. in case difference points_distance method uses scipy.spatial.distance.cdist expects (n,2)-shaped numpy.ndarray. works when using serial version raises exception mentioned when used in parallel. suppose there's caveat of arguments passing cpickle.

the points array pass pool.map has shape of (1000, 2). when pool.map splits pass points argument __points_distance, that array has shape (2,).

try adding points.shape = (1, 2) body of __points_distance before call cdist.


Comments