Vectorized shapely operations using Cython

Example on single objects

Let's create a Point and a Polygon object:

In [1]:
from shapely.geometry import Point, Polygon
In [2]:
point = Point(50, 50)
point
Out[2]:
In [3]:
poly = Polygon([(10,10), (10,100), (100,100), (100, 10)])
poly
Out[3]:

A typical spatial predicate operation is to check whether the point is contained in the polygon:

In [4]:
poly.contains(point)
Out[4]:
True

Working with many objects

However, often we do not have some single objects, but an array (or a GeoSeries) full of geometries.

For the test example, create an array of 10,000 Points:

In [5]:
a = np.array([Point(i, i) for i in range(10000)], dtype=object)
In [6]:
a
Out[6]:
array([<shapely.geometry.point.Point object at 0x7f309c7d76a0>,
       <shapely.geometry.point.Point object at 0x7f309c7d7518>,
       <shapely.geometry.point.Point object at 0x7f309c7d7588>, ...,
       <shapely.geometry.point.Point object at 0x7f309c42eeb8>,
       <shapely.geometry.point.Point object at 0x7f309c42eef0>,
       <shapely.geometry.point.Point object at 0x7f309c42ef28>], dtype=object)

Python function for vectorized contains

If we now want to do the same 'contains' operation but with the full array of points, we can write the following function looping over the shapely objects:

In [7]:
def contains_py(array, poly):
    return np.array([poly.contains(p) for p in array])
In [8]:
res1 = contains_py(a, poly)
res1
Out[8]:
array([False, False, False, ..., False, False, False], dtype=bool)
In [9]:
res1.sum()
Out[9]:
89

And let's time this function:

In [10]:
%timeit contains_py(a, poly)
49 ms ± 1.65 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Cython function for "vectorized contains": using cython to do the for loop in C

The main performance problem with this vectorized 'contains' function is that the for loop happens in python, making this slow (and this is also how geopandas does it under the hood).
But, the actual 'contains' operation happens in C (by calling the GEOS library, wrapped by shapely). So let's try to make use of this fact by doing the for loop in Cython and calling the C GEOS function directly from there, without the overhead of switching to Python.

In [11]:
!geos-config --version
3.5.1
In [12]:
!geos-config --cflags
-I/home/joris/miniconda3/include
In [13]:
!geos-config --clibs
-L/home/joris/miniconda3/lib -lgeos_c
In [14]:
%load_ext cython
In [19]:
%%cython -l geos_c -L /home/joris/miniconda3/lib -I /home/joris/miniconda3/include -a

import cython
cimport cpython.array

import numpy as np
cimport numpy as np

import shapely.prepared

include "_geos.pxi"


@cython.boundscheck(False)
@cython.wraparound(False)
def contains_cy(array, geometry):
    
    cdef Py_ssize_t idx
    cdef unsigned int n = array.size
    cdef np.ndarray[np.uint8_t, ndim=1, cast=True] result = np.empty(n, dtype=np.uint8)

    cdef GEOSContextHandle_t geos_handle
    cdef GEOSPreparedGeometry *geom1
    cdef GEOSGeometry *geom2
    cdef uintptr_t geos_geom
    
    # Prepare the geometry if it hasn't already been prepared.
    if not isinstance(geometry, shapely.prepared.PreparedGeometry):
        geometry = shapely.prepared.prep(geometry)
        
    geos_h = get_geos_context_handle()
    geom1 = geos_from_prepared(geometry)
    
    for idx in xrange(n):
        # Construct a coordinate sequence with our x, y values.
        geos_geom = array[idx]._geom
        geom2 = <GEOSGeometry *>geos_geom
        
        # Put the result of whether the point is "contained" by the
        # prepared geometry into the result array. 
        result[idx] = <np.uint8_t> GEOSPreparedContains_r(geos_h, geom1, geom2)
        #GEOSGeom_destroy_r(geos_h, geom2)

    return result.view(dtype=np.bool)
Out[19]:
Cython: _cython_magic_82b59155c5d602a807243f9ffdef2d25.pyx

Generated by Cython 0.24.1

Yellow lines hint at Python interaction.
Click on a line that starts with a "+" to see the C code that Cython generated for it.

 01: 
+02: import cython
  __pyx_t_1 = PyDict_New(); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 2, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_1);
  if (PyDict_SetItem(__pyx_d, __pyx_n_s_test, __pyx_t_1) < 0) __PYX_ERR(0, 2, __pyx_L1_error)
  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
 03: cimport cpython.array
 04: 
+05: import numpy as np
  __pyx_t_1 = __Pyx_Import(__pyx_n_s_numpy, 0, -1); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 5, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_1);
  if (PyDict_SetItem(__pyx_d, __pyx_n_s_np, __pyx_t_1) < 0) __PYX_ERR(0, 5, __pyx_L1_error)
  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
 06: cimport numpy as np
 07: 
+08: import shapely.prepared
  __pyx_t_1 = __Pyx_Import(__pyx_n_s_shapely_prepared, 0, -1); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 8, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_1);
  if (PyDict_SetItem(__pyx_d, __pyx_n_s_shapely, __pyx_t_1) < 0) __PYX_ERR(0, 8, __pyx_L1_error)
  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
 09: 
 10: include "_geos.pxi"
 11: 
 12: 
 13: @cython.boundscheck(False)
 14: @cython.wraparound(False)
+15: def contains_cy(array, geometry):
/* Python wrapper */
static PyObject *__pyx_pw_46_cython_magic_82b59155c5d602a807243f9ffdef2d25_1contains_cy(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds); /*proto*/
static PyMethodDef __pyx_mdef_46_cython_magic_82b59155c5d602a807243f9ffdef2d25_1contains_cy = {"contains_cy", (PyCFunction)__pyx_pw_46_cython_magic_82b59155c5d602a807243f9ffdef2d25_1contains_cy, METH_VARARGS|METH_KEYWORDS, 0};
static PyObject *__pyx_pw_46_cython_magic_82b59155c5d602a807243f9ffdef2d25_1contains_cy(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds) {
  PyObject *__pyx_v_array = 0;
  PyObject *__pyx_v_geometry = 0;
  PyObject *__pyx_r = 0;
  __Pyx_RefNannyDeclarations
  __Pyx_RefNannySetupContext("contains_cy (wrapper)", 0);
  {
    static PyObject **__pyx_pyargnames[] = {&__pyx_n_s_array,&__pyx_n_s_geometry,0};
    PyObject* values[2] = {0,0};
    if (unlikely(__pyx_kwds)) {
      Py_ssize_t kw_args;
      const Py_ssize_t pos_args = PyTuple_GET_SIZE(__pyx_args);
      switch (pos_args) {
        case  2: values[1] = PyTuple_GET_ITEM(__pyx_args, 1);
        case  1: values[0] = PyTuple_GET_ITEM(__pyx_args, 0);
        case  0: break;
        default: goto __pyx_L5_argtuple_error;
      }
      kw_args = PyDict_Size(__pyx_kwds);
      switch (pos_args) {
        case  0:
        if (likely((values[0] = PyDict_GetItem(__pyx_kwds, __pyx_n_s_array)) != 0)) kw_args--;
        else goto __pyx_L5_argtuple_error;
        case  1:
        if (likely((values[1] = PyDict_GetItem(__pyx_kwds, __pyx_n_s_geometry)) != 0)) kw_args--;
        else {
          __Pyx_RaiseArgtupleInvalid("contains_cy", 1, 2, 2, 1); __PYX_ERR(0, 15, __pyx_L3_error)
        }
      }
      if (unlikely(kw_args > 0)) {
        if (unlikely(__Pyx_ParseOptionalKeywords(__pyx_kwds, __pyx_pyargnames, 0, values, pos_args, "contains_cy") < 0)) __PYX_ERR(0, 15, __pyx_L3_error)
      }
    } else if (PyTuple_GET_SIZE(__pyx_args) != 2) {
      goto __pyx_L5_argtuple_error;
    } else {
      values[0] = PyTuple_GET_ITEM(__pyx_args, 0);
      values[1] = PyTuple_GET_ITEM(__pyx_args, 1);
    }
    __pyx_v_array = values[0];
    __pyx_v_geometry = values[1];
  }
  goto __pyx_L4_argument_unpacking_done;
  __pyx_L5_argtuple_error:;
  __Pyx_RaiseArgtupleInvalid("contains_cy", 1, 2, 2, PyTuple_GET_SIZE(__pyx_args)); __PYX_ERR(0, 15, __pyx_L3_error)
  __pyx_L3_error:;
  __Pyx_AddTraceback("_cython_magic_82b59155c5d602a807243f9ffdef2d25.contains_cy", __pyx_clineno, __pyx_lineno, __pyx_filename);
  __Pyx_RefNannyFinishContext();
  return NULL;
  __pyx_L4_argument_unpacking_done:;
  __pyx_r = __pyx_pf_46_cython_magic_82b59155c5d602a807243f9ffdef2d25_contains_cy(__pyx_self, __pyx_v_array, __pyx_v_geometry);

  /* function exit code */
  __Pyx_RefNannyFinishContext();
  return __pyx_r;
}

static PyObject *__pyx_pf_46_cython_magic_82b59155c5d602a807243f9ffdef2d25_contains_cy(CYTHON_UNUSED PyObject *__pyx_self, PyObject *__pyx_v_array, PyObject *__pyx_v_geometry) {
  Py_ssize_t __pyx_v_idx;
  unsigned int __pyx_v_n;
  PyArrayObject *__pyx_v_result = 0;
  GEOSPreparedGeometry *__pyx_v_geom1;
  GEOSGeometry *__pyx_v_geom2;
  uintptr_t __pyx_v_geos_geom;
  GEOSContextHandle_t __pyx_v_geos_h;
  __Pyx_LocalBuf_ND __pyx_pybuffernd_result;
  __Pyx_Buffer __pyx_pybuffer_result;
  PyObject *__pyx_r = NULL;
  __Pyx_RefNannyDeclarations
  __Pyx_RefNannySetupContext("contains_cy", 0);
  __Pyx_INCREF(__pyx_v_geometry);
  __pyx_pybuffer_result.pybuffer.buf = NULL;
  __pyx_pybuffer_result.refcount = 0;
  __pyx_pybuffernd_result.data = NULL;
  __pyx_pybuffernd_result.rcbuffer = &__pyx_pybuffer_result;
/* … */
  /* function exit code */
  __pyx_L1_error:;
  __Pyx_XDECREF(__pyx_t_1);
  __Pyx_XDECREF(__pyx_t_3);
  __Pyx_XDECREF(__pyx_t_4);
  __Pyx_XDECREF(__pyx_t_5);
  __Pyx_XDECREF(__pyx_t_6);
  { PyObject *__pyx_type, *__pyx_value, *__pyx_tb;
    __Pyx_PyThreadState_declare
    __Pyx_PyThreadState_assign
    __Pyx_ErrFetch(&__pyx_type, &__pyx_value, &__pyx_tb);
    __Pyx_SafeReleaseBuffer(&__pyx_pybuffernd_result.rcbuffer->pybuffer);
  __Pyx_ErrRestore(__pyx_type, __pyx_value, __pyx_tb);}
  __Pyx_AddTraceback("_cython_magic_82b59155c5d602a807243f9ffdef2d25.contains_cy", __pyx_clineno, __pyx_lineno, __pyx_filename);
  __pyx_r = NULL;
  goto __pyx_L2;
  __pyx_L0:;
  __Pyx_SafeReleaseBuffer(&__pyx_pybuffernd_result.rcbuffer->pybuffer);
  __pyx_L2:;
  __Pyx_XDECREF((PyObject *)__pyx_v_result);
  __Pyx_XDECREF(__pyx_v_geometry);
  __Pyx_XGIVEREF(__pyx_r);
  __Pyx_RefNannyFinishContext();
  return __pyx_r;
}
/* … */
  __pyx_tuple__7 = PyTuple_Pack(10, __pyx_n_s_array, __pyx_n_s_geometry, __pyx_n_s_idx, __pyx_n_s_n, __pyx_n_s_result, __pyx_n_s_geos_handle, __pyx_n_s_geom1, __pyx_n_s_geom2, __pyx_n_s_geos_geom, __pyx_n_s_geos_h); if (unlikely(!__pyx_tuple__7)) __PYX_ERR(0, 15, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_tuple__7);
  __Pyx_GIVEREF(__pyx_tuple__7);
/* … */
  __pyx_t_1 = PyCFunction_NewEx(&__pyx_mdef_46_cython_magic_82b59155c5d602a807243f9ffdef2d25_1contains_cy, NULL, __pyx_n_s_cython_magic_82b59155c5d602a807); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 15, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_1);
  if (PyDict_SetItem(__pyx_d, __pyx_n_s_contains_cy, __pyx_t_1) < 0) __PYX_ERR(0, 15, __pyx_L1_error)
  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
 16: 
 17:     cdef Py_ssize_t idx
+18:     cdef unsigned int n = array.size
  __pyx_t_1 = __Pyx_PyObject_GetAttrStr(__pyx_v_array, __pyx_n_s_size); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 18, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_1);
  __pyx_t_2 = __Pyx_PyInt_As_unsigned_int(__pyx_t_1); if (unlikely((__pyx_t_2 == (unsigned int)-1) && PyErr_Occurred())) __PYX_ERR(0, 18, __pyx_L1_error)
  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
  __pyx_v_n = __pyx_t_2;
+19:     cdef np.ndarray[np.uint8_t, ndim=1, cast=True] result = np.empty(n, dtype=np.uint8)
  __pyx_t_1 = __Pyx_GetModuleGlobalName(__pyx_n_s_np); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 19, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_1);
  __pyx_t_3 = __Pyx_PyObject_GetAttrStr(__pyx_t_1, __pyx_n_s_empty); if (unlikely(!__pyx_t_3)) __PYX_ERR(0, 19, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_3);
  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
  __pyx_t_1 = __Pyx_PyInt_From_unsigned_int(__pyx_v_n); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 19, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_1);
  __pyx_t_4 = PyTuple_New(1); if (unlikely(!__pyx_t_4)) __PYX_ERR(0, 19, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_4);
  __Pyx_GIVEREF(__pyx_t_1);
  PyTuple_SET_ITEM(__pyx_t_4, 0, __pyx_t_1);
  __pyx_t_1 = 0;
  __pyx_t_1 = PyDict_New(); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 19, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_1);
  __pyx_t_5 = __Pyx_GetModuleGlobalName(__pyx_n_s_np); if (unlikely(!__pyx_t_5)) __PYX_ERR(0, 19, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_5);
  __pyx_t_6 = __Pyx_PyObject_GetAttrStr(__pyx_t_5, __pyx_n_s_uint8); if (unlikely(!__pyx_t_6)) __PYX_ERR(0, 19, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_6);
  __Pyx_DECREF(__pyx_t_5); __pyx_t_5 = 0;
  if (PyDict_SetItem(__pyx_t_1, __pyx_n_s_dtype, __pyx_t_6) < 0) __PYX_ERR(0, 19, __pyx_L1_error)
  __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;
  __pyx_t_6 = __Pyx_PyObject_Call(__pyx_t_3, __pyx_t_4, __pyx_t_1); if (unlikely(!__pyx_t_6)) __PYX_ERR(0, 19, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_6);
  __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;
  __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0;
  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
  if (!(likely(((__pyx_t_6) == Py_None) || likely(__Pyx_TypeTest(__pyx_t_6, __pyx_ptype_5numpy_ndarray))))) __PYX_ERR(0, 19, __pyx_L1_error)
  __pyx_t_7 = ((PyArrayObject *)__pyx_t_6);
  {
    __Pyx_BufFmt_StackElem __pyx_stack[1];
    if (unlikely(__Pyx_GetBufferAndValidate(&__pyx_pybuffernd_result.rcbuffer->pybuffer, (PyObject*)__pyx_t_7, &__Pyx_TypeInfo_nn___pyx_t_5numpy_uint8_t, PyBUF_FORMAT| PyBUF_STRIDES| PyBUF_WRITABLE, 1, 1, __pyx_stack) == -1)) {
      __pyx_v_result = ((PyArrayObject *)Py_None); __Pyx_INCREF(Py_None); __pyx_pybuffernd_result.rcbuffer->pybuffer.buf = NULL;
      __PYX_ERR(0, 19, __pyx_L1_error)
    } else {__pyx_pybuffernd_result.diminfo[0].strides = __pyx_pybuffernd_result.rcbuffer->pybuffer.strides[0]; __pyx_pybuffernd_result.diminfo[0].shape = __pyx_pybuffernd_result.rcbuffer->pybuffer.shape[0];
    }
  }
  __pyx_t_7 = 0;
  __pyx_v_result = ((PyArrayObject *)__pyx_t_6);
  __pyx_t_6 = 0;
 20: 
 21:     cdef GEOSContextHandle_t geos_handle
 22:     cdef GEOSPreparedGeometry *geom1
 23:     cdef GEOSGeometry *geom2
 24:     cdef uintptr_t geos_geom
 25: 
 26:     # Prepare the geometry if it hasn't already been prepared.
+27:     if not isinstance(geometry, shapely.prepared.PreparedGeometry):
  __pyx_t_6 = __Pyx_GetModuleGlobalName(__pyx_n_s_shapely); if (unlikely(!__pyx_t_6)) __PYX_ERR(0, 27, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_6);
  __pyx_t_1 = __Pyx_PyObject_GetAttrStr(__pyx_t_6, __pyx_n_s_prepared); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 27, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_1);
  __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;
  __pyx_t_6 = __Pyx_PyObject_GetAttrStr(__pyx_t_1, __pyx_n_s_PreparedGeometry); if (unlikely(!__pyx_t_6)) __PYX_ERR(0, 27, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_6);
  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
  __pyx_t_8 = PyObject_IsInstance(__pyx_v_geometry, __pyx_t_6); if (unlikely(__pyx_t_8 == -1)) __PYX_ERR(0, 27, __pyx_L1_error)
  __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;
  __pyx_t_9 = ((!(__pyx_t_8 != 0)) != 0);
  if (__pyx_t_9) {
/* … */
  }
+28:         geometry = shapely.prepared.prep(geometry)
    __pyx_t_1 = __Pyx_GetModuleGlobalName(__pyx_n_s_shapely); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 28, __pyx_L1_error)
    __Pyx_GOTREF(__pyx_t_1);
    __pyx_t_4 = __Pyx_PyObject_GetAttrStr(__pyx_t_1, __pyx_n_s_prepared); if (unlikely(!__pyx_t_4)) __PYX_ERR(0, 28, __pyx_L1_error)
    __Pyx_GOTREF(__pyx_t_4);
    __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
    __pyx_t_1 = __Pyx_PyObject_GetAttrStr(__pyx_t_4, __pyx_n_s_prep); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 28, __pyx_L1_error)
    __Pyx_GOTREF(__pyx_t_1);
    __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0;
    __pyx_t_4 = NULL;
    if (CYTHON_COMPILING_IN_CPYTHON && likely(PyMethod_Check(__pyx_t_1))) {
      __pyx_t_4 = PyMethod_GET_SELF(__pyx_t_1);
      if (likely(__pyx_t_4)) {
        PyObject* function = PyMethod_GET_FUNCTION(__pyx_t_1);
        __Pyx_INCREF(__pyx_t_4);
        __Pyx_INCREF(function);
        __Pyx_DECREF_SET(__pyx_t_1, function);
      }
    }
    if (!__pyx_t_4) {
      __pyx_t_6 = __Pyx_PyObject_CallOneArg(__pyx_t_1, __pyx_v_geometry); if (unlikely(!__pyx_t_6)) __PYX_ERR(0, 28, __pyx_L1_error)
      __Pyx_GOTREF(__pyx_t_6);
    } else {
      __pyx_t_3 = PyTuple_New(1+1); if (unlikely(!__pyx_t_3)) __PYX_ERR(0, 28, __pyx_L1_error)
      __Pyx_GOTREF(__pyx_t_3);
      __Pyx_GIVEREF(__pyx_t_4); PyTuple_SET_ITEM(__pyx_t_3, 0, __pyx_t_4); __pyx_t_4 = NULL;
      __Pyx_INCREF(__pyx_v_geometry);
      __Pyx_GIVEREF(__pyx_v_geometry);
      PyTuple_SET_ITEM(__pyx_t_3, 0+1, __pyx_v_geometry);
      __pyx_t_6 = __Pyx_PyObject_Call(__pyx_t_1, __pyx_t_3, NULL); if (unlikely(!__pyx_t_6)) __PYX_ERR(0, 28, __pyx_L1_error)
      __Pyx_GOTREF(__pyx_t_6);
      __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;
    }
    __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
    __Pyx_DECREF_SET(__pyx_v_geometry, __pyx_t_6);
    __pyx_t_6 = 0;
 29: 
+30:     geos_h = get_geos_context_handle()
  __pyx_v_geos_h = __pyx_f_46_cython_magic_82b59155c5d602a807243f9ffdef2d25_get_geos_context_handle();
+31:     geom1 = geos_from_prepared(geometry)
  __pyx_t_10 = __pyx_f_46_cython_magic_82b59155c5d602a807243f9ffdef2d25_geos_from_prepared(__pyx_v_geometry); if (unlikely(PyErr_Occurred())) __PYX_ERR(0, 31, __pyx_L1_error)
  __pyx_v_geom1 = __pyx_t_10;
 32: 
+33:     for idx in xrange(n):
  __pyx_t_2 = __pyx_v_n;
  for (__pyx_t_11 = 0; __pyx_t_11 < __pyx_t_2; __pyx_t_11+=1) {
    __pyx_v_idx = __pyx_t_11;
 34:         # Construct a coordinate sequence with our x, y values.
+35:         geos_geom = array[idx]._geom
    __pyx_t_6 = __Pyx_GetItemInt(__pyx_v_array, __pyx_v_idx, Py_ssize_t, 1, PyInt_FromSsize_t, 0, 0, 0); if (unlikely(!__pyx_t_6)) __PYX_ERR(0, 35, __pyx_L1_error)
    __Pyx_GOTREF(__pyx_t_6);
    __pyx_t_1 = __Pyx_PyObject_GetAttrStr(__pyx_t_6, __pyx_n_s_geom); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 35, __pyx_L1_error)
    __Pyx_GOTREF(__pyx_t_1);
    __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;
    __pyx_t_12 = __Pyx_PyInt_As_size_t(__pyx_t_1); if (unlikely((__pyx_t_12 == (uintptr_t)-1) && PyErr_Occurred())) __PYX_ERR(0, 35, __pyx_L1_error)
    __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
    __pyx_v_geos_geom = __pyx_t_12;
+36:         geom2 = <GEOSGeometry *>geos_geom
    __pyx_v_geom2 = ((GEOSGeometry *)__pyx_v_geos_geom);
 37: 
 38:         # Put the result of whether the point is "contained" by the
 39:         # prepared geometry into the result array. 
+40:         result[idx] = <np.uint8_t> GEOSPreparedContains_r(geos_h, geom1, geom2)
    __pyx_t_13 = __pyx_v_idx;
    *__Pyx_BufPtrStrided1d(__pyx_t_5numpy_uint8_t *, __pyx_pybuffernd_result.rcbuffer->pybuffer.buf, __pyx_t_13, __pyx_pybuffernd_result.diminfo[0].strides) = ((__pyx_t_5numpy_uint8_t)GEOSPreparedContains_r(__pyx_v_geos_h, __pyx_v_geom1, __pyx_v_geom2));
  }
 41:         #GEOSGeom_destroy_r(geos_h, geom2)
 42: 
+43:     return result.view(dtype=np.bool)
  __Pyx_XDECREF(__pyx_r);
  __pyx_t_1 = __Pyx_PyObject_GetAttrStr(((PyObject *)__pyx_v_result), __pyx_n_s_view); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 43, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_1);
  __pyx_t_6 = PyDict_New(); if (unlikely(!__pyx_t_6)) __PYX_ERR(0, 43, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_6);
  __pyx_t_3 = __Pyx_GetModuleGlobalName(__pyx_n_s_np); if (unlikely(!__pyx_t_3)) __PYX_ERR(0, 43, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_3);
  __pyx_t_4 = __Pyx_PyObject_GetAttrStr(__pyx_t_3, __pyx_n_s_bool); if (unlikely(!__pyx_t_4)) __PYX_ERR(0, 43, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_4);
  __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;
  if (PyDict_SetItem(__pyx_t_6, __pyx_n_s_dtype, __pyx_t_4) < 0) __PYX_ERR(0, 43, __pyx_L1_error)
  __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0;
  __pyx_t_4 = __Pyx_PyObject_Call(__pyx_t_1, __pyx_empty_tuple, __pyx_t_6); if (unlikely(!__pyx_t_4)) __PYX_ERR(0, 43, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_4);
  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
  __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;
  __pyx_r = __pyx_t_4;
  __pyx_t_4 = 0;
  goto __pyx_L0;
In [20]:
res2 = contains_cy(a, p)
res2
Out[20]:
array([False, False, False, ..., False, False, False], dtype=bool)
In [21]:
res2.sum()
Out[21]:
89
In [22]:
np.testing.assert_array_equal(res1, res2)
In [26]:
%timeit contains_cy(a, p)
2.07 ms ± 124 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Already much faster !!

From about 50 ms to 2 ms -> 25x speed-up

Passing _geom pointers instead of Geometry objects

As can be seen in the annotated cython code above, one of the bottlenecks in the for loop part is geos_geom = array[idx]._geom (yellow colored line), where I access the geometry python object (a is an object dtyped array) and get the _geom attribute. This has to switch to python to get the attribute, and therefore gives a slowdown.

Let's try to make an array of those integer _geom pointers in advance, and pass this to an adapted version of the cython function:

In [27]:
%%cython -l geos_c -L /home/joris/miniconda3/lib -I /home/joris/miniconda3/include -a

import cython
cimport cpython.array

import numpy as np
cimport numpy as np

cdef extern from "geos_c.h":
    ctypedef void *GEOSContextHandle_t
    ctypedef struct GEOSGeometry
    char GEOSContains_r(GEOSContextHandle_t, const GEOSGeometry*, const GEOSGeometry*) nogil
    
cdef GEOSContextHandle_t get_geos_context_handle():
    # Note: This requires that lgeos is defined, so needs to be imported as:
    from shapely.geos import lgeos
    cdef np.uintp_t handle = lgeos.geos_handle
    return <GEOSContextHandle_t>handle


@cython.boundscheck(False)
@cython.wraparound(False)
def contains_cy2(np.int64_t[:] array, geometry):
    
    cdef Py_ssize_t idx
    cdef unsigned int n = array.size
    cdef np.ndarray[np.uint8_t, ndim=1, cast=True] result = np.empty(n, dtype=np.uint8)

    cdef GEOSContextHandle_t geos_handle
    cdef GEOSGeometry *geom1
    cdef GEOSGeometry *geom2
    cdef np.uintp_t geos_geom
    cdef np.uintp_t geos_geom1

    geos_h = get_geos_context_handle()
    geos_geom1 = geometry.__geom__ 
    geom1 = <GEOSGeometry *> geos_geom1
    
    for idx in xrange(n):
        geos_geom = array[idx]
        geom2 = <GEOSGeometry *>geos_geom

        # Put the result of whether the point is "contained" by the
        # prepared geometry into the result array. 
        result[idx] = <np.uint8_t> GEOSContains_r(geos_h, geom1, geom2)
        #GEOSGeom_destroy_r(geos_h, geom2)

    return result.view(dtype=np.bool)
Out[27]:
Cython: _cython_magic_69b3f517e97837cffcede7e1a60806aa.pyx

Generated by Cython 0.26

Yellow lines hint at Python interaction.
Click on a line that starts with a "+" to see the C code that Cython generated for it.

 01: 
+02: import cython
  __pyx_t_1 = PyDict_New(); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 2, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_1);
  if (PyDict_SetItem(__pyx_d, __pyx_n_s_test, __pyx_t_1) < 0) __PYX_ERR(0, 2, __pyx_L1_error)
  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
 03: cimport cpython.array
 04: 
+05: import numpy as np
  __pyx_t_1 = __Pyx_Import(__pyx_n_s_numpy, 0, 0); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 5, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_1);
  if (PyDict_SetItem(__pyx_d, __pyx_n_s_np, __pyx_t_1) < 0) __PYX_ERR(0, 5, __pyx_L1_error)
  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
 06: cimport numpy as np
 07: 
 08: cdef extern from "geos_c.h":
 09:     ctypedef void *GEOSContextHandle_t
 10:     ctypedef struct GEOSGeometry
 11:     char GEOSContains_r(GEOSContextHandle_t, const GEOSGeometry*, const GEOSGeometry*) nogil
 12: 
+13: cdef GEOSContextHandle_t get_geos_context_handle():
static GEOSContextHandle_t __pyx_f_46_cython_magic_69b3f517e97837cffcede7e1a60806aa_get_geos_context_handle(void) {
  PyObject *__pyx_v_lgeos = NULL;
  __pyx_t_5numpy_uintp_t __pyx_v_handle;
  GEOSContextHandle_t __pyx_r;
  __Pyx_RefNannyDeclarations
  __Pyx_RefNannySetupContext("get_geos_context_handle", 0);
/* … */
  /* function exit code */
  __pyx_L1_error:;
  __Pyx_XDECREF(__pyx_t_1);
  __Pyx_XDECREF(__pyx_t_2);
  __Pyx_WriteUnraisable("_cython_magic_69b3f517e97837cffcede7e1a60806aa.get_geos_context_handle", __pyx_clineno, __pyx_lineno, __pyx_filename, 1, 0);
  __pyx_r = 0;
  __pyx_L0:;
  __Pyx_XDECREF(__pyx_v_lgeos);
  __Pyx_RefNannyFinishContext();
  return __pyx_r;
}
 14:     # Note: This requires that lgeos is defined, so needs to be imported as:
+15:     from shapely.geos import lgeos
  __pyx_t_1 = PyList_New(1); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 15, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_1);
  __Pyx_INCREF(__pyx_n_s_lgeos);
  __Pyx_GIVEREF(__pyx_n_s_lgeos);
  PyList_SET_ITEM(__pyx_t_1, 0, __pyx_n_s_lgeos);
  __pyx_t_2 = __Pyx_Import(__pyx_n_s_shapely_geos, __pyx_t_1, 0); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 15, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_2);
  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
  __pyx_t_1 = __Pyx_ImportFrom(__pyx_t_2, __pyx_n_s_lgeos); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 15, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_1);
  __Pyx_INCREF(__pyx_t_1);
  __pyx_v_lgeos = __pyx_t_1;
  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
  __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;
+16:     cdef np.uintp_t handle = lgeos.geos_handle
  __pyx_t_2 = __Pyx_PyObject_GetAttrStr(__pyx_v_lgeos, __pyx_n_s_geos_handle); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 16, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_2);
  __pyx_t_3 = __Pyx_PyInt_As_size_t(__pyx_t_2); if (unlikely((__pyx_t_3 == ((npy_uintp)-1)) && PyErr_Occurred())) __PYX_ERR(0, 16, __pyx_L1_error)
  __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;
  __pyx_v_handle = __pyx_t_3;
+17:     return <GEOSContextHandle_t>handle
  __pyx_r = ((GEOSContextHandle_t)__pyx_v_handle);
  goto __pyx_L0;
 18: 
 19: 
 20: @cython.boundscheck(False)
 21: @cython.wraparound(False)
+22: def contains_cy2(np.int64_t[:] array, geometry):
/* Python wrapper */
static PyObject *__pyx_pw_46_cython_magic_69b3f517e97837cffcede7e1a60806aa_1contains_cy2(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds); /*proto*/
static PyMethodDef __pyx_mdef_46_cython_magic_69b3f517e97837cffcede7e1a60806aa_1contains_cy2 = {"contains_cy2", (PyCFunction)__pyx_pw_46_cython_magic_69b3f517e97837cffcede7e1a60806aa_1contains_cy2, METH_VARARGS|METH_KEYWORDS, 0};
static PyObject *__pyx_pw_46_cython_magic_69b3f517e97837cffcede7e1a60806aa_1contains_cy2(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds) {
  __Pyx_memviewslice __pyx_v_array = { 0, 0, { 0 }, { 0 }, { 0 } };
  PyObject *__pyx_v_geometry = 0;
  PyObject *__pyx_r = 0;
  __Pyx_RefNannyDeclarations
  __Pyx_RefNannySetupContext("contains_cy2 (wrapper)", 0);
  {
    static PyObject **__pyx_pyargnames[] = {&__pyx_n_s_array,&__pyx_n_s_geometry,0};
    PyObject* values[2] = {0,0};
    if (unlikely(__pyx_kwds)) {
      Py_ssize_t kw_args;
      const Py_ssize_t pos_args = PyTuple_GET_SIZE(__pyx_args);
      switch (pos_args) {
        case  2: values[1] = PyTuple_GET_ITEM(__pyx_args, 1);
        CYTHON_FALLTHROUGH;
        case  1: values[0] = PyTuple_GET_ITEM(__pyx_args, 0);
        CYTHON_FALLTHROUGH;
        case  0: break;
        default: goto __pyx_L5_argtuple_error;
      }
      kw_args = PyDict_Size(__pyx_kwds);
      switch (pos_args) {
        case  0:
        if (likely((values[0] = PyDict_GetItem(__pyx_kwds, __pyx_n_s_array)) != 0)) kw_args--;
        else goto __pyx_L5_argtuple_error;
        CYTHON_FALLTHROUGH;
        case  1:
        if (likely((values[1] = PyDict_GetItem(__pyx_kwds, __pyx_n_s_geometry)) != 0)) kw_args--;
        else {
          __Pyx_RaiseArgtupleInvalid("contains_cy2", 1, 2, 2, 1); __PYX_ERR(0, 22, __pyx_L3_error)
        }
      }
      if (unlikely(kw_args > 0)) {
        if (unlikely(__Pyx_ParseOptionalKeywords(__pyx_kwds, __pyx_pyargnames, 0, values, pos_args, "contains_cy2") < 0)) __PYX_ERR(0, 22, __pyx_L3_error)
      }
    } else if (PyTuple_GET_SIZE(__pyx_args) != 2) {
      goto __pyx_L5_argtuple_error;
    } else {
      values[0] = PyTuple_GET_ITEM(__pyx_args, 0);
      values[1] = PyTuple_GET_ITEM(__pyx_args, 1);
    }
    __pyx_v_array = __Pyx_PyObject_to_MemoryviewSlice_ds_nn___pyx_t_5numpy_int64_t(values[0]); if (unlikely(!__pyx_v_array.memview)) __PYX_ERR(0, 22, __pyx_L3_error)
    __pyx_v_geometry = values[1];
  }
  goto __pyx_L4_argument_unpacking_done;
  __pyx_L5_argtuple_error:;
  __Pyx_RaiseArgtupleInvalid("contains_cy2", 1, 2, 2, PyTuple_GET_SIZE(__pyx_args)); __PYX_ERR(0, 22, __pyx_L3_error)
  __pyx_L3_error:;
  __Pyx_AddTraceback("_cython_magic_69b3f517e97837cffcede7e1a60806aa.contains_cy2", __pyx_clineno, __pyx_lineno, __pyx_filename);
  __Pyx_RefNannyFinishContext();
  return NULL;
  __pyx_L4_argument_unpacking_done:;
  __pyx_r = __pyx_pf_46_cython_magic_69b3f517e97837cffcede7e1a60806aa_contains_cy2(__pyx_self, __pyx_v_array, __pyx_v_geometry);

  /* function exit code */
  __Pyx_RefNannyFinishContext();
  return __pyx_r;
}

static PyObject *__pyx_pf_46_cython_magic_69b3f517e97837cffcede7e1a60806aa_contains_cy2(CYTHON_UNUSED PyObject *__pyx_self, __Pyx_memviewslice __pyx_v_array, PyObject *__pyx_v_geometry) {
  Py_ssize_t __pyx_v_idx;
  unsigned int __pyx_v_n;
  PyArrayObject *__pyx_v_result = 0;
  GEOSGeometry *__pyx_v_geom1;
  GEOSGeometry *__pyx_v_geom2;
  __pyx_t_5numpy_uintp_t __pyx_v_geos_geom;
  __pyx_t_5numpy_uintp_t __pyx_v_geos_geom1;
  GEOSContextHandle_t __pyx_v_geos_h;
  __Pyx_LocalBuf_ND __pyx_pybuffernd_result;
  __Pyx_Buffer __pyx_pybuffer_result;
  PyObject *__pyx_r = NULL;
  __Pyx_RefNannyDeclarations
  __Pyx_RefNannySetupContext("contains_cy2", 0);
  __pyx_pybuffer_result.pybuffer.buf = NULL;
  __pyx_pybuffer_result.refcount = 0;
  __pyx_pybuffernd_result.data = NULL;
  __pyx_pybuffernd_result.rcbuffer = &__pyx_pybuffer_result;
/* … */
  /* function exit code */
  __pyx_L1_error:;
  __Pyx_XDECREF(__pyx_t_1);
  __Pyx_XDECREF(__pyx_t_2);
  __Pyx_XDECREF(__pyx_t_4);
  __Pyx_XDECREF(__pyx_t_5);
  __Pyx_XDECREF(__pyx_t_6);
  { PyObject *__pyx_type, *__pyx_value, *__pyx_tb;
    __Pyx_PyThreadState_declare
    __Pyx_PyThreadState_assign
    __Pyx_ErrFetch(&__pyx_type, &__pyx_value, &__pyx_tb);
    __Pyx_SafeReleaseBuffer(&__pyx_pybuffernd_result.rcbuffer->pybuffer);
  __Pyx_ErrRestore(__pyx_type, __pyx_value, __pyx_tb);}
  __Pyx_AddTraceback("_cython_magic_69b3f517e97837cffcede7e1a60806aa.contains_cy2", __pyx_clineno, __pyx_lineno, __pyx_filename);
  __pyx_r = NULL;
  goto __pyx_L2;
  __pyx_L0:;
  __Pyx_SafeReleaseBuffer(&__pyx_pybuffernd_result.rcbuffer->pybuffer);
  __pyx_L2:;
  __Pyx_XDECREF((PyObject *)__pyx_v_result);
  __PYX_XDEC_MEMVIEW(&__pyx_v_array, 1);
  __Pyx_XGIVEREF(__pyx_r);
  __Pyx_RefNannyFinishContext();
  return __pyx_r;
}
/* … */
  __pyx_tuple__29 = PyTuple_Pack(11, __pyx_n_s_array, __pyx_n_s_geometry, __pyx_n_s_idx, __pyx_n_s_n, __pyx_n_s_result, __pyx_n_s_geos_handle, __pyx_n_s_geom1, __pyx_n_s_geom2, __pyx_n_s_geos_geom, __pyx_n_s_geos_geom1, __pyx_n_s_geos_h); if (unlikely(!__pyx_tuple__29)) __PYX_ERR(0, 22, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_tuple__29);
  __Pyx_GIVEREF(__pyx_tuple__29);
/* … */
  __pyx_t_1 = PyCFunction_NewEx(&__pyx_mdef_46_cython_magic_69b3f517e97837cffcede7e1a60806aa_1contains_cy2, NULL, __pyx_n_s_cython_magic_69b3f517e97837cffc); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 22, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_1);
  if (PyDict_SetItem(__pyx_d, __pyx_n_s_contains_cy2, __pyx_t_1) < 0) __PYX_ERR(0, 22, __pyx_L1_error)
  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
  __pyx_codeobj__30 = (PyObject*)__Pyx_PyCode_New(2, 0, 11, 0, 0, __pyx_empty_bytes, __pyx_empty_tuple, __pyx_empty_tuple, __pyx_tuple__29, __pyx_empty_tuple, __pyx_empty_tuple, __pyx_kp_s_home_joris_cache_ipython_cython, __pyx_n_s_contains_cy2, 22, __pyx_empty_bytes); if (unlikely(!__pyx_codeobj__30)) __PYX_ERR(0, 22, __pyx_L1_error)
 23: 
 24:     cdef Py_ssize_t idx
+25:     cdef unsigned int n = array.size
  __pyx_t_1 = __pyx_memoryview_fromslice(__pyx_v_array, 1, (PyObject *(*)(char *)) __pyx_memview_get_nn___pyx_t_5numpy_int64_t, (int (*)(char *, PyObject *)) __pyx_memview_set_nn___pyx_t_5numpy_int64_t, 0);; if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 25, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_1);
  __pyx_t_2 = __Pyx_PyObject_GetAttrStr(__pyx_t_1, __pyx_n_s_size); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 25, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_2);
  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
  __pyx_t_3 = __Pyx_PyInt_As_unsigned_int(__pyx_t_2); if (unlikely((__pyx_t_3 == (unsigned int)-1) && PyErr_Occurred())) __PYX_ERR(0, 25, __pyx_L1_error)
  __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;
  __pyx_v_n = __pyx_t_3;
+26:     cdef np.ndarray[np.uint8_t, ndim=1, cast=True] result = np.empty(n, dtype=np.uint8)
  __pyx_t_2 = __Pyx_GetModuleGlobalName(__pyx_n_s_np); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 26, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_2);
  __pyx_t_1 = __Pyx_PyObject_GetAttrStr(__pyx_t_2, __pyx_n_s_empty); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 26, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_1);
  __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;
  __pyx_t_2 = __Pyx_PyInt_From_unsigned_int(__pyx_v_n); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 26, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_2);
  __pyx_t_4 = PyTuple_New(1); if (unlikely(!__pyx_t_4)) __PYX_ERR(0, 26, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_4);
  __Pyx_GIVEREF(__pyx_t_2);
  PyTuple_SET_ITEM(__pyx_t_4, 0, __pyx_t_2);
  __pyx_t_2 = 0;
  __pyx_t_2 = PyDict_New(); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 26, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_2);
  __pyx_t_5 = __Pyx_GetModuleGlobalName(__pyx_n_s_np); if (unlikely(!__pyx_t_5)) __PYX_ERR(0, 26, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_5);
  __pyx_t_6 = __Pyx_PyObject_GetAttrStr(__pyx_t_5, __pyx_n_s_uint8); if (unlikely(!__pyx_t_6)) __PYX_ERR(0, 26, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_6);
  __Pyx_DECREF(__pyx_t_5); __pyx_t_5 = 0;
  if (PyDict_SetItem(__pyx_t_2, __pyx_n_s_dtype, __pyx_t_6) < 0) __PYX_ERR(0, 26, __pyx_L1_error)
  __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;
  __pyx_t_6 = __Pyx_PyObject_Call(__pyx_t_1, __pyx_t_4, __pyx_t_2); if (unlikely(!__pyx_t_6)) __PYX_ERR(0, 26, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_6);
  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
  __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0;
  __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;
  if (!(likely(((__pyx_t_6) == Py_None) || likely(__Pyx_TypeTest(__pyx_t_6, __pyx_ptype_5numpy_ndarray))))) __PYX_ERR(0, 26, __pyx_L1_error)
  __pyx_t_7 = ((PyArrayObject *)__pyx_t_6);
  {
    __Pyx_BufFmt_StackElem __pyx_stack[1];
    if (unlikely(__Pyx_GetBufferAndValidate(&__pyx_pybuffernd_result.rcbuffer->pybuffer, (PyObject*)__pyx_t_7, &__Pyx_TypeInfo_nn___pyx_t_5numpy_uint8_t, PyBUF_FORMAT| PyBUF_STRIDES| PyBUF_WRITABLE, 1, 1, __pyx_stack) == -1)) {
      __pyx_v_result = ((PyArrayObject *)Py_None); __Pyx_INCREF(Py_None); __pyx_pybuffernd_result.rcbuffer->pybuffer.buf = NULL;
      __PYX_ERR(0, 26, __pyx_L1_error)
    } else {__pyx_pybuffernd_result.diminfo[0].strides = __pyx_pybuffernd_result.rcbuffer->pybuffer.strides[0]; __pyx_pybuffernd_result.diminfo[0].shape = __pyx_pybuffernd_result.rcbuffer->pybuffer.shape[0];
    }
  }
  __pyx_t_7 = 0;
  __pyx_v_result = ((PyArrayObject *)__pyx_t_6);
  __pyx_t_6 = 0;
 27: 
 28:     cdef GEOSContextHandle_t geos_handle
 29:     cdef GEOSGeometry *geom1
 30:     cdef GEOSGeometry *geom2
 31:     cdef np.uintp_t geos_geom
 32:     cdef np.uintp_t geos_geom1
 33: 
+34:     geos_h = get_geos_context_handle()
  __pyx_v_geos_h = __pyx_f_46_cython_magic_69b3f517e97837cffcede7e1a60806aa_get_geos_context_handle();
+35:     geos_geom1 = geometry.__geom__
  __pyx_t_6 = __Pyx_PyObject_GetAttrStr(__pyx_v_geometry, __pyx_n_s_geom); if (unlikely(!__pyx_t_6)) __PYX_ERR(0, 35, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_6);
  __pyx_t_8 = __Pyx_PyInt_As_size_t(__pyx_t_6); if (unlikely((__pyx_t_8 == ((npy_uintp)-1)) && PyErr_Occurred())) __PYX_ERR(0, 35, __pyx_L1_error)
  __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;
  __pyx_v_geos_geom1 = __pyx_t_8;
+36:     geom1 = <GEOSGeometry *> geos_geom1
  __pyx_v_geom1 = ((GEOSGeometry *)__pyx_v_geos_geom1);
 37: 
+38:     for idx in xrange(n):
  __pyx_t_3 = __pyx_v_n;
  for (__pyx_t_9 = 0; __pyx_t_9 < __pyx_t_3; __pyx_t_9+=1) {
    __pyx_v_idx = __pyx_t_9;
+39:         geos_geom = array[idx]
    __pyx_t_10 = __pyx_v_idx;
    __pyx_v_geos_geom = (*((__pyx_t_5numpy_int64_t *) ( /* dim=0 */ (__pyx_v_array.data + __pyx_t_10 * __pyx_v_array.strides[0]) )));
+40:         geom2 = <GEOSGeometry *>geos_geom
    __pyx_v_geom2 = ((GEOSGeometry *)__pyx_v_geos_geom);
 41: 
 42:         # Put the result of whether the point is "contained" by the
 43:         # prepared geometry into the result array. 
+44:         result[idx] = <np.uint8_t> GEOSContains_r(geos_h, geom1, geom2)
    __pyx_t_11 = __pyx_v_idx;
    *__Pyx_BufPtrStrided1d(__pyx_t_5numpy_uint8_t *, __pyx_pybuffernd_result.rcbuffer->pybuffer.buf, __pyx_t_11, __pyx_pybuffernd_result.diminfo[0].strides) = ((__pyx_t_5numpy_uint8_t)GEOSContains_r(__pyx_v_geos_h, __pyx_v_geom1, __pyx_v_geom2));
  }
 45:         #GEOSGeom_destroy_r(geos_h, geom2)
 46: 
+47:     return result.view(dtype=np.bool)
  __Pyx_XDECREF(__pyx_r);
  __pyx_t_6 = __Pyx_PyObject_GetAttrStr(((PyObject *)__pyx_v_result), __pyx_n_s_view); if (unlikely(!__pyx_t_6)) __PYX_ERR(0, 47, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_6);
  __pyx_t_2 = PyDict_New(); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 47, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_2);
  __pyx_t_4 = __Pyx_GetModuleGlobalName(__pyx_n_s_np); if (unlikely(!__pyx_t_4)) __PYX_ERR(0, 47, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_4);
  __pyx_t_1 = __Pyx_PyObject_GetAttrStr(__pyx_t_4, __pyx_n_s_bool); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 47, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_1);
  __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0;
  if (PyDict_SetItem(__pyx_t_2, __pyx_n_s_dtype, __pyx_t_1) < 0) __PYX_ERR(0, 47, __pyx_L1_error)
  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
  __pyx_t_1 = __Pyx_PyObject_Call(__pyx_t_6, __pyx_empty_tuple, __pyx_t_2); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 47, __pyx_L1_error)
  __Pyx_GOTREF(__pyx_t_1);
  __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;
  __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;
  __pyx_r = __pyx_t_1;
  __pyx_t_1 = 0;
  goto __pyx_L0;
In [28]:
a2 = np.array([x._geom for x in a])
In [29]:
a2
Out[29]:
array([40688656, 35212704, 35200416, ..., 43157344, 43157552, 43157760])
In [30]:
a2.dtype
Out[30]:
dtype('int64')
In [31]:
res3 = contains_cy2(a2, poly)
res3
Out[31]:
array([False, False, False, ..., False, False, False], dtype=bool)
In [32]:
res3.sum()
Out[32]:
89
In [33]:
np.testing.assert_array_equal(res1, res3)
In [35]:
%timeit contains_cy2(a2, poly)
209 µs ± 13.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Further massive speed-up !!

From 50 ms to 2 ms and now to 0.2 ms -> 250x speed-up !!

We have to note that the creation of the array of geom pointers also is costly, but this could be cached (if multiple operations are done), and can probably be optimized as well using cython/

In [36]:
%timeit np.array([x._geom for x in a])
3.14 ms ± 106 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Comparison with the shapely.vectorized.contains function

shapely already has an optimized vectorized contains function in cython, but this accepts arrays of x and y coordinates instead of an array of Point geometries:

In [37]:
a_x = np.array([x.x for x in a])
a_y = np.array([x.x for x in a])
In [39]:
import shapely.prepared
In [41]:
p2 = shapely.prepared.prep(poly)
In [42]:
import shapely.vectorized
In [43]:
res4 = shapely.vectorized.contains(poly, a_x, a_y)
res4
Out[43]:
array([False, False, False, ..., False, False, False], dtype=bool)
In [44]:
res4.sum()
Out[44]:
89
In [45]:
np.testing.assert_array_equal(res1, res4)
In [47]:
%timeit shapely.vectorized.contains(poly, a_x, a_y)
3.96 ms ± 66.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

So it seems that the creation of the Geometry object from the x, y coordinates gives some overhead, as this is a bit slower than our contains_cy function.

Comments