API Reference

PySIFT

class PySIFT(n_octaves=4, n_scales=3, contrast_thresh=0.04, edge_thresh=10.0, double_image=True, rootsift=True, dsp=False, orientation='histogram', descriptor='sift', fp16_pyramid=True)

GPU-resident SIFT feature detector and descriptor.

Parameters:
  • n_octaves (int) – Number of octaves in the Gaussian pyramid (default: 4).

  • n_scales (int) – Scale levels per octave (default: 3).

  • contrast_thresh (float) – Contrast threshold for keypoint filtering (default: 0.04). Lower values detect more keypoints on low-contrast regions.

  • edge_thresh (float) – Edge response threshold (default: 10.0). Higher values keep more edge-like keypoints.

  • double_image (bool) – Upsample image 2x before building pyramid (default: True). Auto-suppressed for inputs >4 MP on 4 GB GPUs.

  • rootsift (bool) – Apply RootSIFT normalization – L1 + sqrt (default: True). Improves matching by converting L2 distance to Hellinger distance.

  • dsp (bool) – Enable DSP-SIFT multi-scale descriptor pooling (default: False). Recommended for all matching tasks.

  • orientation (str) – Orientation assignment method. 'histogram' (default) or 'orinet' (learned, requires kornia).

  • descriptor (str) – Descriptor computation method. 'sift' (default), 'hardnet', or 'hynet' (learned, require kornia).

  • fp16_pyramid (bool) – Store Gaussian pyramid in half-precision (default: True). Halves VRAM usage with negligible quality loss.

detectAndCompute(image, mask=None, gpu_output=False)

Detect keypoints and compute descriptors.

Parameters:
  • image (numpy.ndarray) – Grayscale input image (uint8, HxW).

  • mask (numpy.ndarray) – Optional binary mask (same size as image). Only keypoints inside mask are returned.

  • gpu_output (bool) – If True, return CuPy arrays in VRAM instead of OpenCV KeyPoints + NumPy arrays.

Returns:

(keypoints, descriptors)

  • Default (gpu_output=False): keypoints is a list of cv2.KeyPoint, descriptors is numpy.ndarray (N, 128).

  • GPU mode (gpu_output=True): keypoints is cupy.ndarray (N, 4) [x, y, size, angle], descriptors is cupy.ndarray (N, 128).

Return type:

tuple

GPUPyStitch

class GPUPyStitch(config=None, **kwargs)

Full GPU-resident panoramic stitching pipeline: feature extraction, matching, RANSAC, warping, and blending.

Parameters:
  • config (str) – Path to YAML config file (optional).

  • kwargs – Override any config parameter (e.g., descriptor='hardnet', matcher='lightglue').

stitch(img1, img2, ...)

Stitch two or more BGR images into a panorama.

Parameters:
  • img1 (numpy.ndarray) – First BGR image.

  • img2 (numpy.ndarray) – Second BGR image.

Returns:

Stitched panorama as BGR numpy.ndarray.

Return type:

numpy.ndarray

Zero-Copy DLPack Interop

PySIFT descriptors can be consumed by any DLPack-compatible framework without copying data:

# CuPy -> PyTorch (zero-copy)
import torch
kp, desc = sift.detectAndCompute(img, None, gpu_output=True)
torch_desc = torch.from_dlpack(desc)

# CuPy -> JAX (zero-copy)
import jax.dlpack
jax_desc = jax.dlpack.from_dlpack(desc.toDlpack())

The DLPack exchange transfers a 64-byte metadata struct (shape, dtype, stride, device ID). No bytes of descriptor data are copied. This is O(1) regardless of keypoint count.