API Reference¶
PySIFT¶
- class PySIFT(n_octaves=4, n_scales=3, contrast_thresh=0.04, edge_thresh=10.0, double_image=True, rootsift=True, dsp=False, orientation='histogram', descriptor='sift', fp16_pyramid=True)¶
GPU-resident SIFT feature detector and descriptor.
- Parameters:
n_octaves (int) – Number of octaves in the Gaussian pyramid (default: 4).
n_scales (int) – Scale levels per octave (default: 3).
contrast_thresh (float) – Contrast threshold for keypoint filtering (default: 0.04). Lower values detect more keypoints on low-contrast regions.
edge_thresh (float) – Edge response threshold (default: 10.0). Higher values keep more edge-like keypoints.
double_image (bool) – Upsample image 2x before building pyramid (default: True). Auto-suppressed for inputs >4 MP on 4 GB GPUs.
rootsift (bool) – Apply RootSIFT normalization – L1 + sqrt (default: True). Improves matching by converting L2 distance to Hellinger distance.
dsp (bool) – Enable DSP-SIFT multi-scale descriptor pooling (default: False). Recommended for all matching tasks.
orientation (str) – Orientation assignment method.
'histogram'(default) or'orinet'(learned, requires kornia).descriptor (str) – Descriptor computation method.
'sift'(default),'hardnet', or'hynet'(learned, require kornia).fp16_pyramid (bool) – Store Gaussian pyramid in half-precision (default: True). Halves VRAM usage with negligible quality loss.
- detectAndCompute(image, mask=None, gpu_output=False)¶
Detect keypoints and compute descriptors.
- Parameters:
image (numpy.ndarray) – Grayscale input image (uint8, HxW).
mask (numpy.ndarray) – Optional binary mask (same size as image). Only keypoints inside mask are returned.
gpu_output (bool) – If True, return CuPy arrays in VRAM instead of OpenCV KeyPoints + NumPy arrays.
- Returns:
(keypoints, descriptors)Default (
gpu_output=False):keypointsis a list ofcv2.KeyPoint,descriptorsisnumpy.ndarray (N, 128).GPU mode (
gpu_output=True):keypointsiscupy.ndarray (N, 4)[x, y, size, angle],descriptorsiscupy.ndarray (N, 128).
- Return type:
tuple
GPUPyStitch¶
- class GPUPyStitch(config=None, **kwargs)¶
Full GPU-resident panoramic stitching pipeline: feature extraction, matching, RANSAC, warping, and blending.
- Parameters:
config (str) – Path to YAML config file (optional).
kwargs – Override any config parameter (e.g.,
descriptor='hardnet',matcher='lightglue').
- stitch(img1, img2, ...)¶
Stitch two or more BGR images into a panorama.
- Parameters:
img1 (numpy.ndarray) – First BGR image.
img2 (numpy.ndarray) – Second BGR image.
- Returns:
Stitched panorama as BGR
numpy.ndarray.- Return type:
numpy.ndarray
Zero-Copy DLPack Interop¶
PySIFT descriptors can be consumed by any DLPack-compatible framework without copying data:
# CuPy -> PyTorch (zero-copy)
import torch
kp, desc = sift.detectAndCompute(img, None, gpu_output=True)
torch_desc = torch.from_dlpack(desc)
# CuPy -> JAX (zero-copy)
import jax.dlpack
jax_desc = jax.dlpack.from_dlpack(desc.toDlpack())
The DLPack exchange transfers a 64-byte metadata struct (shape, dtype, stride, device ID). No bytes of descriptor data are copied. This is O(1) regardless of keypoint count.