perf: Numba JIT kernel per score_by_shift (2.1x speedup)
- Nuovo modulo pm2d/_jit_kernels.py con _jit_score_by_shift Numba njit parallel + fastmath + boundscheck=False - Parallelizzazione per riga output (no race condition su acc) - Fallback automatico numpy se numba non installato - Warmup automatico al module import (evita JIT lag al 1 match) Benchmark clip.png (13 istanze): prima (numpy + threads): 1.55s dopo (numba + threads): 0.72s speedup: 2.1x Pipeline totale full (refine+subpix): 0.80s Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
+8
-1
@@ -1,7 +1,14 @@
|
||||
from pm2d.matcher import EdgeShapeMatcher, Match, Template
|
||||
from pm2d.line_matcher import LineShapeMatcher, Match as LineMatch
|
||||
from pm2d._jit_kernels import HAS_NUMBA, _warmup as _warmup_jit
|
||||
|
||||
# Precompila kernel JIT in background al primo import (evita lag al 1° match)
|
||||
try:
|
||||
_warmup_jit()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
__all__ = [
|
||||
"EdgeShapeMatcher", "Match", "Template",
|
||||
"LineShapeMatcher", "LineMatch",
|
||||
"LineShapeMatcher", "LineMatch", "HAS_NUMBA",
|
||||
]
|
||||
|
||||
Reference in New Issue
Block a user