NumPy Best Practices
Expert guidelines for NumPy development, focusing on array programming, numerical computing, and performance optimization.
Code Style and Structure
- •Write concise, technical Python code with accurate NumPy examples
- •Prefer vectorized operations over explicit loops for performance
- •Use descriptive variable names reflecting data content (e.g.,
weights,gradients,input_array) - •Follow PEP 8 style guidelines for Python code
- •Use functional programming patterns when appropriate
Array Creation and Manipulation
- •Use appropriate array creation functions:
np.array(),np.zeros(),np.ones(),np.empty(),np.arange(),np.linspace() - •Prefer
np.zeros()ornp.empty()for pre-allocation when array size is known - •Use
np.concatenate(),np.vstack(),np.hstack()for combining arrays - •Leverage broadcasting for operations on arrays with different shapes
Indexing and Slicing
- •Use advanced indexing with boolean arrays for conditional selection
- •Prefer views over copies when possible to save memory
- •Use
np.where()for conditional element selection - •Understand the difference between fancy indexing (creates copy) and basic slicing (creates view)
Data Types
- •Specify appropriate data types explicitly using
dtypeparameter - •Use
np.float32for memory-efficient computations when full precision is not needed - •Be aware of integer overflow with fixed-size integer types
- •Use
np.asarray()for type conversion without unnecessary copies
Performance Optimization
Vectorization
- •Always prefer vectorized operations over Python loops
- •Use NumPy universal functions (ufuncs) for element-wise operations
- •Leverage
np.einsum()for complex tensor operations - •Use
np.dot()or@operator for matrix multiplication
Memory Management
- •Use
np.ndarray.flagsto check memory layout (C-contiguous vs Fortran-contiguous) - •Prefer in-place operations with
outparameter when possible - •Use memory-mapped arrays (
np.memmap) for large datasets - •Be mindful of array copies vs views
Computation Efficiency
- •Use
np.sum(),np.mean(),np.std()withaxisparameter for aggregations - •Leverage
np.cumsum(),np.cumprod()for cumulative operations - •Use
np.searchsorted()for efficient sorted array operations
Error Handling and Validation
- •Validate input shapes and data types before computations
- •Use assertions for dimension checking with informative messages
- •Handle NaN and Inf values appropriately with
np.isnan(),np.isinf() - •Use
np.errstate()context manager for controlling floating-point error handling
Random Number Generation
- •Use
np.random.default_rng()for modern random number generation - •Set seeds for reproducibility:
rng = np.random.default_rng(seed=42) - •Prefer the new Generator API over legacy
np.randomfunctions - •Use appropriate distributions:
rng.normal(),rng.uniform(),rng.choice()
Linear Algebra
- •Use
np.linalgfor linear algebra operations - •Leverage
np.linalg.solve()instead of computing inverse for linear systems - •Use
np.linalg.eig(),np.linalg.svd()for decompositions - •Check matrix condition with
np.linalg.cond()before inversion
Testing and Documentation
- •Write unit tests using
pytestwithnp.testingassertions - •Use
np.testing.assert_array_equal()for exact comparisons - •Use
np.testing.assert_array_almost_equal()for floating-point comparisons - •Include comprehensive docstrings following NumPy docstring format
Key Conventions
- •Import as
import numpy as np - •Use
snake_casefor variables and functions - •Document array shapes in docstrings
- •Profile code with
%timeitto identify bottlenecks