SAP HANA ML Python Client (hana-ml)
Package Version: 2.22.241011
Last Verified: 2025-11-27
Table of Contents
- •Installation & Setup
- •Quick Start
- •Core Libraries
- •Common Patterns
- •Best Practices
- •Bundled Resources
Installation & Setup
bash
pip install hana-ml
Requirements: Python 3.8+, SAP HANA 2.0 SPS03+ or SAP HANA Cloud
Quick Start
Connection & DataFrame
python
from hana_ml import ConnectionContext
# Connect
conn = ConnectionContext(
address='<hostname>',
port=443,
user='<username>',
password='<password>',
encrypt=True
)
# Create DataFrame
df = conn.table('MY_TABLE', schema='MY_SCHEMA')
print(f"Shape: {df.shape}")
df.head(10).collect()
PAL Classification
python
from hana_ml.algorithms.pal.unified_classification import UnifiedClassification # Train model clf = UnifiedClassification(func='RandomDecisionTree') clf.fit(train_df, features=['F1', 'F2', 'F3'], label='TARGET') # Predict & evaluate predictions = clf.predict(test_df, features=['F1', 'F2', 'F3']) score = clf.score(test_df, features=['F1', 'F2', 'F3'], label='TARGET')
APL AutoML
python
from hana_ml.algorithms.apl.classification import AutoClassifier # Automated classification auto_clf = AutoClassifier() auto_clf.fit(train_df, label='TARGET') predictions = auto_clf.predict(test_df)
Model Persistence
python
from hana_ml.model_storage import ModelStorage ms = ModelStorage(conn) clf.name = 'MY_CLASSIFIER' ms.save_model(model=clf, if_exists='replace')
Core Libraries
PAL (Predictive Analysis Library)
- •100+ algorithms executed in-database
- •Categories: Classification, Regression, Clustering, Time Series, Preprocessing
- •Key classes:
UnifiedClassification,UnifiedRegression,KMeans,ARIMA - •See:
references/PAL_ALGORITHMS.mdfor complete list
APL (Automated Predictive Library)
- •AutoML capabilities with automatic feature engineering
- •Key classes:
AutoClassifier,AutoRegressor,GradientBoostingClassifier - •See:
references/APL_ALGORITHMS.mdfor details
DataFrames
- •Lazy evaluation - builds SQL until
collect()called - •In-database processing for optimal performance
- •See:
references/DATAFRAME_REFERENCE.mdfor complete API
Visualizers
- •EDA plots, model explanations, metrics
- •SHAP integration for model interpretability
- •See:
references/VISUALIZERS.mdfor 14 visualization modules
Common Patterns
Train-Test Split
python
from hana_ml.algorithms.pal.partition import train_test_val_split
train, test, val = train_test_val_split(
data=df,
training_percentage=0.7,
testing_percentage=0.2,
validation_percentage=0.1
)
Feature Importance
python
# APL models importance = auto_clf.get_feature_importances() # PAL models from hana_ml.algorithms.pal.preprocessing import FeatureSelection fs = FeatureSelection() fs.fit(train_df, features=features, label='TARGET')
Pipeline
python
from hana_ml.algorithms.pal.pipeline import Pipeline
from hana_ml.algorithms.pal.preprocessing import Imputer, FeatureNormalizer
pipeline = Pipeline([
('imputer', Imputer(strategy='mean')),
('normalizer', FeatureNormalizer()),
('classifier', UnifiedClassification(func='RandomDecisionTree'))
])
Best Practices
- •Use lazy evaluation - Operations build SQL without execution until
collect() - •Leverage in-database processing - Keep data in HANA for performance
- •Use Unified interfaces - Consistent APIs across algorithms
- •Save models - Use
ModelStoragefor persistence - •Explain predictions - Use SHAP explainers for interpretability
- •Monitor AutoML - Use
PipelineProgressStatusMonitorfor long-running jobs
Bundled Resources
Reference Files
- •
references/DATAFRAME_REFERENCE.md(479 lines)- •ConnectionContext API, DataFrame operations, SQL generation
- •
references/PAL_ALGORITHMS.md(869 lines)- •Complete PAL algorithm reference (100+ algorithms)
- •Classification, Regression, Clustering, Time Series, Preprocessing
- •
references/APL_ALGORITHMS.md(534 lines)- •AutoML capabilities, automated feature engineering
- •AutoClassifier, AutoRegressor, GradientBoosting classes
- •
references/VISUALIZERS.md(704 lines)- •14 visualization modules (EDA, SHAP, metrics, time series)
- •Plot types, configuration, export options
- •
references/SUPPORTING_MODULES.md(626 lines)- •Model storage, spatial analytics, graph algorithms
- •Text mining, statistics, error handling
Error Handling
python
from hana_ml.ml_exceptions import Error
try:
clf.fit(train_df, features=features, label='TARGET')
except Error as e:
print(f"HANA ML Error: {e}")
Documentation
- •Official Docs: https://help.sap.com/doc/1d0ebfe5e8dd44d09606814d83308d4b/2.0.07/en-US/hana_ml.html
- •PyPI Package: https://pypi.org/project/hana-ml/