Model Format and Conversion¶
This document describes the AIMNet2 model format, metadata structure, and conversion between legacy and new formats.
It reflects the current implementation in aimnet.models.base, aimnet.models.utils, and aimnet.calculators.calculator.
Model Formats¶
AIMNet2 supports two model formats:
| Format | Extension | Version | Description |
|---|---|---|---|
| Legacy | .jpt |
1 | TorchScript JIT-compiled model with embedded LR modules |
| New | .pt |
2 | State dict with embedded YAML config and metadata |
Format Detection¶
When loading a model via load_model(), the format is automatically detected:
- New format: Dictionary containing
"model_yaml"key - Legacy format:
torch.jit.ScriptModuleinstance
Metadata Structure¶
Model metadata is returned by load_model() as a ModelMetadata TypedDict.
For early v2 bundles that predate format_version, load_model() defaults to format_version=2.
Core Fields¶
| Field | Type | Description |
|---|---|---|
format_version |
int |
1 = legacy JIT, 2 = new format (default for early v2 bundles) |
cutoff |
float |
Model short-range cutoff radius (Å) |
implemented_species |
list[int] |
Supported atomic numbers |
Coulomb Configuration¶
| Field | Type | Description |
| --------------------- | ------ | ---------------------------------------------------------------- | -------------------------------------------------------- |
| needs_coulomb | bool | If True, calculator should add external Coulomb |
| coulomb_mode | str | What's embedded: "sr_embedded", "full_embedded", or "none" |
| coulomb_sr_rc | float | None | SR Coulomb cutoff (only if coulomb_mode="sr_embedded") |
| coulomb_sr_envelope | str | None | Envelope function: "exp" (mollifier) or "cosine" |
Dispersion Configuration¶
| Field | Type | Description |
| ------------------ | ------ | ----------------------------------------------- | --------------------------------- |
| needs_dispersion | bool | If True, calculator should add external DFTD3 |
| d3_params | dict | None | D3 parameters: {s6, s8, a1, a2} |
Which Format Should I Use?¶
Decision Matrix¶
| Scenario | Format | Model Type | Notes |
|---|---|---|---|
| Training new model | v2 (.pt) | Export after training | Flexible, modern |
| Need runtime Coulomb control | v2 (.pt) | Convert from v1 if needed | Switch simple/DSF/Ewald |
| Production inference | v2 (.pt) | Preferred | Smaller, more flexible |
| Legacy deployment | v1 (.jpt) | Keep as-is | If compatibility required |
| Experimenting with methods | v2 (.pt) | Required | Runtime reconfiguration |
| Fixed pipeline | Either | Use what works | No strong preference |
Quick Selection Guide¶
Use v2 (.pt) if:
- Training new models
- Need to try different Coulomb methods
- Want runtime flexibility
- Prefer modern PyTorch features
Keep v1 (.jpt) if:
- Existing deployment works
- Don't need to change methods
- Legacy compatibility required
- No issues with current setup
Coulomb Modes¶
The coulomb_mode field describes what Coulomb treatment is embedded in the model.
Coulomb Mode Comparison¶
┌─────────────────────────────────────────────────────────────────┐
│ sr_embedded (v2 format - RECOMMENDED) │
├─────────────────────────────────────────────────────────────────┤
│ Model: E_NN - E_SR (SR Coulomb subtracted) │
│ Calculator: + E_full (adds full Coulomb externally) │
│ Total: E_NN + E_LR (SR cancels out) │
│ │
│ Runtime control: ✓ Can switch simple/DSF/Ewald │
│ File size: Smaller (no LR modules embedded) │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ full_embedded (v1 legacy format) │
├─────────────────────────────────────────────────────────────────┤
│ Model: E_NN + E_Coulomb (full Coulomb embedded in JIT) │
│ Calculator: (nothing) │
│ Total: E_NN + E_Coulomb │
│ │
│ Runtime control: ✗ Fixed method, warning only │
│ File size: Larger (modules in JIT) │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ none (no Coulomb) │
├─────────────────────────────────────────────────────────────────┤
│ Model: E_NN only │
│ Calculator: (nothing) │
│ Total: E_NN │
│ │
│ Runtime control: N/A │
│ Use case: Models without electrostatics │
└─────────────────────────────────────────────────────────────────┘
"sr_embedded"¶
- Model has SRCoulomb (short-range) embedded
- Model outputs:
E_NN - E_SR - Calculator adds full Coulomb externally:
E_total = (E_NN - E_SR) + E_full = E_NN + E_LR - Uses
coulomb_sr_rcandcoulomb_sr_envelopefrom metadata - User can switch Coulomb method (simple/DSF/Ewald) at runtime via
set_lrcoulomb_method()
SR Coulomb Cutoff (coulomb_sr_rc)¶
The short-range Coulomb cutoff defines the distance within which SR Coulomb interactions are computed by the embedded SRCoulomb module.
Constraint: coulomb_sr_rc <= model cutoff
The SR cutoff must be less than or equal to the model's short-range cutoff (cutoff) because:
- SRCoulomb uses the same neighbor list as the neural network
- Atom pairs beyond the model cutoff are not visible to SRCoulomb
- Typical value: 4.6 Å (with model cutoff of 5.0 Å)
SR Envelope (coulomb_sr_envelope)¶
The envelope function defines how the SR interaction decays at the cutoff:
"exp": Smooth mollifier-based decay (default)"cosine": Cosine-based decay
"full_embedded"¶
- Legacy JIT model with full Coulomb embedded
- Model outputs:
E_NN + E_Coulombdirectly - No external Coulomb needed (
needs_coulomb=False) - Coulomb method cannot be changed at runtime
"none"¶
- No Coulomb treatment in model
needs_coulomb=False- Model outputs:
E_NNonly
Dispersion Modes¶
External DFTD3 (needs_dispersion=True)¶
- DFTD3/D3BJ module removed from model during export
- D3 parameters (
s6,s8,a1,a2) stored ind3_paramsmetadata - Calculator creates external DFTD3 module
- Cutoff can be configured via
set_dftd3_cutoff()for external DFTD3 only
Note: DFTD3 cutoff/smoothing values are not currently stored in metadata. External DFTD3 defaults to 15.0 Å cutoff and 0.8 smoothing fraction unless overridden at runtime.
Embedded D3TS¶
- D3TS (learned parameters) remains embedded in model
needs_dispersion=Falsefor D3TS models- Cannot be modified at runtime
No Dispersion¶
needs_dispersion=Falseand no D3TS- Model outputs energy without dispersion correction
File Structure¶
New Format (.pt)¶
{
"format_version": 2, # Default for early v2 bundles may be omitted
"model_yaml": str, # Core model YAML config (no LR modules)
"cutoff": float,
"needs_coulomb": bool,
"needs_dispersion": bool,
"coulomb_mode": str,
"coulomb_sr_rc": float | None,
"coulomb_sr_envelope": str | None,
"d3_params": dict | None, # DFTD3 params for external use
"implemented_species": list[int],
"state_dict": dict, # Model weights (SAE baked in)
}
Legacy Format (.jpt)¶
TorchScript module with attributes:
cutoff: Model cutoffcutoff_lr: Long-range cutoff (if applicable)- LRCoulomb and DFTD3/D3BJ modules embedded
Exporting Models¶
Use the aimnet export CLI command:
aimnet export weights.pt model_v2.pt --model config.yaml --sae sae.yaml
Export Process¶
- Load model YAML config, SAE (self-atomic energies), and weights
- Strip LRCoulomb/DFTD3 modules from config
- Add SRCoulomb if LRCoulomb was present (requires determinable
rc) - Build core model from modified config
- Load weights (with
strict=Falsefor module changes) - Bake SAE into
atomic_shift.shifts.weightas float64 - Mask unimplemented species (set NaN in
afv.weight) - Save with metadata
Export Options¶
aimnet export weights.pt model.pt \
--model config.yaml \
--sae sae.yaml \
--needs-coulomb # Override: force external Coulomb
--needs-dispersion # Override: force external DFTD3
Explicit flags override auto-detection from config.
Converting Legacy Models¶
Use the aimnet convert CLI command:
aimnet convert model.jpt config.yaml model_v2.pt
Conversion Process¶
- Load legacy JIT model and YAML config
- Extract
cutofffrom model attribute - Extract
implemented_speciesfromafv.weight(non-NaN entries) - Strip LR modules from config, add SRCoulomb if needed (requires determinable
rc) - Build core model from modified config
- Load weights from JIT state dict
- Validate keys (filter expected missing/unexpected)
- Convert
atomic_shiftto float64 - Save with metadata
Key Changes During Conversion¶
| Legacy | New |
|---|---|
outputs.lrcoulomb.* |
Removed |
outputs.dftd3.* |
Removed |
outputs.d3bj.* |
Removed |
| (none) | outputs.srcoulomb.* added |
Loading Models¶
from aimnet.models.base import load_model
model, metadata = load_model("model.pt", device="cuda")
# Access metadata
print(metadata["cutoff"])
print(metadata["needs_coulomb"])
print(metadata["coulomb_mode"])
Loading Behavior¶
- New format: Parses YAML, builds model, loads state dict
- Legacy format: Returns JIT model directly
- Metadata always returned as
ModelMetadatadict atomic_shiftconverted to float64 after loading (precision)
Metadata Behavior Summary¶
| Model Type | needs_coulomb |
coulomb_mode |
Calculator Behavior |
|---|---|---|---|
| New with Coulomb | True |
"sr_embedded" |
Adds external LRCoulomb |
| New without Coulomb | False |
"none" |
No external Coulomb |
| Legacy JIT | False |
"full_embedded" |
Coulomb embedded in JIT |
| Model Type | needs_dispersion |
Calculator Behavior |
|---|---|---|
| New with DFTD3/D3BJ | True |
Adds external DFTD3 |
| New with D3TS | False |
D3TS embedded |
| New without dispersion | False |
No dispersion |
| Legacy with DFTD3 | False |
Embedded in JIT (d3_params extracted for diagnostics) |
API Reference¶
load_model(path, device="cpu")¶
Load model from file with automatic format detection.
Parameters:
path(str): Path to model file (.ptor.jpt)device(str): Device to load model on
Returns:
model(nn.Module): Loaded modelmetadata(ModelMetadata): Metadata dictionary
ModelMetadata (TypedDict)¶
See Metadata Structure for field definitions.
Migration Guide¶
Why Migrate to v2 Format?¶
Benefits of v2 (.pt) over v1 (.jpt):
- Runtime flexibility: Change Coulomb method (simple/DSF/Ewald) without retraining
- Smaller files: Separate external modules reduce file size
- Better debugging: Access model structure and weights directly
- Modern workflow: Compatible with latest PyTorch features
- Metadata: Rich metadata for validation and documentation
When to convert:
- You have legacy
.jptmodels and want runtime Coulomb control - You're training new models (use v2 from the start)
- You need to modify model architecture post-training
When to keep v1:
- Legacy compatibility required
- Model works fine and no new features needed
- Deployment pipeline expects JIT models
Step-by-Step Migration¶
1. Prepare Required Files¶
You'll need:
model.jpt- Your legacy JIT modelconfig.yaml- Original model configuration
# If you don't have config.yaml, you may need to reconstruct it
# from training logs or model inspection
2. Run Conversion¶
aimnet convert model.jpt config.yaml model_v2.pt
What happens during conversion:
- Extracts model weights from JIT state dict
- Strips embedded LRCoulomb/DFTD3 modules
- Adds SRCoulomb if LRCoulomb was present
- Preserves atomic shifts (SAE) as float64
- Detects implemented species from weights
- Generates metadata dictionary
3. Validate Conversion¶
from aimnet.calculators import AIMNet2Calculator
import torch
# Load both models
calc_v1 = AIMNet2Calculator("model.jpt")
calc_v2 = AIMNet2Calculator("model_v2.pt")
# Test data
data = {
"coord": torch.randn(10, 3),
"numbers": torch.randint(1, 9, (10,)),
"charge": 0.0,
}
# Compare energies (should match within tolerance)
result_v1 = calc_v1(data, forces=True)
result_v2 = calc_v2(data, forces=True)
energy_diff = (result_v1["energy"] - result_v2["energy"]).abs()
force_diff = (result_v1["forces"] - result_v2["forces"]).abs().max()
print(f"Energy difference: {energy_diff:.2e} eV")
print(f"Max force difference: {force_diff:.2e} eV/Å")
assert energy_diff < 1e-5, "Energy mismatch!"
assert force_diff < 1e-4, "Force mismatch!"
Expected differences:
- Energies: < 1e-5 eV (numerical precision)
- Forces: < 1e-4 eV/Å (gradient precision)
4. Test Runtime Flexibility¶
# v2 models support runtime method changes
calc_v2.set_lrcoulomb_method("dsf", cutoff=15.0)
result_dsf = calc_v2(data)
calc_v2.set_lrcoulomb_method("ewald", ewald_accuracy=1e-8)
result_ewald = calc_v2(data)
# v1 models show warning but don't change
calc_v1.set_lrcoulomb_method("dsf", cutoff=15.0)
# Warning: Cannot change method for legacy models
Common Conversion Issues¶
Issue: Missing config.yaml¶
Problem: You have a .jpt model but no configuration file.
Solution: Inspect the model to reconstruct config:
import torch
model = torch.jit.load("model.jpt")
# Inspect attributes
print(f"Cutoff: {model.cutoff}")
print(f"Cutoff LR: {model.cutoff_lr}")
# May need to manually create config based on model structure
Issue: Weight Mismatch¶
Problem: Conversion completes but validation shows large differences.
Solution: Check for module name mismatches:
# Use verbose mode to see what's happening
aimnet convert model.jpt config.yaml model_v2.pt --verbose
# Check for unexpected missing keys
# Some modules may have been renamed
Issue: Implemented Species Mismatch¶
Problem: Converted model has wrong implemented_species.
Solution: Species are auto-detected from non-NaN entries in afv.weight. Verify:
from aimnet.models.base import load_model
model, metadata = load_model("model_v2.pt")
print(metadata["implemented_species"])
# If wrong, may need to fix config before conversion
Exporting New Models¶
For newly trained models, export directly to v2:
aimnet export weights.pt model_v2.pt \
--model config.yaml \
--sae sae.yaml
Optional flags:
# Override auto-detection
--needs-coulomb # Force external Coulomb
--needs-dispersion # Force external DFTD3
CLI Commands¶
# Export trained model
aimnet export weights.pt output.pt --model config.yaml --sae sae.yaml
# Convert legacy JIT model
aimnet convert model.jpt config.yaml output.pt
# Calculate SAE from dataset
aimnet calc_sae dataset.h5 sae.yaml