Multi-Level Disorder
This guide demonstrates how to enumerate structures with disorder on multiple independent subsets of sites, such as simultaneous cation and anion disorder.
The Problem
Sometimes you need to explore disorder on more than one subset of sites. For example, we might be interested in mixed cation/anion disorder in (Ti,Zr)O2:
Cation disorder: Ti/Zr substitution on metal sites
Anion disorder: O/F substitution on anion sites
These disorders are independent—they occur on different sets of sites—but they interact through symmetry breaking.
The Hierarchical Approach
When you disorder one subset of sites, the resulting structure typically has lower symmetry than the parent. This reduced symmetry should be used when enumerating disorder on the second subset.
Algorithm
Level 1: Enumerate disorder on the first subset using the parent structure’s symmetry
Level 2: For each Level 1 configuration:
The configuration has its own (typically reduced) symmetry
Enumerate disorder on the second subset using this reduced symmetry
Collect all Level 2 structures
This hierarchical approach ensures:
No duplicate structures
Correct symmetry analysis at each level
Computational efficiency through symmetry reduction at each level
Example: Ti/Zr and O/F Disorder in TiOF2
Let us work through a realistic example: a 2×2×2 supercell of TiOF2 with disorder on both cation and anion sublattices.
Setting Up the Parent Structure
import numpy as np
from pymatgen.core import Structure, Lattice
from bsym.interface.pymatgen import unique_structure_substitutions
# Create 2×2×2 TiOF2 supercell
a = 3.798 # lattice parameter in Ångströms
coords = np.array([[0.0, 0.0, 0.0],
[0.5, 0.0, 0.0],
[0.0, 0.5, 0.0],
[0.0, 0.0, 0.5]])
atom_list = ['Ti', 'X', 'X', 'X']
lattice = Lattice.from_parameters(a=a, b=a, c=a, alpha=90, beta=90, gamma=90)
unit_cell = Structure(lattice, atom_list, coords)
# Create a 2×2×2 supercell
parent_structure = unit_cell * [2, 2, 2]
print(f"Created supercell with {len(parent_structure)} atoms")
print(f" - {len([s for s in parent_structure if s.species_string == 'Ti'])} Ti sites")
print(f" - {len([s for s in parent_structure if s.species_string == 'X'])} X sites (to be O/F)")
Created supercell with 32 atoms
- 8 Ti sites
- 0 X sites (to be O/F)
/home/docs/checkouts/readthedocs.org/user_builds/bsym/envs/stable/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
Hierarchical Enumeration
We will enumerate Ti/Zr disorder first (smaller combinatorial space), then O/F disorder for each Ti/Zr arrangement.
# Level 1: Ti/Zr disorder on cation sites
print("Level 1: Enumerating Ti/Zr arrangements...")
level1_structures = unique_structure_substitutions(
parent_structure,
'Ti', # Substitute Ti sites
{'Ti': 6, 'Zr': 2} # 6 Ti, 2 Zr
)
print(f"Found {len(level1_structures)} unique Ti/Zr arrangements\n")
Level 1: Enumerating Ti/Zr arrangements...
Found 3 unique Ti/Zr arrangements
# Level 2: O/F disorder for each Ti/Zr arrangement
print("Level 2: Enumerating O/F arrangements for each Ti/Zr configuration...")
all_structures = []
for i, structure in enumerate(level1_structures):
level2_structures = unique_structure_substitutions(
structure, # Uses the reduced symmetry of this Ti/Zr arrangement
'X',
{'O': 8, 'F': 16}
)
print(f" Ti/Zr config {i+1}: {len(level2_structures)} O/F arrangements")
all_structures.extend(level2_structures)
print(f"\nTotal unique structures: {len(all_structures)}")
Level 2: Enumerating O/F arrangements for each Ti/Zr configuration...
Ti/Zr config 1: 29371 O/F arrangements
Ti/Zr config 2: 28955 O/F arrangements
Ti/Zr config 3: 9782 O/F arrangements
Total unique structures: 68108
Understanding the Output
In this example:
Level 1 produces a modest number of unique Ti/Zr cation arrangements
Level 2 varies for each Ti/Zr configuration:
Some Ti/Zr arrangements preserve more symmetry → fewer O/F arrangements needed
Other Ti/Zr arrangements break more symmetry → more O/F arrangements needed
The total combines all possibilities across both disorder types
Implementation Pattern
The general pattern for multi-level disorder is:
# Level 1: First disorder type
level1_configs = unique_structure_substitutions(
parent_structure,
species_to_substitute_1,
composition_1
)
# Level 2: Second disorder type
all_configs = []
for config in level1_configs:
level2_configs = unique_structure_substitutions(
config, # Each has its own symmetry
species_to_substitute_2,
composition_2
)
all_configs.extend(level2_configs)
This pattern extends naturally to three or more levels by adding additional nested loops.
Iterative Approach for Multiple Levels
For cases with three or more disorder types, nested loops become unwieldy. An iterative approach can be used instead that generalises to any number of levels:
def enumerate_multilevel_disorder(parent_structure, disorder_specs):
"""
Enumerate structures with multiple levels of disorder.
Args:
parent_structure: Initial pymatgen Structure
disorder_specs: List of dicts, each containing:
- 'to_substitute': species label to replace
- 'site_distribution': dict of {species: count}
Returns:
List of Structure objects with all disorder levels applied
"""
structures = [parent_structure]
for level, spec in enumerate(disorder_specs, 1):
print(f"Level {level}: Enumerating {spec['to_substitute']} → {spec['site_distribution']}")
new_structures = []
for i, structure in enumerate(structures):
if i % 100 == 0 and len(structures) > 100:
print(f" Processing structure {i+1}/{len(structures)}...")
level_structures = unique_structure_substitutions(
structure,
spec['to_substitute'],
spec['site_distribution']
)
new_structures.extend(level_structures)
print(f" Generated {len(new_structures)} structures\n")
structures = new_structures
return structures
disorder_specs = [
{
'to_substitute': 'Ti',
'site_distribution': {'Ti': 6, 'Zr': 2}
},
{
'to_substitute': 'X',
'site_distribution': {'O': 8, 'F': 16}
}
]
# Run the multi-level enumeration
all_structures = enumerate_multilevel_disorder(parent_structure, disorder_specs)
print(f"Total unique structures: {len(all_structures)}")
Level 1: Enumerating Ti → {'Ti': 6, 'Zr': 2}
Generated 3 structures
Level 2: Enumerating X → {'O': 8, 'F': 16}
Generated 68108 structures
Total unique structures: 68108
This iterative approach:
Generalizes easily to 3+ disorder levels
Avoids deeply nested loops
Makes it easy to modify or reorder disorder specifications
Provides progress tracking for long enumerations
Notes
Current Implementation
This hierarchical enumeration can be implemented in two ways:
Manual chaining (shown in the first example): Explicitly nest the
unique_structure_substitutionscalls. Best for 2 levels or when you need fine control over the process.Iterative approach (shown above): Use the
enumerate_multilevel_disorderfunction to handle an arbitrary number of levels. Best for 3+ levels or when you want cleaner, more maintainable code.
Choosing the Level Order
You can enumerate the subsets in any order—the final set of structures will be the same. However:
Starting with the smaller combinatorial space (fewer permutations) means fewer level-1 structures to loop over
Starting with disorder that breaks symmetry more may lead to faster level-2 enumerations
In practice, performance is often similar regardless of ordering
When using the iterative approach, simply reorder the entries in disorder_specs to change the enumeration order.
Memory Considerations
For very large structure sets:
Process structures in batches rather than storing all in memory
In the iterative approach, you can modify the function to write structures to disk after each level rather than keeping them all in the
structureslist
Summary
Multi-level disorder requires enumerating permutations on multiple independent subsets of sites
The hierarchical approach enumerates one level at a time, using the appropriate symmetry at each step
Two implementation patterns are available:
Manual chaining for 2 levels or fine-grained control
Iterative approach for 3+ levels or cleaner code
Each level uses the symmetry of configurations from the previous level, ensuring correct and efficient enumeration