Multi-Level Disorder

This guide demonstrates how to enumerate structures with disorder on multiple independent subsets of sites, such as simultaneous cation and anion disorder.

The Problem

Sometimes you need to explore disorder on more than one subset of sites. For example, we might be interested in mixed cation/anion disorder in (Ti,Zr)O2:

  • Cation disorder: Ti/Zr substitution on metal sites

  • Anion disorder: O/F substitution on anion sites

These disorders are independent—they occur on different sets of sites—but they interact through symmetry breaking.

The Hierarchical Approach

When you disorder one subset of sites, the resulting structure typically has lower symmetry than the parent. This reduced symmetry should be used when enumerating disorder on the second subset.

Algorithm

  1. Level 1: Enumerate disorder on the first subset using the parent structure’s symmetry

  2. Level 2: For each Level 1 configuration:

    • The configuration has its own (typically reduced) symmetry

    • Enumerate disorder on the second subset using this reduced symmetry

  3. Collect all Level 2 structures

This hierarchical approach ensures:

  • No duplicate structures

  • Correct symmetry analysis at each level

  • Computational efficiency through symmetry reduction at each level

Example: Ti/Zr and O/F Disorder in TiOF2

Let us work through a realistic example: a 2×2×2 supercell of TiOF2 with disorder on both cation and anion sublattices.

Setting Up the Parent Structure

import numpy as np
from pymatgen.core import Structure, Lattice
from bsym.interface.pymatgen import unique_structure_substitutions

# Create 2×2×2 TiOF2 supercell
a = 3.798  # lattice parameter in Ångströms

coords = np.array([[0.0, 0.0, 0.0],
                   [0.5, 0.0, 0.0],
                   [0.0, 0.5, 0.0],
                   [0.0, 0.0, 0.5]])
atom_list = ['Ti', 'X', 'X', 'X']
lattice = Lattice.from_parameters(a=a, b=a, c=a, alpha=90, beta=90, gamma=90)
unit_cell = Structure(lattice, atom_list, coords)

# Create a 2×2×2 supercell
parent_structure = unit_cell * [2, 2, 2]
print(f"Created supercell with {len(parent_structure)} atoms")
print(f"  - {len([s for s in parent_structure if s.species_string == 'Ti'])} Ti sites")
print(f"  - {len([s for s in parent_structure if s.species_string == 'X'])} X sites (to be O/F)")
Created supercell with 32 atoms
  - 8 Ti sites
  - 0 X sites (to be O/F)
/home/docs/checkouts/readthedocs.org/user_builds/bsym/envs/stable/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Hierarchical Enumeration

We will enumerate Ti/Zr disorder first (smaller combinatorial space), then O/F disorder for each Ti/Zr arrangement.

# Level 1: Ti/Zr disorder on cation sites
print("Level 1: Enumerating Ti/Zr arrangements...")
level1_structures = unique_structure_substitutions(
    parent_structure,
    'Ti',  # Substitute Ti sites
    {'Ti': 6, 'Zr': 2}  # 6 Ti, 2 Zr
)
print(f"Found {len(level1_structures)} unique Ti/Zr arrangements\n")
Level 1: Enumerating Ti/Zr arrangements...
Found 3 unique Ti/Zr arrangements
# Level 2: O/F disorder for each Ti/Zr arrangement
print("Level 2: Enumerating O/F arrangements for each Ti/Zr configuration...")
all_structures = []

for i, structure in enumerate(level1_structures):
    level2_structures = unique_structure_substitutions(
        structure,  # Uses the reduced symmetry of this Ti/Zr arrangement
        'X',
        {'O': 8, 'F': 16}
    )
    print(f"  Ti/Zr config {i+1}: {len(level2_structures)} O/F arrangements")
    all_structures.extend(level2_structures)

print(f"\nTotal unique structures: {len(all_structures)}")
Level 2: Enumerating O/F arrangements for each Ti/Zr configuration...
  Ti/Zr config 1: 29371 O/F arrangements
  Ti/Zr config 2: 28955 O/F arrangements
  Ti/Zr config 3: 9782 O/F arrangements

Total unique structures: 68108

Understanding the Output

In this example:

  • Level 1 produces a modest number of unique Ti/Zr cation arrangements

  • Level 2 varies for each Ti/Zr configuration:

    • Some Ti/Zr arrangements preserve more symmetry → fewer O/F arrangements needed

    • Other Ti/Zr arrangements break more symmetry → more O/F arrangements needed

  • The total combines all possibilities across both disorder types

Implementation Pattern

The general pattern for multi-level disorder is:

# Level 1: First disorder type
level1_configs = unique_structure_substitutions(
    parent_structure, 
    species_to_substitute_1, 
    composition_1
)

# Level 2: Second disorder type
all_configs = []
for config in level1_configs:
    level2_configs = unique_structure_substitutions(
        config,  # Each has its own symmetry
        species_to_substitute_2, 
        composition_2
    )
    all_configs.extend(level2_configs)

This pattern extends naturally to three or more levels by adding additional nested loops.

Iterative Approach for Multiple Levels

For cases with three or more disorder types, nested loops become unwieldy. An iterative approach can be used instead that generalises to any number of levels:

def enumerate_multilevel_disorder(parent_structure, disorder_specs):
    """
    Enumerate structures with multiple levels of disorder.
    
    Args:
        parent_structure: Initial pymatgen Structure
        disorder_specs: List of dicts, each containing:
            - 'to_substitute': species label to replace
            - 'site_distribution': dict of {species: count}
    
    Returns:
        List of Structure objects with all disorder levels applied
    """
    structures = [parent_structure]
    
    for level, spec in enumerate(disorder_specs, 1):
        print(f"Level {level}: Enumerating {spec['to_substitute']}{spec['site_distribution']}")
        new_structures = []
        
        for i, structure in enumerate(structures):
            if i % 100 == 0 and len(structures) > 100:
                print(f"  Processing structure {i+1}/{len(structures)}...")
            
            level_structures = unique_structure_substitutions(
                structure,
                spec['to_substitute'],
                spec['site_distribution']
            )
            new_structures.extend(level_structures)
        
        print(f"  Generated {len(new_structures)} structures\n")
        structures = new_structures
    
    return structures
disorder_specs = [
    {
        'to_substitute': 'Ti',
        'site_distribution': {'Ti': 6, 'Zr': 2}
    },
    {
        'to_substitute': 'X',
        'site_distribution': {'O': 8, 'F': 16}
    }
]

# Run the multi-level enumeration
all_structures = enumerate_multilevel_disorder(parent_structure, disorder_specs)

print(f"Total unique structures: {len(all_structures)}")
Level 1: Enumerating Ti → {'Ti': 6, 'Zr': 2}
  Generated 3 structures

Level 2: Enumerating X → {'O': 8, 'F': 16}
  Generated 68108 structures

Total unique structures: 68108

This iterative approach:

  • Generalizes easily to 3+ disorder levels

  • Avoids deeply nested loops

  • Makes it easy to modify or reorder disorder specifications

  • Provides progress tracking for long enumerations

Notes

Current Implementation

This hierarchical enumeration can be implemented in two ways:

  1. Manual chaining (shown in the first example): Explicitly nest the unique_structure_substitutions calls. Best for 2 levels or when you need fine control over the process.

  2. Iterative approach (shown above): Use the enumerate_multilevel_disorder function to handle an arbitrary number of levels. Best for 3+ levels or when you want cleaner, more maintainable code.

Choosing the Level Order

You can enumerate the subsets in any order—the final set of structures will be the same. However:

  • Starting with the smaller combinatorial space (fewer permutations) means fewer level-1 structures to loop over

  • Starting with disorder that breaks symmetry more may lead to faster level-2 enumerations

  • In practice, performance is often similar regardless of ordering

When using the iterative approach, simply reorder the entries in disorder_specs to change the enumeration order.

Memory Considerations

For very large structure sets:

  • Process structures in batches rather than storing all in memory

  • In the iterative approach, you can modify the function to write structures to disk after each level rather than keeping them all in the structures list

Summary

  • Multi-level disorder requires enumerating permutations on multiple independent subsets of sites

  • The hierarchical approach enumerates one level at a time, using the appropriate symmetry at each step

  • Two implementation patterns are available:

    • Manual chaining for 2 levels or fine-grained control

    • Iterative approach for 3+ levels or cleaner code

  • Each level uses the symmetry of configurations from the previous level, ensuring correct and efficient enumeration