Varying Composition Substitutions

This guide demonstrates how to generate symmetry-inequivalent structures across multiple compositions in a single operation using unique_structure_substitutions_by_composition.

Overview

The unique_structure_substitutions_by_composition function extends the basic substitution functionality by systematically exploring all possible compositions of your specified species.

Key differences from unique_structure_substitutions:

  • Input: Provide a list of species instead of a fixed site_distribution dict

  • Output: Returns a dictionary mapping composition tuples to lists of structures

  • Explores all possible compositions (or a constrained range)

This is particularly useful when you want to:

  • Survey structures across a composition range

  • Build phase diagrams

  • Screen materials with variable stoichiometry

Basic Example: Binary Substitution

Let’s start with a simple example: a 4-site system where we substitute with two species (A and B).

import numpy as np
from pymatgen.core import Structure, Lattice
from bsym.interface.pymatgen import unique_structure_substitutions_by_composition

# Create a simple 2×2 square lattice
coords = np.array([[0.0, 0.0, 0.0]])
atom_list = ['Li']
lattice = Lattice.from_parameters(a=1.0, b=1.0, c=1.0, alpha=90, beta=90, gamma=90)
parent_structure = Structure(lattice, atom_list, coords) * [2, 2, 1]

print(f"Created structure with {len(parent_structure)} sites")
Created structure with 4 sites
/home/docs/checkouts/readthedocs.org/user_builds/bsym/envs/stable/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Now we’ll generate all unique structures for substituting with species A and B:

results = unique_structure_substitutions_by_composition(
    parent_structure,
    'Li',
    ['A', 'B']  # List of species (order matters for composition tuples)
)

print(f"Generated structures for {len(results)} different compositions:")
for composition, structures in results.items():
    print(f"  Composition {composition}: {len(structures)} unique structure(s)")
Generated structures for 5 different compositions:
  Composition (0, 4): 1 unique structure(s)
  Composition (4, 0): 1 unique structure(s)
  Composition (1, 3): 1 unique structure(s)
  Composition (3, 1): 1 unique structure(s)
  Composition (2, 2): 2 unique structure(s)

Understanding the Output Format

The function returns a dictionary where:

  • Keys are composition tuples: (n_A, n_B) representing the count of each species

  • Values are lists of Structure objects

The order of species in the tuple matches the order in your species list.

Let’s examine a specific composition:

# Access structures with 2 A atoms and 2 B atoms
composition_2_2 = results[(2, 2)]

print(f"Composition (2A, 2B) has {len(composition_2_2)} unique structure(s)")
print(f"\nStructure 0:")
print(composition_2_2[0])
print(f"\nDegeneracy: {composition_2_2[0].number_of_equivalent_configurations}")
Composition (2A, 2B) has 2 unique structure(s)

Structure 0:
Full Formula (A2 B2)
Reduced Formula: AB
abc   :   2.000000   2.000000   1.000000
angles:  90.000000  90.000000  90.000000
pbc   :       True       True       True
Sites (4)
  #  SP      a    b    c
---  ----  ---  ---  ---
  0  A0+   0    0      0
  1  A0+   0    0.5    0
  2  B     0.5  0      0
  3  B     0.5  0.5    0

Degeneracy: 4

Example: Li-Na Binary System

Let’s look at a more realistic example with actual elements on a larger lattice:

# Create a 3×3 lattice
coords = np.array([[0.0, 0.0, 0.0]])
atom_list = ['X']  # Placeholder to substitute
lattice = Lattice.from_parameters(a=1.0, b=1.0, c=1.0, alpha=90, beta=90, gamma=90)
parent_structure = Structure(lattice, atom_list, coords) * [3, 3, 1]

print(f"Created structure with {len(parent_structure)} sites")
Created structure with 9 sites
# Generate Li-Na structures across all compositions
li_na_results = unique_structure_substitutions_by_composition(
    parent_structure,
    'X',
    ['Li', 'Na']
)

print(f"Number of compositions explored: {len(li_na_results)}")
print("\nComposition summary:")
for composition, structures in sorted(li_na_results.items()):
    n_li, n_na = composition
    total_configs = sum(s.number_of_equivalent_configurations for s in structures)
    print(f"  Li{n_li}Na{n_na}: {len(structures)} unique, {total_configs} total configurations")
Number of compositions explored: 10

Composition summary:
  Li0Na9: 1 unique, 1 total configurations
  Li1Na8: 1 unique, 9 total configurations
  Li2Na7: 2 unique, 36 total configurations
  Li3Na6: 4 unique, 84 total configurations
  Li4Na5: 5 unique, 126 total configurations
  Li5Na4: 5 unique, 126 total configurations
  Li6Na3: 4 unique, 84 total configurations
  Li7Na2: 2 unique, 36 total configurations
  Li8Na1: 1 unique, 9 total configurations
  Li9Na0: 1 unique, 1 total configurations

Constraining Compositions with Bounds

Often you don’t want to explore all possible compositions. The bounds parameter lets you constrain the range:

Format: {'species_name': (min_count, max_count)}

For example, to explore only lithium-rich compositions:

# Only compositions with at least 6 Li atoms
li_rich_results = unique_structure_substitutions_by_composition(
    parent_structure,
    'X',
    ['Li', 'Na'],
    bounds={'Li': (6, 9)}  # 6-9 Li atoms
)

print(f"Li-rich compositions (6+ Li):")
for composition in sorted(li_rich_results.keys()):
    n_li, n_na = composition
    print(f"  Li{n_li}Na{n_na}: {len(li_rich_results[composition])} unique structures")
Li-rich compositions (6+ Li):
  Li6Na3: 4 unique structures
  Li7Na2: 2 unique structures
  Li8Na1: 1 unique structures
  Li9Na0: 1 unique structures

You can constrain multiple species:

# Compositions with 3-6 Li and 1-4 Na
constrained_results = unique_structure_substitutions_by_composition(
    parent_structure,
    'X',
    ['Li', 'Na'],
    bounds={'Li': (3, 6), 'Na': (1, 4)}
)

print(f"Constrained compositions:")
for composition in sorted(constrained_results.keys()):
    n_li, n_na = composition
    print(f"  Li{n_li}Na{n_na}: {len(constrained_results[composition])} unique structures")
Constrained compositions:
  Li5Na4: 5 unique structures
  Li6Na3: 4 unique structures

Three-Species Example

The function works with any number of species. Here’s an example with three:

# Create a smaller structure for 3-species exploration
small_structure = Structure(lattice, ['X'], [[0.0, 0.0, 0.0]]) * [2, 2, 1]

# Li-Na-K ternary system
ternary_results = unique_structure_substitutions_by_composition(
    small_structure,
    'X',
    ['Li', 'Na', 'K'],
    bounds={'Li': (0, 4), 'Na': (0, 4), 'K': (0, 2)}  # Limit K content
)

print(f"Generated {len(ternary_results)} compositions")
print("\nExample compositions:")
for i, (composition, structures) in enumerate(sorted(ternary_results.items())[:5]):
    n_li, n_na, n_k = composition
    print(f"  Li{n_li}Na{n_na}K{n_k}: {len(structures)} unique structures")
Generated 12 compositions

Example compositions:
  Li0Na2K2: 2 unique structures
  Li0Na3K1: 1 unique structures
  Li0Na4K0: 1 unique structures
  Li1Na1K2: 2 unique structures
  Li1Na2K1: 2 unique structures

Using Progress Bars

When running from a terminal or Python script, you can enable progress bars to monitor the computation:

# Create a 5×5 lattice
coords = np.array([[0.0, 0.0, 0.0]])
atom_list = ['X']  # Placeholder to substitute
lattice = Lattice.from_parameters(a=1.0, b=1.0, c=1.0, alpha=90, beta=90, gamma=90)
parent_structure = Structure(lattice, atom_list, coords) * [5, 5, 1]

# Enable progress bars
results_with_progress = unique_structure_substitutions_by_composition(
    parent_structure,
    'X',
    ['Li', 'Na'],
    show_progress=True,
    verbose=True
)

This produces terminal output showing progress for each composition:

100%|████████████████████████| 1/1 [00:00<00:00, 1971.93 permutations/s, found=1]
100%|█████████████████████| 25/25 [00:00<00:00, 50291.41 permutations/s, found=1]
100%|██████████████████| 300/300 [00:00<00:00, 115196.48 permutations/s, found=5]
100%|███████████████| 2300/2300 [00:00<00:00, 197986.64 permutations/s, found=19]
100%|█████████████| 12650/12650 [00:00<00:00, 218314.01 permutations/s, found=88]
100%|████████████| 53130/53130 [00:00<00:00, 240692.40 permutations/s, found=309]
100%|██████████| 177100/177100 [00:00<00:00, 246178.45 permutations/s, found=975]
...

Each progress bar corresponds to one composition being processed, showing the number of permutations evaluated and unique configurations found.

Note: For Jupyter notebooks, use show_progress='notebook' to display interactive progress widgets instead of ASCII bars.

Exporting Structures

You might want to write structures to files, organised by composition:

import os

# Example: save structures for each composition
# (Commented out to avoid creating files in documentation)

# output_dir = 'li_na_structures'
# os.makedirs(output_dir, exist_ok=True)

# for composition, structures in li_na_results.items():
#     n_li, n_na = composition
#     comp_dir = os.path.join(output_dir, f'Li{n_li}Na{n_na}')
#     os.makedirs(comp_dir, exist_ok=True)
#     
#     for i, structure in enumerate(structures):
#         filename = os.path.join(comp_dir, f'structure_{i}.cif')
#         structure.to(filename=filename, fmt='cif')

print("Structures can be exported using structure.to(filename='...', fmt='cif')")
Structures can be exported using structure.to(filename='...', fmt='cif')

Key Points

  • unique_structure_substitutions_by_composition explores multiple compositions in one call

  • Returns a dictionary with composition tuples as keys

  • Composition tuple order matches the species list order

  • Use bounds to constrain composition ranges

  • Each structure retains its number_of_equivalent_configurations attribute

  • Efficient for surveying composition space and building phase diagrams

When to Use This vs Fixed Composition

Use unique_structure_substitutions_by_composition when:

  • You want to explore multiple compositions

  • Building a phase diagram or composition-property map

  • You don’t know the optimal composition in advance

Use unique_structure_substitutions when:

  • You know the exact composition you need

  • Performing sequential substitutions with intermediate analysis

  • You need to track full_configuration_degeneracy through multiple steps

Next Steps