Composition Enumeration

The Composition Enumeration Problem

Single vs Multiple Compositions

The unique_configurations() method finds all symmetry-inequivalent configurations for a fixed composition - a specific distribution of object types. For example, {0: 2, 1: 2} specifies exactly 2 objects of type 0 and 2 of type 1.

However, many problems require exploring multiple compositions:

  • Building phase diagrams (e.g., LixNa1-xCoO2 across all x)

  • Finding energetically favorable compositions

  • Surveying disorder across different stoichiometries

The Challenge

For an n-site system with k species, there are many possible compositions. A naive approach would call unique_configurations() separately for each composition, but this is inefficient because many compositions are related by symmetry.

Example: Binary System

Consider a 4-site system with 2 species (labels 0 and 1):

Possible compositions:

  • (4, 0) - all sites are species 0

  • (3, 1) - 3 sites species 0, 1 site species 1

  • (2, 2) - 2 sites of each species

  • (1, 3) - 1 site species 0, 3 sites species 1

  • (0, 4) - all sites are species 1

If we are studying Li/Na substitution on 4 sites, (3, 1) might represent Li3Na, while (1, 3) represents LiNa3. These are different chemically, but from a configuration enumeration perspective they are equivalent — the same counting problem with labels swapped.

Species Exchange Symmetry

What is Species Exchange Symmetry?

Two compositions are related by species exchange symmetry if one can be obtained from the other by permuting the species labels.

For a composition represented as a tuple (n₀, n₁, n₂, ...) where ni is the count of species i:

  • (3, 1, 0) and (1, 3, 0) are related by swapping species 0 ↔ species 1

  • (2, 1, 1) and (1, 2, 1) are related by swapping species 0 ↔ species 1

  • (2, 1, 1), (1, 2, 1), (1, 1, 2) are all related by various species permutations

Why This Matters

Compositions related by species exchange produce identical sets of symmetry-inequivalent configurations (up to relabeling).

If we find the unique configurations for composition (3, 1), we can generate the configurations for (1, 3) simply by swapping the species labels in each configuration. No expensive symmetry analysis is needed for the second composition.

Partitions and Canonical Compositions

Integer Partitions

An integer partition of n into k parts is a way of writing n as a sum of k non-negative integers, in non-increasing order.

For n=4 sites and k=2 species:

  • (4, 0) - partition: 4 + 0 = 4

  • (3, 1) - partition: 3 + 1 = 4

  • (2, 2) - partition: 2 + 2 = 4

Note: (3, 1) and (1, 3) represent the same partition. When written in non-increasing order, both become (3, 1).

Partitions Group Compositions

Each partition corresponds to all compositions related by species exchange:

Partition (3, 1) corresponds to:

  • Composition (3, 1) - 3 of species 0, 1 of species 1

  • Composition (1, 3) - 1 of species 0, 3 of species 1

Partition (2, 2) corresponds to:

  • Composition (2, 2) - 2 of each species (only one arrangement - symmetric)

Partition (4, 0) corresponds to:

  • Composition (4, 0) - all sites species 0

  • Composition (0, 4) - all sites species 1

Canonical Composition

The canonical composition for a partition is the partition tuple itself.

  • For partition (3, 1): canonical composition is (3, 1)

  • For partition (2, 2): canonical composition is (2, 2)

All other permutations of the partition are non-canonical compositions.

The Optimization Strategy

Instead of performing symmetry analysis on every composition:

  1. Generate all partitions of n sites into k species

  2. For each partition, perform symmetry analysis only on the canonical composition

  3. For non-canonical compositions, generate configurations by relabeling species in the canonical results

This reduces computational cost from O(compositions) to O(partitions).

Example: 4 Sites, 2 Species

All compositions: (4,0), (3,1), (2,2), (1,3), (0,4) - 5 total

All partitions: (4,0), (3,1), (2,2) - 3 total

Without optimization (naive approach):

  • 5 separate symmetry analyses (one per composition)

With optimization:

  • Analyze partition (4,0) → generate composition (0,4) by relabeling

  • Analyze partition (3,1) → generate composition (1,3) by relabeling

  • Analyze partition (2,2) (symmetric - no additional compositions)

  • 3 symmetry analyses

Reduction: 40% (5 analyses → 3 analyses)

Species Mapping and Relabeling

The Mapping Vector

To generate configurations for a non-canonical composition from canonical results, we compute a species mapping vector that describes how to relabel species.

For partition (3, 1) with compositions (3, 1)(1, 3):

  • Species 0 in canonical (count 3) appears at position 1 in (1, 3) → mapping[0] = 1

  • Species 1 in canonical (count 1) appears at position 0 in (1, 3) → mapping[1] = 0

Mapping vector: [1, 0] (swap species 0↔1)

Applying the Mapping

Apply the mapping vector to each configuration by relabeling species:

Canonical configuration for (3, 1):

[0, 0, 0, 1]  # 3 sites are species 0, 1 site is species 1

After applying mapping [1, 0]:

[1, 1, 1, 0]  # 3 sites are species 1, 1 site is species 0

This produces a valid configuration for composition (1, 3).

The degeneracy (count) is preserved during relabeling - if the canonical configuration represents 4 equivalent arrangements, the relabeled configuration also represents 4 equivalent arrangements.

The Algorithm

Overview

The composition enumeration algorithm:

  1. Generate partitions: Find all partitions of n sites into k species

  2. Generate compositions: For each partition, create all permutations to get all compositions

  3. Filter by bounds (if specified): Keep only compositions satisfying occupancy constraints

  4. Analyze canonical compositions: For each partition, perform symmetry analysis on the canonical composition only

  5. Generate non-canonical results: For other compositions from the same partition, relabel the canonical configurations using species mapping vectors

Step-by-Step Process

Given: n-site configuration space, k species, optional occupancy bounds

Step 1: Generate Partitions

Generate all integer partitions of n into k parts:

# For n=4, k=2:
partitions = [(4, 0), (3, 1), (2, 2)]

Step 2: Generate Compositions from Each Partition

For each partition, generate all distinct permutations:

# For partition (3, 1):
permutations = [(3, 1), (1, 3)]

Step 3: Filter by Bounds (Optional)

If occupancy bounds are specified, check each composition:

# bounds = {0: (1, 3), 1: (1, 3)}
# Composition (4, 0) violates bounds (species 1 has 0, needs at least 1)
# Composition (3, 1) satisfies bounds ✓

Step 4: Analyze Canonical Composition

For each partition, perform expensive symmetry analysis on the canonical composition:

# For partition (3, 1):
canonical = (3, 1)
site_distribution = {0: 3, 1: 1}
canonical_configs = config_space.unique_configurations(site_distribution)

Step 5: Generate Non-Canonical Configurations

For each non-canonical composition from the same partition:

# For composition (1, 3) from partition (3, 1):
mapping = compute_mapping_vector((3, 1), (1, 3))  # Returns [1, 0]
relabeled_configs = [apply_species_mapping(config, mapping) 
                     for config in canonical_configs]
results[(1, 3)] = relabeled_configs

Implementation in bsym

The algorithm is implemented in ConfigurationSpace.unique_configurations_by_composition():

from bsym import ConfigurationSpace

config_space = ConfigurationSpace(
    objects=[1, 2, 3, 4],
    symmetry_group=my_symmetry_group
)

# Enumerate all compositions for 2 species
results = config_space.unique_configurations_by_composition(n_species=2)

# Results is a dictionary: {composition_tuple: [Configuration, ...]}
# e.g., {(4, 0): [...], (3, 1): [...], (2, 2): [...], (1, 3): [...], (0, 4): [...]}

With optional bounds:

# Only compositions with 1-3 of each species
bounds = {0: (1, 3), 1: (1, 3)}
results = config_space.unique_configurations_by_composition(
    n_species=2,
    bounds=bounds
)

# Results: {(3, 1): [...], (2, 2): [...], (1, 3): [...]}
# (4, 0) and (0, 4) excluded by bounds