# Composition Enumeration ## The Composition Enumeration Problem ### Single vs Multiple Compositions The `unique_configurations()` method finds all symmetry-inequivalent configurations for a **fixed composition** - a specific distribution of object types. For example, `{0: 2, 1: 2}` specifies exactly 2 objects of type 0 and 2 of type 1. However, many problems require exploring **multiple compositions**: - Building phase diagrams (e.g., LixNa1-xCoO2 across all x) - Finding energetically favorable compositions - Surveying disorder across different stoichiometries ### The Challenge For an n-site system with k species, there are many possible compositions. A naive approach would call `unique_configurations()` separately for each composition, but this is inefficient because many compositions are related by symmetry. ### Example: Binary System Consider a 4-site system with 2 species (labels 0 and 1): Possible compositions: - `(4, 0)` - all sites are species 0 - `(3, 1)` - 3 sites species 0, 1 site species 1 - `(2, 2)` - 2 sites of each species - `(1, 3)` - 1 site species 0, 3 sites species 1 - `(0, 4)` - all sites are species 1 If we are studying Li/Na substitution on 4 sites, `(3, 1)` might represent Li3Na, while `(1, 3)` represents LiNa3. These are different chemically, but from a **configuration enumeration perspective** they are equivalent — the same counting problem with labels swapped. ## Species Exchange Symmetry ### What is Species Exchange Symmetry? Two compositions are related by **species exchange symmetry** if one can be obtained from the other by permuting the species labels. For a composition represented as a tuple `(n₀, n₁, n₂, ...)` where ni is the count of species i: - `(3, 1, 0)` and `(1, 3, 0)` are related by swapping species 0 ↔ species 1 - `(2, 1, 1)` and `(1, 2, 1)` are related by swapping species 0 ↔ species 1 - `(2, 1, 1)`, `(1, 2, 1)`, `(1, 1, 2)` are all related by various species permutations ### Why This Matters Compositions related by species exchange produce **identical sets** of symmetry-inequivalent configurations (up to relabeling). If we find the unique configurations for composition `(3, 1)`, we can generate the configurations for `(1, 3)` simply by swapping the species labels in each configuration. No expensive symmetry analysis is needed for the second composition. ## Partitions and Canonical Compositions ### Integer Partitions An **integer partition** of n into k parts is a way of writing n as a sum of k non-negative integers, in non-increasing order. For n=4 sites and k=2 species: - `(4, 0)` - partition: 4 + 0 = 4 - `(3, 1)` - partition: 3 + 1 = 4 - `(2, 2)` - partition: 2 + 2 = 4 Note: `(3, 1)` and `(1, 3)` represent the **same partition**. When written in non-increasing order, both become `(3, 1)`. ### Partitions Group Compositions Each partition corresponds to all compositions related by species exchange: **Partition (3, 1)** corresponds to: - Composition `(3, 1)` - 3 of species 0, 1 of species 1 - Composition `(1, 3)` - 1 of species 0, 3 of species 1 **Partition (2, 2)** corresponds to: - Composition `(2, 2)` - 2 of each species (only one arrangement - symmetric) **Partition (4, 0)** corresponds to: - Composition `(4, 0)` - all sites species 0 - Composition `(0, 4)` - all sites species 1 ### Canonical Composition The **canonical composition** for a partition is the partition tuple itself. - For partition `(3, 1)`: canonical composition is `(3, 1)` - For partition `(2, 2)`: canonical composition is `(2, 2)` All other permutations of the partition are **non-canonical compositions**. ### The Optimization Strategy Instead of performing symmetry analysis on every composition: 1. Generate all partitions of n sites into k species 2. For each partition, perform symmetry analysis **only on the canonical composition** 3. For non-canonical compositions, generate configurations by **relabeling species** in the canonical results This reduces computational cost from O(compositions) to O(partitions). ### Example: 4 Sites, 2 Species All compositions: `(4,0)`, `(3,1)`, `(2,2)`, `(1,3)`, `(0,4)` - **5 total** All partitions: `(4,0)`, `(3,1)`, `(2,2)` - **3 total** **Without optimization** (naive approach): - 5 separate symmetry analyses (one per composition) **With optimization**: - Analyze partition `(4,0)` → generate composition `(0,4)` by relabeling - Analyze partition `(3,1)` → generate composition `(1,3)` by relabeling - Analyze partition `(2,2)` (symmetric - no additional compositions) - **3 symmetry analyses** **Reduction: 40%** (5 analyses → 3 analyses) ## Species Mapping and Relabeling ### The Mapping Vector To generate configurations for a non-canonical composition from canonical results, we compute a **species mapping vector** that describes how to relabel species. For partition `(3, 1)` with compositions `(3, 1)` → `(1, 3)`: - Species 0 in canonical (count 3) appears at position 1 in `(1, 3)` → mapping[0] = 1 - Species 1 in canonical (count 1) appears at position 0 in `(1, 3)` → mapping[1] = 0 Mapping vector: `[1, 0]` (swap species 0↔1) ### Applying the Mapping Apply the mapping vector to each configuration by relabeling species: **Canonical configuration for (3, 1)**: ``` [0, 0, 0, 1] # 3 sites are species 0, 1 site is species 1 ``` **After applying mapping [1, 0]**: ``` [1, 1, 1, 0] # 3 sites are species 1, 1 site is species 0 ``` This produces a valid configuration for composition `(1, 3)`. The degeneracy (count) is preserved during relabeling - if the canonical configuration represents 4 equivalent arrangements, the relabeled configuration also represents 4 equivalent arrangements. ## The Algorithm ### Overview The composition enumeration algorithm: 1. **Generate partitions**: Find all partitions of n sites into k species 2. **Generate compositions**: For each partition, create all permutations to get all compositions 3. **Filter by bounds** (if specified): Keep only compositions satisfying occupancy constraints 4. **Analyze canonical compositions**: For each partition, perform symmetry analysis on the canonical composition only 5. **Generate non-canonical results**: For other compositions from the same partition, relabel the canonical configurations using species mapping vectors ### Step-by-Step Process **Given**: n-site configuration space, k species, optional occupancy bounds **Step 1: Generate Partitions** Generate all integer partitions of n into k parts: ```python # For n=4, k=2: partitions = [(4, 0), (3, 1), (2, 2)] ``` **Step 2: Generate Compositions from Each Partition** For each partition, generate all distinct permutations: ```python # For partition (3, 1): permutations = [(3, 1), (1, 3)] ``` **Step 3: Filter by Bounds (Optional)** If occupancy bounds are specified, check each composition: ```python # bounds = {0: (1, 3), 1: (1, 3)} # Composition (4, 0) violates bounds (species 1 has 0, needs at least 1) # Composition (3, 1) satisfies bounds ✓ ``` **Step 4: Analyze Canonical Composition** For each partition, perform expensive symmetry analysis on the canonical composition: ```python # For partition (3, 1): canonical = (3, 1) site_distribution = {0: 3, 1: 1} canonical_configs = config_space.unique_configurations(site_distribution) ``` **Step 5: Generate Non-Canonical Configurations** For each non-canonical composition from the same partition: ```python # For composition (1, 3) from partition (3, 1): mapping = compute_mapping_vector((3, 1), (1, 3)) # Returns [1, 0] relabeled_configs = [apply_species_mapping(config, mapping) for config in canonical_configs] results[(1, 3)] = relabeled_configs ``` ### Implementation in bsym The algorithm is implemented in `ConfigurationSpace.unique_configurations_by_composition()`: ```python from bsym import ConfigurationSpace config_space = ConfigurationSpace( objects=[1, 2, 3, 4], symmetry_group=my_symmetry_group ) # Enumerate all compositions for 2 species results = config_space.unique_configurations_by_composition(n_species=2) # Results is a dictionary: {composition_tuple: [Configuration, ...]} # e.g., {(4, 0): [...], (3, 1): [...], (2, 2): [...], (1, 3): [...], (0, 4): [...]} ``` With optional bounds: ```python # Only compositions with 1-3 of each species bounds = {0: (1, 3), 1: (1, 3)} results = config_space.unique_configurations_by_composition( n_species=2, bounds=bounds ) # Results: {(3, 1): [...], (2, 2): [...], (1, 3): [...]} # (4, 0) and (0, 4) excluded by bounds ```