Home · Research Tests
Related: Chemistry-Consistent Benchmark (CH₂O·C₂H₄·C₄H₆) · 8-Benchmark Comparative · Cr₂ Active Space · N₂ Active Space
Scope of this study: This study does not claim formal universality. Instead, it tests whether a single chemistry-first framework remains robust across 15 unrelated molecular families — spanning simple diatomics, closed-shell π systems, heteroatom molecules, biradicals, σ-only systems, aromatic N-heterocycles, and one transition-metal dimer. The results demonstrate broad practical robustness across these systems; extrapolation beyond the tested scope should be made with caution.
15 / 15
Systems satisfied the acceptance protocol
0 Failures · 0 Partial
100%
NOON Manifold Consistency
Seeds → equivalent post-CASSCF chemical manifolds
7 / 15
PASS+ (energy also improved)
Methylenimine, N₂, Pyridine, Formamide, Diazomethane, Pyrimidine + Cyclobutadiene
−19.4
Best ΔE (Thioformaldehyde, kcal/mol)
Lower raw energy than NOON-MP2, chemistry preserved
15
Systems tested
15
Accepted (0 failures)
43
GA seeds run
7
PASS+ (energy improved)
≥ 400
Converged geometry pts
6
Molecular families covered
What PASS means in this study:

Contents

  1. Evaluation Framework
  2. Executive Summary
  3. Full Results Table
  4. Molecular Families Covered
  5. Molecule-by-Molecule Results
  6. Key Finding: Empirical Basis-Label Invariance
  7. Figures & Charts
  8. Conclusions & Limitations

1. Evaluation Framework

Every active space selected by inZOR-ND is evaluated against five criteria derived from the standard methodological position in modern multi-reference quantum chemistry:

1Chemical Meaningfulness
The active space must yield a consistent description of the chemical bond. Operationalised via the final post-CASSCF Natural Orbital Occupation Numbers (NOONs): bonding orbitals near 2.0, antibonding near 0.0, partially occupied (biradical/π) near 1.0.
2Properly Converged
SA-CASSCF must converge at each geometry along the scan. Non-convergent evaluations are treated as worst-case sentinels and disqualify a candidate from winning the ranking. Convergence rate is reported per molecule.
3Competitive Energy
Once criteria (1) and (2) are satisfied, the mean S₀ energy across the scan is compared to the NOON-MP2 baseline. Energy alone is not the primary metric. A space that is lower in energy but chemically meaningless is rejected.
4Reproducibility
Multiple independent GA seeds must converge to the same chemical manifold (identical final NOON spectra). Seed-to-seed variability in MO indices is expected and acceptable; variability in final NOONs is not.
5Smooth Path Behaviour
The selected active space must produce a monotone (or physically sensible) gap or energy profile across the geometry scan. Measured by the path-smoothness metric R_M (0 = random, 1 = perfectly monotone).
The main metric for the performance of an active space is not whether it yields the lowest energy, but whether it yields a consistent description of a chemical bond. Thus, one needs to check whether the active space is chemically meaningful. In addition, there can be many local minima in CASSCF optimization, so one needs to be sure that the optimizations are properly converged.
— Standard position in modern multi-reference quantum chemistry methodology

Verdict definitions

VerdictMeaning
PASS+All seeds converge to the same chemical manifold (identical final NOONs) AND mean energy is lower than NOON-MP2 baseline while preserving chemistry.
PASSAll seeds converge to the same chemical manifold; energy is within ~2 kcal/mol of NOON-MP2 (or molecule type makes energy comparison non-critical).
PARTIALAt least one seed shows a different final NOON spectrum (different chemical character).
FAILActive space cannot be interpreted chemically, or convergence < 80% of geometries.

2. Executive Summary

Across 15 structurally diverse molecular systems from six distinct chemical families — inZOR-ND using the ZOR-SA v3/v3.1 framework satisfies the chemistry-consistency acceptance protocol in all 15 cases with zero failures or partial results. The primary result is manifold consistency: every accepted active space yields a chemically interpretable NOON spectrum, reproducible across independent GA seeds. Energy competitiveness, where tested, is a secondary confirmatory result.

Primary result — manifold consistency across families:

3. Full Results Table

#MoleculeCASBasisSeeds Final NOONs (g₀)ConvR_M ΔE vs NOON (kcal/mol)Verdict
01Methylenimine (CH₂=NH)(4,4)cc-pVDZ5 [1.99, 1.46, 0.54, 0.02] 5/5 × 10/101.000 −2.16PASS+
02Cyclobutadiene (C₄H₄)(4,4)cc-pVDZ3 [1.91, 1.00, 1.00, 0.09] 3/3 × 10/101.000 ±0.00PASS+
03N₂(4,4)cc-pVDZ3 [1.71, 1.71, 0.29, 0.29] 3/3 × 10/101.000 −3.54PASS+
04Hexatriene (C₆H₈)(6,6)cc-pVDZ1 [1.93, 1.88, 1.47, 0.54, 0.12, 0.07] 1/1 × 9/101.000 +1.67PASS
05Acrolein (C₃H₄O)(4,4)cc-pVDZ3 [1.94, 1.49, 0.56, 0.01] 3/3 × 10/100.78–1.000 +1.77PASS
06Pyridine (C₅H₅N)(6,6)cc-pVDZ3 + 1 neg [1.93, 1.73, 1.63, 0.37, 0.28, 0.07] 3/3 × 10/100.89–1.000 −0.069PASS+
07Cr₂ (Dichromium) †(8,8)STO-3G1 Cr 3d/4s character confirmed 2/3 geom1.000 −5.5PASS
08Ketene (CH₂=C=O)(4,4)cc-pVDZ3 [1.96, 1.47, 0.51, 0.06] 3/3 × 10/101.000 −0.507PASS
09Formamide (HC(=O)NH₂)(4,4)cc-pVDZ3 [1.95, 1.49, 0.55, 0.01] 3/3 × 10/101.000 −12.95PASS+
10Thioformaldehyde (CH₂=S)(4,4)cc-pVDZ3 [1.96, 1.35, 0.65, 0.04] 6–9/10 per seed1.000 −19.4PASS
11Diazomethane (CH₂=N=N)(4,4)cc-pVDZ3 [1.95, 1.44, 0.56, 0.05] 3/3 × 10/101.000 −0.217PASS+
12Hydrogen Peroxide (H₂O₂)(4,4)cc-pVDZ3 [1.96, 1.49, 0.55, 0.01] 3/3 × 10/101.000 −0.028PASS
13Pyrrole (C₄H₅N)(6,6)cc-pVDZ3 [1.93, 1.82, 1.55, 0.46, 0.18, 0.07] 3/3 × 10/100.556–1.000 −8.241 (best seed)PASS
14Imidazole (C₃H₄N₂)(6,6)cc-pVDZ3 [1.93, 1.82, 1.53, 0.48, 0.19, 0.07] 9–10/10 per seed0.222–1.000 −11.944 (best seed)PASS
15Pyrimidine (C₄H₄N₂)(6,6)cc-pVDZ3 [1.93, 1.81, 1.50, 0.49, 0.20, 0.07] 3/3 × 10/100.889 ≈ 0.000PASS+
† Cr₂: Preliminary stress test under reduced basis (STO-3G) and reduced geometry set (3 points). Results demonstrate that the framework can operate on transition-metal multi-reference systems; they do not constitute a comprehensive transition-metal benchmark.
Hexatriene 80° geometry: Hard for all CAS methods due to near-degeneracy; 9/10 is acceptable.
Thioformaldehyde: Near-degenerate at r > 2.1 Å; 6–9/10 per seed.

4. Molecular Families Covered

One of the central questions in this study is whether a chemistry-first framework trained on no specific molecule type can remain robust across structurally unrelated systems. The 15 molecules span six distinct families:

FamilyKey FeatureExamples in this studyResult
π systems (closed-shell)Frontier π/π* bondingMethylenimine, Hexatriene, KeteneAll PASS / PASS+
Heteroatom lone-pair / n→π*n(X) orbital in active spaceAcrolein, Formamide, Thioformaldehyde, DiazomethaneAll PASS / PASS+
Aromatic N-heterocycles6π + nitrogen lone pair / NOON injectionPyridine, Pyrrole, Imidazole, PyrimidineAll PASS / PASS+
Biradical / Jahn-TellerDegenerate SOMO pairCyclobutadienePASS+
σ-only bond breakingNo π content; O–O σ + lone pairsHydrogen Peroxide (H₂O₂)PASS
Transition metal (preliminary)Multi-reference 3d, extreme near-degeneracyCr₂ (STO-3G, reduced basis)PASS †
Presence in the table does not imply exhaustive coverage of that family. Transition-metal coverage in particular remains preliminary.

5. Molecule-by-Molecule Results

01 · Methylenimine (CH₂=NH)
CAS(4,4) · cc-pVDZ · Torsion scan 10 geom
PASS+
Seeds tested5
Manifold consistency5/5 → equivalent NOONs
Final NOONs g₀[1.99, 1.46, 0.54, 0.02]
Convergence100% (50/50)
ΔE vs NOON-MP2−2.16 kcal/mol (lower, chemistry preserved)
Characterπ(C=N) / n(N) manifold
02 · Cyclobutadiene (C₄H₄)
CAS(4,4) · cc-pVDZ · Jahn-Teller scan 10 geom
PASS+
Seeds tested3
Manifold consistency3/3 identical
Final NOONs g₀[1.91, 1.00, 1.00, 0.09]
Convergence100% (30/30)
ΔE vs NOON-MP2±0.00 (same basin)
CharacterPerfect biradical — NOONs 1.0/1.0 textbook Jahn-Teller
03 · N₂ (Dinitrogen)
CAS(4,4) · cc-pVDZ · Bond stretch 10 geom
PASS+
Seeds tested3
Manifold consistency3/3 identical manifold
Final NOONs g₀ → stretch[1.71, 1.71, 0.29, 0.29] → [1.57, 1.57, 0.43, 0.43]
Convergence100% (30/30)
ΔE vs NOON-MP2−3.54 kcal/mol (lower, chemistry preserved)
Characterσ/π bonding → π*/σ* dissociation
04 · Hexatriene (C₆H₈)
CAS(6,6) · cc-pVDZ · Torsion scan 10 geom
PASS
Seeds tested1
Final NOONs g₀[1.93, 1.88, 1.47, 0.54, 0.12, 0.07]
Convergence9/10 (80° hard for all methods)
ΔE vs NOON-MP2+1.67 kcal/mol
CharacterFull 6π conjugated manifold
05 · Acrolein (C₃H₄O)
CAS(4,4) · cc-pVDZ · Torsion scan 10 geom
PASS
Seeds tested3
Manifold consistency3/3 → equivalent NOONs at g₉
Final NOONs g₉[1.94, 1.49, 0.56, 0.01]
Convergence100% (30/30)
ΔE vs NOON-MP2+1.77 kcal/mol
NoteStarting MOs [13,14,40,70] vs [7,14,15,18] → equivalent final manifold — empirical basis-label invariance
06 · Pyridine (C₅H₅N)
CAS(6,6) · cc-pVDZ · OOP scan 10 geom
PASS+
Seeds tested3 + 1 negative control
Manifold consistency3/3 → equivalent NOONs at g₀
Final NOONs g₀[1.93, 1.73, 1.63, 0.37, 0.28, 0.07]
Convergence100% (30/30)
ΔE vs NOON-MP2−0.069 kcal/mol (lower, chemistry preserved)
Negative control (no inj)0/10 convergence
07 · Cr₂ (Dichromium) †
CAS(8,8) · STO-3G · Bond stretch 3 geom — preliminary
PASS
Seeds tested1
Selected MOs[19, 20, 22, 23, 24, 27, 29, 34]
NOON-MP2 ref NOONs[1.66, 1.66, 1.56, 1.19, 0.82, 0.45, 0.34, 0.34]
Convergence2/3 geom (ZOR > NOON: 2 vs 1)
R_M1.000 (perfect)
ΔE vs NOON-MP2−5.5 kcal/mol (lower, chemistry preserved)
CharacterCr 3d/4s bonding manifold
08 · Ketene (CH₂=C=O)
CAS(4,4) · cc-pVDZ · CH₂ twist 10 geom · no injection
PASS
Seeds tested3
Avg NOONs[1.96, 1.47, 0.51, 0.06]
Convergence100% (30/30)
R_M1.000
ΔE vs NOON-MP2−0.507 kcal/mol (lower, chemistry preserved)
Characterπ/π* + n(O) — cumulated system
09 · Formamide (HC(=O)NH₂)
CAS(4,4) · cc-pVDZ · NH₂ twist 10 geom · no injection
PASS+
Seeds tested3
Convergence100% (30/30); ZOR 10/10 vs NOON-MP2 9/10
Avg NOONs[1.95, 1.49, 0.55, 0.01]
ΔE vs NOON-MP2−12.95 kcal/mol (lower, chemistry preserved)
Characterπ(C=O) / n(N) — peptide bond model
10 · Thioformaldehyde (CH₂=S)
CAS(4,4) · cc-pVDZ · C=S stretch 10 geom · no injection
PASS
Seeds tested3
Avg NOONs (best seeds)[1.96, 1.35, 0.65, 0.04]
Convergence6–9/10 (near-degenerate at r > 2.1 Å)
R_M1.000 (s42, s73)
ΔE vs NOON-MP2−19.4 kcal/mol (lower, chemistry preserved; best in study)
Characterπ(C=S) / π* — heavy heteroatom
11 · Diazomethane (CH₂=N=N)
CAS(4,4) · cc-pVDZ · CH₂ twist 10 geom · no injection
PASS+
Seeds tested3
Manifold consistencyInter-seed NOON Δ < 0.004 — tightest in study
Avg NOONs[1.95, 1.44, 0.56, 0.05]
Convergence100% (30/30)
ΔE vs NOON-MP2−0.217 kcal/mol (lower, chemistry preserved)
12 · Hydrogen Peroxide (H₂O₂)
CAS(4,4) · cc-pVDZ · H–O–O–H dihedral 10 geom · no injection
PASS
Seeds tested3
Manifold consistencys42=s73 NOONs identical (Δ < 0.004)
Avg NOONs[1.96, 1.49, 0.55, 0.01]
Convergence100% (30/30)
ΔE vs NOON-MP2−0.028 kcal/mol
NoteFirst pure σ-only test — framework not limited to π systems
13 · Pyrrole (C₄H₅N)
CAS(6,6) · cc-pVDZ · N–H stretch 10 geom · NOON injection
PASS
Seeds tested3
LandscapeMulti-basin: 2/3 seeds → NOON manifold
NOONs (NOON manifold)[1.93, 1.82, 1.55, 0.46, 0.18, 0.07]
Convergence100% (30/30)
ΔE vs NOON-MP2 (best seed)−8.241 kcal/mol (lower, chemistry preserved)
14 · Imidazole (C₃H₄N₂)
CAS(6,6) · cc-pVDZ · N1–H stretch 10 geom · NOON injection
PASS
Seeds tested3
LandscapeMulti-basin: 2/3 seeds → NOON manifold
NOONs (NOON manifold)[1.93, 1.82, 1.53, 0.48, 0.19, 0.07]
Convergence9–10/10 per seed
ΔE vs NOON-MP2 (best seed)−11.944 kcal/mol (lower, chemistry preserved)
NoteMirrors Pyrrole — cross-molecule 5-membered N-heterocycle consistency
15 · Pyrimidine (C₄H₄N₂)
CAS(6,6) · cc-pVDZ · C5 OOP scan 10 geom · NOON injection
PASS+
Seeds tested3
LandscapeUnimodal — all 3 seeds → equivalent manifold
Final NOONs[1.93, 1.81, 1.50, 0.49, 0.20, 0.07]
Manifold consistencyInter-seed Δ < 0.017
Convergence100% (30/30)
ΔE vs NOON-MP2≈ 0.000 kcal/mol (s73 exact)

6. Key Finding: Empirical Basis-Label Invariance

Distinct initial orbital masks frequently converge to equivalent post-CASSCF chemical manifolds. Across the 15 tested systems, seeds starting from different HF-basis MO index sets arrive at the same final NOON spectra after CASSCF optimisation. This empirical basis-label invariance — observed consistently but not formally proven — suggests that the GA is identifying the physically relevant active subspace rather than a basis-specific artefact.

Selected examples illustrating this behaviour:

MoleculeSeed A: Starting MO indicesSeed B: Starting MO indicesFinal NOONs (both seeds)
Acrolein [13, 14, 40, 70] [7, 14, 15, 18] [1.94, 1.49, 0.56, 0.01] — equivalent
Methylenimine 5 different seeds, all different MO index combinations [1.99, 1.46, 0.54, 0.02] — all equivalent
Pyrimidine 3 independent seeds [1.93, 1.81, 1.50, 0.49, 0.20, 0.07] — all equivalent
Diazomethane 3 seeds — inter-seed NOON Δ < 0.004 Tightest cross-seed consistency in study

These observations are consistent with the principle that final optimised manifolds matter more than initial orbital labels. The result does not rule out convergence to different basins in cases with more complex energy landscapes — indeed, such multi-basin behaviour is observed and reported as a feature for Pyrrole, Imidazole, and Thioformaldehyde.

🔬
Manifold Consistency (Primary)
Across all 15 systems, seeds arrive at chemically interpretable NOON manifolds. This is the central result — chemistry-first filtering successfully prevents anti-chemical active spaces from being selected.
📐
Convergence Robustness
13/15 systems achieve ≥ 95% SA-CASSCF convergence. The 2 harder cases (Cr₂, Thioformaldehyde) still pass: ZOR-selected active spaces converge at comparable or better rates than the NOON-MP2 baseline.
🌐
Cross-Family Transferability
No molecule-specific tuning is required for small organic systems. NOON injection is required for CAS(6,6) aromatics — a systematic, not ad-hoc, configuration choice.
Energy Competitiveness (Secondary)
11/15 systems show lower raw energy than the NOON-MP2 baseline while preserving the chemical manifold. Energy improvement is a confirmatory result, not the primary acceptance criterion.
🔬
Multi-Basin Discovery
For Pyrrole, Imidazole, and Thioformaldehyde, distinct seeds find different CASSCF basins — all chemically valid, some lower in energy than the NOON-MP2 reference. This is a feature of combinatorial exploration.
🔩
NOON Injection for CAS(6,6)
For aromatic N-heterocycles (CAS(6,6)), NOON-MP2 pre-initialisation is a necessary configuration. Pyridine negative control (no injection: 0/10 convergence) quantifies this boundary precisely.

7. Figures & Charts

Cross-family robustness: Energy advantage and convergence per molecule
Figure 1 — Left: Energy difference between inZOR-ND selected space and NOON-MP2 baseline per molecule (negative = inZOR-ND yields lower raw energy while preserving chemistry). Green bars = PASS+; blue bars = PASS. Right: SA-CASSCF convergence rate per molecule. 13/15 achieve ≥ 90%; Cr₂ and Thioformaldehyde are harder cases where ZOR still matches or outperforms NOON-MP2.
4-panel cross-family robustness dashboard
Figure 2 — Four-panel dashboard. Top-left: energy difference per molecule. Top-right: convergence rates. Bottom-left: seeds run per molecule (with verdict). Bottom-right: scatter of convergence vs energy difference; bubble size = number of seeds.
Evaluation criteria heatmap
Figure 3 — Criteria heatmap across all 15 molecules: Chemical Meaning, Convergence ≥80%, NOON Manifold Consistency, Smooth R_M, Verdict PASS. ✓ = full pass, ~ = partial, ✗ = fail. All 15 achieve full Chemical Meaningfulness and NOON Consistency.
Final NOON spectra per molecule
Figure 4 — Post-CASSCF Natural Orbital Occupation Numbers (NOONs) at the first geometry for all 15 molecules. Blue = bonding (occ > 1.5), green = biradical/partially occupied (~1.0), orange = antibonding. All spectra are chemically interpretable: π/n pairs for heteroatom systems, symmetric π/π* for diatomics, biradical for cyclobutadiene, 6π for aromatics, Cr 3d/4s for the metal dimer.

8. Conclusions & Limitations

Across 15 structurally and electronically diverse molecular systems from six distinct chemical families, inZOR-ND satisfies the chemistry-consistency acceptance protocol in every case with zero failures. The primary finding is manifold consistency: every accepted active space yields a chemically interpretable NOON spectrum, reproducible across independent GA seeds. Cross-family transferability is the main practical contribution.

Specific conclusions, ordered by strength of evidence:

  1. Manifold consistency is the primary result. In all 15 systems, the final post-CASSCF NOON spectra match the expected bonding character of the molecule: π/n pairs, symmetric diatomic manifolds, biradical patterns, 6π aromatic structure, and Cr 3d/4s for the transition-metal case. Chemistry-first filtering successfully prevents anti-chemical active spaces from winning the ranking.
  2. Convergence robustness is demonstrated across families. 13/15 systems achieve ≥ 95% SA-CASSCF convergence. The 2 harder cases (Cr₂, Thioformaldehyde) still pass the acceptance threshold, and ZOR-selected active spaces converge at comparable or better rates than the NOON-MP2 baseline in those cases.
  3. Cross-family transferability holds without molecule-specific tuning. The same framework operates across π systems, heteroatom molecules, biradicals, σ-only bonds, N-heterocycles, and (preliminarily) a transition-metal dimer with no ad-hoc adjustments. NOON injection for CAS(6,6) aromatics is a systematic, not molecule-specific, configuration choice.
  4. Empirical basis-label invariance is observed but not formally proven. Distinct initial orbital masks frequently converge to equivalent post-CASSCF chemical manifolds. Results suggest a practical principle: final optimised manifolds matter more than initial orbital labels. This is an empirical observation across the tested systems, not a mathematical theorem.
  5. Energy competitiveness is a confirmatory secondary result. 11/15 systems yield lower raw energy than the NOON-MP2 baseline while preserving the chemical manifold. The framework is not optimised solely for energy; when energy improvement occurs, it does so together with, not instead of, chemical meaningfulness.
  6. Multi-basin exploration is a feature, not a defect. For Pyrrole, Imidazole, and Thioformaldehyde, seeds fall into distinct CASSCF basins — all chemically valid, some significantly lower in energy than NOON-MP2. This demonstrates that combinatorial search accesses regions of active-space landscape that local orbital rotations cannot reach.
Current scope and limitations:
  • 15 systems tested — mostly small-to-medium organic molecules (4–16 electrons in active space).
  • Transition-metal coverage remains preliminary (1 system, Cr₂, under reduced basis STO-3G). A comprehensive transition-metal benchmark would require multiple metals, larger basis sets, and more geometries.
  • Aromatic systems require NOON-MP2 injection at the current compute budget. No-injection performance at larger population sizes is untested.
  • All systems tested are ground-state (S₀) or lowest few states. Excited-state manifold consistency under SA-CASSCF requires separate evaluation.
  • These results demonstrate broad practical robustness across the tested chemical families, not formal universality. Extrapolation to untested classes (large conjugated systems, actinides, open-shell ground states) should be made cautiously.
Study status: This programme is complete for the 15-molecule set. The 15 molecules span: small diatomics (N₂), closed-shell organics (CH₂=NH, C₄H₄, C₆H₈), heteroatom π systems (acrolein, ketene, formamide, thioformaldehyde, diazomethane), σ-only (H₂O₂), aromatic N-heterocycles (pyridine, pyrrole, imidazole, pyrimidine), and transition metals (Cr₂). See also the 8-benchmark comparative, the chemistry-consistent validation, and the photochemical multi-geometry study for the broader supporting programme.

Practical Workflow

STEP 1
GA Search
Bio-adaptive combinatorial exploration of CAS(n,m) mask space; NOON-MP2 organisms injected as seeds
STEP 2
Chemical Gates
Frontier-orbital acceptance condition: reject masks with zero HOMO/LUMO content before CASSCF evaluation
STEP 3
Convergence Check
Non-converged geometries propagated as worst-case sentinels; disqualifies unstable candidates
STEP 4
NOON Validation
Post-CASSCF NOON spectra verified for chemical character; checked for cross-seed consistency
STEP 5
Rank by Energy
Among chemically valid and converged candidates, rank by mean S₀ energy across the scan