# Manticore-Local Posterior Association Catalogue

Catalogue of massive halo associations in the nearby Universe, constructed from
the **Manticore-Local** posterior ensemble of 80 constrained cosmological
resimulations. Each association is a set of dark-matter haloes drawn from
different posterior realizations that represent the same inferred galaxy cluster.

**Reference:**
McAlpine (2026), *The Manticore-Local Cluster Catalogue: A Posterior Map of
Massive Structures in the Nearby Universe*, Open Journal of Astrophysics
(submitted). Code: <https://github.com/stuartmcalpine/manticore_posterior_clusters>

**Public data portal:** <https://cosmictwin.org>

---

## Catalogue summary

| Property | Value |
|---|---|
| Total associations | 1210 |
| Fiducial sample | 401 |
| Strict sample | 162 |
| Posterior realizations | 80 |
| Mass range (mean M200) | 8.1 x 10^13 -- 2.7 x 10^15 Msol |
| Existence probability range | 0.013 -- 1.0 |
| Volume | R < 400 Mpc from observer at (500, 500, 500) Mpc in a 1 Gpc box |
| Halo finder | HBT-HERONS |
| Property pipeline | SOAP |
| Clustering algorithm | DBSCAN (eps = 2.75 Mpc, Nmin = 10) |
| Cosmology | DES Y3 (h = 0.681, Om = 0.306, Ob = 0.0486) |

### Sample definitions

| Sample | Cuts |
|---|---|
| **Raw** | mean M200 >= 10^14 Msol |
| **Fiducial** | Raw + ambiguity rate < 5% + n_members >= 20 |
| **Strict** | Fiducial + sigma_R < 2.5 Mpc + sigma(log10 M200) < 0.25 dex |

### Status encoding

Each association has a `status` derived from its existence probability
(fraction of the 80 realizations in which it appears):

| Status | Criterion | Integer code |
|---|---|---|
| stable | existence_prob >= 0.5 | 2 |
| tentative | 0.2 <= existence_prob < 0.5 | 1 |
| rare | existence_prob < 0.2 | 0 |
| unknown | — | -1 |

---

## File structure

The catalogue is stored in
`manticore_posterior_association_catalog.h5` with two top-level HDF5 groups and
file-level attributes.

### File-level attributes

These record the parameters used to build the catalogue:

| Attribute | Description |
|---|---|
| `n_clusters` | Total number of associations (1210) |
| `n_fiducial` | Number passing fiducial cuts (401) |
| `n_strict` | Number passing strict cuts (162) |
| `original_basedir` | Path to the Manticore-Local resimulation suite |
| `original_m200_mass_cut` | Input halo mass threshold: 7 x 10^13 Msol |
| `original_radius_cut` | Radial selection cut: 400 Mpc |
| `original_observer_coords` | Observer position in the simulation box: [500, 500, 500] Mpc |
| `original_alpha` | HDBSCAN alpha parameter (1.0) |
| `original_cluster_selection_method` | HDBSCAN selection method (`eom`) |
| `original_existence_prob_stable` | Threshold for "stable" status (0.5) |
| `original_existence_prob_tentative` | Threshold for "tentative" status (0.2) |
| `selection_*` | Selection criteria for each sample tier (value ranges as `[min, max]`, `None` = unbounded) |
| `status_encoding` | Maps status strings to integer codes |
| `total_properties` | Number of halo properties stored per member (34) |

---

### `posterior/` group — full member data

Contains one subgroup per association: `posterior/association_0/` through
`posterior/association_1209/`.

#### Per-association attributes

| Attribute | Type | Description |
|---|---|---|
| `cluster_id` | int | Association ID (0-indexed, sorted by existence prob desc) |
| `existence_prob` | float | Fraction of realizations containing this association (0–1) |
| `n_members` | int | Number of member haloes (max 80) |
| `mean_m200_mass` | float | Mean M200,crit mass across members [Msol] |
| `m200_mass_std` | float | Standard deviation of M200 [Msol] |
| `log10_m200_mass_std` | float | Std dev of log10(M200) [dex] |
| `center_xyz` | float[3] | Median position of members [Mpc, comoving] |
| `position_std` | float[3] | Positional scatter per axis [Mpc] |
| `ambiguity_rate` | float | Fraction of realizations where >1 halo was initially assigned |
| `axis_ratio_ba` | float | Minor-to-major axis ratio of member positions (1 = spherical) |
| `status` | string | `stable`, `tentative`, or `rare` |
| `in_fiducial` | bool | Passes fiducial selection |
| `in_strict` | bool | Passes strict selection |

#### Per-association datasets

Each dataset has shape `(n_members, ...)` containing the properties of the
individual member haloes from each realization:

| Dataset | Shape | Units | Description |
|---|---|---|---|
| `positions` | (N, 3) | Mpc | Comoving Cartesian position (x, y, z) |
| `masses` | (N,) | Msol | M200,crit total mass |
| `realization_ids` | (N,) | — | MCMC posterior sample index (0–79) |
| `mcmc_id` | (N,) | — | Same as realization_ids |
| `membership_probs` | (N,) | — | Clustering membership probability (0–1) |
| `cluster_id` | (N,) | — | Association ID for each member |
| `halo_original_index` | (N,) | — | Original halo index within its realization |
| `ra` | (N,) | deg | Right ascension (observer-centric equatorial) |
| `dec` | (N,) | deg | Declination |
| `dist` | (N,) | Mpc | Observer distance |
| `gal_l` | (N,) | deg | Galactic longitude |
| `gal_b` | (N,) | deg | Galactic latitude |
| `vr` | (N,) | km/s | Radial (line-of-sight) velocity |
| **SO/200_crit properties** | | | |
| `SO_200_crit_TotalMass` | (N,) | Msol | Spherical overdensity M200,crit |
| `SO_200_crit_SORadius` | (N,) | Mpc | R200,crit |
| `SO_200_crit_CentreOfMass` | (N, 3) | Mpc | Centre of mass within R200 |
| `SO_200_crit_CentreOfMassVelocity` | (N, 3) | km/s | CoM velocity within R200 |
| `SO_200_crit_Concentration` | (N,) | — | NFW concentration c200 |
| `SO_200_crit_MassFractionSatellites` | (N,) | — | Satellite mass fraction within R200 |
| `SO_200_crit_MassFractionExternal` | (N,) | — | External mass fraction within R200 |
| **SO/500_crit properties** | | | |
| `SO_500_crit_TotalMass` | (N,) | Msol | Spherical overdensity M500,crit |
| `SO_500_crit_SORadius` | (N,) | Mpc | R500,crit |
| `SO_500_crit_CentreOfMass` | (N, 3) | Mpc | Centre of mass within R500 |
| `SO_500_crit_CentreOfMassVelocity` | (N, 3) | km/s | CoM velocity within R500 |
| `SO_500_crit_Concentration` | (N,) | — | NFW concentration c500 |
| `SO_500_crit_MassFractionSatellites` | (N,) | — | Satellite mass fraction within R500 |
| `SO_500_crit_MassFractionExternal` | (N,) | — | External mass fraction within R500 |
| **Bound subhalo properties** | | | |
| `BoundSubhalo_TotalMass` | (N,) | Msol | Bound subhalo mass |
| `BoundSubhalo_CentreOfMass` | (N, 3) | Mpc | Bound subhalo centre of mass |
| `BoundSubhalo_CentreOfMassVelocity` | (N, 3) | km/s | Bound subhalo CoM velocity |
| `BoundSubhalo_EncloseRadius` | (N,) | Mpc | Subhalo enclosing radius |
| `BoundSubhalo_MaximumCircularVelocity` | (N,) | km/s | Vmax of the bound subhalo |
| **SOAP metadata** | | | |
| `SOAP_ProgenitorIndex` | (N,) | — | Progenitor index from HBT-HERONS |
| `SOAP_SubhaloRankByBoundMass` | (N,) | — | Subhalo rank (0 = central) |

---

### `summary/` group — association-level summary table

Pre-computed summary statistics for all 1210 associations in flat arrays, useful
for quick filtering and selection without iterating over individual groups.

**Scalar datasets** — shape `(1210,)`:

| Dataset | Description |
|---|---|
| `association_id` | Association ID |
| `existence_prob` | Existence probability |
| `n_members` | Number of member haloes |
| `mean_m200_mass` | Mean M200 mass [Msol] |
| `m200_mass_std` | Std dev of M200 [Msol] |
| `log10_m200_mass_std` | Std dev of log10(M200) [dex] |
| `ambiguity_rate` | Ambiguity rate |
| `axis_ratio_ba` | Axis ratio b/a |
| `in_fiducial` | Bool: passes fiducial selection |
| `in_strict` | Bool: passes strict selection |
| `status` | Integer status code (see encoding above) |

**Vector datasets** — shape `(1210, 3)`:

| Dataset | Description |
|---|---|
| `center_xyz` | Median position [Mpc] |
| `position_std` | Per-axis positional scatter [Mpc] |

**Percentile summary datasets** — shape `(1210, 3)`:

For each halo property, the three columns are
**[median, 25th percentile, 75th percentile]** computed across the member haloes
of each association. All property names match the per-member datasets listed
above (e.g. `SO_200_crit_TotalMass`, `dist`, `ra`, `dec`, etc.).

---

## Quick-start example

```python
import h5py
import numpy as np

f = h5py.File("manticore_posterior_association_catalog.h5", "r")

# --- Use the summary table for fast filtering ---
summary = f["summary"]
fiducial = summary["in_fiducial"][:]           # bool array, shape (1210,)
mass_median = summary["SO_200_crit_TotalMass"][:, 0]  # median M200 per association
dist_median = summary["dist"][:, 0]            # median observer distance

# Select fiducial associations within 150 Mpc
sel = fiducial & (dist_median < 150)
print(f"Fiducial associations within 150 Mpc: {sel.sum()}")
print(f"Mass range: {mass_median[sel].min():.2e} - {mass_median[sel].max():.2e} Msol")

# --- Access full posterior data for a specific association ---
assoc = f["posterior/association_0"]
print(f"\nAssociation 0:")
print(f"  Existence prob: {assoc.attrs['existence_prob']}")
print(f"  Mean M200: {assoc.attrs['mean_m200_mass']:.2e} Msol")
print(f"  Center: {assoc.attrs['center_xyz']} Mpc")

# Member halo positions and masses (one per realization)
pos = assoc["positions"][:]          # (80, 3) Mpc
m200 = assoc["SO_200_crit_TotalMass"][:]  # (80,) Msol
ra = assoc["ra"][:]                  # (80,) deg
dec = assoc["dec"][:]                # (80,) deg
real_ids = assoc["realization_ids"][:] # which MCMC sample each comes from

f.close()
```

A demonstration Jupyter notebook with more detailed examples is available at
<https://github.com/stuartmcalpine/manticore_public_scripts>.

---

## How the catalogue was built

1. **Input data.** Central haloes at z = 0 with M200 >= 7 x 10^13 Msol are
   extracted from each of the 80 Manticore-Local posterior resimulations
   (SWIFT N-body, 1024^3 particles, L = 1000 Mpc box) within 400 Mpc of
   the observer.

2. **Clustering.** DBSCAN is applied in 3D comoving Cartesian coordinates
   (eps = 2.75 Mpc, Nmin = 10) to group haloes from different realizations
   that represent the same physical structure.

3. **One-per-realization constraint.** If DBSCAN links multiple haloes from
   the same realization to one association, only the halo closest to the
   association centroid is kept. The fraction of realizations where this
   occurs defines the ambiguity rate.

4. **Quality cuts.** Associations are classified into Raw, Fiducial, and
   Strict tiers based on mass, membership count, ambiguity rate, and
   positional/mass scatter (see table above).

5. **Summary statistics.** For each association, posterior summaries (median,
   25th, 75th percentiles) are computed for all halo properties.

The construction code is available at
<https://github.com/stuartmcalpine/manticore_posterior_clusters> (Mode 1).

---

## Key definitions

- **Association:** A group of haloes from distinct posterior realizations
  that map to the same physical cluster. Not a group of satellite galaxies.
  Each member is a central halo from a different MCMC sample.

- **Existence probability:** The fraction of the 80 realizations that
  contribute a member to the association. An existence probability of 1.0
  means the structure appears in every realization.

- **Ambiguity rate:** The fraction of realizations where more than one halo
  was initially assigned to the association before enforcing the
  one-per-realization constraint. Low values indicate clean assignments.

- **Member halo:** A single z = 0 central halo from one posterior
  realization that belongs to an association. Each realization contributes
  at most one member per association.

---

## Units and conventions

- **Masses:** Solar masses (Msol), no h factors
- **Positions:** Comoving Mpc, no h factors, in a periodic box of L = 1000 Mpc
- **Velocities:** km/s (peculiar)
- **Sky coordinates:** Equatorial (ra, dec) and Galactic (gal_l, gal_b) in degrees
- **Observer:** Located at (500, 500, 500) Mpc in the simulation box
- **Spherical overdensity:** Defined relative to the critical density (200_crit and 500_crit)

---

## Citation

If you use this catalogue, please cite:

```
McAlpine (2026), The Manticore-Local Cluster Catalogue: A Posterior Map of
Massive Structures in the Nearby Universe, Open Journal of Astrophysics (submitted).
```

and the parent simulation paper:

```
McAlpine (2025), The Manticore simulations.
```
