CorrHOD_cubic#

For the utility of the class, see its desctiption in the API section below.

Before using the class#

As we will be using the AbacusSummit Dark Matter simulations, AbacusHOD requires to prepare these simulations.

Warning

Without this step, the simulation creation will fail.

To do so, you need to follow the instructions in the AbacusHOD documentation.

The BGS filter#

The prepare_sim script is scaled for LRG galaxies. For computational reasons, a filter [1] is applied to the halo catalog to remove the light halos that will not be populated. However, the Bright Galaxy Survey (BGS) galaxies has lower mass conditions than LRGs. Therefore, the filter needs to be changed.

prepare_sim halos distribution

The Halo mass distribution for different filters (blue for the halo catalog, yellow for BGS and green for LRGs)#

The prepare_sim.sh file in the scripts folder is the script used to prepare the BGS simulations.

Warning

The prepare_sim.sh script and prepare_sim_bgs.py script are not included in the package, as they are supposed to be included in the AbacusHOD package. However, they are available in the scripts folder of the GitHub repository.

The script needs as an imput a config file (the same as AbacusHOD), with the path to the dark matter simulation in sim_dir, the name of the simulation we want to prepare in sim_name and the path in which the prepared simulation will be saved in subsample_dir. (see config\config.yaml for an example, and AbacusHOD for more details).

Note

The only difference between the LRG and BGS scripts are the subsample_halos and submask_particles functions.

How to use the class#

Initialization of the class#

First, you need to create a config file for AbacusHOD. See the AbacusHOD documentation for more details.

Then, you need to create a HOD dictionnary that contains the following parameters (provided here with a random set for example):

{'Acent': 0,
'Asat': 0,
'Bcent': 0.30609746972148444,
'Bsat': -0.0737193257210993,
'alpha': 0.8689115800548024,
'alpha_c': 0.7700801491165564,
'alpha_s': 1.036122317356142,
'ic': 1,
'kappa': 0.3005439816787442,
'logM1': 13.481589622029889,
'logM_cut': 13.274157859189234,
's': 0,
's_p': 0,
's_r': 0,
's_v': 0,
'sigma': 0.00011413582336912553}

The CorrHOD_cubic class can then be initialized

from CorrHOD import CorrHOD_cubic

Object = CorrHOD_cubic(config_file, HOD_dict)

Object.initialize_halos() # Load the halos catalog

Object.populate_halos() # Populate the halos with galaxies

From that point, you can use the different methods of the class to perform different analysis.

Tip

The analysis can be performed on a pre-existing catalog. To do so, you need to either use the set_cubic method, or pass the catalog to the cubic_dict variable after initialization.

Note

In that case, no HOD_dict is needed in the initialization, as no simulation is computed. However, due to the construction of the class, the config file for AbacusHOD still needs to be passed and valid.

Note

The class initialization can also take a line of sight, a boxsize and a cosmology as arguments. See the API for more details.

Getting the positions#

The CorrHOD code is designed to work for BGS only, but should in theory work for all AbacusHOD tracers. Only galaxies of this tracer will be populated in the simulation, and only one tracer can be used at a time. The tracer used can be changed in the tracer variable

Object.tracer = 'LRG'

Tip

The tracer parameter is set to ‘LRG’ because the HOD model has the same functions for BGS and LRG, so only ‘LRG’ is coded in AbacusHOD.

The get_tracer_positions() method can then be used to get the positions of the galaxies in the catalog. This will also apply the Redshift Space Distortion (RSD) to the positions.

Object.get_tracer_positions()

Warning

This step is important, as the positionnal array returned in the class by this function will be used for all the analysis.

DensitySplit [2]#

The compute_densitysplit() method applies the densitysplit package to the catalog, and separates the galaxies in the catalog in quantiles of local density:

quantiles, density = Object.compute_densitysplit(return_density=True) # If return_density is False, only the quantiles are returned

Note

The quantiles also can be accessed with Object.quantiles.

Downsampling the catalog#

For computational reasons, the number of galaxies in the catalog can be too big, causing the Correlations functions to take too much time to compute.

While CorrHOD supports multiprocessing and MPI (see the Parallelism section), downsampling the catalog can be a good way to reduce the computation time, without loosing too much information.

The downsample_data() method can be used to uniformly downsample the catalog

new_nbar = 1e-3 # New number density of galaxies
Object.downsample_data(new_n=new_n)

Warning

Any downsampling will cause a loss of information, and should be used with caution. For example, the DensitySplit theory is based on the local density of galaxies, that will be affected by the downsampling.

Computing the correlation functions#

Three methods are available to compute the correlation functions: compute_2pcf(), compute_auto_corr() and compute_cross_corr(). Their use and parameters are detailled in the API section.

Their results are stored in two dictionnaires inside the class :

  • Object.xi stores the pycorr.TwoPointCorrelationFunction objects

  • Object.CF stores the multipoles of the correlation functions, and assumes that all

    the correlation functions are computed with the same separation bins.

Warning

This last point is important, as the separation bins are stored only once in this dictionnary.

The two dictionnaires have the same structure :

los = ‘z’ # Line of sight on which the RSD have been applied Object.CF[los][‘2PCF’] # gets the poles of the 2PCF Object.CF[‘Cross’][‘DS0’] # For quantile 0 …

Tip

The line of sight key is here so that several lines of sight can be computed in the same CorrHOD object. This allows for the Averaging the correlation functions part.

Averaging the correlation functions#

If several lines of sight have been computed in the same CorrHOD object, the average_CF() method can be used to average the poles of the correlation functions.

Tip

The method has an average_on parameter that can be used to average only on a subset of the lines of sight.

This allows for a reduction of the noise in the correlation functions, and a better estimation of the covariance matrix.

Once this method called, a new key is added to the Object.CF dictionnary, with the key 'average'.

Warning

For the method to work, all the correlation functions must have been computed (and with the same separation bins). This means the 2PCF, and the auto and cross correlation functions for each quantile !

Parallelism#

The CorrHOD_cubic class supports multiprocessing and MPI, and will transfer the nthreads, communicator and rank to the functions that support it, if they are provided. See the API section for more details.

Saving the results#

The save() method can be used to save the selected results in numpy files. The arguments of the method can be found in the API section.

The saved results are saved in the following structure:

Folder/
├── ds/
│   ├── density/        (Density)
│   ├── quantiles/      (Quantiles)
│   ├── gaussian/       (Auto and cross correlation functions)
├── hod/                (HOD parameters)
├── tpcf/               (2PCF)
├── xi/                 (Correlation functions)

The run_all method#

This method is a shortcut to run all the analysis in one go. It is not recommended to use it, as it will not allow for a good control of the analysis, and requires a good understanding of the class to be used properly.

However, if you know what you are doing, it makes for a very powerful tool to run the analysis, as the entire code only takes a few lines:

from CorrHOD import CorrHOD_cubic

Object = CorrHOD_cubic(config_file, HOD_dict)

Object.run_all() # Several arguments can be passed to the method, see the API for more details

Tip

The run_all() method times every step of the execution. These times are displayed in the logs, but are also stored in a times_dict object, that can be accessed. It has the following keys :

  • ‘initialize_halo’ : Time to initialize the halos catalog

  • ‘run_all’ : Time to run the entire analysis

  • ‘x’, ‘y’ or ‘z’ : The chosen line of sight

    • ‘run_los’ : Time to run the entire line of sight

    • ‘compute_DensitySplit’ : Time to compute the density split

    • ‘compute_2pcf’ : Time to compute the 2PCF

    • ‘compute_auto_corr’ : Time to compute the auto correlation functions

      • ‘DS0’, ‘DS1’, … : The quantiles

    • ‘compute_cross_corr’ : Time to compute the cross correlation functions

      • ‘DS0’, ‘DS1’, … : The quantiles

API#

class CorrHOD.cubic.CorrHOD_cubic(path2config: str, HOD_params: Optional[dict] = None, los: str = 'z', boxsize: float = 2000, cosmo: int = 0)[source][source]#

Bases: object

This class is used to compute the 2PCF and the autocorrelation and cross-correlation of the quantiles of the DensitySplit. It takes HOD parameters and a cosmology as input and uses AbacusHOD to generate a cubic mock. It then uses DensitySplit to compute the density field and the quantiles. Finally, it uses pycorr to compute the 2PCF and the autocorrelation and cross-correlation of the quantiles.

Initialises the CorrHOD object. Note that the ‘tracer’ parameter is fixed to ‘LRG’ by default. It can be changed later by manually reassigning the parameter. This tracer is chosen to study LRG and BGS. It can be changed but the code is not guaranteed to work.

Parameters:
  • path2config (str) – Path to the config file of AbacusHOD. It is used to load the simulation. See the documentation for more details.

  • HOD_params (dict, optional) – Dictionary containing the HOD parameters. The keys must be the same as the ones used in AbacusHOD. If not provided, the default parameters of AbacusHOD are used. Defaults to None.

  • los (str, optional) – Line-of-sight direction in which to apply the RSD. Defaults to ‘z’.

  • boxsize (float, optional) – Size of the simulation box in Mpc/h. Defaults to 2000.

  • cosmo (int, optional) – Index of the cosmology to use. This index is used to load the cosmology from the AbacusSummit object. Defaults to 0. See the documentation of AbacusSummit for more details.

Raises:

ValueError – If the simulation is not precomputed. See the documentation of AbacusHOD for more details.

initialize_halo(nthread: int = 16)[source][source]#

Initializes the AbacusHOD object and loads the simulation.

Parameters:

nthread (int, optional) – Number of threads to use. Defaults to 16.

Returns:

Ball – AbacusHOD object containing the simulation.

Return type:

AbacusHOD object

populate_halos(nthread: int = 16, remove_Ball: bool = False)[source][source]#

Populates the halos with galaxies using the HOD parameters.

Parameters:
  • nthread (int, optional) – Number of threads to use. Defaults to 16.

  • remove_Ball (bool, optional) – If True, the Ball object (Dark matter simulation loaded from AbacusHOD) is removed after populating the halos. This is useful to save memory. Defaults to False.

Returns:

cubic_dict – Dictionary containing the HOD catalogue. It contains for each tracer : * ‘x’, ‘y’, ‘z’ : the positions of the galaxies, * ‘vx’, ‘vy’, ‘vz’ : their velocities, * ‘id’ and ‘mass’ : the ID and mass of the halo they belong to, * ‘Ncent’ : the number of central galaxies in the simulation. The first Ncent galaxies are the centrals.

Return type:

dict

set_cubic(positions: ndarray, velocities: ndarray, masses: ndarray, halo_id: ndarray, Ncent: int, tracer: str = 'LRG')[source][source]#

A method to set a tracer data in the cubic dictionary. This will overwrite the tracer data if it already exists. Otherwise, it will create it.

Parameters:
  • positions (np.ndarray) – The positions of the galaxies. It can be a (N,3) array or a dictionary with keys ‘x’, ‘y’, ‘z’.

  • velocities (np.ndarray) – The velocities of the galaxies. It can be a (N,3) array or a dictionary with keys ‘vx’, ‘vy’, ‘vz’.

  • masses (np.ndarray) – The masses of the galaxies. It must be a (N,) array.

  • halo_id (np.ndarray) – The ID of the halo each galaxy belongs to. It must be a (N,) array.

  • Ncent (int) – The number of central galaxies in the simulation. The first Ncent galaxies are the centrals.

  • tracer (str, optional) – The tracer type. Defaults to ‘LRG’.

Raises:

ValueError – If an array is not of the right shape or if a dictionary does not have the right keys.

get_tracer_positions()[source][source]#

Get the positions of the galaxies and apply RSD.

Returns:

data_positions – The positions of the galaxies after RSD. It is a (N,3) array.

Return type:

np.ndarray

compute_DensitySplit(smooth_radius: float = 10, cellsize: float = 5, nquantiles: int = 10, sampling: str = 'randoms', randoms=None, filter_shape: str = 'Gaussian', return_density: bool = True, nthread=16)[source][source]#

Compute the density field and the quantiles using DensitySplit. (see epaillas/densitysplit for more details)

Parameters:
  • smooth_radius (float, optional) – The radius of the Gaussian smoothing in Mpc/h. Defaults to 10.

  • cellsize (float, optional) – The size of the cells in the mesh. Defaults to 10.

  • nquantiles (int, optional) – The number of quantiles to compute. Defaults to 10.

  • sampling (str, optional) – The sampling to use. Can be ‘randoms’ or ‘data’. Defaults to ‘randoms’.

  • randoms (np.ndarray, optional) – The positions of the randoms. It is a (N,3) array of points in the range [0, boxsize] for each coordinate. If set to none, a random array will be created of size nquantiles * len(data_positions). Defaults to None.

  • filter_shape (str, optional) – The shape of the filter to use. Can be ‘Gaussian’ or ‘TopHat’. Defaults to ‘Gaussian’.

  • return_density (bool, optional) – If True, the density field is returned. Defaults to True.

  • nthread (int, optional) – The number of threads to use. Defaults to 16.

Returns:

  • quantiles (np.ndarray) – The quantiles. It is a (nquantiles,N,3) array.

  • density (np.ndarray, optional) – The density field. It is a (N,3) array.

downsample_data(frac: Optional[float] = None, new_n: Optional[float] = None, npoints: Optional[int] = None)[source][source]#

Downsample the data to a new number of galaxies. The quantiles are also downsampled to the same number of galaxies. (They are generated to have the same number of points as the data in each quantile)

Parameters:

new_n (float) – The new number density in the box.

Returns:

  • data_positions (np.ndarray) – The positions of the galaxies after downsampling. It is a (N,3) array.

  • quantiles (np.ndarray, optional) – The quantiles after downsampling. It is a (nquantiles,N,3) array. Returns None if the quantiles have not been computed yet.

compute_auto_corr(quantile: int, mode: str = 'smu', edges: list = [np.linspace(0.1, 200, 50), np.linspace(-1, 1, 60)], mpicomm=None, mpiroot=None, nthread: int = 16)[source][source]#

Compute the auto-correlation of a quantile. The result will also be saved in the class, as self.CF[‘Auto’][f’DS{quantile}’]

Parameters:
  • quantile (int) – Index of the quantile to compute. It must be between 0 and nquantiles-1.

  • mode (str, optional) – The mode to use for the 2PCF. Defaults to ‘smu’. See pycorr for more details .

  • edges (list, optional) – The edges to use for the 2PCF. Defaults to [np.linspace(0.1, 200, 50), np.linspace(-1, 1, 60)]. See pycorr for more details.

  • mpicomm (_type_, optional) – The MPI communicator. Defaults to None.

  • mpiroot (_type_, optional) – The MPI root. Defaults to None.

  • nthread (int, optional) – The number of threads to use. Defaults to 16.

Returns:

xi_quantile – The auto-correlation of the quantile.

Return type:

pycorr.TwoPointCorrelationFunction

compute_cross_corr(quantile: int, mode: str = 'smu', edges: list = [np.linspace(0.1, 200, 50), np.linspace(-1, 1, 60)], mpicomm=None, mpiroot=None, nthread: int = 16)[source][source]#

Compute the cross-correlation of a quantile with the galaxies. The result will also be saved in the class, as self.CF[‘Cross’][f’DS{quantile}’]

Parameters:
  • quantile (int) – Index of the quantile to compute. It must be between 0 and nquantiles-1.

  • mode (str, optional) – The mode to use for the 2PCF. Defaults to ‘smu’. See pycorr for more details .

  • edges (list, optional) – The edges to use for the 2PCF. Defaults to [np.linspace(0.1, 200, 50), np.linspace(-1, 1, 60)]. See pycorr for more details.

  • mpicomm (_type_, optional) – The MPI communicator. Defaults to None.

  • mpiroot (_type_, optional) – The MPI root. Defaults to None.

  • nthread (int, optional) – The number of threads to use. Defaults to 16.

Returns:

xi_quantile – The cross-correlation of the quantile.

Return type:

pycorr.TwoPointCorrelationFunction

compute_2pcf(mode: str = 'smu', edges: list = [np.linspace(0.1, 200, 50), np.linspace(-1, 1, 60)], mpicomm=None, mpiroot=None, nthread: int = 16)[source][source]#

Compute the 2PCF of the galaxies. The result will also be saved in the class, as self.CF[‘2PCF’]

Parameters:
  • mode (str, optional) – The mode to use for the 2PCF. Defaults to ‘smu’. See pycorr for more details .

  • edges (list, optional) – The edges to use for the 2PCF. Defaults to [np.linspace(0.1, 200, 50), np.linspace(-1, 1, 60)]. See pycorr for more details.

  • mpicomm (_type_, optional) – The MPI communicator. Defaults to None.

  • mpiroot (_type_, optional) – The MPI root. Defaults to None.

  • nthread (int, optional) – The number of threads to use. Defaults to 16.

Returns:

xi – The 2PCF of the galaxies.

Return type:

pycorr.TwoPointCorrelationFunction

average_CF(average_on: list = ['x', 'y', 'z'])[source][source]#

Averages the 2PCF, autocorrelation and cross-correlation of the quantiles on the three lines of sight. available (x, y and z)

Parameters:

average_on (list, optional) – The lines of sight to average on. Defaults to [‘x’, ‘y’, ‘z’]. The CFs must have been computed on these lines of sight before calling this function.

Returns:

CF_average – Dictionary containing the averaged 2PCF, autocorrelation and cross-correlation of the quantiles on the lines of sight. It contains : * s : the separation bins (identical for all the lines of sight, as long as the same edges are used for the 2PCF), * 2PCF : the averaged poles of the 2PCF, * Auto : a dictionary containing the averaged autocorrelation of the quantiles. It contains for each quantile DS{quantile} as the poles of the autocorrelation, * Cross : a dictionary containing the averaged cross-correlation of the quantiles. It contains for each quantile DS{quantile} as the poles of the cross-correlation.

Return type:

dict

save(hod_indice: int = 0, path: Optional[str] = None, save_HOD: bool = True, save_pos: bool = True, save_density: bool = True, save_quantiles: bool = True, save_CF: bool = True, save_xi: bool = True, los: str = 'average', save_all: bool = False)[source][source]#

Save the results of the CorrHOD object. Some of the results are saved as dictionaries in numpy files. They can be accessed using the np.load function, and the .item() method of the loaded object.

Parameters:
  • hod_indice (int, optional) – The indice of the set of HOD parameters provided to the CorrHOD Object. Defaults to 0. Be careful to set the right indice if you want to save the HOD parameters !

  • path (str, optional) – The path where to save the results. If None, the path is set to the output_dir of config file. Defaults to None.

  • save_HOD (bool, optional) – If True, the HOD parameters are saved. File saved as hod{hod_indice}_c{cosmo}_p{phase}.npy. Defaults to True.

  • save_pos (bool, optional) – If True, the cubic mock dictionary is saved. File saved as pos_hod{hod_indice}_c{cosmo}_p{phase}.npy. Defaults to True.

  • save_density (bool, optional) – If True, the density PDF is saved. File saved as density_hod{hod_indice}_c{cosmo}_p{phase}.npy. Defaults to True.

  • save_quantiles (bool, optional) – If True, the quantiles of the densitysplit are saved. File saved as quantiles_hod{hod_indice}_c{cosmo}_p{phase}.npy. Defaults to True.

  • save_CF (bool, optional) – If True, the 2PCF, the autocorrelation and cross-correlation of the quantiles are saved. The 2PCF is saved as tpcf_hod{hod_indice}_c{cosmo}_p{phase}.npy. It contains the separation bins in the s key and the poles in the 2PCF key. The Auto and Corr dictionaries are saved as ds_auto_hod{hod_indice}_c{cosmo}_p{phase}.npy and ds_cross_hod{hod_indice}_c{cosmo}_p{phase}.npy. Each dictionnary contains DS{quantile} keys with the CF of the quantile. The s key contains the separation bins. Defaults to True.

  • save_xi (bool, optional) – If True, the pycorr objects are saved in an unique dictionary. The 2PCF can be accessed using the xi[los][‘2PCF’] key. The auto and cross-correlations of the quantiles can be accessed using the xi[los][‘Auto’][‘DS{i}’] and xi[los][‘Cross’][‘DS{i}’] keys. Defaults to True.

  • los (str, optional) – The line of sight along which to save the 2PCF, the autocorrelation and cross-correlation of the quantiles. Can be ‘x’, ‘y’, ‘z’ or ‘average’. Defaults to ‘average’.

  • save_all (bool, optional) – If True, all the results are saved. This overrides the other options. Defaults to False.

run_all(is_populated: bool = False, exit_on_underpopulated: bool = False, los_to_compute='average', downsample_to: Optional[float] = None, remove_Ball: bool = False, smooth_radius: float = 10, cellsize: float = 5, nquantiles: int = 10, sampling: str = 'randoms', filter_shape: str = 'Gaussian', edges=[np.linspace(0.1, 200, 50), np.linspace(-1, 1, 60)], mpicomm=None, mpiroot=None, nthread: int = 16, hod_indice: int = 0, path: Optional[str] = None, **kwargs)[source][source]#

Run all the steps of the CorrHOD object. See tests/test_corrHOD.py for an example of how to use it and its contents.

Note that the **kwargs are used to set the parameters of the save function. If nothing is provided, nothing will be saved. See the documentation of the save function for more details.

Times will be saved in the times_dict attribute of the CorrHOD object and displayed in the logs.

Parameters:
  • los_to_compute (str, optional) – The line of sight along which to compute the 2PCF, the autocorrelation and cross-correlation of the quantiles. If set to ‘average’, the 2PCF, the autocorrelation and cross-correlation of the quantiles will be averaged on the three lines of sight. Defaults to ‘average’.

  • is_populated (bool, optional) – If True, the halos have already been populated before running this method. Defaults to False.

  • exit_on_underpopulated (bool, optional) – If True, the function will exit if the halos are underpopulated compared to the requested number density. Defaults to False.

  • downsample_to (float, optional) – If provided, the galaxies will be randomly downsampled to the provided number density in h^3/Mpc^3 before computing the CFs. This can be useful to reduce the computation time. Defaults to None.

  • remove_Ball (bool, optional) – If True, the Ball object (Dark matter simulation loaded from AbacusHOD) is removed after populating the halos. This is useful to save memory. Defaults to False.

  • smooth_radius (float, optional) – The radius of the Gaussian smoothing in Mpc/h used in the densitysplit. See epaillas/densitysplit for more details. Defaults to 10.

  • cellsize (float, optional) – The size of the cells in the mesh used in the densitysplit. See epaillas/densitysplit for more details. Defaults to 10.

  • nquantiles (int, optional) – The number of quantiles to define in the densitysplit. see epaillas/densitysplit for more details. Defaults to 10.

  • sampling (str, optional) – The type of sampling to use in the densitysplit. See epaillas/densitysplit for more details. Defaults to ‘randoms’.

  • filter_shape (str, optional) – The shape of the smoothing filter to use in the densitysplit. see epaillas/densitysplit for more details. Defaults to ‘Gaussian’.

  • edges (list, optional) – The edges of the s, mu bins to use for the 2PCF, autocorrelation and cross-correlations. Defaults to [np.linspace(0.1, 200, 50), np.linspace(-1, 1, 60)].

  • mpicomm (_type_, optional) – The MPI communicator used in the 2PCF, autocorrelation and cross-correlations. Defaults to None.

  • mpiroot (_type_, optional) – The MPI root used in the 2PCF, autocorrelation and cross-correlations. Defaults to None.

  • nthread (int, optional) – The number of threads to use in the 2PCF, autocorrelation and cross-correlations. Defaults to 16.

  • hod_indice (int, optional) – The indice of the set of HOD parameters provided to the CorrHOD Object. Defaults to 0. Be careful to set the right indice if you want to save the HOD parameters !

  • path (str, optional) – The path where to save the results. If None, the path is set to the output_dir of config file. Defaults to None.

  • **kwargs (dict, optional) – The parameters to pass to the save function. See the documentation of the save function for more details. If nothing is provided, nothing will be saved.

Footnotes