Skip to content

Conversation

rrjbca
Copy link
Contributor

@rrjbca rrjbca commented Mar 11, 2021

Description

Implements colossus_mf_redshift and colossus_mf allowing users to (jointly) sample the redshifts (and masses) of dark matter halos for any of the mass functions in COLOSSUS. The structure of this new code mirrors the Schechter luminosity/mass functions in the galaxies module.

Checklist

  • Follow the Contributor Guidelines
  • Write unit tests
  • Write documentation strings
  • Assign someone from your working team to review this pull request
  • Assign someone from the infrastructure team to review this pull request

@rrjbca rrjbca added module: halos new feature New feature, such as a new model labels Mar 11, 2021
@rrjbca rrjbca requested review from a team March 11, 2021 16:08
Copy link
Member

@Lucia-Fonseca Lucia-Fonseca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Some typos in doc-strings,
  • warnings, issue with Planck18
  • and some questions

Copy link
Member

@ntessore ntessore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • As we discussed with the galaxies module, the splitting into redshift and <quantity> functions is a relic of when we could not multiassign columns. There should only be a single function that returns redshifts and properties.
  • The output shape is (2, nhalos). As we discussed around the multiassignment, we want functions to return (nrows, ncolumns).

@rrjbca
Copy link
Contributor Author

rrjbca commented Mar 12, 2021

  • As we discussed with the galaxies module, the splitting into redshift and <quantity> functions is a relic of when we could not multiassign columns. There should only be a single function that returns redshifts and properties.

In that case (where) should colossus_mf_redshift appear in the documentation?

  • The output shape is (2, nhalos). As we discussed around the multiassignment, we want functions to return (nrows, ncolumns).

I'm not sure yet if that is the best way to address multiassignment; for anybody not using the Pipeline class it is natural for a function to return a tuple of different properties. Regardless I would rather merge as is than have this PR held up waiting for multiassignment to be updated on the main branch.

@ntessore
Copy link
Member

In that case (where) should colossus_mf_redshift appear in the documentation?

There should only be one function. It goes through everything sequentially anyway.

I'm not sure yet if that is the best way to address multiassignment; for anybody not using the Pipeline class it is natural for a function to return a tuple of different properties. Regardless I would rather merge as is than have this PR held up waiting for multiassignment to be updated on the main branch.

I don't want to have another API in the codebase that we already know needs changing. And I think you once pointed out to me that multiassignment assumes (nrows, ncolumns) and it's by hack/coincidence that the other way round works, too.

I was very much in the "columns like tuples" camp, but I am now very much in the "columns as columns" camp, because you cannot easily stack the former.

@rrjbca
Copy link
Contributor Author

rrjbca commented Mar 12, 2021

I am strongly against working in this way in general. Pull requests can only realistically be reviewed in the context of the code on the base branch (and/or main branch). What you're suggesting leaves me stuck waiting for the multiassignment "fix" before this PR can be merged and I can carry on with downstream work. I imagine this would be particularly frustrating for an external contributor being made to wait on a separate issue through no fault of their own.

@ntessore
Copy link
Member

the multiassignment "fix"

There's nothing that needs to be fixed? Only code in this PR needs to be changed in order to align with what we decided we wanted the galaxies code to look like. That is i) unroll the colossus_mf function, and ii) return an array of shape (nhalos, 2).

@rrjbca
Copy link
Contributor Author

rrjbca commented Mar 12, 2021

Sorry, I thought (ii) in you suggestion also required a change to Pipeline hence my pushback.

@rrjbca
Copy link
Contributor Author

rrjbca commented Mar 12, 2021

  • While I agree that the extra submodules are probably unnecessary, is it not still useful to keep the separate redshift and mass samplers? E.g. I could imagine wanting to compare halo mass samples at fixed redshift with a simulation snapshot.
  • If the main advantage is ease of use in a Pipeline (e.g. stacking) doesn't it make more sense that we fix how Pipeline handles multiassignment? Rather than bending all of our functions to work around it; returning a 2d array of heterogeneous data seems unnecessarily obtuse for anybody not using Pipeline.

Copy link
Member

@ntessore ntessore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some local discussion, we agreed to merge this as is and then try to sort out the questions across the codebase separately.

@rrjbca rrjbca requested a review from Lucia-Fonseca May 18, 2021 12:47
@rrjbca rrjbca merged commit bc6ec88 into skypyproject:module/halos May 18, 2021
@rrjbca rrjbca deleted the colossus_mf branch May 18, 2021 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: halos new feature New feature, such as a new model
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants