Skip to content

Automatically generate docstring for each existing and newly written processes. #67

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jan 10, 2020

Conversation

rlange2
Copy link
Contributor

@rlange2 rlange2 commented Dec 2, 2019

Will update the doc attribute of each process in the framework. autodoc=True does not need to be passed to the xs.process decorator and instead is set globally.

I tried to follow the numpy docstring guide as closely as possible. Basically, variable name, type (if supplied), intent and description will be taken into account and represented in the format:

name: type (intent)
    description

For example, the user defined process:

@xs.process()
class DippingDyke:
    """Mimics the effect on erosion rates of a dyke dipping at
    a given angle, that is buried beneath the landscape and that is
    progressively exhumed by erosion.
    """
    
    x_position = xs.variable(validator = attr.validators.instance_of(int), description='initial x-position of exposed dyke')
    width = xs.variable(description='dyke fixed width')
    angle = xs.variable(description='dyke dipping angle in degrees')
    
    grid_shape = xs.foreign(UniformRectilinearGrid2D, 'shape')
    x = xs.foreign(UniformRectilinearGrid2D, 'x')
    
    etot = xs.foreign(TotalErosion, 'cumulative_height')
    
    k_coef = xs.foreign(StreamPowerChannel, 'k_coef', intent='out')
    diffusivity = xs.foreign(LinearDiffusion, 'diffusivity', intent='out')
    
    def run_step(self):
        cotg = 1. / np.tan(np.radians(self.angle))
        
        dyke_pos = self.x - self.x_position - self.etot * cotg
        
        in_dyke = (dyke_pos - self.width) * (dyke_pos + self.width) <= 0
        
        self.k_coef = np.where(in_dyke, 1e-5, 2e-5)
        self.diffusivity = np.where(in_dyke, 0.05, 0.1)

Will return:

Mimics the effect on erosion rates of a dyke dipping at
    a given angle, that is buried beneath the landscape and that is
    progressively exhumed by erosion.
    

Parameters
----------
x_position : int (in)
    initial x-position of exposed dyke
width : (in)
    dyke fixed width
angle : (in)
    dyke dipping angle in degrees
grid_shape : (in)
    Reference to variable 'shape' defined in class
    'UniformRectilinearGrid2D'
x : (in)
    Reference to variable 'x' defined in class 'UniformRectilinearGrid2D'
etot : (in)
    Reference to variable 'cumulative_height' defined in class
    'TotalErosion'
k_coef : (out)
    Reference to variable 'k_coef' defined in class 'StreamPowerChannel'
diffusivity : (out)
    Reference to variable 'diffusivity' defined in class 'LinearDiffusion'

@rlange2
Copy link
Contributor Author

rlange2 commented Dec 3, 2019

pytest xsimlab --verbose returns a KeyError for 'intent' and 'description' since I try to access these keys in metadata, see e.g.
intent = str(value.metadata['intent']).split('.')[1].lower()
I'm not 100% sure why this is happening since these should be valid keys of metadata as defined in variable.py.

Wrapping these expressions in try and except seems to solve the problem and leaves me with one remaining failed test, which is:

=================================== FAILURES ===================================
____________________________ test_process_decorator ____________________________

    def test_process_decorator():
        with pytest.raises(NotImplementedError):
>           @xs.process(autodoc=True)
            class Dummy:
E           Failed: DID NOT RAISE <class 'NotImplementedError'>

xsimlab/tests/test_process.py:210: Failed

I guess that makes sense and shouldn't be an issue, right?

However, using try and except probably isn't the best way to deal with the failed pytest since these are pre-defined keys. I really cannot say if this happens because I hardcoded the keys or it has something to do with the internal logic of pytest or something completely different.

I'd be thankful for ideas.

@rlange2
Copy link
Contributor Author

rlange2 commented Dec 4, 2019

Okay, in the spirit of encapsulation and OOP, I should start to rely on setters and getters.
That said, I changed the assignment for intent and description to:

intent = value.metadata.get('intent').value
description = value.metadata.get('description')

Running pytest again for now solves the KeyError. However, it returns:

xsimlab/process.py:458: in render_docstrings
    intent = value.metadata.get('intent').value
E   AttributeError: 'NoneType' object has no attribute 'value'

I'm curious now, how intent (and equally description) can turn out to be of type None since it is not a valid VarIntent (and description being None would raise other errors) and also, how other methods which make use of get don't run into these problems. Additionally, retrieving the variable type via data_type = str(value.validator).split("'")[1] does not raise any concerns.

Copy link
Member

@benbovy benbovy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thanks @rlange2 for submitting this!

I have some inline comments.

I think we should also allow more control on where to include the generated parameter section in the docstrings if it already exists, e.g.,

@xsimlab.process(autodoc=True)
class A:
    """Process A
    
    {{parameters}}

    Notes
    ------
    Some notes about this process.
    """
    ...

I also think we should include other information such as the dimensions, perhaps under the variable description as a bullet list (and maybe move the intent there), e.g.,

a_var: object
   Variable description

    - dimensions: scalar or 'x' or ('x', 'y')
    - intent: 'in'
another_var: object
    ...

See here for an example.

On a more general note, I try to more or less strictly follow the PEP8 conventions for formatting the code. There are tools like flake8 (linters) that can help you enforcing the code into this format. I will eventually use black so that we won't need to worry about code style.

@benbovy
Copy link
Member

benbovy commented Dec 4, 2019

I'm curious now, how intent (and equally description) can turn out to be of type None since it is not a valid VarIntent (and description being None would raise other errors)

value.metadata.get('intent') returns either an Enum member or None. Like the error says, None in Python has no .value attribute.

@rlange2
Copy link
Contributor Author

rlange2 commented Dec 4, 2019

Dear @benbovy, many thanks for your input. Most of it seems pretty clear to me. On some others, I'm afraid, I'll need some clarification.

But let's go through that step by step:

I think we should also allow more control on where to include the generated parameter section in the docstrings if it already exists, e.g.,

@xsimlab.process(autodoc=True)
class A:
    """Process A
    
    {{parameters}}

    Notes
    ------
    Some notes about this process.
    """
    ...

Yes, having a keyword that later on gets replaced is a good idea.

I also think we should include other information such as the dimensions, perhaps under the variable description as a bullet list (and maybe move the intent there), e.g.,

a_var: object
   Variable description

    - dimensions: scalar or 'x' or ('x', 'y')
    - intent: 'in'
another_var: object
    ...

That looks better from an organisational perspective. I was wondering what other information are useful to add to the bullet list. The metadata sure offer a lot of them and it's probably a good idea to check beforehand, which items are actually set and others we don't include if they are None. Although, there is less information available, for instance, if it's a foreign variable. Should we look that up then in the reference class? Then, it might be more sensical, to write little helper functions (in utils.py?) that, for instance, look up what attributes are given in metadata or retrieve information when there is a foreign variable. Just so, that not everything is done by render_docstring.

On a more general note, I try to more or less strictly follow the PEP8 conventions for formatting the code. There are tools like flake8 (linters) that can help you enforcing the code into this format. I will eventually use black so that we won't need to worry about code style.

That is a fair point. I just started using flake8 and incorporated their feedback. Will soon take a deeper look into black

Instead of attr.field_dict, you should use variables_dict defined in utils.py, since we don't want to include attrs attributes that are not xarray-simlab variables here. This is why you get the KeyError: 'intent' in the tests.

Indeed that was the case. Maybe at some point, we can talk about why that happens.

value.metadata.get('intent') returns either an Enum member or None. Like the error says, None in Python has no .value attribute.

Okay, I understand. Then it was probably a logic error from my side. From my understanding value.metadata.get('intent') will (in this framework) never return None and therefore, will always have a .value attribute. Moreover, this also got solved by variables_dict from utils.py and I'm not sure what's the reason.

Do not directly overwrite the __doc__ attribute of the base class here. Instead do:

self._p_cls_dict['__doc__'] = ...   # concatenate self._base_cls.__doc__ + the formatted parameter section

Here, I run into problems and did so in the past. When I try:
self._p_cls_dict['__doc__'] = ...

The updated docstring doesn't show up in process.__dict__['__doc__'] and can only be seen right at the end when I call help(process).

When using instead:
self._p_cls_dict.__doc__ = ...

I get:

AttributeError: 'dict' object attribute '__doc__' is read-only

That's why I started overwriting the base class. I only overwrite, when self._base_cls_.__doc__ is NoneType, i.e. there is no docstring given (from my understanding, and that is only the case for BorderBoundary in fastscape/processes/boundary.py plus any user created process without a docstring). Otherwise, I concatenate the docstring to self._base_cls_.__doc__.
This way, the information shows up in __doc__ as intended.

Thanks for the help.

@benbovy
Copy link
Member

benbovy commented Dec 4, 2019

I was wondering what other information are useful to add to the bullet list. [...] Although, there is less information available, for instance, if it's a foreign variable. Should we look that up then in the reference class? [...] Then, it might be more sensical, to write little helper functions (in utils.py?)

Good point. I think it's enough to show the class and variable to which refers a foreign variable. You could look at formatting.var_details(), which is used to format the docstring of the generated class properties for the variables. Actually, maybe you could just reuse it here?

value.metadata.get('intent') will (in this framework) never return None

It will likely return None if value is not an attr.Attribute created with the framework wrappers (i.e., xs.variable, xs.foreign...). The framework should be as less intrusive as possible, i.e., users should be able to do the following:

@xsimlab.process
class A:
    not_a_xsimlab_var = attr.ib()
    xsimlab_var = xsimlab.variable()

Hence https://github.com/benbovy/xarray-simlab/pull/67#discussion_r353609467

The updated docstring doesn't show up in process.dict['doc'] and can only be seen right at the end when I call help(process).

Oh yes I see, actually the docstrings should be updated for both the base class (stand-alone) and the subclass (the class returned by build_class).

rlange2 pushed a commit to rlange2/xarray-simlab that referenced this pull request Dec 5, 2019
@rlange2
Copy link
Contributor Author

rlange2 commented Dec 5, 2019

I was wondering what other information are useful to add to the bullet list. [...] Although, there is less information available, for instance, if it's a foreign variable. Should we look that up then in the reference class? [...] Then, it might be more sensical, to write little helper functions (in utils.py?)

Good point. I think it's enough to show the class and variable to which refers a foreign variable. You could look at formatting.var_details(), which is used to format the docstring of the generated class properties for the variables. Actually, maybe you could just reuse it here?

Reusing formatting.var_details() would keep everything consistent as well. However, the function returns the output in a very defined format. In order to account for additional indents and newline characters, I came up with two different approaches:

  1. Don't change var_details()itself but rather its output:

Here, a nested loop is introduced to format the output on the fly:

def render_docstrings(self):

    attributes_keyword = "{{attributes}}"
    data_type = "object"  # placeholder until issue #34 is solved
    docstring = "\nAttributes\n----------\n"
    indent = "    "

    for key, value in variables_dict(self._base_cls).items():
        temp_string = ''
        var_attributes = var_details(value).split("\n")
            
        for line in range(len(var_attributes)):
            if line == len(var_attributes) - 1:
                temp_string += indent + var_attributes[line]
            else:
                temp_string += indent + var_attributes[line] + "\n"

        var_attributes = temp_string
        docstring += f"{key}: {data_type}\n{var_attributes}\n"

    if self._base_cls.__doc__ is not None:
        if attributes_keyword in self._base_cls.__doc__:
            self._base_cls.__doc__ = self._base_cls.__doc__.replace(attributes_keyword,
                                                                    docstring)
        else:
            self._base_cls.__doc__ += f"\n{docstring}"
    else:
        self._base_cls.__doc__ = docstring
  1. Add a flag (default False) to var_details() to return expected output:
def var_details(var, docstring=False):
    max_line_length = 70

    var_metadata = var.metadata.copy()

    description = textwrap.fill(var_metadata.pop('description').capitalize(),
                                width=max_line_length, subsequent_indent='    ')

    detail_items = [('type', var_metadata.pop('var_type').value),
                    ('intent', var_metadata.pop('intent').value)]
    detail_items += list(var_metadata.items())

    if docstring==True:
        details = "\n".join(["    - {} : {}".format(k, v) for k, v in detail_items])
    else:
        details = "\n".join(["- {} : {}".format(k, v) for k, v in detail_items])

    return description + "\n\n" + details + '\n'

This change let's us get rid of the second loop in render_docstrings():

def render_docstrings(self):

    attributes_keyword = "{{attributes}}"
    data_type = "object"  # placeholder until issue #34 is solved
    docstring = "\nAttributes\n----------\n"
    indent = "    "

    for key, value in variables_dict(self._base_cls).items():
        var_attributes = var_details(value, docstring=True)
        docstring += f"{key}: {data_type}\n{indent}{var_attributes}\n"

    if self._base_cls.__doc__ is not None:
        if attributes_keyword in self._base_cls.__doc__:
            self._base_cls.__doc__ = self._base_cls.__doc__.replace(attributes_keyword,
                                                                    docstring)
        else:
            self._base_cls.__doc__ += docstring
    else:
        self._base_cls.__doc__ = docstring

There's no need anymore to import textwrap.
Both approaches generate the same result, for example, by using the process mentioned in #67 (comment):

Mimics the effect on erosion rates of a dyke dipping at
    a given angle, that is buried beneath the landscape and that is
    progressively exhumed by erosion.
    
    
Attributes
----------
x_position: object
    Initial x-position of exposed dyke

    - type : variable
    - intent : in
    - dims : ((),)
    - group : None
    - attrs : {}

width: object
    Dyke fixed width

    - type : variable
    - intent : in
    - dims : ((),)
    - group : None
    - attrs : {}

angle: object
    Dyke dipping angle in degrees

    - type : variable
    - intent : in
    - dims : ((),)
    - group : None
    - attrs : {}

grid_shape: object
    Reference to variable 'shape' defined in class
    'uniformrectilineargrid2d'

    - type : foreign
    - intent : in
    - other_process_cls : <class 'fastscape.processes.grid.UniformRectilinearGrid2D'>
    - var_name : shape

x: object
    Reference to variable 'x' defined in class 'uniformrectilineargrid2d'

    - type : foreign
    - intent : in
    - other_process_cls : <class 'fastscape.processes.grid.UniformRectilinearGrid2D'>
    - var_name : x

etot: object
    Reference to variable 'cumulative_height' defined in class
    'totalerosion'

    - type : foreign
    - intent : in
    - other_process_cls : <class 'fastscape.processes.erosion.TotalErosion'>
    - var_name : cumulative_height

k_coef: object
    Reference to variable 'k_coef' defined in class 'streampowerchannel'

    - type : foreign
    - intent : out
    - other_process_cls : <class 'fastscape.processes.channel.StreamPowerChannel'>
    - var_name : k_coef

diffusivity: object
    Reference to variable 'diffusivity' defined in class 'lineardiffusion'

    - type : foreign
    - intent : out
    - other_process_cls : <class 'fastscape.processes.hillslope.LinearDiffusion'>
    - var_name : diffusivity

A few things I noticed:
Due to capitalize() in var_details() any character other than the first in the string becomes lowercase, e.g. 'UniformRectilinearGrid2D' turns into 'uniformrectilineargrid2d'. It's not the biggest issue right now but it lowers readability quite a bit in my opinion.
Additionally, some bullet points end up containing redundant information (e.g. other_process_cls for foreign variables), others are even less helpful if not assigned (e.g. group, dims, attrs). Of course, that doesn't mean, we should leave these out.
When no description is given, there will be an additional blank line instead.

Finally, when using a non xsimlab-variable via attr.ib() it likely won't be considered by variables_dict() since they are not xsimlab-specific.

@benbovy
Copy link
Member

benbovy commented Dec 5, 2019

I prefer the 1st approach. It's better customize things at a higher level and not propagate complexity at lower levels (e.g., var_details in this case), IMO.

I think you could get rid of the nested loop by using textwrap.indent, and actually get rid of all nested blocks, e.g., with something like:

def render_docstrings(self):

    attributes_keyword = "{{attributes}}"
    data_type = "object"  # placeholder until issue #34 is solved

    fmt_vars = []

    for vname, var in variables_dict(self._base_cls).items():
        var_header = f"{vname} : {data_type}"
        var_content = textwrap.indent(var_details(var), " " * 4)

        fmt_vars.append(f"{var_header}\n{var_content}")

    fmt_section = textwrap.indent("Attributes\n"
                                  "----------\n"
                                  "\n".join(fmt_vars),
                                  " " * 4)

    current_doc = self._base_cls.__doc__ or ""

    if attributes_keyword in current_doc:
        new_doc = current_doc.replace(attributes_keyword,
                                      fmt_section[4:])
    else:
        new_doc = f"\n\n{fmt_section}\n"

    self._base_cls.__doc__ = new_doc

The code here above should also properly handle all line returns and indentation.

@benbovy
Copy link
Member

benbovy commented Dec 5, 2019

Due to capitalize() in var_details() any character other than the first in the string becomes lowercase, e.g. 'UniformRectilinearGrid2D' turns into 'uniformrectilineargrid2d'.

You can ignore this for now. Actually, I think that it will be better if the description of a foreign variable corresponds to the description of the variable it refers to (we can do this in another PR).

@rlange2
Copy link
Contributor Author

rlange2 commented Dec 9, 2019

That is a really clean and thorough approach!

I slightly changed your code to the following:

def render_docstrings(self):
    attributes_keyword = "{{attributes}}"
    data_type = "object"  # placeholder until issue #34 is solved

    fmt_vars = []

    for vname, var in variables_dict(self._base_cls).items():
        var_header = f"{vname} : {data_type}"
        var_content = textwrap.indent(var_details(var), " " * 4)

        fmt_vars.append(f"{var_header}\n{var_content}")

    fmt_section = textwrap.indent("Attributes\n"
                                  "----------\n"
                                  + "\n".join(fmt_vars),
                                  " " * 4)

    current_doc = self._base_cls.__doc__ or ""

    if attributes_keyword in current_doc:
        new_doc = current_doc.replace(attributes_keyword,
                                      fmt_section[4:])
    else:
        new_doc = f"{current_doc}\n{fmt_section}\n"

    self._base_cls.__doc__ = new_doc
  1. + was added to your second textwrap.indent() otherwise the attributes headline is wrapped around each variable instead of being placed once at the beginning of fmt_section. Line break was introduced before the binary operator for readability.
  2. Concatenation in else since a previously existing docstring (without using the attributes_keyword) would have been overwritten.
  3. Added f to your f-string.

I would like to know what you think about setting autodoc=True in process.process() by default. Also, there is still a blank line in case no variable description is given. I will think about what's a good way to handle that.

You can ignore this for now. Actually, I think that it will be better if the description of a foreign variable corresponds to the description of the variable it refers to (we can do this in another PR).

True. I'm not sure how to link a variable in a process to its original process. Will think about it.

Edit: Maybe var.metadata.get('other_process_cls') is a good start.

@benbovy
Copy link
Member

benbovy commented Dec 9, 2019

For better overall readability, I'm wondering if we shouldn't move the the formatting logic currently in render_docstrings into a new function in formatting.py, e.g., format_attribute_section(process) and then just call it in render_docstrings:

from .formatting import format_attribute_section


def render_docstrings(self):
    new_doc = format_attribute_section(self._base_cls)

    self._base_cls.__doc__ = new_doc
    self._p_cls_dict['__doc__'] = new_doc

The last line is needed (I think) to be able to get the docstrings from process instances attached to a model (i.e., mymodel.myprocess).

@benbovy
Copy link
Member

benbovy commented Dec 9, 2019

We should also take the indentation into account when wrapping the variable description as a text block in the section. This would need exposing the max line length as a parameter, i.e., var_details(var, max_line_length=70) and then use 62 instead of 70 for var_content in the code above.

@benbovy
Copy link
Member

benbovy commented Dec 9, 2019

Nit: add_attribute_section(process, placeholder="{{attributes}}") is slightly better than format_attribute_section(process) IMO.

@rlange2
Copy link
Contributor Author

rlange2 commented Dec 9, 2019

add_attribute_section(process, placeholder="{{attributes}}") has been added to formatting.py and the code has been adjusted as you suggested.

I pushed the update since I felt that it's good to have a more recent version to talk about. Not sure why the checks have failed since pytest didn't have any issues.

I will look into the self._p_cls_dict['__doc__'] = new_doc assignment and max line length.

@benbovy
Copy link
Member

benbovy commented Dec 9, 2019

The errors on travis are unrelated. I'll remove python 3.5 from the test matrix and fix the issue with the latest xarray release in another PR.

That said, it would be good to add a test for this "autodoc" feature.

I would like to know what you think about setting autodoc=True in process.process() by default.

It's a good default IMO. More descriptive docstrings is good. autodoc=False is for edge cases or in case something goes wrong.

Also, there is still a blank line in case no variable description is given. I will think about what's a good way to handle that.

I like adding (no description given) as you proposed it in an earlier version. You could do that in var_details().

@@ -3,11 +3,12 @@
import inspect
import sys
import warnings
import textwrap
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove this unused import.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, thank you. Might have jumped the gun here.

@benbovy
Copy link
Member

benbovy commented Dec 10, 2019

The errors on travis are unrelated. I'll remove python 3.5 from the test matrix and fix the issue with the latest xarray release in another PR.

This is now fixed in the master branch. You can merge it here.

Raphael Lange and others added 3 commits December 10, 2019 13:00
…en process, if autodoc=True in the process decorator.
The formatting of the docstring has been outsourced to a separate
function in formatting.add_attribute_section(). This way, the
render_docstrings() only needs to call that function and pass its
return to the __doc__ attribute of the base class (and subclasses).

There are more suggestions in the PR but I had the impression that it's
beneficial to update the recent code.
@rlange2
Copy link
Contributor Author

rlange2 commented Dec 10, 2019

    self._p_cls_dict['__doc__'] = new_doc

The last line is needed (I think) to be able to get the docstrings from process instances attached to a model (i.e., mymodel.myprocess).

I'm not sure about this, e.g.

from fastscape.processes import basic_model
basic_model.terrain

It returns the information with the previous formatting, i.e.

<TerrainDerivatives 'terrain' (xsimlab process)>
Variables:
    shape         [in] <--- grid.shape
    spacing       [in] <--- grid.spacing
    elevation     [in] <--- topography.elevation
    slope        [out] ('y', 'x') terrain local slope
    curvature    [out] ('y', 'x') terrain local curvature
Simulation stages:
    *no stage implemented*

The same is the case for a custom process, see DippingDyke example from the first post:

from fastscape.models import basic_model
basic_model = basic_model.update_processes({'Dyke':DippingDyke})
basic_mode.Dyke

returns

<DippingDyke 'Dyke' (xsimlab process)>
Variables:
    x_position      [in] initial x-position of exposed dyke
    width           [in] dyke fixed width
    angle           [in] dyke dipping angle in degrees
    grid_shape      [in] <--- grid.shape
    x               [in] <--- grid.x
    etot            [in] <--- erosion.cumulative_height
    k_coef         [out] ---> spl.k_coef
    diffusivity    [out] ---> diffusion.diffusivity
Simulation stages:
    run_step

That is the case, whether I include self._p_cls_dict['__doc__'] = new_doc or not.
It makes sense. formatting._summarize_var() is called in formatting.repr_process() which sets up the layout. That in turn gets passed to the process._ProcessBuilder class as well as to process.process_info().
Maybe I misunderstood your intention but this is what I found regarding mymodell.myprocess.


We should also take the indentation into account when wrapping the variable description as a text block in the section. This would need exposing the max line length as a parameter, i.e., var_details(var, max_line_length=70) and then use 62 instead of 70 for var_content in the code above.

For the current format, it seems like there are mostly two places where text length might be a concern. That would be the i) variable description and ii) the bullet point other_process_cls (appears for foreign variables) will usually contain a long sequence, e.g. <class 'fastscape.processes.grid.UniformRectilinearGrid2D'>.
With the way var_details() is written now, a line break is only introduced in the description if max_line_length is exceeded, since this is where textwrap.fill is applied. So this won't apply for the bulleted list.
There are a couple of possibilities I can think of:

  1. Wrap another textwrap method around around var_content or fmt_section.
  2. Change var_details() once again and include a textwrap (or custom formatting) around details.
  3. Write an entirely new function that is kin to var_details() but only used for this specific case of docstring-formatting.
  4. Leave everything as is.

Feel free to share your thoughts and add to the list.
My thoughts: Regarding 1., textwrap can mess with a text block quite substantially. However, I will try to make that work. I can see, how 2. might not be the best option. var_details() is used in a few other places and we probably should keep this function as original as it is meant to be. 3. introduces another function to the project. I think, that is definitely an option. 4. sounds a bit dissatisfying at first but there are two things, we should keep in mind: a) We can leave line breaks for the description up to the user, and b) we have yet to think about a way how to deal with the variable information of foreign variables. I'm sure you want to include what process the variable is referencing (as a bullet point in this list), but an output like
- other_process_cls : <class 'fastscape.processes.grid.UniformRectilinearGrid2D'>
is much too cryptic in my opinion anyways. Of course, we can wrap the description with max_line_length=62 and replace other_process_cls with a wrapped expression as well.


The errors on travis are unrelated. I'll remove python 3.5 from the test matrix and fix the issue with the latest xarray release in another PR.

That said, it would be good to add a test for this "autodoc" feature.

That sounds good. I'm not yet sure how to properly customise your own build but I hope, we can discuss it at one point.


It's a good default IMO. More descriptive docstrings is good. autodoc=False is for edge cases or in case something goes wrong.

Great!


Also, there is still a blank line in case no variable description is given. I will think about what's a good way to handle that.

I like adding (no description given) as you proposed it in an earlier version. You could do that in var_details().

The formatting.var_details() now contains:

description = textwrap.fill(var_metadata.pop('description').capitalize(),
                                width=max_line_length) or (
        "(no description given)")

Your suggestion inspired me to use or in string assignment. I hope, it doesn't affect readability too much compared to a comprehensive if/else (PEP 308 -- Conditional Expressions).
Seems to do the job for now.


I tried to introduce horizontal rulers to structure my posts a bit more. If I should make separate posts, please don't hesitate to mention it.


The errors on travis are unrelated. I'll remove python 3.5 from the test matrix and fix the issue with the latest xarray release in another PR.

This is now fixed in the master branch. You can merge it here.

I hope, I did the right thing :) The pytest output is a bit different now but no complains so far.

@benbovy
Copy link
Member

benbovy commented Dec 10, 2019

It returns the information with the previous formatting

basic_model.terrain gives what the class __repr__ returns (that's what xs.process_info(basic_model.terrain) also returns), which is not the same than the docstrings. Try help(basic_model.terrain) and you should see the formatted attributes section among many other things.

For the current format, it seems like there are mostly two places where text length might be a concern. That would be the i) variable description and ii) the bullet point other_process_cls (appears for foreign variables) will usually contain a long sequence.

Yeah I think it's enough for now to wrap only the description (with 70 - 8 char width in this case) and let the bullet list untouched.

it would be good to add a test for this "autodoc" feature [...] I'm not yet sure how to properly customise your own build

The test would be very similar to those already implemented in https://github.com/benbovy/xarray-simlab/blob/master/xsimlab/tests/test_formatting.py.

Your suggestion inspired me to use or in string assignment.

Actually, I prefer a if in this case.

I tried to introduce horizontal rulers to structure my posts a bit more. If I should make separate posts, please don't hesitate to mention it.

Sometimes it's good to push your last changes/commits first and then add yourself inline comments, so that we can discuss specific details just next to the relevant lines of code.

I hope, I did the right thing :) The pytest output is a bit different now but no complains so far.

Same advise, don't hesitate to push your commits early. This way I can directly see the output in the CI logs.

For readability, variable description in var_details() now assigned
with a separate conditional expression.

To account for indents in the docstring, variable description is now
wrapped after 62 characters.
Copy link
Contributor Author

@rlange2 rlange2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same advise, don't hesitate to push your commits early. This way I can directly see the output in the CI logs.

Ok, the script now only fails the non-raised NotImplementedError.

____________________________ test_process_decorator ____________________________
809 
810     def test_process_decorator():
811         with pytest.raises(NotImplementedError):
812 >           @xs.process(autodoc=True)
813              class Dummy:
814 E           Failed: DID NOT RAISE <class 'NotImplementedError'>
815 
816 xsimlab/tests/test_process.py:210: Failed

it would be good to add a test for this "autodoc" feature [...] I'm not yet sure how to properly customise your own build

The test would be very similar to those already implemented in https://github.com/benbovy/xarray-simlab/blob/master/xsimlab/tests/test_formatting.py.

Thank you. This is what I will look into next.

@@ -453,7 +452,6 @@ def render_docstrings(self):
new_doc = add_attribute_section(self._base_cls)

self._base_cls.__doc__ = new_doc
self._p_cls_dict['__doc__'] = new_doc
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self._base_cls.__doc__ = new_doc is enough to create docstring in help(mymodel.myprocess). In fact, it's pasted in there twice, at the beginning and the end.

Raphael Lange added 2 commits December 11, 2019 10:00
With this commit, a test for the autodoc-feature will be implemented.
For now, however, pytest returns an AssertionError because
`add_attribute_section()` in `test_add_attribute_section()` returns
the docstring twice. Reason unknown.

Also, `@pytest.mark.xfail` was introduced to circumvent failures in
`test_process_decorator()` and `test_add_attribute_section()`. This
will be only temporary.
Copy link
Contributor Author

@rlange2 rlange2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With test_add_attribute_section() I tried to compare the output of add_attribute_section() with a template. However, the output gets generated twice and therefore fails to pass the test.

In tests.fixture_process I found the following:

@pytest.fixture(scope='session')
def in_var_details():
    return dedent("""\
    Input variable

    - type : variable
    - intent : in
    - dims : (('x',), ('x', 'y'))
    - group : None
    - attrs : {}
    """)

Is that related to test_var_details()?


""")

assert add_attribute_section(Dummy) == expected
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This leaves me with an AssertionError. Apparently, the docstring gets returned twice and therefore fails to be equal to expected. I tried this with a Dummy class as well as with tests.fixture_process.SomeProcess. I'm not sure, though, why that happens.

@@ -37,6 +39,34 @@ def test_var_details(example_process_obj):
assert "- dims : (('x',),)" in var_details_str


@pytest.mark.xfail
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pytest.mark.xfail was introduced to allow for this failure (also in tests.test_process.test_process_decorator(). This will be only temporary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah but then I cannot see in the CI logs why the test fails.

Copy link
Member

@benbovy benbovy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! We're almost there.

Some minor issues remaining.

Also, the docstrings of the process decorator should be updated (here: https://github.com/benbovy/xarray-simlab/blob/f595fd6af3150f4f293306985c644d2949299852/xsimlab/process.py#L492), with default: False -> default: True and maybe with a bit more details.

And don't forget to update the release notes in doc/whats_new.rst.

@@ -37,6 +39,34 @@ def test_var_details(example_process_obj):
assert "- dims : (('x',),)" in var_details_str


@pytest.mark.xfail
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove xfail here. We expect that this test always succeeds.

"""This is a Dummy class
to test `add_attribute_section()`
"""
var = xs.variable(dims='x', description='a variable')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add another variable with no description to test the case where (no description given) is added to the docstrings.

It might be good to also test the case where the {{attributes}} placeholder is present in the original docstrings.

@@ -205,6 +205,7 @@ def run_step(self, a, b):
assert "Process runtime methods" in str(excinfo.value)


@pytest.mark.xfail
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, here this should be removed.

You should instead replace the NotImplementedError test below by some logic testing if the docstrings are updated if autodoc=True and if it stays unchanged if autodoc=False. No need to duplicate the test that you have written for the attribute section content, assert "Attributes" in cls.__doc__ might be enough here.

var_metadata = var.metadata.copy()

description = textwrap.fill(var_metadata.pop('description').capitalize(),
width=max_line_length)
if description == "":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if not description: is slightly more idiomatic.

@benbovy
Copy link
Member

benbovy commented Dec 18, 2019

@rlange2 I started using black in #84. As a consequence, there are some conflicts between this branch and the master branch. Let me know if you have some work that you haven't commit/pushed yet. Otherwise, I'll resolve the conflicts through this interface and you'll just need to pull the changes.

From now on, you can stop worrying about code formatting and use black: see here.

Raphael Lange added 2 commits January 6, 2020 12:03
Fork is now synched with latest commit 1d995b4 of original repo.
whats_new.rst has been updated. Placeholder and if-clause if no
description has been given for individual variables. Docstring for
`process.process.py` has been updated. Lastly, tests included for
`add_attribute_section()` and `process_decorator()` in `test_formatting`
and `test_process`, respectively.
@rlange2
Copy link
Contributor Author

rlange2 commented Jan 6, 2020

Latest changes include:

@rlange2 I started using black in #84. As a consequence, there are some conflicts between this branch and the master branch. Let me know if you have some work that you haven't commit/pushed yet. Otherwise, I'll resolve the conflicts through this interface and you'll just need to pull the changes.

I created a new branch and used it to overwrite the existing autodoc-branch. I hope that wasn't the worst idea. Should I highlight the changes I made?

Copy link
Member

@benbovy benbovy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! Just two minor comments left.

@@ -47,6 +47,27 @@ Enhancements
to :func:`xarray.Dataset.xsimlab.run`.
- More consistent dictionary format for output variables in the xarray
extension (:issue:`85`).
- Existing and newly written processes will be updated automatically
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep the release notes concise, e.g., with something like:

The ``autodoc`` parameter of the :func:`~xsimlab.process` decorator
now allows to automatically add an attributes section in the docstrings
of the class to which the decorator is applied, using the metadata of 
each variable declared in the class.

{{attributes}} can be used as a placeholder for the updated
metadata information.

Docstring template:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again here, not sure it's worth showing the whole attributes section template.

@benbovy benbovy merged commit d94c629 into xarray-contrib:master Jan 10, 2020
@benbovy
Copy link
Member

benbovy commented Jan 10, 2020

Great! Thanks @rlange2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants