Automatically generate docstring for each existing and newly written processes. #67

rlange2 · 2019-12-02T14:55:29Z

Will update the doc attribute of each process in the framework. autodoc=True does not need to be passed to the xs.process decorator and instead is set globally.

I tried to follow the numpy docstring guide as closely as possible. Basically, variable name, type (if supplied), intent and description will be taken into account and represented in the format:

name: type (intent)
    description

For example, the user defined process:

@xs.process()
class DippingDyke:
    """Mimics the effect on erosion rates of a dyke dipping at
    a given angle, that is buried beneath the landscape and that is
    progressively exhumed by erosion.
    """
    
    x_position = xs.variable(validator = attr.validators.instance_of(int), description='initial x-position of exposed dyke')
    width = xs.variable(description='dyke fixed width')
    angle = xs.variable(description='dyke dipping angle in degrees')
    
    grid_shape = xs.foreign(UniformRectilinearGrid2D, 'shape')
    x = xs.foreign(UniformRectilinearGrid2D, 'x')
    
    etot = xs.foreign(TotalErosion, 'cumulative_height')
    
    k_coef = xs.foreign(StreamPowerChannel, 'k_coef', intent='out')
    diffusivity = xs.foreign(LinearDiffusion, 'diffusivity', intent='out')
    
    def run_step(self):
        cotg = 1. / np.tan(np.radians(self.angle))
        
        dyke_pos = self.x - self.x_position - self.etot * cotg
        
        in_dyke = (dyke_pos - self.width) * (dyke_pos + self.width) <= 0
        
        self.k_coef = np.where(in_dyke, 1e-5, 2e-5)
        self.diffusivity = np.where(in_dyke, 0.05, 0.1)

Will return:

Mimics the effect on erosion rates of a dyke dipping at
    a given angle, that is buried beneath the landscape and that is
    progressively exhumed by erosion.
    

Parameters
----------
x_position : int (in)
    initial x-position of exposed dyke
width : (in)
    dyke fixed width
angle : (in)
    dyke dipping angle in degrees
grid_shape : (in)
    Reference to variable 'shape' defined in class
    'UniformRectilinearGrid2D'
x : (in)
    Reference to variable 'x' defined in class 'UniformRectilinearGrid2D'
etot : (in)
    Reference to variable 'cumulative_height' defined in class
    'TotalErosion'
k_coef : (out)
    Reference to variable 'k_coef' defined in class 'StreamPowerChannel'
diffusivity : (out)
    Reference to variable 'diffusivity' defined in class 'LinearDiffusion'

rlange2 · 2019-12-03T09:45:21Z

pytest xsimlab --verbose returns a KeyError for 'intent' and 'description' since I try to access these keys in metadata, see e.g.
intent = str(value.metadata['intent']).split('.')[1].lower()
I'm not 100% sure why this is happening since these should be valid keys of metadata as defined in variable.py.

Wrapping these expressions in try and except seems to solve the problem and leaves me with one remaining failed test, which is:

=================================== FAILURES ===================================
____________________________ test_process_decorator ____________________________

    def test_process_decorator():
        with pytest.raises(NotImplementedError):
>           @xs.process(autodoc=True)
            class Dummy:
E           Failed: DID NOT RAISE <class 'NotImplementedError'>

xsimlab/tests/test_process.py:210: Failed

I guess that makes sense and shouldn't be an issue, right?

However, using try and except probably isn't the best way to deal with the failed pytest since these are pre-defined keys. I really cannot say if this happens because I hardcoded the keys or it has something to do with the internal logic of pytest or something completely different.

I'd be thankful for ideas.

rlange2 · 2019-12-04T08:59:38Z

Okay, in the spirit of encapsulation and OOP, I should start to rely on setters and getters.
That said, I changed the assignment for intent and description to:

intent = value.metadata.get('intent').value
description = value.metadata.get('description')

Running pytest again for now solves the KeyError. However, it returns:

xsimlab/process.py:458: in render_docstrings
    intent = value.metadata.get('intent').value
E   AttributeError: 'NoneType' object has no attribute 'value'

I'm curious now, how intent (and equally description) can turn out to be of type None since it is not a valid VarIntent (and description being None would raise other errors) and also, how other methods which make use of get don't run into these problems. Additionally, retrieving the variable type via data_type = str(value.validator).split("'")[1] does not raise any concerns.

benbovy

Great! Thanks @rlange2 for submitting this!

I have some inline comments.

I think we should also allow more control on where to include the generated parameter section in the docstrings if it already exists, e.g.,

@xsimlab.process(autodoc=True)
class A:
    """Process A
    
    {{parameters}}

    Notes
    ------
    Some notes about this process.
    """
    ...

I also think we should include other information such as the dimensions, perhaps under the variable description as a bullet list (and maybe move the intent there), e.g.,

a_var: object
   Variable description

    - dimensions: scalar or 'x' or ('x', 'y')
    - intent: 'in'
another_var: object
    ...

See here for an example.

On a more general note, I try to more or less strictly follow the PEP8 conventions for formatting the code. There are tools like flake8 (linters) that can help you enforcing the code into this format. I will eventually use black so that we won't need to worry about code style.

xsimlab/process.py

benbovy · 2019-12-04T09:48:58Z

I'm curious now, how intent (and equally description) can turn out to be of type None since it is not a valid VarIntent (and description being None would raise other errors)

value.metadata.get('intent') returns either an Enum member or None. Like the error says, None in Python has no .value attribute.

rlange2 · 2019-12-04T14:40:25Z

Dear @benbovy, many thanks for your input. Most of it seems pretty clear to me. On some others, I'm afraid, I'll need some clarification.

But let's go through that step by step:

I think we should also allow more control on where to include the generated parameter section in the docstrings if it already exists, e.g.,
@xsimlab.process(autodoc=True)
class A:
    """Process A
    
    {{parameters}}

    Notes
    ------
    Some notes about this process.
    """
    ...

Yes, having a keyword that later on gets replaced is a good idea.

I also think we should include other information such as the dimensions, perhaps under the variable description as a bullet list (and maybe move the intent there), e.g.,
a_var: object
   Variable description

    - dimensions: scalar or 'x' or ('x', 'y')
    - intent: 'in'
another_var: object
    ...

That looks better from an organisational perspective. I was wondering what other information are useful to add to the bullet list. The metadata sure offer a lot of them and it's probably a good idea to check beforehand, which items are actually set and others we don't include if they are None. Although, there is less information available, for instance, if it's a foreign variable. Should we look that up then in the reference class? Then, it might be more sensical, to write little helper functions (in utils.py?) that, for instance, look up what attributes are given in metadata or retrieve information when there is a foreign variable. Just so, that not everything is done by render_docstring.

On a more general note, I try to more or less strictly follow the PEP8 conventions for formatting the code. There are tools like flake8 (linters) that can help you enforcing the code into this format. I will eventually use black so that we won't need to worry about code style.

That is a fair point. I just started using flake8 and incorporated their feedback. Will soon take a deeper look into black

Instead of attr.field_dict, you should use variables_dict defined in utils.py, since we don't want to include attrs attributes that are not xarray-simlab variables here. This is why you get the KeyError: 'intent' in the tests.

Indeed that was the case. Maybe at some point, we can talk about why that happens.

value.metadata.get('intent') returns either an Enum member or None. Like the error says, None in Python has no .value attribute.

Okay, I understand. Then it was probably a logic error from my side. From my understanding value.metadata.get('intent') will (in this framework) never return None and therefore, will always have a .value attribute. Moreover, this also got solved by variables_dict from utils.py and I'm not sure what's the reason.

Do not directly overwrite the __doc__ attribute of the base class here. Instead do:
self._p_cls_dict['__doc__'] = ...   # concatenate self._base_cls.__doc__ + the formatted parameter section

Here, I run into problems and did so in the past. When I try:
self._p_cls_dict['__doc__'] = ...

The updated docstring doesn't show up in process.__dict__['__doc__'] and can only be seen right at the end when I call help(process).

When using instead:
self._p_cls_dict.__doc__ = ...

I get:

AttributeError: 'dict' object attribute '__doc__' is read-only

That's why I started overwriting the base class. I only overwrite, when self._base_cls_.__doc__ is NoneType, i.e. there is no docstring given (from my understanding, and that is only the case for BorderBoundary in fastscape/processes/boundary.py plus any user created process without a docstring). Otherwise, I concatenate the docstring to self._base_cls_.__doc__.
This way, the information shows up in __doc__ as intended.

Thanks for the help.

benbovy · 2019-12-04T16:47:47Z

I was wondering what other information are useful to add to the bullet list. [...] Although, there is less information available, for instance, if it's a foreign variable. Should we look that up then in the reference class? [...] Then, it might be more sensical, to write little helper functions (in utils.py?)

Good point. I think it's enough to show the class and variable to which refers a foreign variable. You could look at formatting.var_details(), which is used to format the docstring of the generated class properties for the variables. Actually, maybe you could just reuse it here?

value.metadata.get('intent') will (in this framework) never return None

It will likely return None if value is not an attr.Attribute created with the framework wrappers (i.e., xs.variable, xs.foreign...). The framework should be as less intrusive as possible, i.e., users should be able to do the following:

@xsimlab.process
class A:
    not_a_xsimlab_var = attr.ib()
    xsimlab_var = xsimlab.variable()

Hence https://github.com/benbovy/xarray-simlab/pull/67#discussion_r353609467

The updated docstring doesn't show up in process.dict['doc'] and can only be seen right at the end when I call help(process).

Oh yes I see, actually the docstrings should be updated for both the base class (stand-alone) and the subclass (the class returned by build_class).

rlange2 · 2019-12-05T14:40:24Z

I was wondering what other information are useful to add to the bullet list. [...] Although, there is less information available, for instance, if it's a foreign variable. Should we look that up then in the reference class? [...] Then, it might be more sensical, to write little helper functions (in utils.py?)

Good point. I think it's enough to show the class and variable to which refers a foreign variable. You could look at formatting.var_details(), which is used to format the docstring of the generated class properties for the variables. Actually, maybe you could just reuse it here?

Reusing formatting.var_details() would keep everything consistent as well. However, the function returns the output in a very defined format. In order to account for additional indents and newline characters, I came up with two different approaches:

Don't change var_details()itself but rather its output:

Here, a nested loop is introduced to format the output on the fly:

def render_docstrings(self):

    attributes_keyword = "{{attributes}}"
    data_type = "object"  # placeholder until issue #34 is solved
    docstring = "\nAttributes\n----------\n"
    indent = "    "

    for key, value in variables_dict(self._base_cls).items():
        temp_string = ''
        var_attributes = var_details(value).split("\n")
            
        for line in range(len(var_attributes)):
            if line == len(var_attributes) - 1:
                temp_string += indent + var_attributes[line]
            else:
                temp_string += indent + var_attributes[line] + "\n"

        var_attributes = temp_string
        docstring += f"{key}: {data_type}\n{var_attributes}\n"

    if self._base_cls.__doc__ is not None:
        if attributes_keyword in self._base_cls.__doc__:
            self._base_cls.__doc__ = self._base_cls.__doc__.replace(attributes_keyword,
                                                                    docstring)
        else:
            self._base_cls.__doc__ += f"\n{docstring}"
    else:
        self._base_cls.__doc__ = docstring

Add a flag (default False) to var_details() to return expected output:

def var_details(var, docstring=False):
    max_line_length = 70

    var_metadata = var.metadata.copy()

    description = textwrap.fill(var_metadata.pop('description').capitalize(),
                                width=max_line_length, subsequent_indent='    ')

    detail_items = [('type', var_metadata.pop('var_type').value),
                    ('intent', var_metadata.pop('intent').value)]
    detail_items += list(var_metadata.items())

    if docstring==True:
        details = "\n".join(["    - {} : {}".format(k, v) for k, v in detail_items])
    else:
        details = "\n".join(["- {} : {}".format(k, v) for k, v in detail_items])

    return description + "\n\n" + details + '\n'

This change let's us get rid of the second loop in render_docstrings():

def render_docstrings(self):

    attributes_keyword = "{{attributes}}"
    data_type = "object"  # placeholder until issue #34 is solved
    docstring = "\nAttributes\n----------\n"
    indent = "    "

    for key, value in variables_dict(self._base_cls).items():
        var_attributes = var_details(value, docstring=True)
        docstring += f"{key}: {data_type}\n{indent}{var_attributes}\n"

    if self._base_cls.__doc__ is not None:
        if attributes_keyword in self._base_cls.__doc__:
            self._base_cls.__doc__ = self._base_cls.__doc__.replace(attributes_keyword,
                                                                    docstring)
        else:
            self._base_cls.__doc__ += docstring
    else:
        self._base_cls.__doc__ = docstring

There's no need anymore to import textwrap.
Both approaches generate the same result, for example, by using the process mentioned in #67 (comment):

Mimics the effect on erosion rates of a dyke dipping at
    a given angle, that is buried beneath the landscape and that is
    progressively exhumed by erosion.
    
    
Attributes
----------
x_position: object
    Initial x-position of exposed dyke

    - type : variable
    - intent : in
    - dims : ((),)
    - group : None
    - attrs : {}

width: object
    Dyke fixed width

    - type : variable
    - intent : in
    - dims : ((),)
    - group : None
    - attrs : {}

angle: object
    Dyke dipping angle in degrees

    - type : variable
    - intent : in
    - dims : ((),)
    - group : None
    - attrs : {}

grid_shape: object
    Reference to variable 'shape' defined in class
    'uniformrectilineargrid2d'

    - type : foreign
    - intent : in
    - other_process_cls : <class 'fastscape.processes.grid.UniformRectilinearGrid2D'>
    - var_name : shape

x: object
    Reference to variable 'x' defined in class 'uniformrectilineargrid2d'

    - type : foreign
    - intent : in
    - other_process_cls : <class 'fastscape.processes.grid.UniformRectilinearGrid2D'>
    - var_name : x

etot: object
    Reference to variable 'cumulative_height' defined in class
    'totalerosion'

    - type : foreign
    - intent : in
    - other_process_cls : <class 'fastscape.processes.erosion.TotalErosion'>
    - var_name : cumulative_height

k_coef: object
    Reference to variable 'k_coef' defined in class 'streampowerchannel'

    - type : foreign
    - intent : out
    - other_process_cls : <class 'fastscape.processes.channel.StreamPowerChannel'>
    - var_name : k_coef

diffusivity: object
    Reference to variable 'diffusivity' defined in class 'lineardiffusion'

    - type : foreign
    - intent : out
    - other_process_cls : <class 'fastscape.processes.hillslope.LinearDiffusion'>
    - var_name : diffusivity

A few things I noticed:
Due to capitalize() in var_details() any character other than the first in the string becomes lowercase, e.g. 'UniformRectilinearGrid2D' turns into 'uniformrectilineargrid2d'. It's not the biggest issue right now but it lowers readability quite a bit in my opinion.
Additionally, some bullet points end up containing redundant information (e.g. other_process_cls for foreign variables), others are even less helpful if not assigned (e.g. group, dims, attrs). Of course, that doesn't mean, we should leave these out.
When no description is given, there will be an additional blank line instead.

Finally, when using a non xsimlab-variable via attr.ib() it likely won't be considered by variables_dict() since they are not xsimlab-specific.

benbovy · 2019-12-05T16:26:58Z

I prefer the 1st approach. It's better customize things at a higher level and not propagate complexity at lower levels (e.g., var_details in this case), IMO.

I think you could get rid of the nested loop by using textwrap.indent, and actually get rid of all nested blocks, e.g., with something like:

def render_docstrings(self):

    attributes_keyword = "{{attributes}}"
    data_type = "object"  # placeholder until issue #34 is solved

    fmt_vars = []

    for vname, var in variables_dict(self._base_cls).items():
        var_header = f"{vname} : {data_type}"
        var_content = textwrap.indent(var_details(var), " " * 4)

        fmt_vars.append(f"{var_header}\n{var_content}")

    fmt_section = textwrap.indent("Attributes\n"
                                  "----------\n"
                                  "\n".join(fmt_vars),
                                  " " * 4)

    current_doc = self._base_cls.__doc__ or ""

    if attributes_keyword in current_doc:
        new_doc = current_doc.replace(attributes_keyword,
                                      fmt_section[4:])
    else:
        new_doc = f"\n\n{fmt_section}\n"

    self._base_cls.__doc__ = new_doc

The code here above should also properly handle all line returns and indentation.

benbovy · 2019-12-05T16:32:55Z

Due to capitalize() in var_details() any character other than the first in the string becomes lowercase, e.g. 'UniformRectilinearGrid2D' turns into 'uniformrectilineargrid2d'.

You can ignore this for now. Actually, I think that it will be better if the description of a foreign variable corresponds to the description of the variable it refers to (we can do this in another PR).

rlange2 · 2019-12-09T13:55:28Z

That is a really clean and thorough approach!

I slightly changed your code to the following:

def render_docstrings(self):
    attributes_keyword = "{{attributes}}"
    data_type = "object"  # placeholder until issue #34 is solved

    fmt_vars = []

    for vname, var in variables_dict(self._base_cls).items():
        var_header = f"{vname} : {data_type}"
        var_content = textwrap.indent(var_details(var), " " * 4)

        fmt_vars.append(f"{var_header}\n{var_content}")

    fmt_section = textwrap.indent("Attributes\n"
                                  "----------\n"
                                  + "\n".join(fmt_vars),
                                  " " * 4)

    current_doc = self._base_cls.__doc__ or ""

    if attributes_keyword in current_doc:
        new_doc = current_doc.replace(attributes_keyword,
                                      fmt_section[4:])
    else:
        new_doc = f"{current_doc}\n{fmt_section}\n"

    self._base_cls.__doc__ = new_doc

+ was added to your second textwrap.indent() otherwise the attributes headline is wrapped around each variable instead of being placed once at the beginning of fmt_section. Line break was introduced before the binary operator for readability.
Concatenation in else since a previously existing docstring (without using the attributes_keyword) would have been overwritten.
Added f to your f-string.

I would like to know what you think about setting autodoc=True in process.process() by default. Also, there is still a blank line in case no variable description is given. I will think about what's a good way to handle that.

You can ignore this for now. Actually, I think that it will be better if the description of a foreign variable corresponds to the description of the variable it refers to (we can do this in another PR).

True. I'm not sure how to link a variable in a process to its original process. Will think about it.

Edit: Maybe var.metadata.get('other_process_cls') is a good start.

benbovy · 2019-12-09T14:20:46Z

For better overall readability, I'm wondering if we shouldn't move the the formatting logic currently in render_docstrings into a new function in formatting.py, e.g., format_attribute_section(process) and then just call it in render_docstrings:

from .formatting import format_attribute_section


def render_docstrings(self):
    new_doc = format_attribute_section(self._base_cls)

    self._base_cls.__doc__ = new_doc
    self._p_cls_dict['__doc__'] = new_doc

The last line is needed (I think) to be able to get the docstrings from process instances attached to a model (i.e., mymodel.myprocess).

benbovy · 2019-12-09T14:27:19Z

We should also take the indentation into account when wrapping the variable description as a text block in the section. This would need exposing the max line length as a parameter, i.e., var_details(var, max_line_length=70) and then use 62 instead of 70 for var_content in the code above.

benbovy · 2019-12-09T14:37:16Z

Nit: add_attribute_section(process, placeholder="{{attributes}}") is slightly better than format_attribute_section(process) IMO.

rlange2 · 2019-12-09T15:52:35Z

add_attribute_section(process, placeholder="{{attributes}}") has been added to formatting.py and the code has been adjusted as you suggested.

I pushed the update since I felt that it's good to have a more recent version to talk about. Not sure why the checks have failed since pytest didn't have any issues.

I will look into the self._p_cls_dict['__doc__'] = new_doc assignment and max line length.

benbovy · 2019-12-09T16:04:42Z

The errors on travis are unrelated. I'll remove python 3.5 from the test matrix and fix the issue with the latest xarray release in another PR.

That said, it would be good to add a test for this "autodoc" feature.

I would like to know what you think about setting autodoc=True in process.process() by default.

It's a good default IMO. More descriptive docstrings is good. autodoc=False is for edge cases or in case something goes wrong.

Also, there is still a blank line in case no variable description is given. I will think about what's a good way to handle that.

I like adding (no description given) as you proposed it in an earlier version. You could do that in var_details().

benbovy · 2019-12-09T16:16:17Z

xsimlab/process.py

@@ -3,11 +3,12 @@
 import inspect
 import sys
 import warnings
+import textwrap


You can remove this unused import.

Right, thank you. Might have jumped the gun here.

benbovy · 2019-12-10T10:33:06Z

The errors on travis are unrelated. I'll remove python 3.5 from the test matrix and fix the issue with the latest xarray release in another PR.

This is now fixed in the master branch. You can merge it here.

…en process, if autodoc=True in the process decorator.

The formatting of the docstring has been outsourced to a separate function in formatting.add_attribute_section(). This way, the render_docstrings() only needs to call that function and pass its return to the __doc__ attribute of the base class (and subclasses). There are more suggestions in the PR but I had the impression that it's beneficial to update the recent code.

rlange2 · 2019-12-10T13:54:56Z

    self._p_cls_dict['__doc__'] = new_doc
The last line is needed (I think) to be able to get the docstrings from process instances attached to a model (i.e., mymodel.myprocess).

I'm not sure about this, e.g.

from fastscape.processes import basic_model
basic_model.terrain

It returns the information with the previous formatting, i.e.

<TerrainDerivatives 'terrain' (xsimlab process)>
Variables:
    shape         [in] <--- grid.shape
    spacing       [in] <--- grid.spacing
    elevation     [in] <--- topography.elevation
    slope        [out] ('y', 'x') terrain local slope
    curvature    [out] ('y', 'x') terrain local curvature
Simulation stages:
    *no stage implemented*

The same is the case for a custom process, see DippingDyke example from the first post:

from fastscape.models import basic_model
basic_model = basic_model.update_processes({'Dyke':DippingDyke})
basic_mode.Dyke

returns

<DippingDyke 'Dyke' (xsimlab process)>
Variables:
    x_position      [in] initial x-position of exposed dyke
    width           [in] dyke fixed width
    angle           [in] dyke dipping angle in degrees
    grid_shape      [in] <--- grid.shape
    x               [in] <--- grid.x
    etot            [in] <--- erosion.cumulative_height
    k_coef         [out] ---> spl.k_coef
    diffusivity    [out] ---> diffusion.diffusivity
Simulation stages:
    run_step

That is the case, whether I include self._p_cls_dict['__doc__'] = new_doc or not.
It makes sense. formatting._summarize_var() is called in formatting.repr_process() which sets up the layout. That in turn gets passed to the process._ProcessBuilder class as well as to process.process_info().
Maybe I misunderstood your intention but this is what I found regarding mymodell.myprocess.

We should also take the indentation into account when wrapping the variable description as a text block in the section. This would need exposing the max line length as a parameter, i.e., var_details(var, max_line_length=70) and then use 62 instead of 70 for var_content in the code above.

For the current format, it seems like there are mostly two places where text length might be a concern. That would be the i) variable description and ii) the bullet point other_process_cls (appears for foreign variables) will usually contain a long sequence, e.g. <class 'fastscape.processes.grid.UniformRectilinearGrid2D'>.
With the way var_details() is written now, a line break is only introduced in the description if max_line_length is exceeded, since this is where textwrap.fill is applied. So this won't apply for the bulleted list.
There are a couple of possibilities I can think of:

Wrap another textwrap method around around var_content or fmt_section.
Change var_details() once again and include a textwrap (or custom formatting) around details.
Write an entirely new function that is kin to var_details() but only used for this specific case of docstring-formatting.
Leave everything as is.

Feel free to share your thoughts and add to the list.
My thoughts: Regarding 1., textwrap can mess with a text block quite substantially. However, I will try to make that work. I can see, how 2. might not be the best option. var_details() is used in a few other places and we probably should keep this function as original as it is meant to be. 3. introduces another function to the project. I think, that is definitely an option. 4. sounds a bit dissatisfying at first but there are two things, we should keep in mind: a) We can leave line breaks for the description up to the user, and b) we have yet to think about a way how to deal with the variable information of foreign variables. I'm sure you want to include what process the variable is referencing (as a bullet point in this list), but an output like
- other_process_cls : <class 'fastscape.processes.grid.UniformRectilinearGrid2D'>
is much too cryptic in my opinion anyways. Of course, we can wrap the description with max_line_length=62 and replace other_process_cls with a wrapped expression as well.

The errors on travis are unrelated. I'll remove python 3.5 from the test matrix and fix the issue with the latest xarray release in another PR.

That said, it would be good to add a test for this "autodoc" feature.

That sounds good. I'm not yet sure how to properly customise your own build but I hope, we can discuss it at one point.

It's a good default IMO. More descriptive docstrings is good. autodoc=False is for edge cases or in case something goes wrong.

Great!

Also, there is still a blank line in case no variable description is given. I will think about what's a good way to handle that.

I like adding (no description given) as you proposed it in an earlier version. You could do that in var_details().

The formatting.var_details() now contains:

description = textwrap.fill(var_metadata.pop('description').capitalize(),
                                width=max_line_length) or (
        "(no description given)")

Your suggestion inspired me to use or in string assignment. I hope, it doesn't affect readability too much compared to a comprehensive if/else (PEP 308 -- Conditional Expressions).
Seems to do the job for now.

I tried to introduce horizontal rulers to structure my posts a bit more. If I should make separate posts, please don't hesitate to mention it.

The errors on travis are unrelated. I'll remove python 3.5 from the test matrix and fix the issue with the latest xarray release in another PR.

This is now fixed in the master branch. You can merge it here.

I hope, I did the right thing :) The pytest output is a bit different now but no complains so far.

benbovy · 2019-12-10T14:27:07Z

It returns the information with the previous formatting

basic_model.terrain gives what the class __repr__ returns (that's what xs.process_info(basic_model.terrain) also returns), which is not the same than the docstrings. Try help(basic_model.terrain) and you should see the formatted attributes section among many other things.

For the current format, it seems like there are mostly two places where text length might be a concern. That would be the i) variable description and ii) the bullet point other_process_cls (appears for foreign variables) will usually contain a long sequence.

Yeah I think it's enough for now to wrap only the description (with 70 - 8 char width in this case) and let the bullet list untouched.

it would be good to add a test for this "autodoc" feature [...] I'm not yet sure how to properly customise your own build

The test would be very similar to those already implemented in https://github.com/benbovy/xarray-simlab/blob/master/xsimlab/tests/test_formatting.py.

Your suggestion inspired me to use or in string assignment.

Actually, I prefer a if in this case.

I tried to introduce horizontal rulers to structure my posts a bit more. If I should make separate posts, please don't hesitate to mention it.

Sometimes it's good to push your last changes/commits first and then add yourself inline comments, so that we can discuss specific details just next to the relevant lines of code.

I hope, I did the right thing :) The pytest output is a bit different now but no complains so far.

Same advise, don't hesitate to push your commits early. This way I can directly see the output in the CI logs.

For readability, variable description in var_details() now assigned with a separate conditional expression. To account for indents in the docstring, variable description is now wrapped after 62 characters.

rlange2

Same advise, don't hesitate to push your commits early. This way I can directly see the output in the CI logs.

Ok, the script now only fails the non-raised NotImplementedError.

____________________________ test_process_decorator ____________________________
809 
810     def test_process_decorator():
811         with pytest.raises(NotImplementedError):
812 >           @xs.process(autodoc=True)
813              class Dummy:
814 E           Failed: DID NOT RAISE <class 'NotImplementedError'>
815 
816 xsimlab/tests/test_process.py:210: Failed

it would be good to add a test for this "autodoc" feature [...] I'm not yet sure how to properly customise your own build

The test would be very similar to those already implemented in https://github.com/benbovy/xarray-simlab/blob/master/xsimlab/tests/test_formatting.py.

Thank you. This is what I will look into next.

rlange2 · 2019-12-10T16:39:56Z

xsimlab/process.py

@@ -453,7 +452,6 @@ def render_docstrings(self):
        new_doc = add_attribute_section(self._base_cls)

        self._base_cls.__doc__ = new_doc
-        self._p_cls_dict['__doc__'] = new_doc


self._base_cls.__doc__ = new_doc is enough to create docstring in help(mymodel.myprocess). In fact, it's pasted in there twice, at the beginning and the end.

With this commit, a test for the autodoc-feature will be implemented. For now, however, pytest returns an AssertionError because `add_attribute_section()` in `test_add_attribute_section()` returns the docstring twice. Reason unknown. Also, `@pytest.mark.xfail` was introduced to circumvent failures in `test_process_decorator()` and `test_add_attribute_section()`. This will be only temporary.

rlange2

With test_add_attribute_section() I tried to compare the output of add_attribute_section() with a template. However, the output gets generated twice and therefore fails to pass the test.

In tests.fixture_process I found the following:

@pytest.fixture(scope='session')
def in_var_details():
    return dedent("""\
    Input variable

    - type : variable
    - intent : in
    - dims : (('x',), ('x', 'y'))
    - group : None
    - attrs : {}
    """)

Is that related to test_var_details()?

rlange2 · 2019-12-11T15:34:04Z

xsimlab/tests/test_formatting.py

+
+    """)
+
+    assert add_attribute_section(Dummy) == expected


This leaves me with an AssertionError. Apparently, the docstring gets returned twice and therefore fails to be equal to expected. I tried this with a Dummy class as well as with tests.fixture_process.SomeProcess. I'm not sure, though, why that happens.

rlange2 · 2019-12-11T15:35:23Z

xsimlab/tests/test_formatting.py

@@ -37,6 +39,34 @@ def test_var_details(example_process_obj):
    assert "- dims : (('x',),)" in var_details_str


+@pytest.mark.xfail


pytest.mark.xfail was introduced to allow for this failure (also in tests.test_process.test_process_decorator(). This will be only temporary.

Yeah but then I cannot see in the CI logs why the test fails.

benbovy

Nice! We're almost there.

Some minor issues remaining.

Also, the docstrings of the process decorator should be updated (here: https://github.com/benbovy/xarray-simlab/blob/f595fd6af3150f4f293306985c644d2949299852/xsimlab/process.py#L492), with default: False -> default: True and maybe with a bit more details.

And don't forget to update the release notes in doc/whats_new.rst.

benbovy · 2019-12-11T15:33:03Z

xsimlab/tests/test_formatting.py

@@ -37,6 +39,34 @@ def test_var_details(example_process_obj):
    assert "- dims : (('x',),)" in var_details_str


+@pytest.mark.xfail


Remove xfail here. We expect that this test always succeeds.

benbovy · 2019-12-11T15:35:24Z

xsimlab/tests/test_formatting.py

+        """This is a Dummy class
+    to test `add_attribute_section()`
+        """
+        var = xs.variable(dims='x', description='a variable')


Maybe add another variable with no description to test the case where (no description given) is added to the docstrings.

It might be good to also test the case where the {{attributes}} placeholder is present in the original docstrings.

benbovy · 2019-12-11T15:40:59Z

xsimlab/tests/test_process.py

@@ -205,6 +205,7 @@ def run_step(self, a, b):
        assert "Process runtime methods" in str(excinfo.value)


+@pytest.mark.xfail


Again, here this should be removed.

You should instead replace the NotImplementedError test below by some logic testing if the docstrings are updated if autodoc=True and if it stays unchanged if autodoc=False. No need to duplicate the test that you have written for the attribute section content, assert "Attributes" in cls.__doc__ might be enough here.

benbovy · 2019-12-11T15:53:30Z

xsimlab/formatting.py

    var_metadata = var.metadata.copy()

    description = textwrap.fill(var_metadata.pop('description').capitalize(),
                                width=max_line_length)
+    if description == "":


nit: if not description: is slightly more idiomatic.

benbovy · 2019-12-18T15:12:38Z

@rlange2 I started using black in #84. As a consequence, there are some conflicts between this branch and the master branch. Let me know if you have some work that you haven't commit/pushed yet. Otherwise, I'll resolve the conflicts through this interface and you'll just need to pull the changes.

From now on, you can stop worrying about code formatting and use black: see here.

Fork is now synched with latest commit 1d995b4 of original repo. whats_new.rst has been updated. Placeholder and if-clause if no description has been given for individual variables. Docstring for `process.process.py` has been updated. Lastly, tests included for `add_attribute_section()` and `process_decorator()` in `test_formatting` and `test_process`, respectively.

rlange2 · 2020-01-06T11:35:28Z

Latest changes include:

Closes autogenerate docstrings for Process subclasses (process interface) #3
Tests added, test_formatting.test.add_attribute_section() and test_process.test_process_decorator()
Passes black . && flake8
Fully documented, including whats-new.rst for all changes

@rlange2 I started using black in #84. As a consequence, there are some conflicts between this branch and the master branch. Let me know if you have some work that you haven't commit/pushed yet. Otherwise, I'll resolve the conflicts through this interface and you'll just need to pull the changes.

I created a new branch and used it to overwrite the existing autodoc-branch. I hope that wasn't the worst idea. Should I highlight the changes I made?

benbovy

This looks good to me! Just two minor comments left.

benbovy · 2020-01-07T14:03:11Z

doc/whats_new.rst

@@ -47,6 +47,27 @@ Enhancements
  to :func:`xarray.Dataset.xsimlab.run`.
 - More consistent dictionary format for output variables in the xarray
  extension (:issue:`85`).
+- Existing and newly written processes will be updated automatically


Let's keep the release notes concise, e.g., with something like:

The ``autodoc`` parameter of the :func:`~xsimlab.process` decorator now allows to automatically add an attributes section in the docstrings of the class to which the decorator is applied, using the metadata of each variable declared in the class.

benbovy · 2020-01-07T14:05:59Z

xsimlab/process.py

+        {{attributes}} can be used as a placeholder for the updated
+        metadata information.
+
+        Docstring template:


Again here, not sure it's worth showing the whole attributes section template.

benbovy · 2020-01-10T15:46:34Z

Great! Thanks @rlange2

benbovy reviewed Dec 4, 2019

View reviewed changes

rlange2 pushed a commit to rlange2/xarray-simlab that referenced this pull request Dec 5, 2019

Incorporate first changes (see github PR xarray-contrib#67)

616827f

rlange2 force-pushed the autodoc branch from 1fea042 to 1b20698 Compare December 9, 2019 15:35

benbovy reviewed Dec 9, 2019

View reviewed changes

Raphael Lange and others added 3 commits December 10, 2019 13:00

Automatically generates a docstring for each existing and newly writt…

ce32dcf

…en process, if autodoc=True in the process decorator.

Update autodoc to deal with pytest

1218da3

Set conditional string assignment, wrap variable description

f4caf83

For readability, variable description in var_details() now assigned with a separate conditional expression. To account for indents in the docstring, variable description is now wrapped after 62 characters.

rlange2 force-pushed the autodoc branch from 1b20698 to f4caf83 Compare December 10, 2019 16:32

rlange2 commented Dec 10, 2019

View reviewed changes

Raphael Lange added 2 commits December 11, 2019 10:00

explicit if-clause

d59dc93

rlange2 commented Dec 11, 2019

View reviewed changes

Merge remote-tracking branch 'upstream/master' into autodoc

dc4c74a

benbovy reviewed Dec 11, 2019

View reviewed changes

Raphael Lange added 2 commits January 6, 2020 12:03

merge old and new autodoc

0c4079b

rlange2 mentioned this pull request Jan 6, 2020

Update docstring with original description of foreign variables #88

Closed

4 tasks

Added parameter note for test_add_attribute_section()

0eff384

benbovy reviewed Jan 7, 2020

View reviewed changes

rlange2 mentioned this pull request Jan 9, 2020

Propagate original description to foreign variables #91

Merged

3 tasks

Raphael Lange and others added 3 commits January 10, 2020 10:39

Rewrite whats_new.rst and xsimlab.process docstring

46c7fbc

Merge branch 'master' into autodoc

4712c9d

Rephrase documentation

e45cff4

benbovy merged commit d94c629 into xarray-contrib:master Jan 10, 2020

benbovy mentioned this pull request Jan 10, 2020

autogenerate docstrings for Process subclasses (process interface) #3

Closed

rlange2 deleted the autodoc branch January 13, 2020 09:03

		@@ -37,6 +39,34 @@ def test_var_details(example_process_obj):
		assert "- dims : (('x',),)" in var_details_str


		@pytest.mark.xfail

		@@ -205,6 +205,7 @@ def run_step(self, a, b):
		assert "Process runtime methods" in str(excinfo.value)


		@pytest.mark.xfail

Automatically generate docstring for each existing and newly written processes. #67

Automatically generate docstring for each existing and newly written processes. #67

Uh oh!

Conversation

rlange2 commented Dec 2, 2019

Uh oh!

rlange2 commented Dec 3, 2019

Uh oh!

rlange2 commented Dec 4, 2019

Uh oh!

benbovy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

benbovy commented Dec 4, 2019

Uh oh!

rlange2 commented Dec 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benbovy commented Dec 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rlange2 commented Dec 5, 2019

Uh oh!

benbovy commented Dec 5, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benbovy commented Dec 5, 2019

Uh oh!

rlange2 commented Dec 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benbovy commented Dec 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benbovy commented Dec 9, 2019

Uh oh!

benbovy commented Dec 9, 2019

Uh oh!

rlange2 commented Dec 9, 2019

Uh oh!

benbovy commented Dec 9, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

benbovy commented Dec 10, 2019

Uh oh!

rlange2 commented Dec 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benbovy commented Dec 10, 2019

Uh oh!

rlange2 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rlange2 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

benbovy left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rlange2 commented Dec 4, 2019 •

edited

Loading

benbovy commented Dec 4, 2019 •

edited

Loading

benbovy commented Dec 5, 2019 •

edited

Loading

rlange2 commented Dec 9, 2019 •

edited

Loading

benbovy commented Dec 9, 2019 •

edited

Loading

rlange2 commented Dec 10, 2019 •

edited

Loading

rlange2 commented Jan 6, 2020 •

edited

Loading