Skip to content

Convert documentation style for doxygen compatibility #869

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Jun 6, 2017

Conversation

reuk
Copy link
Contributor

@reuk reuk commented Apr 25, 2017

Attempts to format existing documentation blocks according to the LLVM doxygen guidelines.

Format is approximately:

/// Purpose, which should start with a single sentence giving a concise overview.
/// Further information is given below, and formatted in complete sentences, using
/// proper grammar.
/// \param parameter_one: A short description of the parameter.
/// \param parameter_two: Information about the second param. If I want to give
///   more information, the following lines should be indented an extra two spaces,
///   so that the \param markers stand out.
/// \return Some information about the return value.
Foo get_foo(Bar parameter_one, Baz parameter_two);

/// This is a simple utility function which doesn't need parameter or return docs.
void Toggle::flip();

Note that the function name is not given in the documentation, and there is no blank line between the docs and the function. It's obvious which function is being documented, without having to name it explicitly. Also note that, if the purpose/parameters/returns sections are not applicable to a given function, then they should be omitted.

In some cases, the existing documentation was not formatted to use lists of input parameters. In these cases, the conversion has added a parameters: prefix. Ideally, these cases should use the \param format instead, but this would require manual editing.

I think this is about as far as we can go in terms of automation, but there are some further changes that should be made:

  • Grammar should be fixed so that all sentences begin with capital letters and end with full-stops.
  • Documentation should be moved to interfaces, rather than placed in the source file.
  • AFAIK Doxygen will only generate documentation for granular items if the superitem is documented - so static functions will only get documented if the file has a doc-block, and member functions will only get documented if the class has a doc-block. We should check this, and add top-level documentation wherever necessary.

Finally, it would be really nice if merges into master automatically rebuilt the documentation and published it online.

@reuk
Copy link
Contributor Author

reuk commented Apr 25, 2017

@thk123 suggested just now that a good way of tackling this review might be to slightly modify the conversion script, so that it produces some kind of condensed diagnostic message whenever it makes a non-trivial formatting change.

@reuk
Copy link
Contributor Author

reuk commented Apr 25, 2017

At present the conversion removes the 'author' annotation, as I believe this information is better managed by git blame. Is this likely to be problematic in terms of copyright ownership? Is there another reason for keeping the author field?

@owen-mc-diffblue
Copy link
Contributor

owen-mc-diffblue commented Apr 25, 2017

By the way, I've so far found these three formats supported a little by qtcreator:

///
/// \brief get_value
/// \param type
/// \param ns
/// \param offset
/// \param use_non_det
/// \return 
///
/*!
 * \brief get_value
 * \param type
 * \param ns
 * \param offset
 * \param use_non_det
 * \param newparam
 * \return 
 */
/**
 * @brief get_value
 * @param type
 * @param ns
 * @param offset
 * @param use_non_det
 * @param newparam
 * @return 
 */

By "supported a little" I mean that when you type the first line and press return it auto-completes the rest of it, including all the parameter names.

We should definitely consider whether it would make sense to use a format that is very easy to create with qtcreator.

@reuk
Copy link
Contributor Author

reuk commented Apr 25, 2017

To be clear, does QTCreator require the leading and trailing triple slashes on an otherwise empty line?

@owen-mc-diffblue
Copy link
Contributor

It seems like QTCreator has something which kicks in whenever you press return on a line which ends with three slashes in a row in it, and you're above a function. It adds in the rest of the text above. So to get rid of the first and last lines would require removing them manually afterwards, which is a little annoying.

@tautschnig
Copy link
Collaborator

  1. Could we have a preparatory PR that will adjust the linter to accept both the current and the new copyright headers, such that the linter does not fail on this PR?
  2. Is there signoff on this by @peterschrammel ? AFAIK he had an automated conversion in place already.
  3. Files need copyright headers. (Actually the current ones are IMHO too brief as they are lacking license information.)

Copy link
Contributor

@thk123 thk123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked at a few files, as mentioned it would be good to have a list of functions which aren't completely erased by this change to hand review them.

I'm also not sure about not including the function name. I understand your argument, but worry that comments can sometimes go for walkies and it acts as a checksum that the comment is really about the function you are looking at.

I'll review the scripts tomorrow.


/// Uses the options to pick an SMT 1.2 solver
///
/// parameters: None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose in this case we don't really want this line, perhaps we can special case Parameters: None (or even Parameters: followed by a single word) to mean no generated parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - this should probably be a special case in the conversion script, yes.

@thk123
Copy link
Contributor

thk123 commented Apr 25, 2017

@tautschnig

Could we have a preparatory PR that will adjust the linter to accept both the current and the new copyright headers, such that the linter does not fail on this PR?

I think it would be better/simpler to update in this PR the linter to validate doxygen?

@reuk
Copy link
Contributor Author

reuk commented Apr 25, 2017

@owen-jones-diffblue OK, then I think that's a pretty strong argument for having empty leading and trailing lines. I should also change \returns to \return by the looks of things. There's an option in QT creator to disable generating a \brief, which I think people should probably use, as the first sentence is treated as a brief description automatically. I think it's more readable to have empty lines between the description, inputs, and outputs - is it reasonable to add those manually?

@tautschnig I was asked to do this work by @forejtv and @peterschrammel a few weeks ago. Re. copyright headers, who should I talk to about the correct wording? (Also, is it legally permissible to retroactively add licence text to source code?)

@peterschrammel peterschrammel self-requested a review April 25, 2017 16:46
@tautschnig
Copy link
Collaborator

Thanks for all the answers; I agree with all that's been said/suggested. For the legalese, I'd suggest to get @kroening + @forejtv + @peterschrammel to agree one format. (Yes, the copyright has legal implications, not just "documentation").

Copy link
Contributor

@thk123 thk123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a look at the convert script - looks good to me apart from one regex.

convert.py Outdated
""" Extract the named fields of an old-style comment block. """

field_re = re.compile(
r'(?:\n *(Purpose):(.*))|(?:\n *([a-zA-Z0-9]+?):\n?(.*?)?^$)',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come only "Purpose" is allowed to be on one line. If you are worried about false positives, there aren't so many possible labels: Purpose, Input, Output, Module, Author, Name are all I think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that most fields extend until the next empty line, but in a few files there are very long 'Purpose' fields with multiple paragraphs. Therefore, we want 'Purpose' fields to consume everything until the end of the block, because we know that 'Purpose' will always be the final field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't say I understand why this regex does this, but I am convinced it does based on debugex.

convert.py Outdated
with open(file) as f:
contents = f.read()

doc_width = 75
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realise the current docs aren't quite the full file width, but it is there a reason for this or should the doc width be 80 just like the regular lines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a misleading name, I think - this is the width used for wrapped text, but then the leading /// is added to the beginning of each line, meaning that the actual width of the block is 79 columns. This could be 76, if we want to go to 80 columns.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well dunno if worth it since doubt we'll use the script again, but maybe doc_width = 80 - strlen('/// ')?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to constrain the width of documentation by default to 70 or 72 characters, just like Python coding guidelines do (even then those same coding guidelines otherwise use 80 characters). The reason is the added flexibility: if breaking words would make things look awkward then you can safely go a few characters beyond 70/72 and would still remain within the linters 80-characters hard block.

@reuk
Copy link
Contributor Author

reuk commented Apr 26, 2017

Thanks for the review @thk123.

I've tried out CLion's documentation completion. This is the auto-generated comment for the given function declaration. This is with no configuration, after typing three slashes and hitting return:

///
/// \param os
/// \param foo
/// \return
std::ostream &operator<<(std::ostream &os, const Foo &foo);

Looks like we should definitely use return rather than returns, but this weakens the case for leading and trailing ///.

As both styles don't add empty lines separating parameters and returns, I wonder whether we should do the same.

@owen-mc-diffblue
Copy link
Contributor

@reuk I found that option (it's in Tools > Options... > Text Editor > Completion > Documentation Comments). Unticking "Generate brief description" means I get this output:

///
///
/// \param type
/// \param ns
/// \param offset
/// \param use_non_det
/// \return 
///

It still has the same number of lines, but the second line is blank instead of having "\brief ".

I don't think we should get too caught up with copying exactly what QT Creator does. I had a brief look and CLion doesn't produce the blank line at the top and bottom, so I'm fine with not having that. It doesn't have a blank line between different sections, so that seems to be standard and we should probably stick to it.

@reuk
Copy link
Contributor Author

reuk commented Apr 27, 2017

Updated to include all changes mentioned, other than the linter patch and new legalese.

Re. the text wrapping, I think it's probably best to wrap at 80 columns, like in the code. Most editors support automatic line-breaking at a certain column, but most editors don't support line-breaking on different columns depending on context.

@peterschrammel
Copy link
Member

doxygen.cfg needs updating. It currently only includes a few directories.

@peterschrammel peterschrammel changed the title Convert documentation style for doxygen compatability Convert documentation style for doxygen compatibility Apr 29, 2017
@peterschrammel
Copy link
Member

A few files contain legacy \defgroup definitions, which are down to function level. These will require further refinement as we improve the documentation.
A question is whether it makes sense to generate groups by default now, e.g. on a file level to group them per source directory so that each file appears in at least one group (shown as "module" in doxygen).

@peterschrammel
Copy link
Member

It also seems that the "files" section in doxygen is only populated if the file contains a \brief; without that a file does not show up there. A TO_BE_DOCUMENTED default value might also help there.

@reuk reuk force-pushed the doxy-squashed branch 2 times, most recently from 67c7365 to 9b53eea Compare May 1, 2017 19:24
@reuk
Copy link
Contributor Author

reuk commented May 1, 2017

I had a look at the doxygen config. By enabling the EXTRACT_ALL option, we can get Doxygen to create overviews for everything (even files which aren't documented). This would keep the code cleaner than adding TO_BE_DOCUMENTED labels. Of course, we could revert the option in the future, once everything that needs documenting has been documented.

@owen-mc-diffblue
Copy link
Contributor

Does this PR include an update to the coding standard, to specify what function header comments will have to look like?

@reuk
Copy link
Contributor Author

reuk commented May 10, 2017

No, but it probably should. Waiting on @peterschrammel, @forejtv and @kroening to decide on header legalese so that I can update the linter to check for the proper header wording, and so I can update the coding standard in one go (to include info about function blocks, header blocks, class blocks etc.)

@forejtv
Copy link
Contributor

forejtv commented May 10, 2017

@reuk Can you please put the following as the header in each file, appropriately wrapped?

Copyright 2001-2017, Daniel Kroening (Computer Science Department, University of Oxford and Diffblue Ltd), Edmund Clarke (Computer Science Department, Carnegie Mellon University), DiffBlue Ltd.

@reuk reuk force-pushed the doxy-squashed branch 2 times, most recently from 6843eab to 05669a5 Compare May 12, 2017 21:33
@reuk
Copy link
Contributor Author

reuk commented May 21, 2017

I added a driver script that ignores certain files and folders, and re-ran the conversion using this script.

@tautschnig tautschnig dismissed their stale review May 22, 2017 07:31

Requested changes completed

@reuk reuk force-pushed the doxy-squashed branch from d747b52 to 6ed8fe9 Compare May 30, 2017 11:04
@reuk reuk force-pushed the doxy-squashed branch from 6604d00 to c13374b Compare June 6, 2017 14:06
@reuk reuk force-pushed the doxy-squashed branch from c13374b to e9349f8 Compare June 6, 2017 14:13
@kroening kroening merged commit e2afe93 into diffblue:master Jun 6, 2017
@reuk reuk deleted the doxy-squashed branch June 7, 2017 10:16
@reuk reuk mentioned this pull request Jun 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants