Skip to content

include mass in description #77

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

ABaldwinHunter
Copy link
Contributor

  • Increase parity with Code Climate classic duplication reporting
  • Make mass more salient

The mass is already presented in the issue content readup, but
it's lost in a sea of non-custom text.

Classic:

screen shot 2016-01-15 at 6 40 58 pm

@codeclimate/review

ABaldwinHunter added 2 commits January 15, 2016 18:34
- Increase parity with Code Climate classic duplication reporting
- Make mass more salient

The mass is already presented in the issue content readup, but
it's lost in a sea of non-custom text.
@pbrisbin
Copy link
Contributor

Sounds good to me. I'm a little confused why the screenshot shows mass*2 = 468 though.

@wfleming
Copy link
Contributor

Glad to see this. Should we remove it from the readup as well? Seems redundant to put it in both places.

@ABaldwinHunter
Copy link
Contributor Author

@pbrisbin yeah, I shirked commenting on that, apologies.

The number two comes from the number of occurrences of the code. That multiplication only occurs for issues that have identical code. (As opposed to similar)

After comparing a number of files analyzed in both classic and platform, I believe the same calculations in terms of mass are happening under the hood. @wfleming any other comment?

Sent from my iPhone

On Jan 15, 2016, at 6:43 PM, pat brisbin [email protected] wrote:

Sounds good to me. I'm a little confused why the screenshot shows mass*2 = 468 though.


Reply to this email directly or view it on GitHub.

@noahd1
Copy link
Contributor

noahd1 commented Jan 16, 2016

Caveat Emptor: I was in this code recently and was surprised to find that issue.mass was, I think, the total mass of all the duplicated code, as opposed to a single instance of duplication. I know, sounds weird -- maybe I'm wrong, but it's worth double checking.

@noahd1
Copy link
Contributor

noahd1 commented Jan 16, 2016

Did a bit more looking at the https://github.com/seattlerb/flay and found some interesting things.

This was a surprise to me, maybe not others, but issue.mass is really a score, not really, say a count of AST nodes in a section of duplicated code. The logic says:

  1. If it's a fuzzy match, the score is nodes.size * nodes.first.mass
  2. If it's an exact match, the score is 'nodes.size * nodes.size * nodes.first.mass`

The extra nodes.size for exact matches is referred to as a bonus in the flay code. The bonus is displayed by flay as, for example *2, but note that it only applies to exact matches. There is no bonus for similar/fuzzy matching code.

My hope with showing mass wasn't necessarily to demonstrate the severity of the identified code, but more to help guide people when setting a mass_threshold in their .codeclimate.yml. If the see the masses of certain sections of code, they can better tune their configs.

In the very least we should update our documentation to reflect how remediation points are calculated -- since they are impacted by the scoring algorithm above.

@ABaldwinHunter
Copy link
Contributor Author

@noahd1 Thanks for the great feedback and research at the source on flay.

I can take a closer look at the internals of our engine to make sure it matches and update the docs.

@wfleming
Copy link
Contributor

Quick note here while I'm thinking about: the point of displaying this mass in the readup/description was to educate the user about where they might want to set their threshold at.

#78 has work to change the calculated value of mass but, importantly, the determination of cutoff is still internal to flay, I'm fairly sure. That calculation (again, I'm fairly sure, because I dug into it a while ago, but it's certainly worth checking again) is the total mass as it is currently represented in master, not the new calculation happening in #78. So: in order for this display to make sense after #78 moves forward, I think we'd need to reverse-engineer the total mass again (or just track it as a separate value in Violation).

Actually, this is making me think we should change the name of #mass in Violation maybe is the thing to do. We should distinguish between the per-instance value used to calculate remediation points (what #78 seems to be going after), and the value Flay assigns to the totality of violations (which Flay calls mass internally, I think, and which we also use for the language around the threshold settings).

I was utterly wrong, and just spent some time diving into Flay's code to confirm it. Ignore me.

@ABaldwinHunter
Copy link
Contributor Author

closing in favor of #87

@ABaldwinHunter ABaldwinHunter deleted the abh-include-mass-in-description branch January 27, 2016 22:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants