Skip to content

Languages Statusbar #2165

@TimerWolf

Description

@TimerWolf

Are there any plans for gitea to get a status-bar that shows how much of different code its used in the git?

And when you click on it you see the different languages that are used and how many % of the code that is used of that language in the git?

Its a really neat feature that i really miss!

Activity

added
type/proposalThe new feature has not been accepted yet but needs to be discussed first.
type/featureCompletely new functionality. Can only be merged if feature freeze is not active.
on Jul 14, 2017
added this to the 2.x.x milestone on Jul 14, 2017
tonivj5

tonivj5 commented on Jul 19, 2017

@tonivj5
Contributor

Here is the PR to gogs (it was not merged) implemeting that feature gogs/gogs#2135. It could be reused to add it in gitea 😉

lunny

lunny commented on Jul 20, 2017

@lunny
Member

@xxxTonixxx maybe someone could send it to Gitea.

tonivj5

tonivj5 commented on Jul 21, 2017

@tonivj5
Contributor

If @generaltso want, he could do it! If not, I think I could attempt it 😅

dayvonjersen

dayvonjersen commented on Jul 21, 2017

@dayvonjersen

gogs/gogs#2135 is certainly out of date by now as the gogs codebase has probably changed and the API for linguist has definitely changed.

Of course anyone is more than welcome to use my library but implementing this feature isn't going to be a copy/paste job.

That said, I copied and pasted the CSS I whipped up for those screenshots into a codepen: https://codepen.io/anon/pen/PjMdBy

-tso

lafriks

lafriks commented on Jul 21, 2017

@lafriks
Member

And I don't think it can be accepted in that form, stats should be generated and cached only once when repository default branch changes, not on every page load

dayvonjersen

dayvonjersen commented on Jul 21, 2017

@dayvonjersen

@lafriks yes, exactly. probably best to have like a post-receive hook that runs in the background and stores the result in the db.
and have a setting to disable it entirely for those concerned about server resource usage
it would also be cool if the classifier could then be retrained on real-world code samples but I'm probably jumping the gun here >_>

-tso

OmarAssadi

OmarAssadi commented on Oct 11, 2017

@OmarAssadi
Contributor

I put up a small ($5) bounty on this one. Miss this feature!

EDIT: Here is the current pledge amount. If anyone else feels like contributing, feel free!
current amount

dayvonjersen

dayvonjersen commented on Oct 19, 2017

@dayvonjersen

Hm, now my interest is piqued ;)

I could take another stab at it maybe tomorrow evening (I'm in EST). But be forewarned, my preliminary attempt will probably be an awful hack job. I will need to rely on the rest of the community's advice to do it right.

-tso

OmarAssadi

OmarAssadi commented on Oct 19, 2017

@OmarAssadi
Contributor

Sounds great! In addition to caching, the final version should probably also be limited by file size. Maybe an adjustable setting?

dayvonjersen

dayvonjersen commented on Oct 19, 2017

@dayvonjersen

@54 Hm so you mean don't try to classify individual files that are larger than, e.g. 1MB? Most of the time what linguist does is it goes by file extension but hm, yes I see what you mean just thinking aloud... Good idea :)

OmarAssadi

OmarAssadi commented on Oct 19, 2017

@OmarAssadi
Contributor

@generaltso Yeah, I just figure it'd kinda suck if someone uploaded some monstrous set of files that the server had to analyze. But, I haven't looked at your linguist library. Does it ever actually do some content analysis or is it pretty much entirely based on extension?

If it is purely based on the extension, then I don't think it's necessary to add that particular limitation.

dayvonjersen

dayvonjersen commented on Oct 19, 2017

@dayvonjersen

well it can do either.

in the reference implementation, after being filtered by linguist.ShouldIgnoreFilename() the file extension is passed to linguist.LanguageHints().

if there is more than one possible language for an extension (e.g. .php could be either PHP or Facebook's "Hack" language) then it first checks if the file is a binary blob with linguist.ShouldIgnoreContents() and then uses a bayesian classifier which has been trained on the same dataset as github/linguist to analyse the text (using a tokenizer which could use some improvement) and determine the language (the function is called linguist.Analyse())

a pretty straightforward process imo but I'm a tiny bit biased since I wrote it :p

it might be more convenient to encapsulate all the nuance into a single package-level function instead of requiring all of those steps for the typical use-case, I welcome any input in improving the library for users as well if you or anyone else have any suggestions :)

-tso

17 remaining items

removed this from the 2.x.x milestone on Nov 19, 2018
added this to the 1.10.0 milestone on Jul 25, 2019
modified the milestones: 1.10.0, 1.11.0 on Sep 18, 2019
modified the milestones: 1.11.0, 1.x.x on Dec 12, 2019
removed
type/proposalThe new feature has not been accepted yet but needs to be discussed first.
on Feb 3, 2020
modified the milestones: 1.x.x, 1.12.0 on Feb 3, 2020
locked and limited conversation to collaborators on Nov 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/featureCompletely new functionality. Can only be merged if feature freeze is not active.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Participants

      @genedna@lunny@techknowlogick@lafriks@alexanderadam

      Issue actions

        Languages Statusbar · Issue #2165 · go-gitea/gitea