You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recently came across this project that syntax-highlights file paths according to LS_COLORS. I already use zsh-syntax-highlighting; it would be nice if this feature could be included so I could use my LS_COLORS everywhere. Thanks. #feature-request
For details on the format of these files, run ‘dircolors --print-database’.
I won't copy/paste that here. The author of that zsh filetypes syntax highlighting project I linked to has a repository with a large predefined LS_COLORS file. In that file he mentions that the extended color codes it uses are defined in ECMA-48. Otherwise, it looks like the output from dircolors --print-database is the closest thing to an actual specification there is.
I found this article documenting it as well. Otherwise, it looks like it doesn't actually take that much code to process (though unfortunately that code appears to be entirely without a license 🤦♂️, while his LS_COLORS repo has the Artistic license).
However, it appears to be a well-understood format. That askubuntu thread seems to be the best resource on it and that answer includes a simple script to dump all file types and their associated colors.
First of all, thanks for the well-researched and well-presented answer.
As you probably know, there are copyright issues here. z-sy-h is BSD-licensed, so we can't use anything that's derived from askubuntu posts or from dircolors, as they are CC BY-SA- and GPL-licensed respectively. The output of dircolors --print-database, however, is excepted from the license of dircolors proper, through an explicit copyright notice it includes:
Copyright (C) 1996-2016 Free Software Foundation, Inc.
Copying and distribution of this file, with or without modification,
are permitted provided the copyright notice and this notice are preserved.
I'm not sure how to interpret that; it gives permission to "distribute the file [...] with modification" but that's not quite the same thing as "permission to make derivative works". Switching back to the software design hat for a moment, do we even care about the database? Shouldn't we only care about parsing the LS_COLORS environment variable? And switching back to the copyright hat, may we reverse engineer the LS_COLORS envvar's syntax and semantics, or would that still count as a derivative work?
At this point, due to legal issues alone I wonder if we shouldn't just use FreeBSD ls's CLICOLOR as our base? Then we'll at last be able to proceed to assessing the feature request from a technical point of view…
(Sorry for the legal detour. I too will be happy when it's over…)
The rest is recognizing if the current file path matches either the pre-defined key (block device=bd, etc.) or file glob and then outputting the associated color codes.
At this point, due to legal issues alone I wonder if we shouldn't just use FreeBSD ls's CLICOLOR as our base?
FWIW that's much more limited and much less widely used. All modern CLI tools (like fd, exa, (GNU) ls) use LS_COLORS. Only Mac (and FreeBSD...) users might even have CLICOLOR set. Btw, here's the Rust implementation of LS_COLORS (MIT licensed) that many Rust CLI tools have been sharing.
Parsing LS_COLORS isn't the hard part; local -A var=(${(@s,=,)${(@s,:,)LS_COLORS}}) should do. The difficult part is knowing what each of the keys mean and in which order they apply. As there isn't any official documentation, currently the only way to get it right it is to read the GNU ls source code which is of course GPL licensed.
If someone were to document what the keys mean and their order of precedence by reverse engineering it (i.e. running ls over and over with different LS_COLORS and directory contents), I think that would clear us of any licensing concerns.
@phy1729 What you're proposing is https://en.wikipedia.org/wiki/Clean_room_design. The million dollar question in this case is at what point the documentation ceases to be a derivative work of the original GPL code; I doubt there's a hard-and-fast rule for that, as it's a legal question, not a technical one.
We might be able to use the MIT-licensed Rust implementation Keith cited, but even that is questionable. Do we need to check for ourselves that it's not a derivative work of GNU ls? Or can we rely on that repository's maintainers' word?
Again, I'm sorry we're stuck in the sand like this, but there's really nothing we can do until there's something we can legally work off of. Once that's done, the remainder should be pretty simple: just a few pattern matches and a `zstat` call at the point where the `path` style is added.
The best documentation that lists all the codes seems to still be that blog post I linked earlier but you made a good point about how precedence applies to them.
Hey @sharkdp, sorry to bother you, but since you implemented projects like vivid and lscolors, could you give any guidance here?
Hey @sharkdp, sorry to bother you, but since you implemented projects like vivid and lscolors, could you give any guidance here?
@sharkdp The question is: can a BSD-licensed project implement an LS_COLORS parser?
I'm not an expert on open source licenses. I can only give my personal interpretation of the matter, but that's not really helpful if what you really need is legal advice.
The way I see it is that LS_COLORS is an open format that is used by many different tools (ls, dircolors, tree, bfs, fd, exa, …). The format was originally invented for ls and I would therefore consider the ls implementation to be the specification (see sharkdp/lscolors#6 (comment) for a similar discussion).
Both my lscolors library/tool as well as bfs by @tavianator (which features a much more solid implementation of the LS_COLORS format) are implementations of that specification (bfs is BSD-licensed, by the way).
We might be able to use the MIT-licensed Rust implementation Keith cited, but even that is questionable. Do we need to check for ourselves that it's not a derivative work of GNU ls? Or can we rely on that repository's maintainers' word?
If somebody can explain to me precisely what "derivative work" means, I might be able to answer that question. My implementation is definitely not based on the ls source code, but I have read the ls source code in order to understand the LS_COLORS format. If that is a problem in and of itself, my interpretation is probably incorrect.
The difficult part is knowing what each of the keys mean and in which order they apply.
Absolutely. There are a lot of subtleties, see for example sharkdp/lscolors#10. I have found some edge cases by automatically comparing my implementation against ls on huge sets of files (see unit tests in lscolors).
By the way (offtopic?): I think the LS_COLORS format is pretty horrible and severely limited. I would love if there would be a modern, well documented standard to colorize file systems paths.
Limitations of the LS_COLORS format:
If colorizing by file name, there is really only one option: to match on a certain suffix. There is no proper way to only highlight files named bar, because we can only add a *bar pattern that will also match hello.bar or foobar.
It is not readable. There are dozens of abbreviations that need to be looked up and everything is condensed into a single long line. This is my current LS_COLORS:
There is no way to colorize based on file attributes such as size, modification date, owner, etc.
There is no way to extend the format to things like git modification status, for example.
LS_COLORS assigns ANSI styles to certain patterns. There is no separation of "content" and "style" (in the HTML / CSS sense). This is why there is a need for additional tools like dircolors or vivid.
I think it could be worth to work on a new standard format. For compatibility reasons, there could be a LS_COLORS generator.
If somebody can explain to me precisely what "derivative work" means, I might be able to answer that question. My implementation is definitely not based on the ls source code, but I have read the ls source code in order to understand the LS_COLORS format. If that is a problem in and of itself, my interpretation is probably incorrect.
GNU ls is licensed under GPLv3, which requires:
You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also […] c) license the entire work, as a whole, under this License to anyone who comes into possession of a copy. […]
The phrase "based on" is defined in §0:
To “modify” a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a “modified version” of the earlier work or a work “based on” the earlier work.
The part about "requiring permission" is because, under copyright law, the copyright owner's permission is normally required to distribute a copyrighted work or works derived from it (that's why J. K. Rowling receives royalties from sales of Harry Potter translations, even though the translations are separately copyrighted).
In English, all that adds up to: if an LS_COLORS parser in zsh-syntax-highlighting would be considered a derivative work of GNU ls of a sort that requires the original copyright owner's permission to distribute, then the aforequoted GPLv3 §5 would apply, and zsh-syntax-highlighting would be required to be licensed under GPL, rather than BSD.
To be clear, I agree that zsh-syntax-highlighting would not be a derivative work of GNU ls even if we shipped an LS_COLORS parser. My concern is that the parser would be (able to be argued to be) a derivative work and under GPLv3 §5 the "entire work" (= all of z-sy-h) would be required to be GPL'd.
I'm not sure how to proceed. Perhaps we should give the GNU ls copyright holders a bell and ask their opinion/permission?
If colorizing by file name, there is really only one option: to match on
a certain suffix. There is no proper way to only highlight files
named bar, because we can only add a *bar pattern that will also
match hello.bar or foobar.
$ LS_COLORS="${LS_COLORS}:bar=01;33:" ls
ls: unparsable value for LS_COLORS environment variable
...
$ LS_COLORS="${LS_COLORS}:bar*=01;33:" ls
ls: unparsable value for LS_COLORS environment variable
...
I'm not much for licensing, but as the author of zsh-syntax-highlighting-filetypes (and LS_COLORS and File::LsColor ) I hereby grant you permission to do what you want with it. Is that a license enough? :)
I'm afraid I'm too short on time to look into legal questions presently, but I do wish to say, Thanks very much for this contribution, @trapd00r.
Activity
danielshahaf commentedon Mar 12, 2019
Could you link to a specification of LS_COLORS that we can work off of?
kbd commentedon Mar 12, 2019
Good question! GNU
ls
manpage says:dircolors
docs say:I won't copy/paste that here. The author of that zsh filetypes syntax highlighting project I linked to has a repository with a large predefined LS_COLORS file. In that file he mentions that the extended color codes it uses are defined in ECMA-48. Otherwise, it looks like the output from
dircolors --print-database
is the closest thing to an actual specification there is.I found this article documenting it as well. Otherwise, it looks like it doesn't actually take that much code to process (though unfortunately that code appears to be entirely without a license 🤦♂️, while his LS_COLORS repo has the Artistic license).
However, it appears to be a well-understood format. That askubuntu thread seems to be the best resource on it and that answer includes a simple script to dump all file types and their associated colors.
danielshahaf commentedon Mar 13, 2019
First of all, thanks for the well-researched and well-presented answer.
As you probably know, there are copyright issues here. z-sy-h is BSD-licensed, so we can't use anything that's derived from askubuntu posts or from dircolors, as they are CC BY-SA- and GPL-licensed respectively. The output of
dircolors --print-database
, however, is excepted from the license of dircolors proper, through an explicit copyright notice it includes:I'm not sure how to interpret that; it gives permission to "distribute the file [...] with modification" but that's not quite the same thing as "permission to make derivative works". Switching back to the software design hat for a moment, do we even care about the database? Shouldn't we only care about parsing the
LS_COLORS
environment variable? And switching back to the copyright hat, may we reverse engineer theLS_COLORS
envvar's syntax and semantics, or would that still count as a derivative work?At this point, due to legal issues alone I wonder if we shouldn't just use FreeBSD
ls
'sCLICOLOR
as our base? Then we'll at last be able to proceed to assessing the feature request from a technical point of view…(Sorry for the legal detour. I too will be happy when it's over…)
kbd commentedon Mar 14, 2019
Correct, you only need to care about parsing
LS_COLORS
, thedircolors --print-database
output is just some documentation.I really appreciate your concern for legality, but
LS_COLORS
is really a very simple and widely-implemented format. Here's some Python to parse it:The rest is recognizing if the current file path matches either the pre-defined key (block device=bd, etc.) or file glob and then outputting the associated color codes.
FWIW that's much more limited and much less widely used. All modern CLI tools (like
fd
,exa
, (GNU)ls
) useLS_COLORS
. Only Mac (and FreeBSD...) users might even haveCLICOLOR
set. Btw, here's the Rust implementation of LS_COLORS (MIT licensed) that many Rust CLI tools have been sharing.phy1729 commentedon Mar 14, 2019
Parsing
LS_COLORS
isn't the hard part;local -A var=(${(@s,=,)${(@s,:,)LS_COLORS}})
should do. The difficult part is knowing what each of the keys mean and in which order they apply. As there isn't any official documentation, currently the only way to get it right it is to read the GNU ls source code which is of course GPL licensed.If someone were to document what the keys mean and their order of precedence by reverse engineering it (i.e. running
ls
over and over with differentLS_COLORS
and directory contents), I think that would clear us of any licensing concerns.danielshahaf commentedon Mar 14, 2019
kbd commentedon Mar 14, 2019
Dang that's some shell-fu right there.
The best documentation that lists all the codes seems to still be that blog post I linked earlier but you made a good point about how precedence applies to them.
Hey @sharkdp, sorry to bother you, but since you implemented projects like vivid and lscolors, could you give any guidance here?
danielshahaf commentedon Mar 15, 2019
@sharkdp The question is: can a BSD-licensed project implement an LS_COLORS parser?
sharkdp commentedon Mar 15, 2019
I'm not an expert on open source licenses. I can only give my personal interpretation of the matter, but that's not really helpful if what you really need is legal advice.
The way I see it is that
LS_COLORS
is an open format that is used by many different tools (ls
,dircolors
,tree
,bfs
,fd
,exa
, …). The format was originally invented forls
and I would therefore consider thels
implementation to be the specification (see sharkdp/lscolors#6 (comment) for a similar discussion).Both my
lscolors
library/tool as well asbfs
by @tavianator (which features a much more solid implementation of theLS_COLORS
format) are implementations of that specification (bfs
is BSD-licensed, by the way).If somebody can explain to me precisely what "derivative work" means, I might be able to answer that question. My implementation is definitely not based on the
ls
source code, but I have read thels
source code in order to understand theLS_COLORS
format. If that is a problem in and of itself, my interpretation is probably incorrect.Absolutely. There are a lot of subtleties, see for example sharkdp/lscolors#10. I have found some edge cases by automatically comparing my implementation against
ls
on huge sets of files (see unit tests inlscolors
).sharkdp commentedon Mar 15, 2019
By the way (offtopic?): I think the
LS_COLORS
format is pretty horrible and severely limited. I would love if there would be a modern, well documented standard to colorize file systems paths.Limitations of the
LS_COLORS
format:bar
, because we can only add a*bar
pattern that will also matchhello.bar
orfoobar
.LS_COLORS
:LS_COLORS
assigns ANSI styles to certain patterns. There is no separation of "content" and "style" (in the HTML / CSS sense). This is why there is a need for additional tools likedircolors
orvivid
.I think it could be worth to work on a new standard format. For compatibility reasons, there could be a
LS_COLORS
generator.danielshahaf commentedon Mar 16, 2019
GNU
ls
is licensed under GPLv3, which requires:The phrase "based on" is defined in §0:
The part about "requiring permission" is because, under copyright law, the copyright owner's permission is normally required to distribute a copyrighted work or works derived from it (that's why J. K. Rowling receives royalties from sales of Harry Potter translations, even though the translations are separately copyrighted).
In English, all that adds up to: if an LS_COLORS parser in zsh-syntax-highlighting would be considered a derivative work of GNU
ls
of a sort that requires the original copyright owner's permission to distribute, then the aforequoted GPLv3 §5 would apply, and zsh-syntax-highlighting would be required to be licensed under GPL, rather than BSD.danielshahaf commentedon Mar 16, 2019
Forgot to say: thanks for joining this thread, @sharkdp!
tavianator commentedon Mar 16, 2019
@sharkdp If you want to collaborate on some better replacement for LS_COLORS, I'd support it in bfs.
Also in my confident but non-lawyer opinion, just implementing an LS_COLORS parser does not make your code a derivative work of GNU ls.
danielshahaf commentedon Mar 17, 2019
To be clear, I agree that zsh-syntax-highlighting would not be a derivative work of GNU ls even if we shipped an LS_COLORS parser. My concern is that the parser would be (able to be argued to be) a derivative work and under GPLv3 §5 the "entire work" (= all of z-sy-h) would be required to be GPL'd.
I'm not sure how to proceed. Perhaps we should give the GNU ls copyright holders a bell and ask their opinion/permission?
phy1729 commentedon Mar 23, 2019
It seems zsh itself has an
LS_COLORS
parser/user, so perhaps we can use that code as a base?https://github.com/zsh-users/zsh/blob/master/Src/Zle/complist.c#L151
trapd00r commentedon Apr 30, 2019
Hey guys,
I'm not much for licensing, but as the author of zsh-syntax-highlighting-filetypes (and LS_COLORS and File::LsColor) I hereby grant you permission to do what you want with it. Is that a license enough? :)
False. :)
LS_COLORS="${LS_COLORS}:*bar=38;5;220:*foobar=38;5;197:*.bar=38;5;196"
tavianator commentedon Apr 30, 2019
@trapd00r You'll notice that
ls
will color a filenotbar
the same asbar
with thatLS_COLORS
. That's the annoying part.trapd00r commentedon Apr 30, 2019
Right. My point was that the asterisk can go anywhere or nowhere at all (to match an exact filename).
tavianator commentedon Apr 30, 2019
@trapd00r It has to come at the beginning:
danielshahaf commentedon May 4, 2019
I'm afraid I'm too short on time to look into legal questions presently, but I do wish to say, Thanks very much for this contribution, @trapd00r.