-
Notifications
You must be signed in to change notification settings - Fork 695
An idea of using Cirru syntax as an alternative of text format #617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
You have some competition.... from me. I made LES (LESv2), a language inspired by LISP but syntactically based on Javascript and C#. Its relationship to JSON is somewhat different - LES is a superset of JSON, rather than being convertible to JSON (one could define such a reverse mapping, of course, I just haven't done so, and if JSON code is treated as LES code and then converted with any reasonable mapping back to JSON, it would probably have to expand in size). Edit: LES is probably a more sophisticated language than Wasm needs, so I was going to propose a subset of LES to the Wasm CG, I just haven't got around to defining it yet. And I was considering other refinements. Hmm... given that, in Cirru, an |
I'm looking into the "Cirru problems" first, For For ($) :: (a -> b) -> (a -> b)
f $ x = f x Yeah, it's internally different in Cirru since Cirru Parser is doing syntax tree transformation at the end of parsing. I want to say the meaning is learnt from Haskell. Back to LES... personally I don't like writing brackets, but I have to say LES is far more mature than Cirru is and Cirru is hardly a competitor as an idea at this moment(well, I suppose Scheme syntax is by now the best). But I'm still glad Scheme is not our only option to write WebAssembly :D By the way even people don't like Cirru, I can still compile Cirru to WebAssembly by some ways..
|
How old is Cirru, by the way? It seems to me that if the CG decided they just wanted an s-expression parser, they could go with Sweet Expressions, which appear to be
Uhh, did you really mean to say that? Cirru is older than About the Edit added corrections after you replied |
Cirru was started in the middle 2012 but only as a tree editor, and later got a text syntax by the end of the year, according to my git commits and tweets at weibo. It was only a toy before I can actually compile it to JavaScript in 2015. I saw Sweet before, but I don't know how to use it or where to find its interpreter. So, I did it before Sweet. The way I came to Cirru, I was making an AST editor after learning http://norvig.com/lispy2.html but failed to make it handy back that time with raw JavaScript. There was not React at that time, I built my first version of tree editor in canvas and it's hardly an editor I can use. So I realised I need a text syntax which looks very similar to Cirru Editor. Clojure is OK for me, Polish notation, so prefix syntax, it feels good to me. Ever since I built my own "lispy" interpreter I found it very nice, which JavaScript syntax bothered me a lot. I mean I think Lisp is still readable comparing to C-style syntax since the later one has much more syntax rules. Yes, I think LES is more mature in the part of WebAssembly. Well, you got a spec. Cirru does not have a spec, it only contains some special functions to help transforming code to JavaScript or Clojure, also some WebAssembly. So it's more like a code generator at this moment. However, I would say Cirru is more advanced as "an AST tree exploring project". :) |
Edit corrected my earlier post. LES has a spec, but it's not complete, and if the CG considers LES as the Wasm text format, I will pretty much change it according to the whims of the "important" members like @lukewagner and @jfbastien. Heck, I could even stop calling it "my" idea - if that's the price :) The most important thing is that Wasm should choose a parser that isn't solely dedicated to Wasm itself - like s-expressions, it should be a syntax that lends itself to many different use cases. Why? Look, none of the "friendly" languages for representing syntax trees - sweet expressions, LES, Cirru - are popular. People don't use them because they aren't popular, and they aren't popular because people don't use them. Chicken and egg. I just talked to a prominent member of the C# community; he said he wouldn't use the Enhanced C# language I created, not because of anything wrong with it, but simply because "hardly anyone else knows" it. That means WebAssembly CG has a huge opportunity that they shouldn't squander. Whatever they pick automatically becomes popular. I hope that they will choose a variation of LES, because Javacript/C syntax is by far the most popular syntactic style, and as far as I know, LES is the only language that parses a language resembling Javacript et al into a simple data structure. Fun fact: originally LES had a Python-style mode where it was indentation-sensitive. I took that out in favor of just C/JS-style, because I think people will warm up to that more than the Python style. I don't really care one way or the other - I quite like Python's indentation-as-structure, but the important thing is to bring those LISP ideas of "syntax independent of semantics" and "code is data" over to the mainstream.
I have difficulty understanding this perspective. I mean, I've been coding for over 20 years, and in that time I have spent far, far more time learning about library APIs than syntax. More time learning semantics than syntax. Etc. C/C++ syntax is definitely confusing in some ways, but Javascript? It's a lot easier. Also, infix operators are present in natural language ("five plus four"), the order of operations is taught before high school ("1 + 2 × 3 < 9" vs "< (+ 1 (× 2 3)) 9"), and some other parts of the syntax are popular in mathematics as well as programming languages (e.g. |
That's far more than I have thought. Popularity matters a lot in language choosing I think. WebAssembly is still new as a compiling target so choosing a popular syntax may benefit from existing toolchains. S-expression is nice since it's simple to built from scratch and people already explored it enough to bypass the potential problems. And in my point, Cirru is just a simple version of S-expression and better designed for the future, which is enabling graphical tool in programming. So in short, Cirru is not the best choice at this moment. But if people want 1) a cleaner version of S-expression(by using indentations), 2) build graphical tools for WebAssembly, then Cirru might be a good start to the future plans. I've been learning programming for about 4 years, maybe 5, only. And from the day I began to learn, CoffeeScript is becoming popular and lots of ideas are making to choose an efficient way of writing code, like making less mistakes, bringing less confusions. That's why I'm exploring a new one trying to get more by using a graphical way. JavaScript syntax, in my view is tedious to write and modify. I can bear using it in large projects that enhance stability and team work, but not in all my projects that I want to code fast. |
For anyone wanting quick context on this discussion: WebAssembly will be a binary format but there are some cases in which a Text Format is also desired. An official text format hasn't been decided, but S-expressions are currently used for the sake of simplicity in the spec. Hence, the discussion here on what the official text format should be by looking at alternative syntaxes explicitly designed to represent ASTs. A way to evaluateI want to offer that choosing a syntax may be an optimization problem of the following things:
* I think familiarity is set by community defaults, for which WebAssembly has carte blanche, as @qwertie said. Evaluating: S-expressions
Evaluating: Cirru and LESI like what I've seen from both, but I can't fully evaluate these for lack of experience, so I will leave it to their authors... My thoughtsI would place less weight on familiarity and more weight on simplicity and |
@qwertie wasm does not have many of the operators of LES and LES seems to be described as having a fixed set of operators which seems necessary for the code-is-data use case. e.g. Wasm does not have |
@JSStats LES actually has an unlimited number of operators (sequences of punctuation marks), with precedence chosen by a set of rules based on operators from several languages, especially C/C++/Java/C#/Javascript. There is also a
could be directly transliterated to LES in prefix notation...
or with "superexpressions"
or with a
or even (worst of all) a backquote fest:
These are all equivalent, due to the equivalence of operators and "calls()". They are also ugly and not recommended; I'm just showing the different ways LES allows you to express the same thing. (Note that the
Quite right. However, I do think it would make a lot of sense to include operators like
(Incidentally, one extra It could look something like this in LES:
That's if we decide users can access locals unqualified (or we could use
The fixed set of rules is designed to reflect the existing consensus of many different programming languages. I think the existing rule set would serve Wasm quite well - but it's not a finalized spec, and if the CG wants changes, then changes there will be. Even better would be if eyeballs outside the Wasm community deliberated on it, but I'm not sure how to make that happen. |
@qwertie None of those seem 'familiar' or an improvement over s-exp to me, but it's a subjective matter. Avoiding explicit |
@JSStats The 'add' example doesn't seem familiar? It looks like Rust and Swift. Since the text format isn't used directly on the web, I was assuming a one-pass compiler wouldn't be necessary for it (and certainly one could do multiple passes over individual functions), so some conveniences can be added, so |
@qwertie Sorry if I missed it but nothing in you comment immediately above looked like familiar JS style? Do you have a wasm printer that emits wasm using the format you suggest to give an example? 'Rust and Swift' were not something i would consider 'familiar' (subjective). I think it would help if people could develop some alternatives to add to the discussion. If removing the type from operators such as It's an interesting idea of LES to have a fixed set of familiar operators with familiar infix and precedence rules for them, and with all other operators handled by generic paths so as to support the code-is-data use case. It might create a compelling reason to 'overload' some operators so that they match familiar patterns. This same property would probably help text formatters in general, and it affects the binary encoding too unless formatters are expected to translate. Personally I hope the browsers can support an extension model for viewing the source so that there can be some personalization in this area and it seems possible for the text format to evolve somewhat independent of the binary encoding. |
On the LES page, you'll LES code that looks like Javascript, but since Wasm is statically typed it couldn't/shouldn't look quite like JS, hence the Rust/Swift style. I haven't made a LES-to-Wasm (or vice versa) converter yet, and I've been puzzling over which language I should write a parser in... JS? C++? OCaml? Currently it's just C# which presumably doesn't interest folks here. |
I have just published a WASM playground that can also render the code in different text formats. This way it is easy to compare the pros and cons of different formats: I'm planning to add next rendering to Sweet-expression like format. Cirru would be interesting, too. |
@data-ux Looks nice. I found you example on GitHub and here's one of them in Cirru: module
func $factorial (param $num f64)
result f64
local $i f64
local $result f64
set_local $i $ get_local $num
set_local $result $ f64.const 1
loop $done $loop
if
f64.eq (get_local $i)
f64.const 0
br $done
block
set_local $result $ f64.mul (get_local $i)
get_local $result
set_local $i $ f64.sub (get_local $i)
f64.const 1
br $loop
get_local $result
export :factorial $factorial by using existing tools of Cirru, parser, writer, JSON toolkit(might be old). The difference from your "indentation" view:
|
Some thoughts on several different ideas here (from different people): It isn't necessary to adopt Cirru, LES, or Sweet expressions, in order to obtain a simple data structure representing WebAssembly. WebAssembly is defined as a simple data structure. The "browser view source" text syntax will just be one way of describing that language, rather than being the basis for the language definition.
Although it currently uses s-expressions, WebAssembly is not in the LISP family of programming languages. If one likes Clojure but wishes it had a different syntax, WebAssembly will not be a satisfying replacement, with any syntax, as it lacks the majority of features that make Clojure great. For example, it doesn't even have closures, in the Clojure sense.
Inertia is surely significant, but another possible way of saying this is that these tools aren't offering enough of a practical advantage to overcome the inertia.
Graphical programming is intriguing. However, building and popularizing graphical programming on top of WebAssembly seems like it would be strictly harder than doing so with an actual LISP. Graphical programming is all about the human factors, so building it around a language the lacks the facilities that human programmers typically rely on would put it at a significant disadvantage. In contrast, actual LISP would seem to be an ideal base: many actual LISPs are very much designed to be written by humans, they have elegant and powerful abstraction mechanisms, and they already have a syntax which is an extremely simple representation of a tree.
This is very much true. The discussion here is about WebAssembly's "browser view source" text format, which has some unusual constraints. However, no matter what WebAssembly picks for its own use, Cirru seems like it wants to be more than this anyway. |
@sunfishcode I want to talk more about "graphical programming" part. Cirru is still far from building a graphical programming environment. I believe it's better than Lisp. Cirru is simpler than S-Expressions meanwhile keeps the power of S-Expressions. People have been using S-Expressions for years but only got techniques like Emacs and Parinfer that regard code more like text buffers rather than trees. However, in Cirru the code is represented in a form like I have an example here: inspired by data structures in Clojure I realised a tree of S-Expressions is always equivalent to a binary tree, so my editor in Cirru can be a fractal tree rather than a boring text layout. Then I figured out a way to render Cirru code in a fractal tree that I can drag to reshape it(https://twitter.com/cirrulang/status/650921962944884736 , hope the video is available) while it's still hard to make it editable and even useful: I would say it's much more difficult to build a graphical thing if the whole language is designed based on text syntax. Since Cirru is using only arrays(or vectors in Clojure) and strings for representation, playing with its text format is as simple as |
Oh? I don't think so. I just think that traditionally, language designers have (primarily out of habit, but not always) "overcome the inertia" in a different way than proposed by LES, Cirru and Sweet Exprs. It's obvious that there are practical reasons not to use s-expressions - not a single one of the top 20 languages on TIOBE is based on s-expressions. There's a reason for that: many people don't like them, and don't want to read or write software in them. The syntax is a cognitive burden. It's just that historically, every new programming language that rejected s-expressions has assumed that the alternative is to define a custom syntax from scratch. My research with LES is meant to demonstrate a "third way" - that you can get a lot of mileage out of a simple, general purpose parser and a simple data structure. Given that the text format is not the primary format, and most developers will not read or write it as often as "real" source code, I think WebAssembly is the perfect opportunity to introduce this "third way" into mainstream consciousness. According to TextFormat.md,
My argument is (i) LES is perfectly in tune with this plan, (ii) there isn't a strong reason to use a custom-designed parser that is useless for tasks other than parsing Wasm code, and (iii) although adopting a general-purpose parser won't directly help the WebAssembly project, it's good for the future of the software industry. @jiyinyiyong |
@jiyinyiyong Cirru looks like a fun and interesting project, but it looks like it's a better fit for LISP-family languages than for WebAssembly. |
@qwertie no, its graphical is not for WebAssembly, it can be discussed in the future. @sunfishcode yeah, Lisp-like languages are better. Other languages are also ok, I tried JavaScript and it worked. |
Some more words on the parser and the code generator. The parser is quite short, the most stable one in CoffeeScript, it's The code generator is also short, but might be a little tricky. In my view the future of Cirru is text format in JSON/EDN with powerful GUI tree editors. So the code generator is not a big deal. It's okay the generator is currently working well with the parser. And it's not hard to turn 300+ sloc of JavaScript into Java or C if needed. And if someone likes the idea behind Cirru but dislike the text syntax of it, there's one more way to use Cirru. For wasm code like this: (export "even" $even) in Cirru's solution, it's equivalent to a piece of JSON: [["export", ":even", "$even"]] and equivalent to EDN in Clojure: [["export" ":even" "$even"]] or maybe to others, like YAML: ---
-
- "export"
- ":even"
- "$even" which can be turned into a Cirru file: export :even $even the recursive data structure can be a bridge between wasm and Cirru, by reusing parsers and code generators of many language: data CirruValue = CirruList [CirruValue] | CirruString String That means if someone does not like Cirru, it's okay to pick an existing config language, as long as that language supports nesting data structure of Vector(or Array, or List) and String. Well, I just think Cirru is the shortest one among these languages. |
@sunfishcode I'm a little cross that you didn't mention you'd done this (dated Apr 14). |
@qwertie That's an experiment, and we're still in the process of experimenting with it. We may eventually propose something related to this, or we may end up proposing something entirely different, depending on how our experiments go. |
@sunfishcode Okay, fair enough ... although the document itself (used to) boldly headlines "Official Text Format". So, it's been a month ... how's the experiment going? Is there a thread where people are discussing it? If not, are there any pain points or different directions you're thinking about? |
Comparing to the text syntax, I'm concerned more about the question that given a wasm file in binary or text, can I get a JavaScript Object representation with one function call, as simple as |
@qwertie Sorry for the confusion; that text came from the original document it's based on. As it's an experiment, I've not yet polished it up. |
Edit: @sunfishcode's strawman isn't actually based on LES...I misread the outer context. I do feel that Cirru syntax looks rather natural for the text format. Although WebAssembly definitely isn't a Lisp, it's definitely a very expression-heavy and tree-heavy language syntactically, which Cirru was mostly made for (although I do feel that @sunfishcode's strawman could use a little simplification). Here's a few side-by-side comparisons from @sunfishcode's strawman: From fac.wast:
-- Cirru
func $fac-opt ($a i64) i64
local $x i64
set $x 1
block $block
br_if $block (<s $a 2)
loop $loop
set $x (* $x $a)
set $a (+ $a -1)
br_if $loop (<s $a 1)
, $x Fast square root from Quake: /* C source */
float Q_rsqrt(float number)
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = *(long *) &y;
i = 0x5f3759df - (i >> 1);
y = *(float *) &i;
y = y * (threehalfs - (x2 * y * y));
y = y * (threehalfs - (x2 * y * y));
return y;
}
-- Cirru
func $Q_rsqrt ($0 f32) f32
local $1 f32
set $1 $ f32.reinterpret/i32 $ - 1597463007 (>> (i32.reinterpret/f32 $0) 1)
set $0 (* $0 0x1p-1)
set $1 $ * $1 (- 0x1.8p0 (* $1 $0 $1))
* $1 (- 0x1.8p0 (* $1 $0 $1)) From labels.wast
-- Cirru
func $loop3 i32
local $i i32
set $i 0
block $exit $ loop $cont
set $i (+ $i 1)
if (= $i 5)
br $exit $i In the case of nested blocks, though, it might need some fine tuning:
-- Cirru
block $default
block $green
block $yellow
block $orange
block $red
br_table $default [$red $orange $yellow $green] $index
-- ...
-- ...
-- ...
-- ... |
@isiahmeadows thx. There's a video explained Cirru test syntax better and I think it will help https://youtu.be/cXTJDj8ad_U [0:38] It explains why you may need an extra comma in the first demo: -- Cirru
func $fac-opt ($a i64) i64
local $x i64
set $x 1
block $block
br_if $block (<s $a 2)
loop $loop
set $x (* $x $a)
set $a (+ $a -1)
br_if $loop (<s $a 1)
, $x And some updates on Cirru itself. Actually in the past months I don't write Cirru Text syntax anymore, but instead editor code in graphical editor and store code in EDN or JSON. For example this is the text file: [
[ "ns" "respo.component.text" [ ":require" [ "[]" "respo.alias" ":refer" [ "[]" "create-comp" "span" ] ] ] ]
[ "defn" "render" [ "content" "style" ] [ "fn" [ "state" "mutate!" ] [ "span" [ "{}" [ ":attrs" [ "{}" [ ":inner-text" "content" ] ] ] [ ":style" "style" ] ] ] ] ]
[ "def" "comp-text" [ "create-comp" ":text" "render" ] ]
] and this is how I edit code: Why I'm staying away from the syntax I designed? Here's an issue: I want to generate text results with program, but it's hard to generate very pretty and correct indentation syntax if a first item in an expression is a complicated expression, for example in this AST cond
(and (> a 1) true) (+ a 1)
(or (< a 1) false) (- a 1)
:else a An obvious solution is to put that first item in a single line. But I don't think that's I wanted. So I chose JSON or EDN, which are very stable. And as a conclusion: I'd like to contribute my experience on editing code by AST that I explored in Cirru project and WebAssembly is the future and is full of possibilities. Cirru is targeting a graphical AST, if you want a text one, Cirru is nice, but far from perfect. |
@jiyinyiyong To be fair, I didn't actually verify the code samples. And I know Cirru is far from perfect (it can be pretty awkward IMHO). I just wanted to put out a more representative sample. 😄 (Oh, and I fixed the error, BTW.) |
@isiahmeadows can you remove or change |
@yurydelendik I was referring to the raw expression syntax itself used by the two proposals to differentiate them (LES-based and Cirru-based syntaxes), not the actual syntax of the text format (which there isn't actually any ATM beyond the s-expressions used by the WASM reference interpreter and most current experimental implementations). So if I were to be technically correct, I'd make the comments To be fair, someone could theoretically invent their own syntax for it for another completely proposal (which I've considered doing so myself, but just haven't had the time to sketch it out). The ideas are pretty much limitless right now. |
The @sunfishcode's strawman proposal is not LES-based (at least not yet) that's why I don't really understand why this is mentioned in the examples comments. |
@yurydelendik Well...you're correct in a sense:
At this point I feel we're arguing semantics. I feel the intent was sufficient enough (although I'll add a note above explaining the technicality). |
@isiahmeadows, my LES proposal is derived from @sunfishcode's and not the other way around (although LES itself was developed before WebAssembly, just as s-expressions were.) |
@qwertie Where did I imply that? If I was, I'm sorry, and I'll fix it. (I tried to speak of theirs as purely independent, and I didn't think I referenced yours.) |
@isiahmeadows your comment 3 days ago, which discusses "sunfishcode's LES-based proposal". (There's only one LES proposal - mine. Unless you count this second one which is also mine) |
@qwertie That was part of the associated edit with this comment stemming from this discussion. I'll fix it. |
@yurydelendik I'm sorry...I didn't fully differentiate the PR from the actual strawman. I apologize. |
After some communication between implementers, we've decided to focus on having all browsers be able to display linear opcodes for the MVP timeframe. Moving to Discussion. |
Text format proposal. Closing this in favor of the proposal. |
...I'm just thinking from the syntax side, but not being experienced in compiler. I will just discuss on the syntax side.
I made a video to demonstrate the main ideas behind Cirru:
https://www.youtube.com/watch?v=1ShYO_A8g-0 (14mins)
Short version:
For Cirru code like this:
it is equivalent to a piece of data in JSON:
once you get JSON you are free to make an DOM editor or simply render to HTML/CSS:

and also it can be turned back to Cirru code as you want.
Check these links if you need details.
http://cirru.org/
https://github.com/Cirru/
https://twitter.com/cirrulang
https://github.com/Cirru/cirru-wasm-ast (mostly based on @indutny 's work)
https://github.com/Cirru/cirru-wasm-cli
From my side I got some experience using Cirru:
Pros
Cons
Even if we don't want the graphical part of CIrru, we can still try its text syntax. It's basic Lisp in indentations(with string syntax a bit strange):
https://gist.github.com/jiyinyiyong/0fc58e2ed7c641973d9b
I'm a big fan of Lisp(mainly Clojure), but dislike the parentheses in Lisp. So this is my idea to fix it and that in WebAssembly.
The text was updated successfully, but these errors were encountered: