Skip to content

An idea of using Cirru syntax as an alternative of text format #617

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tiye opened this issue Mar 19, 2016 · 41 comments
Closed

An idea of using Cirru syntax as an alternative of text format #617

tiye opened this issue Mar 19, 2016 · 41 comments

Comments

@tiye
Copy link

tiye commented Mar 19, 2016

...I'm just thinking from the syntax side, but not being experienced in compiler. I will just discuss on the syntax side.

I made a video to demonstrate the main ideas behind Cirru:
https://www.youtube.com/watch?v=1ShYO_A8g-0 (14mins)

Short version:

  • Cirru is trying to build a syntax tree that we can edit directly
  • that syntax tree has a text syntax, with parser and formatter
  • that text syntax is based on indentations
  • I've been thinking about it for several years and have been using it for months

For Cirru code like this:

defn add2 (x y) (+ x y)

it is equivalent to a piece of data in JSON:

["defn", "add2", ["x", "y"], ["+", "x", "y"]]

once you get JSON you are free to make an DOM editor or simply render to HTML/CSS:
image
and also it can be turned back to Cirru code as you want.

Check these links if you need details.
http://cirru.org/
https://github.com/Cirru/
https://twitter.com/cirrulang
https://github.com/Cirru/cirru-wasm-ast (mostly based on @indutny 's work)
https://github.com/Cirru/cirru-wasm-cli

From my side I got some experience using Cirru:

Pros

  • Preciously we are programming by manipulating text files, we all say it's about AST, which is a tree, but we need the help of text syntax to do it. Some languages pushed syntax even further and it becomes hard for newcomers to pick up. Cirru is giving more possibilities. I mean we can manipulate the syntax tree directly now.
  • Cirru syntax is simple. It's based on indentations and reduced the usage of parentheses a lot.
  • Cirru syntax is flexible. When recovered from binary files, it can be turned into serveral styles of text form, such as "smart folding look like human wrote it", "expression unfolded for the purpose of add break points", "tree view for a better user interface", etc.
  • I've been generating my JavaScript code and ClojureScript code in Cirru for many months, the syntax is quite handy now.

Cons

  • People hate to learn one more syntax, what's worse, it's different from C, Java or Python.
  • String syntax in Cirru can be confusing according to my experience introducing Cirru to others. Because Cirru was originally designed as a tree editor and people does not quote marks in a tree editor.
  • After Cirru code is parsed into a tree, we need another step to turn that tree into an AST. I'm not quite clear about how compilers will handle this. So I'm not sure it's good or bad.
  • I think we don't have very much experience programming with a tree rather than a text file. Our toolchains are mostly based on text files.

Even if we don't want the graphical part of CIrru, we can still try its text syntax. It's basic Lisp in indentations(with string syntax a bit strange):
https://gist.github.com/jiyinyiyong/0fc58e2ed7c641973d9b

module
  export :even $even
  export "odd" $odd

  func $even (param $n i32) (result i32)
    if (i32.eq (get_local $n) (i32.const 0))
      i32.const 1
      call $odd (i32.sub (get_local $n) (i32.const 1))

  func $odd (param $n i32) (result i32)
    store_global $scratch (get_local $n)
    if (i32.eq (get_local $n) (i32.const 0))
      i32.const 0
      call $even (i32.sub (get_local $n) (i32.const 1))

  global $scratch i32

I'm a big fan of Lisp(mainly Clojure), but dislike the parentheses in Lisp. So this is my idea to fix it and that in WebAssembly.

@qwertie
Copy link

qwertie commented Mar 19, 2016

You have some competition.... from me.

I made LES (LESv2), a language inspired by LISP but syntactically based on Javascript and C#. Its relationship to JSON is somewhat different - LES is a superset of JSON, rather than being convertible to JSON (one could define such a reverse mapping, of course, I just haven't done so, and if JSON code is treated as LES code and then converted with any reasonable mapping back to JSON, it would probably have to expand in size). Edit: LES is probably a more sophisticated language than Wasm needs, so I was going to propose a subset of LES to the Wasm CG, I just haven't got around to defining it yet. And I was considering other refinements.

Hmm... given that, in Cirru, an ident becomes "ident" in JSON, what about a string like "ident" in Cirru? Are strings and identifiers exactly equivalent? And what about the dollar sign in the above code? The documentation says $ as a function to fold code, see Haskell, but I suppose that's not what's happening here.

@tiye
Copy link
Author

tiye commented Mar 19, 2016

I'm looking into the "Cirru problems" first,

For ident, my video covered it. Yes I realise this problem as soon as I began to use it. There are already examples on solving this. It's like 'something-quoted in Scheme, and :a-keyword in Clojure. My solution is to prefix that with one character, for JavaScript "str" is written in :str, and in Clojure I use |str.

For $, in Haskell it's an infix function which evaluates x before f is calling in:

($) :: (a -> b) -> (a -> b)
f $ x = f x

Yeah, it's internally different in Cirru since Cirru Parser is doing syntax tree transformation at the end of parsing. I want to say the meaning is learnt from Haskell. f $ a b is evaluated like f (a b).

Back to LES... personally I don't like writing brackets, but I have to say LES is far more mature than Cirru is and Cirru is hardly a competitor as an idea at this moment(well, I suppose Scheme syntax is by now the best). But I'm still glad Scheme is not our only option to write WebAssembly :D

By the way even people don't like Cirru, I can still compile Cirru to WebAssembly by some ways..

  • CirruScript -> JavaScript -> wasm
  • Cirru -> ClojureScript -> wasm
  • Cirru -> wasm AST -> wasm

@qwertie
Copy link

qwertie commented Mar 19, 2016

How old is Cirru, by the way? It seems to me that if the CG decided they just wanted an s-expression parser, they could go with Sweet Expressions, which appear to be 2.5 years 3.5 years old. I heard about sweet-expressions (a.k.a. Readable Lisp S-expressions) a couple of years ago, and I don't even use LISP. So.... I wonder, did you know about sweet expressions already when you started Cirru? Personally I think infix operators are a key to readability; sweet exprs have them but I don't see them in Cirru. Heck, if it weren't for the fact that everybody still writes LISP in s-exprs instead of sweet-exprs, I'm sure I would use a language in the LISP family by now (the real problem is that I can't learn LISP with sweet exprs: no one is teaching it that way.)

LES is far more mature than Cirru

Uhh, did you really mean to say that? Cirru is older than LES LESv2 (LESv1 is 2.7 years old), it has its own domain name, and it has parsers written in a whole bunch of languages. LES is great if we want a Javascript-style syntax, but Cirru (and sweet expressions) appear to be mature.

About the $, I was referring to the code you wrote earlier like (param $n i32) - this means (param (n 132))?

Edit added corrections after you replied

@tiye
Copy link
Author

tiye commented Mar 20, 2016

Cirru was started in the middle 2012 but only as a tree editor, and later got a text syntax by the end of the year, according to my git commits and tweets at weibo. It was only a toy before I can actually compile it to JavaScript in 2015. I saw Sweet before, but I don't know how to use it or where to find its interpreter. So, I did it before Sweet.

The way I came to Cirru, I was making an AST editor after learning http://norvig.com/lispy2.html but failed to make it handy back that time with raw JavaScript. There was not React at that time, I built my first version of tree editor in canvas and it's hardly an editor I can use. So I realised I need a text syntax which looks very similar to Cirru Editor.

Clojure is OK for me, Polish notation, so prefix syntax, it feels good to me. Ever since I built my own "lispy" interpreter I found it very nice, which JavaScript syntax bothered me a lot. I mean I think Lisp is still readable comparing to C-style syntax since the later one has much more syntax rules.

Yes, I think LES is more mature in the part of WebAssembly. Well, you got a spec. Cirru does not have a spec, it only contains some special functions to help transforming code to JavaScript or Clojure, also some WebAssembly. So it's more like a code generator at this moment. However, I would say Cirru is more advanced as "an AST tree exploring project". :)

@qwertie
Copy link

qwertie commented Mar 20, 2016

Edit corrected my earlier post.

LES has a spec, but it's not complete, and if the CG considers LES as the Wasm text format, I will pretty much change it according to the whims of the "important" members like @lukewagner and @jfbastien. Heck, I could even stop calling it "my" idea - if that's the price :)

The most important thing is that Wasm should choose a parser that isn't solely dedicated to Wasm itself - like s-expressions, it should be a syntax that lends itself to many different use cases. Why? Look, none of the "friendly" languages for representing syntax trees - sweet expressions, LES, Cirru - are popular. People don't use them because they aren't popular, and they aren't popular because people don't use them. Chicken and egg. I just talked to a prominent member of the C# community; he said he wouldn't use the Enhanced C# language I created, not because of anything wrong with it, but simply because "hardly anyone else knows" it.

That means WebAssembly CG has a huge opportunity that they shouldn't squander. Whatever they pick automatically becomes popular. I hope that they will choose a variation of LES, because Javacript/C syntax is by far the most popular syntactic style, and as far as I know, LES is the only language that parses a language resembling Javacript et al into a simple data structure.

Fun fact: originally LES had a Python-style mode where it was indentation-sensitive. I took that out in favor of just C/JS-style, because I think people will warm up to that more than the Python style. I don't really care one way or the other - I quite like Python's indentation-as-structure, but the important thing is to bring those LISP ideas of "syntax independent of semantics" and "code is data" over to the mainstream.

JavaScript syntax bothered me a lot. I mean I think Lisp is still readable comparing to C-style syntax since the later one has much more syntax rules.

I have difficulty understanding this perspective. I mean, I've been coding for over 20 years, and in that time I have spent far, far more time learning about library APIs than syntax. More time learning semantics than syntax. Etc. C/C++ syntax is definitely confusing in some ways, but Javascript? It's a lot easier. Also, infix operators are present in natural language ("five plus four"), the order of operations is taught before high school ("1 + 2 × 3 < 9" vs "< (+ 1 (× 2 3)) 9"), and some other parts of the syntax are popular in mathematics as well as programming languages (e.g. (tuple, of, things) and [list, of, things]).

@tiye
Copy link
Author

tiye commented Mar 20, 2016

That's far more than I have thought. Popularity matters a lot in language choosing I think. WebAssembly is still new as a compiling target so choosing a popular syntax may benefit from existing toolchains. S-expression is nice since it's simple to built from scratch and people already explored it enough to bypass the potential problems. And in my point, Cirru is just a simple version of S-expression and better designed for the future, which is enabling graphical tool in programming.

So in short, Cirru is not the best choice at this moment. But if people want 1) a cleaner version of S-expression(by using indentations), 2) build graphical tools for WebAssembly, then Cirru might be a good start to the future plans.

I've been learning programming for about 4 years, maybe 5, only. And from the day I began to learn, CoffeeScript is becoming popular and lots of ideas are making to choose an efficient way of writing code, like making less mistakes, bringing less confusions. That's why I'm exploring a new one trying to get more by using a graphical way. JavaScript syntax, in my view is tedious to write and modify. I can bear using it in large projects that enhance stability and team work, but not in all my projects that I want to code fast.

@shaunlebron
Copy link

For anyone wanting quick context on this discussion: WebAssembly will be a binary format but there are some cases in which a Text Format is also desired. An official text format hasn't been decided, but S-expressions are currently used for the sake of simplicity in the spec. Hence, the discussion here on what the official text format should be by looking at alternative syntaxes explicitly designed to represent ASTs.

A way to evaluate

I want to offer that choosing a syntax may be an optimization problem of the following things:

simplicitynumber of syntax rules
flexibilityability and ease of adding rules
familiarity*popularity of syntax
readabilitya subjective function of simplicity, familiarity, and preference
writabilityhow simple it is for a human to produce it w/ or w/o tools

* I think familiarity is set by community defaults, for which WebAssembly has carte blanche, as @qwertie said.

Evaluating: S-expressions

  • simplicity - obviously good
  • flexibility - obviously good
  • familiarity - obviously bad
  • readability
  • writability
    • balancing parens is difficult without plugins
    • we are working on Parinfer to allow indentation to influence structure and vice versa

Evaluating: Cirru and LES

I like what I've seen from both, but I can't fully evaluate these for lack of experience, so I will leave it to their authors...

My thoughts

I would place less weight on familiarity and more weight on simplicity and
flexibility because that will contribute the most to future-proofing how we
express the AST of the web, which will probably be around for a while! ;)

@ghost
Copy link

ghost commented Mar 21, 2016

@qwertie wasm does not have many of the operators of LES and LES seems to be described as having a fixed set of operators which seems necessary for the code-is-data use case. e.g. Wasm does not have +, rather it has i32.add i64.add etc. Languages and data do not in general map to a fixed set of operators so a fixed set of precedence and parsing rules seems to have limited utility. How could this be addressed?

@qwertie
Copy link

qwertie commented Mar 21, 2016

@JSStats LES actually has an unlimited number of operators (sequences of punctuation marks), with precedence chosen by a set of rules based on operators from several languages, especially C/C++/Java/C#/Javascript. There is also a backquote pseudo-operator, which lets you treat any identifier as an operator. For instance, the s-expression

(i32.add (get_local N) (i32.const 1))

could be directly transliterated to LES in prefix notation...

@i32.add(get_local(N), @i32.const(1));

or with "superexpressions"

@i32.add(get_local N, @i32.const 1);

or with a i32.add and a superexpression...

(get_local N) `i32.add` (@i32.const 1);

or even (worst of all) a backquote fest:

`get_local` N `i32.add` `i32.const` 1;

These are all equivalent, due to the equivalence of operators and "calls()". They are also ugly and not recommended; I'm just showing the different ways LES allows you to express the same thing. (Note that the @ in @int32.add is needed because the identifier i32.add contains a dot character and we probably don't want . to be considered a binary operator.)

Wasm does not have +, rather it has i32.add i64.add

Quite right. However, I do think it would make a lot of sense to include operators like + in the text format, requiring the "assembler" to infer whether it's i32.add or i64.add. There are other things we can readily do to make the text format more friendly. Let's consider this adder:

(func $add (param $x i32) (param $y i32) (result i32) 
    (i32.add (get_local $x) (get_local $y)))

(Incidentally, one extra @ sign makes this into valid LES code.)

It could look something like this in LES:

func add(x: i32, y: i32) -> i32 { x + y };

That's if we decide users can access locals unqualified (or we could use #x or $x or whatever); similarly we could use simply 123 for i32.const 123 and 123L for i64.const 123. . The structure of this expression (expressed as an LES-compatible s-expression) is

(func (@-> (add (@: x i32) (@: y i32)) (i32)) (@`{}` (@+ #x #y)));

Languages and data do not in general map to a fixed set of operators so a fixed set of precedence and parsing rules seems to have limited utility.

The fixed set of rules is designed to reflect the existing consensus of many different programming languages. I think the existing rule set would serve Wasm quite well - but it's not a finalized spec, and if the CG wants changes, then changes there will be. Even better would be if eyeballs outside the Wasm community deliberated on it, but I'm not sure how to make that happen.

@ghost
Copy link

ghost commented Mar 21, 2016

@qwertie None of those seem 'familiar' or an improvement over s-exp to me, but it's a subjective matter. Avoiding explicit get_local operators seems an obvious thing to improve the readability. I suggest waiting to see the latest thinking on a post-order encoding because with a one pass validator the type derivation will be bottom up and I can't see any advantage to having i32.add and i64.add etc rather than just add and determining the type from the argument types? This seems to work fine for CIL. If the types could be removed from many of the operators then they would map to familiar operators.

@qwertie
Copy link

qwertie commented Mar 21, 2016

@JSStats The 'add' example doesn't seem familiar? It looks like Rust and Swift. Since the text format isn't used directly on the web, I was assuming a one-pass compiler wouldn't be necessary for it (and certainly one could do multiple passes over individual functions), so some conveniences can be added, so + could be used for adding even if there is no single opcode for it. Heck, even if one really wanted to optimize the parser, that feature is still possible (given forward declarations for all functions).

@ghost
Copy link

ghost commented Mar 22, 2016

@qwertie Sorry if I missed it but nothing in you comment immediately above looked like familiar JS style? Do you have a wasm printer that emits wasm using the format you suggest to give an example? 'Rust and Swift' were not something i would consider 'familiar' (subjective).

I think it would help if people could develop some alternatives to add to the discussion. If removing the type from operators such as add allows them to map to familiar infix and precedence syntax then make the case for this change and I may well support this as no reason against doing this comes to mind. Show people the substantial difference it could make to the text format.

It's an interesting idea of LES to have a fixed set of familiar operators with familiar infix and precedence rules for them, and with all other operators handled by generic paths so as to support the code-is-data use case. It might create a compelling reason to 'overload' some operators so that they match familiar patterns. This same property would probably help text formatters in general, and it affects the binary encoding too unless formatters are expected to translate.

Personally I hope the browsers can support an extension model for viewing the source so that there can be some personalization in this area and it seems possible for the text format to evolve somewhat independent of the binary encoding.

@qwertie
Copy link

qwertie commented Mar 22, 2016

On the LES page, you'll LES code that looks like Javascript, but since Wasm is statically typed it couldn't/shouldn't look quite like JS, hence the Rust/Swift style. I haven't made a LES-to-Wasm (or vice versa) converter yet, and I've been puzzling over which language I should write a parser in... JS? C++? OCaml? Currently it's just C# which presumably doesn't interest folks here.

@data-ux
Copy link

data-ux commented Apr 8, 2016

I have just published a WASM playground that can also render the code in different text formats. This way it is easy to compare the pros and cons of different formats:
http://ast.run/

I'm planning to add next rendering to Sweet-expression like format. Cirru would be interesting, too.

@tiye
Copy link
Author

tiye commented Apr 9, 2016

@data-ux Looks nice. I found you example on GitHub and here's one of them in Cirru:

module
  func $factorial (param $num f64)
    result f64
    local $i f64
    local $result f64
    set_local $i $ get_local $num
    set_local $result $ f64.const 1
    loop $done $loop
      if
        f64.eq (get_local $i)
          f64.const 0
        br $done
        block
          set_local $result $ f64.mul (get_local $i)
            get_local $result
          set_local $i $ f64.sub (get_local $i)
            f64.const 1

      br $loop

    get_local $result

  export :factorial $factorial

by using existing tools of Cirru, parser, writer, JSON toolkit(might be old).

The difference from your "indentation" view:

  • using $ to reduce nestings
  • :factorial instead of "factorial" since quote mark special in Cirru

@sunfishcode
Copy link
Member

Some thoughts on several different ideas here (from different people):

It isn't necessary to adopt Cirru, LES, or Sweet expressions, in order to obtain a simple data structure representing WebAssembly. WebAssembly is defined as a simple data structure. The "browser view source" text syntax will just be one way of describing that language, rather than being the basis for the language definition.

I'm a big fan of Lisp(mainly Clojure), but dislike the parentheses in Lisp. So this is my idea to fix it and that in WebAssembly.

Although it currently uses s-expressions, WebAssembly is not in the LISP family of programming languages. If one likes Clojure but wishes it had a different syntax, WebAssembly will not be a satisfying replacement, with any syntax, as it lacks the majority of features that make Clojure great. For example, it doesn't even have closures, in the Clojure sense.

none of the "friendly" languages for representing syntax trees - sweet expressions, LES, Cirru - are popular.

Inertia is surely significant, but another possible way of saying this is that these tools aren't offering enough of a practical advantage to overcome the inertia.

build graphical tools for WebAssembly

Graphical programming is intriguing. However, building and popularizing graphical programming on top of WebAssembly seems like it would be strictly harder than doing so with an actual LISP. Graphical programming is all about the human factors, so building it around a language the lacks the facilities that human programmers typically rely on would put it at a significant disadvantage. In contrast, actual LISP would seem to be an ideal base: many actual LISPs are very much designed to be written by humans, they have elegant and powerful abstraction mechanisms, and they already have a syntax which is an extremely simple representation of a tree.

By the way even people don't like Cirru, I can still compile Cirru to WebAssembly by some ways..

This is very much true. The discussion here is about WebAssembly's "browser view source" text format, which has some unusual constraints. However, no matter what WebAssembly picks for its own use, Cirru seems like it wants to be more than this anyway.

@tiye
Copy link
Author

tiye commented Apr 14, 2016

@sunfishcode I want to talk more about "graphical programming" part. Cirru is still far from building a graphical programming environment. I believe it's better than Lisp. Cirru is simpler than S-Expressions meanwhile keeps the power of S-Expressions. People have been using S-Expressions for years but only got techniques like Emacs and Parinfer that regard code more like text buffers rather than trees. However, in Cirru the code is represented in a form like [["a", ["b", "c"], "d"]] and everyone can build graphical tools to play with it, and thus bypass lexing and parsing.

I have an example here: inspired by data structures in Clojure I realised a tree of S-Expressions is always equivalent to a binary tree, so my editor in Cirru can be a fractal tree rather than a boring text layout.
image

Then I figured out a way to render Cirru code in a fractal tree that I can drag to reshape it(https://twitter.com/cirrulang/status/650921962944884736 , hope the video is available) while it's still hard to make it editable and even useful:

image

I would say it's much more difficult to build a graphical thing if the whole language is designed based on text syntax. Since Cirru is using only arrays(or vectors in Clojure) and strings for representation, playing with its text format is as simple as JSON.stringify JSON.parse(or read-string pr-str in Clojure), and most people would find it easy to pick up.

@qwertie
Copy link

qwertie commented Apr 15, 2016

@sunfishcode

Inertia is surely significant, but another possible way of saying this is that these tools aren't offering enough of a practical advantage to overcome the inertia.

Oh? I don't think so. I just think that traditionally, language designers have (primarily out of habit, but not always) "overcome the inertia" in a different way than proposed by LES, Cirru and Sweet Exprs. It's obvious that there are practical reasons not to use s-expressions - not a single one of the top 20 languages on TIOBE is based on s-expressions. There's a reason for that: many people don't like them, and don't want to read or write software in them. The syntax is a cognitive burden.

It's just that historically, every new programming language that rejected s-expressions has assumed that the alternative is to define a custom syntax from scratch. My research with LES is meant to demonstrate a "third way" - that you can get a lot of mileage out of a simple, general purpose parser and a simple data structure.

Given that the text format is not the primary format, and most developers will not read or write it as often as "real" source code, I think WebAssembly is the perfect opportunity to introduce this "third way" into mainstream consciousness. According to TextFormat.md,

There is no requirement to use JavaScript syntax .... There may also be substantive reasons to use notation that is different than JavaScript (for example, WebAssembly has a 32-bit integer type....). On the other hand, when there are no substantive reasons and the options are basically bikeshedding, then it does make sense for the text format to match existing conventions on the Web (for example, curly braces, as in JavaScript and CSS).

My argument is (i) LES is perfectly in tune with this plan, (ii) there isn't a strong reason to use a custom-designed parser that is useless for tasks other than parsing Wasm code, and (iii) although adopting a general-purpose parser won't directly help the WebAssembly project, it's good for the future of the software industry.

@jiyinyiyong
There are various people around the world working on graphical programming - Jonathan Edwards and the Wolfram Language come to mind, and no doubt many others I've never heard of. But I don't think the WebAssembly folks would want to get involved in that area.

@sunfishcode
Copy link
Member

@jiyinyiyong Cirru looks like a fun and interesting project, but it looks like it's a better fit for LISP-family languages than for WebAssembly.

@tiye
Copy link
Author

tiye commented Apr 16, 2016

@qwertie no, its graphical is not for WebAssembly, it can be discussed in the future.

@sunfishcode yeah, Lisp-like languages are better. Other languages are also ok, I tried JavaScript and it worked.

@tiye
Copy link
Author

tiye commented Apr 20, 2016

Some more words on the parser and the code generator.

The parser is quite short, the most stable one in CoffeeScript, it's parser.coffee 232 sloc with tree.coffee(transforming layout) 41 sloc. I've copied Cirru Parser to several languages(sadly, no C, no Java, my bad). It could be quite fast if I can write it in C in the future, while the bad part is lack of syntax error messages.

The code generator is also short, but might be a little tricky. In my view the future of Cirru is text format in JSON/EDN with powerful GUI tree editors. So the code generator is not a big deal. It's okay the generator is currently working well with the parser. And it's not hard to turn 300+ sloc of JavaScript into Java or C if needed.

And if someone likes the idea behind Cirru but dislike the text syntax of it, there's one more way to use Cirru. For wasm code like this:

(export "even" $even)

in Cirru's solution, it's equivalent to a piece of JSON:

[["export", ":even", "$even"]]

and equivalent to EDN in Clojure:

[["export" ":even" "$even"]]

or maybe to others, like YAML:

---
  - 
    - "export"
    - ":even"
    - "$even"

which can be turned into a Cirru file:

export :even $even

the recursive data structure can be a bridge between wasm and Cirru, by reusing parsers and code generators of many language:

data CirruValue = CirruList [CirruValue] | CirruString String

That means if someone does not like Cirru, it's okay to pick an existing config language, as long as that language supports nesting data structure of Vector(or Array, or List) and String. Well, I just think Cirru is the shortest one among these languages.

@tiye tiye closed this as completed Apr 20, 2016
@tiye tiye reopened this Apr 20, 2016
@qwertie
Copy link

qwertie commented May 22, 2016

@sunfishcode I'm a little cross that you didn't mention you'd done this (dated Apr 14).

@sunfishcode
Copy link
Member

@qwertie That's an experiment, and we're still in the process of experimenting with it. We may eventually propose something related to this, or we may end up proposing something entirely different, depending on how our experiments go.

@qwertie
Copy link

qwertie commented May 23, 2016

@sunfishcode Okay, fair enough ... although the document itself (used to) boldly headlines "Official Text Format". So, it's been a month ... how's the experiment going? Is there a thread where people are discussing it? If not, are there any pain points or different directions you're thinking about?

@tiye
Copy link
Author

tiye commented May 23, 2016

Comparing to the text syntax, I'm concerned more about the question that given a wasm file in binary or text, can I get a JavaScript Object representation with one function call, as simple as JSON.parse(content). It was pain in the old days I have to find a proper parser for JavaScript syntax(even worse, the syntax changes over time) and then I can start playing with it.

@ghost
Copy link

ghost commented May 23, 2016

@sunfishcode
Copy link
Member

@qwertie Sorry for the confusion; that text came from the original document it's based on. As it's an experiment, I've not yet polished it up.

@sunfishcode sunfishcode modified the milestones: Meta, MVP Jul 8, 2016
@dead-claudia
Copy link

dead-claudia commented Jul 23, 2016

Edit: @sunfishcode's strawman isn't actually based on LES...I misread the outer context.


I do feel that Cirru syntax looks rather natural for the text format. Although WebAssembly definitely isn't a Lisp, it's definitely a very expression-heavy and tree-heavy language syntactically, which Cirru was mostly made for (although I do feel that @sunfishcode's strawman could use a little simplification).

Here's a few side-by-side comparisons from @sunfishcode's strawman:

From fac.wast:

// @sunfishcode's strawman
function $fac-opt ($a:i64) : (i64) {
  var $x:i64;
  $x = 1;
  br_if ($a <s 2) $end;
  loop $loop {
    $x = $x * $a;
    $a = $a + -1;
    br_if ($a >s 1) $loop;
  }
$end:
  $x
}
-- Cirru
func $fac-opt ($a i64) i64
  local $x i64
  set $x 1
  block $block
    br_if $block (<s $a 2)
    loop $loop
      set $x (* $x $a)
      set $a (+ $a -1)
      br_if $loop (<s $a 1)
  , $x

Fast square root from Quake:

/* C source */
float Q_rsqrt(float number)
{
    long i;
    float x2, y;
    const float threehalfs = 1.5F;

    x2 = number * 0.5F;
    y  = number;
    i  = *(long *) &y;
    i  = 0x5f3759df - (i >> 1);
    y  = *(float *) &i;
    y  = y * (threehalfs - (x2 * y * y));
    y  = y * (threehalfs - (x2 * y * y));

    return y;
}
;; LLVM output + binaryen + few tweaks
(func $Q_rsqrt (param $0 f32) (result f32)
  (local $1 f32)
  (set_local $1
    (f32.reinterpret/i32
      (i32.sub
        (i32.const 1597463007)
        (i32.shr_s
          (i32.reinterpret/f32
            (get_local $0))
          (i32.const 1)))))
  (set_local $1
    (f32.mul
      (get_local $1)
      (f32.sub
        (f32.const 0x1.8p+0)
        (f32.mul
          (get_local $1)
          (f32.mul
            (get_local $1)
            (set_local $0
              (f32.mul
                (get_local $0)
                (f32.const 0x1p-1))))))))
  (f32.mul
    (get_local $1)
    (f32.sub
      (f32.const 0x1.8p+0)
      (f32.mul
        (get_local $1)
        (f32.mul
          (get_local $0)
          (get_local $1)))))
)
// @sunfishcode's strawman
function $Q_rsqrt ($0:f32) : (f32) {
  var $1:f32;
  $1 = f32.reinterpret/i32(1597463007 - ((i32.reinterpret/f32($0)) >> 1));
  $1 = $1 * (0x1.8p0 - $1 * ($0 = $0 * 0x1p-1) * $1);
  $1 * (0x1.8p0 - $1 * $0 * $1)
}
-- Cirru
func $Q_rsqrt ($0 f32) f32
  local $1 f32
  set $1 $ f32.reinterpret/i32 $ - 1597463007 (>> (i32.reinterpret/f32 $0) 1)
  set $0 (* $0 0x1p-1)
  set $1 $ * $1 (- 0x1.8p0 (* $1 $0 $1))
  * $1 (- 0x1.8p0 (* $1 $0 $1))

From labels.wast

;; S-expressions
(func $loop3 (result i32)
  (local $i i32)
  (set_local $i (i32.const 0))
  (loop $exit $cont
    (set_local $i (i32.add (get_local $i) (i32.const 1)))
    (if (i32.eq (get_local $i) (i32.const 5))
      (br $exit (get_local $i))
    )
    (get_local $i)
  )
)
// @sunfishcode's strawman
function $loop3 () : (i32) {
  var $i:i32;
  $i = 0;
  loop $cont {
    $i = $i + 1;
    if ($i == 5) {
      br ($i) $exit;
    }
  $exit:
  }
}
-- Cirru
func $loop3 i32
  local $i i32
  set $i 0
  block $exit $ loop $cont
    set $i (+ $i 1)
    if (= $i 5)
      br $exit $i

In the case of nested blocks, though, it might need some fine tuning:

;; S-expressions
(block $default
  (block $green
    (block $yellow
      (block $orange
        (block $red
          (br_table (get_local $index) [$red $orange $yellow $green] $default)
        )
        // ...
      )
      // ...
    )
    // ...
  )
  // ...
)
// @sunfishcode's strawman
{
  br_table ($index) [$red, $orange, $yellow, $green], $default;
$red:
    // ...
$orange:
    // ...
$yellow:
    // ...
$green:
    // ...
$default:
}
-- Cirru
block $default
  block $green
    block $yellow
      block $orange
        block $red
          br_table $default [$red $orange $yellow $green] $index
        -- ...
      -- ...
    -- ...
  -- ...

@tiye
Copy link
Author

tiye commented Jul 23, 2016

@isiahmeadows thx. There's a video explained Cirru test syntax better and I think it will help https://youtu.be/cXTJDj8ad_U [0:38] It explains why you may need an extra comma in the first demo:

-- Cirru
func $fac-opt ($a i64) i64
  local $x i64
  set $x 1
  block $block
    br_if $block (<s $a 2)
    loop $loop
      set $x (* $x $a)
      set $a (+ $a -1)
      br_if $loop (<s $a 1)
  , $x

And some updates on Cirru itself. Actually in the past months I don't write Cirru Text syntax anymore, but instead editor code in graphical editor and store code in EDN or JSON. For example this is the text file:

[
[ "ns" "respo.component.text" [ ":require" [ "[]" "respo.alias" ":refer" [ "[]" "create-comp" "span" ] ] ] ]
[ "defn" "render" [ "content" "style" ] [ "fn" [ "state" "mutate!" ] [ "span" [ "{}" [ ":attrs" [ "{}" [ ":inner-text" "content" ] ] ] [ ":style" "style" ] ] ] ] ]
[ "def" "comp-text" [ "create-comp" ":text" "render" ] ]
]

and this is how I edit code:

image

Why I'm staying away from the syntax I designed? Here's an issue: I want to generate text results with program, but it's hard to generate very pretty and correct indentation syntax if a first item in an expression is a complicated expression, for example in this AST (and (> a 1) true) is complicated, it can be tricky to generated the correct indentations:

cond
  (and (> a 1) true) (+ a 1)
  (or (< a 1) false) (- a 1)
  :else a

An obvious solution is to put that first item in a single line. But I don't think that's I wanted. So I chose JSON or EDN, which are very stable.

And as a conclusion: I'd like to contribute my experience on editing code by AST that I explored in Cirru project and WebAssembly is the future and is full of possibilities. Cirru is targeting a graphical AST, if you want a text one, Cirru is nice, but far from perfect.

@dead-claudia
Copy link

dead-claudia commented Jul 23, 2016

@jiyinyiyong To be fair, I didn't actually verify the code samples. And I know Cirru is far from perfect (it can be pretty awkward IMHO). I just wanted to put out a more representative sample. 😄

(Oh, and I fixed the error, BTW.)

@yurydelendik
Copy link

From fac.wast:

// LES
function $fac-opt ($a:i64) : (i64) {

@isiahmeadows can you remove or change// LES from your examples above? The experimental text format has no such abbreviation. I think you are mistaken the name with counterproposal's one.

@dead-claudia
Copy link

@yurydelendik I was referring to the raw expression syntax itself used by the two proposals to differentiate them (LES-based and Cirru-based syntaxes), not the actual syntax of the text format (which there isn't actually any ATM beyond the s-expressions used by the WASM reference interpreter and most current experimental implementations). So if I were to be technically correct, I'd make the comments LES-based and Cirru-based instead, but I thought that LES and Cirru would be sufficient given the context.

To be fair, someone could theoretically invent their own syntax for it for another completely proposal (which I've considered doing so myself, but just haven't had the time to sketch it out). The ideas are pretty much limitless right now.

@yurydelendik
Copy link

The @sunfishcode's strawman proposal is not LES-based (at least not yet) that's why I don't really understand why this is mentioned in the examples comments.

@dead-claudia
Copy link

@yurydelendik Well...you're correct in a sense:

[From the strawman pull: sunfishcode#3]

I propose that the text format be compatible with LES - as the PR text explains, not LES as it exists today, but as it will be when the MVP is launched.

At this point I feel we're arguing semantics. I feel the intent was sufficient enough (although I'll add a note above explaining the technicality).

@qwertie
Copy link

qwertie commented Jul 26, 2016

@isiahmeadows, my LES proposal is derived from @sunfishcode's and not the other way around (although LES itself was developed before WebAssembly, just as s-expressions were.)

@dead-claudia
Copy link

dead-claudia commented Jul 26, 2016

@qwertie Where did I imply that? If I was, I'm sorry, and I'll fix it. (I tried to speak of theirs as purely independent, and I didn't think I referenced yours.)

@qwertie
Copy link

qwertie commented Jul 26, 2016

@isiahmeadows your comment 3 days ago, which discusses "sunfishcode's LES-based proposal". (There's only one LES proposal - mine. Unless you count this second one which is also mine)

@dead-claudia
Copy link

dead-claudia commented Jul 26, 2016

@qwertie That was part of the associated edit with this comment stemming from this discussion. I'll fix it.

@dead-claudia
Copy link

@yurydelendik I'm sorry...I didn't fully differentiate the PR from the actual strawman. I apologize.

@flagxor
Copy link
Member

flagxor commented Jul 30, 2016

After some communication between implementers, we've decided to focus on having all browsers be able to display linear opcodes for the MVP timeframe. Moving to Discussion.

@flagxor flagxor modified the milestones: Discussion, MVP Jul 30, 2016
@jfbastien
Copy link
Member

Text format proposal. Closing this in favor of the proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants