[SUGGESTION] Array Literals with the support of both Multidimensional and Jagged Arrays

### Preface

The idea of this suggestion is gathered from discussion in [this issue](https://github.com/hsutter/cppfront/issues/193).

I have to mention that `(...)` is already for calling constructores, grouping expressions and initializing lists.

Now, consider the following ambiguities:

- `(1)` as an expression, are they parenthesis around a value? Or is it an array of one item?
- `()` is not an empty array, but it calls default constructor in variable declarations.
- `(1, 2)` in declarations, is it the arguments of a constructor? Or is it an array of two items?

```cpp
x0: std::vector<int> = (1, 2);
```

Yes I know that `std::vector` has a bad API design, but I ask myself why would Cpp2 (like Cpp1) allow libraries to have this ambiguity in the first place?

Having array literals with a different syntax, will solve those three ambiguities. I suggest to use `[...]` for array literals:

```cpp
x0: = [1, 2, 3];
```

Also nested `(...)`s or `;`s or etc, will create multidimensional arrays, because they don't create a new array, and they are for mathematical grouping (as they are used to group expressions and to change the precedence of operators). On the other hand, nested `[...]`s will create jagged arrays, because they create a new array:

```cpp
// Multideminsional Array
x0: = [(1, 2, 3), (4, 5, 6)];
// Or alternatively one of the following syntax:
// x0: = [1, 2, 3; 4, 5, 6];
// x0: = [(1, 2, 3); (4, 5, 6);];
r0: = x0[0, 1] == 2; // true

// Jagged Array
x1: = [[1, 2, 3], [4, 5, 6]];
r1: = x1[0][1] == 2; // true
```

***I currently do not suggest to support multidimensional arrays, but it's a possibility to consider in the future.***

### Suggestion Detail

Three options are available instead of `()` for array literals:

- `<...>` is already for template parameters/arguments. It's not a good choice, because:
    - It doesn't have any known relation with arrays.
    - It looks like less-than and greater-than operators, because of this similarity, it's not a good choice for arrays which are methematical such as a vector of boolean values, e.g. `<a < b, c > 2>`.
- `[...]` is already for accessing items of an array. It seems to be a good choice.
- `{...}` is already for function/statement blocks and type definitions. It can be considered as a good choice.

Now, it's the time to compare both `[...]` and `{...}` for array literals:

```cpp
x0: /*...*/ = [1];
x1: /*...*/ = {1};
```

OK. Both of them look good. So what if we want to write an empty array?

```cpp
x0: /*...*/ = [];
x1: /*...*/ = {};
```

`[]` is clearly an empty array, but `{}` can be either an empty function/statement block or an empty array in which it depends on the declaration. For example:

```cpp
x0         :       std::vector<int> = {};  // empty array
x1         : () -> void             = {}   // empty statement block
x2         : () -> std::vector<int> = {}   // ERROR: It doesn't work, although visually it's the same as above.
x3: = call(: () -> std::vector<int> = {}); // SURPRISE! It works, although visually it's the same as above.
```

`{}` is visually surprising and inconsistent for `x1`, `x2` and `x3`, although they look the same:

- In `x0` declaration, `{}` is an empty array.
- In `x1` declaration, `{}` is an empty statement block.
- But in `x2` declaration, `{}` is an error, because it must end with `;`.
- But in `x3` declaration, `{}` is an empty array!

So `[...]` is more expressive than `{...}` for array literals.

Now let's consider this situation in the following example:

```cpp
x0: = [1, 2, 3][1]; // It's equal to 2
```

The first `[...]` creates an array, and the second `[...]` accesses an item from it. A sequence of `[...]`s is not ambiguous, because its behaviour is similar to parenthesis:

```cpp
x0: = (call() + something)(1);
```

The first `(...)` groups the operands of `operator+`, and the second `(...)` calls `operator()` on the result.

### Your Questions

**Will your feature suggestion eliminate X% of security vulnerabilities of a given kind in current C++ code?**

Yes. If a bad API design can suddenly change the meaning of code, it's going to be a security vulnerability. This suggestion is a way to prevent it by separating arrays from constructors and expressions.

**Will your feature suggestion _automate or eliminate_ X% of current C++ guidance literature?**

Yes. It's not needed to learn if user-defined constructors are ambiguous with initializer lists, because it prevents ambiguous situation completely. It allows more API choices.

### Considered Alternatives

An alternative solution was that if a type has ambiguous constructor with initializer list, it should be a syntax error. By the way, this approach wouldn't fix the bad API design of `std::vector`.

Another alternative solution was a little complicated. The idea was to favor constructors over initializer list, and to consider a comma-separated list with parenthesis to be an initializer list:

```cpp
x0: = (1, 2); // x0 is an initializer list
```

With the help of [unnamed variable declaration](https://github.com/hsutter/cppfront/issues/391) and indirect initialization, it could be used like this:

```cpp
// `: = (1, 2)` is an initializer list
x0: std::vector<int> = : = (1, 2);

// This calls the constructor to create a vector of one element with value 2.
x1: std::vector<int> = (1, 2);
```

But I gave up on this idea, becuase it would encourage unnamed variable declaration more than necessary.

Finally I considered to use [literal templates](https://github.com/hsutter/cppfront/issues/316) syntax:

```cpp
// `(1, 2)<int>` is an initializer list
x0: std::vector<int> = (1, 2)<int>;

// `list` is a user-defined literal suffix which creates an initializer list
x1: std::vector<int> = (1, 2)list;

// This calls the constructor to create a vector of one element with value 2.
x2: std::vector<int> = (1, 2);
```

But I gave up on this idea too, because `(1, 2)<int>` would require to always specify the type, and `(1, 2)list` would make user-defined literal suffixes to be look like constructors.

### Edits

- I've added one more alternative solution which I was considered.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SUGGESTION] Array Literals with the support of both Multidimensional and Jagged Arrays #424

Preface

Suggestion Detail

Your Questions

Considered Alternatives

Edits

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[SUGGESTION] Array Literals with the support of both Multidimensional and Jagged Arrays #424

Description

Preface

Suggestion Detail

Your Questions

Considered Alternatives

Edits

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions