Skip to content

sql: values of type BLOB are encoded as strings #4023

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Korablev77 opened this issue Mar 1, 2019 · 0 comments
Closed

sql: values of type BLOB are encoded as strings #4023

Korablev77 opened this issue Mar 1, 2019 · 0 comments
Assignees
Labels
bug Something isn't working sql static types sql
Milestone

Comments

@Korablev77
Copy link
Contributor

Korablev77 commented Mar 1, 2019

CREATE TABLE t (key BLOB PRIMARY KEY);
INSERT INTO t VALUES ('abc');

In a nutshell, to execute last query following actions are performed:

1. OP_String8 r, 'abc' //Save constant string 'abc' to memory cell r
2. OP_ApplyType r, BLOB //Cast value in r to SCALAR type
3. OP_MakeRecord r, r1 //Encode value in r to msgpack and store to r1
4. OP_Insert r1, space(t) //Insert value of r1 to space(t)

At step 2 to convert value we are calling mem_apply_type() routine. If we want to convert string to scalar, next procedure is called:

 366         case FIELD_TYPE_SCALAR:                                                 
 367                 if (record->flags & (MEM_Str | MEM_Blob))                       
 368                         record->flags |= MEM_Blob;                              
 369                 return 0;     

So, in fact after type SCALAR is applied to STRING, value features both flags: MEM_Blob and MEM_Str.
And during msgpack encoding at stage 3, we set type of tuple format as MP_STR:

1714 void                                                                            
1715 mpstream_encode_vdbe_mem(struct mpstream *stream, struct Mem *var)              
1716 {        
...
1740         } else if (var->flags & MEM_Str) {                                      
1741                 mpstream_encode_strn(stream, var->z, var->n);    

Which is obviously wrong action, since after fetching value from tuple it will be treated as STRING (BLOB is detected as MP_BIN in tuple format). To resolve this problem, we should allow only one type-flag at once. Fix in this case is obvious, but it affects other type-related issues.

@Korablev77 Korablev77 self-assigned this Mar 1, 2019
@Korablev77 Korablev77 added the bug Something isn't working label Mar 1, 2019
@kyukhin kyukhin added this to the 2.1.2 milestone Mar 4, 2019
Korablev77 added a commit that referenced this issue Mar 7, 2019
BLOB column type is represented by SCALAR field type in terms of NoSQL.
We attempted at emulating BLOB behaviour, but such efforts turn out to
be not decent enough. For this reason, we've decided to abandon these
attempts and fairly replace it with SCALAR column type.
SCALAR column type acts in the same way as it does in NoSQL: it is
aggregator-type for INTEGER, NUMBER and STRING types. So, column
declared with this type can contain values of these three (available in
SQL) types. It is worth mentioning that CAST operator in this case does
nothing.

Still, we consider BLOB values as entries encoded in msgpack
with MP_BIN format. To make this happen, values to be operated should be
represented in BLOB form x'...' (e.g. x'000000'). What is more, there
are two built-in functions returning BLOBs: randomblob() and zeroblob().
On the other hand, columns with STRING NoSQL type don't accept BLOB
values.

@TarantoolBot document
Title: SQL types changes
There are couple of recently introduced changes connected with SQL
types.

Firstly, we've removed support of DATE/TIME types from parser due to
confusing behaviour of these types: they were mapped to NUMBER NoSQL
type and have nothing in common with generally accepted DATE/TIME types
(like in other DBs). In addition, all built-in functions related to
these types (julianday(), date(), time(), datetime(), current_time(),
current_date() etc) are disabled until we reimplement TIME-like types as
a native NoSQL ones (see #3694 issue).

Secondly, we've removed CHAR type (i.e. alias to VARCHAR and TEXT). The
reason is that according to ANSI SQL CHAR(len) must accept only strings
featuring length exactly equal to given in type definition. Obviously,
now we don't provide such checks. Types VARCHAR and TEXT are still
legal.

For the same reason, we've removed NUMERIC and DECIMAL types, which were
aliases to NUMBER NoSQL type. REAL, FLOAT and DOUBLE are still exist as
aliases.

Finally, we've renamed BLOB column type to SCALAR. We've decided that
all our attempts to emulate BLOB behaviour using SCALAR NoSQL type don't
seem decent enough, i.e. without native NoSQL type BLOB there always
will be inconsistency, especially taking into account possible NoSQL-SQL
interactions. In SQL SCALAR type works exactly in the same way as in
NoSQL: it can store values of INTEGER, FLOAT and TEXT SQL types at the
same time. Also, with this change behaviour of CAST operator has been
slightly corrected: now cast to SCALAR doesn't affect type of value at
all. Couple of examples:

CREATE TABLE t1 (a SCALAR PRIMARY KEY);
INSERT INTO t1 VALUES ('1');
SELECT * FROM t1 WHERE a = 1;
[] -- Result is empty set since column "a" contains string literal value
'1', not integer value 1.

CAST(123 AS SCALAR); -- Returns 123 (integer)
CAST('abc' AS SCALAR); -- Returns 'abc' (string)

Note that in NoSQL values of BLOB type defined as ones decoded in
msgpack with MP_BIN format. In SQL there are still a few ways to force
this format: declaring literal in "BLOB" format (x'...') or using one of
two built-in functions (randomblob() and zeroblob()). TEXT and VARCHAR
SQL types don't accept BLOB values:

CREATE TABLE t (a TEXT PRIMARAY KEY);
INSERT INTO t VALUES (randomblob(5));
---
- error: 'Tuple field 1 type does not match one required by operation: expected string'
...

BLOB itself is going to be reimplemented in scope of #3650.

Closes #4019
Closes #4023
Korablev77 added a commit that referenced this issue Mar 7, 2019
BLOB column type is represented by SCALAR field type in terms of NoSQL.
We attempted at emulating BLOB behaviour, but such efforts turn out to
be not decent enough. For this reason, we've decided to abandon these
attempts and fairly replace it with SCALAR column type.  SCALAR column
type acts in the same way as it does in NoSQL: it is aggregator-type for
INTEGER, NUMBER and STRING types. So, column declared with this type can
contain values of these three (available in SQL) types. It is worth
mentioning that CAST operator in this case does nothing.

Still, we consider BLOB values as entries encoded in msgpack with MP_BIN
format. To make this happen, values to be operated should be represented
in BLOB form x'...' (e.g. x'000000'). What is more, there are two
built-in functions returning BLOBs: randomblob() and zeroblob().  On the
other hand, columns with STRING NoSQL type don't accept BLOB values.

Closes #4019
Closes #4023

@TarantoolBot document
Title: SQL types changes
There are couple of recently introduced changes connected with SQL
types.

Firstly, we've removed support of DATE/TIME types from parser due to
confusing behaviour of these types: they were mapped to NUMBER NoSQL
type and have nothing in common with generally accepted DATE/TIME types
(like in other DBs). In addition, all built-in functions related to
these types (julianday(), date(), time(), datetime(), current_time(),
current_date() etc) are disabled until we reimplement TIME-like types as
a native NoSQL ones (see #3694 issue).

Secondly, we've removed CHAR type (i.e. alias to VARCHAR and TEXT). The
reason is that according to ANSI SQL CHAR(len) must accept only strings
featuring length exactly equal to given in type definition. Obviously,
now we don't provide such checks. Types VARCHAR and TEXT are still
legal.

For the same reason, we've removed NUMERIC and DECIMAL types, which were
aliases to NUMBER NoSQL type. REAL, FLOAT and DOUBLE are still exist as
aliases.

Finally, we've renamed BLOB column type to SCALAR. We've decided that
all our attempts to emulate BLOB behaviour using SCALAR NoSQL type don't
seem decent enough, i.e. without native NoSQL type BLOB there always
will be inconsistency, especially taking into account possible NoSQL-SQL
interactions. In SQL SCALAR type works exactly in the same way as in
NoSQL: it can store values of INTEGER, FLOAT and TEXT SQL types at the
same time. Also, with this change behaviour of CAST operator has been
slightly corrected: now cast to SCALAR doesn't affect type of value at
all. Couple of examples:

CREATE TABLE t1 (a SCALAR PRIMARY KEY); INSERT INTO t1 VALUES ('1');
SELECT * FROM t1 WHERE a = 1; [] -- Result is empty set since column "a"
contains string literal value '1', not integer value 1.

CAST(123 AS SCALAR); -- Returns 123 (integer) CAST('abc' AS SCALAR); --
Returns 'abc' (string)

Note that in NoSQL values of BLOB type defined as ones decoded in
msgpack with MP_BIN format. In SQL there are still a few ways to force
this format: declaring literal in "BLOB" format (x'...') or using one of
two built-in functions (randomblob() and zeroblob()). TEXT and VARCHAR
SQL types don't accept BLOB values:

CREATE TABLE t (a TEXT PRIMARAY KEY);
INSERT INTO t VALUES (randomblob(5));
---
- error: 'Tuple field 1 type does not match one required: expected string'
...

BLOB itself is going to be reimplemented in scope of #3650.
Korablev77 added a commit that referenced this issue Mar 7, 2019
BLOB column type is represented by SCALAR field type in terms of NoSQL.
We attempted at emulating BLOB behaviour, but such efforts turn out to
be not decent enough. For this reason, we've decided to abandon these
attempts and fairly replace it with SCALAR column type.  SCALAR column
type acts in the same way as it does in NoSQL: it is aggregator-type for
INTEGER, NUMBER and STRING types. So, column declared with this type can
contain values of these three (available in SQL) types. It is worth
mentioning that CAST operator in this case does nothing.

Still, we consider BLOB values as entries encoded in msgpack with MP_BIN
format. To make this happen, values to be operated should be represented
in BLOB form x'...' (e.g. x'000000'). What is more, there are two
built-in functions returning BLOBs: randomblob() and zeroblob().  On the
other hand, columns with STRING NoSQL type don't accept BLOB values.

Closes #4019
Closes #4023

@TarantoolBot document
Title: SQL types changes
There are couple of recently introduced changes connected with SQL
types.

Firstly, we've removed support of DATE/TIME types from parser due to
confusing behaviour of these types: they were mapped to NUMBER NoSQL
type and have nothing in common with generally accepted DATE/TIME types
(like in other DBs). In addition, all built-in functions related to
these types (julianday(), date(), time(), datetime(), current_time(),
current_date() etc) are disabled until we reimplement TIME-like types as
a native NoSQL ones (see #3694 issue).

Secondly, we've removed CHAR type (i.e. alias to VARCHAR and TEXT). The
reason is that according to ANSI SQL CHAR(len) must accept only strings
featuring length exactly equal to given in type definition. Obviously,
now we don't provide such checks. Types VARCHAR and TEXT are still
legal.

For the same reason, we've removed NUMERIC and DECIMAL types, which were
aliases to NUMBER NoSQL type. REAL, FLOAT and DOUBLE are still exist as
aliases.

Finally, we've renamed BLOB column type to SCALAR. We've decided that
all our attempts to emulate BLOB behaviour using SCALAR NoSQL type don't
seem decent enough, i.e. without native NoSQL type BLOB there always
will be inconsistency, especially taking into account possible NoSQL-SQL
interactions. In SQL SCALAR type works exactly in the same way as in
NoSQL: it can store values of INTEGER, FLOAT and TEXT SQL types at the
same time. Also, with this change behaviour of CAST operator has been
slightly corrected: now cast to SCALAR doesn't affect type of value at
all. Couple of examples:

CREATE TABLE t1 (a SCALAR PRIMARY KEY);
INSERT INTO t1 VALUES ('1');
SELECT * FROM t1 WHERE a = 1;
-- []
Result is empty set since column "a" contains string literal value '1',
not integer value 1.

CAST(123 AS SCALAR); -- Returns 123 (integer)
CAST('abc' AS SCALAR); -- Returns 'abc' (string)

Note that in NoSQL values of BLOB type defined as ones decoded in
msgpack with MP_BIN format. In SQL there are still a few ways to force
this format: declaring literal in "BLOB" format (x'...') or using one of
two built-in functions (randomblob() and zeroblob()). TEXT and VARCHAR
SQL types don't accept BLOB values:

CREATE TABLE t (a TEXT PRIMARAY KEY);
INSERT INTO t VALUES (randomblob(5));
---
- error: 'Tuple field 1 type does not match one required: expected string'
...

BLOB itself is going to be reimplemented in scope of #3650.
Korablev77 added a commit that referenced this issue Mar 7, 2019
BLOB column type is represented by SCALAR field type in terms of NoSQL.
We attempted at emulating BLOB behaviour, but such efforts turn out to
be not decent enough. For this reason, we've decided to abandon these
attempts and fairly replace it with SCALAR column type.  SCALAR column
type acts in the same way as it does in NoSQL: it is aggregator-type for
INTEGER, NUMBER and STRING types. So, column declared with this type can
contain values of these three (available in SQL) types. It is worth
mentioning that CAST operator in this case does nothing.

Still, we consider BLOB values as entries encoded in msgpack with MP_BIN
format. To make this happen, values to be operated should be represented
in BLOB form x'...' (e.g. x'000000'). What is more, there are two
built-in functions returning BLOBs: randomblob() and zeroblob().  On the
other hand, columns with STRING NoSQL type don't accept BLOB values.

Closes #4019
Closes #4023

@TarantoolBot document
Title: SQL types changes
There are couple of recently introduced changes connected with SQL
types.

Firstly, we've removed support of DATE/TIME types from parser due to
confusing behaviour of these types: they were mapped to NUMBER NoSQL
type and have nothing in common with generally accepted DATE/TIME types
(like in other DBs). In addition, all built-in functions related to
these types (julianday(), date(), time(), datetime(), current_time(),
current_date() etc) are disabled until we reimplement TIME-like types as
a native NoSQL ones (see #3694 issue).

Secondly, we've removed CHAR type (i.e. alias to VARCHAR and TEXT). The
reason is that according to ANSI SQL CHAR(len) must accept only strings
featuring length exactly equal to given in type definition. Obviously,
now we don't provide such checks. Types VARCHAR and TEXT are still
legal.

For the same reason, we've removed NUMERIC and DECIMAL types, which were
aliases to NUMBER NoSQL type. REAL, FLOAT and DOUBLE are still exist as
aliases.

Finally, we've renamed BLOB column type to SCALAR. We've decided that
all our attempts to emulate BLOB behaviour using SCALAR NoSQL type don't
seem decent enough, i.e. without native NoSQL type BLOB there always
will be inconsistency, especially taking into account possible NoSQL-SQL
interactions. In SQL SCALAR type works exactly in the same way as in
NoSQL: it can store values of INTEGER, FLOAT and TEXT SQL types at the
same time. Also, with this change behaviour of CAST operator has been
slightly corrected: now cast to SCALAR doesn't affect type of value at
all. Couple of examples:

CREATE TABLE t1 (a SCALAR PRIMARY KEY);
INSERT INTO t1 VALUES ('1');
SELECT * FROM t1 WHERE a = 1;
-- []
Result is empty set since column "a" contains string literal value '1',
not integer value 1.

CAST(123 AS SCALAR); -- Returns 123 (integer)
CAST('abc' AS SCALAR); -- Returns 'abc' (string)

Note that in NoSQL values of BLOB type defined as ones decoded in
msgpack with MP_BIN format. In SQL there are still a few ways to force
this format: declaring literal in "BLOB" format (x'...') or using one of
two built-in functions (randomblob() and zeroblob()). TEXT and VARCHAR
SQL types don't accept BLOB values:

CREATE TABLE t (a TEXT PRIMARAY KEY);
INSERT INTO t VALUES (randomblob(5));
---
- error: 'Tuple field 1 type does not match one required: expected string'
...

BLOB itself is going to be reimplemented in scope of #3650.
@kyukhin kyukhin closed this as completed in 0d0f53a Mar 7, 2019
@kyukhin kyukhin added the tmp label Mar 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working sql static types sql
Projects
None yet
Development

No branches or pull requests

2 participants