-
Notifications
You must be signed in to change notification settings - Fork 30
Add SimpleTable construct based on Named Tuples #81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SimpleTable construct based on Named Tuples #81
Conversation
1501969
to
496dc23
Compare
496dc23
to
5e6edca
Compare
@lihaoyi pushed a queryable for arbitrary named tuples, and now only nested case class need to extend SimpleTable.Source |
I've been experimenting with augmenting However this is not possible as covariant so I think it will have to remain as is |
944299f
to
006172d
Compare
latest commit renames Lift to MapOver, and removes some duplication |
moved the named tuple queryable stuff to a dedicated file. |
the queryables function should only cache factories that still require the mappers to be provided.
e0f83fe
to
339eb4b
Compare
339eb4b
to
6cf41b2
Compare
now i put everything available in one import |
aca6099
to
e1e5cf6
Compare
e1e5cf6
to
0de7f5c
Compare
4600d21
to
40c3237
Compare
a7997c1
to
88ede3a
Compare
410c2c6
to
f152114
Compare
@lihaoyi I have updated the PR description with an explanation of choices made |
Thanks, will take a look |
@lihaoyi I have updated the PR description with changes to the build, documentation, and testing |
here is a benchmark to compare against good to know anyway, perhaps there are tweaks that can be made before anyone immediately jumps to macros |
Not surprising there's some performance penalty, but it's probably fine. ScalaSql isn't meant to be hyper-optimized and generally the bottleneck for database queries is on the actual database anyway, rather than in application code |
Abstract
In this PR, introduce a new
com.lihaoyi::scalasql-namedtuples
module, supporting only Scala 3.7+, with three main contributions:scalasql.namedtuples.SimpleTable
class as an alternative toTable
. The pitch is that you can use a "basic" case class with no higher-kinded type parameters to represent your models.scalasql.namedtuples.NamedTupleQueryable
- provides implicitQueryable
so you can return a named tuple from a query.scalasql.simple
package object, that re-exportsscalasql
package, plusSimpleTable
andNamedTupleQueryable
Example
Defining a table
return named tuples from queries
Design
This PR manages to introduce
SimpleTable
entirely without needing to change the core library. It is designed around leveraging the new Named Tuples and programmatic structural typing facilities introduced in Scala 3.7. No macros are needed.It was also designed such that any query should be pretty much source compatible if you change from
Table
toSimpleTable
- (with the exception of dropping[Sc]
type arguments).Within a query, e.g.
City.select.map(c => ...)
we still needc
to be an object that has all the fields ofCity
, but they need to be wrapped by eitherscalasql.Expr[T]
orscalasql.Column[T]
.With
Table
this is done by the case class having an explicit type parameter (e.g.City[T[_]](name: T[String]...)
) - so you would just substitute the parameter. Of course withSimpleTable
the main idea is that you do not declare thisT[_]
type parameter, but thescalasql.query
package expects it to be there.The solution in this PR is to represent the table row within queries by
Record[City, Expr]
, (rather thanCity[Expr]
).Record[C, T[_]
is a new class, and essentially a structurally typed tuple that extendsscala.Selectable
with a named tupleFields
type member, derived mappingT
overNamedTuple.From[C]
.Record
(andSimpleTable
) still support using a nested case class field to share common columns (with a caveat*).When you return a
Record[C, T]
from a query, you need to still get back aC
, soSimpleTable
provides an implicitQueryable.Row[Record[C, Expr], C]
, which is generated by compiletime derivation (viainline
methods).Implementation
To make a simpler diff,
SimpleTable
is entirely defined in terms ofTable
. i.e. here is the signature:The
metadata0
argument is expected to be generated automatically from an inline given inSimpleTableMacros.scala
(I suggest to rename toSimpleTableDerivation.scala
)Table[V[_[_]]
, being a higher kinded type, normally expects somecase class Foo[T[_]]
, and fills in various placesV[Expr]
orV[Column]
in queries, andV[Sc]
for results. However forSimpleTable
whenT[_]
isscalasql.Sc
we want to returnC
and otherwise return thisRecord[C, T]
soMapOver
needs to be a match type:(
Tombstone
is used here to try and introduce a unique type that would never be used for any other purpose, i.e. be disjoint in the eyes of the match type resolver - also so we can convince ourselves that ifT
returnsTombstone
it is probably the identity and not some accident.)See #83 for another approach that eliminates removes the
V[_[_]]
fromTable
,Insert
and various other places.Design of
Record
Record[C, T[_]]
is implemented as a structural type that tries to wrap the fields ofC
inT
. It has a few design constraints:C
has a field of typeX
that is a nested Table, the corresponding field inRecord[C, T]
must also beRecord[X, T]
.Expr
orColumn
) to wrap fields in from the outer level.First decision:
Record
uses aFields
type member for structural selection, rather than the traditional type refinements.Why:
Array
rather than a hash map,Fields
derived viaNamedTuple.From[C]
is treated as part of the class implementation, this means you never get a huge refinement type showing up whenever you hover in the IDE.Second decision: how to decide which fields are "scalar" data and which are nested records.
Constraints:
Table
, the only evidence that a field of typeX
is a nested table is implicit evidence of typeTable.ImplicitMetadata[X]
scala.compiletime
intrinsic) that can tell you if there exists an implicit of typeX
.Choices:
foo: Ref[Foo]
, unclear how much this would be intrusive at each use-siteSimpleTable.Nested
) that the nested case class should extend - this does however prevent using "third party" classes as a nested tableThe implicit derivation of Metadata also enforces that whenever an implicit metadata is discovered for use as field, the class must extend
SimpleTable.Nested
.Alternatives
Why is there
Record[C, T]
and notExprRecord[C]
orColumnRecord[C]
classes?This was explored in #83, which requires a large change to the
scalasql.query
package, i.e. a new type hierarchy forTable
(but makes more explicit in types the boundary between read-only queries, column updates, and results). It's also unclear if it relies upon "hacks" to work.Why use
Record[C, T]
and not named tuples in queries?What is needed to get rid of
Simpletable.Nested
?lets remind ourselves of the current definition of
SimpleTable
:First thing - we determined that the transitive closure of available implicit
SimpleTable.GivenMetadata[Foo]
needs to be added as an argument toRecord
.In #82 we explored this by just precomputing all the field types ahead of time in a macro, so the types would look a bit like
Record[City, Expr, (id: Expr[Long], name: Expr[String], nested: (fooId: Expr[Long], ...))]
which was very verbose.An alternative could be to pass as a type parameter the classes which have a metadata defined. Something like
Record[City, Expr, Foo | Bar]
orRecord[Foo, Expr, Empty.type]
, and modify theRecord
class as such:This could be a sweet spot between verbosity and extensibility to "uncontrolled" third party classes - but it is uncertain who in reality would be blocked by needing to extend
SimpleTable.Nested
. Also it is still to determine the potential impact on performance of compilation times, also the best place to compute this type without causing explosions of implicit searches.You can see a prototype here: bishabosha/scalasql#table-named-tuples-infer-nested-tables
Build changes
introduce top level
scalasql-namedtuples
modulecom.lihaoyi:scalasql-namedtuples_3
3.7.0
scalasql/namedtuples
scalasql("3.6.2")
- so that it can re-export all of scalasql from thescalasql.simple
package objectAlso declare
scalasql-namedtuples.test
modulescalasql/namedtuples/test
scalasql("3.6.2").test
, so the custom test framework can be used to capture test results.Testing changes
The main approach to testing was to copy test sources that already exist, and convert them to use SimpleTable with otherwise no other changes.
Assumptions made when copying:
scalasql
tests are testing the query translation to SQL, rather than specifically the implementation of Table.Metadata generated by macros.SimpleTable
andTable
are the signatures of implicits available, and the implementation ofTable.Metadata
, the test coverage forSimpleTable
should focus on type checking, and that the fundamentals of TableMetadata are implemented correctly in a "round trip".scalasql/test/src/ExampleTests.scala
,scalasql/test/src/datatypes/DataTypesTests.scala
andscalasql/test/src/datatypes/OptionalTests.scala
, renaming the traits and switching fromTable
toSimpleTable
, otherwise unchanged.scalasql/test/src/ConcreteTestSuites.scala
toscalasql/namedtuples/test/src/SimpleTableConcreteTestSuites.scala
, commenting out most objects exceptOptionalTests
andDataTypesTests
, which now extend the duplicated and renamed suites. I also renamed the package toscalasql.namedtuples
scalasql/test/src/WorldSqlTests.scala
(toscalasql/namedtuples/test/src/example/WorldSqlTestsNamedTuple.scala
) to ensure that every example intutorial.md
compiles after switching toSimpleTable
, and also to provide snippets I will include in thetutorial.md
.OptionalTests.scala
andDataTypesTests.scala
so that they would generate unique names that can be included inreference.md
.New tests:
SimpleTableH2Example
testsscalasql/namedtuples/test/src/datatypes/LargeObjectTest.scala
to stress test the compiler for large sized classes.scalasql/namedtuples/test/src/example/foo.scala
for quick testing of compilation, typechecking etc.copy
method withRecord#updates
inSimpleTableOptionalTests.scala
Documentation changes
tutorial.md
andreference.md
are generated from scala source files and test results indocs/generateDocs.mill
.I decided that rather than duplicate both
tutorial.md
andreference.md
forSimpleTable
, it would be better to avoid duplication, or potential drift, by reusing the original documents, but include specific notes when use ofSimpleTable
orNamedTupleQueryable
adds new functionality or requires different code.tutorial.md
To update
tutorial.md
I wrote the new text as usual inWorldSqlTests.scala
. These texts exclusively talk about differences between the two approaches, such as declaring case classes, returning named tuples, or using theupdates
method on record. To support the new texts, I needed to include code snippets. But like inWorldSqlTests.scala
I would prefer the snippets to be verified in a test suite. So the plan was to copyWorldSqlTests.scala
to a new file, update the examples to useSimpleTable
and include snippets from there.To support including snippets from another file I updated the
generateTutorial
task indocs/generateDocs.mill
. The change was that if the scanner sees a line// +INCLUDE SNIPPET [FOO] somefile
inWorldSqlTests.scala
, then it switches to reading the lines fromsomefile
file, looking for the first line containing// +SNIPPET [FOO]
, then splices all lines ofsome file
until it reaches a line containing// -SNIPPET [FOO]
, then it switches back to reading the lines inWorldSqlTests.scala
.The main idea is that snippets within
somefile
should be declared in the same order that they are included fromWorldSqlTests.scala
, meaning that the scanner traverses both files from top to bottom once (beginning from the previous position whenever switching back).So to declare the snippets as mention above I copied
WorldSqlTests.scala
toscalasql/namedtuples/test/src/example/WorldSqlTestsNamedTuple.scala
, replacedTable
bySimpleTable
and declared in there the snippets I wanted (and included them fromWorldSqlTests.scala
) .Any other changes (e.g. newlines, indentation etc) are likely due to updating scalafmt.
reference.md
this file is generated by the
generateReference
task indocs/generateDocs.mill
. It works by formatting the data fromout/recordedTests.json
(captured by running tests with a custom framework) and grouping tests by the suite they occur in.Like with
tutorial.md
I thought it best to only add extra snippets that highlight the differences between the two kinds of table.So first thing to capture the output of simple table tests, in the build I set the
SCALASQL_RECORDED_TESTS_NAME
andSCALASQL_RECORDED_SUITE_DESCRIPTIONS_NAME
environment variables in thescalasql-namedtuples.test
module: in this caserecordedTestsNT.json
andout/recordedSuiteDescriptionsNT.json
.Next I updated the
generateReference
task so that it also includes the recorded outputs fromrecordedTestsNT.json
. This task handles grouping of tests and removing duplicates (e.g. themysql
,h2
variants). I made it so that for each Suite e.g.DataTypes
it find the equivalent suite in the simple table results, and then only include the test names it hadn't seen at the end of that suite.So therefore to include any test result from
SimpleTableDataTypesTests.scala
orSimpleTableOptionalTests.scala
, it is only necessary to rename an individual test, and it will be appended to the bottom of the relevant group inreference.md
. For this PR I did this by adding a- with SimpleTable
suffix to relevant tests (i.e. the demonstration of nested classes, and the usage ofRecord#updates
method)