Q1 2024 Release
This is our quarterly roll-up release that contains many bug fixes, features, and performance improvements. Reminder that interfaces are not guaranteed to settle until 1.0.
This release contains a breaking change for setting up an in-memory database server, e.g. in testing. The in-memory database now requires its own session implementation, and the server must be supplied with a compatible session builder to prevent panics. See the _examples
directory for an example.
Merged PRs
go-mysql-server
- 2350: [planbuilder] Limit allows unsigned ints
Parameterizing limits in stored procedures exposed a bug where we weren't accepting unsigned ints in LIMIT/OFFSET. This is only possible in prepared or stored procedure contexts afaik. - 2348: Optional integrator schema validation
Move table schema validation so that we validate the accumulated result rather than sequential alters. - 2347: implement
json_depth()
MySQL Docs:
https://dev.mysql.com/doc/refman/8.0/en/json-attribute-functions.html#function_json-depth - 2346: fix null and empty paths for json functions
- 2345: Return correct MySQL error code when inserting into nonexistent columns.
Fixes #2344
This PR also adds a test framework for testing MySQL error codes, since this doesn't appear to currently be tested. This should make adding tests for other error codes easy.
Some of the MySQL error codes that I expected us to use (such as1050: ERTableExists
) have zero usages in GMS or Dolt. We're probably returning1105: ERUnknownError
for these. - 2343: fix json bool comparison
JSON Boolean values are special; they do not behave like normal when converted/compared against other types.
fixes dolthub/dolt#7528 - 2342: Add row length guards
Tests and basic engine-side guards similar to mysql's behavior.
re: dolthub/dolt#7524 - 2339: only use
crosshashjoin
on subqueryaliases
This fixes 2 sqllogictests. - 2336: fix rangetree removal of largest node without left child
When removing a right child, a missing nil guard was causing us to improperly propagate the new max range.
This resulted in overlapping ranges.
This fixes 1 sqllogictest. - 2335: Schema pruning considers database name
Fix some inconsistencies with project and join hints. New test added for new project pruning changes. - 2334: fix char to bool conversion
This PR adds missingnil
guards in a few conversion methods.
fixes dolthub/dolt#7515 - 2333: [planbuilder] Skip post-aggregate projection when there are no aliases in target scope
re: dolthub/dolt#6982
This takes the last big chunk of CPU pressure off of TPC-C execution, which is mostly disk limited now:
- 2332: [time] shortcut for no timezone conversion
This function reads files from disk to try to convert a timezone to itself. There are other optimizations we could do to maybe try checking for the integer offset before doing syscalls.
re: dolthub/dolt#6982, ~10% TPC-C boost - 2331: fix
NOT
expression in conjunction withWHERE EXISTS(<subquery>)
The analyzer ruleunnest_exists_subqueries
was accidentally droppingNOT
expressions when hoisting subqueries fromWHERE EXISTS...
clauses.
This should fix 8 sqllogictests.
Correctness: dolthub/dolt#7510 - 2330: Fix
DISTINCT
overDECIMALS
There was another place where we were usinghashstructure
package, which does not hashdecimal.Decimal
types correctly.
Switched toxxhash
package, which is what we use everywhere else.
Reusing solution from: #2279
This will fix 1 sqllogictest. - 2329:
expression.Div
Micro Benchmarks
There are going to be changes to our division behavior that impact both its accuracy and speed.
This PR adds benchmarks to track the runtime improvements/degradations - 2327: memo/planbuilder: Off by one error, and issue importing statistics for testing
The original bug was a one-off error whereconjCollector
was checking for the wrong equivalence column. That spiraled into:- statistics were not properly being used for some tests because PRIMARY case sensitivity in
ANALYZE ... UPDATE HISTOGRAM
statements - one integration test had a weird edge case where OSX and linux compute slightly differently float values, leading to different plans locally vs CI. One option was to truncate the specificity of all cost values with a multiply/round to avoid numerical weirdness...picking the first lowest plan also seems to work.
- variety of other small bugs in costing
- update join bug where the projections after a rearrangement lost expression id/nullability info. This was exposed by the changes above
- statistics were not properly being used for some tests because PRIMARY case sensitivity in
- 2326: Various Decimal Type Fixes
This PR has a variety of fixes to have arithmetic operations (especially those involving decimals) behave more like MySQL.
The logic for theType()
method forArthmetic
andDiv
is simpler, and better tested.
When comparing Decimal results from division operations, MySQL has an internal Scale that is different than the Scale used when returning Decimal results for display.
Here's a matrix displaying the resulting scale:
(Ex:1 / 3 = 0.333333333
; scale 0 div scale 0 should return scale 9)
Additionally, this PR adds test for arithmetic operations over Datetime and Year types. There are still a some problems dealing with precision and parsing there...
Note: I believe the division optimization where float division is preferred over decimal division for internal calculations may be causing problems. More testing is needed to see if it's possible to enable this without causing inaccuracies/precision loss.
There are microbenchmarks measuring the performance of div expression, and it turns out these changes actually greatly improve the runtime.
Correctness: dolthub/dolt#7484
Fixes - 2323: SQLEXCEPTION proc handler
We originally supported one type of procedure handler,NOT FOUND
, which explicitly checked for an error when fetching from a cursor io.EOFs. The implementation for that handler would walk the entire BEGIN/END scope stack inside the Fetch call looking for a handler, execute the handler body, and then embed the scope height into a special return error. The error walked back up the callstack looking for the BEGIN/END block embedded in the error message. - 2319: Add net.Listener as server config option
Presently go-mysql-server is served by anet.Listener
backed by a real port or socket. In some environments (like testing) we want to avoid the host's networking stack entirely. This changeset adds support for the caller to provide thenet.Listener
, which gives them full control over where the sever serves.
This opens the door for some cool workflows. For example, you can run the server using a buffer-basednet.Listener
implementation (which is what happens in the test that I added).
I landed on this solution while thinking through #2314 - 2317: support
SELECT ... INTO OUTFILE/DUMPFILE ...
This adds support for MySQL'sSELECT ... INTO OUTFILE/DUMPFILE ...
feature.
It is the complement toLOAD DATA
. There is noLOCAL
option, so files created using this feature are on the server.
This PR adds a custom TestSuite for testing these files, as it needs to write, read, and delete files.
syntax: dolthub/vitess#311
fixes dolthub/dolt#7453 - 2316: Allow primary keys with
auto_increment
columns to be dropped when an appropriate index exists
Related to dolthub/dolt#7456 - 2315: Stored procedures can use params as LIMIT,OFFSET
fixes: dolthub/dolt#7458
Fixes procedure param types in the process.
Testing against mysql it doesn't seem like subqueries or regular column types are valid as LIMIT values in any case other than procedure parameters. I still need to test whether trigger columns can be used in LIMIT,OFFSET, but that seems like a separate problem. - 2312: Added the ExtendedType interface
This adds theExtendedType
interface, which is used within DoltgreSQL to implement PostgreSQL types, as well as in Dolt to properly handle the new type and value serialization.
Related PRs: - 2311: fix
group by ... having ...
aliasing once more
This PR simplifies theGROUP BY
andHAVING
aliasing rules to much more closely match MySQL's weirdness.
TODO: Go through various group by aliasing issues and link fixed ones here
correctness: dolthub/dolt#7455 - 2310: Implement
COM_RESET_CONNECTION
Implements theCOM_RESET_CONNECTION
command to allow a sql-server to clear session state on a connection so that it can be safely reused, for example, in connection pools. - 2308: fix decimal precision for division
divScale
tracks the number of division operators on the leftSubtree. It can be used to determine if the current expression is the outermost Div expression, and doubles as a way to determine how many decimal places of precision we need to add.
opScale
tracks the total number of Arithmetic operators in the expression tree in total. It is used to identify the outermost Arithmetic expression.
There were a few places where we were mixing updivScale
andopScale
, leading to improper rounding; this PR fixes that.
As a result, we are able to properly round decimal results when the outermost operator is not aDiv
.
Additionally, this PR also contains a fix for functions that returnfloat64
being incorrectly converted todecimal
.
correctness: dolthub/dolt#7442
fixes dolthub/dolt#4931 - 2307: Persist table comment from
create table
We parse table comments, but they were not persisted anywhere. This change adds support for persisting table comments so that they can be round-tripped betweencreate table
andshow create table
.
There are still several limitations with table comments that we should eventually follow up on:- Table comments containing any single quotes are not supported – Vitess parses the table options, but then strips any quotes and replaces them with single quotes, then pushes all table options back into a single string that GMS parses again, using regexes to pull out charset, collation, and comment. If a table comment is double quoted and contains single quotes in the string, Vitess doesn't currently preserve enough information for us to reparse that correctly in GMS. The right fix for this is to change Vitess to return table options in a structure, instead of as a single string.
ALTER TABLE
does not support altering a table's comment.
Related to dolthub/dolt#7416
- 2306: Bug fix: honor database qualifier on view name for
create view
statements
Related to dolthub/dolt#7438 - 2305: [fixidx] Table function expression ids
Simplify and fixplan.TableAlias
indexing. Some table alias children have their own expression ids, butsql.TableFunction
implementations don't necessarily extend theplan.TableIdNode
interface and rely on the analyzer to populate column ids. There are a couple ways to simplify this in the future, like adding an intermediate prunable sql.Projector node for table functions, or having pruning clean up after itself by updating node and parent table alias columns.
TODO: this case is kind of hard to test, but trying to come up with something. - 2304: implementing
json_keys()
function
This PR adds support forJSON_KEYS()
.
MySQL docs: https://dev.mysql.com/doc/refman/8.0/en/json-search-functions.html#function_json-keys - 2303:
float64
is larger type than decimal forcoalesce()
function
This causes 937 failures in sqllogictests - 2302: adding
@@server_uuid
system variable
This PR adds support for the@@server_uuid
system variable.
However, this just generates a random UUID on server start, and does not persist it to a file.
MySQL Docs:
https://dev.mysql.com/doc/refman/8.0/en/replication-options.html#sysvar_server_uuid
Fixes dolthub/dolt#7431 - 2301: fix coalesce
type()
Although only the first argument is returned, coalesce still examines the types of all the arguments passed in, and uses the "largest" type. - 2300: Fix External Index Creation
Fixes regression introduced by 9b9301f - 2298: fix outermost division placement
MySQL round appends 4 places of precision to their division operators, but only on the very last division operation.
All inner divisions need to preserve as much precision as possible. To track this, we have adivScale
variable in.
We also have anopScale
to track if an operation is the outermost operator, for other rounding/precision reasons.
We were getting these values mixed up, leading to results with incorrect precision. - 2295: allow tables with self referential foreign keys to be dropped
Our logic for preventingDROP
s on a table if that table was referenced in aFOREIGN KEY
did not account for when the table was self referencing. This PR fixes that and closes the gap in our testing for this area.
fixes dolthub/dolt#7418 - 2294: match range key with range type
Creating lookups when comparing columns of different types were causing problems.
Especially, if one of the types is a string.
This is not a perfect fix as we don't do truncation yet, but it will stop panics.
Reused solution from this PR:
#2177
It fixes the test case in this issue, but a skipped test is added for missing truncation functionality.
dolthub/dolt#7371 - 2292: Various aggregation/indexing fixes
re: #2271
Use expression ids to fix expression indexes most of the time. This makes it easier to resolve definitions once in planbuilder, rather than having to redo all that work a second time duringassignExecIndexes
. This should be more reliable for most queries and make it easier to make alias resolving refactors.
Many expressions now implementsql.IdExpression
and absorb unique expression ids. Trigger variables and stored procedure params still lack ids and fallback to string matching behavior. - 2291: backquote column names in
default
andgenerated
column expressions
To prevent parsing errors for special column names, we need to wrap column identifiers with backquotes.
MySQL prints column identifiers indefault
andgenerated
expressions when doingshow create table ...
with backquotes anyways.
fixes dolthub/dolt#7388 - 2288: Add more join statistic tests, some bug fixes
- 2286: Earlier LOAD DATA and insert validation
re: #2283
fixes: dolthub/dolt#7313 - 2285: Avoid generating SemiJoins when the subquery contains an OFFSET expression, as doing say may generate incorrect results.
- 2281: groupby and having getfield index fixes
Implementing more groupby and having strangeness from MySQL, and fixing some getfield indexing errors.
Aliases in the projection scope were always added as new columns in the having scope; this led to getfields with indexes that were invalid in aggregation functions. Now, we don't add the alias to the scope if the alias overlaps with a column that is used in an aggregation.
There is also a small fix to rewriting planbuilder tests to have proper tabbing, write skipped queries, and removed some random new lines.
There are still some issues, left as skipped tests:- We incorrectly disallow aliases from the projection scope in aggregations
- Columns that are aliased and used in group by should not be visible to having clauses; so, we should throw an error.
Correctness: dolthub/dolt#7382
- 2279: fix count distinct with decimals
Swapped out the library used in the CountDistinct aggregation function, as it would hash decimals to the same value.
Correctness: dolthub/dolt#7374 - 2278: RangeHeapJoin should consistently sort NULL values before non-NULL values while managing its heap.
Fixes dolthub/dolt#7260
This was ultimately caused by #1903. I didn't think it was possible for that issue to cause user-facing problems, but I was wrong. Because of that issue, RangeHeapJoins considered all NULL values in its children iterators to come after all non-NULL values. However, if the child node was an index, then the child iterator would order its rows with the NULL values first. This causes the RangeHeapIterator to mismanage the heap and skip rows that should have been in the results.
I updated the range heap code to manually check for NULL values when manipulating the heap. I also updated the plan tests to include NULL values in the test tables, which should now catch this issue. - 2274: Fixup index selection when prefix not complete
Consider a querySELECT * from t where b = 1 and c = 1
and two indexes,(a,b,c)
and(b,c)
. We want to use the(b,c)
index as a lookup, because(a,b,c)
will be disjoint on a(b,c)
key. This PR fixes index costing to record and prefer non-zero prefix matches. We only differentiate zero and non-zero cases here because it is easier and I think pretty reliable. - 2273: Add support for the
ALTER USER
statement to change account passwords
Adds basic support for theALTER USER
statement, so that users can change passwords. All users are allowed to change their own password; changing another user's password requires theCREATE USER
privilege or theUPDATE
privilege on themysql
database.
MySQL reference docs forALTER USER
Related to: dolthub/dolt#7348 - 2267: Fix
HASH IN <tuple>
when comparing against collated string types
This PR changes theHASH IN
to always use a collation to hash the values in the case that we compare a numerical type against a collated string type.
fixes dolthub/dolt#7338 - 2266: Field response packets MySQL consistency improvements
While debugging a customer issue, I noticed that Dolt/GMS has several details in the field metadata packets that differ from MySQL's behavior. This PR fixes the following differences:- flags are now populated for nullability, unsigned types, blobs
- column names for projected system variables now only include the variable scope when it was explicitly specified in the query
@@max_allowed_packet
should be an unsigned integer type
Vitess PR: dolthub/vitess#302
- 2264: implement
octet_length()
function
This PR implementsOCTET_LENGTH()
, which is just a synonym forLENGTH()
MySQL Docs: https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_octet-length - 2263: Insert VALUES statements with subqueries index correctly
fixes: dolthub/dolt#7322 - 2262: implement
elt()
andfield()
function
This PR implements theELT()
function, which just returns the nth argument as a string.
This PR also implements theFIELD()
function, which returns the first case-insensitive match to the first argument.
MySQL Docs: - 2261: implement
exp()
function
This PR adds support for theEXP()
function in MySQL.
MySQL Docs: https://dev.mysql.com/doc/refman/8.0/en/mathematical-functions.html#function_exp - 2260: implementing
bit_count()
function
This PR implements theBIT_COUNT()
function, which counts the number of 1s in the binary representation of the integer argument.
MySQL Docs: https://dev.mysql.com/doc/refman/8.0/en/bit-functions.html#function_bit-count - 2259: implement
atan2()
function and support second argument foratan
This PR adds support forATAN()
andATAN2()
, which just computes atan
MySQL Docs: https://dev.mysql.com/doc/refman/8.0/en/mathematical-functions.html#function_atan2 - 2258: implementing
quarter()
function
This PR adds support for theQUARTER()
function, which tells the quarter from the provided date.
MySQL Docs: https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_quarter - 2257: preserve original col name for subqueries
There are likely other places where we don't copy overoriginalCol
name when creating scope columns, but I can't tell which should be left as empty and which should be filled out.
fixes dolthub/dolt#7316 - 2256: implementing
ORD()
function
This PR adds support for theORD()
function in MySQL, which converts a unicode character into its numerical representation. - 2255: implementing
CHAR()
function
This PR implements most ofCHAR()
functionality. - 2254: implementing
pi()
function
Implements the PI() function, which just returns the value of PI
MySQL docs: https://dev.mysql.com/doc/refman/8.0/en/mathematical-functions.html#function_pi - 2253: implement
SPACE()
function
This function outputs a string containing the number of space characters specified by its argument.
MySQL Docs: https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_space - 2251: fix order by optimization
- 2250: Add support for
SUBDATE()
Implements theSUBDATE()
built-in function and cleans up a few other spots.
Related to: dolthub/dolt#7277 - 2249: fix decimal out of bounds in case statement
Fixes dolthub/dolt#7079 - 2248: Use
DESCRIBE ANALYZE
/EXPLAIN ANALYZE
to display stats data about joins. - 2247: Merge joins populate join stats
- 2246: round IndexAccess for numerical strings over int columns
fixes dolthub/dolt#7261 - 2244: Feature:
SYSDATE()
function
Adds support for theSYSDATE()
function.
Related to: dolthub/dolt#7270 - 2243: Fix context warn panics
fixes: dolthub/dolt#7262 - 2239: Support for doltgres prepared statements
- 2238: [memo] RequiredIndex interface
Some table nodes need to be executed as IndexScans with mandatory filters. The new interface makes this transparent to join planning.
re: dolthub/dolt#7256 - 2236: trim floats when hashing
fmt.Sprintf("%v", x)
writes floats with ax.0
which causes it to never equal hashed strings.
Initially, I wanted to always convert the LHS to the type in the RHS, but this is difficult when there are multiple types in the RHS
fixes dolthub/dolt#7246 - 2235: Add source table and database to field metadata in wire responses
The field metadata messages we were sending back for a result set did not include the source table and source database for fields. This caused a behavior difference from MySQL where the table-qualified column name (e.g.table1.ID
) would not work with Dolt when using the MySQL Java Connector library. See dolthub/dolt#7247 for more details.
Updates to unit tests for theschemaToFields
code are in this PR, and I'll follow up with a PR in the Dolt repo that updates our MySQL Connector library integration tests to add coverage for the table-qualified column name. - 2234: use cached decimal type for casts
Fixes a variety (hopefully 49) of SQLLogicTests involvingCASTS(... AS DECIMAL)
- 2233: Add logging for
COM_RESET_CONNECTION
command
Adding debug logging to see when clients are sending this command, even though we don't implement it yet. For example, ORMs may send this command when returning a connection to a connection pool, so it may be helpful to implement this command and clear out session state in that case. - 2231: fix precision for
utc_timestamp
TheUTC_TIMESTAMP()
function should take in an argument and round the milliseconds. For now, we stick to always returning the full precision (6 places) - 2230: guard ctx and session with nil
fixes dolthub/dolt#7235 - 2228: fix type conversion in Between expressions
Replace the logic inBetween.Eval()
with a logically equivalentAND
statement to reuse the type conversion logic incomparison.go
fixes dolthub/dolt#7229 - 2227: Add
JsonIter
class for iterating over the key-value pairs of a JSON object.
This is the GMS side of automating JSON merging in Dolt: just some type aliases and a simple iterator for getting the keys in a JSON object in a deterministic order.
It's worth pointing out that currently Dolt stores JSON in a normalized form by sorting keys by length, but the iterator here uses a simple lexicographic order instead. This difference doesn't really matter at the moment because we unmarshall the entire object into a go map no matter what. But Dolt needs to be aware of the ordering used in order to correctly compute three-way diffs. - 2226: Error on NOW() eval with nil context
- 2224: empty string is 0.0 decimal
fix dolthub/dolt#7222 - 2223: Fix multi db views with overlapping name
fixes dolthub/dolt#7223 - 2221: fix decimal max precision check
- 2220: guard ctx functions with nil check
fixes dolthub/dolt#7216 - 2219: Implement fmt.Stringer for JsonDocument.
GMS side of an upcoming Dolt PR.
I'm improving error handling in schema merge tests by having the tests print the de-serialized values in incorrect tables instead of just their bytes. Since table values are passed around as aninterface{}
, it seems reasonable to have them implement Stringer. - 2218: implement
NOW()
siblings
This PR has our behavior surroundingNOW()
functions more closely match MySQL.
Changes:- Added
NOW()
synonyms to registry - Have
CURRENT_TIMESTAMP()
,LOCALTIME()
,LOCALTIMESTAMP()
all just callNOW()
- Support parsing synonyms in
DEFAULT
andON UPDATE
expressions - Fixed
SHOW CREATE TABLE
to printCURRENT_TIMESTAMP
forNOW()
and synonyms
- Added
- 2217: Fix decimal parsing bug
Decimals with capitalized exponential 'E' were incorrectly bound to literals, losing precision in some cases.select 5.0E-5; // -> 0.0001
- 2216: Use indexes and lookups for cardinality estimates
Cardinality estimation happens once after join reordering, indexing, and lookup join planning. The indexes and lookups help with result count estimates. - 2214: Fix wrongly written 'aribtrary' -> 'arbitrary'
Fix wrongly written word 'aribtrary' -> 'arbitrary' - 2212: Join estimation algos
Code and tests related to estimating output distributions for joins. Limited to numeric types. - 2210: sql/planbuilder: Update join defaults are not prefixed, need individual table schemas to resolve
Updates need to resolve default expressions. The input definitions for update joins include the total join output columns. If two of those columns have the same name, the target table's unqualified default expression will throw an "ambiguous column" error. We partition the update join schemas/column definitions to sidestep the error. - 2208: fix type evaluating for
mult
anddiv
arithmetic operations- For
Mult
expression, the.Type()
method is fixed that it returns correct type for decimal type results, which uses the sum of precision and scale values of its right and left values. - For
Div
expression, the.Type()
method is fixed that it returns the correct precision and scale for the final result.
- For
- 2207: Fix partial left join update when columns have default values or join table is in where clause
- 2206: guard
ctx.Span
again
#2203 again, because it was overwritten
fixes dolthub/dolt#7182 again - 2205: use entire lateral scope when indexing RangeHeapJoin
RangeHeapJoin
s looked at the lateral join scope when assigning indexes.
However, we never tested nested joins for this case, leading to dolthub/dolt#7177
What made the error more apparent was the string in the left scope that would result in empty results when doing lookups into the right tables. The fix was to look at the whole lateral scope when indexingRangeHeapJoin
s
fixes dolthub/dolt#7177 - 2204: fix foreign key panic on update
When resolving ForeignKeys,ForeignKeyReferenceHandler.CheckTable()
callsPartition()
and iterates over all rows in the referenced table to ensure that the key is valid (doesn't contain any values that are not in the referenced table).
WriteableIndexedDoltTable
prevents us from reading every value throughPartition()
as it is only supposed to return rows that match the index range. Additionally, there are some differences between the way MySQL and dolt handle certain ddl and dml operations during delayed resolution of Foreign Keys.
To fix this, we selectively perform table scans (only when we are creating/altering the table to resolve foreign keys).
Fixes dolthub/dolt#6963
Companion PR: dolthub/dolt#7203 - 2203: guard
ctx.Span
with nil check
We have places where we callexpression.Eval(nil, nil)
.
The null context causes a panic when we call Span on it.
This PR just adds a nil check inside the*context.Span()
receiver
fixes dolthub/dolt#7154 - 2200: Replace carder with stats
- 2199: remove delimiter before caching the prepared stmt
When using prepared statement, some create statements were being stored in the cache with delimiter at the end of it, which does not trimmed when executing the prepared statement. This causes consistency issue where in cases of storing a create statement, it will have the delimiter whereas running the same query without it being prepared statement trims before storing. - 2198: fix decimal/float conversion for
HashInTuple
expressions
fixes dolthub/dolt#7147 - 2197: prevent RangeJoinHeap plans over Filter nodes
Currently, we don't have proper handling of nodes that aren'tRangeHeap
s in theRangeJoinIter
.
This PR makes it so that we don't create plans that would have these nodes.
Additionally, we error instead of panicking when we see an invalid node in this area.
fixes dolthub/dolt#7139 - 2196: Prepend subquery scope to sql.TableFunction nodes
Most but not all table functions implement sql.Table. Table functions that do not implement sql.Table still need to return prepended rows to maintain indexing rules. - 2195: fix type promotion for in expressions
TODO: check type promotion for int -> float/decimal for all expressions
fixes dolthub/dolt#7120 - 2194: fix decimal result scale difference on server
This PR fixes:- converting the decimal type values to given decimal column type before inserting.
- getting precision and scale of final decimal type correctly.
- 2193: Set the
original_name
field in response metadata in addition to thename
field
A customer reported that the MySQL C++ Connector library was unable to retrieve column name information from a Dolt sql-server. After looking at the two wire captures between MySQL and Dolt, this is because the MySQL C++ Connector library pulls the column name from theoriginal_name
field, not from thename
field.
I've updated the unit tests that assert the expected response metadata fields are populated, and I'll follow up next with some changes in the Dolt repo to our C++ Connector library acceptance tests so that they use response metadata and assert that it is filled in.
After that, it would be good to proactively look at any other response metadata fields that we aren't setting. For example, theFlags
field seems important to fill in correctly for tooling to use. - 2191: Move index costing into join planning phase
Put index costing inside join planning, so that in the future join planning will have better cardinalities (statistics) for join ordering. Most of the changes will look like refactoring the way we expression index lookups in the memo. I attempted to do this in a way that makes as few changes as possible to join planning; the goal here is to set me up for rewriting cardinality checks with stats objects. It didn't go as cleanly as I wanted, I ended up shifting a lot of join plans back to lookup plans because HASH_JOIN was beating LOOKUP_JOIN in several key places.
One downside of the current PR is that it converts a sysbench MERGE_JOIN into a LOOKUP_JOIN. I would prefer fixing this in the next PR when I do a bigger costing overhaul.
Variety of fixes for join hinting, correctness, etc.
At some point we appeared to fix this:
#1893 - 2189: Upgraded xxhash to v2
- 2188: support
on update current_timestamp
This PR adds support for handling and executingON UPDATE
column expressions, which are triggered when an update is applied to any column in the row.
TODO:- we support
now()
for default, but not the syntax foron update
fixes dolthub/dolt#6030
- we support
- 2187: fix
round()
handling of scale, precision, and nulls
This PR hasROUND()
behavior match MySQL more closely specifically when handling NULLs.
Additionally, it refactors the function to no longer use custom logic, and rely ondecimal.Decimal
library for conversions. - 2186: Feature: Support
BLOB/TEXT
columns in unique indexes, without requiring a prefix length
AllowsTEXT
andBLOB
columns to be used in unique keys, without requiring that a prefix length be specified. This causes the secondary index to store a hash of the content, instead of the content itself, and then that hash is used to enforce the uniqueness constraint. This is useful to enforce uniqueness over very long fields without having to specify a threshold with a prefix length.
This feature is supported by MariaDB and PostgreSQL, but not by MySQL. A new SQL system variablestrict_mysql_compatibility
is also introduced in case customers want to opt-out of extensions like this and stick to the exact behavior of MySQL. The default value ofstrict_mysql_compatibility
isfalse
.
Unique secondary indexes using content-hashed fields have several restrictions, such as not being eligible for use in range scans or in any scans that require a specific order.
The GMS in-memory secondary index implementation takes a simple approach – it doesn't actually hash encode the content-hashed fields, and instead includes the full column value. This is consistent with how the GMS in-memory index implementation handles other features, such as prefix lengths, which are also a no-op and the full content is stored in the secondary index.
Dolt integration: dolthub/dolt#7104
Related to: dolthub/dolt#7040 - 2185: fix panic of concurrent map writes, when using in memory mode
Replaces #2179 - 2182: fix IN_SUBQUERY projection bugs
Correctness regression fix. With a bit more work this could probably be a smaller query:The CREATE panicked because the top-level projections get pushed into the source node, and my recent refactors failed to map projections onto the reported table output column sets.CREATE VIEW view_2_tab1_157 AS SELECT pk, col0 FROM tab1 WHERE NOT ((col0 IN (SELECT col3 FROM tab1 WHERE ((col0 IS NULL) OR col3 > 5 OR col3 <= 50 OR col1 < 83.11))) OR col0 > 75);
- 2181: Improve IN_SUBQUERY table disambiguation
When unnesting and IN_SUBQUERY into a parent scope with a table name clash, rename the child table and update its references to the new name. Prevent EXISTS subqueries from unnesting if it doesn't full decorrelate the child scope. - 2178: Improve handling of
charset
andcollate
in column options.
dolthub/vitess#293 should be merged before this. - 2177: Properly round IndexAccess Bounds for float/decimal type filters over integer columns
When generating indexes ranges, we don't convert the range bounds type to the index bound type until later on.
Additionally, when we do convert (specifically floats to ints) we round the floats based on the halfway point, leading to indexes skipping over rows depending on how it rounds.
This PR changes that to convert the types earlier (keeping the key type and index type consistent), and rounding floating point bounds correctly to not leave out any rows.
fixes dolthub/dolt#7072 - 2175: Fix existant typo
- 2174: Having aggregate alias bug
fixes: dolthub/dolt#7082 - 2171: Bug fixes for type handling in IF and COALESCE functions
- 2170: have
flush binary logs
be noop
fixes dolthub/dolt#7055 - 2169: Unique table and column ids
The motivation for this PR is making the costing/exploration phase consistent with the data structures for index costing. That means switching memoScalarExpr
back tosql.Expression
. Moving the previously join-specific ids inScalarExpr
to tables and column refs lets us preserve most memo data structures and join transformation logic. The alternative to combining these changes would be to rewrite all of the memo data structures and transformation logic to not use ids, which would be a fairly long and difficult process. - 2168: fix panic in math funcs
fixes dolthub/dolt#7060
Additionally, fixesPOW()
to not have the same panic and returns warnings instead of errors for certain inputs toLOG()
. - 2166: prevent panic on
nil
cast for time functions
fixes dolthub/dolt#7056 - 2165: fix
update <table> set <column> = default
This PR fixes a bug where attempting to update a column to its default would throw an unresolved error. - 2161: SHOW PLUGINS
fixes: dolthub/dolt#7049
depends on: dolthub/vitess#291 - 2160: fix
Inf
andNaN
for trigonometric functions
fixes dolthub/dolt#7046 - 2158: have
DayName()
returnNULL
on badDATETIME
conversions
fixes dolthub/dolt#7039 - 2157: don't round floats when converting to bool
fixes dolthub/dolt#7038 - 2156: Decorrelate IN_SUBQUERY refactor
- 2155: Allow
BLOB
/JSON
/TEXT
columns to have literal default values (MariaDB compatibility)
Technically, MySQL does NOT allowBLOB
/JSON
/TEXT
columns to have a literal default value, and requires them to be specified as an expression (i.e. wrapped in parens). We diverge from this behavior and allow it, for compatibility with MariaDB.
While testing with a binary literal, I noticed thatSQLVal
was converting that value to "BLOB
" instead of the actual binary content, so I fixed that one, too.
Related to: dolthub/dolt#7033
Dolt CI Checks: dolthub/dolt#7036 - 2154: Use max prec and scale for decimal oob
- 2153: null in-tuple bugs
fixes: dolthub/dolt#7025 - 2152: Update RangeTree MaxUpperBound when removing nodes
Verification code from in the last RangeTree related PR caught some bad ranges in the sqllogictests.
The cause was not properly updating the MaxUpperBound in nodes for the rangetree when performing a remove operation, which led to missing connections when pruning ranges.
This also prints MaxUpperBound in the String/DebugString methods for easier debugging. - 2151: add decimal type to convert functions
fixes dolthub/dolt#7018 - 2149: updating
RangeTree
nodeMaxUpperBound
again
A previous fix involved updating the MaxUpperBound in the RangeTree when traversing the right node, turns out we need to do that when creating a new node as well.
To better catch overlapping range expressions, we now verify that the resulting ranges do not overlap (an operation which isn't too expensive). This fixes some plans from an index refactor.
Additionally, this also fixes a skipped test where the ranges were not overlapping but different than the brute force approach. - 2148: fix for foreign key that references dropped table in
information_schema.referential_constraint
table - 2147: Fix mod bool conversion
fixes: dolthub/dolt#7006 - 2146: Update
MaxUpperBound
when inserting intoRangeTree
Recent changes to index costing exposed a bug inRangeTree
. This bug is responsible for a small regression in the sqllogictests, involving a complicated filter. - 2145: mysql server handler intercept support
Split from the PR /~https://github.com/dolthub/go-mysql-server/pull/2036.
Add mysql server handler intercept support. - 2144: Push filters insensitive to table name
Filter pushing bug that is specific to 1) table names with capital letters, and 2) filters that need to move through joins. The problem is not indexing specifically, but checking for an index is the easiest way to test this.
dolt bump: dolthub/dolt#7001 - 2142: Idx histogram manipulation
Add simple histogram mutators for filter types. Use histogram costs for index selection when available. Addedstats
docs. - 2141: Fixing field metadata for
JSON
andgeometry
types
JSON
andgeometry
types should always report a binary collation in MySQL's field metadata. While debugging dolthub/dolt#6970, I noticed that MySQL was sending a binary collation for these types, but GMS was sending back the default collation. - 2140: Respect
character_set_results
when emitting field metadata
For non-binary types, we need to respect the value for thecharacter_set_results
session var (when notNULL
) and use that for the field metadata returned in the MySQL wire protocol.
The unexpected charset/collation metadata is causing DataGrip to be unable to work with some types in the table editor ( see * 2138: Fixed SET working for invalid charsets/collations
Fices dolthub/dolt#6972 - 2137: fix nested subquery filter in
exists
We do not explore the children ofsubquery
when attempting to decorrelate filters forexists
queries; we now do this through the use ofsubquery.Correlated()
.
We should also avoid using uncacheable subqueries as keys forIndexLookups
.
fixes dolthub/dolt#6898 - 2136: Fix nil range correctness bug
- 2135: Resolve indexes of columns in CREATE TABLE statements early
- 2134: Costed index scan framework
This is the accumulation of the following PRs:
#2093
#2104
#2112
#2124 - 2133: Add query plans for index/join sysbench queries
- 2132: ReferenceChecker interface
re: dolthub/dolt#6957 It is expensive and unnecessary to deserialize blobs during FK reference check lookups. - 2131: Fixed Full-Text defaults
Fixes dolthub/dolt#6941 - 2130: More TPC-C tests, fix the slow HASH_JOIN
The randIO parameter for LOOKUP_JOIN costing was perhaps too strict, since that cost is already stacked on top of the sequential cost. This isn't a replacement for better costing, but boosts TPC-C perf a bit and isn't less correct than the previous version. - 2126: fix inner join filter pushdown, and alias matching
This PR fixes an issue we have with InnerJoins where the filter condition uses a function and references columns from both tables. We need to properly search the expressions in aSubqueryExpression
.
Additionally, addresses an getIdx bug involving unqualifying aliases.
partially fixes dolthub/dolt#6898 - 2125: Minimal index searchable interface
re: #2036 - 2121: fix panic when calling
ST_POINTFROMWKB()
with no arguments - 2120: Prevent virtual columns from being used in primary keys
Also tests for stored generated columns in primary keys - 2116: Grant Options privs need the AdminOnly treatment too
This addresses a gap discovered while writing dolt tests - Grant Option on procedures is not currently validated correctly, resulting in only super users being able to set grants on procedures. This should address that. - 2115: Fix ExistsSubquery with functions
OurhoistSelectExists
optimization incorrectly generatesSemiJoins
when there are OuterScope column references in projections in subqueries. In the future, a possible optimization could be to haveSemiLateralJoins
that properly grant this visibility.
Also contains small refactoring and extra debug information forcoalesce
function.
Fixes one of the queries here: dolthub/dolt#6898 - 2113: Allow ScriptTestAssertion to specify when an assertion needs a new session
As part of testing Dolt's reflog feature, I need to calldolt_gc()
and then check the reflog behavior. Becausedolt_gc()
invalidates the session, I needed to add this hook that signals the test framework to create a new session for a ScriptTestAssertion. - 2110: Procedure privledges
Adds support for respecting procedure and function permissions. Also added AdminOnly flag for external procedures to indicate that they should not have their privileges evaluated in the standard MySQL hierarchical way. This will allow us to tighten dolt procedures access.
The ability to grant access is still blocked behind an environment variable. That will remain until dolt changes have been released. - 2109: adding join and subquery tests
Convert many of the sqllogictests into enginetests for visibility - 2108: Push not filters
De Morgan's laws and leaf filter inversions to get NOT expressions as low in filter trees as possible. This will make index costing NOT filters easier. - 2102: view aliasing bug
TheView.ViewExpr
AST object seems to drop aliasing information that we depend on for query schema presentation. I want to circle back to have the view AST expression be consistent with the view definition, but this forces a re-parsing to get the correct view schema. - 2100: Refactor or the NewPrivilegedOperation method
Adding support for routines is coming, and and a bunch or string arguments is cumbersome.
Note to reviewer: start at sql/privileges.go! - 2099: Add the mysql.procs_privs table
In order to support procedure and function permissions, the mysql.procs_priv needs to be supported. This change add support, but gates the ability to create these grants because they are not being used for actual permission checks yet. That will come next.
I have engine tests on another branch due to the gate. I'll add them when the feature is complete. There are also bats tests in flight in the dolt repo which can be seen here:
dolthub/dolt@4d8fe2a...macneale4/privs_tests_wip - 2098: Logging improvements for prepared statements
Updating from mysql.Handler changes in 7da194ad69efa0e4cc3992f860ecb92550482d62 – adding debug logging forComPrepare
and including params count, other minor logging cleanup.
Related Vitess PR: dolthub/vitess#286 - 2095: give child SQA isLateral if parent SQA isLateral
If a parent SubqueryAlias is marked IsLateral, then mark its child SubqueryAliases with IsLateral as well.
This essentially gives the child SubqueryAlias visibility to the left subtree all the time, which fixes the linked issue.
We should be able to differentiate between different scopes of lateral joins and only grant visibility to those columns, which could be a subset of the parent/left columns. However, I don't think we are close to getting that working.
fixes dolthub/dolt#6843 - 2094: Add support for view column clause
re: dolthub/vitess#285
closes: dolthub/dolt#6859 - 2091: Non-ambiguous ORDER BY col should not error
- 2090: Allow queries with duplicate table names if those tables come from different databases and all references are unambiguously resolvable.
- 2088: fix off by one for
found_rows
whenlimit > count(*)
We had a bug where we could increment the limit counter before receiving the EOF error.
fixes dolthub/dolt#6829
companion pr: dolthub/vitess#283 - 2086: Server handling parsed statements
Initially this was going to be a bit more involved, as I was planning on having Dolt expose a new interface, and we'd directly pass in GMS ASTs rather than Vitess ASTs. The Dolt interface approach turned out to be a lot more involved than first anticipated, and the construction of GMS ASTs needs state that we will not have at higher layers, and exposing such state is also a lot more involved. Therefore, I've made a compromise by accepting Vitess ASTs instead, which makes this vastly simpler. It's not going to be quite as powerful, but I think it can still serve our purposes for the foreseeable future.
This basically works by hijacking that fact that we'll sometimes process Vitess ASTs via the prepared cache. If we receive a Vitess AST, then we skip the cache, otherwise we access the cache like the normal workflow. - 2085: small fixes for new test harness using server engine
This PR adds small fixes to server_engine that is used to test existing engine tests over running server. - 2084: Remove Dead Grant/Revoke code
- 2083: Virtual column index support
Implements index support for virtual columns, and fixes several bugs related to generated columns for certain statements.
Changes the memory table index implementation to actually store secondary indexes separately, rather than fake it via the primary index. The indexes are stored via sorted slices for now. I'll do another pass and replace them, and primary storage, with btrees for speed now that I have this proof of concept working.
Also introduces a new interface for rebuilding a single index, rather than doing a complete table rewrite every time an index is created. - 2081: Refactor JSON interfaces, stats interfaces
- 2080: drop sort when secondary index is available
- 2079: reverse filters for reverse lookups
fixes dolthub/dolt#6824 - 2078: fix panic when order by column is out of range
We used to blindly index ORDER BY values, which resulted in panics.
Now, we throw the appropriate error.
Additionally, this matches MySQL behavior when performing ORDER BY with indexes< 1
.ORDER BY 0
= errorORDER BY -1
= noop
- 2077: sql.StatsTable->RowCount returns whether the estimate is exact
Add return argument for whether the a RowCount can be a substitute for count(*). - 2076: pick float type if one side is non-number type
If one side or comparison is non-number type and the other side is number type, then convert to float to compare and the non-number type value can be float type. - 2075: Fix information_schema row count regression
If a table does not implement RowCount() we used to use 1000 as a default value for row costing. A recent refactor changed that to 0. This fixes information schema tables to report the 1000 value again, which is usually accurate for small databases because of default tables and columns. I also fixed some issues with database reporting for info schema tables.
This regression probably still exists for some dolt tables and table functions. I will do a pass and see if I can add some more accurate values on the Dolt side. - 2074: optimize
min(pk)
andmax(pk)
This PR adds an optimization to queries that have aMIN
orMAX
aggregation over aPRIMARY KEY
column.
Since indexes are already sorted (and we support a reverse iterator) we can look at the first/last row to answer queries in this form.
The new analyzer rule,replaceAgg
, converts queries of the formatselect max(pk) ... from ...
to the equivalentselect pk ... from ... order by pk limit 1
. Then, we depend on anreplacePkSort
to applyIndexedTableAccess
Additionally, this PR hasreplacePkSort
optimization apply to queries that have filters (specifically those that were pushed down to IndexedTableAccess)
There is also a some refactoring and tidying up.
Fixes dolthub/dolt#6793 - 2073: Bug fixes for explicit
DEFAULT
values in INSERT statements
Previously this was broken when DEFAULT values referred to other columns.
Fixes dolthub/dolt#6430
Needs more tests, but this fixes the immediate buggy behavior.
Also fixes a related bug in MySQL: https://bugs.mysql.com/bug.php?id=112708 - 2072: support
json_valid()
function
Add support forJSON_VALID(val)
function - 2071: Updates for Dolt stats
json_value
andjson_length
addedjson_table
edited to support json document inputs.- our custom json marshaller supports types that implement the
json.Marshaller
interface - increased recursive iter limit to 10,000 to more easily generate 3-level prolly trees for statistics testing
Note: thejson_value
notation is different than mysql's. I accept the type as a third parameter, rather than expecting a RETURNING clause.
- 2070: Add routines to PrivilegeSets
Added some Unit tests as well. This is in preparation for supporting routine grants. - 2068: Virtual column proof of concept
This needs a lot more tests, but this demonstrates the approach within can work well enough - 2067: adding sqllogictests
This PR adds some utility scripts to convert CRDB testing files into SQLLogicTest format.
Additionally, this adds tests focusing on join and subqueries.
Some notable tests are added as skipped enginetests. - 2065: Remove redundant information_schema creation in the driver example
In the driver's example,information_schema
is duplicated.
It is created in two places.- factory.Resolve()
- Catalog in Analyzer in Driver.OpenConnector()
There is no need to create it in thefactory.Resolve()
.
I checked databases using this:
diff --git a/driver/_example/main.go b/driver/_example/main.go index 34e0580ed..adbc1a249 100644 --- a/driver/_example/main.go +++ b/driver/_example/main.go @@ -35,6 +35,14 @@ func main() { rows, err := db.Query("SELECT * FROM mytable") must(err) dump(rows) + + rows, err = db.Query("SHOW DATABASES") + must(err) + for rows.Next() { + var db string + must(rows.Scan(&db)) + fmt.Println("db:", db) + } } func must(err error) {
- 2064: Implement fast join ordering for large joins that can be implemented as a series of lookups.
- 2063: Improve functional dependency analysis for joins.
This is a prerequisite for fixing dolthub/dolt#6713.
Basically, this PR does two things:- Adds additional bookkeeping to FDS in order to track "partial FDS keys", that is, column sets that determine some (but not all) of the other columns in a relation. Before this PR, we attempted to only track keys that determined the entire relation. This would cause us to lose some information and prohibit some optimizations. (There were also a couple of cases in nested joins where we accidentally added partial keys to FDS anyway (by adding a key from a child table to the FDS of the parent.) If we ever used these keys it could have caused correctness issues, but it doesn't look like we ever did.
- Improves the
simplifyCols
method in FDS to use partial keys in order to improve analysis.
Overall, these changes allow us to compute much better FDS keys for complicated joins, by allowing us to remember and reuse keys derived from child tables to improve the computed key for tables later in the join.
- 2062: "CREATE TABLE" fails with a "database not found" error when using the driver
- 2061: More histogram support, redo stats table
Reshape statistics interfaces to better support custom Dolt implementation.- Add
ANALYZE TABLE <table> [UPDATE/DROP] HISTOGRAM ON <column,...> USING <json blob>
support and a few tests - Replace use of
update information schema
cardinality updates with histogram updates - Default catalog has an in-memory histogram
- New memory histogram impl
- Delete old statistics table/histogram code
The only prod difference is that the new update path can overwrite the default table count, while it was a testing-only thing before.
companion: dolthub/vitess#279
Dolt bump seems OK
- Add
- 2060: Fixed various bugs in last_insert_id
Fixes dolthub/dolt#6776
Also adds better test coverage forLAST_INSERT_ID()
and theINSERT_ID
field in the ok response for updates. - 2059: fix db qualified column names in order by
We did not handle the case where database qualified column names would be in the order by clause.
This PR fixes that issue and adds a variety of tests with database qualified column names and database qualified table names.
There is still a case where joining two tables with the same name from two different databases results in an error; there's a skipped test for this, and a workaround is to alias the tables.
fixes dolthub/dolt#6773 - 2054: prevent projections to be pushed past lateral joins
pruneTables
will pushdown projections as low as possible to avoid passing around columns that won't make it to the final result.
In the case of lateral joins, a column that isn't referenced in the topmost projection may still be referenced by subqueries, which causes problems. This PR prevents the projection pushdown optimization from applying to the children of lateral joins.
A better solution might be to determine which columns are referenced, and trim out all others. However, that seems hard.
fixes dolthub/dolt#6741 - 2051: Trigger view errors don't prevent writes on database with preexisting trigger views
PR #2034 made it so that we can not DDL triggers on views. But past versions of Dolt permitted creating those triggers. After this change, databases with a trigger view will be have all writes blocked with the trigger view error. We should probably not try to parse and bind all triggers for every write, but as long as we do this adds a warning rather than an error for a non-DDL trigger parse. - 2050: Fix binding of time.Time via driver
When using the driver in v0.17.0, an error occurs when binding a variable of type time.Time as a query parameter.
The error message looks like this:This issue did not occur in v0.16.0.type time.Time not supported as bind var: 2023-10-01 17:37:49.382855116 +0900 JST m=+0.017340108
While it's possible to resolve this by modifying sqltypes.BuildBindVariable() in vitess,
this PR resolves it within the driver itself by converting the variable into a timestamp literal string. - 2049: adding gosql prepared tests
We are lacking in tests that use the gosql driver to connect and run queries against an in-memory GMS server. This PR creates a new test suite so it's easier to write these tests. Additionally, there are some tests targeting a bug involving unsigned integers being read as signed. - 2048: Prevent identifiers longer than 64 characters
Also fixed a bug where we allowed multiple column names with the same case-insensitive name
Fixes dolthub/dolt#6611 - 2047: Don't reorder joins that are too large to efficiently analyze.
The current implementation of the join order builder scales poorly if there are too many joins. It's likely possible to improve it, but in the meantime, I'm disabling join reordering on joins that have too many tables (currently defined to be more than 20.)
In these situations, the analyzer takes longer to run the reordering than it does to actually execute any of our test cases, so running the analysis in this case can only slow us down.
I expect this is unlikely to adversely affect users because joins this large are rare, and when they do occur they are often written in a way that the explicit order is good enough.
For example, this test from sqllogictests:takes 30 minutes to reorder, and 15 seconds to run when reordering is disabled.SELECT x63,x53,x62,x52,x11,x5,x40,x64,x27,x28,x21,x41,x22,x30,x16,x14,x56,x32,x46,x50,x1,x34 FROM t46,t34,t1,t32,t53,t21,t63,t11,t30,t62,t27,t50,t16,t64,t40,t56,t22,t28,t52,t5,t41,t14 WHERE a21=b5 AND b30=a52 AND a62=b46 AND a14=3 AND b52=a28 AND b53=a14 AND a63=b28 AND b40=a56 AND a11=b64 AND a53=b22 AND b1=a34 AND b32=a41 AND a50=b63 AND a64=b62 AND b11=a30 AND b27=a40 AND a22=b56 AND b21=a46 AND a1=b50 AND b34=a16 AND a27=b16 AND a5=b41;
MySQL runs the query in under a second, demonstrating that reordering can still massively improve performance if we can make the algorithm more efficient. But this is a good stopgap measure. - 2044: use session builder from harness in the server engine
Small fixes for memory harness for enginetest:- use sessionBuilder from the harness instead of DefaultSessionBuilder
- convert row result for
SHOW
queries
- 2041: not panic on
Star.IsNullable()
This reverts #2039 because the fix was not correct choice for the issue dolthub/dolt#6659. - 2039: AliasedExpr.InputExression should be compared case insensitive
- 2038: error msg for invalid reference to non-existent table or column in existing view
It catches invalid reference to non-existent table or column in existing view. This includesSELECT
queries on a view that references table or column that was removed or renamed.
Note: For now, It does not catch references to invalid functions or users without appropriate privilege cases and queries other thanSELECT
queries.
Fixes: dolthub/dolt#6691 - 2035: Tests for errors during insert not fouling a session
- 2034: prevent creating trigger on view
fixes dolthub/dolt#6432 - 2033: Add VERBOSE_ANALYZER environment variable.
This environment variable will make the analyzer output the current optimization plan after each analyzer rule. - 2032: fix
order by
on unioned schemas
When unioning twoSELECT
statements that have different column types, we would get -1 duringassignExecIndexes
, resulting in a panic.
This PR fixes the issue by matching on unqualified column names when we don't have an exact match.
We don't find these matches because the second table has an unqualified alias over the column name because it is wrapping it in a convert node. - 2031: UNIX_TIMESTAMP() respects session time zone
- For
UNIX_TIMESTAMP()
function, it converts the time value to be in the current session TZ instead of UTC TZ before returning the final value because the initial value is parsed as in UTC TZ, which is incorrect. - The default value of
system_time_zone
global variable will be set to the system TZ instead ofUTC
.
- For
- 2030: unskipping fixed tests
We have many tests that are marked skip/broken, but they are working now.
This PR unskips and cleans up some of these skipped tests. - 2029: Return innodb_lock_wait_timeout=1 always
See: dolthub/dolt#4190 - 2026: Do not error for SHOW view indexes
Until we support view indexes, return nil for show view keys/indexes.
fixes dolthub/dolt#6705 - 2025: Failure to push filter causes dropped filter
We make a hard assumption during join planning that there are no errant filters in the join tree. Every filter is either a join edge, or sitting on its relation. When this is not true, the memo can generate a transitive edge between two relations that loses track of the original filter. The process for triggering this bug is 1) filter in an ON condition gets moved to the middle of the tree, 2) the filter fails to get pushed to its join edge/relation, 3) we generate a transitive join edge that loses track of that filter, and then 4) we choose the transitive join edge in costing.
You'll see the filter restored in the integration query plans in the PR. I added a minimal repro with the appropriate ON conditions and forced a transitive edge that drops the filter if pushdown regresses in the future. - 2022: TPC-X query plan tests
Added schemas, stats, query plans for:- TPC-H
- TPC-DS
- IMDB join planning benchmark
Addedplangen
to auto-update the tests.
We cannot parse all of the TPC-DS query plans yet. I saw some ROLLUP and aggregation validation errors.
Excluding data ops benchmark because the plans are not interesting.
- 2018: Avoid corrupting the privileges file
Currently, the last of these commands results in a panic due to the revoke inserting a database with an empty string for a name. This is fairly awkward to test in GMS, so I'm going to take the easy route and create a bats test in dolt. - 2017: Retaining display width for
TINYINT(1)
MySQL allows integer fields to specify a display width (e.g.TINYINT(1)
).go-mysql-server
currently parses that information, but doesn't retain it anywhere. This PR changes that behavior to match MySQL and retain the display width setting so that it can be passed back to callers.
As of MySQL 8.1.0, the display width setting is ONLY retained for signedTINYINT
fields and ONLY when the display width is set to 1.
Fixes: dolthub/dolt#6682
Corresponding Dolt PR: dolthub/dolt#6688 - 2016: Bug fix: Preserve column name case for views
Fixes dolthub/dolt#6683
Dolt CI Checks: dolthub/dolt#6684 - 2015: GROUP BY identifiers should prefer binding to table columns over projections.
This also means an expression is allowed to project from the GROUP BY column multiple times.
Fixes dolthub/dolt#6676 - 2014: Improve message from json_extract() when json path doesn't start with a '$'
- 2013: Change Project nodes so they can't return negative zero.
In some cases we differ from MySQL in what types we use for intermediate results. This doesn't usually affect the final output, and can be more performant.
But if the final result of an expression is a float when MySQL deduces it to be a decimal, and we don't do anything else to value that would cause it to be coerced (such as inserting it into a table with a schema), then we could end up displaying a result of the wrong type to the user. Usually this doesn't matter, unless that result is the float value-0
when the user expects a decimal.
Ideally we'd prefer to detect the expected type and do a cast, but this is an acceptable stopgap measure. - 2012: Insert on dup col ordinal bug
A certain set of conditions causes an error for indexingon duplicate update
expressions:- The source is a SELECT statement (not a VALUES row)
- All columns are specified by the INSERT (not sure why, but partial columns seems to get rearranged correctly. I think we must insert a compensating projection to handle column defaults)
- The source columns are not the same order as the destination table schema
- On duplicate update expression references a column from the new row
For the query below, we were indexing the on duplicate expression in the wrong order, causing the output row to be two zero types:
The way we resolve inserts is still a bit weird. We resolve the source, and then afterwards add a projection to rearrange columns to match the target schema. I ran into a lot of problems trying to rearrange that ordering (first add projection, then analyze), mostly due to our inability to fix indexes on the source node's projection (VALUE nodes don't have a schema, and it isn't obvious when walking a tree that a given projection is going to be special). When we add the projection afterwards, however, it avoids the indexing rule so we can inline the values safely.create table xy (x int primary key, y datetime); insert into xy (y,x) select * from (select cast('2019-12-31T12:00:00Z' as date), 0) dt(a,b) on duplicate key update x=dt.b+1, y=dt.a;
My current fix is to mimic the projection mapping inside indexing. Index the duplicate expression values based on the ordinal of the destination schema. LOAD DATA for some reason needs its insert columns to not be specified, which will probably the source of different issues at some point.
fixes: dolthub/dolt#6675 - 2009: Re-enable query logging by default for DEBUG log level
- 2008: Rewrote memory table editors to enable table rewrites
This also fixes a number of bugs that were discovered during this process, notably not properly closing table editors in all instances. - 2007: Fixed character set IDs
Character set IDs should correlate with their default collation's ID. Previously, they were arbitrarily assigned by sorting their names alphabetically. This should not be a breaking change for anyone, as the comment on theCharacterSetID
mentions that the ID may change, and should not be persisted. Dolt, the largest integrator, abides by this rule. - 2006: Semi join and FDs bug
Returning no projections from a table causes "column not found errors" when we try to reference those expressions higher in the tree. This fixes the semi join transform to creating empty projections.
This fixes two bugs. The first is that we were too conservative checking whether index keys were strict FDs for a join relation. When a relation has a constant applied to a primary key, we can assume all of the columns returned by that join will be constant. Fixing that made it easer to test certain semi -> right lookup join transforms which were buggy. For the same case, when we are doing a lookup into table where a constant filter satisfies an index key, we need to still return a projection set that covers non-pruneable columns used in higher-level nodes. - 2005: implement
INTERSECT
andEXCEPT
clause
This PR implements two MySQL functions,INTERSECT
andEXCEPT
.
They are similar toUNION
in that they combine to two sets of rows.
INTERSECT
is the set of rows that are in both left and right.
EXCEPT
is the set of rows that are in the left that are not in the right.
Additionally, there is a bug with field indexes with ORDER BY over unions.
companion pr: dolthub/vitess#271
fixes dolthub/dolt#6643 - 2004: Enable use of slices in tuples for HashLookups
Currently FusionAuth crashes Dolt with the following error:All FusionAuth IDs are binary(16), and join on those values in a HashLookup was resulting in using twopanic: runtime error: hash of unhashable type []uint8
[]uint8
being used as a key to a hashtable. Nested arrays in tuples were tripping on an optimization made for short arrays. We've verified that optimization doesn't actually made a difference, so this change simplifies the code and makes it more generic. - 2003: Delete fixidx package
remove remnants of old field indexing code - 2000: adding new ways to say varchar
This PR makes to so the engine recognizes more ways to specify that a column is of typeVARCHAR
Companion PR: dolthub/vitess#270
Fixes dolthub/dolt#6650 - 1996: Move join indexing after all
This should put almost all indexing logic into one rule that runs once at the end of analysis for a given query. It should require one walk of the tree, be much more correct for nested join and subquery indexing, and allow us to add nodes with special indexing logic much more easily.
Summary:- new rule
fixupIndexes
replaces all of the other default indexing code FixFieldIndexes
still exists for insert source projection wrapping and LOAD DATA, but both of these can be easily rewritten to removeCheckConstraintTable
interface forChecks()
andWithChecks()
helpers
The way the new fixup works is to traverse the tree in 3 phases. Each phase has a default mode but can be handled case-by-case for special node logic:
(1) Descend child nodes of the current node. Collect children and child "scopes" that contain schema info
(2) Index the current node's expressions.
(3) Re-build the current node, and fix its "schema" to pass upwards into a parent
The walk and scopes used for indexing should mirror exactly what we do on the execution side. As a result, some of the nutty logic for nodes could be standardized by changing what we do at execution time.
note: The initial implementation is not particularly memory-efficient. Currently trying to filter for Dolt-side uses of old indexing functions that would need to be refactored.
- new rule
- 1995: Added option to change protocol listener
This allows plugging in other protocol listeners besides the default MySQL one. - 1994: Fixes for subquery indexing/correlation tracking
I missed a place where we were using getField indexes to do a correlation check. This adds the changes necessary to replace that check with one that uses a subquery's tracked correlation column set. This also adds an expression id column to GetField expression that preserves the id tracking information between iterative passes of the analyzer. It would probably be preferable to avoid all cases where we unnecessarily re-run rules, but this is more near at hand. - 1993: JSON Array Mutation
The json_array_insert() and json_array_append() methods are in the family of other mutation functions ((set,insert,replace,remove). - 1992: Delete fixidx from pushdown
Summary:- delete
fixidx
usages inpushdownFilters
andgenerateIndexScans
- use a mapping on subquery aliases when pushing filters through subqueries
- better max1Row memo now
- misc changes to re-indexing to compensate for work previously done during pushdown
- delete
- 1991: Fix show processlist panic
close dolthub/dolt#6625 - 1990: Cast limit/offset to int type for prepared path
fixes: dolthub/dolt#6610 - 1989: Improve the efficiency of
newIndexAnalyzerForNode
by avoiding visiting Filter nodes.
This prevents super-linear runtime ingenerateIndexScans
- 1988: JSON Mutation functions implemented as stored procedures
Earlier PR (#1983) enabled us to modify json document objects, but none of that functionality was exposed as actual JSON_* functions. This change ties it together. The following functions will behave identically to MySQL (to the best of my knowledge). - 1987: Move JSON Functions into their own sub directory
Purely mechanical refactor done by the IDE. This is in preparation for adding support for several more JSON functions. - 1986: When pushing down filters, ensure removal of original filter.
In cases where the filter expression changed during push down because the column IDs changed, we were accidentally checking for expressions with the new column IDs, not the old ones, so the old filter expressions weren't being removed. - 1985: Fix panic in merge join when using custom Indexes that don't allow range lookups.
For instance, dolt has Commit indexes for tables that use commit hash as an index, but ranges don't make sense for those.
There's no equivalent in GMS, so I created "point_lookup_table" table function for use in tests. - 1984: Partially reorder indexing rules; use unique ids rather than execution ids
Name binding stores caching information upfront. Rule interdependencies pushed me into fixing a bunch of other rules before tests would pass. All together I think most of the changes are simplifications that I was planning on doing related to thefixidx
refactor. I was hoping to make it more piecemeal. Hopefully this gets us ~50% of the way towards removing those dependencies.
fixidx
is mostly contained toreorderJoins
andfixAuxiliaryExpressions
now, both near the end of analysis. If we move the indexing inreorderJoins
intofixAuxiliaryExpressions
, all indexing will happen at the end of analysis. That would let us index complicated joins with subqueries correctly and all queries more reliably.
summary:- rewrite cacheability to use correlated column references
- volatile functions now prevent caching
- rewrite
moveFiltersOutOfJoinConditions
to put filters below join when appropriate - subquery decorrelation uses (and updates) correlated column references
- alias subquery strings simplified to use the query string, not the plan string
- fix jsonTable and lateral join analysis
fixAuxiliaryExpresssions
at end of analysis- recursive analyzer rules (insert, trigger, procedure) are all at end of analysis now
- 1983: JSON Mutation
Add JSON Value mutation operations and tests. These changes do not alter the operation of dolt in anyway yet - that will come in a second PR which updates the JSON_SET procedure, and adds support for the JSON_REPLACE, JSON_INSERT, and JSON_REMOVE procedures at the same time. This is laying the foundation for that work. - 1982: Inline flatten aliases
This will skip a tree walk for most queries, inlining the rule in the places where nested table aliases can occur during binding. - 1981: Make the IntSequence test function consistently use int64 for its generated values
The previous implementation had an issue where it assumed the type used in the received IndexLookup, but this type can actually depend on exactly how the lookup was generated (and whether the bounds value was parsed from the query or generated internally.) This caused a panic if it was used in Lookup joins.
This makes no such assumptions and adds extra tests. - 1980: FreeBSD sockstate workaround
Close #2138 - 1979: Refactor event resolving
InlineloadEvents
and move event resolving out of execution time. - 1978: Fix table lookup bug for nil asOf
Edit: Show keys for info schema table should work the same way as MySQL now.
Prev: This error message is not correct, MySQL finds the table and does not error. But this at least reverts the behavior to what Dolt did before the name resolution refactors. - 1976: Refactor prepareds to cache AST
- 1975: Union NULL types cast to same type as neighbor
3+-way unions are subject to a bug where a NULL projection will be cast to a char type, and then have a type clash with the subsequent table relation. Rearranging the query to have the NULL projection last appears to fix the bug because theconvert
cast is moved to a position where it cannot have a type clash with a table relation to the right. - 1974: Re-implement generated stored columns with new analyzer
Testing revealed a bug in explicit DEFAULT values in insert statements, added a skipped test for same.
Also rewrote column default tests as scripts. - 1973: Fixed behavior for CONVERT(x USING y)
TheCONVERT(x USING y)
expression allows us to convert between character sets. When a string does not have a representing rune in the target character set, it should be replaced by a question mark. This is consistent with MySQL. Our previous behavior logged an error, which is valid in many scenarios, but not for this particular expression. - 1970: sql: ExternalStoredProcedureDetails: Add the ability to mark stored procedures read-only.
- 1969: Added a QueryEngine abstraction to engine tests to allow us to run engine tests on a running server
Implementation is partially working. Lots of errors when used in tests, some of which are real and some of which are issues with test definitions. Will keep working on this over time, but want to get the necessary refactoring in before continuing. - 1968: Missed cases where replicaBinlogController is needed during building
- 1967: Join aggregation projections subject to dropped refs
Aggregation intermediate projections need to output all inputs for aliases in the target projection. The way we were tracking column source column references did not take into account that two join tables can have the same column name. Now we include the table name for alias dependency tracking.
close dolthub/dolt#6578 - 1964: Skip subquery pushdown when no subquery
This will have a small perf impact. - 1962: Name res cleanup, docs and inline rules
Move bespoke rules into planbuilder. Functions are only ran for their necessary nodes. Also adds copyright headers.
Requires some Dolt changes: dolthub/dolt#6567 - 1945: server: Add a warning message at startup if the server is run without secure_file_priv being set to a non-empty value.
- 1934: New Merge Join planner
This should fix dolthub/dolt#6020 once finished, and then some.
The killer new feature in this new join planner is "Multi-Column Merge Joins", that is, merges where the comparison used for merges incorporates multiple filter conditions. This allows us to, in some cases, choose a much more selective index for merge joins. This improves both memory usage and performance because there will be fewer cases where the join iterator needs to keep multiple secondary rows in memory and cross-join them with multiple primary rows. - 1888: Remove filters from LookupJoins when they're provably not required.
During joins, we still evaluate every filter for every candidate row. But based on the join implementation, some of those filters must necessarily be true, so we don't need to evaluate them.
In most joins the performance cost of this isn't that bad, but this problem is most noticeable in LookupJoins where a point lookup is constructed from several columns, at which point the filter evaluation can dominate the runtime.
One potential drawback of this change is that this might make reading query plans more obtuse, because some filters are no longer explicitly listed in the plan, but are a consequence of Join node's children. One option to prevent this would be to add metadata to joins listing filters that they assume to hold, or marking filters in the execution plan as "skipped." However, I think that described execution plans should match the actual behavior of the execution as closely as possible, so this may not actually be a concern.
Some notes about the individual commits in this PR:
3b54dbe: This commit fixes an existing bug which caused lookups to return extra rows if the filter was a null-safe-equals check, and the key is non-null. This bug previously caused no issues because these extra rows would not match the filter and would get dropped. Now that we're skipping filters we know to be extraneous, this bug would manifest if not fixed.
48e8777: This commit changes the costing function of lookup joins. In the event that the lookup expressions can't be proved to uniquely identify a row, we attempt to estimate what percentage of the rows will be returned by the lookup. It's a very rough estimate, and serves mostly to guide which index to use if the table has multiple indexes: the more filters the index is able to make redundant, the better. The previous costing implementation had a special case for indexes where every column in the index was filtered on, and assigned that index a score somewhere between a 3-key lookup and a 4-key lookup. I imagine that the thought process when this was implemented was a lookup that used every column in an index would tend to result in very few rows compared to a lookup using an index prefix. Of course this is data dependent and I'm not convinced is generally true.
2b384a3: This commit updates tests that check for specific query plans. Most of these updates are just eliminating filters. A small number of updates are a result of changing the lookup costing, resulting in different join types. I haven't looked at these closely yet. This may be okay, it may not be. - 1786: support event execution
This PR adds event execution logic implementing EventScheduler interface in the engine.
Notes:- Event Scheduler status cannot be updated at run-time.
- Event
DISABLE ON SLAVE
status is not supported. It will be set toDISABLE
by default.
Corresponding Dolt changes: dolthub/dolt#6108
vitess
- 313: supports
FROM s.account
, in whichaccount
is non-reserved keyword - 312: Allow InjectedExpr to handle name resolution
Adds the capability for additional expressions that need to be resolved toInjectedExpr
.
Related PRs: - 311: add syntax support for
SELECT ... INTO OUTFILE ...
options
We had parser support forSELECT ... INTO OUTFILE
, but were lacking parser support for the variety ofOUTFILE
options.
This PR adds support for all the options and certain combinations of them.
However, we are missing support for duplicate options and certain orderings of these options; tests for these are added as error tests with a TODO.
MySQL Docs:
https://dev.mysql.com/doc/refman/8.0/en/select-into.html
MariaDB Docs (this has the actual grammar):
https://mariadb.com/kb/en/select-into-outfile/ - 310: Limit/offset proc param
re: dolthub/dolt#7458 - 309: Added a way for Doltgres to inject expressions
This primarily adds a new AST node that DoltgreSQL takes advantage of. All currently-existing nodes are specifically designed for MySQL transformations. In many cases, Doltgres will have different transformations that need to take place, and this provides a way for Doltgres to implement those transformations without having to modify other packages whenever a new transformation is added or edited.
Related PRs: - 308: Change
Handler.ComResetConnection
to return an error
Related to dolthub/dolt#3921 - 307: allow hexnums for
stats_auto_recalc
table option
This fixes 1 sqllogictest - 306: allow function keywords
We have some extra keywords (they aren't listed as reserved or non-reserved in MySQL), to handle the special syntax.
These keywords are allowed as table and column names without backquotes. - 305: Allow
UNIQUE
andPRIMARY KEY
to be specified on the same column
Related to: dolthub/dolt#7395 - 303: Add parser support for the
ALTER USER
statement
Adds support for parsingALTER USER
statements (MySQL reference). First step towards allowing users to change passwords.
Related to: dolthub/dolt#7348 - 302: Detect if a system variable scope was explicitly specified or not
Changes theVarScope
function so that it returns whether a scope was explicitly specified, or if one has been inferred.
This is needed because some tooling (e.g. the official MySQL .NET Connector library) will query system variables (e.g.SELECT @@max_allowed_packet
) and then will look up the returned value in the result set using the expected column name (@@max_allowed_packet
). Currently, from the way we parse the system variable, this was always returned with the scope present, but to match MySQL's behavior, the column name needs to match the requested name.
Related GMS PR: #2266 - 301: support special syntax for
position()
This PR adds support forPOSITION(<expr1> in <expr2>)
, which is a synoynm forLOCATE(<expr1>, <expr2>);
MySQL Docs: https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_position - 300: custom
char()
syntax handling
In order to support the charset argument forCHAR(... using <charset>)
, there has to be special syntax support.
Companion PR: #2255 - 299: Define an ExtendedHandler interface to handle postgres specific interactions
These are pretty narrowly tailored for postgres prepared statements and long term should probably be placed in either doltgresql or another package. Defining them here is just expedient for experimenting with the new interfaces while they change. - 298: parse empty begin end block in trigger
14 SQL correctness tests are failing because we throw syntax errors on queries like this:CREATE TRIGGER t1r1 AFTER UPDATE ON t1 FOR EACH ROW BEGIN END;
- 297: Fixing the
version
keyword to not require identifier quotes
Theversion
keyword still required identifier quoting in some usages, such asSELECT * FROM base.version;
. See dolthub/dolt#7237 for more details.
This change moves theversion
keyword into the main list of non-reserved keywords. There was one conflict from use of theversion
keyword in thefunction_call_keyword
rule, but it turns out that use ofversion
there is not required. We have an existing test for using theversion()
function, so I didn't add a new one. - 296: refactoring
default
andon update
expressions
This PR changes the grammar to more closely match MySQL's behavior, specifically around theNOW()
function and its synonyms.
Changes:- Throw syntax errors for
ON UPDATE
expressions against functinos that aren'tNOW()
or a synonym. - Only accept integer for argument to
NOW()
and synonyms; syntax error for anything else - Simplified grammar rules
- Removed
CurTimeFuncExpr
from AST in favor of plainFuncExpr
Companion PR: #2218
- Throw syntax errors for
- 295: Allow inline column check constraint definitions to appear in any order
Previously, an inline column check constraint could only appear as the very last option for a column definition. This change allows it to appear in other positions in the column definition. For example, this query now works:Resolves: dolthub/dolt#7195create table t123 (c1 varchar(5) check (c1 in ('v1', 'v2')) NOT NULL);
- 294: Allow SqlType to parse "CHARACTER" and add tests for every other possible type that could be passed in.
SqlType is a function in Vitess for normalizing every type name. It was missing an entry for the "CHARACTER" keyword.
I added tests that should verify every single valid type keyword in the grammar, so this shouldn't happen again. - 293: Add additional types to sqlparser.SQLType()
This function is used when the parser needs to map type names to underlying types in order to judge the validity of certain queries. Some types are aliases for others, (like REAL is an alias for float64) but they weren't included inSQLType()
, so certain expressions that used these types could panic. - 292: Parse COLLATE BINARY on individual columns.
We should be able to parse statements like:
create table test (pk varchar(255) collate binary)
This particular example will eventually get rewritten ascreate table test (pk varbinary(255))
, but that doesn't happen during parsing, so the added vitess tests still expectvarchar
. - 291: round trip SHOW PLUGINS
- 289: add partial support for ':=' assignment operator
This PR adds support for set expressions and assignment expressions.
Does not include support for select expressions, as it's deprecated on MySQL - 288: allow unquoted non reserved keywords for drop and rename column ddl
fixes dolthub/dolt#6950 - 287: support
int1
,int2
,int3
,int4
,int8
aliases
fixes dolthub/dolt#6900 - 286: Logging improvements
Logging a couple more error handling spots. Changing theHandler
interface to passmysql.PrepareData
into theComPrepare
function so that GMS can log the params count at a debug level.
Related GMS PR: #2098 - 285: Add create view with columns
ex:create view v_today(today) as select CURRENT_DATE()
- 284: Include the With clause in walked subtrees for Select statements
We weren't walking theWith
clause for select statements, which caused us to not find any bind vars in use there.
Related to: dolthub/dolt#6852 - 283: allow query options to appear in any order any number of times
Allow statements like this to parse:Fixes dolthub/dolt#6829select distinct sql_calc_found_rows distinct * from t;
Companion PR: #2088 - 282: Server handling parsed statements
See #2086 - 281: Made generated column expressions parse to ParenExpr to match Default
- 280: Update the default server version to
8.0.33
The connection handshake was advertising a server version of5.7.9-Vitess
and some clients were using that info and trying and speak MySQL-5.7 to Dolt (example issue)
This change updates the default advertised server version to8.0.33-Dolt
.
Dolt CI tests are running at: dolthub/dolt#6798 - 279: Add ANALYZE HISTOGRAM support
- 278: fix unsigned flag for
COM_STMT_EXECUTE
whennew_params_bind_flag
is set
In the previous implementation, we assumed that the way the MySQL Protocol specifies Column Definitions is the same as how it specifies parameter types forCOM_STMT_EXECUTE
. The difference lies specifically in the flags that come after the field type.
When reading/writing a field type (for a Column Definition), MySQL expects/writes a 1 byte wideenum_field_type
followed by a 2 byte wide Column Definition Flag.
However, when reading aCOM_STMT_EXECUTE
payload (that specifies parameters throughnew_params_bind_flag
), MySQL indicatesparameter_types
with the same 1 byte wideenum_field_type
followed by a 1 byte wide flag that indicates signedness.
So basically, read0x80
for COM_STMT_EXECUTE parameters, but read/write0x20
forfield_types
/column definitions.
I'm assuming MySQL does it this way because the majority of the Column Definition Flags are nonsensical/meaningless when paired up with parameters to prepared statements. Regardless, this was a subtle bug, and we should have tests for parsingCOM_STMT_EXECUTE
withnew_params_bind_flag
.
Fixes dolthub/dolt#6728 - 277: Allow parsing of
CREATE TABLE t AS (...) UNION (...)
This allows parsing ofCREATE TABLE AS
statements when the expression being used to create the table is a set operation like UNION, INTERSECT, or EXCEPT.
The "AS" keyword is typically optional. But this change only allows set ops to be used with CREATE ... AS when the AS is explicit. This is to avoid an ambiguity in the current grammar when attempting to parseCREATE TABLE t (
, where what follows could be a set op or a table definition. Fully matching MySQL's spec here would require rewriting our grammar to avoid this ambiguity, which is outside the scope of the PR. However, this PR makes us strictly more correct than we were before. - 276: Allow parsing of
SECONDARY_ENGINE = NULL
This is a simple change to allow parsing a NULL value for the SECONDARY_ENGINE attribute for CREATE TABLE and ALTER TABLE statements. - 275: Adding parser support for
VISIBLE
andINVISIBLE
modifiers for indexes
Fixes: dolthub/dolt#6690 - 274: Allow
CREATE TABLE
andALTER TABLE
to accept hexnum and float values when integers are expected.
Fixes dolthub/dolt#6644 and dolthub/dolt#6648
MySQL is permissive in what it expects for DDL statement parameters: in many places where ints are expected, the parser will also accept a hex number (and convert) or a float number (and truncate.)
None of these values are currently used by GMS, so we don't need to add any additional processing logic. But the parser needs to accept them when they appear. - 273: Parse for update skip locked as a no-op
While testing the FusionAuth application, there were regular panics which result from a query which does aselect for update skip locked
. This change allows us to parse these queries, but they don't impact query execution because row level locks are not supported by Dolt currently.
https://dev.mysql.com/blog-archive/mysql-8-0-1-using-skip-locked-and-nowait-to-handle-hot-rows/ - 272: Set character set IDs to current MySQL version
The character set values were set for MySQL 5.0, so they've been updated to the correct values for 8.0. - 271: parsing
intersect
andexcept
This PR adds support for parsing the keywordsINTERSECT
andEXCEPT
.
These work similar toUNION
and work withDISTINCT
andALL
keywords.
Additionally, there are new precedence tests;INTERSECT
has a higher precedence thanUNION
andEXCEPT
.
The rest are parsed left to right.
syntax for dolthub/dolt#6643 - 270: support alternate ways to say
VARCHAR
This PR adds more variations toVARCHAR
, specifically including the keywordVARYING
.
Parses correctly, but needs GMS changes to actually work as a whole
Companion PR: #2000
Fixes dolthub/dolt#6650 - 269: Fix
UNION
table option and parseLAST
keyword
This PR fixes support for parsing theUNION
table option and parsingLAST
.
Additionally, it adds support for these table options in alter table statements.
Fixes dolthub/dolt#6653 - 268: tests and adding keywords for alter too
This PR add support for parsing the keywordTABLE_CHECKSUM
as an alias forCHECKSUM
as a valid table option.
Additionally,CHECKSUM
andTABLE_CHECKSUM
are added as valid keywords in alter statements as well.
Fixes dolthub/dolt#6645 - 267: Parse
TABLE <table_name>
statements
This PR adds parser support forTABLE <table_name>
syntax.
It just converts this to be aSELECT * FROM <table_name>
statement.
Fixes dolthub/dolt#6642 - 266: parsing secondary_engine
This PR parses the syntax forSECONDARY_ENGINE
, but does not store it with the table.
For the most part we seem to parse, but not store any table options.
MySQL docs aren't up-to-date yet;SECONDARY_ENGINE
is not listed as a valid table option, but it works.
https://dev.mysql.com/doc/refman/8.0/en/create-table.html
These are also apparently valid options for create index statements, so I just added the syntax there too
https://dev.mysql.com/doc/refman/8.0/en/create-index.html
Fix for: dolthub/dolt#6632 - 265: upgraded YAML library
Closed Issues
- 2349: Foreign key constraints break auto-incrementing ids in memory mode
- 2344: Server error message compatibility when doing INSERT
- 2314: Can't turn on multiStatements when using memory driver
- 2296: add create table comment on table clause to table spec
- 2225: go mod tidy error
- 2184: GolangCI does not work
- 2159: VSCode debug Build Error
- 1782: Error 1105: -128 out of range for BIGINT UNSIGNED