Skip to content

Q1 2024 Release

Compare
Choose a tag to compare
@zachmu zachmu released this 28 Feb 22:44
· 858 commits to main since this release
8122f68

This is our quarterly roll-up release that contains many bug fixes, features, and performance improvements. Reminder that interfaces are not guaranteed to settle until 1.0.

This release contains a breaking change for setting up an in-memory database server, e.g. in testing. The in-memory database now requires its own session implementation, and the server must be supplied with a compatible session builder to prevent panics. See the _examples directory for an example.

Merged PRs

go-mysql-server

  • 2350: [planbuilder] Limit allows unsigned ints
    Parameterizing limits in stored procedures exposed a bug where we weren't accepting unsigned ints in LIMIT/OFFSET. This is only possible in prepared or stored procedure contexts afaik.
  • 2348: Optional integrator schema validation
    Move table schema validation so that we validate the accumulated result rather than sequential alters.
  • 2347: implement json_depth()
    MySQL Docs:
    https://dev.mysql.com/doc/refman/8.0/en/json-attribute-functions.html#function_json-depth
  • 2346: fix null and empty paths for json functions
  • 2345: Return correct MySQL error code when inserting into nonexistent columns.
    Fixes #2344
    This PR also adds a test framework for testing MySQL error codes, since this doesn't appear to currently be tested. This should make adding tests for other error codes easy.
    Some of the MySQL error codes that I expected us to use (such as 1050: ERTableExists) have zero usages in GMS or Dolt. We're probably returning 1105: ERUnknownError for these.
  • 2343: fix json bool comparison
    JSON Boolean values are special; they do not behave like normal when converted/compared against other types.
    fixes dolthub/dolt#7528
  • 2342: Add row length guards
    Tests and basic engine-side guards similar to mysql's behavior.
    re: dolthub/dolt#7524
  • 2339: only use crosshashjoin on subqueryaliases
    This fixes 2 sqllogictests.
  • 2336: fix rangetree removal of largest node without left child
    When removing a right child, a missing nil guard was causing us to improperly propagate the new max range.
    This resulted in overlapping ranges.
    This fixes 1 sqllogictest.
  • 2335: Schema pruning considers database name
    Fix some inconsistencies with project and join hints. New test added for new project pruning changes.
  • 2334: fix char to bool conversion
    This PR adds missing nil guards in a few conversion methods.
    fixes dolthub/dolt#7515
  • 2333: [planbuilder] Skip post-aggregate projection when there are no aliases in target scope
    re: dolthub/dolt#6982
    This takes the last big chunk of CPU pressure off of TPC-C execution, which is mostly disk limited now:
    image
  • 2332: [time] shortcut for no timezone conversion
    This function reads files from disk to try to convert a timezone to itself. There are other optimizations we could do to maybe try checking for the integer offset before doing syscalls.
    re: dolthub/dolt#6982, ~10% TPC-C boost
  • 2331: fix NOT expression in conjunction with WHERE EXISTS(<subquery>)
    The analyzer rule unnest_exists_subqueries was accidentally dropping NOT expressions when hoisting subqueries from WHERE EXISTS... clauses.
    This should fix 8 sqllogictests.
    Correctness: dolthub/dolt#7510
  • 2330: Fix DISTINCT over DECIMALS
    There was another place where we were using hashstructure package, which does not hash decimal.Decimal types correctly.
    Switched to xxhash package, which is what we use everywhere else.
    Reusing solution from: #2279
    This will fix 1 sqllogictest.
  • 2329: expression.Div Micro Benchmarks
    There are going to be changes to our division behavior that impact both its accuracy and speed.
    This PR adds benchmarks to track the runtime improvements/degradations
  • 2327: memo/planbuilder: Off by one error, and issue importing statistics for testing
    The original bug was a one-off error where conjCollector was checking for the wrong equivalence column. That spiraled into:
    • statistics were not properly being used for some tests because PRIMARY case sensitivity in ANALYZE ... UPDATE HISTOGRAM statements
    • one integration test had a weird edge case where OSX and linux compute slightly differently float values, leading to different plans locally vs CI. One option was to truncate the specificity of all cost values with a multiply/round to avoid numerical weirdness...picking the first lowest plan also seems to work.
    • variety of other small bugs in costing
    • update join bug where the projections after a rearrangement lost expression id/nullability info. This was exposed by the changes above
  • 2326: Various Decimal Type Fixes
    This PR has a variety of fixes to have arithmetic operations (especially those involving decimals) behave more like MySQL.
    The logic for the Type() method for Arthmetic and Div is simpler, and better tested.
    When comparing Decimal results from division operations, MySQL has an internal Scale that is different than the Scale used when returning Decimal results for display.
    Here's a matrix displaying the resulting scale:
    (Ex: 1 / 3 = 0.333333333; scale 0 div scale 0 should return scale 9)
    image
    Additionally, this PR adds test for arithmetic operations over Datetime and Year types. There are still a some problems dealing with precision and parsing there...
    Note: I believe the division optimization where float division is preferred over decimal division for internal calculations may be causing problems. More testing is needed to see if it's possible to enable this without causing inaccuracies/precision loss.
    There are microbenchmarks measuring the performance of div expression, and it turns out these changes actually greatly improve the runtime.
    Correctness: dolthub/dolt#7484
    Fixes
  • 2323: SQLEXCEPTION proc handler
    We originally supported one type of procedure handler, NOT FOUND, which explicitly checked for an error when fetching from a cursor io.EOFs. The implementation for that handler would walk the entire BEGIN/END scope stack inside the Fetch call looking for a handler, execute the handler body, and then embed the scope height into a special return error. The error walked back up the callstack looking for the BEGIN/END block embedded in the error message.
  • 2319: Add net.Listener as server config option
    Presently go-mysql-server is served by a net.Listener backed by a real port or socket. In some environments (like testing) we want to avoid the host's networking stack entirely. This changeset adds support for the caller to provide the net.Listener, which gives them full control over where the sever serves.
    This opens the door for some cool workflows. For example, you can run the server using a buffer-based net.Listener implementation (which is what happens in the test that I added).
    I landed on this solution while thinking through #2314
  • 2317: support SELECT ... INTO OUTFILE/DUMPFILE ...
    This adds support for MySQL's SELECT ... INTO OUTFILE/DUMPFILE ... feature.
    It is the complement to LOAD DATA. There is no LOCAL option, so files created using this feature are on the server.
    This PR adds a custom TestSuite for testing these files, as it needs to write, read, and delete files.
    syntax: dolthub/vitess#311
    fixes dolthub/dolt#7453
  • 2316: Allow primary keys with auto_increment columns to be dropped when an appropriate index exists
    Related to dolthub/dolt#7456
  • 2315: Stored procedures can use params as LIMIT,OFFSET
    fixes: dolthub/dolt#7458
    Fixes procedure param types in the process.
    Testing against mysql it doesn't seem like subqueries or regular column types are valid as LIMIT values in any case other than procedure parameters. I still need to test whether trigger columns can be used in LIMIT,OFFSET, but that seems like a separate problem.
  • 2312: Added the ExtendedType interface
    This adds the ExtendedType interface, which is used within DoltgreSQL to implement PostgreSQL types, as well as in Dolt to properly handle the new type and value serialization.
    Related PRs:
  • 2311: fix group by ... having ... aliasing once more
    This PR simplifies the GROUP BY and HAVING aliasing rules to much more closely match MySQL's weirdness.
    TODO: Go through various group by aliasing issues and link fixed ones here
    correctness: dolthub/dolt#7455
  • 2310: Implement COM_RESET_CONNECTION
    Implements the COM_RESET_CONNECTION command to allow a sql-server to clear session state on a connection so that it can be safely reused, for example, in connection pools.
  • 2308: fix decimal precision for division
    divScale tracks the number of division operators on the leftSubtree. It can be used to determine if the current expression is the outermost Div expression, and doubles as a way to determine how many decimal places of precision we need to add.
    opScale tracks the total number of Arithmetic operators in the expression tree in total. It is used to identify the outermost Arithmetic expression.
    There were a few places where we were mixing up divScale and opScale, leading to improper rounding; this PR fixes that.
    As a result, we are able to properly round decimal results when the outermost operator is not a Div.
    Additionally, this PR also contains a fix for functions that return float64 being incorrectly converted to decimal.
    correctness: dolthub/dolt#7442
    fixes dolthub/dolt#4931
  • 2307: Persist table comment from create table
    We parse table comments, but they were not persisted anywhere. This change adds support for persisting table comments so that they can be round-tripped between create table and show create table.
    There are still several limitations with table comments that we should eventually follow up on:
    • Table comments containing any single quotes are not supported – Vitess parses the table options, but then strips any quotes and replaces them with single quotes, then pushes all table options back into a single string that GMS parses again, using regexes to pull out charset, collation, and comment. If a table comment is double quoted and contains single quotes in the string, Vitess doesn't currently preserve enough information for us to reparse that correctly in GMS. The right fix for this is to change Vitess to return table options in a structure, instead of as a single string.
    • ALTER TABLE does not support altering a table's comment.
      Related to dolthub/dolt#7416
  • 2306: Bug fix: honor database qualifier on view name for create view statements
    Related to dolthub/dolt#7438
  • 2305: [fixidx] Table function expression ids
    Simplify and fix plan.TableAlias indexing. Some table alias children have their own expression ids, but sql.TableFunction implementations don't necessarily extend the plan.TableIdNode interface and rely on the analyzer to populate column ids. There are a couple ways to simplify this in the future, like adding an intermediate prunable sql.Projector node for table functions, or having pruning clean up after itself by updating node and parent table alias columns.
    TODO: this case is kind of hard to test, but trying to come up with something.
  • 2304: implementing json_keys() function
    This PR adds support for JSON_KEYS().
    MySQL docs: https://dev.mysql.com/doc/refman/8.0/en/json-search-functions.html#function_json-keys
  • 2303: float64 is larger type than decimal for coalesce() function
    This causes 937 failures in sqllogictests
  • 2302: adding @@server_uuid system variable
    This PR adds support for the @@server_uuid system variable.
    However, this just generates a random UUID on server start, and does not persist it to a file.
    MySQL Docs:
    https://dev.mysql.com/doc/refman/8.0/en/replication-options.html#sysvar_server_uuid
    Fixes dolthub/dolt#7431
  • 2301: fix coalesce type()
    Although only the first argument is returned, coalesce still examines the types of all the arguments passed in, and uses the "largest" type.
  • 2300: Fix External Index Creation
    Fixes regression introduced by 9b9301f
  • 2298: fix outermost division placement
    MySQL round appends 4 places of precision to their division operators, but only on the very last division operation.
    All inner divisions need to preserve as much precision as possible. To track this, we have a divScale variable in.
    We also have an opScale to track if an operation is the outermost operator, for other rounding/precision reasons.
    We were getting these values mixed up, leading to results with incorrect precision.
  • 2295: allow tables with self referential foreign keys to be dropped
    Our logic for preventing DROPs on a table if that table was referenced in a FOREIGN KEY did not account for when the table was self referencing. This PR fixes that and closes the gap in our testing for this area.
    fixes dolthub/dolt#7418
  • 2294: match range key with range type
    Creating lookups when comparing columns of different types were causing problems.
    Especially, if one of the types is a string.
    This is not a perfect fix as we don't do truncation yet, but it will stop panics.
    Reused solution from this PR:
    #2177
    It fixes the test case in this issue, but a skipped test is added for missing truncation functionality.
    dolthub/dolt#7371
  • 2292: Various aggregation/indexing fixes
    re: #2271
    Use expression ids to fix expression indexes most of the time. This makes it easier to resolve definitions once in planbuilder, rather than having to redo all that work a second time during assignExecIndexes. This should be more reliable for most queries and make it easier to make alias resolving refactors.
    Many expressions now implement sql.IdExpression and absorb unique expression ids. Trigger variables and stored procedure params still lack ids and fallback to string matching behavior.
  • 2291: backquote column names in default and generated column expressions
    To prevent parsing errors for special column names, we need to wrap column identifiers with backquotes.
    MySQL prints column identifiers in default and generated expressions when doing show create table ... with backquotes anyways.
    fixes dolthub/dolt#7388
  • 2288: Add more join statistic tests, some bug fixes
  • 2286: Earlier LOAD DATA and insert validation
    re: #2283
    fixes: dolthub/dolt#7313
  • 2285: Avoid generating SemiJoins when the subquery contains an OFFSET expression, as doing say may generate incorrect results.
  • 2281: groupby and having getfield index fixes
    Implementing more groupby and having strangeness from MySQL, and fixing some getfield indexing errors.
    Aliases in the projection scope were always added as new columns in the having scope; this led to getfields with indexes that were invalid in aggregation functions. Now, we don't add the alias to the scope if the alias overlaps with a column that is used in an aggregation.
    There is also a small fix to rewriting planbuilder tests to have proper tabbing, write skipped queries, and removed some random new lines.
    There are still some issues, left as skipped tests:
    • We incorrectly disallow aliases from the projection scope in aggregations
    • Columns that are aliased and used in group by should not be visible to having clauses; so, we should throw an error.
      Correctness: dolthub/dolt#7382
  • 2279: fix count distinct with decimals
    Swapped out the library used in the CountDistinct aggregation function, as it would hash decimals to the same value.
    Correctness: dolthub/dolt#7374
  • 2278: RangeHeapJoin should consistently sort NULL values before non-NULL values while managing its heap.
    Fixes dolthub/dolt#7260
    This was ultimately caused by #1903. I didn't think it was possible for that issue to cause user-facing problems, but I was wrong. Because of that issue, RangeHeapJoins considered all NULL values in its children iterators to come after all non-NULL values. However, if the child node was an index, then the child iterator would order its rows with the NULL values first. This causes the RangeHeapIterator to mismanage the heap and skip rows that should have been in the results.
    I updated the range heap code to manually check for NULL values when manipulating the heap. I also updated the plan tests to include NULL values in the test tables, which should now catch this issue.
  • 2274: Fixup index selection when prefix not complete
    Consider a query SELECT * from t where b = 1 and c = 1 and two indexes, (a,b,c) and (b,c). We want to use the (b,c) index as a lookup, because (a,b,c) will be disjoint on a (b,c) key. This PR fixes index costing to record and prefer non-zero prefix matches. We only differentiate zero and non-zero cases here because it is easier and I think pretty reliable.
  • 2273: Add support for the ALTER USER statement to change account passwords
    Adds basic support for the ALTER USER statement, so that users can change passwords. All users are allowed to change their own password; changing another user's password requires the CREATE USER privilege or the UPDATE privilege on the mysql database.
    MySQL reference docs for ALTER USER
    Related to: dolthub/dolt#7348
  • 2267: Fix HASH IN <tuple> when comparing against collated string types
    This PR changes the HASH IN to always use a collation to hash the values in the case that we compare a numerical type against a collated string type.
    fixes dolthub/dolt#7338
  • 2266: Field response packets MySQL consistency improvements
    While debugging a customer issue, I noticed that Dolt/GMS has several details in the field metadata packets that differ from MySQL's behavior. This PR fixes the following differences:
    • flags are now populated for nullability, unsigned types, blobs
    • column names for projected system variables now only include the variable scope when it was explicitly specified in the query
    • @@max_allowed_packet should be an unsigned integer type
      Vitess PR: dolthub/vitess#302
  • 2264: implement octet_length() function
    This PR implements OCTET_LENGTH(), which is just a synonym for LENGTH()
    MySQL Docs: https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_octet-length
  • 2263: Insert VALUES statements with subqueries index correctly
    fixes: dolthub/dolt#7322
  • 2262: implement elt() and field() function
    This PR implements the ELT() function, which just returns the nth argument as a string.
    This PR also implements the FIELD() function, which returns the first case-insensitive match to the first argument.
    MySQL Docs:
  • 2261: implement exp() function
    This PR adds support for the EXP() function in MySQL.
    MySQL Docs: https://dev.mysql.com/doc/refman/8.0/en/mathematical-functions.html#function_exp
  • 2260: implementing bit_count() function
    This PR implements the BIT_COUNT() function, which counts the number of 1s in the binary representation of the integer argument.
    MySQL Docs: https://dev.mysql.com/doc/refman/8.0/en/bit-functions.html#function_bit-count
  • 2259: implement atan2() function and support second argument for atan
    This PR adds support for ATAN() and ATAN2(), which just computes atan
    MySQL Docs: https://dev.mysql.com/doc/refman/8.0/en/mathematical-functions.html#function_atan2
  • 2258: implementing quarter() function
    This PR adds support for the QUARTER() function, which tells the quarter from the provided date.
    MySQL Docs: https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_quarter
  • 2257: preserve original col name for subqueries
    There are likely other places where we don't copy over originalCol name when creating scope columns, but I can't tell which should be left as empty and which should be filled out.
    fixes dolthub/dolt#7316
  • 2256: implementing ORD() function
    This PR adds support for the ORD() function in MySQL, which converts a unicode character into its numerical representation.
  • 2255: implementing CHAR() function
    This PR implements most of CHAR() functionality.
  • 2254: implementing pi() function
    Implements the PI() function, which just returns the value of PI
    MySQL docs: https://dev.mysql.com/doc/refman/8.0/en/mathematical-functions.html#function_pi
  • 2253: implement SPACE() function
    This function outputs a string containing the number of space characters specified by its argument.
    MySQL Docs: https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_space
  • 2251: fix order by optimization
  • 2250: Add support for SUBDATE()
    Implements the SUBDATE() built-in function and cleans up a few other spots.
    Related to: dolthub/dolt#7277
  • 2249: fix decimal out of bounds in case statement
    Fixes dolthub/dolt#7079
  • 2248: Use DESCRIBE ANALYZE / EXPLAIN ANALYZE to display stats data about joins.
  • 2247: Merge joins populate join stats
  • 2246: round IndexAccess for numerical strings over int columns
    fixes dolthub/dolt#7261
  • 2244: Feature: SYSDATE() function
    Adds support for the SYSDATE() function.
    Related to: dolthub/dolt#7270
  • 2243: Fix context warn panics
    fixes: dolthub/dolt#7262
  • 2239: Support for doltgres prepared statements
  • 2238: [memo] RequiredIndex interface
    Some table nodes need to be executed as IndexScans with mandatory filters. The new interface makes this transparent to join planning.
    re: dolthub/dolt#7256
  • 2236: trim floats when hashing
    fmt.Sprintf("%v", x) writes floats with a x.0 which causes it to never equal hashed strings.
    Initially, I wanted to always convert the LHS to the type in the RHS, but this is difficult when there are multiple types in the RHS
    fixes dolthub/dolt#7246
  • 2235: Add source table and database to field metadata in wire responses
    The field metadata messages we were sending back for a result set did not include the source table and source database for fields. This caused a behavior difference from MySQL where the table-qualified column name (e.g. table1.ID) would not work with Dolt when using the MySQL Java Connector library. See dolthub/dolt#7247 for more details.
    Updates to unit tests for the schemaToFields code are in this PR, and I'll follow up with a PR in the Dolt repo that updates our MySQL Connector library integration tests to add coverage for the table-qualified column name.
  • 2234: use cached decimal type for casts
    Fixes a variety (hopefully 49) of SQLLogicTests involving CASTS(... AS DECIMAL)
  • 2233: Add logging for COM_RESET_CONNECTION command
    Adding debug logging to see when clients are sending this command, even though we don't implement it yet. For example, ORMs may send this command when returning a connection to a connection pool, so it may be helpful to implement this command and clear out session state in that case.
  • 2231: fix precision for utc_timestamp
    The UTC_TIMESTAMP() function should take in an argument and round the milliseconds. For now, we stick to always returning the full precision (6 places)
  • 2230: guard ctx and session with nil
    fixes dolthub/dolt#7235
  • 2228: fix type conversion in Between expressions
    Replace the logic in Between.Eval() with a logically equivalent AND statement to reuse the type conversion logic in comparison.go
    fixes dolthub/dolt#7229
  • 2227: Add JsonIter class for iterating over the key-value pairs of a JSON object.
    This is the GMS side of automating JSON merging in Dolt: just some type aliases and a simple iterator for getting the keys in a JSON object in a deterministic order.
    It's worth pointing out that currently Dolt stores JSON in a normalized form by sorting keys by length, but the iterator here uses a simple lexicographic order instead. This difference doesn't really matter at the moment because we unmarshall the entire object into a go map no matter what. But Dolt needs to be aware of the ordering used in order to correctly compute three-way diffs.
  • 2226: Error on NOW() eval with nil context
  • 2224: empty string is 0.0 decimal
    fix dolthub/dolt#7222
  • 2223: Fix multi db views with overlapping name
    fixes dolthub/dolt#7223
  • 2221: fix decimal max precision check
  • 2220: guard ctx functions with nil check
    fixes dolthub/dolt#7216
  • 2219: Implement fmt.Stringer for JsonDocument.
    GMS side of an upcoming Dolt PR.
    I'm improving error handling in schema merge tests by having the tests print the de-serialized values in incorrect tables instead of just their bytes. Since table values are passed around as an interface{}, it seems reasonable to have them implement Stringer.
  • 2218: implement NOW() siblings
    This PR has our behavior surrounding NOW() functions more closely match MySQL.
    Changes:
    • Added NOW() synonyms to registry
    • Have CURRENT_TIMESTAMP(), LOCALTIME(), LOCALTIMESTAMP() all just call NOW()
    • Support parsing synonyms in DEFAULT and ON UPDATE expressions
    • Fixed SHOW CREATE TABLE to print CURRENT_TIMESTAMP for NOW() and synonyms
  • 2217: Fix decimal parsing bug
    Decimals with capitalized exponential 'E' were incorrectly bound to literals, losing precision in some cases.
    select 5.0E-5; // -> 0.0001
  • 2216: Use indexes and lookups for cardinality estimates
    Cardinality estimation happens once after join reordering, indexing, and lookup join planning. The indexes and lookups help with result count estimates.
  • 2214: Fix wrongly written 'aribtrary' -> 'arbitrary'
    Fix wrongly written word 'aribtrary' -> 'arbitrary'
  • 2212: Join estimation algos
    Code and tests related to estimating output distributions for joins. Limited to numeric types.
  • 2210: sql/planbuilder: Update join defaults are not prefixed, need individual table schemas to resolve
    Updates need to resolve default expressions. The input definitions for update joins include the total join output columns. If two of those columns have the same name, the target table's unqualified default expression will throw an "ambiguous column" error. We partition the update join schemas/column definitions to sidestep the error.
  • 2208: fix type evaluating for mult and div arithmetic operations
    • For Mult expression, the .Type() method is fixed that it returns correct type for decimal type results, which uses the sum of precision and scale values of its right and left values.
    • For Div expression, the .Type() method is fixed that it returns the correct precision and scale for the final result.
  • 2207: Fix partial left join update when columns have default values or join table is in where clause
  • 2206: guard ctx.Span again
    #2203 again, because it was overwritten
    fixes dolthub/dolt#7182 again
  • 2205: use entire lateral scope when indexing RangeHeapJoin
    RangeHeapJoins looked at the lateral join scope when assigning indexes.
    However, we never tested nested joins for this case, leading to dolthub/dolt#7177
    What made the error more apparent was the string in the left scope that would result in empty results when doing lookups into the right tables. The fix was to look at the whole lateral scope when indexing RangeHeapJoins
    fixes dolthub/dolt#7177
  • 2204: fix foreign key panic on update
    When resolving ForeignKeys, ForeignKeyReferenceHandler.CheckTable() calls Partition() and iterates over all rows in the referenced table to ensure that the key is valid (doesn't contain any values that are not in the referenced table).
    WriteableIndexedDoltTable prevents us from reading every value through Partition() as it is only supposed to return rows that match the index range. Additionally, there are some differences between the way MySQL and dolt handle certain ddl and dml operations during delayed resolution of Foreign Keys.
    To fix this, we selectively perform table scans (only when we are creating/altering the table to resolve foreign keys).
    Fixes dolthub/dolt#6963
    Companion PR: dolthub/dolt#7203
  • 2203: guard ctx.Span with nil check
    We have places where we call expression.Eval(nil, nil).
    The null context causes a panic when we call Span on it.
    This PR just adds a nil check inside the *context.Span() receiver
    fixes dolthub/dolt#7154
  • 2200: Replace carder with stats
  • 2199: remove delimiter before caching the prepared stmt
    When using prepared statement, some create statements were being stored in the cache with delimiter at the end of it, which does not trimmed when executing the prepared statement. This causes consistency issue where in cases of storing a create statement, it will have the delimiter whereas running the same query without it being prepared statement trims before storing.
  • 2198: fix decimal/float conversion for HashInTuple expressions
    fixes dolthub/dolt#7147
  • 2197: prevent RangeJoinHeap plans over Filter nodes
    Currently, we don't have proper handling of nodes that aren't RangeHeaps in the RangeJoinIter.
    This PR makes it so that we don't create plans that would have these nodes.
    Additionally, we error instead of panicking when we see an invalid node in this area.
    fixes dolthub/dolt#7139
  • 2196: Prepend subquery scope to sql.TableFunction nodes
    Most but not all table functions implement sql.Table. Table functions that do not implement sql.Table still need to return prepended rows to maintain indexing rules.
  • 2195: fix type promotion for in expressions
    TODO: check type promotion for int -> float/decimal for all expressions
    fixes dolthub/dolt#7120
  • 2194: fix decimal result scale difference on server
    This PR fixes:
    • converting the decimal type values to given decimal column type before inserting.
    • getting precision and scale of final decimal type correctly.
  • 2193: Set the original_name field in response metadata in addition to the name field
    A customer reported that the MySQL C++ Connector library was unable to retrieve column name information from a Dolt sql-server. After looking at the two wire captures between MySQL and Dolt, this is because the MySQL C++ Connector library pulls the column name from the original_name field, not from the name field.
    I've updated the unit tests that assert the expected response metadata fields are populated, and I'll follow up next with some changes in the Dolt repo to our C++ Connector library acceptance tests so that they use response metadata and assert that it is filled in.
    After that, it would be good to proactively look at any other response metadata fields that we aren't setting. For example, the Flags field seems important to fill in correctly for tooling to use.
  • 2191: Move index costing into join planning phase
    Put index costing inside join planning, so that in the future join planning will have better cardinalities (statistics) for join ordering. Most of the changes will look like refactoring the way we expression index lookups in the memo. I attempted to do this in a way that makes as few changes as possible to join planning; the goal here is to set me up for rewriting cardinality checks with stats objects. It didn't go as cleanly as I wanted, I ended up shifting a lot of join plans back to lookup plans because HASH_JOIN was beating LOOKUP_JOIN in several key places.
    One downside of the current PR is that it converts a sysbench MERGE_JOIN into a LOOKUP_JOIN. I would prefer fixing this in the next PR when I do a bigger costing overhaul.
    Variety of fixes for join hinting, correctness, etc.
    At some point we appeared to fix this:
    #1893
  • 2189: Upgraded xxhash to v2
  • 2188: support on update current_timestamp
    This PR adds support for handling and executing ON UPDATE column expressions, which are triggered when an update is applied to any column in the row.
    TODO:
    • we support now() for default, but not the syntax for on update
      fixes dolthub/dolt#6030
  • 2187: fix round() handling of scale, precision, and nulls
    This PR has ROUND() behavior match MySQL more closely specifically when handling NULLs.
    Additionally, it refactors the function to no longer use custom logic, and rely on decimal.Decimal library for conversions.
  • 2186: Feature: Support BLOB/TEXT columns in unique indexes, without requiring a prefix length
    Allows TEXT and BLOB columns to be used in unique keys, without requiring that a prefix length be specified. This causes the secondary index to store a hash of the content, instead of the content itself, and then that hash is used to enforce the uniqueness constraint. This is useful to enforce uniqueness over very long fields without having to specify a threshold with a prefix length.
    This feature is supported by MariaDB and PostgreSQL, but not by MySQL. A new SQL system variable strict_mysql_compatibility is also introduced in case customers want to opt-out of extensions like this and stick to the exact behavior of MySQL. The default value of strict_mysql_compatibility is false.
    Unique secondary indexes using content-hashed fields have several restrictions, such as not being eligible for use in range scans or in any scans that require a specific order.
    The GMS in-memory secondary index implementation takes a simple approach – it doesn't actually hash encode the content-hashed fields, and instead includes the full column value. This is consistent with how the GMS in-memory index implementation handles other features, such as prefix lengths, which are also a no-op and the full content is stored in the secondary index.
    Dolt integration: dolthub/dolt#7104
    Related to: dolthub/dolt#7040
  • 2185: fix panic of concurrent map writes, when using in memory mode
    Replaces #2179
  • 2182: fix IN_SUBQUERY projection bugs
    Correctness regression fix. With a bit more work this could probably be a smaller query:
    CREATE VIEW view_2_tab1_157 AS SELECT pk, col0 FROM tab1 WHERE NOT ((col0 IN (SELECT col3 FROM tab1 WHERE ((col0 IS NULL) OR col3 > 5 OR col3 <= 50 OR col1 < 83.11))) OR col0 > 75);
    The CREATE panicked because the top-level projections get pushed into the source node, and my recent refactors failed to map projections onto the reported table output column sets.
  • 2181: Improve IN_SUBQUERY table disambiguation
    When unnesting and IN_SUBQUERY into a parent scope with a table name clash, rename the child table and update its references to the new name. Prevent EXISTS subqueries from unnesting if it doesn't full decorrelate the child scope.
  • 2178: Improve handling of charset and collate in column options.
    dolthub/vitess#293 should be merged before this.
  • 2177: Properly round IndexAccess Bounds for float/decimal type filters over integer columns
    When generating indexes ranges, we don't convert the range bounds type to the index bound type until later on.
    Additionally, when we do convert (specifically floats to ints) we round the floats based on the halfway point, leading to indexes skipping over rows depending on how it rounds.
    This PR changes that to convert the types earlier (keeping the key type and index type consistent), and rounding floating point bounds correctly to not leave out any rows.
    fixes dolthub/dolt#7072
  • 2175: Fix existant typo
  • 2174: Having aggregate alias bug
    fixes: dolthub/dolt#7082
  • 2171: Bug fixes for type handling in IF and COALESCE functions
  • 2170: have flush binary logs be noop
    fixes dolthub/dolt#7055
  • 2169: Unique table and column ids
    The motivation for this PR is making the costing/exploration phase consistent with the data structures for index costing. That means switching memo ScalarExpr back to sql.Expression. Moving the previously join-specific ids in ScalarExpr to tables and column refs lets us preserve most memo data structures and join transformation logic. The alternative to combining these changes would be to rewrite all of the memo data structures and transformation logic to not use ids, which would be a fairly long and difficult process.
  • 2168: fix panic in math funcs
    fixes dolthub/dolt#7060
    Additionally, fixes POW() to not have the same panic and returns warnings instead of errors for certain inputs to LOG().
  • 2166: prevent panic on nil cast for time functions
    fixes dolthub/dolt#7056
  • 2165: fix update <table> set <column> = default
    This PR fixes a bug where attempting to update a column to its default would throw an unresolved error.
  • 2161: SHOW PLUGINS
    fixes: dolthub/dolt#7049
    depends on: dolthub/vitess#291
  • 2160: fix Inf and NaN for trigonometric functions
    fixes dolthub/dolt#7046
  • 2158: have DayName() return NULL on bad DATETIME conversions
    fixes dolthub/dolt#7039
  • 2157: don't round floats when converting to bool
    fixes dolthub/dolt#7038
  • 2156: Decorrelate IN_SUBQUERY refactor
  • 2155: Allow BLOB/JSON/TEXT columns to have literal default values (MariaDB compatibility)
    Technically, MySQL does NOT allow BLOB/JSON/TEXT columns to have a literal default value, and requires them to be specified as an expression (i.e. wrapped in parens). We diverge from this behavior and allow it, for compatibility with MariaDB.
    While testing with a binary literal, I noticed that SQLVal was converting that value to "BLOB" instead of the actual binary content, so I fixed that one, too.
    Related to: dolthub/dolt#7033
    Dolt CI Checks: dolthub/dolt#7036
  • 2154: Use max prec and scale for decimal oob
  • 2153: null in-tuple bugs
    fixes: dolthub/dolt#7025
  • 2152: Update RangeTree MaxUpperBound when removing nodes
    Verification code from in the last RangeTree related PR caught some bad ranges in the sqllogictests.
    The cause was not properly updating the MaxUpperBound in nodes for the rangetree when performing a remove operation, which led to missing connections when pruning ranges.
    This also prints MaxUpperBound in the String/DebugString methods for easier debugging.
  • 2151: add decimal type to convert functions
    fixes dolthub/dolt#7018
  • 2149: updating RangeTree node MaxUpperBound again
    A previous fix involved updating the MaxUpperBound in the RangeTree when traversing the right node, turns out we need to do that when creating a new node as well.
    To better catch overlapping range expressions, we now verify that the resulting ranges do not overlap (an operation which isn't too expensive). This fixes some plans from an index refactor.
    Additionally, this also fixes a skipped test where the ranges were not overlapping but different than the brute force approach.
  • 2148: fix for foreign key that references dropped table in information_schema.referential_constraint table
  • 2147: Fix mod bool conversion
    fixes: dolthub/dolt#7006
  • 2146: Update MaxUpperBound when inserting into RangeTree
    Recent changes to index costing exposed a bug in RangeTree. This bug is responsible for a small regression in the sqllogictests, involving a complicated filter.
  • 2145: mysql server handler intercept support
    Split from the PR /~https://github.com/dolthub/go-mysql-server/pull/2036.
    Add mysql server handler intercept support.
  • 2144: Push filters insensitive to table name
    Filter pushing bug that is specific to 1) table names with capital letters, and 2) filters that need to move through joins. The problem is not indexing specifically, but checking for an index is the easiest way to test this.
    dolt bump: dolthub/dolt#7001
  • 2142: Idx histogram manipulation
    Add simple histogram mutators for filter types. Use histogram costs for index selection when available. Added stats docs.
  • 2141: Fixing field metadata for JSON and geometry types
    JSON and geometry types should always report a binary collation in MySQL's field metadata. While debugging dolthub/dolt#6970, I noticed that MySQL was sending a binary collation for these types, but GMS was sending back the default collation.
  • 2140: Respect character_set_results when emitting field metadata
    For non-binary types, we need to respect the value for the character_set_results session var (when not NULL) and use that for the field metadata returned in the MySQL wire protocol.
    The unexpected charset/collation metadata is causing DataGrip to be unable to work with some types in the table editor ( see * 2138: Fixed SET working for invalid charsets/collations
    Fices dolthub/dolt#6972
  • 2137: fix nested subquery filter in exists
    We do not explore the children of subquery when attempting to decorrelate filters for exists queries; we now do this through the use of subquery.Correlated().
    We should also avoid using uncacheable subqueries as keys for IndexLookups.
    fixes dolthub/dolt#6898
  • 2136: Fix nil range correctness bug
  • 2135: Resolve indexes of columns in CREATE TABLE statements early
  • 2134: Costed index scan framework
    This is the accumulation of the following PRs:
    #2093
    #2104
    #2112
    #2124
  • 2133: Add query plans for index/join sysbench queries
  • 2132: ReferenceChecker interface
    re: dolthub/dolt#6957 It is expensive and unnecessary to deserialize blobs during FK reference check lookups.
  • 2131: Fixed Full-Text defaults
    Fixes dolthub/dolt#6941
  • 2130: More TPC-C tests, fix the slow HASH_JOIN
    The randIO parameter for LOOKUP_JOIN costing was perhaps too strict, since that cost is already stacked on top of the sequential cost. This isn't a replacement for better costing, but boosts TPC-C perf a bit and isn't less correct than the previous version.
  • 2126: fix inner join filter pushdown, and alias matching
    This PR fixes an issue we have with InnerJoins where the filter condition uses a function and references columns from both tables. We need to properly search the expressions in a SubqueryExpression.
    Additionally, addresses an getIdx bug involving unqualifying aliases.
    partially fixes dolthub/dolt#6898
  • 2125: Minimal index searchable interface
    re: #2036
  • 2121: fix panic when calling ST_POINTFROMWKB() with no arguments
  • 2120: Prevent virtual columns from being used in primary keys
    Also tests for stored generated columns in primary keys
  • 2116: Grant Options privs need the AdminOnly treatment too
    This addresses a gap discovered while writing dolt tests - Grant Option on procedures is not currently validated correctly, resulting in only super users being able to set grants on procedures. This should address that.
  • 2115: Fix ExistsSubquery with functions
    Our hoistSelectExists optimization incorrectly generates SemiJoins when there are OuterScope column references in projections in subqueries. In the future, a possible optimization could be to have SemiLateralJoins that properly grant this visibility.
    Also contains small refactoring and extra debug information for coalesce function.
    Fixes one of the queries here: dolthub/dolt#6898
  • 2113: Allow ScriptTestAssertion to specify when an assertion needs a new session
    As part of testing Dolt's reflog feature, I need to call dolt_gc() and then check the reflog behavior. Because dolt_gc() invalidates the session, I needed to add this hook that signals the test framework to create a new session for a ScriptTestAssertion.
  • 2110: Procedure privledges
    Adds support for respecting procedure and function permissions. Also added AdminOnly flag for external procedures to indicate that they should not have their privileges evaluated in the standard MySQL hierarchical way. This will allow us to tighten dolt procedures access.
    The ability to grant access is still blocked behind an environment variable. That will remain until dolt changes have been released.
  • 2109: adding join and subquery tests
    Convert many of the sqllogictests into enginetests for visibility
  • 2108: Push not filters
    De Morgan's laws and leaf filter inversions to get NOT expressions as low in filter trees as possible. This will make index costing NOT filters easier.
  • 2102: view aliasing bug
    The View.ViewExpr AST object seems to drop aliasing information that we depend on for query schema presentation. I want to circle back to have the view AST expression be consistent with the view definition, but this forces a re-parsing to get the correct view schema.
  • 2100: Refactor or the NewPrivilegedOperation method
    Adding support for routines is coming, and and a bunch or string arguments is cumbersome.
    Note to reviewer: start at sql/privileges.go!
  • 2099: Add the mysql.procs_privs table
    In order to support procedure and function permissions, the mysql.procs_priv needs to be supported. This change add support, but gates the ability to create these grants because they are not being used for actual permission checks yet. That will come next.
    I have engine tests on another branch due to the gate. I'll add them when the feature is complete. There are also bats tests in flight in the dolt repo which can be seen here:
    dolthub/dolt@4d8fe2a...macneale4/privs_tests_wip
  • 2098: Logging improvements for prepared statements
    Updating from mysql.Handler changes in 7da194ad69efa0e4cc3992f860ecb92550482d62 – adding debug logging for ComPrepare and including params count, other minor logging cleanup.
    Related Vitess PR: dolthub/vitess#286
  • 2095: give child SQA isLateral if parent SQA isLateral
    If a parent SubqueryAlias is marked IsLateral, then mark its child SubqueryAliases with IsLateral as well.
    This essentially gives the child SubqueryAlias visibility to the left subtree all the time, which fixes the linked issue.
    We should be able to differentiate between different scopes of lateral joins and only grant visibility to those columns, which could be a subset of the parent/left columns. However, I don't think we are close to getting that working.
    fixes dolthub/dolt#6843
  • 2094: Add support for view column clause
    re: dolthub/vitess#285
    closes: dolthub/dolt#6859
  • 2091: Non-ambiguous ORDER BY col should not error
  • 2090: Allow queries with duplicate table names if those tables come from different databases and all references are unambiguously resolvable.
  • 2088: fix off by one for found_rows when limit > count(*)
    We had a bug where we could increment the limit counter before receiving the EOF error.
    fixes dolthub/dolt#6829
    companion pr: dolthub/vitess#283
  • 2086: Server handling parsed statements
    Initially this was going to be a bit more involved, as I was planning on having Dolt expose a new interface, and we'd directly pass in GMS ASTs rather than Vitess ASTs. The Dolt interface approach turned out to be a lot more involved than first anticipated, and the construction of GMS ASTs needs state that we will not have at higher layers, and exposing such state is also a lot more involved. Therefore, I've made a compromise by accepting Vitess ASTs instead, which makes this vastly simpler. It's not going to be quite as powerful, but I think it can still serve our purposes for the foreseeable future.
    This basically works by hijacking that fact that we'll sometimes process Vitess ASTs via the prepared cache. If we receive a Vitess AST, then we skip the cache, otherwise we access the cache like the normal workflow.
  • 2085: small fixes for new test harness using server engine
    This PR adds small fixes to server_engine that is used to test existing engine tests over running server.
  • 2084: Remove Dead Grant/Revoke code
  • 2083: Virtual column index support
    Implements index support for virtual columns, and fixes several bugs related to generated columns for certain statements.
    Changes the memory table index implementation to actually store secondary indexes separately, rather than fake it via the primary index. The indexes are stored via sorted slices for now. I'll do another pass and replace them, and primary storage, with btrees for speed now that I have this proof of concept working.
    Also introduces a new interface for rebuilding a single index, rather than doing a complete table rewrite every time an index is created.
  • 2081: Refactor JSON interfaces, stats interfaces
  • 2080: drop sort when secondary index is available
  • 2079: reverse filters for reverse lookups
    fixes dolthub/dolt#6824
  • 2078: fix panic when order by column is out of range
    We used to blindly index ORDER BY values, which resulted in panics.
    Now, we throw the appropriate error.
    Additionally, this matches MySQL behavior when performing ORDER BY with indexes < 1.
    • ORDER BY 0 = error
    • ORDER BY -1 = noop
  • 2077: sql.StatsTable->RowCount returns whether the estimate is exact
    Add return argument for whether the a RowCount can be a substitute for count(*).
  • 2076: pick float type if one side is non-number type
    If one side or comparison is non-number type and the other side is number type, then convert to float to compare and the non-number type value can be float type.
  • 2075: Fix information_schema row count regression
    If a table does not implement RowCount() we used to use 1000 as a default value for row costing. A recent refactor changed that to 0. This fixes information schema tables to report the 1000 value again, which is usually accurate for small databases because of default tables and columns. I also fixed some issues with database reporting for info schema tables.
    This regression probably still exists for some dolt tables and table functions. I will do a pass and see if I can add some more accurate values on the Dolt side.
  • 2074: optimize min(pk) and max(pk)
    This PR adds an optimization to queries that have a MIN or MAX aggregation over a PRIMARY KEY column.
    Since indexes are already sorted (and we support a reverse iterator) we can look at the first/last row to answer queries in this form.
    The new analyzer rule, replaceAgg, converts queries of the format select max(pk) ... from ... to the equivalent select pk ... from ... order by pk limit 1. Then, we depend on an replacePkSort to apply IndexedTableAccess
    Additionally, this PR has replacePkSort optimization apply to queries that have filters (specifically those that were pushed down to IndexedTableAccess)
    There is also a some refactoring and tidying up.
    Fixes dolthub/dolt#6793
  • 2073: Bug fixes for explicit DEFAULT values in INSERT statements
    Previously this was broken when DEFAULT values referred to other columns.
    Fixes dolthub/dolt#6430
    Needs more tests, but this fixes the immediate buggy behavior.
    Also fixes a related bug in MySQL: https://bugs.mysql.com/bug.php?id=112708
  • 2072: support json_valid() function
    Add support for JSON_VALID(val) function
  • 2071: Updates for Dolt stats
    • json_value and json_length added
    • json_table edited to support json document inputs.
    • our custom json marshaller supports types that implement the json.Marshaller interface
    • increased recursive iter limit to 10,000 to more easily generate 3-level prolly trees for statistics testing
      Note: the json_value notation is different than mysql's. I accept the type as a third parameter, rather than expecting a RETURNING clause.
  • 2070: Add routines to PrivilegeSets
    Added some Unit tests as well. This is in preparation for supporting routine grants.
  • 2068: Virtual column proof of concept
    This needs a lot more tests, but this demonstrates the approach within can work well enough
  • 2067: adding sqllogictests
    This PR adds some utility scripts to convert CRDB testing files into SQLLogicTest format.
    Additionally, this adds tests focusing on join and subqueries.
    Some notable tests are added as skipped enginetests.
  • 2065: Remove redundant information_schema creation in the driver example
    In the driver's example, information_schema is duplicated.
    It is created in two places.
    1. factory.Resolve()
    2. Catalog in Analyzer in Driver.OpenConnector()
      There is no need to create it in the factory.Resolve().
      I checked databases using this:
    diff --git a/driver/_example/main.go b/driver/_example/main.go
    index 34e0580ed..adbc1a249 100644
    --- a/driver/_example/main.go
    +++ b/driver/_example/main.go
    @@ -35,6 +35,14 @@ func main() {
    rows, err := db.Query("SELECT * FROM mytable")
    must(err)
    dump(rows)
    +
    +       rows, err = db.Query("SHOW DATABASES")
    +       must(err)
    +       for rows.Next() {
    +               var db string
    +               must(rows.Scan(&db))
    +               fmt.Println("db:", db)
    +       }
    }
    func must(err error) {
  • 2064: Implement fast join ordering for large joins that can be implemented as a series of lookups.
  • 2063: Improve functional dependency analysis for joins.
    This is a prerequisite for fixing dolthub/dolt#6713.
    Basically, this PR does two things:
    • Adds additional bookkeeping to FDS in order to track "partial FDS keys", that is, column sets that determine some (but not all) of the other columns in a relation. Before this PR, we attempted to only track keys that determined the entire relation. This would cause us to lose some information and prohibit some optimizations. (There were also a couple of cases in nested joins where we accidentally added partial keys to FDS anyway (by adding a key from a child table to the FDS of the parent.) If we ever used these keys it could have caused correctness issues, but it doesn't look like we ever did.
    • Improves the simplifyCols method in FDS to use partial keys in order to improve analysis.
      Overall, these changes allow us to compute much better FDS keys for complicated joins, by allowing us to remember and reuse keys derived from child tables to improve the computed key for tables later in the join.
  • 2062: "CREATE TABLE" fails with a "database not found" error when using the driver
  • 2061: More histogram support, redo stats table
    Reshape statistics interfaces to better support custom Dolt implementation.
    • Add ANALYZE TABLE <table> [UPDATE/DROP] HISTOGRAM ON <column,...> USING <json blob> support and a few tests
    • Replace use of update information schema cardinality updates with histogram updates
    • Default catalog has an in-memory histogram
    • New memory histogram impl
    • Delete old statistics table/histogram code
      The only prod difference is that the new update path can overwrite the default table count, while it was a testing-only thing before.
      companion: dolthub/vitess#279
      Dolt bump seems OK
  • 2060: Fixed various bugs in last_insert_id
    Fixes dolthub/dolt#6776
    Also adds better test coverage for LAST_INSERT_ID() and the INSERT_ID field in the ok response for updates.
  • 2059: fix db qualified column names in order by
    We did not handle the case where database qualified column names would be in the order by clause.
    This PR fixes that issue and adds a variety of tests with database qualified column names and database qualified table names.
    There is still a case where joining two tables with the same name from two different databases results in an error; there's a skipped test for this, and a workaround is to alias the tables.
    fixes dolthub/dolt#6773
  • 2054: prevent projections to be pushed past lateral joins
    pruneTables will pushdown projections as low as possible to avoid passing around columns that won't make it to the final result.
    In the case of lateral joins, a column that isn't referenced in the topmost projection may still be referenced by subqueries, which causes problems. This PR prevents the projection pushdown optimization from applying to the children of lateral joins.
    A better solution might be to determine which columns are referenced, and trim out all others. However, that seems hard.
    fixes dolthub/dolt#6741
  • 2051: Trigger view errors don't prevent writes on database with preexisting trigger views
    PR #2034 made it so that we can not DDL triggers on views. But past versions of Dolt permitted creating those triggers. After this change, databases with a trigger view will be have all writes blocked with the trigger view error. We should probably not try to parse and bind all triggers for every write, but as long as we do this adds a warning rather than an error for a non-DDL trigger parse.
  • 2050: Fix binding of time.Time via driver
    When using the driver in v0.17.0, an error occurs when binding a variable of type time.Time as a query parameter.
    The error message looks like this:
    type time.Time not supported as bind var: 2023-10-01 17:37:49.382855116 +0900 JST m=+0.017340108
    
    This issue did not occur in v0.16.0.
    While it's possible to resolve this by modifying sqltypes.BuildBindVariable() in vitess,
    this PR resolves it within the driver itself by converting the variable into a timestamp literal string.
  • 2049: adding gosql prepared tests
    We are lacking in tests that use the gosql driver to connect and run queries against an in-memory GMS server. This PR creates a new test suite so it's easier to write these tests. Additionally, there are some tests targeting a bug involving unsigned integers being read as signed.
  • 2048: Prevent identifiers longer than 64 characters
    Also fixed a bug where we allowed multiple column names with the same case-insensitive name
    Fixes dolthub/dolt#6611
  • 2047: Don't reorder joins that are too large to efficiently analyze.
    The current implementation of the join order builder scales poorly if there are too many joins. It's likely possible to improve it, but in the meantime, I'm disabling join reordering on joins that have too many tables (currently defined to be more than 20.)
    In these situations, the analyzer takes longer to run the reordering than it does to actually execute any of our test cases, so running the analysis in this case can only slow us down.
    I expect this is unlikely to adversely affect users because joins this large are rare, and when they do occur they are often written in a way that the explicit order is good enough.
    For example, this test from sqllogictests:
    SELECT x63,x53,x62,x52,x11,x5,x40,x64,x27,x28,x21,x41,x22,x30,x16,x14,x56,x32,x46,x50,x1,x34   FROM t46,t34,t1,t32,t53,t21,t63,t11,t30,t62,t27,t50,t16,t64,t40,t56,t22,t28,t52,t5,t41,t14  WHERE a21=b5    AND b30=a52    AND a62=b46    AND a14=3    AND b52=a28    AND b53=a14    AND a63=b28    AND b40=a56    AND a11=b64    AND a53=b22    AND b1=a34    AND b32=a41    AND a50=b63    AND a64=b62    AND b11=a30    AND b27=a40    AND a22=b56    AND b21=a46    AND a1=b50    AND b34=a16    AND a27=b16  AND a5=b41;
    
    takes 30 minutes to reorder, and 15 seconds to run when reordering is disabled.
    MySQL runs the query in under a second, demonstrating that reordering can still massively improve performance if we can make the algorithm more efficient. But this is a good stopgap measure.
  • 2044: use session builder from harness in the server engine
    Small fixes for memory harness for enginetest:
    • use sessionBuilder from the harness instead of DefaultSessionBuilder
    • convert row result for SHOW queries
  • 2041: not panic on Star.IsNullable()
    This reverts #2039 because the fix was not correct choice for the issue dolthub/dolt#6659.
  • 2039: AliasedExpr.InputExression should be compared case insensitive
  • 2038: error msg for invalid reference to non-existent table or column in existing view
    It catches invalid reference to non-existent table or column in existing view. This includes SELECT queries on a view that references table or column that was removed or renamed.
    Note: For now, It does not catch references to invalid functions or users without appropriate privilege cases and queries other than SELECT queries.
    Fixes: dolthub/dolt#6691
  • 2035: Tests for errors during insert not fouling a session
  • 2034: prevent creating trigger on view
    fixes dolthub/dolt#6432
  • 2033: Add VERBOSE_ANALYZER environment variable.
    This environment variable will make the analyzer output the current optimization plan after each analyzer rule.
  • 2032: fix order by on unioned schemas
    When unioning two SELECT statements that have different column types, we would get -1 during assignExecIndexes, resulting in a panic.
    This PR fixes the issue by matching on unqualified column names when we don't have an exact match.
    We don't find these matches because the second table has an unqualified alias over the column name because it is wrapping it in a convert node.
  • 2031: UNIX_TIMESTAMP() respects session time zone
    • For UNIX_TIMESTAMP() function, it converts the time value to be in the current session TZ instead of UTC TZ before returning the final value because the initial value is parsed as in UTC TZ, which is incorrect.
    • The default value of system_time_zone global variable will be set to the system TZ instead of UTC.
  • 2030: unskipping fixed tests
    We have many tests that are marked skip/broken, but they are working now.
    This PR unskips and cleans up some of these skipped tests.
  • 2029: Return innodb_lock_wait_timeout=1 always
    See: dolthub/dolt#4190
  • 2026: Do not error for SHOW view indexes
    Until we support view indexes, return nil for show view keys/indexes.
    fixes dolthub/dolt#6705
  • 2025: Failure to push filter causes dropped filter
    We make a hard assumption during join planning that there are no errant filters in the join tree. Every filter is either a join edge, or sitting on its relation. When this is not true, the memo can generate a transitive edge between two relations that loses track of the original filter. The process for triggering this bug is 1) filter in an ON condition gets moved to the middle of the tree, 2) the filter fails to get pushed to its join edge/relation, 3) we generate a transitive join edge that loses track of that filter, and then 4) we choose the transitive join edge in costing.
    You'll see the filter restored in the integration query plans in the PR. I added a minimal repro with the appropriate ON conditions and forced a transitive edge that drops the filter if pushdown regresses in the future.
  • 2022: TPC-X query plan tests
    Added schemas, stats, query plans for:
    • TPC-H
    • TPC-DS
    • IMDB join planning benchmark
      Added plangen to auto-update the tests.
      We cannot parse all of the TPC-DS query plans yet. I saw some ROLLUP and aggregation validation errors.
      Excluding data ops benchmark because the plans are not interesting.
  • 2018: Avoid corrupting the privileges file
    Currently, the last of these commands results in a panic due to the revoke inserting a database with an empty string for a name. This is fairly awkward to test in GMS, so I'm going to take the easy route and create a bats test in dolt.
  • 2017: Retaining display width for TINYINT(1)
    MySQL allows integer fields to specify a display width (e.g. TINYINT(1)). go-mysql-server currently parses that information, but doesn't retain it anywhere. This PR changes that behavior to match MySQL and retain the display width setting so that it can be passed back to callers.
    As of MySQL 8.1.0, the display width setting is ONLY retained for signed TINYINT fields and ONLY when the display width is set to 1.
    Fixes: dolthub/dolt#6682
    Corresponding Dolt PR: dolthub/dolt#6688
  • 2016: Bug fix: Preserve column name case for views
    Fixes dolthub/dolt#6683
    Dolt CI Checks: dolthub/dolt#6684
  • 2015: GROUP BY identifiers should prefer binding to table columns over projections.
    This also means an expression is allowed to project from the GROUP BY column multiple times.
    Fixes dolthub/dolt#6676
  • 2014: Improve message from json_extract() when json path doesn't start with a '$'
  • 2013: Change Project nodes so they can't return negative zero.
    In some cases we differ from MySQL in what types we use for intermediate results. This doesn't usually affect the final output, and can be more performant.
    But if the final result of an expression is a float when MySQL deduces it to be a decimal, and we don't do anything else to value that would cause it to be coerced (such as inserting it into a table with a schema), then we could end up displaying a result of the wrong type to the user. Usually this doesn't matter, unless that result is the float value -0 when the user expects a decimal.
    Ideally we'd prefer to detect the expected type and do a cast, but this is an acceptable stopgap measure.
  • 2012: Insert on dup col ordinal bug
    A certain set of conditions causes an error for indexing on duplicate update expressions:
    • The source is a SELECT statement (not a VALUES row)
    • All columns are specified by the INSERT (not sure why, but partial columns seems to get rearranged correctly. I think we must insert a compensating projection to handle column defaults)
    • The source columns are not the same order as the destination table schema
    • On duplicate update expression references a column from the new row
      For the query below, we were indexing the on duplicate expression in the wrong order, causing the output row to be two zero types:
    create table xy (x int primary key, y datetime);
    insert into xy (y,x)
    select * from (select cast('2019-12-31T12:00:00Z' as date), 0) dt(a,b)
    on duplicate key update x=dt.b+1, y=dt.a;
    The way we resolve inserts is still a bit weird. We resolve the source, and then afterwards add a projection to rearrange columns to match the target schema. I ran into a lot of problems trying to rearrange that ordering (first add projection, then analyze), mostly due to our inability to fix indexes on the source node's projection (VALUE nodes don't have a schema, and it isn't obvious when walking a tree that a given projection is going to be special). When we add the projection afterwards, however, it avoids the indexing rule so we can inline the values safely.
    My current fix is to mimic the projection mapping inside indexing. Index the duplicate expression values based on the ordinal of the destination schema. LOAD DATA for some reason needs its insert columns to not be specified, which will probably the source of different issues at some point.
    fixes: dolthub/dolt#6675
  • 2009: Re-enable query logging by default for DEBUG log level
  • 2008: Rewrote memory table editors to enable table rewrites
    This also fixes a number of bugs that were discovered during this process, notably not properly closing table editors in all instances.
  • 2007: Fixed character set IDs
    Character set IDs should correlate with their default collation's ID. Previously, they were arbitrarily assigned by sorting their names alphabetically. This should not be a breaking change for anyone, as the comment on the CharacterSetID mentions that the ID may change, and should not be persisted. Dolt, the largest integrator, abides by this rule.
  • 2006: Semi join and FDs bug
    Returning no projections from a table causes "column not found errors" when we try to reference those expressions higher in the tree. This fixes the semi join transform to creating empty projections.
    This fixes two bugs. The first is that we were too conservative checking whether index keys were strict FDs for a join relation. When a relation has a constant applied to a primary key, we can assume all of the columns returned by that join will be constant. Fixing that made it easer to test certain semi -> right lookup join transforms which were buggy. For the same case, when we are doing a lookup into table where a constant filter satisfies an index key, we need to still return a projection set that covers non-pruneable columns used in higher-level nodes.
  • 2005: implement INTERSECT and EXCEPT clause
    This PR implements two MySQL functions, INTERSECT and EXCEPT.
    They are similar to UNION in that they combine to two sets of rows.
    INTERSECT is the set of rows that are in both left and right.
    EXCEPT is the set of rows that are in the left that are not in the right.
    Additionally, there is a bug with field indexes with ORDER BY over unions.
    companion pr: dolthub/vitess#271
    fixes dolthub/dolt#6643
  • 2004: Enable use of slices in tuples for HashLookups
    Currently FusionAuth crashes Dolt with the following error:
    panic: runtime error: hash of unhashable type []uint8
    
    All FusionAuth IDs are binary(16), and join on those values in a HashLookup was resulting in using two []uint8 being used as a key to a hashtable. Nested arrays in tuples were tripping on an optimization made for short arrays. We've verified that optimization doesn't actually made a difference, so this change simplifies the code and makes it more generic.
  • 2003: Delete fixidx package
    remove remnants of old field indexing code
  • 2000: adding new ways to say varchar
    This PR makes to so the engine recognizes more ways to specify that a column is of type VARCHAR
    Companion PR: dolthub/vitess#270
    Fixes dolthub/dolt#6650
  • 1996: Move join indexing after all
    This should put almost all indexing logic into one rule that runs once at the end of analysis for a given query. It should require one walk of the tree, be much more correct for nested join and subquery indexing, and allow us to add nodes with special indexing logic much more easily.
    Summary:
    • new rule fixupIndexes replaces all of the other default indexing code
    • FixFieldIndexes still exists for insert source projection wrapping and LOAD DATA, but both of these can be easily rewritten to remove
    • CheckConstraintTable interface for Checks() and WithChecks() helpers
      The way the new fixup works is to traverse the tree in 3 phases. Each phase has a default mode but can be handled case-by-case for special node logic:
      (1) Descend child nodes of the current node. Collect children and child "scopes" that contain schema info
      (2) Index the current node's expressions.
      (3) Re-build the current node, and fix its "schema" to pass upwards into a parent
      The walk and scopes used for indexing should mirror exactly what we do on the execution side. As a result, some of the nutty logic for nodes could be standardized by changing what we do at execution time.
      note: The initial implementation is not particularly memory-efficient. Currently trying to filter for Dolt-side uses of old indexing functions that would need to be refactored.
  • 1995: Added option to change protocol listener
    This allows plugging in other protocol listeners besides the default MySQL one.
  • 1994: Fixes for subquery indexing/correlation tracking
    I missed a place where we were using getField indexes to do a correlation check. This adds the changes necessary to replace that check with one that uses a subquery's tracked correlation column set. This also adds an expression id column to GetField expression that preserves the id tracking information between iterative passes of the analyzer. It would probably be preferable to avoid all cases where we unnecessarily re-run rules, but this is more near at hand.
  • 1993: JSON Array Mutation
    The json_array_insert() and json_array_append() methods are in the family of other mutation functions ((set,insert,replace,remove).
  • 1992: Delete fixidx from pushdown
    Summary:
    • delete fixidx usages in pushdownFilters and generateIndexScans
    • use a mapping on subquery aliases when pushing filters through subqueries
    • better max1Row memo now
    • misc changes to re-indexing to compensate for work previously done during pushdown
  • 1991: Fix show processlist panic
    close dolthub/dolt#6625
  • 1990: Cast limit/offset to int type for prepared path
    fixes: dolthub/dolt#6610
  • 1989: Improve the efficiency of newIndexAnalyzerForNode by avoiding visiting Filter nodes.
    This prevents super-linear runtime in generateIndexScans
  • 1988: JSON Mutation functions implemented as stored procedures
    Earlier PR (#1983) enabled us to modify json document objects, but none of that functionality was exposed as actual JSON_* functions. This change ties it together. The following functions will behave identically to MySQL (to the best of my knowledge).
  • 1987: Move JSON Functions into their own sub directory
    Purely mechanical refactor done by the IDE. This is in preparation for adding support for several more JSON functions.
  • 1986: When pushing down filters, ensure removal of original filter.
    In cases where the filter expression changed during push down because the column IDs changed, we were accidentally checking for expressions with the new column IDs, not the old ones, so the old filter expressions weren't being removed.
  • 1985: Fix panic in merge join when using custom Indexes that don't allow range lookups.
    For instance, dolt has Commit indexes for tables that use commit hash as an index, but ranges don't make sense for those.
    There's no equivalent in GMS, so I created "point_lookup_table" table function for use in tests.
  • 1984: Partially reorder indexing rules; use unique ids rather than execution ids
    Name binding stores caching information upfront. Rule interdependencies pushed me into fixing a bunch of other rules before tests would pass. All together I think most of the changes are simplifications that I was planning on doing related to the fixidx refactor. I was hoping to make it more piecemeal. Hopefully this gets us ~50% of the way towards removing those dependencies.
    fixidx is mostly contained to reorderJoins and fixAuxiliaryExpressions now, both near the end of analysis. If we move the indexing in reorderJoins into fixAuxiliaryExpressions, all indexing will happen at the end of analysis. That would let us index complicated joins with subqueries correctly and all queries more reliably.
    summary:
    • rewrite cacheability to use correlated column references
    • volatile functions now prevent caching
    • rewrite moveFiltersOutOfJoinConditions to put filters below join when appropriate
    • subquery decorrelation uses (and updates) correlated column references
    • alias subquery strings simplified to use the query string, not the plan string
    • fix jsonTable and lateral join analysis
    • fixAuxiliaryExpresssions at end of analysis
    • recursive analyzer rules (insert, trigger, procedure) are all at end of analysis now
  • 1983: JSON Mutation
    Add JSON Value mutation operations and tests. These changes do not alter the operation of dolt in anyway yet - that will come in a second PR which updates the JSON_SET procedure, and adds support for the JSON_REPLACE, JSON_INSERT, and JSON_REMOVE procedures at the same time. This is laying the foundation for that work.
  • 1982: Inline flatten aliases
    This will skip a tree walk for most queries, inlining the rule in the places where nested table aliases can occur during binding.
  • 1981: Make the IntSequence test function consistently use int64 for its generated values
    The previous implementation had an issue where it assumed the type used in the received IndexLookup, but this type can actually depend on exactly how the lookup was generated (and whether the bounds value was parsed from the query or generated internally.) This caused a panic if it was used in Lookup joins.
    This makes no such assumptions and adds extra tests.
  • 1980: FreeBSD sockstate workaround
    Close #2138
  • 1979: Refactor event resolving
    Inline loadEvents and move event resolving out of execution time.
  • 1978: Fix table lookup bug for nil asOf
    Edit: Show keys for info schema table should work the same way as MySQL now.
    Prev: This error message is not correct, MySQL finds the table and does not error. But this at least reverts the behavior to what Dolt did before the name resolution refactors.
  • 1976: Refactor prepareds to cache AST
  • 1975: Union NULL types cast to same type as neighbor
    3+-way unions are subject to a bug where a NULL projection will be cast to a char type, and then have a type clash with the subsequent table relation. Rearranging the query to have the NULL projection last appears to fix the bug because the convert cast is moved to a position where it cannot have a type clash with a table relation to the right.
  • 1974: Re-implement generated stored columns with new analyzer
    Testing revealed a bug in explicit DEFAULT values in insert statements, added a skipped test for same.
    Also rewrote column default tests as scripts.
  • 1973: Fixed behavior for CONVERT(x USING y)
    The CONVERT(x USING y) expression allows us to convert between character sets. When a string does not have a representing rune in the target character set, it should be replaced by a question mark. This is consistent with MySQL. Our previous behavior logged an error, which is valid in many scenarios, but not for this particular expression.
  • 1970: sql: ExternalStoredProcedureDetails: Add the ability to mark stored procedures read-only.
  • 1969: Added a QueryEngine abstraction to engine tests to allow us to run engine tests on a running server
    Implementation is partially working. Lots of errors when used in tests, some of which are real and some of which are issues with test definitions. Will keep working on this over time, but want to get the necessary refactoring in before continuing.
  • 1968: Missed cases where replicaBinlogController is needed during building
  • 1967: Join aggregation projections subject to dropped refs
    Aggregation intermediate projections need to output all inputs for aliases in the target projection. The way we were tracking column source column references did not take into account that two join tables can have the same column name. Now we include the table name for alias dependency tracking.
    close dolthub/dolt#6578
  • 1964: Skip subquery pushdown when no subquery
    This will have a small perf impact.
  • 1962: Name res cleanup, docs and inline rules
    Move bespoke rules into planbuilder. Functions are only ran for their necessary nodes. Also adds copyright headers.
    Requires some Dolt changes: dolthub/dolt#6567
  • 1945: server: Add a warning message at startup if the server is run without secure_file_priv being set to a non-empty value.
  • 1934: New Merge Join planner
    This should fix dolthub/dolt#6020 once finished, and then some.
    The killer new feature in this new join planner is "Multi-Column Merge Joins", that is, merges where the comparison used for merges incorporates multiple filter conditions. This allows us to, in some cases, choose a much more selective index for merge joins. This improves both memory usage and performance because there will be fewer cases where the join iterator needs to keep multiple secondary rows in memory and cross-join them with multiple primary rows.
  • 1888: Remove filters from LookupJoins when they're provably not required.
    During joins, we still evaluate every filter for every candidate row. But based on the join implementation, some of those filters must necessarily be true, so we don't need to evaluate them.
    In most joins the performance cost of this isn't that bad, but this problem is most noticeable in LookupJoins where a point lookup is constructed from several columns, at which point the filter evaluation can dominate the runtime.
    One potential drawback of this change is that this might make reading query plans more obtuse, because some filters are no longer explicitly listed in the plan, but are a consequence of Join node's children. One option to prevent this would be to add metadata to joins listing filters that they assume to hold, or marking filters in the execution plan as "skipped." However, I think that described execution plans should match the actual behavior of the execution as closely as possible, so this may not actually be a concern.
    Some notes about the individual commits in this PR:
    3b54dbe: This commit fixes an existing bug which caused lookups to return extra rows if the filter was a null-safe-equals check, and the key is non-null. This bug previously caused no issues because these extra rows would not match the filter and would get dropped. Now that we're skipping filters we know to be extraneous, this bug would manifest if not fixed.
    48e8777: This commit changes the costing function of lookup joins. In the event that the lookup expressions can't be proved to uniquely identify a row, we attempt to estimate what percentage of the rows will be returned by the lookup. It's a very rough estimate, and serves mostly to guide which index to use if the table has multiple indexes: the more filters the index is able to make redundant, the better. The previous costing implementation had a special case for indexes where every column in the index was filtered on, and assigned that index a score somewhere between a 3-key lookup and a 4-key lookup. I imagine that the thought process when this was implemented was a lookup that used every column in an index would tend to result in very few rows compared to a lookup using an index prefix. Of course this is data dependent and I'm not convinced is generally true.
    2b384a3: This commit updates tests that check for specific query plans. Most of these updates are just eliminating filters. A small number of updates are a result of changing the lookup costing, resulting in different join types. I haven't looked at these closely yet. This may be okay, it may not be.
  • 1786: support event execution
    This PR adds event execution logic implementing EventScheduler interface in the engine.
    Notes:
    • Event Scheduler status cannot be updated at run-time.
    • Event DISABLE ON SLAVE status is not supported. It will be set to DISABLE by default.
      Corresponding Dolt changes: dolthub/dolt#6108

vitess

  • 313: supports FROM s.account, in which account is non-reserved keyword
  • 312: Allow InjectedExpr to handle name resolution
    Adds the capability for additional expressions that need to be resolved to InjectedExpr.
    Related PRs:
  • 311: add syntax support for SELECT ... INTO OUTFILE ... options
    We had parser support for SELECT ... INTO OUTFILE, but were lacking parser support for the variety of OUTFILE options.
    This PR adds support for all the options and certain combinations of them.
    However, we are missing support for duplicate options and certain orderings of these options; tests for these are added as error tests with a TODO.
    MySQL Docs:
    https://dev.mysql.com/doc/refman/8.0/en/select-into.html
    MariaDB Docs (this has the actual grammar):
    https://mariadb.com/kb/en/select-into-outfile/
  • 310: Limit/offset proc param
    re: dolthub/dolt#7458
  • 309: Added a way for Doltgres to inject expressions
    This primarily adds a new AST node that DoltgreSQL takes advantage of. All currently-existing nodes are specifically designed for MySQL transformations. In many cases, Doltgres will have different transformations that need to take place, and this provides a way for Doltgres to implement those transformations without having to modify other packages whenever a new transformation is added or edited.
    Related PRs:
  • 308: Change Handler.ComResetConnection to return an error
    Related to dolthub/dolt#3921
  • 307: allow hexnums for stats_auto_recalc table option
    This fixes 1 sqllogictest
  • 306: allow function keywords
    We have some extra keywords (they aren't listed as reserved or non-reserved in MySQL), to handle the special syntax.
    These keywords are allowed as table and column names without backquotes.
  • 305: Allow UNIQUE and PRIMARY KEY to be specified on the same column
    Related to: dolthub/dolt#7395
  • 303: Add parser support for the ALTER USER statement
    Adds support for parsing ALTER USER statements (MySQL reference). First step towards allowing users to change passwords.
    Related to: dolthub/dolt#7348
  • 302: Detect if a system variable scope was explicitly specified or not
    Changes the VarScope function so that it returns whether a scope was explicitly specified, or if one has been inferred.
    This is needed because some tooling (e.g. the official MySQL .NET Connector library) will query system variables (e.g. SELECT @@max_allowed_packet) and then will look up the returned value in the result set using the expected column name (@@max_allowed_packet). Currently, from the way we parse the system variable, this was always returned with the scope present, but to match MySQL's behavior, the column name needs to match the requested name.
    Related GMS PR: #2266
  • 301: support special syntax for position()
    This PR adds support for POSITION(<expr1> in <expr2>), which is a synoynm for LOCATE(<expr1>, <expr2>);
    MySQL Docs: https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_position
  • 300: custom char() syntax handling
    In order to support the charset argument for CHAR(... using <charset>), there has to be special syntax support.
    Companion PR: #2255
  • 299: Define an ExtendedHandler interface to handle postgres specific interactions
    These are pretty narrowly tailored for postgres prepared statements and long term should probably be placed in either doltgresql or another package. Defining them here is just expedient for experimenting with the new interfaces while they change.
  • 298: parse empty begin end block in trigger
    14 SQL correctness tests are failing because we throw syntax errors on queries like this:
    CREATE TRIGGER t1r1 AFTER UPDATE ON t1 FOR EACH ROW BEGIN END;
  • 297: Fixing the version keyword to not require identifier quotes
    The version keyword still required identifier quoting in some usages, such as SELECT * FROM base.version;. See dolthub/dolt#7237 for more details.
    This change moves the version keyword into the main list of non-reserved keywords. There was one conflict from use of the version keyword in the function_call_keyword rule, but it turns out that use of version there is not required. We have an existing test for using the version() function, so I didn't add a new one.
  • 296: refactoring default and on update expressions
    This PR changes the grammar to more closely match MySQL's behavior, specifically around the NOW() function and its synonyms.
    Changes:
    • Throw syntax errors for ON UPDATE expressions against functinos that aren't NOW() or a synonym.
    • Only accept integer for argument to NOW() and synonyms; syntax error for anything else
    • Simplified grammar rules
    • Removed CurTimeFuncExpr from AST in favor of plain FuncExpr
      Companion PR: #2218
  • 295: Allow inline column check constraint definitions to appear in any order
    Previously, an inline column check constraint could only appear as the very last option for a column definition. This change allows it to appear in other positions in the column definition. For example, this query now works:
    create table t123 (c1 varchar(5) check (c1 in ('v1', 'v2')) NOT NULL);
    Resolves: dolthub/dolt#7195
  • 294: Allow SqlType to parse "CHARACTER" and add tests for every other possible type that could be passed in.
    SqlType is a function in Vitess for normalizing every type name. It was missing an entry for the "CHARACTER" keyword.
    I added tests that should verify every single valid type keyword in the grammar, so this shouldn't happen again.
  • 293: Add additional types to sqlparser.SQLType()
    This function is used when the parser needs to map type names to underlying types in order to judge the validity of certain queries. Some types are aliases for others, (like REAL is an alias for float64) but they weren't included in SQLType(), so certain expressions that used these types could panic.
  • 292: Parse COLLATE BINARY on individual columns.
    We should be able to parse statements like:
    create table test (pk varchar(255) collate binary)
    This particular example will eventually get rewritten as create table test (pk varbinary(255)), but that doesn't happen during parsing, so the added vitess tests still expect varchar.
  • 291: round trip SHOW PLUGINS
  • 289: add partial support for ':=' assignment operator
    This PR adds support for set expressions and assignment expressions.
    Does not include support for select expressions, as it's deprecated on MySQL
  • 288: allow unquoted non reserved keywords for drop and rename column ddl
    fixes dolthub/dolt#6950
  • 287: support int1,int2,int3,int4,int8 aliases
    fixes dolthub/dolt#6900
  • 286: Logging improvements
    Logging a couple more error handling spots. Changing the Handler interface to pass mysql.PrepareData into the ComPrepare function so that GMS can log the params count at a debug level.
    Related GMS PR: #2098
  • 285: Add create view with columns
    ex:
    create view v_today(today) as select CURRENT_DATE()
  • 284: Include the With clause in walked subtrees for Select statements
    We weren't walking the With clause for select statements, which caused us to not find any bind vars in use there.
    Related to: dolthub/dolt#6852
  • 283: allow query options to appear in any order any number of times
    Allow statements like this to parse:
    select distinct sql_calc_found_rows distinct * from t;
    Fixes dolthub/dolt#6829
    Companion PR: #2088
  • 282: Server handling parsed statements
    See #2086
  • 281: Made generated column expressions parse to ParenExpr to match Default
  • 280: Update the default server version to 8.0.33
    The connection handshake was advertising a server version of 5.7.9-Vitess and some clients were using that info and trying and speak MySQL-5.7 to Dolt (example issue)
    This change updates the default advertised server version to 8.0.33-Dolt.
    Dolt CI tests are running at: dolthub/dolt#6798
  • 279: Add ANALYZE HISTOGRAM support
  • 278: fix unsigned flag for COM_STMT_EXECUTE when new_params_bind_flag is set
    In the previous implementation, we assumed that the way the MySQL Protocol specifies Column Definitions is the same as how it specifies parameter types for COM_STMT_EXECUTE. The difference lies specifically in the flags that come after the field type.
    When reading/writing a field type (for a Column Definition), MySQL expects/writes a 1 byte wide enum_field_type followed by a 2 byte wide Column Definition Flag.
    However, when reading a COM_STMT_EXECUTE payload (that specifies parameters through new_params_bind_flag), MySQL indicates parameter_types with the same 1 byte wide enum_field_type followed by a 1 byte wide flag that indicates signedness.
    So basically, read 0x80 for COM_STMT_EXECUTE parameters, but read/write 0x20 for field_types/column definitions.
    I'm assuming MySQL does it this way because the majority of the Column Definition Flags are nonsensical/meaningless when paired up with parameters to prepared statements. Regardless, this was a subtle bug, and we should have tests for parsing COM_STMT_EXECUTE with new_params_bind_flag.
    Fixes dolthub/dolt#6728
  • 277: Allow parsing of CREATE TABLE t AS (...) UNION (...)
    This allows parsing of CREATE TABLE AS statements when the expression being used to create the table is a set operation like UNION, INTERSECT, or EXCEPT.
    The "AS" keyword is typically optional. But this change only allows set ops to be used with CREATE ... AS when the AS is explicit. This is to avoid an ambiguity in the current grammar when attempting to parse CREATE TABLE t (, where what follows could be a set op or a table definition. Fully matching MySQL's spec here would require rewriting our grammar to avoid this ambiguity, which is outside the scope of the PR. However, this PR makes us strictly more correct than we were before.
  • 276: Allow parsing of SECONDARY_ENGINE = NULL
    This is a simple change to allow parsing a NULL value for the SECONDARY_ENGINE attribute for CREATE TABLE and ALTER TABLE statements.
  • 275: Adding parser support for VISIBLE and INVISIBLE modifiers for indexes
    Fixes: dolthub/dolt#6690
  • 274: Allow CREATE TABLE and ALTER TABLE to accept hexnum and float values when integers are expected.
    Fixes dolthub/dolt#6644 and dolthub/dolt#6648
    MySQL is permissive in what it expects for DDL statement parameters: in many places where ints are expected, the parser will also accept a hex number (and convert) or a float number (and truncate.)
    None of these values are currently used by GMS, so we don't need to add any additional processing logic. But the parser needs to accept them when they appear.
  • 273: Parse for update skip locked as a no-op
    While testing the FusionAuth application, there were regular panics which result from a query which does a select for update skip locked. This change allows us to parse these queries, but they don't impact query execution because row level locks are not supported by Dolt currently.
    https://dev.mysql.com/blog-archive/mysql-8-0-1-using-skip-locked-and-nowait-to-handle-hot-rows/
  • 272: Set character set IDs to current MySQL version
    The character set values were set for MySQL 5.0, so they've been updated to the correct values for 8.0.
  • 271: parsing intersect and except
    This PR adds support for parsing the keywords INTERSECT and EXCEPT.
    These work similar to UNION and work with DISTINCT and ALL keywords.
    Additionally, there are new precedence tests; INTERSECT has a higher precedence than UNION and EXCEPT.
    The rest are parsed left to right.
    syntax for dolthub/dolt#6643
  • 270: support alternate ways to say VARCHAR
    This PR adds more variations to VARCHAR, specifically including the keyword VARYING.
    Parses correctly, but needs GMS changes to actually work as a whole
    Companion PR: #2000
    Fixes dolthub/dolt#6650
  • 269: Fix UNION table option and parse LAST keyword
    This PR fixes support for parsing the UNION table option and parsing LAST.
    Additionally, it adds support for these table options in alter table statements.
    Fixes dolthub/dolt#6653
  • 268: tests and adding keywords for alter too
    This PR add support for parsing the keyword TABLE_CHECKSUM as an alias for CHECKSUM as a valid table option.
    Additionally, CHECKSUM and TABLE_CHECKSUM are added as valid keywords in alter statements as well.
    Fixes dolthub/dolt#6645
  • 267: Parse TABLE <table_name> statements
    This PR adds parser support for TABLE <table_name> syntax.
    It just converts this to be a SELECT * FROM <table_name> statement.
    Fixes dolthub/dolt#6642
  • 266: parsing secondary_engine
    This PR parses the syntax for SECONDARY_ENGINE, but does not store it with the table.
    For the most part we seem to parse, but not store any table options.
    MySQL docs aren't up-to-date yet; SECONDARY_ENGINE is not listed as a valid table option, but it works.
    https://dev.mysql.com/doc/refman/8.0/en/create-table.html
    These are also apparently valid options for create index statements, so I just added the syntax there too
    https://dev.mysql.com/doc/refman/8.0/en/create-index.html
    Fix for: dolthub/dolt#6632
  • 265: upgraded YAML library

Closed Issues

  • 2349: Foreign key constraints break auto-incrementing ids in memory mode
  • 2344: Server error message compatibility when doing INSERT
  • 2314: Can't turn on multiStatements when using memory driver
  • 2296: add create table comment on table clause to table spec
  • 2225: go mod tidy error
  • 2184: GolangCI does not work
  • 2159: VSCode debug Build Error
  • 1782: Error 1105: -128 out of range for BIGINT UNSIGNED