-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for table partitioning #1035
Comments
PostgreSQL docs: https://www.postgresql.org/docs/current/ddl-partitioning.html |
@mrjrt this is definitely a good feature to have in the provider. If you want to give this a stab I can provide some guidance and pointers. |
@roji - We're looking to use Partitions for efficient deletes using EF Core. Do you have any guidance on either
|
@jonmill using partitions today could be done by editing the generated migrations files for table creation, and replacing the appropriate lines with raw SQL that would create the partitions as desired (according to https://www.postgresql.org/docs/current/ddl-partitioning.html). I'm not familiar enough with PostgreSQL partitions to foresee particular problems or details, but in principle partitions should only ever affect table (and index) creation, and not querying or inserting. That means it's purely a migrations thing. Contributing to this would mean designing model building extensions on EntityType, which would allow to model partitioning as supported by PostgreSQL. Users would be able to call C# extension methods when configuring their model, to define partitions for each entity etc. The Npgsql provider already exposes quite a few such PostgreSQL-specific extensions, although partitioning does seem to be slightly richer than the average case. |
@roji - Hmm manually editing the migrations file seems sketchy. I did some research and it doesn't look like you can alter a table to generate partitioning after it's created, so it looks like updating the EF extensions might be the way to go. Any guidance on where to start looking in the source for ways to begin adding support for partitions? |
In general, there's no problem at all with editing migration files and adding raw SQL (or replacing existing directives) - the EF Core team explicitly encourages this where it is necessary. A good example is database triggers; EF Core doesn't provide any way to model triggers and produce migrations for them, and there wouldn't be much value in it. So if you need them you can always edit migrations and add them there. Adding support for partitions isn't going to be trivial. But if you want to go into it, take a look at the existing entity builder extensions, which are PostgreSQL model configuration extensions. These add metadata to entities (in the form of annotations), which are later picked up by the migrations SQL generator when it generates the CREATE TABLE DDL, for example. As I wrote above, the description of how partitions are specified/configured isn't trivial, so we'd have to think about how to represent everything in metadata etc. |
Interesting, I didn't know that...thanks for clarifying that. So if we were to go this route since the addition of partitions to the code base sounds like it would be non-trivial and time-consuming, especially for someone not intimately familiar with the code base, would this involve completely replacing the CREATE TABLE call for the partitioned table? Or is there a way we can update the existing CREATE TABLE call to add our partitioning scheme? |
@roji - I found a potentially better way that's simpler; I use table inheritance to create the master table using EF Core, then create inherited tables as necessary using raw sql with a Rule to redirect inserts into the inherited table. The one problem I've found is inserting into the master table; EF Core throws a concurrency error since rows weren't actually inserted into the master table. I'm guessing this is due to how postgres rules change the actual query...for instance
Inserting data into Master causes a concurrency exception since EF tries to get values back for tracking but the insert rule necessary for inherited tables doesn't return one. Do you know of any way around this outside of writing manual SQL to insert? |
Hmm... If you examine the INSERTs sent by EF Core, you should see a RETURNING clause which instructs PostgreSQL to send back the generated columns in the new row. As long as you replicate the same clause in your INSTEAD rule, things should be transparent to EF Core and everything should work. Hope I've understood your issue, note that there's an issue open on having EF Core support PostgreSQL table inheritance, although it's unlikely we'll get around to doing anything soon. |
I don't think a |
@jonmill no, EF Core really does require those keys in order to work correctly (in at least some scenarios). For example, if you add a new entity and then update it, EF Core needs to have the (generated) key from the earlier INSERT statement to properly generate the later UPDATE statement. Note this paragraph in the PG docs which seems to indicate that RETURNING is supported in INSERT rules, but apparently on unconditional ones (e.g. no WHERE clause)? I'm guessing this is the problem you're running into?
|
@roji - gotcha; yes, that is the issue I'm running into. I got around it by just doing a manual INSERT statement instead of using EF. The items I'm inserting won't be tracked or (immediately) used after insertion, so the RETURNING clause is extra work that's unnecessary for me. It'd be great if this was possible without writing a manual INSERT clause, but I couldn't find a way |
Unfortunately EF Core doesn't know that you don't need tracking (and so don't need the RETURNING) - what you're asking for is dotnet/efcore#9118. Right now I don't think there's a solution for what you're trying to do. |
Gotcha; thanks for all your help! |
Hi all!
Hope it will help! |
Hi @roji, we got all the tables migration happening using EF Core. For enabling the partition by range, I'm not sure how I can change the Migration File? ( to achieve something like below ) CREATE TABLE people ( reg child partition table, I can add by migrationBuilder.Sql(sql);. but not sure how I can create the master table with the partition definition using the EF Core Migration. |
@mathavanmani EF Core (and the PostgreSQL provider) doesn't currently support table partitioning. However, it's perfectly fine to generate migrations, and then edit the generated files and introduce raw SQL. So you can use EF Core migrations to generate a plain vanilla CreateTable operation, then edit the migration and replace that with a raw SQL operation that contains your |
"returning xmin" does not work in partitioned tables in Postgres 12 (works ok in 11), so we can't use npgsql EF Core with partitioned tables... |
Hello, @roji I used an interceptor to rewrite the SQL, but the SaveChanges data was successfully saved to the table, but an exception in the npgsql driver was triggered, and an error was still reported at the code level. It seems that npgsql defaults to a bigint after receiving txid_current() as xmin, and an exception occurs when converting to the rowversion uint type of the ef model. |
Yes, we did such interceptor as I remember :)) Now we're on spring boot projects - other technology other problems :)) |
@KrzysztofBranicki you make a good point about libraries vs. applications; however, it simply isn't realistic to expect EF to cover the entire range of DDL possibilities across all supported databases (after all, the point you make has nothing to do with PostgreSQL table partitioning specifically). There's a huge space of options there, and there are bound to be users out there who want each one of them; modeling them in EF is non-trivial work and requires careful planning, and generally the return on that effort isn't great, given that users can do whatever they want by editing their migrations. Having said that, there definitely does seem to be lots of interest in table partitioning specifically. I can't commit on this getting done for 9.0 - but give the number of votes this is definitely one of the highest-value features, and I promise to do my best. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
@pantonis The primary key can use the Snowflake ID algorithm to generate a long type ID. The Snowflake ID algorithm has a built-in concept of time. |
https://dba.stackexchange.com/questions/123145/how-to-view-tuples-changed-in-a-postgresql-transaction/123183#123183 /// <see>https://www.postgresql.org/message-id/flat/141051591267657%40mail.yandex.ru</see>
/// <see>https://dba.stackexchange.com/questions/123145/how-to-view-tuples-changed-in-a-postgresql-transaction/123183#123183</see>
/// <see>https://stackoverflow.com/questions/49214219/what-is-the-meaning-of-epoch-in-txid-current-in-postgresql</see>
/// <see>/~https://github.com/npgsql/efcore.pg/issues/1035#issuecomment-2118584744</see>
protected class UseCurrentXactIdAsConcurrencyTokenCommandInterceptor : DbCommandInterceptor
{
public static UseCurrentXactIdAsConcurrencyTokenCommandInterceptor Instance => new();
public override InterceptionResult<DbDataReader> ReaderExecuting(
DbCommand command,
CommandEventData eventData,
InterceptionResult<DbDataReader> result)
{
ManipulateCommand(command);
return result;
}
public override ValueTask<InterceptionResult<DbDataReader>> ReaderExecutingAsync(
DbCommand command,
CommandEventData eventData,
InterceptionResult<DbDataReader> result,
CancellationToken cancellationToken = default)
{
ManipulateCommand(command);
return new(result);
}
private static void ManipulateCommand(DbCommand command) =>
command.CommandText = command.CommandText.Replace(
"RETURNING xmin",
"RETURNING pg_current_xact_id()::xid");
} protected override void OnConfiguring(DbContextOptionsBuilder options)
{
+ options.AddInterceptors(UseCurrentXactIdAsConcurrencyTokenCommandInterceptor.Instance) Also it's recommend to disable |
… `cannot retrieve a system column in this context` with partitioned table: npgsql/efcore.pg#1035 (comment) @ TbmDbContext.cs $ mv Db/Entit{ies,y}WithImageId.cs @ c#/shared
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I would also like to declare my interest in supporting partitioned takes. Great work on the library. |
Did you achieve it? Can you tell me how to do it? |
Not yet |
its possible to replace command in migration engine this way
the problem is that the string "PARTITION BY .... (YourColumn);" must be in the same command as CREATE TABLE (is it correct?). The solution is not beautiful, but it is there. We write our own NpgsqlMigrationsSqlGenerator, intercept the CreateTableOperation command and modify it. This solution is the best i could come up with, please forgive me if i did something against the rules. i'm just starting out with Ent Framework Core 5 days ago. But why cant create partition table when i have primary key... really sad :( |
Not sure if you broke the rule but that interesting indeed.
That may simply be because you're not including your partition key in your primary key and this is mandatory. See : https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE-LIMITATIONS |
@otarnus-dennemeyer We all know that postgres has a limitation of being part of the key, but i dont fully understand why. I understand that being the primary key allows during insert to quickly check if the key is not already taken, because it does not have to search all subtables. However, this current limitation can perhaps be solved by using an additional common index (unless i am missing something and it does not solve the problem). During insertion, only one common index would be searched. Maybe in the postgres implementation it is not easy to implement, I do not understand that. |
I'm a member of a project that is going to be using code-first EFC with Postgres and need to utilise table partitioning. It would be great if the data connector supported this natively so we don't have to resort to hacking SQL into the migrations manually or switching to schema-first.
I've taken a quick gander at your codebase and have some ideas, but I suspect quite strongly I'll not get anywhere so thought I'd ask politely here as well as with the improvements in PG12 (i.e. foreign key support) more people may decide to utilise partitioning.
Thanks!
The text was updated successfully, but these errors were encountered: