Skip to content

Releases: redpanda-data/redpanda

v24.3.3

09 Jan 23:38
b731170
Compare
Choose a tag to compare

Bug Fixes

  • Fixes a bug in Redpanda's Iceberg manifest list Avro definition that previously resulted in an end-of-file (EOF) error when reading manifest list Avro files written by other engines. This could previously crash Redpanda or block Redpanda from appending Iceberg data, and could also prevent certain query engines from successfully reading Iceberg data written by Redpanda. by @andrwng in #24650
  • Fixes a bug which may lead to archival_metadata_stm inconsistencies when reconfiguring clusters with recovered compacted topics. by @mmaslankaprv in #24678
  • #24684 Fixes an issue that blocked the compaction of consumer offsets with group transactions. by @bharathv in #24688
  • fixes rare bug leading to offset translation inconsistency in recovered topics by @mmaslankaprv in #24628

Improvements

  • Added metrics for pandaproxy resource usage. by @IoannisRP in #24603
  • Adds logging to mention data removed by compaction. by @andrwng in #24736
  • Move failed authorization log statements from the kafka logger to a new kafka/authz logger, allowing for fine grained control over log statements for failed authorization. by @rockwotj in #24718
  • rpk now supports well-known protobuf types when encoding/decoding records using Schema Registry. by @r-vasquez in #24699
  • PR #24591 [v24.3.x] pandaproxy: add missing internal metrics by @IoannisRP
  • PR #24608 [v24.3.x] storage: add tombstones_removed metric to probe by @WillemKauf
  • PR #24619 [v24.3.x] Offset translator consistency validation by @mmaslankaprv
  • PR #24627 [v24.3.x] rpk remote debug bundle: job-id help text change by @r-vasquez
  • PR #24705 [v24.3.x] kafka/client: replace std::vector with chunked vector by @IoannisRP
  • PR #24729 [v24.3.x] rpk bundle: Fix race condition in SASL credential redaction by @r-vasquez

Full Changelog: v24.3.2...v24.3.3

v24.2.15

09 Jan 19:06
9424d94
Compare
Choose a tag to compare

Bug Fixes

  • Fixes a bug where failing to audit an authentication event could lead to a broker crash. by @pgellert in #24739
  • Fixes a bug which may lead to archival_metadata_stm inconsistencies when reconfiguring clusters with recovered compacted topics. by @mmaslankaprv in #24680
  • #24685 Fixes an issue that blocked the compaction of consumer offsets with group transactions. by @bharathv in #24689
  • fixes rare bug leading to offset translation inconsistency in recovered topics by @mmaslankaprv in #24629

Full Changelog: v24.2.14...v24.2.15

v24.2.14

20 Dec 17:26
cd11afe
Compare
Choose a tag to compare

Bug Fixes

  • Fixes a bug in which a segment being rolled and closed could race, leading to a triggered vassert. by @WillemKauf in #24559

Improvements

  • Added metrics for pandaproxy resource usage. by @IoannisRP in #24604
  • Show leader id in /v1/cluster/partitions response. by @ztlpn in #24584

Full Changelog: v24.2.13...v24.2.14

v24.3.2

18 Dec 21:36
32a9dce
Compare
Choose a tag to compare

Features

  • Improve the user messages when the
    topic_partitions_reserve_shard0 cluster config is used and a user tries to create a topic with more partitions than the core-based partition limit. by @pgellert in #24461

Bug Fixes

  • Ensure redpanda_cloud_storage_cloud_log_size metric consistent across all replicas. We used to update it seldomly from the leader replica only which lead to inconsistent/stale values. by @nvartolomei in #24364
  • Fixed a bug in which sliding window compaction may become stuck on failing to build an index map for a single segment. by @WillemKauf in #24424
  • Fixes a bug in which a segment being rolled and closed could race, leading to a triggered vassert. by @WillemKauf in #24560
  • Fixes a bug in which segments which may have tombstones in them were not considered eligible for self-compaction. by @WillemKauf in #24500
  • Fixes a bug that could prevent topic recovery on ABS object storage when there are objects in a bucket from multiple clusters (e.g. following a whole cluster restore). by @andrwng in #24455
  • Fixes a bug where rpk wasn't parsing --help when used alongside --redpanda-id in rpk cloud <provider> byoc apply by @r-vasquez in #24396
  • Fixes a bug where serializing manifests for Iceberg topics with decimal fields could cause Redpanda to crash or upload invalid manifests by @oleiman in #24467
  • Fixes a crash resulting from incorrect cleanup of log readers used for iceberg translation. by @bharathv in #24576
  • Fixes a race that could prevent Iceberg translation from happening following a leadership change. by @andrwng in #24562
  • Fixes accounting of iceberg commit lag metric that can remain erroneously high in some cases even though the translation if fully caught up. Additionally the change ensures that only partition leaders emit lag metrics while followers emit 0 lag. by @bharathv in #24575
  • If a discrete disk is used for cloud storage cache Redpanda previously rejected writes if that disk (cache disk) was full (in degraded state). This is incorrect since the cache disk isn't in the way of writes. From now on, reject writes only if the data disk is full (in degraded state). by @nvartolomei in #24486
  • #24428 Schema Registry: fixes a bug in the Avro compatibility check reader_field_missing_default_value where it was too lenient for missing default values of null-able types. by @pgellert in #24430
  • #24587 Redpanda will now permit topics to be created with redpanda.remote.[read|write] set to true when a license is expired or missing provided that the cluster config cloud_storage_enabled is set to false. by @michael-redpanda in #24588

Improvements

  • Adds additional debug log messages in the datalake coordinator regarding files to be committed to Iceberg. by @andrwng in #24563
  • Beta version of Iceberg support was incorrectly classified as "enterprise only". by @oleiman in #24443
  • Leader balancer: don't treat each core as independent and balance total number of leaders on each node as well. by @ztlpn in #24440
  • Show leader id in /v1/cluster/partitions response. by @ztlpn in #24585
  • #24539 Disable datalake services in recovery mode by @ztlpn in #24549
  • rpk topic describe now supports the --format flag to display the output in either JSON or YAML. by @r-vasquez in #24438

Full Changelog: v24.3.1...v24.3.2

v24.2.13

11 Dec 13:01
ee0c765
Compare
Choose a tag to compare

Features

  • Improve the user messages when the
    topic_partitions_reserve_shard0 cluster config is used and a user tries to create a topic with more partitions than the core-based partition limit. by @pgellert in #24462

Bug Fixes

  • Ensure redpanda_cloud_storage_cloud_log_size metric consistent across all replicas. We used to update it seldomly from the leader replica only which lead to inconsistent/stale values. by @nvartolomei in #24365
  • Fixes a bug that could prevent topic recovery on ABS object storage when there are objects in a bucket from multiple clusters (e.g. following a whole cluster restore). by @andrwng in #24454
  • Fixes a bug where rpk wasn't parsing --help when used alongside --redpanda-id in rpk cloud <provider> byoc apply by @r-vasquez in #24397
  • If a discrete disk is used for cloud storage cache Redpanda previously rejected writes if that disk (cache disk) was full (in degraded state). This is incorrect since the cache disk isn't in the way of writes. From now on, reject writes only if the data disk is full (in degraded state). by @nvartolomei in #24484
  • #24431 Schema Registry: fixes a bug in the Avro compatibility check reader_field_missing_default_value where it was too lenient for missing default values of null-able types. by @pgellert in #24432
  • PR #24200 [v24.2.x] cst/cache: fix use-after-move caused by calling get_exception twice by @nvartolomei
  • PR #24329 [v24.2.x] Fixed race condition between appends and prefix truncation by @mmaslankaprv
  • PR #24335 rm_stm: remove always true assert on transaction_ga feature by @bharathv
  • PR #24349 [v24.2.x] c/balancer_planner: check if topic exists in node count map by @mmaslankaprv
  • PR #24372 [v24.2.x] c/controller_backend: allow shutdown_partition to fail on app shutdown by @bashtanov
  • PR #24459 [v24.2.x] raft/c: fix an indefinite hang in transfer leadership by @bharathv

Full Changelog: v24.2.12...v24.2.13

v24.3.1

03 Dec 15:00
afe1a3f
Compare
Choose a tag to compare

Features

  • Added support for Iceberg Topics (various improvements below)
  • New REST API for mounting/unmounting topics by @mmaslankaprv in #23167
  • adds rpk cluster storage topic mount, unmount, list-mount, status-mount, cancel-mount by @gene-redpanda in #23575
  • Add leadership pinning: ability to set preferred racks for topic partition leaders. To configure, set redpanda.leaders.preference topic config property or default_leaders_preference cluster config property. by @ztlpn in #23691
  • Enable node_local_core_assignment feature by default by @ztlpn in #23453
  • Adds Schema Registry support for the JavaScript Data Transforms SDK by @oleiman in #21491
  • Adds list-mountable to allow listing mountable topics by @gene-redpanda in #23924
  • Adds the topic property delete.retention.ms, as well as the cluster property tombstone_retention_ms. Configuring these allow for the removal of tombstone records in compacted topics with tiered storage disabled in redpanda. by @WillemKauf in #23662
  • Schema Registry: Support normalize=true by @BenPope in #22519
  • Schema Registry: added support for the "verbose" query parameter on the schema compatibility checker endpoint by @pgellert in #22877
  • Schema Registry: verbose compatibility error reporting is now supported for JSON as well by @pgellert in #23208
  • #17984 Adds a new broker configuration transaction_max_timeout_ms. The configuration controls the maximum allowed user set timeout for transactions. If a client requested transaction timeout exceeds this configuration, the broker will return
    an error during transactional producer initialization. This guardrail prevents hanging transactions from blocking consumer progress. The default value is 15mins. by @bharathv in #21504
  • rpk: Add rpk registry mode to manage the schema registry mode. by @r-vasquez in #22675
  • rpk: supports triggering on-demand partition balancer by @daisukebe in #22855
  • Added support for using PKCS#12 files for TLS services by @michael-redpanda in #21313
  • Adds admin API endpoint for enterprise feature info GET /v1/features/enterprise by @oleiman in #23314
  • A new metric (cluster_features_enterprise_license_expiry_sec) is added for easier monitoring of the enterprise license's expiry time. by @pgellert in #23367
  • After the cluster is first formed, a trial license is automatically loaded to provide an evaluation period of enterprise features. by @pgellert in #23893

Improvements

  • --regex flag in rpk topic describe now supports internal topics. by @r-vasquez in #23487
  • A number of optimizations to local storage compaction. by @WillemKauf in #23380
  • Add an LRU caching layer to Rust transform SDK Schema Registry client by @oleiman in #19859
  • Add support for differentiating tombstone records from empty-string value records in rpk produce and rpk consume. by @WillemKauf in #23264
  • Added support for Metadata API v8 by @michael-redpanda in #22669
  • Added vectorized_kafka_rpc_connections_rejected_rate_limit metric which counts incoming Kafka connections rejected due to the connection rate limit (if set), analogously to the existing vectorized_kafka_rpc_connections_rejected metric which counts rejected connections due to the hitting the open connection limit. by @travisdowns in #22803
  • Adds a shard label to some consumer group metrics. by @ballard26 in #23339
  • Adds support for setting schema registry connection parameters in the rpk stanza of redpanda.yaml. by @andrewstucki in #24017
  • Adds the cloud_storage_backend::oracle value, and helps the s3_client properly configure for OCI storage. by @WillemKauf in #22902
  • Adds the ability to configure Node UUID and ID overrides at broker startup. by @oleiman in #22972
  • Allow rpk cluster self-test start to run, even in a cluster with mixed versions of redpanda (before and after cloudcheck addition in 24.2.x). by @WillemKauf in #21370
  • Allows DeleteRecords requests from Kafka clients or rpk topic trim-prefix to be called with truncation_offset <= start_offset without returning an error. The request is instead treated as a no-op. by @WillemKauf in #22905
  • Allows the self-test to be completely compatible with a mixed version cluster, in the case of a rolling upgrade. by @WillemKauf in #22831
  • Deprecate leader_balancer_mode cluster config property. by @ztlpn in #23780
  • Implements @redpanda-data/transform-sdk-sr.SchemaFormat for the WASM Transforms JS module by @oleiman in #23164
  • Improve handling of boolean property values during a CreateTopics request by making parsing case-insensitive. by @WillemKauf in #23682
  • Improve handling of boolean values during a CreateTopics request by no longer silently ignoring an invalid value, instead throwing a configuration error. by @WillemKauf in #23682
  • Improve handling of certain invalid topic configuration parameters that would lead to a timeout failure instead of a graceful error code during a CreateTopics request. by @WillemKauf in #23682
  • Improve property configuration descriptions. by @Deflaimun in #23347
  • Minimizes data loss in recovery scenarios by @mmaslankaprv in #24071
  • Reduce the memory overhead of many small segments. by @rockwotj in #22962
  • Return core assignments from health report in /v1/cluster/partitions admin API output. by @ztlpn in #22695
  • Schema Registry: 5 new compatibility checks are added for protobuf (ONEOF_FIELD_REMOVED, MULTIPLE_FIELDS_MOVED_TO_ONEOF, REQUIRED_FIELD_{ADDED,REMOVED}, FIELD_NAMED_TYPE_CHANGED, MESSAGE_REMOVED) by @pgellert in #22798
  • Schema Registry: Improve AVRO Normalization by @BenPope in #22519
  • Schema Registry: now reports more specific error messages for Avro and Protobuf schemas when they are incompatible with earlier schemas. by @pgellert in #22958
  • Set the default value of topic_partitions_reserve_shard0 to zero. This means that we no longer weight shard 0 as if it has 2 more partitions than it actually has, leading to more even partition distribution in cases where the total number of partitions is close to the vCPU count. by @travisdowns in #22841
  • The command line is now printed to the log at startup by the Redpanda process. by @travisdowns in #22826
  • Upgrade data transforms tinygo compiler to version 0.34.0 by @rockwotj in #23969
  • #17682 Schema Registry: Remove spurious log entry: No syntax specified for the proto file by @BenPope in #22633
  • #21536 rpk topic describe-storage can be used now with internal topics. by @r-vasquez in #22338
  • #22333 rpk debug bundle: include the result of uname -a by @JFlath in #22334
  • #22666 Allows users to query the value of a cluster property with rpk cluster config get using either the original property name, or any of its aliases. Whereas before, rpk cluster config get using a property's aliased name would return a Property {} not found result. by @WillemKauf in #22674
  • [#23038](/~https://github.com/redpanda-dat...
Read more

v24.2.12

27 Nov 01:11
e9dc86e
Compare
Choose a tag to compare

Bug Fixes

  • Fixed an issue where creating a topic with a huge number of partitions could lead to a crash. by @IoannisRP in #24232

Improvements

  • Schema Registry: Add Some metrics for resource usage taken by in-memory schemas by @BenPope in #24270

Full Changelog: v24.2.11...v24.2.12

v24.2.11

21 Nov 21:18
29b8a8e
Compare
Choose a tag to compare

Bug Fixes

  • Construct audit metrics probe during service initialization to prevent null pointer access. by @michael-redpanda in #24127
  • Fixed an issue where creating a topic with a huge number of partitions could lead to a crash. by @IoannisRP in #24232
  • Fixes a bug in which upload candidates made from segments with missing batches would trigger metadata related errors in the ntp_archiver_service, due to assigned start offsets being lower than they should be. by @WillemKauf in #24106
  • #24076 Fixes a rare bug during remote partition manifest downloads where broken pipe exceptions weren't retried in an edge case. by @pgellert in #24080
  • #24144 This fixes a bug in the audit client where if the cluster config value kafka_batch_max_bytes was greater than audit_client_max_buffer_size, the audit client ends up not producing any messages and becomes stuck filling up the audit log buffers. by @pgellert in #24148
  • #24207 Redpanda neglected to include ECDSA based ciphers in the cipher strings used for TLSv1.2 and below. This caused TLS connections that used ECDSA based certificates to fail cipher negotiation when using TLSv1.2 and below. ECDSA ciphers are now in the list of supported ciphers. by @michael-redpanda in #24209

Full Changelog: v24.2.10...v24.2.11

v24.1.18

17 Nov 16:26
907c5f9
Compare
Choose a tag to compare

Features

  • #23454 A new metric (cluster_features_enterprise_license_expiry_sec) is added for easier monitoring of the enterprise license's expiry time. by @pgellert in #23467
  • #23760 Adds admin API endpoint for enterprise feature info GET /v1/features/enterprise by @oleiman in #23761

Bug Fixes

  • Construct audit metrics probe during service initialization to prevent null pointer access. by @michael-redpanda in #24128
  • Fixes a bug in which upload candidates made from segments with missing batches would trigger metadata related errors in the ntp_archiver_service, due to assigned start offsets being lower than they should be. by @WillemKauf in #24105
  • Fixes a bug where only a group static member's protocols would be updated on rejoin, even if more properties had been passed to the rejoin command by @IoannisRP in #23733
  • #23863 Fixes a bug where audit log manager would retry a bad request forever, causing buffers to fill up, blocking audit log appends and preventing authZ. by @oleiman in #23868
  • #23930 Ignore heartbeat requests/replies to/from unexpected node ids. by @ztlpn in #23934
  • #24056 Cleanup tiered storage temporary cache file if exceptions are thrown during download. by @nvartolomei in #24064
  • #24077 Fixes a rare bug during remote partition manifest downloads where broken pipe exceptions weren't retried in an edge case. by @pgellert in #24079
  • #24143 This fixes a bug in the audit client where if the cluster config value kafka_batch_max_bytes was greater than audit_client_max_buffer_size, the audit client ends up not producing any messages and becomes stuck filling up the audit log buffers. by @pgellert in #24149

Improvements

  • --regex flag in rpk topic describe now supports internal topics. by @r-vasquez in #23605
  • Adds a shard label to some consumer group metrics. by @ballard26 in #23626
  • #23404 Adds the ability to configure Node UUID and ID overrides at broker startup. by @oleiman in #23412
  • fixed large allocation in Raft implementation by @mmaslankaprv in #24009
  • rpk: redpanda admin brokers list exposes Host/Port/Rack/UUID additionally by @daisukebe in #23688
  • PR #23414 [v24.1.x] archival: use log_level_for_error() for failed reupload candidates by @WillemKauf
  • PR #23450 [v24.1.x] storage: catch ss::gate_closed_exception in log_manager (manual backport) by @WillemKauf
  • PR #23501 [v24.1.x] tests/failure_injector: undo the failures on exit by @bashtanov
  • PR #23506 [v24.1.x] cluster_recovery_backend_test: only reset relevant config by @andrwng
  • PR #23515 [v24.1.x] rptest: produce more data in FullDiskReclaimTest to trigger gc conditions by @nvartolomei
  • PR #23527 [v24.1.x] CORE-7689 dt/rp_installer: ensure cache directory exists by @pgellert
  • PR #23537 [v24.1.x] rptest: do not expect cached segment readers at the end of the test by @nvartolomei
  • PR #23555 [v24.1.x] ssx: exit early from sleep_abortable if already aborted by @nvartolomei
  • PR #23608 [v24.1.x] tests: fix rpk generate test by @r-vasquez
  • PR #23646 [24.1.x] tests: bump ducktape to latest of 0.11.x by @ivotron
  • PR #23674 [v24.1.x] tests: test legacy dashboard in rpk generate by @r-vasquez
  • PR #23708 [v24.1.x] gha: rm use of rp_storage_tool_uploader by @andrewhsu
  • PR #23746 [v24.1.x] Keep producer inflight requests queue bounded by @mmaslankaprv
  • PR #23766 [v24.1.x] gha: fix pip install on python actions by @ivotron
  • PR #23818 [v24.1.x] rpk: debug bundle collecting broker UUIDs by @daisukebe
  • PR #23831 [v24.1.x] rpk: introduce license warnings messages by @r-vasquez
  • PR #23849 [v24.1.x] [CORE-7957] tests: wait for license information comparisons by @r-vasquez
  • PR #23861 [v24.1.x] kafka: oversized alloc in list_offsets_topic by @IoannisRP
  • PR #23866 [v24.1.x] [CORE-7719] Add has_valid_license & has_enterprise_features to phone-home metrics by @oleiman
  • PR #23923 [v24.1.x] rpk: fill schema registry information in cloud profiles by @r-vasquez
  • PR #23965 [v24.1.x] [DEVEX-36] rpk: change expiry check for free_trial by @r-vasquez
  • PR #23988 [v24.1.x] rpk: fix printing new lines by @r-vasquez
  • PR #24015 [v24.1.x] metrics: Add list of enterprise features to call-home POST by @oleiman
  • PR #24025 [v24.1.x] transform-sdk/go/tests: remove -quiet flag by @rockwotj
  • PR #24044 [v24.1.x] [CORE-8141] Add host information to metrics report by @michael-redpanda
  • PR #24051 [v24.1.x] storage: housekeeping metrics by @nvartolomei
  • PR #24062 [v24.1.x] [CORE-1478] rptest: fix retention value in archive_retention_test by @WillemKauf
  • PR #24070 [v24.1.x] rptest: reduce cache eviction throttling for space leak test by @nvartolomei
  • PR #24089 [v24.1.x] storage: remove assertion on is_cloud_retention_active by @ballard26

Full Changelog: v24.1.17...v24.1.18

v24.2.10

08 Nov 17:14
74404e7
Compare
Choose a tag to compare

Bug Fixes

  • #24057 Cleanup tiered storage temporary cache file if exceptions are thrown during download. by @nvartolomei in #24063

Full Changelog: v24.2.9...v24.2.10