Release 2.0.0 available

Indiana-Dunes-haiku

Apache Ozone 2.0.0 adds 1708 new features, improvements and bug fixes on top of Ozone 1.4.

The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.

[2.0.0] - 2025-04-04

Added

  • New pipeline choosing policy: CapacityPipelineChoosePolicy. This policy randomly chooses pipelines with relatively lower utilization. To use, configure hdds.scm.pipeline.choose.policy.impl to org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.CapacityPipelineChoosePolicy. (HDDS-9345)
  • APIs to fetch single datanode specific information, reducing data transfer from server to client. (HDDS-9648)
  • Support for symmetric keys for delegation tokens. (HDDS-8829)
  • A Storage Container Manager (SCM) can now be decommissioned from a set of SCM nodes. (HDDS-7852)
  • Option to close all pipelines via CLI (ozone admin pipeline close --all). (HDDS-10742)
  • Metrics to monitor bucket state including usage, quota, and available space. (HDDS-10476)
  • Unit tests and documentation for creating keys/files with EC replication config using ofs/o3fs. (HDDS-10553)
  • Support for passing Kerberos credentials in GrpcOmTransport. (HDDS-11041)

Changed

  • S3 Gateway endpoints for static content and admin purposes (/prom, /logs, etc.) are now served on a separate port (default: 19878). Config keys are under ozone.s3g.webadmin. (HDDS-7307)
  • Improved logging for container not found in CloseContainerCommandHandler to INFO level. (HDDS-9958)
  • Container scanner (hdds.datanode.container.scrub.enabled) is now enabled by default. (HDDS-10485)
  • Upgraded jgrapht to 1.4.0. (HDDS-10503)
  • Bumped follow-redirects to 1.15.6 in Ozone Recon. (HDDS-10526)
  • Bumped axios to 0.28.0 in Ozone Recon. (HDDS-10669)
  • Bumped es5-ext to 0.10.64 in Ozone Recon. (HDDS-10673)
  • Bumped ip to 1.1.9 in Ozone Recon. (HDDS-10674)
  • Bumped browserify-sign to 4.2.3 in Ozone Recon. (HDDS-10676)
  • Bumped plotly.js to 2.25.2 in Ozone Recon. (HDDS-10677)
  • Replaced ConcurrentHashMap with HashMap protected by ReadWriteLock in NodeStateMap for potential performance improvement. (HDDS-10830)
  • Replaced ConcurrentHashMap with HashMap in PipelineStateMap as access is already protected by locks. (HDDS-10971)
  • Bumped express to 4.21.0 in Ozone Recon. (HDDS-11460)
  • Bumped vite to 4.5.5 in Ozone Recon. (HDDS-11467)
  • Improved array handling efficiency, avoiding legacy conversions and double conversions. (HDDS-11544)
  • Extracted common Kubernetes definitions for HttpFS and Recon from getting-started example. (HDDS-11845)
  • Reverted workaround added by HDDS-8715 for thread renaming, as the underlying Hadoop issue HDFS-13566 is fixed in the current Hadoop version. (HDDS-12470)
  • Migrated Ozone Recon UI build process from react-scripts/Jest to Vite/vitest. (HDDS-11017)
  • Added wrapper methods for getting/setting port details (Standalone, Ratis, Rest) in DatanodeDetails, replacing direct usage. (HDDS-117)
  • Refactored OMRequest building in TrashOzoneFileSystem to reduce code duplication. (HDDS-6796)
  • Switched chunk file reading in Datanode to use Netty’s ChunkedNioFile for potential performance improvement. (HDDS-7188)
  • Improved multipart upload part ETag generation to use MD5 hash of content for consistency. (HDDS-9680)
  • Pipeline failure now triggers an immediate heartbeat to SCM to minimize client impact. (HDDS-9823)
  • Improved performance of processing IncrementalContainerReport requests from DN in Recon by batching SCM lookups and reducing client timeouts. (HDDS-9883)
  • Changed Recon datanode ‘Last Heartbeat’ display to show relative time values (e.g., “2s ago”) instead of absolute timestamps. (HDDS-9933)
  • SCM UI now shows cluster storage usage percentage in addition to absolute values. (HDDS-9988)
  • Added functionality to freon OmMetadataGenerator (ommg) Test. (HDDS-10025)
  • Improved logs for SCMDeletedBlockTransactionStatusManager. (HDDS-10029)
  • OzoneManagerRatisServer.getServer() now returns the specific Ratis Division for the group. (HDDS-10036)
  • Reduced buffer copying in OMRatisHelper by using ByteBuffer. (HDDS-10037)
  • Consolidated and added tests for the Ratis write path for prefix ACL operations. (HDDS-10066)
  • Refined SCM start-up logs for clarity and reduced noise (removed duplicate balancer config, reduced cert info verbosity). (HDDS-10271)
  • Removed unnecessary sorting when excluding Datanodes during Ratis Pipeline Creation based on pipeline limits. (HDDS-10345)
  • CopyObjectResponse ETag is now based on the content hash of the copied key, consistent with PutObject. (HDDS-10403)
  • Avoided unnecessary creation of ChunkInfo objects in container-service code by directly accessing proto fields. (HDDS-10410)
  • Prefix ACL checks now correctly resolve bucket links. (HDDS-10412)
  • Refined audit logging for bucket property update operations to include quota and replication details. (HDDS-10460)
  • Implemented logic to fail Datanode decommission early if the cluster doesn’t have enough nodes to maintain replication requirements. (HDDS-10462)
  • Refined audit logging for bucket creation to include quota, owner, and replication details. (HDDS-10475)
  • Standardized byte array to String conversion for RocksDB LiveFileMetaData using UTF-8 and StringUtils.bytes2String, removing BouncyCastle dependency. (HDDS-10744)
  • Tool ozone admin find-ec-missing-padding-blocks added to detect keys affected by missing EC padding blocks (HDDS-10681). (HDDS-10751)
  • Improved logging for signature verification failures in OzoneDelegationTokenSecretManager to aid debugging. (HDDS-10802)
  • Implemented getHomeDirectory in OzoneFileSystem implementations to correctly return /user/<ugi user> in secure clusters, respecting impersonation. (HDDS-10905)
  • Reduced client watch requests by using CommitInfoProto from NotReplicatedException (requires Ratis 3.1.0+ and config tuning). (HDDS-10932)
  • Added Netty off-heap memory usage metrics to OM and SCM for better monitoring. (HDDS-11100)
  • Enhanced ozone admin containerbalancer status output with richer information including start time, parameters, progress details, and involved datanodes using -v or --verbose. (HDDS-11120)
  • Improved SCM WebUI display: formatted JVM properties, added DN version/UUID to list, formatted SCM HA info as a list. (HDDS-11196)
  • Added statistical indicators (min, max, median, stdev) for DataNode storage usage to SCM UI/metrics. (HDDS-11206)
  • Added statistics for Capacity, ScmUsed, Remaining, NonScmUsed storage space indicators. (HDDS-11252)
  • Improved CLI display for OM/SCM roles with a --table option. (HDDS-11268)
  • Added statistics for node status counts (Healthy, Dead, Decommissioning, EnteringMaintenance). (HDDS-11272)
  • Allowed disabling OM version-specific features via internal config (e.g., atomic rewrite key). (HDDS-11378)
  • Introduced schema versioning for Recon DB to handle upgrades and distinguish schema changes. (HDDS-11465)
  • Added statistics for Pipeline and Container counts/states to SCM UI/metrics. (HDDS-11469)
  • Improved --duration option handling in freon tests (ombg, ommg) for consistency with -n limit. (HDDS-11494)
  • Made SCMDBDefinition a singleton to reflect its immutability. (HDDS-11555)
  • Simplified DBColumnFamilyDefinition by removing redundant keyType/valueType fields (relying on Codec). (HDDS-11557)
  • Made ReconSCMDBDefinition a singleton. (HDDS-11589)
  • Clarified OM Ratis configuration change log message to avoid confusion about peer roles. (HDDS-11623)
  • Optimized OmUtils.normalizeKey to check isDebugEnabled before performing string comparison. (HDDS-11669)
  • Enhanced Recon metrics for background task status (lastRunStatus, currentTaskStatus) and queue monitoring. (HDDS-11680)
  • Implemented OM-side filtering for ranged GET requests for specific MPU parts to reduce network overhead. (HDDS-11699)
  • Refactored S3 request unmarshalling logic to reduce code duplication. (HDDS-11739)
  • Improved efficiency of BufferUtils.writeFully for ByteBuffer[] using GatheringByteChannel. (HDDS-11860)
  • The ozonefs-hadoop3-client jar may be optionally relocated to a different classpath fix by specifying the Maven properties proto.shaded.prefix. (HDDS-12116)
  • Changed default Replication Manager command deadline to 12 minutes (SCM) and Datanode offset to 6 minutes. (HDDS-12135)
  • Improved error messages in Ozone CLI for FileSystemExceptions (e.g., NoSuchFileException, AccessDeniedException) when not in verbose mode. (HDDS-12241)
  • Returned explicit QUOTA_EXCEEDED S3 error code instead of a generic 500 internal error. (HDDS-12329)
  • Optimized listMultipartUploads by removing duplicate key scanning in OmMetadataManagerImpl. (HDDS-12371)
  • Changed ContainerID to be a value-based class, enforcing factory methods and improving efficiency with cached proto/hash. (HDDS-12541)
  • Combined containerMap and replicaMap in SCM’s ContainerStateMap into a single map for simplicity and efficiency. (HDDS-12555)
  • Moved StorageTypeProto enum from OM/SCM specific proto files to the common hdds.proto. This is a Java API incompatible change for internal protocols but wire compatible. (HDDS-12750)
  • Added configuration (ozone.client.ratis.watch.type) to tune the replication level (ALL_COMMITTED or MAJORITY_COMMITTED) for client watch requests. (HDDS-2887)
  • SCM StateMachine now uses Ratis notifyLeaderReady API instead of relying solely on notifyTermIndexUpdated. (HDDS-10690)
  • Refactored OM request validateAndUpdateCache methods to pass ExecutionContext instead of just TermIndex. (HDDS-11975)
  • Reduced unnecessary object creation (RunningDatanodeState, EndpointTasks) during Datanode heartbeat processing when state is RUNNING. (HDDS-11083)
  • Improved replication metrics consistency across Datanode commands handled by ReplicationSupervisor and those handled directly. (HDDS-11376)
  • Improved logging in Container Balancer’s AbstractFindTargetGreedy to detail why potential targets are excluded. (HDDS-10198)
  • Refined ozone admin containerbalancer status output for better readability and detail, including time consumption and data units (MB/GB). (HDDS-11367)
  • Added Pipeline count to ozone admin datanode usageinfo output. (HDDS-11357)
  • Removed redundant CommandHandler thread pool size methods (already covered by ReplicationSupervisor metrics). (HDDS-11304)
  • Replaced clusterId parameter in KeyValueHandler with initialization via setClusterId to prevent potential NPE during concurrent container creation under high load. (HDDS-11396)
  • Added ozone.om.ratis.leader.election.minimum.timeout.duration.key config to OM RaftProperties for leader election timeout. (HDDS-10761)
  • Added configuration (ozone.om.rocksdb.max_open_files) to set RocksDB max_open_files option for OM DB. (HDDS-11191)
  • Standardized Datanode command metrics tracking across ReplicationSupervisor and direct command handlers. (HDDS-11444)
  • Optimized Recon List Keys API by reusing calculated path prefix for consecutive keys with the same parent ID. (HDDS-11668)
  • Optimized Recon List Keys API response generation by reducing object creation (avoiding OmKeyInfo) and memory buffering. (HDDS-11660)
  • Optimized Recon List Keys API filtering logic by replacing predicate lambdas with simple IF statements for performance. (HDDS-11649)
  • Added foundational schema upgrade action (InitialConstraintUpgradeAction) for Recon to handle constraints on existing tables (e.g., Unhealthy Containers) upon first upgrade to schema versioning. (HDDS-11615)
  • Added Ozone wrapper configurations (ozone.scm.ipc.server.read.threadpool.size, ozone.hdds.datanode.ipc.server.read.threadpool.size) to increase ipc.server.read.threadpool.size for SCM and Datanode RPC servers (default 10). (HDDS-11302)
  • Refactored ContainerStateMap to restrict ContainerAttribute generic type T to Enum, removing unused ownerMap/repConfigMap. (HDDS-12532)
  • Refactored ContainerStateManager interface to remove redundant ContainerID parameters when ContainerReplica (which contains the ID) is already passed. (HDDS-12572)
  • Refactored DB/Table classes to use the DB name as the thread name prefix implicitly, removing the explicit parameter. (HDDS-12590)
  • Included ContainerInfo within ContainerAttribute to avoid extra map lookups in ContainerStateManager methods. (HDDS-12591)
  • Enabled custom ValueCodec for TypedTable to allow performance optimizations like partial deserialization (e.g., OmKeyInfo without ACLs/locations). (HDDS-12582)
  • Made ozone admin scm safemode --verbose show rule status even when SCM is not in safe mode. (HDDS-12548)
  • Addressed thread safety issue in BlockOutputStream#failedServers by using a concurrent collection. (HDDS-12331)
  • Added DatanodeID validation for incoming ContainerCommandRequests and on Ratis group joins to prevent operations on incorrect nodes. (HDDS-11667)
  • Persisted the list of container IDs created on a Datanode to prevent recreation after volume failures, ensuring consistency for both Ratis and EC containers. (HDDS-11650)
  • Added check for rocks_tools native library in ozone checknative CLI command output. (HDDS-11347)
  • Added Ozone cluster growth rate metric (based on scm_node_manager_total_used rate) to Grafana dashboard using PromQL. (HDDS-12168)
  • Added robust error handling for Recon OM background tasks (e.g., NSSummary) to prevent data inconsistencies if Recon crashes during partial event processing. (HDDS-12062)

Deprecated

  • LegacyReplicationManager (hdds.scm.replication.enable.legacy=true) is removed and no longer supported. (HDDS-11759)
  • FILE_PER_CHUNK container layout (ozone.scm.container.layout) is deprecated. New containers cannot be created with this layout. Support will be removed in a future release. (HDDS-11753)

Removed

  • Removed LegacyReplicationManager implementation and the hdds.scm.replication.enable.legacy config property. (HDDS-11759)
  • Removed unused resultCache and getMatchingContainerIDs method from ContainerStateMap. (HDDS-12445)

Fixed

  • TriggerDBSyncEndpoint admin-only API handling in Recon fixed. (HDDS-11436)
  • Fixed potential NullPointerException in OzoneManagerProtocolClientSideTranslatorPB.listStatusLight when startKey is null (e.g., via s3a). (HDDS-10367)
  • Addressed memory leak caused by ThreadLocal usage in OMClientRequest (OMLockDetails). (HDDS-10385)
  • Fixed Container Balancer incorrectly selecting containers with 0 or negative size for moving. (HDDS-10483)
  • Fixed inability to write files when Datanode chunk data validation (hdds.datanode.chunk.data.validation.check) is enabled due to buffer position issue. (HDDS-10547)
  • Fixed Recon startup failure (“used space cannot be negative”) by handling Datanode reports with negative used space gracefully. (HDDS-10614)
  • Fixed IOException: ParentKeyInfo ... is null in Recon Namespace Summary task by handling cases where parent info might be missing. (HDDS-10855)
  • Fixed EC Reconstruction failure (IllegalArgumentException: The chunk list has X entries, but the checksum chunks has Y entries) potentially caused by out-of-order EC stripe writes leading to inaccurate chunk lists. (HDDS-10985)
  • Fixed OM crash (SnapshotChainManager: Failure while loading snapshot chain) caused by SstFilteringService directly updating snapshot info DB entries, potentially corrupting the chain if OM restarts before DoubleBuffer flush. (HDDS-11068)
  • Resolved ClassCastException (RepeatedOmKeyInfo to OmKeyInfo) in Recon’s FileSizeCountTask due to improper event handling in OMDBUpdatesHandler for conflicting keys across tables (e.g., file and directory with the same name). (HDDS-11187)
  • Fixed ContainerSizeCountTask in Recon logging ERROR for negative-sized containers; reduced log level as these are ignored functionally. (HDDS-12227)
  • Fixed duplicate key violation in Recon’s FileSizeCountTask by correctly handling the isDbTruncated flag to allow updates instead of only inserts. (HDDS-12228)
  • Made OzoneClientException extend IOException. (HDDS-64)
  • Fixed various S3 gateway issues including multipart upload and other improvements. (HDDS-1186)
  • Fixed SCM Decommissioning issue causing InvalidStateTransitionException after recommissioning the same SCM node. (HDDS-9608)
  • Fixed Recon Disk Usage page UI issues with large numbers of keys/buckets/volumes (pie chart usability, axis ticks, path overflow). (HDDS-9626)
  • Fixed Ozone admin namespace CLI du command printing incorrect validation error messages for root (""/"") or volume paths. (HDDS-9644)
  • Fixed Recon incorrectly including out-of-service (decommissioned, maintenance) nodes when checking container health status (over/under/mis-replication). (HDDS-9645)
  • Fixed potential NullPointerException in ContainerStateMap.ContainerAttribute due to race condition between update and get operations. (HDDS-9527)
  • Fixed Recon potentially showing duplicate DEAD datanodes after decommission/reformat/recommission cycles. (HDDS-10409) -> Now only allows removing DEAD nodes. (HDDS-11032)
  • Fixed potential memory overflow in Recon’s Container Health Task due to unbounded list growth. (HDDS-9819)
  • Reduced Ozone client heap memory utilization during writes by using pooled direct buffers for chunks. (HDDS-9843)
  • Fixed Pipeline.nodesInOrder using ThreadLocal, making it inaccessible to other threads after being set. (HDDS-9848)
  • Switched KeyValueContainerCheck.verifyChecksum to use direct/mapped buffers instead of heap buffers. (HDDS-9941)
  • Fixed TokenRenewer implementations (O3FS, OFS) not closing the created OzoneClient. Removed duplicate implementation. (HDDS-9943)
  • Fixed NSSummaryAdmin CLI commands not closing created OzoneClient instances and creating multiple instances unnecessarily. (HDDS-9944)
  • Fixed incorrect synchronization in RatisSnapshotInfo, potentially leading to inconsistent term/index values. Class removed as redundant to TransactionInfo. (HDDS-9984)
  • Fixed Options and ReadOptions instances not being closed properly in rocksdb-checkpoint-differ. (HDDS-10001)
  • Renamed ManagedSstFileReader in rocksdb-checkpoint-differ to SstFileSetReader to avoid name collision with the class in hdds-managed-rocksdb. (HDDS-10007)
  • Fixed potential NullPointerException in VolumeInfoMetrics.getCommitted() if HddsVolume.committedBytes is null. (HDDS-10027)
  • Refined SCM RPC handler counts to be configurable per protocol (Client, Block, Datanode) instead of a single global count. (HDDS-10088)
  • Removed static dbNameToCfHandleMap from RocksDatabase, using non-static columnFamilies map instead. (HDDS-10107)
  • Fixed potential NullPointerException in OMDBCheckpointServlet lock acquisition when SstFilteringService is accessed before initialization. (HDDS-10138)
  • Enabled Zero-Copy reads during container replication for improved performance. (HDDS-10144)
  • Corrected metric names createOmResoonseLatencyNs and validateAndUpdateCacneLatencyNs in OMPerformanceMetrics. (HDDS-10162)
  • Fixed OmMetadataManagerImpl creating a new S3Batcher instance for each S3 secret operation instead of reusing one. (HDDS-10202)
  • Ensured atomic updates in StateContext#updateCommandStatus using computeIfPresent to prevent race conditions. (HDDS-10210)
  • Fixed Grafana dashboards: removed UID/hostnames, included secure/unsecure ports, corrected datastore count. (HDDS-10229)
  • Prevented V3 Schema DatanodeStore from creating container DBs in incorrect locations under certain initialization paths. (HDDS-10230)
  • Fixed ContainerStateManager finalizing OPEN containers without a healthy pipeline on follower SCMs; moved logic to leader-only path via Ratis. (HDDS-10231)
  • Improved JSON response for Deleted Directories and Open Keys Insight Endpoints in Recon for better clarity (using actual names instead of Object IDs). (HDDS-10241)
  • Fixed ContainerReport admin command showing incorrect values immediately after SCM restart before Replication Manager runs. (HDDS-10272)
  • Fixed pagination on the OM DB Insights page in Recon. (HDDS-10282)
  • Added support for direct ByteBuffers in Checksum calculations, using reflection for Java 9+ API while maintaining Java 8 compatibility. (HDDS-10288)
  • Fixed ECReconstructionCoordinator ignoring ozone-site.xml client configurations and using default OzoneClientConfig. (HDDS-10294)
  • Fixed potential orphan blocks during key overwrite operations, especially involving the deleted key table. (HDDS-10296)
  • Fixed KeyManagerImpl#listKeys path normalization to correctly handle OBS/LEGACY buckets when ozone.om.enable.filesystem.paths is true. (HDDS-10319)
  • Fixed metadata not being updated when overwriting existing keys via S3 PutObject. (HDDS-10324)
  • Fixed SetTimes API not working with linked buckets due to missing link resolution. (HDDS-10369)
  • Fixed Recon not handling pre-existing MISSING_EMPTY containers correctly (introduced in HDDS-9695), leaving them marked as missing indefinitely. (HDDS-10370)
  • Fixed S3 listParts incompatibility for keys created before HDDS-9680 (missing ETag metadata) and NPE when ETag is null. (HDDS-10395)
  • Restricted directory deletion in LEGACY buckets via ozone sh key delete; users must use ozone fs interface. (HDDS-10397)
  • Fixed ArrayIndexOutOfBoundsException when listing keys in OBS buckets via S3/s3a under certain conditions. (HDDS-10399)
  • Fixed ozone admin CLI having hard-coded INFO log level, ignoring environment/config settings. (HDDS-10405)
  • Fixed Datanode startup failure (“Illegal configuration: raft.grpc.message.size.max must be 1m larger than …”) when using latest Ratis due to default config mismatch. (HDDS-11375)
  • Fixed Datanode startup failure (“checksum size setting 1024 is not in expected format”) due to incorrect type validation for hdds.ratis.raft.server.snapshot.creation.gap. (HDDS-10423)
  • Fixed Grafana dashboard Prometheus endpoint configuration for Datanodes and added missing Recon endpoint. (HDDS-10433)
  • Fixed Datanode Maintenance failing early incorrectly (logic refined). (HDDS-10463)
  • Fixed OM potentially crashing or failing requests if the configured S3 secret storage (Vault) is unavailable. (HDDS-10469)
  • Fixed audit log for key creation missing EC replication config details (parity, chunk size, codec). (HDDS-10472)
  • Fixed potential NullPointerException in OmUtils.getAllOMHAAddresses if OM HA config keys are missing. (HDDS-10508)
  • Fixed S3 GetObject ETag header returning "null" for objects without an ETag, causing issues with AWS SDK validation. Now omits the header if ETag is missing. (HDDS-10521)
  • Fixed MessageDigest instance in S3 endpoint potentially not being reset after exceptions (e.g., client cancellation), leading to incorrect ETags on subsequent requests using the same thread. (HDDS-10587)
  • Fixed issue where client might attempt Ratis streaming for keys defaulted to EC replication if bucket replication isn’t explicitly set. (HDDS-10832)
  • Fixed freon read/mixed operations failing with “Key not found” if prefix is unspecified; stopped adding random prefix. Fixed misleading random prefix log in ommg. (HDDS-10845)
  • Fixed Ozone CLI not respecting default ozone.om.service.id when only one service ID is configured. (HDDS-10861)
  • Fixed ClosePipelineCommandHandler potentially causing GroupMismatchException by calling removeGroup before getting peer list for propagation. (HDDS-10875)
  • Fixed Recon ReconContainerManager potentially throwing DuplicatedPipelineIdException when checking/adding containers due to race conditions or stale data. (HDDS-10880)
  • Improved logging clarity in Recon’s ReconNodeManager regarding datanode finalization status checks during upgrades. (HDDS-10883)
  • Fixed OM startup failure in single-node Docker container due to Ratis group directory mismatch when using default service ID. (HDDS-10909)
  • Fixed Recon startup failing silently or logging incorrect errors in non-HA SCM scenarios due to inability to fetch SCM roles or snapshot. (HDDS-10937)
  • Fixed OM decommission config (ozone.om.decommissioned.nodes) not working without service ID suffix when only one OM service ID is configured. (HDDS-10942)
  • Fixed EC key read corruption potentially occurring if a container’s replica index on a DN mismatches the index expected by the client (e.g., after container move). Added validation. (HDDS-10983)
  • Fixed S3 gateway potentially throwing exceptions (javax.xml.xpath.XPathExpressionException) during concurrent XML parsing (e.g., CompleteMultipartUpload, DeleteObjects). (HDDS-10777)
  • Fixed NullPointerException in XceiverClientRatis.watchForCommit when updateCommitInfosMap encounters a new Datanode ID in the response after a previous timeout removed it from commitInfoMap. (HDDS-10780)
  • Fixed potential OMLeaderNotReadyException after leader switch if transactions were pending in the double buffer, preventing lastNotifiedTermIndex update. (HDDS-10798)
  • Fixed various HTTP server components (Recon, SCM, OM, DN) failing to start if configured with a wildcard Kerberos principal (*) due to missing kerb-core dependency. (HDDS-10803)
  • Fixed S3 setBucketAcl causing UnsupportedOperationException due to attempting to modify an immutable list returned by OzoneVolume.getAcls(). (HDDS-11737)
  • Fixed SCM leadership metric (SCMLeader) potentially being reset to null by HTTP server initialization after the Raft server has already determined leadership. (HDDS-11742)
  • Fixed SnapshotDiffManager logging NativeLibraryNotLoadedException as ERROR even when native tools are optional; changed to WARN. (HDDS-11486)
  • Fixed potential NullPointerException when checking container balancer status (ozone admin containerbalancer status) if balancer is started but not fully initialized (e.g., waiting for DU info). (HDDS-11350)
  • Fixed ozone fs -rm -r prompt for volume deletion suggesting incorrect ozone sh volume delete options (-skipTrash, -id). (HDDS-11346)
  • Fixed ozone sh key list -h showing duplicate options (--all, --length) due to picocli version issue (reverted). (HDDS-11446) -> Reverted picocli upgrade.
  • Fixed S3 CompleteMultipartUpload returning 500 Internal Server Error instead of S3-compliant InvalidRequest error when no parts are specified in the request body. (HDDS-11457)
  • Fixed multiple IOzoneAuthorizer instances potentially being created and leaked if Ratis snapshot installation fails repeatedly after stopping the metadata manager. (HDDS-11472)
  • Fixed ozone sh volume delete command line parsing error for -r option. (HDDS-11535) -> Resolved as part of HDDS-11346 fix.
  • Fixed NullPointerException in OM when overwriting an empty file using multipart upload in FSO buckets (versioning disabled). (HDDS-12131)
  • Fixed Replication Manager (hdds.scm.replication.thread.interval) interval configuration description to correctly state milliseconds instead of seconds. (HDDS-12144) -> Resolved by removing unsupported types.
  • Fixed Grafana dashboard for Chunk read/write rates using incorrect interval variable ($__interval instead of $__rate_interval). (HDDS-12112)
  • Fixed Replication Manager potentially expiring pending container deletes incorrectly instead of retrying them if the Datanode doesn’t confirm deletion within the deadline. (HDDS-12127)
  • Fixed Replication Manager non-deterministically selecting replicas for deletion if preferred target nodes are overloaded, potentially deleting required replicas. (HDDS-12115)
  • Fixed delete container commands potentially running indefinitely or past their deadline due to long lock waits or slow disk I/O; added lock timeout and moved ICR earlier. (HDDS-12114)
  • Fixed Recon UI potentially switching from old UI to new UI automatically upon page refresh. (HDDS-12084)
  • Fixed missing local refresh button in new Recon UI’s Disk Usage page to reload data for the current path without navigating back to root. (HDDS-12085)
  • Fixed unnecessary parameters “Source Volume” & “Source Bucket” appearing in the metadata table for non-link buckets in the new Recon UI Disk Usage page. (HDDS-12073)
  • Fixed Recon API endpoints /api/v1/volumes and /api/v1/buckets missing from Swagger documentation. (HDDS-11300)
  • Fixed potential NullPointerException in Recon /api/v1/volumes and /api/v1/buckets endpoints if accessed before Recon tables are fully initialized after startup. (HDDS-11349)
  • Fixed ozone freon cr (closed container replication) command failing with NPE due to metrics map lookup failure in ReplicationSupervisor. (HDDS-12040)
  • Fixed incorrect display of Ozone Service ID name in Recon UI (New UI showed “OM ID”, Old UI showed “OM Service”). Corrected to “Ozone Service ID”. (HDDS-12049)
  • Fixed difference in Cluster Capacity % calculation (floor vs round) and Container Pre-Allocated Size display (committed vs 0) between new and old Recon UI. (HDDS-12042)
  • Fixed long path names wrapping to the next line in the new Recon UI Disk Usage page; made it scrollable instead. (HDDS-11957)
  • Fixed Recon failing to update version_number in RECON_SCHEMA_VERSION table (always -1), causing upgrade actions to run unnecessarily on fresh installs. (HDDS-11846)
  • Fixed serialization error (Conflicting/ambiguous property name) in Recon’s listKeys API due to Jackson ambiguity between key field and isKey() getter. Renamed getter. (HDDS-11848)
  • Fixed potential deadlock in OM between DoubleBuffer flush thread (waiting for DeletedTable lock during snapshot checkpoint) and KeyDeletingService (holding DeletedTable lock, waiting for Ratis future). (HDDS-11124)
  • Fixed OM crashing with IOException: Rocks Database is closed during SnapshotMoveDeletedKeys request processing if the snapshot was purged concurrently. (HDDS-11152)
  • Fixed containers potentially stuck in DELETING state after upgrade if they were affected by HDDS-8129 (incorrect block counts) and datanodes rejected delete commands due to negative counts. Added recovery logic. (HDDS-11136)
  • Fixed fs -mkdir incorrectly creating directories in OBS buckets (bypassing layout validation added in HDDS-11235). Reverted the optimization for mkdir. (HDDS-11348)
  • Fixed Datanode potentially failing heartbeats or other operations due to deadlock on StateContext#pipelineActions map under high load. Replaced with concurrent map. (HDDS-11331)
  • Fixed S3 gateway returning 403 Forbidden instead of 302 Redirect for root path (/) requests containing Authorization: Negotiate header (used by newer curl versions). (HDDS-11096)
  • Fixed DELETE_TENANT request logging an unnecessary and uninformative UPDATE_VOLUME audit entry, even on failure. (HDDS-11119)
  • Fixed intermittent timeout in TestBlockDeletion.testBlockDeletion potentially caused by race conditions or slow command processing. (HDDS-9962)
  • Fixed “Bad file descriptor” error in TestOmSnapshotFsoWithNativeLib.testSnapshotCompactionDag when using native RocksDB tools library. (HDDS-10149)
  • Fixed ManagedStatistics objects not being closed properly in OM and DN when RocksDB statistics are enabled, leading to resource leaks. (HDDS-10184)
  • Fixed race condition in RocksDatabase where a close operation could occur between assertClose() check and database operation, causing JVM crash. (HDDS-9527)
  • Fixed ReplicationManager metrics not being re-registered after RM restart via CLI (stop/start), causing metrics to stop reporting. (HDDS-9235)
  • Fixed infinite loop in WritableRatisContainerProvider.getContainer if SCM restarts and existing pipeline nodes are not found (e.g., DNs stopped), causing log flooding. (HDDS-8982)
  • Fixed SCM follower nodes logging NotLeaderException errors when processing Pipeline Reports, which is expected behavior for followers. Suppressed error logging. (HDDS-11695)
  • Fixed FileNotFoundException: ... (Too many open files) and subsequent DN crashes during OM+DN decommission under heavy Freon load, likely due to excessive open file handles. (HDDS-11391) -> Addressed potential causes.
  • Fixed Recon showing incorrect (zero) count for DELETED containers in cluster state summary API (/api/v1/clusterState). (HDDS-11389)
  • Fixed issue where OM could fail if IOzoneAuthorizer (e.g., Ranger plugin) fails to initialize during snapshot installation and reload attempts create multiple instances, leading to heap exhaustion. (HDDS-11472)
  • Fixed NoSuchUpload error when aborting multipart uploads for keys where the parent directory was missing (potentially due to FSO-related bugs or cleanup issues). (HDDS-11784)
  • Fixed secure acceptance tests failing on arm64 due to keytab checksum mismatch when using keytabs generated on amd64. Regenerated keytabs for multi-arch compatibility. (HDDS-11810)
  • Fixed race condition in datanode VERSION file creation where multiple threads could attempt to write using the same temporary file via AtomicFileOutputStream. (HDDS-12608)
  • Fixed SCM logging an error when updating sequence ID for a CLOSED container based on a replica report with higher BCSID; changed log level and added context. (HDDS-12409)

Security

  • Disabled REST endpoint for S3 secret manipulation by username for non-admin users via S3Gateway Secret REST endpoint. (HDDS-11040)

For more details, check out Apache Ozone 2.0.0 JIRA list

This is a generally available (GA) release. It represents a point of API stability and quality that we consider production-ready.

Generated-by: Google AI Studio + Gemini 2.5 Pro Preview 03-25, with input data from a filtered JIRA list using this prompt.

Image credit: Indiana Dunes National Lakeshore, Michigan City, Indiana, USA by Diego Delso, CC-BY-SA 3.0 / Text added to original