Release 2.0.0 available

Apache Ozone 2.0.0 adds 1708 new features, improvements and bug fixes on top of Ozone 1.4.
The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.
[2.0.0] - 2025-04-04
Added
- New pipeline choosing policy: CapacityPipelineChoosePolicy. This policy randomly chooses pipelines with relatively lower utilization. To use, configure
hdds.scm.pipeline.choose.policy.impl
to org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.CapacityPipelineChoosePolicy
. (HDDS-9345)
- APIs to fetch single datanode specific information, reducing data transfer from server to client. (HDDS-9648)
- Support for symmetric keys for delegation tokens. (HDDS-8829)
- A Storage Container Manager (SCM) can now be decommissioned from a set of SCM nodes. (HDDS-7852)
- Option to close all pipelines via CLI (
ozone admin pipeline close --all
). (HDDS-10742)
- Metrics to monitor bucket state including usage, quota, and available space. (HDDS-10476)
- Unit tests and documentation for creating keys/files with EC replication config using ofs/o3fs. (HDDS-10553)
- Support for passing Kerberos credentials in GrpcOmTransport. (HDDS-11041)
Changed
- S3 Gateway endpoints for static content and admin purposes (/prom, /logs, etc.) are now served on a separate port (default: 19878). Config keys are under
ozone.s3g.webadmin
. (HDDS-7307)
- Improved logging for container not found in CloseContainerCommandHandler to INFO level. (HDDS-9958)
- Container scanner (
hdds.datanode.container.scrub.enabled
) is now enabled by default. (HDDS-10485)
- Upgraded jgrapht to 1.4.0. (HDDS-10503)
- Bumped follow-redirects to 1.15.6 in Ozone Recon. (HDDS-10526)
- Bumped axios to 0.28.0 in Ozone Recon. (HDDS-10669)
- Bumped es5-ext to 0.10.64 in Ozone Recon. (HDDS-10673)
- Bumped ip to 1.1.9 in Ozone Recon. (HDDS-10674)
- Bumped browserify-sign to 4.2.3 in Ozone Recon. (HDDS-10676)
- Bumped plotly.js to 2.25.2 in Ozone Recon. (HDDS-10677)
- Replaced ConcurrentHashMap with HashMap protected by ReadWriteLock in NodeStateMap for potential performance improvement. (HDDS-10830)
- Replaced ConcurrentHashMap with HashMap in PipelineStateMap as access is already protected by locks. (HDDS-10971)
- Bumped express to 4.21.0 in Ozone Recon. (HDDS-11460)
- Bumped vite to 4.5.5 in Ozone Recon. (HDDS-11467)
- Improved array handling efficiency, avoiding legacy conversions and double conversions. (HDDS-11544)
- Extracted common Kubernetes definitions for HttpFS and Recon from getting-started example. (HDDS-11845)
- Reverted workaround added by HDDS-8715 for thread renaming, as the underlying Hadoop issue HDFS-13566 is fixed in the current Hadoop version. (HDDS-12470)
- Migrated Ozone Recon UI build process from react-scripts/Jest to Vite/vitest. (HDDS-11017)
- Added wrapper methods for getting/setting port details (Standalone, Ratis, Rest) in DatanodeDetails, replacing direct usage. (HDDS-117)
- Refactored OMRequest building in TrashOzoneFileSystem to reduce code duplication. (HDDS-6796)
- Switched chunk file reading in Datanode to use Netty’s ChunkedNioFile for potential performance improvement. (HDDS-7188)
- Improved multipart upload part ETag generation to use MD5 hash of content for consistency. (HDDS-9680)
- Pipeline failure now triggers an immediate heartbeat to SCM to minimize client impact. (HDDS-9823)
- Improved performance of processing IncrementalContainerReport requests from DN in Recon by batching SCM lookups and reducing client timeouts. (HDDS-9883)
- Changed Recon datanode ‘Last Heartbeat’ display to show relative time values (e.g., “2s ago”) instead of absolute timestamps. (HDDS-9933)
- SCM UI now shows cluster storage usage percentage in addition to absolute values. (HDDS-9988)
- Added functionality to freon OmMetadataGenerator (ommg) Test. (HDDS-10025)
- Improved logs for SCMDeletedBlockTransactionStatusManager. (HDDS-10029)
- OzoneManagerRatisServer.getServer() now returns the specific Ratis
Division
for the group. (HDDS-10036)
- Reduced buffer copying in OMRatisHelper by using ByteBuffer. (HDDS-10037)
- Consolidated and added tests for the Ratis write path for prefix ACL operations. (HDDS-10066)
- Refined SCM start-up logs for clarity and reduced noise (removed duplicate balancer config, reduced cert info verbosity). (HDDS-10271)
- Removed unnecessary sorting when excluding Datanodes during Ratis Pipeline Creation based on pipeline limits. (HDDS-10345)
- CopyObjectResponse ETag is now based on the content hash of the copied key, consistent with PutObject. (HDDS-10403)
- Avoided unnecessary creation of ChunkInfo objects in container-service code by directly accessing proto fields. (HDDS-10410)
- Prefix ACL checks now correctly resolve bucket links. (HDDS-10412)
- Refined audit logging for bucket property update operations to include quota and replication details. (HDDS-10460)
- Implemented logic to fail Datanode decommission early if the cluster doesn’t have enough nodes to maintain replication requirements. (HDDS-10462)
- Refined audit logging for bucket creation to include quota, owner, and replication details. (HDDS-10475)
- Standardized byte array to String conversion for RocksDB LiveFileMetaData using UTF-8 and StringUtils.bytes2String, removing BouncyCastle dependency. (HDDS-10744)
- Tool
ozone admin find-ec-missing-padding-blocks
added to detect keys affected by missing EC padding blocks (HDDS-10681). (HDDS-10751)
- Improved logging for signature verification failures in OzoneDelegationTokenSecretManager to aid debugging. (HDDS-10802)
- Implemented
getHomeDirectory
in OzoneFileSystem implementations to correctly return /user/<ugi user>
in secure clusters, respecting impersonation. (HDDS-10905)
- Reduced client watch requests by using CommitInfoProto from NotReplicatedException (requires Ratis 3.1.0+ and config tuning). (HDDS-10932)
- Added Netty off-heap memory usage metrics to OM and SCM for better monitoring. (HDDS-11100)
- Enhanced
ozone admin containerbalancer status
output with richer information including start time, parameters, progress details, and involved datanodes using -v
or --verbose
. (HDDS-11120)
- Improved SCM WebUI display: formatted JVM properties, added DN version/UUID to list, formatted SCM HA info as a list. (HDDS-11196)
- Added statistical indicators (min, max, median, stdev) for DataNode storage usage to SCM UI/metrics. (HDDS-11206)
- Added statistics for Capacity, ScmUsed, Remaining, NonScmUsed storage space indicators. (HDDS-11252)
- Improved CLI display for OM/SCM roles with a
--table
option. (HDDS-11268)
- Added statistics for node status counts (Healthy, Dead, Decommissioning, EnteringMaintenance). (HDDS-11272)
- Allowed disabling OM version-specific features via internal config (e.g., atomic rewrite key). (HDDS-11378)
- Introduced schema versioning for Recon DB to handle upgrades and distinguish schema changes. (HDDS-11465)
- Added statistics for Pipeline and Container counts/states to SCM UI/metrics. (HDDS-11469)
- Improved
--duration
option handling in freon tests (ombg, ommg) for consistency with -n
limit. (HDDS-11494)
- Made SCMDBDefinition a singleton to reflect its immutability. (HDDS-11555)
- Simplified DBColumnFamilyDefinition by removing redundant keyType/valueType fields (relying on Codec). (HDDS-11557)
- Made ReconSCMDBDefinition a singleton. (HDDS-11589)
- Clarified OM Ratis configuration change log message to avoid confusion about peer roles. (HDDS-11623)
- Optimized
OmUtils.normalizeKey
to check isDebugEnabled
before performing string comparison. (HDDS-11669)
- Enhanced Recon metrics for background task status (lastRunStatus, currentTaskStatus) and queue monitoring. (HDDS-11680)
- Implemented OM-side filtering for ranged GET requests for specific MPU parts to reduce network overhead. (HDDS-11699)
- Refactored S3 request unmarshalling logic to reduce code duplication. (HDDS-11739)
- Improved efficiency of
BufferUtils.writeFully
for ByteBuffer[]
using GatheringByteChannel
. (HDDS-11860)
- The ozonefs-hadoop3-client jar may be optionally relocated to a different classpath fix by specifying the Maven properties
proto.shaded.prefix
. (HDDS-12116)
- Changed default Replication Manager command deadline to 12 minutes (SCM) and Datanode offset to 6 minutes. (HDDS-12135)
- Improved error messages in Ozone CLI for FileSystemExceptions (e.g., NoSuchFileException, AccessDeniedException) when not in verbose mode. (HDDS-12241)
- Returned explicit QUOTA_EXCEEDED S3 error code instead of a generic 500 internal error. (HDDS-12329)
- Optimized
listMultipartUploads
by removing duplicate key scanning in OmMetadataManagerImpl
. (HDDS-12371)
- Changed ContainerID to be a value-based class, enforcing factory methods and improving efficiency with cached proto/hash. (HDDS-12541)
- Combined
containerMap
and replicaMap
in SCM’s ContainerStateMap
into a single map for simplicity and efficiency. (HDDS-12555)
- Moved
StorageTypeProto
enum from OM/SCM specific proto files to the common hdds.proto
. This is a Java API incompatible change for internal protocols but wire compatible. (HDDS-12750)
- Added configuration (
ozone.client.ratis.watch.type
) to tune the replication level (ALL_COMMITTED or MAJORITY_COMMITTED) for client watch requests. (HDDS-2887)
- SCM StateMachine now uses Ratis
notifyLeaderReady
API instead of relying solely on notifyTermIndexUpdated
. (HDDS-10690)
- Refactored OM request
validateAndUpdateCache
methods to pass ExecutionContext
instead of just TermIndex
. (HDDS-11975)
- Reduced unnecessary object creation (RunningDatanodeState, EndpointTasks) during Datanode heartbeat processing when state is RUNNING. (HDDS-11083)
- Improved replication metrics consistency across Datanode commands handled by ReplicationSupervisor and those handled directly. (HDDS-11376)
- Improved logging in Container Balancer’s AbstractFindTargetGreedy to detail why potential targets are excluded. (HDDS-10198)
- Refined
ozone admin containerbalancer status
output for better readability and detail, including time consumption and data units (MB/GB). (HDDS-11367)
- Added Pipeline count to
ozone admin datanode usageinfo
output. (HDDS-11357)
- Removed redundant
CommandHandler
thread pool size methods (already covered by ReplicationSupervisor metrics). (HDDS-11304)
- Replaced
clusterId
parameter in KeyValueHandler
with initialization via setClusterId
to prevent potential NPE during concurrent container creation under high load. (HDDS-11396)
- Added
ozone.om.ratis.leader.election.minimum.timeout.duration.key
config to OM RaftProperties for leader election timeout. (HDDS-10761)
- Added configuration (
ozone.om.rocksdb.max_open_files
) to set RocksDB max_open_files
option for OM DB. (HDDS-11191)
- Standardized Datanode command metrics tracking across ReplicationSupervisor and direct command handlers. (HDDS-11444)
- Optimized Recon List Keys API by reusing calculated path prefix for consecutive keys with the same parent ID. (HDDS-11668)
- Optimized Recon List Keys API response generation by reducing object creation (avoiding OmKeyInfo) and memory buffering. (HDDS-11660)
- Optimized Recon List Keys API filtering logic by replacing predicate lambdas with simple IF statements for performance. (HDDS-11649)
- Added foundational schema upgrade action (
InitialConstraintUpgradeAction
) for Recon to handle constraints on existing tables (e.g., Unhealthy Containers) upon first upgrade to schema versioning. (HDDS-11615)
- Added Ozone wrapper configurations (
ozone.scm.ipc.server.read.threadpool.size
, ozone.hdds.datanode.ipc.server.read.threadpool.size
) to increase ipc.server.read.threadpool.size
for SCM and Datanode RPC servers (default 10). (HDDS-11302)
- Refactored
ContainerStateMap
to restrict ContainerAttribute
generic type T to Enum, removing unused ownerMap/repConfigMap. (HDDS-12532)
- Refactored
ContainerStateManager
interface to remove redundant ContainerID
parameters when ContainerReplica
(which contains the ID) is already passed. (HDDS-12572)
- Refactored DB/Table classes to use the DB name as the thread name prefix implicitly, removing the explicit parameter. (HDDS-12590)
- Included
ContainerInfo
within ContainerAttribute
to avoid extra map lookups in ContainerStateManager
methods. (HDDS-12591)
- Enabled custom
ValueCodec
for TypedTable
to allow performance optimizations like partial deserialization (e.g., OmKeyInfo without ACLs/locations). (HDDS-12582)
- Made
ozone admin scm safemode --verbose
show rule status even when SCM is not in safe mode. (HDDS-12548)
- Addressed thread safety issue in
BlockOutputStream#failedServers
by using a concurrent collection. (HDDS-12331)
- Added DatanodeID validation for incoming ContainerCommandRequests and on Ratis group joins to prevent operations on incorrect nodes. (HDDS-11667)
- Persisted the list of container IDs created on a Datanode to prevent recreation after volume failures, ensuring consistency for both Ratis and EC containers. (HDDS-11650)
- Added check for rocks_tools native library in
ozone checknative
CLI command output. (HDDS-11347)
- Added Ozone cluster growth rate metric (based on
scm_node_manager_total_used
rate) to Grafana dashboard using PromQL. (HDDS-12168)
- Added robust error handling for Recon OM background tasks (e.g., NSSummary) to prevent data inconsistencies if Recon crashes during partial event processing. (HDDS-12062)
Deprecated
- LegacyReplicationManager (
hdds.scm.replication.enable.legacy=true
) is removed and no longer supported. (HDDS-11759)
- FILE_PER_CHUNK container layout (
ozone.scm.container.layout
) is deprecated. New containers cannot be created with this layout. Support will be removed in a future release. (HDDS-11753)
Removed
- Removed LegacyReplicationManager implementation and the
hdds.scm.replication.enable.legacy
config property. (HDDS-11759)
- Removed unused
resultCache
and getMatchingContainerIDs
method from ContainerStateMap
. (HDDS-12445)
Fixed
- TriggerDBSyncEndpoint admin-only API handling in Recon fixed. (HDDS-11436)
- Fixed potential
NullPointerException
in OzoneManagerProtocolClientSideTranslatorPB.listStatusLight
when startKey is null (e.g., via s3a). (HDDS-10367)
- Addressed memory leak caused by ThreadLocal usage in
OMClientRequest
(OMLockDetails). (HDDS-10385)
- Fixed Container Balancer incorrectly selecting containers with 0 or negative size for moving. (HDDS-10483)
- Fixed inability to write files when Datanode chunk data validation (
hdds.datanode.chunk.data.validation.check
) is enabled due to buffer position issue. (HDDS-10547)
- Fixed Recon startup failure (“used space cannot be negative”) by handling Datanode reports with negative used space gracefully. (HDDS-10614)
- Fixed
IOException: ParentKeyInfo ... is null
in Recon Namespace Summary task by handling cases where parent info might be missing. (HDDS-10855)
- Fixed EC Reconstruction failure (
IllegalArgumentException: The chunk list has X entries, but the checksum chunks has Y entries
) potentially caused by out-of-order EC stripe writes leading to inaccurate chunk lists. (HDDS-10985)
- Fixed OM crash (
SnapshotChainManager: Failure while loading snapshot chain
) caused by SstFilteringService directly updating snapshot info DB entries, potentially corrupting the chain if OM restarts before DoubleBuffer flush. (HDDS-11068)
- Resolved
ClassCastException
(RepeatedOmKeyInfo to OmKeyInfo) in Recon’s FileSizeCountTask due to improper event handling in OMDBUpdatesHandler for conflicting keys across tables (e.g., file and directory with the same name). (HDDS-11187)
- Fixed
ContainerSizeCountTask
in Recon logging ERROR for negative-sized containers; reduced log level as these are ignored functionally. (HDDS-12227)
- Fixed duplicate key violation in Recon’s
FileSizeCountTask
by correctly handling the isDbTruncated
flag to allow updates instead of only inserts. (HDDS-12228)
- Made OzoneClientException extend IOException. (HDDS-64)
- Fixed various S3 gateway issues including multipart upload and other improvements. (HDDS-1186)
- Fixed SCM Decommissioning issue causing
InvalidStateTransitionException
after recommissioning the same SCM node. (HDDS-9608)
- Fixed Recon Disk Usage page UI issues with large numbers of keys/buckets/volumes (pie chart usability, axis ticks, path overflow). (HDDS-9626)
- Fixed Ozone admin namespace CLI
du
command printing incorrect validation error messages for root (""/"") or volume paths. (HDDS-9644)
- Fixed Recon incorrectly including out-of-service (decommissioned, maintenance) nodes when checking container health status (over/under/mis-replication). (HDDS-9645)
- Fixed potential
NullPointerException
in ContainerStateMap.ContainerAttribute
due to race condition between update and get operations. (HDDS-9527)
- Fixed Recon potentially showing duplicate DEAD datanodes after decommission/reformat/recommission cycles. (HDDS-10409) -> Now only allows removing DEAD nodes. (HDDS-11032)
- Fixed potential memory overflow in Recon’s Container Health Task due to unbounded list growth. (HDDS-9819)
- Reduced Ozone client heap memory utilization during writes by using pooled direct buffers for chunks. (HDDS-9843)
- Fixed
Pipeline.nodesInOrder
using ThreadLocal, making it inaccessible to other threads after being set. (HDDS-9848)
- Switched
KeyValueContainerCheck.verifyChecksum
to use direct/mapped buffers instead of heap buffers. (HDDS-9941)
- Fixed
TokenRenewer
implementations (O3FS, OFS) not closing the created OzoneClient
. Removed duplicate implementation. (HDDS-9943)
- Fixed NSSummaryAdmin CLI commands not closing created OzoneClient instances and creating multiple instances unnecessarily. (HDDS-9944)
- Fixed incorrect synchronization in
RatisSnapshotInfo
, potentially leading to inconsistent term/index values. Class removed as redundant to TransactionInfo. (HDDS-9984)
- Fixed
Options
and ReadOptions
instances not being closed properly in rocksdb-checkpoint-differ
. (HDDS-10001)
- Renamed
ManagedSstFileReader
in rocksdb-checkpoint-differ
to SstFileSetReader
to avoid name collision with the class in hdds-managed-rocksdb
. (HDDS-10007)
- Fixed potential
NullPointerException
in VolumeInfoMetrics.getCommitted()
if HddsVolume.committedBytes
is null. (HDDS-10027)
- Refined SCM RPC handler counts to be configurable per protocol (Client, Block, Datanode) instead of a single global count. (HDDS-10088)
- Removed static
dbNameToCfHandleMap
from RocksDatabase, using non-static columnFamilies
map instead. (HDDS-10107)
- Fixed potential
NullPointerException
in OMDBCheckpointServlet lock acquisition when SstFilteringService is accessed before initialization. (HDDS-10138)
- Enabled Zero-Copy reads during container replication for improved performance. (HDDS-10144)
- Corrected metric names
createOmResoonseLatencyNs
and validateAndUpdateCacneLatencyNs
in OMPerformanceMetrics
. (HDDS-10162)
- Fixed
OmMetadataManagerImpl
creating a new S3Batcher
instance for each S3 secret operation instead of reusing one. (HDDS-10202)
- Ensured atomic updates in
StateContext#updateCommandStatus
using computeIfPresent
to prevent race conditions. (HDDS-10210)
- Fixed Grafana dashboards: removed UID/hostnames, included secure/unsecure ports, corrected datastore count. (HDDS-10229)
- Prevented V3 Schema DatanodeStore from creating container DBs in incorrect locations under certain initialization paths. (HDDS-10230)
- Fixed
ContainerStateManager
finalizing OPEN containers without a healthy pipeline on follower SCMs; moved logic to leader-only path via Ratis. (HDDS-10231)
- Improved JSON response for Deleted Directories and Open Keys Insight Endpoints in Recon for better clarity (using actual names instead of Object IDs). (HDDS-10241)
- Fixed
ContainerReport
admin command showing incorrect values immediately after SCM restart before Replication Manager runs. (HDDS-10272)
- Fixed pagination on the OM DB Insights page in Recon. (HDDS-10282)
- Added support for direct ByteBuffers in Checksum calculations, using reflection for Java 9+ API while maintaining Java 8 compatibility. (HDDS-10288)
- Fixed
ECReconstructionCoordinator
ignoring ozone-site.xml
client configurations and using default OzoneClientConfig
. (HDDS-10294)
- Fixed potential orphan blocks during key overwrite operations, especially involving the deleted key table. (HDDS-10296)
- Fixed
KeyManagerImpl#listKeys
path normalization to correctly handle OBS/LEGACY buckets when ozone.om.enable.filesystem.paths
is true. (HDDS-10319)
- Fixed metadata not being updated when overwriting existing keys via S3 PutObject. (HDDS-10324)
- Fixed
SetTimes
API not working with linked buckets due to missing link resolution. (HDDS-10369)
- Fixed Recon not handling pre-existing MISSING_EMPTY containers correctly (introduced in HDDS-9695), leaving them marked as missing indefinitely. (HDDS-10370)
- Fixed S3 listParts incompatibility for keys created before HDDS-9680 (missing ETag metadata) and NPE when ETag is null. (HDDS-10395)
- Restricted directory deletion in LEGACY buckets via
ozone sh key delete
; users must use ozone fs
interface. (HDDS-10397)
- Fixed
ArrayIndexOutOfBoundsException
when listing keys in OBS buckets via S3/s3a under certain conditions. (HDDS-10399)
- Fixed
ozone admin
CLI having hard-coded INFO log level, ignoring environment/config settings. (HDDS-10405)
- Fixed Datanode startup failure (“Illegal configuration: raft.grpc.message.size.max must be 1m larger than …”) when using latest Ratis due to default config mismatch. (HDDS-11375)
- Fixed Datanode startup failure (“checksum size setting 1024 is not in expected format”) due to incorrect type validation for
hdds.ratis.raft.server.snapshot.creation.gap
. (HDDS-10423)
- Fixed Grafana dashboard Prometheus endpoint configuration for Datanodes and added missing Recon endpoint. (HDDS-10433)
- Fixed Datanode Maintenance failing early incorrectly (logic refined). (HDDS-10463)
- Fixed OM potentially crashing or failing requests if the configured S3 secret storage (Vault) is unavailable. (HDDS-10469)
- Fixed audit log for key creation missing EC replication config details (parity, chunk size, codec). (HDDS-10472)
- Fixed potential
NullPointerException
in OmUtils.getAllOMHAAddresses
if OM HA config keys are missing. (HDDS-10508)
- Fixed S3 GetObject ETag header returning
"null"
for objects without an ETag, causing issues with AWS SDK validation. Now omits the header if ETag is missing. (HDDS-10521)
- Fixed
MessageDigest
instance in S3 endpoint potentially not being reset after exceptions (e.g., client cancellation), leading to incorrect ETags on subsequent requests using the same thread. (HDDS-10587)
- Fixed issue where client might attempt Ratis streaming for keys defaulted to EC replication if bucket replication isn’t explicitly set. (HDDS-10832)
- Fixed freon read/mixed operations failing with “Key not found” if prefix is unspecified; stopped adding random prefix. Fixed misleading random prefix log in
ommg
. (HDDS-10845)
- Fixed Ozone CLI not respecting default
ozone.om.service.id
when only one service ID is configured. (HDDS-10861)
- Fixed
ClosePipelineCommandHandler
potentially causing GroupMismatchException
by calling removeGroup
before getting peer list for propagation. (HDDS-10875)
- Fixed Recon
ReconContainerManager
potentially throwing DuplicatedPipelineIdException
when checking/adding containers due to race conditions or stale data. (HDDS-10880)
- Improved logging clarity in Recon’s
ReconNodeManager
regarding datanode finalization status checks during upgrades. (HDDS-10883)
- Fixed OM startup failure in single-node Docker container due to Ratis group directory mismatch when using default service ID. (HDDS-10909)
- Fixed Recon startup failing silently or logging incorrect errors in non-HA SCM scenarios due to inability to fetch SCM roles or snapshot. (HDDS-10937)
- Fixed OM decommission config (
ozone.om.decommissioned.nodes
) not working without service ID suffix when only one OM service ID is configured. (HDDS-10942)
- Fixed EC key read corruption potentially occurring if a container’s replica index on a DN mismatches the index expected by the client (e.g., after container move). Added validation. (HDDS-10983)
- Fixed S3 gateway potentially throwing exceptions (
javax.xml.xpath.XPathExpressionException
) during concurrent XML parsing (e.g., CompleteMultipartUpload, DeleteObjects). (HDDS-10777)
- Fixed
NullPointerException
in XceiverClientRatis.watchForCommit
when updateCommitInfosMap
encounters a new Datanode ID in the response after a previous timeout removed it from commitInfoMap
. (HDDS-10780)
- Fixed potential
OMLeaderNotReadyException
after leader switch if transactions were pending in the double buffer, preventing lastNotifiedTermIndex
update. (HDDS-10798)
- Fixed various HTTP server components (Recon, SCM, OM, DN) failing to start if configured with a wildcard Kerberos principal (
*
) due to missing kerb-core
dependency. (HDDS-10803)
- Fixed S3
setBucketAcl
causing UnsupportedOperationException
due to attempting to modify an immutable list returned by OzoneVolume.getAcls()
. (HDDS-11737)
- Fixed SCM leadership metric (
SCMLeader
) potentially being reset to null by HTTP server initialization after the Raft server has already determined leadership. (HDDS-11742)
- Fixed
SnapshotDiffManager
logging NativeLibraryNotLoadedException
as ERROR even when native tools are optional; changed to WARN. (HDDS-11486)
- Fixed potential
NullPointerException
when checking container balancer status (ozone admin containerbalancer status
) if balancer is started but not fully initialized (e.g., waiting for DU info). (HDDS-11350)
- Fixed
ozone fs -rm -r
prompt for volume deletion suggesting incorrect ozone sh volume delete
options (-skipTrash
, -id
). (HDDS-11346)
- Fixed
ozone sh key list -h
showing duplicate options (--all
, --length
) due to picocli version issue (reverted). (HDDS-11446) -> Reverted picocli upgrade.
- Fixed S3 CompleteMultipartUpload returning 500 Internal Server Error instead of S3-compliant InvalidRequest error when no parts are specified in the request body. (HDDS-11457)
- Fixed multiple
IOzoneAuthorizer
instances potentially being created and leaked if Ratis snapshot installation fails repeatedly after stopping the metadata manager. (HDDS-11472)
- Fixed
ozone sh volume delete
command line parsing error for -r
option. (HDDS-11535) -> Resolved as part of HDDS-11346 fix.
- Fixed
NullPointerException
in OM when overwriting an empty file using multipart upload in FSO buckets (versioning disabled). (HDDS-12131)
- Fixed Replication Manager (
hdds.scm.replication.thread.interval
) interval configuration description to correctly state milliseconds instead of seconds. (HDDS-12144) -> Resolved by removing unsupported types.
- Fixed Grafana dashboard for Chunk read/write rates using incorrect interval variable (
$__interval
instead of $__rate_interval
). (HDDS-12112)
- Fixed Replication Manager potentially expiring pending container deletes incorrectly instead of retrying them if the Datanode doesn’t confirm deletion within the deadline. (HDDS-12127)
- Fixed Replication Manager non-deterministically selecting replicas for deletion if preferred target nodes are overloaded, potentially deleting required replicas. (HDDS-12115)
- Fixed delete container commands potentially running indefinitely or past their deadline due to long lock waits or slow disk I/O; added lock timeout and moved ICR earlier. (HDDS-12114)
- Fixed Recon UI potentially switching from old UI to new UI automatically upon page refresh. (HDDS-12084)
- Fixed missing local refresh button in new Recon UI’s Disk Usage page to reload data for the current path without navigating back to root. (HDDS-12085)
- Fixed unnecessary parameters “Source Volume” & “Source Bucket” appearing in the metadata table for non-link buckets in the new Recon UI Disk Usage page. (HDDS-12073)
- Fixed Recon API endpoints
/api/v1/volumes
and /api/v1/buckets
missing from Swagger documentation. (HDDS-11300)
- Fixed potential
NullPointerException
in Recon /api/v1/volumes
and /api/v1/buckets
endpoints if accessed before Recon tables are fully initialized after startup. (HDDS-11349)
- Fixed
ozone freon cr
(closed container replication) command failing with NPE due to metrics map lookup failure in ReplicationSupervisor. (HDDS-12040)
- Fixed incorrect display of Ozone Service ID name in Recon UI (New UI showed “OM ID”, Old UI showed “OM Service”). Corrected to “Ozone Service ID”. (HDDS-12049)
- Fixed difference in Cluster Capacity % calculation (floor vs round) and Container Pre-Allocated Size display (committed vs 0) between new and old Recon UI. (HDDS-12042)
- Fixed long path names wrapping to the next line in the new Recon UI Disk Usage page; made it scrollable instead. (HDDS-11957)
- Fixed Recon failing to update
version_number
in RECON_SCHEMA_VERSION
table (always -1), causing upgrade actions to run unnecessarily on fresh installs. (HDDS-11846)
- Fixed serialization error (
Conflicting/ambiguous property name
) in Recon’s listKeys API due to Jackson ambiguity between key
field and isKey()
getter. Renamed getter. (HDDS-11848)
- Fixed potential deadlock in OM between DoubleBuffer flush thread (waiting for DeletedTable lock during snapshot checkpoint) and KeyDeletingService (holding DeletedTable lock, waiting for Ratis future). (HDDS-11124)
- Fixed OM crashing with
IOException: Rocks Database is closed
during SnapshotMoveDeletedKeys
request processing if the snapshot was purged concurrently. (HDDS-11152)
- Fixed containers potentially stuck in DELETING state after upgrade if they were affected by HDDS-8129 (incorrect block counts) and datanodes rejected delete commands due to negative counts. Added recovery logic. (HDDS-11136)
- Fixed
fs -mkdir
incorrectly creating directories in OBS buckets (bypassing layout validation added in HDDS-11235). Reverted the optimization for mkdir. (HDDS-11348)
- Fixed Datanode potentially failing heartbeats or other operations due to deadlock on
StateContext#pipelineActions
map under high load. Replaced with concurrent map. (HDDS-11331)
- Fixed S3 gateway returning 403 Forbidden instead of 302 Redirect for root path (
/
) requests containing Authorization: Negotiate
header (used by newer curl versions). (HDDS-11096)
- Fixed
DELETE_TENANT
request logging an unnecessary and uninformative UPDATE_VOLUME
audit entry, even on failure. (HDDS-11119)
- Fixed intermittent timeout in
TestBlockDeletion.testBlockDeletion
potentially caused by race conditions or slow command processing. (HDDS-9962)
- Fixed “Bad file descriptor” error in
TestOmSnapshotFsoWithNativeLib.testSnapshotCompactionDag
when using native RocksDB tools library. (HDDS-10149)
- Fixed
ManagedStatistics
objects not being closed properly in OM and DN when RocksDB statistics are enabled, leading to resource leaks. (HDDS-10184)
- Fixed race condition in
RocksDatabase
where a close operation could occur between assertClose()
check and database operation, causing JVM crash. (HDDS-9527)
- Fixed ReplicationManager metrics not being re-registered after RM restart via CLI (
stop
/start
), causing metrics to stop reporting. (HDDS-9235)
- Fixed infinite loop in
WritableRatisContainerProvider.getContainer
if SCM restarts and existing pipeline nodes are not found (e.g., DNs stopped), causing log flooding. (HDDS-8982)
- Fixed SCM follower nodes logging
NotLeaderException
errors when processing Pipeline Reports, which is expected behavior for followers. Suppressed error logging. (HDDS-11695)
- Fixed
FileNotFoundException: ... (Too many open files)
and subsequent DN crashes during OM+DN decommission under heavy Freon load, likely due to excessive open file handles. (HDDS-11391) -> Addressed potential causes.
- Fixed Recon showing incorrect (zero) count for DELETED containers in cluster state summary API (
/api/v1/clusterState
). (HDDS-11389)
- Fixed issue where OM could fail if
IOzoneAuthorizer
(e.g., Ranger plugin) fails to initialize during snapshot installation and reload attempts create multiple instances, leading to heap exhaustion. (HDDS-11472)
- Fixed
NoSuchUpload
error when aborting multipart uploads for keys where the parent directory was missing (potentially due to FSO-related bugs or cleanup issues). (HDDS-11784)
- Fixed secure acceptance tests failing on arm64 due to keytab checksum mismatch when using keytabs generated on amd64. Regenerated keytabs for multi-arch compatibility. (HDDS-11810)
- Fixed race condition in datanode VERSION file creation where multiple threads could attempt to write using the same temporary file via
AtomicFileOutputStream
. (HDDS-12608)
- Fixed SCM logging an error when updating sequence ID for a CLOSED container based on a replica report with higher BCSID; changed log level and added context. (HDDS-12409)
Security
- Disabled REST endpoint for S3 secret manipulation by username for non-admin users via S3Gateway Secret REST endpoint. (HDDS-11040)
For more details, check out Apache Ozone 2.0.0 JIRA list
This is a generally available (GA) release.
It represents a point of API stability and quality that we consider production-ready.
Generated-by: Google AI Studio + Gemini 2.5 Pro Preview 03-25, with input data from a filtered JIRA list using this prompt.
Image credit: Indiana Dunes National Lakeshore, Michigan City, Indiana, USA by Diego Delso, CC-BY-SA 3.0 / Text added to original