All Versions
11
Latest Version
Avg Release Cycle
15 days
Latest Release
13 days ago

Changelog History
Page 1

  • v0.16.0-incubating

    September 10, 2019
  • v0.15.1

    August 01, 2019

    2019-07-30

  • v0.15.1-incubating

    August 01, 2019

    2019-07-30

  • v0.15.0

    June 27, 2019

    📚 Apache Druid 0.15.0-incubating contains over 250 new features, performance/stability/documentation improvements, and bug fixes from 39 contributors. Major new features and improvements include:

    • 🆕 New Data Loader UI
    • 👌 Support transactional Kafka topic
    • 🆕 New Moving Average query
    • Time ordering for Scan query
    • 🆕 New Moments Sketch aggregator
    • SQL enhancements
    • Light lookup module for routers
    • Core ORC extension
    • Core GCP extension
    • Document improvements

    The full list of changes is here: https://github.com/apache/incubator-druid/pulls?q=is%3Apr+is%3Aclosed+milestone%3A0.15.0

    📚 Documentation for this release is at: http://druid.apache.org/docs/0.15.0-incubating/

    Highlights

    🆕 New Data Loader UI (Batch indexing part)

    0 15 0-data-loader

    ✅ Druid has a new Data Loader UI which is integrated with the Druid Console. The new Data Loader UI shows some sampled data to easily verify the ingestion spec and generates the final ingestion spec automatically. The users are expected to easily issue batch index tasks instead of writing a JSON spec by themselves.

    ➕ Added by @vogievetsky and @dclim in #7572 and #7531, respectively.

    👌 Support Kafka Transactional Topics

    👍 The Kafka indexing service now supports Kafka Transactional Topics.

    👍 Please note that only Kafka 0.11.0 or later versions are supported after this change.

    ➕ Added by @surekhasaharan in #6496.

    🆕 New Moving Average Query

    A new query type was introduced to compute moving average.

    👀 Please see http://druid.apache.org/docs/0.15.0-incubating/development/extensions-contrib/moving-average-query.html for more details.

    ➕ Added by @yurmix in #6430.

    Time Ordering for Scan Query

    👀 The Scan query type now supports time ordering. Please see http://druid.apache.org/docs/0.15.0-incubating/querying/scan-query.html#time-ordering for more details.

    ➕ Added by @justinborromeo in #7133.

    🆕 New Moments Sketch Aggregator

    👀 The Moments Sketch is a new sketch type for approximate quantile computation. Please see http://druid.apache.org/docs/0.15.0-incubating/development/extensions-contrib/momentsketch-quantiles.html for more details.

    ➕ Added by @edgan8 in #6581.

    SQL enhancements

    ✨ Druid community has been striving to enhance SQL support and now it's no longer experimental.

    🆕 New SQL functions

    Autocomplete in Druid Console

    0 15 0-autocomplete

    👍 Druid Console now supports autocomplete for SQL.

    ➕ Added by @shuqi7 in #7244.

    👍 Time-ordered scan support for SQL

    👍 Druid SQL supports time-ordered scan query.

    ➕ Added by @justinborromeo in #7373.

    🌐 Lookups view added to the web console

    0 15 0-lookup-view

    🔧 You can now configure your lookups from the web console directly.

    ➕ Added by @shuqi7 in #7259.

    🌐 Misc web console improvements

    "NoSQL" mode : #7493 [@shuqi7]

    🌐 The web console now has a backup mode that allows it to function as best as it can if DruidSQL is disabled or unavailable.

    ➕ Added compaction configuration dialog : #7242 [@shuqi7]

    🔧 You can now configure the auto compaction settings for a data source from the Datasource view.

    Auto wrap query with limit : #7449 [@vogievetsky]

    0 15 0-misc

    The console query view will now (by default) wrap DruidSQL queries with a SELECT * FROM (...) LIMIT 1000 allowing you to enter queries like SELECT * FROM your_table without worrying about the impact to the cluster. You can still send 'raw' queries by selecting the option from the ... menu.

    SQL explain query : #7402 [@shuqi7]

    You can now click on the ... menu in the query view to get an explanation of the DruidSQL query.

    Surface is_overshadowed as a column in the segments table #7555 , #7425 [@shuqi7][@surekhasaharan]

    👀 is_overshadowed column represents that this segment is overshadowed by any published segments. It can be useful to see what segments should be loaded by historicals. Please see http://druid.apache.org/docs/0.15.0-incubating/querying/sql.html for more details.

    👌 Improved status UI for actions on tasks, supervisors, and datasources : #7528 [shuqi7]

    👀 This PR condenses the actions list into a tidy menu and lets you see the detailed status for supervisors and tasks. New actions for datasources around loading and dropping data by interval has also been added.

    Light Lookup Module for Routers

    👀 Light lookup module was introduced for Routers and they now need only minimum amount of memory. Please see http://druid.apache.org/docs/0.15.0-incubating/operations/basic-cluster-tuning.html#router for basic memory tuning.

    ➕ Added by @clintropolis in #7222.

    Core ORC extension

    ⚡️ ORC extension is now promoted to a core extension. Please read the below 'Updating from 0.14.0-incubating and earlier' section if you are using the ORC extension in an earlier version of Druid.

    ➕ Added by @clintropolis in #7138.

    Core GCP extension

    ⚡️ GCP extension is now promoted to a core extension. Please read the below 'Updating from 0.14.0-incubating and earlier' section if you are using the GCP extension in an earlier version of Druid.

    ➕ Added by @drcrallen in #6953.

    Document Improvements

    🚀 Single-machine deployment example configurations and scripts

    👀 Several configurations and scripts were added for easy single machine setup. Please see http://druid.apache.org/docs/0.15.0-incubating/operations/single-server.html for details.

    ➕ Added by @jon-wei in #7590.

    📇 Tool for migrating from local deep storage/Derby metadata

    👀 A new tool was added for easy migration from single machine to a cluster environment. Please see http://druid.apache.org/docs/0.15.0-incubating/operations/deep-storage-migration.html for details.

    ➕ Added by @jon-wei in #7598.

    Document for basic tuning guide

    👀 Documents for basic tuning guide was added. Please see http://druid.apache.org/docs/0.15.0-incubating/operations/basic-cluster-tuning.html for details.

    ➕ Added by @jon-wei in #7629.

    🔒 Security Improvement

    🔒 The Druid system table now requires only mandatory permissions instead of the read permission for the whole sys database. Please see http://druid.apache.org/docs/0.15.0-incubating/development/extensions-core/druid-basic-security.html for details.

    ➕ Added by @jon-wei in #7579.

    🚚 Deprecated/removed

    ⬇️ Drop support for automatic segment merge

    🔀 The automatic segment merge by the coordinator is not supported anymore. Please use auto compaction instead.

    ➕ Added by @jihoonson in #6883.

    ⬇️ Drop support for insert-segment-to-db tool

    📇 In Druid 0.14.x or earlier, Druid stores segment metadata (descriptor.json file) in deep storage in addition to metadata store. This behavior has changed in 0.15.0 and it doesn't store segment metadata file in deep storage anymore. As a result, insert-segment-to-db tool is no longer supported as well since it works based on descriptor.json files in deep storage. Please see http://druid.apache.org/docs/0.15.0-incubating/operations/insert-segment-db.html for details.

    Please note that kill task will fail if you're using HDFS as deep storage and descriptor.json file is missing in 0.14.x or earlier versions.

    ➕ Added by @jihoonson in #6911.

    ✂ Removed "useFallback" configuration for SQL

    🚚 This option was removed since it generates unscalable query plans and doesn't work with some SQL functions.

    ➕ Added by @gianm in #7567.

    ✂ Removed a public API in CompressionUtils for extension developers

    👕 public static void gunzip(File pulledFile, File outDir) was removed in #6908 by @clintropolis.

    Other behavior changes

    Coordinator await initialization before finishing startup

    🔧 A new configuration (druid.coordinator.segment.awaitInitializationOnStart) was added to make Coordinator wait for segment view initialization. This option is enabled by default.

    ➕ Added by @QiuMM in #6847.

    Coordinator API behavior change

    📇 The coordinator periodically polls segment metadata information from metadata store and caches them in memory. In Druid 0.14.x or earlier, removing segments via coordinator APIs (/druid/coordinator/v1/datasources/{dataSourceName} and /druid/coordinator/v1/datasources/{dataSourceName}/segments/{segmentId}) immediately updates the segment cache in memory as well as metadata store. But this behavior has changed in 0.15.0 and the cache is updated per poll rather than being updated immediately on removal. The below APIs can return removed segments via the above API calls until the cache is updated in the next poll.

    • 📇 /druid/coordinator/v1/metadata/datasources/{dataSourceName}
    • 📇 /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments
    • 📇 /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments/{segmentId}
    • 📇 /druid/coordinator/v1/metadata/datasources
    • /druid/coordinator/v1/loadstatus

    ⚡️ The below metrics can also contain removed segments via the above API calls until the cache is updated in the next poll.

    • segment/unavailable/count
    • segment/underReplicated/count

    This behavior was changed in #7595 by @surekhasaharan.

    Listing Lookup API change

    🔧 The /druid/coordinator/v1/lookups/config API now returns a list of tiers currently active in the cluster in addition to ones known in the dynamic configuration.

    ➕ Added by @clintropolis in #7647.

    Zookeeper loss

    🔧 With a new configuration (druid.zk.service.terminateDruidProcessOnConnectFail), Druid processes can terminate itself on disconnection to ZooKeeper.

    ➕ Added by @michael-trelinski in #6740.

    ⚡️ Updating from 0.14.0-incubating and earlier

    Minimum compatible Kafka version change for Kafka Indexing Service

    ⚡️ Kafka 0.11.x or later versions are only supported after #6496. Please consider updating Kafka version if you're using an older one.

    ORC extension changes

    🚀 The ORC extension has been promoted to a core extension. When deploying 0.15.0-incubating, please ensure that your extensions directory does not have any older versions of druid-orc-extensions extension.

    Additionally, even though the new core extension can index any data the old contrib extension could, the JSON spec for the ingestion task is incompatible, and will need modified to work with the newer core extension.

    To migrate to 0.15.0-incubating:

    • In inputSpec of ioConfig, inputFormat must be changed from "org.apache.hadoop.hive.ql.io.orc.OrcNewInputFormat" to
      "org.apache.orc.mapreduce.OrcInputFormat"
    • 👍 The contrib extension supported a typeString property, which provided the schema of the
      ORC file, of which was essentially required to have the types correct, but notably not the column names, which facilitated column renaming. In the core extension, column renaming can be achieved with flattenSpec expressions.
    • 📄 The contrib extension supported a mapFieldNameFormat property, which provided a way to specify a dimension to flatten OrcMap columns with primitive types. This functionality has also been replaced with flattenSpec expressions.

    👀 For more details and examples, please see http://druid.apache.org/docs/0.15.0-incubating/development/extensions-core/orc.html.

    GCP extension changes

    🚀 The GCP extension has been promoted to a core extension. When deploying 0.15.0-incubating, please ensure that your extensions directory does not have any older versions of the druid-google-extensions extension.

    ⬇️ Dropped auto segment merge

    🔧 The coordinator configuration for auto segment merge (druid.coordinator.merge.on) is not supported anymore. Please use auto compaction instead.

    ✂ Removed descriptor.json metadata file in deep storage

    📇 The segment metadata file (descriptor.json) is not stored in deep storage any more. If you are using HDFS as your deep storage and need to roll back to 0.14.x or earlier, then please consider that the kill task could fail because of the missing descriptor.json files.

    Credits

    🚀 Thanks to everyone who contributed to this release!

    @a2l007
    @asdf2014
    @capistrant
    👕 @clintropolis
    @dampcake
    @dclim
    @donbowman
    @drcrallen
    @Dylan1312
    @edgan8
    @es1220
    @esevastyanov
    @FaxianZhao
    @fjy
    @gianm
    @glasser
    @hpandeycodeit
    @jihoonson
    @jon-wei
    @jorbay-au
    @justinborromeo
    @kamaci
    @KazuhitoT
    @leventov
    @lxqfy
    @michael-trelinski
    @peferron
    @puneetjaiswal
    @QiuMM
    @richardstartin
    @samarthjain
    @scrawfor
    @shuqi7
    @surekhasaharan
    @venkatramanp
    @vogievetsky
    @xueyumusic
    @xvrl
    @yurmix

  • v0.15.0-incubating

    June 27, 2019

    📚 Apache Druid 0.15.0-incubating contains over 250 new features, performance/stability/documentation improvements, and bug fixes from 39 contributors. Major new features and improvements include:

    • 🆕 New Data Loader UI
    • 👌 Support transactional Kafka topic
    • 🆕 New Moving Average query
    • Time ordering for Scan query
    • 🆕 New Moments Sketch aggregator
    • SQL enhancements
    • Light lookup module for routers
    • Core ORC extension
    • Core GCP extension
    • Document improvements

    The full list of changes is here: https://github.com/apache/incubator-druid/pulls?q=is%3Apr+is%3Aclosed+milestone%3A0.15.0

    📚 Documentation for this release is at: http://druid.apache.org/docs/0.15.0-incubating/

    Highlights

    🆕 New Data Loader UI (Batch indexing part)

    0 15 0-data-loader

    ✅ Druid has a new Data Loader UI which is integrated with the Druid Console. The new Data Loader UI shows some sampled data to easily verify the ingestion spec and generates the final ingestion spec automatically. The users are expected to easily issue batch index tasks instead of writing a JSON spec by themselves.

    ➕ Added by @vogievetsky and @dclim in #7572 and #7531, respectively.

    👌 Support Kafka Transactional Topics

    👍 The Kafka indexing service now supports Kafka Transactional Topics.

    👍 Please note that only Kafka 0.11.0 or later versions are supported after this change.

    ➕ Added by @surekhasaharan in #6496.

    🆕 New Moving Average Query

    A new query type was introduced to compute moving average.

    👀 Please see http://druid.apache.org/docs/0.15.0-incubating/development/extensions-contrib/moving-average-query.html for more details.

    ➕ Added by @yurmix in #6430.

    Time Ordering for Scan Query

    👀 The Scan query type now supports time ordering. Please see http://druid.apache.org/docs/0.15.0-incubating/querying/scan-query.html#time-ordering for more details.

    ➕ Added by @justinborromeo in #7133.

    🆕 New Moments Sketch Aggregator

    👀 The Moments Sketch is a new sketch type for approximate quantile computation. Please see http://druid.apache.org/docs/0.15.0-incubating/development/extensions-contrib/momentsketch-quantiles.html for more details.

    ➕ Added by @edgan8 in #6581.

    SQL enhancements

    ✨ Druid community has been striving to enhance SQL support and now it's no longer experimental.

    🆕 New SQL functions

    Autocomplete in Druid Console

    0 15 0-autocomplete

    👍 Druid Console now supports autocomplete for SQL.

    ➕ Added by @shuqi7 in #7244.

    👍 Time-ordered scan support for SQL

    👍 Druid SQL supports time-ordered scan query.

    ➕ Added by @justinborromeo in #7373.

    🌐 Lookups view added to the web console

    0 15 0-lookup-view

    🔧 You can now configure your lookups from the web console directly.

    ➕ Added by @shuqi7 in #7259.

    🌐 Misc web console improvements

    "NoSQL" mode : #7493 [@shuqi7]

    🌐 The web console now has a backup mode that allows it to function as best as it can if DruidSQL is disabled or unavailable.

    ➕ Added compaction configuration dialog : #7242 [@shuqi7]

    🔧 You can now configure the auto compaction settings for a data source from the Datasource view.

    Auto wrap query with limit : #7449 [@vogievetsky]

    0 15 0-misc

    The console query view will now (by default) wrap DruidSQL queries with a SELECT * FROM (...) LIMIT 1000 allowing you to enter queries like SELECT * FROM your_table without worrying about the impact to the cluster. You can still send 'raw' queries by selecting the option from the ... menu.

    SQL explain query : #7402 [@shuqi7]

    You can now click on the ... menu in the query view to get an explanation of the DruidSQL query.

    Surface is_overshadowed as a column in the segments table #7555 , #7425 [@shuqi7][@surekhasaharan]

    👀 is_overshadowed column represents that this segment is overshadowed by any published segments. It can be useful to see what segments should be loaded by historicals. Please see http://druid.apache.org/docs/0.15.0-incubating/querying/sql.html for more details.

    👌 Improved status UI for actions on tasks, supervisors, and datasources : #7528 [shuqi7]

    👀 This PR condenses the actions list into a tidy menu and lets you see the detailed status for supervisors and tasks. New actions for datasources around loading and dropping data by interval has also been added.

    Light Lookup Module for Routers

    👀 Light lookup module was introduced for Routers and they now need only minimum amount of memory. Please see http://druid.apache.org/docs/0.15.0-incubating/operations/basic-cluster-tuning.html#router for basic memory tuning.

    ➕ Added by @clintropolis in #7222.

    Core ORC extension

    ⚡️ ORC extension is now promoted to a core extension. Please read the below 'Updating from 0.14.0-incubating and earlier' section if you are using the ORC extension in an earlier version of Druid.

    ➕ Added by @clintropolis in #7138.

    Core GCP extension

    ⚡️ GCP extension is now promoted to a core extension. Please read the below 'Updating from 0.14.0-incubating and earlier' section if you are using the GCP extension in an earlier version of Druid.

    ➕ Added by @drcrallen in #6953.

    Document Improvements

    🚀 Single-machine deployment example configurations and scripts

    👀 Several configurations and scripts were added for easy single machine setup. Please see http://druid.apache.org/docs/0.15.0-incubating/operations/single-server.html for details.

    ➕ Added by @jon-wei in #7590.

    📇 Tool for migrating from local deep storage/Derby metadata

    👀 A new tool was added for easy migration from single machine to a cluster environment. Please see http://druid.apache.org/docs/0.15.0-incubating/operations/deep-storage-migration.html for details.

    ➕ Added by @jon-wei in #7598.

    Document for basic tuning guide

    👀 Documents for basic tuning guide was added. Please see http://druid.apache.org/docs/0.15.0-incubating/operations/basic-cluster-tuning.html for details.

    ➕ Added by @jon-wei in #7629.

    🔒 Security Improvement

    🔒 The Druid system table now requires only mandatory permissions instead of the read permission for the whole sys database. Please see http://druid.apache.org/docs/0.15.0-incubating/development/extensions-core/druid-basic-security.html for details.

    ➕ Added by @jon-wei in #7579.

    🚚 Deprecated/removed

    ⬇️ Drop support for automatic segment merge

    🔀 The automatic segment merge by the coordinator is not supported anymore. Please use auto compaction instead.

    ➕ Added by @jihoonson in #6883.

    ⬇️ Drop support for insert-segment-to-db tool

    📇 In Druid 0.14.x or earlier, Druid stores segment metadata (descriptor.json file) in deep storage in addition to metadata store. This behavior has changed in 0.15.0 and it doesn't store segment metadata file in deep storage anymore. As a result, insert-segment-to-db tool is no longer supported as well since it works based on descriptor.json files in deep storage. Please see http://druid.apache.org/docs/0.15.0-incubating/operations/insert-segment-db.html for details.

    Please note that kill task will fail if you're using HDFS as deep storage and descriptor.json file is missing in 0.14.x or earlier versions.

    ➕ Added by @jihoonson in #6911.

    ✂ Removed "useFallback" configuration for SQL

    🚚 This option was removed since it generates unscalable query plans and doesn't work with some SQL functions.

    ➕ Added by @gianm in #7567.

    ✂ Removed a public API in CompressionUtils for extension developers

    👕 public static void gunzip(File pulledFile, File outDir) was removed in #6908 by @clintropolis.

    Other behavior changes

    Coordinator await initialization before finishing startup

    🔧 A new configuration (druid.coordinator.segment.awaitInitializationOnStart) was added to make Coordinator wait for segment view initialization. This option is enabled by default.

    ➕ Added by @QiuMM in #6847.

    Coordinator API behavior change

    📇 The coordinator periodically polls segment metadata information from metadata store and caches them in memory. In Druid 0.14.x or earlier, removing segments via coordinator APIs (/druid/coordinator/v1/datasources/{dataSourceName} and /druid/coordinator/v1/datasources/{dataSourceName}/segments/{segmentId}) immediately updates the segment cache in memory as well as metadata store. But this behavior has changed in 0.15.0 and the cache is updated per poll rather than being updated immediately on removal. The below APIs can return removed segments via the above API calls until the cache is updated in the next poll.

    • 📇 /druid/coordinator/v1/metadata/datasources/{dataSourceName}
    • 📇 /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments
    • 📇 /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments/{segmentId}
    • 📇 /druid/coordinator/v1/metadata/datasources
    • /druid/coordinator/v1/loadstatus

    ⚡️ The below metrics can also contain removed segments via the above API calls until the cache is updated in the next poll.

    • segment/unavailable/count
    • segment/underReplicated/count

    This behavior was changed in #7595 by @surekhasaharan.

    Listing Lookup API change

    🔧 The /druid/coordinator/v1/lookups/config API now returns a list of tiers currently active in the cluster in addition to ones known in the dynamic configuration.

    ➕ Added by @clintropolis in #7647.

    Zookeeper loss

    🔧 With a new configuration (druid.zk.service.terminateDruidProcessOnConnectFail), Druid processes can terminate itself on disconnection to ZooKeeper.

    ➕ Added by @michael-trelinski in #6740.

    ⚡️ Updating from 0.14.0-incubating and earlier

    Minimum compatible Kafka version change for Kafka Indexing Service

    ⚡️ Kafka 0.11.x or later versions are only supported after #6496. Please consider updating Kafka version if you're using an older one.

    ORC extension changes

    🚀 The ORC extension has been promoted to a core extension. When deploying 0.15.0-incubating, please ensure that your extensions directory does not have any older versions of druid-orc-extensions extension.

    Additionally, even though the new core extension can index any data the old contrib extension could, the JSON spec for the ingestion task is incompatible, and will need modified to work with the newer core extension.

    To migrate to 0.15.0-incubating:

    • In inputSpec of ioConfig, inputFormat must be changed from "org.apache.hadoop.hive.ql.io.orc.OrcNewInputFormat" to
      "org.apache.orc.mapreduce.OrcInputFormat"
    • 👍 The contrib extension supported a typeString property, which provided the schema of the
      ORC file, of which was essentially required to have the types correct, but notably not the column names, which facilitated column renaming. In the core extension, column renaming can be achieved with flattenSpec expressions.
    • 📄 The contrib extension supported a mapFieldNameFormat property, which provided a way to specify a dimension to flatten OrcMap columns with primitive types. This functionality has also been replaced with flattenSpec expressions.

    👀 For more details and examples, please see http://druid.apache.org/docs/0.15.0-incubating/development/extensions-core/orc.html.

    GCP extension changes

    🚀 The GCP extension has been promoted to a core extension. When deploying 0.15.0-incubating, please ensure that your extensions directory does not have any older versions of the druid-google-extensions extension.

    ⬇️ Dropped auto segment merge

    🔧 The coordinator configuration for auto segment merge (druid.coordinator.merge.on) is not supported anymore. Please use auto compaction instead.

    ✂ Removed descriptor.json metadata file in deep storage

    📇 The segment metadata file (descriptor.json) is not stored in deep storage any more. If you are using HDFS as your deep storage and need to roll back to 0.14.x or earlier, then please consider that the kill task could fail because of the missing descriptor.json files.

    Credits

    🚀 Thanks to everyone who contributed to this release!

    @a2l007
    @asdf2014
    @capistrant
    👕 @clintropolis
    @dampcake
    @dclim
    @donbowman
    @drcrallen
    @Dylan1312
    @edgan8
    @es1220
    @esevastyanov
    @FaxianZhao
    @fjy
    @gianm
    @glasser
    @hpandeycodeit
    @jihoonson
    @jon-wei
    @jorbay-au
    @justinborromeo
    @kamaci
    @KazuhitoT
    @leventov
    @lxqfy
    @michael-trelinski
    @peferron
    @puneetjaiswal
    @QiuMM
    @richardstartin
    @samarthjain
    @scrawfor
    @shuqi7
    @surekhasaharan
    @venkatramanp
    @vogievetsky
    @xueyumusic
    @xvrl
    @yurmix

  • v0.14.2

    May 27, 2019

    🚀 Apache Druid 0.14.2-incubating is a bug fix release that includes important fixes for the 'druid-datasketches' extension and the broker 'result' level caching.

    🐛 Bug Fixes

    • #7607 thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384
    • 👻 #6483 Exception during sketch aggregations while using Result level cache
    • #7621 NPE when both populateResultLevelCache and grandTotal are set

    Credits

    🚀 Thanks to everyone who contributed to this release!

    @AlexanderSaydakov
    👕 @clintropolis
    @jihoonson
    @jon-wei

    Apache Druid (incubating) is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.

  • v0.14.2-incubating

    May 27, 2019

    🚀 Apache Druid 0.14.2-incubating is a bug fix release that includes important fixes for the 'druid-datasketches' extension and the broker 'result' level caching.

    🐛 Bug Fixes

    • #7607 thetaSketch(with sketches-core-0.13.1) in groupBy always return value no more than 16384
    • 👻 #6483 Exception during sketch aggregations while using Result level cache
    • #7621 NPE when both populateResultLevelCache and grandTotal are set

    Credits

    🚀 Thanks to everyone who contributed to this release!

    @AlexanderSaydakov
    👕 @clintropolis
    @jihoonson
    @jon-wei

    Apache Druid (incubating) is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.

  • v0.14.1

    May 09, 2019

    📚 Apache Druid 0.14.1-incubating is a small patch release that includes a handful of bug and documentation fixes from 16 contributors.

    Important Notice

    This release fixes an issue with druid-datasketches extension with quantile sketches, but introduces another one with theta sketches that was confirmed after the release was finalized, caused by #7320 and described in #7607. If you utilize theta sketches, we recommend _ not _ upgrading to this release. This will be fixed in the next release of Druid by #7619.

    🐛 Bug Fixes

    • ✅ use latest sketches-core-0.13.1 #7320
    • Adjust BufferAggregator.get() impls to return copies #7464
    • DoublesSketchComplexMetricSerde: Handle empty strings. #7429
    • 🖐 handle empty sketches #7526
    • ➕ Adds backwards-compatible serde for SeekableStreamStartSequenceNumbers. #7512
    • 👌 Support Kafka supervisor adopting running tasks between versions #7212
    • 🛠 Fix time-extraction topN with non-STRING outputType. #7257
    • 🛠 Fix two issues with Coordinator -> Overlord communication. #7412
    • ♻️ refactor druid-bloom-filter aggregators #7496
    • 🛠 Fix encoded taskId check in chatHandlerResource #7520
    • 🛠 Fix too many dentry cache slab objs#7508. #7509
    • 🛠 Fix result-level cache for queries #7325
    • 🛠 Fix flattening Avro Maps with Utf8 keys #7258
    • Write null byte when indexing numeric dimensions with Hadoop #7020
    • 👷 Batch hadoop ingestion job doesn't work correctly with custom segments table #7492
    • 🛠 Fix aggregatorFactory meta merge exception #7504

    📚 Documentation Changes

    • 🛠 Fix broken link due to Typo. #7513
    • 📄 Some docs optimization #6890
    • ⚡️ Updated Javascript Affinity config docs #7441
    • 🛠 fix expressions docs operator table #7420
    • 🛠 Fix conflicting information in configuration doc #7299
    • ➕ Add missing doc link for operations/http-compression.html #7110

    ⚡️ Updating from 0.14.0-incubating and earlier

    Kafka Ingestion

    Updating from version 0.13.0-incubating or earlier directly to 0.14.1-incubating will not require downtime like the migration path to 0.14.0-incubating due to the issue described in #6958, which has been fixed for this release in #7212. Likewise, rolling updates from version 0.13.0-incubating and earlier should also work properly due to #7512.

    Native Parallel Ingestion

    Updating from 0.13.0-incubating directly to 0.14.1-incubating will not encounter any issues during a rolling update with mixed versions of middle managers due to the fixes in #7520, as could be experienced when updating to 0.14.0-incubating.

    Credits

    🚀 Thanks to everyone who contributed to this release!

    @AlexanderSaydakov
    @b-slim
    @benhopp
    @chrishardis
    👕 @clintropolis
    @ferristseng
    @es1220
    @gianm
    @jihoonson
    @jon-wei
    @justinborromeo
    @kaka11chen
    @samarthjain
    @surekhasaharan
    @zhaojiandong
    @zhztheplayer

    Apache Druid (incubating) is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.

  • v0.14.1-incubating

    May 09, 2019

    📚 Apache Druid 0.14.1-incubating is a small patch release that includes a handful of bug and documentation fixes from 16 contributors.

    Important Notice

    This release fixes an issue with druid-datasketches extension with quantile sketches, but introduces another one with theta sketches that was confirmed after the release was finalized, caused by #7320 and described in #7607. If you utilize theta sketches, we recommend _ not _ upgrading to this release. This will be fixed in the next release of Druid by #7619.

    🐛 Bug Fixes

    • ✅ use latest sketches-core-0.13.1 #7320
    • Adjust BufferAggregator.get() impls to return copies #7464
    • DoublesSketchComplexMetricSerde: Handle empty strings. #7429
    • 🖐 handle empty sketches #7526
    • ➕ Adds backwards-compatible serde for SeekableStreamStartSequenceNumbers. #7512
    • 👌 Support Kafka supervisor adopting running tasks between versions #7212
    • 🛠 Fix time-extraction topN with non-STRING outputType. #7257
    • 🛠 Fix two issues with Coordinator -> Overlord communication. #7412
    • ♻️ refactor druid-bloom-filter aggregators #7496
    • 🛠 Fix encoded taskId check in chatHandlerResource #7520
    • 🛠 Fix too many dentry cache slab objs#7508. #7509
    • 🛠 Fix result-level cache for queries #7325
    • 🛠 Fix flattening Avro Maps with Utf8 keys #7258
    • Write null byte when indexing numeric dimensions with Hadoop #7020
    • 👷 Batch hadoop ingestion job doesn't work correctly with custom segments table #7492
    • 🛠 Fix aggregatorFactory meta merge exception #7504

    📚 Documentation Changes

    • 🛠 Fix broken link due to Typo. #7513
    • 📄 Some docs optimization #6890
    • ⚡️ Updated Javascript Affinity config docs #7441
    • 🛠 fix expressions docs operator table #7420
    • 🛠 Fix conflicting information in configuration doc #7299
    • ➕ Add missing doc link for operations/http-compression.html #7110

    ⚡️ Updating from 0.14.0-incubating and earlier

    Kafka Ingestion

    Updating from version 0.13.0-incubating or earlier directly to 0.14.1-incubating will not require downtime like the migration path to 0.14.0-incubating due to the issue described in #6958, which has been fixed for this release in #7212. Likewise, rolling updates from version 0.13.0-incubating and earlier should also work properly due to #7512.

    Native Parallel Ingestion

    Updating from 0.13.0-incubating directly to 0.14.1-incubating will not encounter any issues during a rolling update with mixed versions of middle managers due to the fixes in #7520, as could be experienced when updating to 0.14.0-incubating.

    Credits

    🚀 Thanks to everyone who contributed to this release!

    @AlexanderSaydakov
    @b-slim
    @benhopp
    @chrishardis
    👕 @clintropolis
    @ferristseng
    @es1220
    @gianm
    @jihoonson
    @jon-wei
    @justinborromeo
    @kaka11chen
    @samarthjain
    @surekhasaharan
    @zhaojiandong
    @zhztheplayer

    Apache Druid (incubating) is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.

  • v0.14.0

    April 09, 2019

    📚 Apache Druid (incubating) 0.14.0-incubating contains over 200 new features, performance/stability/documentation improvements, and bug fixes from 54 contributors. Major new features and improvements include:

    • 🆕 New web console
    • Amazon Kinesis indexing service
    • Decommissioning mode for Historicals
    • Published segment cache in Broker
    • Bloom filter aggregator and expression
    • ⚡️ Updated Apache Parquet extension
    • 👮 Force push down option for nested GroupBy queries
    • 👍 Better segment handoff and drop rule handling
    • 👷 Automatically kill MapReduce jobs when Apache Hadoop ingestion tasks are killed
    • 👍 DogStatsD tag support for statsd emitter
    • 🆕 New API for retrieving all lookup specs
    • 🆕 New compaction options
    • More efficient cachingCost segment balancing strategy

    The full list of changes is here: https://github.com/apache/incubator-druid/pulls?q=is%3Apr+is%3Amerged+milestone%3A0.14.0

    📚 Documentation for this release is at: http://druid.io/docs/0.14.0-incubating/

    Highlights

    🆕 New web console

    new-druid-console

    🌐 Druid has a new web console that provides functionality that was previously split between the coordinator and overlord consoles.

    🔧 The new console allows the user to manage datasources, segments, tasks, data processes (Historicals and MiddleManagers), and coordinator dynamic configuration. The user can also run SQL and native Druid queries within the console.

    👀 For more details, please see http://druid.io/docs/0.14.0-incubating/operations/management-uis.html

    ➕ Added by @vogievetsky in #6923.

    Kinesis indexing service

    👍 Druid now supports ingestion from Kinesis streams, provided by the new druid-kinesis-indexing-service core extension.

    👀 Please see http://druid.io/docs/0.14.0-incubating/development/extensions-core/kinesis-ingestion.html for details.

    ➕ Added by @jsun98 in #6431.

    Decommissioning mode for Historicals

    🚚 Historical processes can now be put into a "decommissioning" mode, where the coordinator will no longer consider the Historical process as a target for segment replication. The coordinator will also move segments off the decommissioning Historical.

    👀 This is controlled via Coordinator dynamic configuration. For more details, please see http://druid.io/docs/0.14.0-incubating/configuration/index.html#dynamic-configuration.

    ➕ Added by @egor-ryashin in #6349.

    Published segment cache on Broker

    📇 The Druid Broker now has the ability to maintain a cache of published segments via polling the Coordinator, which can significantly improve response time for metadata queries on the sys.segments system table.

    📇 Please see http://druid.io/docs/0.14.0-incubating/querying/sql.html#retrieving-metadata for details.

    ➕ Added by @surekhasaharan in #6901

    Bloom filter aggregator and expression

    👍 A new aggregator for constructing Bloom filters at query time and support for performing Bloom filter checks within Druid expressions have been added to the druid-bloom-filter extension.

    👀 Please see http://druid.io/docs/0.14.0-incubating/development/extensions-core/bloom-filter.html

    ➕ Added by @clintropolis in #6904 and #6397

    ⚡️ Updated Parquet extension

    🚚 druid-extensions-parquet has been moved into the core extension set from the contrib extensions and now supports flattening and int96 values.

    👀 Please see http://druid.io/docs/0.14.0-incubating/development/extensions-core/parquet.html for details.

    ➕ Added by @clintropolis in #6360

    👮 Force push down option for nested GroupBy queries

    Outer query execution for nested GroupBy queries can now be pushed down to Historical processes; previously, the outer queries would always be executed on the Broker.

    👀 Please see #5471 for details.

    ➕ Added by @samarthjain in #5471.

    👍 Better segment handoff and retention rule handling

    Segment handoff will now ignore segments that would be dropped by a datasource's retention rules, avoiding ingestion failures caused by issue #5868.

    0️⃣ Period load rules will now include the future by default.

    👀 A new "Period Drop Before" rule has been added. Please see http://druid.io/docs/0.14.0-incubating/operations/rule-configuration.html#period-drop-before-rule for details.

    ➕ Added by @QiuMM in #6676, #6414, and #6415.

    👷 Automatically kill MapReduce jobs when Hadoop ingestion tasks are killed

    👷 Druid will now automatically terminate MapReduce jobs created by Hadoop batch ingestion tasks when the ingestion task is killed.

    ➕ Added by @ankit0811 in #6828.

    👍 DogStatsD tag support for statsd-emitter

    💅 The statsd-emitter extension now supports DogStatsD-style tags. Please see http://druid.io/docs/0.14.0-incubating/development/extensions-contrib/statsd.html

    ➕ Added by @deiwin in #6605, with support for constant tags added by @glasser in #6791.

    🆕 New API for retrieving all lookup specs

    👀 A new API for retrieving all lookup specs for all tiers has been added. Please see http://druid.io/docs/0.14.0-incubating/querying/lookups.html#get-all-lookups for details.

    ➕ Added by @jihoonson in #7025.

    🆕 New compaction options

    👀 Auto-compaction now supports the maxRowsPerSegment option. Please see http://druid.io/docs/0.14.0-incubating/design/coordinator.html#compacting-segments for details.

    👀 The compaction task now supports a new segmentGranularity option, deprecating the older keepSegmentGranularity option for controlling the segment granularity of compacted segments. Please see the segmentGranularity table in http://druid.io/docs/0.14.0-incubating/ingestion/compaction.html for more information on these properties.

    ➕ Added by @jihoonson in #6758 and #6780.

    More efficient cachingCost segment balancing strategy

    👷 The cachingCost Coordinator segment balancing strategy will now only consider Historical processes for balancing decisions. Previously the strategy would unnecessarily consider active worker tasks as well, which are not targets for segment replication.

    ➕ Added by @QiuMM in #6879.

    🆕 New metrics:

    • 🆕 New allocation rate metric jvm/heapAlloc/bytes, added by @egor-ryashin in #6710.
    • 🆕 New query count metric query/count, added by @QiuMM in #6473.
    • SQL query metrics sqlQuery/bytes and sqlQuery/time, added by @gaodayue in #6302.
    • Apache Kafka ingestion lag metrics ingest/kafka/maxLag and ingest/kafka/avgLag, added by @QiuMM in #6587
    • Task count metrics task/success/count, task/failed/count, task/running/count, task/pending/count, task/waiting/count, added by @QiuMM in #6657

    🆕 New interfaces for extension developers

    RequestLogEvent

    👀 It is now possible to control the fields in RequestLogEvent, emitted by EmittingRequestLogger. Please see #6477 for details. Added by @leventov.

    Custom TLS certificate checks

    👀 An extension point for custom TLS certificate checks has been added. Please see http://druid.io/docs/0.14.0-incubating/operations/tls-support.html#custom-tls-certificate-checks for details. Added by @jon-wei in #6432.

    Kafka Indexing Service no longer experimental

    🚚 The Kafka Indexing Service extension has been moved out of experimental status.

    SQL Enhancements

    ✨ Enhancements to dsql

    👍 The dsql command line client now supports CLI history, basic autocomplete, and specifying query timeouts in the query context.

    ➕ Added in #6929 by @gianm.

    ➕ Add SQL id, request logs, and metrics

    🔊 SQL queries now have an ID, and native queries executed as part of a SQL query will have the associated SQL query ID in the native query's request logs. SQL queries will now be logged in the request logs.

    Two new metrics, sqlQuery/time and sqlQuery/bytes, are now emitted for SQL queries.

    👀 Please see http://druid.io/docs/0.14.0-incubating/configuration/index.html#request-logging and http://druid.io/docs/0.14.0-incubating/querying/sql.html#sql-metrics for details.

    ➕ Added by @gaodayue in #6302

    👍 More SQL aggregator support

    👍 The follow aggregators are now supported in SQL:

    • DataSketches HLL sketch
    • DataSketches Theta sketch
    • DataSketches quantiles sketch
    • 🛠 Fixed bins histogram
    • Bloom filter aggregator

    ➕ Added by @jon-wei in #6951 and @clintropolis in #6502

    Other SQL enhancements

    • 👍 SQL: Add support for queries with project-after-semijoin. #6756
    • 👍 SQL: Support for selecting multi-value dimensions. #6462
    • 👍 SQL: Support AVG on system tables. #601
    • SQL: Add "POSITION" function. #6596
    • SQL: Set INFORMATION_SCHEMA catalog name to "druid". #6595
    • SQL: Fix ordering of sort, sortProject in DruidSemiJoin. #6769

    ➕ Added by @gianm.

    ⚡️ Updating from 0.13.0-incubating and earlier

    ⬆️ Kafka ingestion downtime when upgrading

    ⬆️ Due to the issue described in #6958, existing Kafka indexing tasks can be terminated unnecessarily during a rolling upgrade of the Overlord. The terminated tasks will be restarted by the Overlord and will function correctly after the initial restart.

    Parquet extension changes

    🚀 The druid-parquet-extensions extension has been moved from contrib to core. When deploying 0.14.0-incubating, please ensure that your extensions-contrib directory does not have any older versions of the Parquet extension.

    ➕ Additionally, there are now two styles of Parquet parsers in the extension:

    • 📜 parquet-avro: Converts Parquet to Avro, and then parses the Avro representation. This was the existing parser prior to 0.14.0-incubating.
    • 📜 parquet: A new parser that parses the Parquet format directly. Only this new parser supports int96 values.

    ⚡️ Prior to 0.14.0-incubating, a specifying a parquet type parser would have a task use the Avro-converting parser. In 0.14.0-incubating, to continue using the Avro-converting parser, you will need to update your ingestion specs to use parquet-avro instead.

    📜 The inputFormat field in the inputSpec for tasks using Parquet input must also match the choice of parser:

    • parquet: org.apache.druid.data.input.parquet.DruidParquetInputFormat
    • parquet-avro: org.apache.druid.data.input.parquet.DruidParquetInputFormat

    👀 Please see http://druid.io/docs/0.14.0-incubating/development/extensions-core/parquet.html for details.

    ⚙ Running Druid with non-2.8.3 Hadoop

    If you plan to use Druid 0.14.0-incubating with Hadoop versions other than 2.8.3, you may need to do the following:

    • 📄 Set the Hadoop dependency coordinates to your target version as described in http://druid.io/docs/0.14.0-incubating/operations/other-hadoop.html under Tip #3: Use specific versions of Hadoop libraries.
    • 🏗 Rebuild Druid with your target version of Hadoop by changing hadoop.compile.version in the main Druid pom.xml and then following the standard build instructions.

    Other Behavior changes

    Old task cleanup

    📇 Old task entries in the metadata storage will now be cleaned up automatically together with their task logs. Please see http:/druid.io/docs/0.14.0-incubating/development/extensions-core/configuration/index.html#task-logging and #6592 for details.

    Automatic processing buffer sizing

    0️⃣ The druid.processing.buffer.sizeBytes property has new default behavior if it is not set. Druid will now automatically choose a value for the processing buffer size using the following formula:

    processingBufferSize = totalDirectMemory / (numMergeBuffers + numProcessingThreads + 1)
    processingBufferSize = min(processingBufferSize, 1GB)
    

    Where:

    • totalDirectMemory: The direct memory limit for the JVM specified by -XX:MaxDirectMemorySize
    • numMergeBuffers: The value of druid.processing.numMergeBuffers.
    • numProcessingThreads: The value of druid.processing.numThreads.

    At most, Druid will use 1GB for the automatically chosen processing buffer size. The processing buffer size can still be specified manually.

    👀 Please see #6588 for details.

    0️⃣ Retention rules now include the future by default

    👀 Please be aware that new retention rules will now include the future by default. Please see #6414 for details.

    Property changes

    Segment announcing

    ⚡️ The druid.announcer.type property used for choosing between Zookeeper or HTTP-based segment management/discovery has been moved to druid.serverview.type. If you were using http prior to 0.14.0-incubating, you will need to update your configs to use the new druid.serverview.type.

    👀 Please see the following for details:

    🛠 fix missing property in JsonTypeInfo of SegmentWriteOutMediumFactory

    0️⃣ The druid.peon.defaultSegmentWriteOutMediumFactory.@type property has been fixed. The property is now druid.peon.defaultSegmentWriteOutMediumFactory.type without the "@".

    👀 Please see #6656 for details.

    🗄 Deprecations

    Approximate Histogram aggregator

    🗄 The [ApproximateHistogram](http:/druid.io/docs/0.14.0-incubating/development/extensions-core/approximate-histograms.html) aggregator has been deprecated; it is a distribution-dependent algorithm without formal error bounds and has significant accuracy issues.

    📄 The [DataSketches quantiles](http:/druid.io/docs/0.14.0-incubating/development/extensions-core/datasketches-quantiles.html) aggregator should be used instead for quantile and histogram use cases.

    👀 Please see Histogram and Quantiles Aggregators

    Cardinality/HyperUnique aggregator

    🐎 The Cardinality and HyperUnique aggregators have been deprecated in favor of the [DataSketches HLL](http:/druid.io/docs/0.14.0-incubating/development/extensions-core/datasketches-hll.html) aggregator and Theta Sketch aggregator. These aggregators have better accuracy and performance characteristics.

    👀 Please see Count Distinct Aggregators for details.

    Query Chunk Period

    👀 The chunkPeriod query context configuration is now deprecated, along with the associated query/intervalChunk/time metric. Please see #6591 for details.

    keepSegmentGranularity for Compaction

    👀 The keepSegmentGranularity option for compaction tasks has been deprecated. Please see #6758 and the segmentGranularity table in http://druid.io/docs/0.14.0-incubating/ingestion/compaction.html for more information on these properties.

    Interface changes for extension developers

    SegmentId class

    👀 Druid now uses a SegmentId class instead of plain Strings to represent segment IDs. Please see #6370 for details.

    ➕ Added by @leventov.

    🚚 druid-api, druid-common, java-util moved to druid-core

    ⚡️ The druid-api, druid-common, java-util modules have been moved into druid-core. Please update your dependencies accordingly if your project depended on these libraries.

    👀 Please see #6443 for details.

    Credits

    🚀 Thanks to everyone who contributed to this release!

    @a2l007
    @AlexanderSaydakov
    @anantmf
    @ankit0811
    @asdf2014
    @awelsh93
    @benhopp
    @Caroline1000
    👕 @clintropolis
    @dclim
    @deiwin
    @DiegoEliasCosta
    @drcrallen
    @dyf6372
    @Dylan1312
    @egor-ryashin
    @elloooooo
    @evans
    @FaxianZhao
    @gaodayue
    @gianm
    @glasser
    @Guadrado
    @hate13
    @hoesler
    @hpandeycodeit
    @janeklb
    @jihoonson
    @jon-wei
    @jorbay-au
    @jsun98
    @justinborromeo
    @kamaci
    @leventov
    @lxqfy
    @mirkojotic
    @navkumar
    @niketh
    @patelh
    @pzhdfy
    @QiuMM
    @rcgarcia74
    @richardstartin
    @robertervin
    @samarthjain
    @seoeun25
    @Shimi
    @surekhasaharan
    @taiii
    @thomask
    @VincentNewkirk
    @vogievetsky
    @yunwan
    @zhaojiandong