Skip to content

Cask Data Application Platform v2.6.1

Compare
Choose a tag to compare
@sreevatsanraman sreevatsanraman released this 30 Jan 04:53
· 34249 commits to develop since this release

Release Notes

CDAP Bug Fixes

  • Allow an unchecked Dataset upgrade upon application deployment CDAP-1253.
  • Update the Hive Dataset table when a Dataset is updated CDAP-71.
  • Use Hadoop configuration files bundled with the Explore Service CDAP-1250.

Known Issues

See also the Known Issues of version 2.6.0.

Typically Datasets are bundled as part of Applications. When an Application is upgraded and redeployed, any changes in Datasets will not be redeployed. This is because Datasets can be shared across applications, and an incompatible schema change can break other applications that are using the Dataset. A workaround CDAP-1253 is to allow unchecked Dataset upgrades. Upgrades cause the Dataset metadata i.e. it’s specification, including properties, to be updated. The Dataset runtime code is also updated. To prevent data loss the existing data and the underlying HBase tables remain as is.

You can allow unchecked Dataset upgrades by setting the configuration property dataset.unchecked.upgrade to true in cdap-site.xml. This will ensure that Datasets are upgraded when the Application is redeployed. When this configuration is set, the recommended process to deploy an upgraded Dataset is to first stop all Applications that are using the Dataset before deploying the new version of the Application. This lets all containers (Flows, Services, etc) to pick up the new Dataset changes. When Datasets are upgraded using dataset.unchecked.upgrade, no schema compatibility checks are performed by the system. Hence it is very important that the developer verify the backward-compatibility, and makes sure that other Applications that are using the Dataset can work with the new changes.