How Cerner Uses CDH with Apache Kafka
Our thanks to Micah Whitacre, a senior software architect on Cerner Corp.’s Big Data Platforms team, for the post below about Cerner’s use case for CDH + Apache Kafka. (Kafka integration with CDH is...
View ArticleGuidelines for Installing CDH Packages on Unsupported Operating Systems
Installing CDH on newer unsupported operating systems (such as Ubuntu 13.04 and later) can lead to conflicts. These guidelines will help you avoid them. Some of the more recently released operating...
View ArticleFor Apache Hadoop, The POODLE Attack Has Lost Its Bite
A significant vulnerability affecting the entire Apache Hadoop ecosystem has now been patched. What was involved? By now, you may have heard about the POODLE (Padding Oracle On Downgraded Legacy...
View ArticleNew in CDH 5.2: Improvements for Running Multiple Workloads on a Single HBase...
These new Apache HBase features in CDH 5.2 make multi-tenant environments easier to manage. Historically, Apache HBase treats all tables, users, and workloads with equal weight. This approach is...
View ArticleCloudera Enterprise 5.3 is Released
We’re pleased to announce the release of Cloudera Enterprise 5.3 (comprising CDH 5.3, Cloudera Manager 5.3, and Cloudera Navigator 2.2). This release continues the drumbeat for security functionality...
View ArticleNew in Cloudera Manager 5.3: Easier CDH Upgrades
An improved upgrade wizard in Cloudera Manager 5.3 makes it easy to upgrade CDH on your clusters. Upgrades can be hard, and any downtime to mission-critical workloads can have a direct impact on...
View ArticleChecklist for Painless Upgrades to CDH 5
Following these best practices can make your upgrade path to CDH 5 relatively free of obstacles. Upgrading the software that powers mission-critical workloads can be challenging in any circumstance. In...
View ArticleCloudera Enterprise 5.4 is Released
We’re pleased to announce the release of Cloudera Enterprise 5.4 (comprising CDH 5.4, Cloudera Manager 5.4, and Cloudera Navigator 2.3). Cloudera Enterprise 5.4 (Release Notes) reflects critical...
View ArticleNew in CDH 5.4: Apache HBase Request Throttling
The following post about the new request throttling feature in HBase 1.1 (now shipping in CDH 5.4) originally published in the ASF blog. We re-publish it here for your convenience. Running multiple...
View ArticleNew in CDH 5.4: Sensitive Data Redaction
The best data protection strategy is to remove sensitive information from everyplace it’s not needed. Have you ever wondered what sort of “sensitive” information might wind up in Apache Hadoop log...
View ArticleDeploying Apache Kafka: A Practical FAQ
This post contains answers to common questions about deploying and configuring Apache Kafka as part of a Cloudera-powered enterprise data hub. Cloudera added support for Apache Kafka, the open standard...
View ArticleHow-to: Run Apache Mesos on CDH
Big Industries, Cloudera systems integration and reseller partner for Belgium and Luxembourg, has developed an integration of Apache Mesos and CDH that can be deployed and managed through Cloudera...
View ArticleHow-to: Prepare Your Apache Hadoop Cluster for PySpark Jobs
Proper configuration of your Python environment is a critical pre-condition for using Apache Spark’s Python API. One of the most enticing aspects of Apache Spark for data scientists is the API it...
View ArticleHow-to: Build a Machine-Learning App Using Sparkling Water and Apache Spark
Thanks to Michal Malohlava, Amy Wang, and Avni Wadhwa of H20.ai for providing the following guest post about building ML apps using Sparkling Water and Apache Spark on CDH. The Sparkling Water project...
View ArticleCloudera Enterprise 5.5 is Now Generally Available
Cloudera Enterprise 5.5 (comprising CDH 5.5, Cloudera Manager 5.5, and Cloudera Navigator 2.4) has been released. Cloudera is excited to bring you news of Cloudera Enterprise 5.5. Our persistent...
View ArticleSustained Innovation in Apache Spark: DataFrames, Spark SQL, and MLlib
Cloudera has announced support for Spark SQL/DataFrame API and MLlib. This post explains their benefits for app developers, data analysts, data engineers, and data scientists. In July 2015, Cloudera...
View ArticleDocker is the New QuickStart Option for Apache Hadoop and Cloudera
Now there’s an even quicker “QuickStart” option for getting hands-on with the Apache Hadoop ecosystem and Cloudera’s platform: a new Docker image. You might already be familiar with Cloudera’s popular...
View ArticleNew in Cloudera Labs: Apache HTrace (incubating)
Via a combination of beta functionality in CDH 5.5 and new Cloudera Labs packages, you now have access to Apache HTrace for doing performance tracing of your HDFS-based applications. HTrace is a new...
View ArticleDistCp Performance Improvements in Apache Hadoop
Recent improvements to Apache Hadoop’s native backup utility, which are now shipping in CDH, make that process much faster. DistCp is a popular tool in Apache Hadoop for periodically backing up data...
View ArticleNew in Cloudera Enterprise 5.5: Improvements to HUE for Automatic HA Setup...
Cloudera Enterprise 5.5 improves the life of the admin through a deeper integration between HUE and Cloudera Manager, as well as a rebase on HUE 3.9. Cloudera Enterprise 5.5 contains a number of...
View Article