Real-time visualisation of Hadoop resources

At CERN we run multiple Hadoop clusters to satisfy demanding requirements from our experiments and accelerator communities. The usage and criticality of the clusters are increasing dramatically as more users are looking at Hadoop to process and archive the vast amounts of data coming out of LHC. Sometimes, we as Hadoop administrators are faced with … Continue reading Real-time visualisation of Hadoop resources

Integrating Hadoop and Elasticsearch – Part 2 – Querying and Writing to Elasticsearch from Apache Spark

Introduction In the part 2 of 'Integrating Hadoop and Elasticsearch' blogpost series we look at bridging Apache Spark and Elasticsearch. I assume that you have access to Hadoop and Elasticsearch clusters and you are faced with the challenge of bridging these two distributed systems. As spark code can be written in scala, python and java, … Continue reading Integrating Hadoop and Elasticsearch – Part 2 – Querying and Writing to Elasticsearch from Apache Spark