Categories
Big Data

Why your organization should start using Apache Mesos

This is the written blog post of my previous talk about Introduction to Mesos which I gave at Agoda machine learning meetup. The post is split into mainly of the following pieces: Challenges Datacenter partitioning and resource management Why Apache Mesos Challenges Everybody is moving towards microservices, especially If your company is doing good, then […]

Categories
Big Data

Writing a faster memory efficient lookup table in scala

TL;DR This blog post explains how to build a fast, memory efficient data lookup from a csv file in scala. When you deal with a realtime application which requires you to do some lookup before doing further processing, you will need to consider caching the data within the application, memory will be a bottleneck if […]

Categories
Big Data

Why Apache Arrow is the future for open source Columnar In-Memory Analytics

Performance gets redefined when the data is in memory, Apache Arrow is a de-facto standard for columnar in-memory analytics, Engineers from across the top level Apache projects are contributing towards to create Apache Arrow. In the coming years we can expect all the big data platforms adopting Apache Arrow as its columnar in-memory layer.   […]

Categories
Big Data

Dynamic Allocation in Apache Spark

This is a brief post about the elastic allocation of cluster resources to your spark applications. As you may already know, Spark is a fast and general engine for big data processing which helps you run your code faster, because of the In-memory data sharing and general computation graphs. Apart from that, using the spark […]

Categories
Big Data Security

Arbitrary Code Execution In Unsecured Apache Spark Cluster

Apache Spark Cluster:       The figure above is the basic abstraction of a Spark cluster. Here, the driver program is the actual code (job) that you will be running over the Spark cluster. Cluster Manager (the Master) coordinates the task allocation between executors. You can say the cluster manager acts as a job […]