TL;DR This blog post explains how to build a fast, memory efficient data lookup from a csv file in scala. When you deal with a realtime application which requires you to do some lookup before doing further processing, you will need to consider caching the data within the application, memory will be a bottleneck if […]
Shipped my first Raspberry Pi 3

I got my raspberry pi 3 shipped last week and today I got some time to play with it. I ordered the following items from the official dealer: Raspberry Pi 3 RP 3 USB Adapter RP 3 Case and I purchased a 16GB MicroSD card from one of the shopping mall for like 4$. This […]

So, I came back home after work and decided to check my internet usage since its month end and noticed my WAN address from the ISP’s portal. I fired up nmap and took a dump of all the available routers on the network, basically scan the network for open port on 80. Use the following […]
Performance gets redefined when the data is in memory, Apache Arrow is a de-facto standard for columnar in-memory analytics, Engineers from across the top level Apache projects are contributing towards to create Apache Arrow. In the coming years we can expect all the big data platforms adopting Apache Arrow as its columnar in-memory layer. […]
Dynamic Allocation in Apache Spark
This is a brief post about the elastic allocation of cluster resources to your spark applications. As you may already know, Spark is a fast and general engine for big data processing which helps you run your code faster, because of the In-memory data sharing and general computation graphs. Apart from that, using the spark […]
Apache Spark Cluster: The figure above is the basic abstraction of a Spark cluster. Here, the driver program is the actual code (job) that you will be running over the Spark cluster. Cluster Manager (the Master) coordinates the task allocation between executors. You can say the cluster manager acts as a job […]
If you have an elasticsearch instance that is publicly available, upgrade to 1.4.3 or later Immediately! Elasticsearch (the “E” in ELK) is a full-text search engine that makes data aggregation and querying easy. It has an extensive JSON API that allows everything from searching to system management. This post will show how a new vulnerability, CVE-2015-1427, […]
ircCloud Session Validation Failure
On thier back-end, they are not validating the session properly. Also they are using only one session variable for validating/authenticating the user throughout. This security flaw lets me access one logged in account from an entirely different browser (from different location) without actually login in. So here’s the steps that i took to make this […]
How to Exploit OpenSSL aka Heartbleed
Hope you already know how heartbleed works. The Heartbleed Bug is a serious vulnerability in the popular OpenSSL cryptographic software library. This weakness allows stealing the information protected, under normal conditions, by the SSL/TLS encryption used to secure the Internet. You may want to look into my previous post to get a clear picture. Try this […]
How OpenSSL heartbleed works
What’s Heartbleed and why should I care about OpenSSL? In case you haven’t read the Heartbleed website, go do that. Here I’ll just give a quick overview. The Heartbleed bug is a particularly nasty bug. It allows an attacker to read up to 64KB of memory, and the security researchers have said: Without using any privileged information […]