In this post, I will discuss two phase commit (aka 2PC) distributed transaction commit protocol and some of the problems associated with it. What is a distributed commit protocol? A commit protocol is an algorithm used for atomically committing a transaction. Atomicity implies that either all the changes (writes / updates) in the transaction will... Continue Reading →
Optimistic Locking
In this post, I will briefly discuss optimistic locking technique, its advantages and potential use cases. Pessimistic locking protocol Let's first discuss the opposite of optimistic locking to setup the context. Pessimistic locking is the main locking paradigm used for guaranteeing mutual exclusion for a given piece of code subject to execution by reader and... Continue Reading →
Vectorized Processing in Analytical Query Engines
Traditional query processing algorithms are based on "iterator" or "tuple-at-a-time" model where a single tuple is pushed up through the query plan tree from one operator to another. Each operator typically has a next() method which outputs a tuple or record and the latter is then consumed as an input record by the caller operator... Continue Reading →
Why Analytic Workloads are faster on Columnar Databases?
In this post I will briefly summarize why analytic (OLAP) workloads perform better on columnar (aka column-oriented) databases as opposed to traditional row-based (aka row-oriented) databases. Introduction Storage Organization Vectorized Query Execution CPU Cache Friendly Late Materialization Compression Introduction Analytic workloads comprise of operations like scans, joins, aggregations etc. These operations are concerned with data... Continue Reading →
Clustered Indexes v/s Non-Clustered Indexes
In this post, I would like to give a small overview of Clustered and Non-Clustered Indexes. DISCLAIMER: I am an Oracle employee, and the views/opinions expressed in the below article are purely my own and do not express the views of my employer. Let's start with similarities: Similarities: Both Clustered and Non-Clustered indexes are types... Continue Reading →
Primary Index v/s Secondary Index
In this very short post, I will give an overview of primary and secondary indexes in Databases. DISCLAIMER: I am an Oracle employee, and the views/opinions expressed in the below article are purely my own and do not express the views of my employer. Let me tell the similarities first: Similarities Both the index structures... Continue Reading →