Musings about Coding, Business and other Geek Stuff Live and Direct from somewhere on the planet
August 15, 2003
Superfast databases with ksql

I am absolutely facinated with KDB a special database engine optimized for timeseries analysis a normally notoriously slow process.

This is of special interest in financial analysis and risk management for traders etc.

The way they have done it is interesting in its extreme simplicity. They use a column based approach rather than the traditional row based approach. How does that work? Each column in the table has its own datafile. Im assuming that anyone wanting to hack up their own version could use a version of Berkeley DB

All the data seems to be stored in a sequential fashion. Which makes sense for timeseries. The have 3 different table models ala mysql. You can pick inmemory for smaller datasets (sub 1G) the tables are stored to disk ofcourse when written to. Shuffle as far as I can work out stores everything on disk and uses a large in memory cache to optimize it. Parallell the largest is for insanely large datasets such as historical trading data.

What it all boils down to anyway are blindingly fast queries and timeseries analysis. While they support standard ansi-92 sql they have their own ksql dialect which has extensions for timeseries analysis. They also have jdbc drivers which I havent tried yet. I assume they are part of the commercial distro.

The interesting thing is that the whole thing is written in their own interpreted language K, which is a vector processing functional language supposedly with roots in APL. I know nothing about that whole world, but it appears powerful. The K download is tiny at 130k and kdb itself is about 50k. (Update: Just found this other K resource: K is all there is to K with good links and background)

It used to be in the 70s and 80s that there were various special purpose database engines. During the 90’s everyone kind of converged on the idea of one size fits all for database management. Databases had to be large monolithic servers like Oracle etc. Now with the advent of Prevalent Databases we have superfast memory resident databases that are great for non analytical stuff such as transaction processing, content managent and most other database applications. Analytical applications where you need to query vasts amount of data are not really practical with Prevalent databases, however enter KDB and there we go.

In the future will there really be much need for the traditional database server? I dont think so, however Im sure it will take a few years for this to trickle into most real world apps.

Posted by pelleb at August 15, 2003 03:41 PM
This entry was posted in the following Categories: Java
Comments
Post a comment
Name:


Email Address:


URL:


Comments:


Remember info?