Thursday, October 22, 2009

Dbspj + mysqld, first trials


mysql> create table T1 (a int, b int, primary key(a)) engine = ndb;
mysql> select x.a,x.b,z.a,z.b from T1 x, T1 y, T1 z where x.b=y.a and y.b=z.a;
+----+------+----+------+
| a | b | a | b |
+----+------+----+------+
| 31 | 47 | 63 | 31 |
| 63 | 31 | 47 | 63 |
| 47 | 63 | 31 | 47 |
| 5 | 47 | 63 | 31 |
+----+------+----+------+
4 rows in set (0.01 sec)


- entire query is executed with one request to data nodes
- code is only a hack (that bluntly examines mysqld's internal structures).
- list of limitations is so long that i can't write it here due to bandwidth restrictions
But still super cool!

Quote from Jan, that implemented experiment: "You can write queries that return correct result"

Wednesday, October 14, 2009

Dbspj preliminary numbers

So some 5 month later...
- Dbspj has an ndbapi
- Dbspj works enough for simple benchmarks!

Reminder, what is Dbspj:
- It's a new feature for Ndb
- It gives the possibility to push-down linked operations (e.g in SQL terminology: joins)
- It currently only supports left-outer-joins, and only some kinds of joins
- It is currently *not* in anyway integrated with mysqld (for accelerating SQL access)

Anyway so here is the benchmark setup
2 computers
- ndbapi running on one
- 2 datanodes running on other

On images below:
- red is new code, blue is corresponding "current" code
- Y-axis is run-time, so lower is better
- X-axis is "depth", i.e no of tables joined

Note: this is debug-compiled, so the actually absolute numbers are
not that interesting...rather the comparison...



Query 1:
depth 1: select * from T t1, T t2 where T1.pk = constant and T2.pk = T1.pk
depth 2: select * from T t1, T t2, T t3 where T1.pk = constant and T2.pk = T1.pk and T3.pk = T2.pk
etc...




Query 2:
depth 1: select * from T t1, T t2 where T2.pk = T1.pk
depth 2: select * from T t1, T t2, T t3 where T2.pk = T1.pk and T3.pk = T2.pk
etc...