Thursday, March 25, 2010

fast IO and ndb

i read Marks blog entry about fast IO for PBXT, InnoDB (1.0.6 plugin) and MyISAM with some interest.

then there was the durable, not durable and really not durable blog in which Mark and LinuxJedi discussed cluster, and Mark wondered how much IO one could get from 1 data-node with ndb.

so I decided to try.
the setup is similar.
ndb cluster using disk-tables, with a tablespace stored in /dev/shm.

Data occupied 7.3G, which is slightly less than (some of) the others, this is as we don't support having columns with indexes stored on disk. I.e any column that has an index will be stored only in memory instead.

ndb does however not support "handler" statements, so I used oltp-point-select-all-cols instead.

sysbench --test=oltp --mysql-host=foobar --mysql-user=root \
--mysql-password=pw \
--mysql-db=test \
--oltp-table-size=40000000 \
--max-time=60 --max-requests=0 \
--mysql-table-engine=ndb \
--db-ps-mode=disable --mysql-engine-trx=yes \
--oltp-read-only --oltp-skip-trx --oltp-test-mode=simple \
--oltp-point-select-all-cols \
--oltp-dist-type=uniform \
--oltp-range-size=1000 \
--num-threads=1 --seed-rng=1 run




results are not super great...
so i did a ndbapi program (which is roughly the equivalent of handler-statements)



these number look a bit better. but datanode (ndbmtd) was almost idle when running this...

so i made another experiment. Instead of retrieving 1 row at a time (set@r = rand() % 40000000; select * from sbtest where id = @r) i changed to retrive 16 rows at a time (set @r1 = rand(); set @r2 = rand(); select * from sbtest where id in (@r1...@r16).



i believe these results are relevant given that mark's aim was to test fast IO,
and i think that this rewrite wont affect other SE as much as it does with ndb.
and those numbers was quite ok.

as an extra bonus, i also tried with using our memory tables (alter table sbtest storage memory), also with ndbapi and 16 rows at a time.



these tests was executed on a 16-core machine Intel(R) Xeon(R) CPU E7420@2.13GHz
and in my config.ini I had

DiskPageBufferMemory = 1GB
DiskIOThreadPool=8
FileSystemPath=/dev/shm


note: I first create tables in myisam, then converted them to ndb disk-tables by issuing "alter table sbtest storage disk tablespace TS engine = ndb" and all tests was using ndb-cluster-connection-pool=4

all results:

SQL 2756 4998 7133 9130 10720 12222 13305 14190 14626 15287 15547 15955 16143 16334 16507 16757
ndbapi 4177 7581 10319 13162 15245 17064 18874 20652 20850 24131 25976 24910 29832 30666 32625 34841
b=16 1520 11374 35454 53396 55601 63248 71819 78468 103324 97330 97572 111099 125564 126790 133873 141588
mm b=16 73004 128296 172207 207361 246907 270783 293753 312435 327006 345085 346924 374837 360747 372192 376887 394862

Monday, March 15, 2010

Dbspj update, teaser for UC

5 months more...

reminder, what is Dbspj:
- It's a new feature for MySQL Cluster
- It gives the possibility to push-down SQL joins, i.e to evaluate the joins inside the data nodes.

latest and greatest:
- last 5 months spent on SQL integration
- big effort having it used only for cases that we actually support
- lots of testing using RQG
- a significant part of bugs found last couple weeks, has not been related to spj, but in fact "ordinary" optimizer bugs
- we think that it's quite usable now.

and the numbers:
- my plan was to present TPC-W as "realistic" numbers
- now they look more like fantastic numbers
- if i could, i would add a "25x" to my title
- visit my UC presentation (link) to learn more

and fyi: we also plan to provide a feature preview source (or binary) release for
O'Reilly MySQL Conference & Expo 2010


(i will of course also disclose all information provided in UC presentation after the event)