So some 5 month later...
- Dbspj has an ndbapi
- Dbspj works enough for simple benchmarks!
Reminder, what is Dbspj:
- It's a new feature for Ndb
- It gives the possibility to push-down linked operations (e.g in SQL terminology: joins)
- It currently only supports left-outer-joins, and only some kinds of joins
- It is currently *not* in anyway integrated with mysqld (for accelerating SQL access)
Anyway so here is the benchmark setup
2 computers
- ndbapi running on one
- 2 datanodes running on other
On images below:
- red is new code, blue is corresponding "current" code
- Y-axis is run-time, so lower is better
- X-axis is "depth", i.e no of tables joined
Note: this is debug-compiled, so the actually absolute numbers are
not that interesting...rather the comparison...
Query 1:
depth 1: select * from T t1, T t2 where T1.pk = constant and T2.pk = T1.pk
depth 2: select * from T t1, T t2, T t3 where T1.pk = constant and T2.pk = T1.pk and T3.pk = T2.pk
etc...
Query 2:
depth 1: select * from T t1, T t2 where T2.pk = T1.pk
depth 2: select * from T t1, T t2, T t3 where T2.pk = T1.pk and T3.pk = T2.pk
etc...
How to Benchmark Replication Performance in MySQL
-
In this blog, I will cover important aspects which you need to test when
benchmarking replication setup. MySQL has great tools that could be used to
test i...
2 years ago
3 comments:
Hi Jonas,
this is really exciting!
I'm just curious about a few things.
First, you mention that mysqld currently does not know about the feature, and thus does not emit NDBAPI calls to benefit from it. So I take it the SQL queries you mention are just to illustrate the kind of operation that is going on, just not literally?
Second, this query:
"depth 2: select * from T t1, T t2 where T1.pk = constant and T2.pk = T1.pk and T3.pk = T2.pk"
I guess that should have been:
depth 2: select * from T t1, T t2, T t3 where T1.pk = constant and T2.pk = T1.pk and T3.pk = T2.pk
Thirdly, I see that for the first query, that with "linked operations" the graph seems to flatten off at the end, wheras without linked operations time seems to increase linearly. What happes at higher depth?
Finally, how many rows are in the table? How do a full scan of T and a cartesian product of T,T compare to these results?
thanks in advance and kind regards,
Roland
Hi Roland,
1) illustrate, yes
2) correction, yes
3) flatten, don't know. But it's debug-compiled, and there can be bugs still...
4) 50 rows in table
(this only affects query 2 though)
5) comparison, don't really understand what you're asking
Jonas, thanks!
By compare I mean, could you add another series for the same query, but leave out the join condition, or better yet, do the cartesian as a subquery in the from clause and then apply a where. Just to see how that measures up against a supposedly more optimized way of joining.
Post a Comment