caching - Why skiplist memory locality is poor but balanced tree is good? -
a guy once challenged antirez(author of redis) why redis use skip list implementation sorted sets in ycombinator:
i looking @ redis yesterday , noticed this. there particular reason chose skip list instead of btrees except simplicity? skip lists consume more memory in pointers , slower btrees because of poor memory locality traversing them means lots of cache misses. suggested way improve throughput when guarantee each command's durability (at end of wiki page): http://code.google.com/p/redis/wiki/appendonlyfilehowto also, have thought accommodating read-only traffic in additional thread way utilize @ least 2 cores efficiently while sharing same memory?
then antirez answered:
there few reasons: 1) not memory intensive. it's basically. changing parameters probability of node have given number of levels make less memory intensive btrees. 2) sorted set target of many zrange or zrevrange operations, is, traversing skip list linked list. operation cache locality of skip lists @ least other kind of balanced trees. 3) simpler implement, debug, , forth. instance skip list simplicity received patch (already in redis master) augmented skip lists implementing zrank in o(log(n)). required little changes code. append durability & speed, don't think idea optimize redis @ cost of more code , more complexity use case imho should rare redis target (fsync() @ every command). no 1 using feature acid sql databases, performance hint big anyway. threads: our experience shows redis i/o bound. i'm using threads serve things virtual memory. long term solution exploit cores, assuming link fast can saturate single core, running multiple instances of redis (no locks, scalable linearly number of cores), , using "redis cluster" solution plan develop in future.
i read carefully, can't understand why skip list comes poor memory locality? , why balanced tree lead memory locality?
in opinion, memory locality storing data in continuous memory. think it's true when read data in address x
, cpu load data in address x+1
cache(based on experiments c, years ago). traversal array result high possibility cache hit , can array has memory locality.
but when comes skip list , balanced tree, both aren't arrays , don't store data continuously. think memory locality both poor. explain little me?
maybe guy meant there 1 key value @ skip list node (in case of default implementation) , n
keys @ b-tree node linear layout. can load bunch of b-tree keys node cache.
you've said:
both aren't arrays , don't store data continuously
but do. store data continiously @ b-tree node.
Comments
Post a Comment