Paper Link :
SPAR
This paper presents a new system(SPAR) of doing Partition on social
Networks. SPUR helps in minimizing the replication overhead by
maintaining data locality .When experiments were conducted on Twitter
and Facebook data sets ,using various algorithm and compared with each
other it was clearly seen that SPAR outperforms other algorithms. So
seeing the huge demand of social network, I think using SPUR will bring
significant improvement in data replication .The paper is also well
written and conveys the idea clearly.
8. Detailed comments
A. Summery
1) Scalability of real systems is a complex field. It becomes more
difficult for Social networks as the data is not disjoint. The paper
proposes a middleware for partition and replication for Social Networks
called SPAR. SPAR works on joint partitioning and replication. Author
explains replication with a graph containing 10 nodes on 2 servers and
gives overview of replication using DHT,Full and SPAR. In case of SPAR
the queries are resoled locally on the server as a result the throughput
was high. SPAR also gives user flexibility to select its datasource.
Spar is a online algorithm so it is highly useful for dynamic social
graphs as these graphs requires recomputation of partitions.The author
describes the Min replication problem and gives a solution based on
greedy optimization. The algorithm is triggered by any of add/removal of
node,sever,edge.All the 6 cases are analyzed. Addition of user happens
at the partition with minimum replicas. When a user is deleted master
and its slaves are deleted.In the case of new relation a edge is created
between 2 users . Algorithm checks if the two masters are co located if
so no action is required. If not then it calculates minimum number of
replicas to be created.The author explains the edge addition with
various cases.It was illustrated that minimizing the nodes was not the
only condition for minimizing replication .The cases of server addition
and removal is also discussed in the paper.In case of server addition it
was seen that SPAR was able to achieve stable state ,irrespective of
how servers were added. Extensive experiments were conducted on data
from Facebook and Twitter and SPAR was compared against various other
algorithms like MOTIS,MO+,Random. The experiments were conducted with 0
and 2 replicas and computed movement cost across 4 servers to 512
servers. It was seen that overhead was minimum in case of SPAR. COV for
read and write operation in case of SPAR was 0.37 and 0.0019 ,indicating
spars efficient efficient handling of read and write in terms of
balncing them across servers.Using Twitter clone Statusnet performance
of SPAR was studied on top of MYSQL and Casandra and it was seen that
SPAR reduces network traffic by a factor of 8.
3)It is not very clear from the paper how the new edges are added. How they deal with edges if there is metadata attached to it.
4)Local load balancing was not addressed.
5)Formulation of Solution is very vague and could have been explained in a better way.
No comments:
Post a Comment