Browsing by Author "Bouhenni, Sarra"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
- ItemA Survey on Distributed Graph Pattern Matching in Massive Graphs(ACM, 2021-02) Bouhenni, Sarra; Yahiaoui, Saïd; Nouali-Taboudjemat, Nadia; Kheddouci, HamamacheBesides its NP-completeness, the strict constraints of subgraph isomorphism are making it impractical for graph pattern matching (GPM) in the context of big data. As a result, relaxed GPM models have emerged as they yield interesting results in a polynomial time. However, massive graphs generated by mostly social networks require a distributed storing and processing of the data over multiple machines, thus, requiring GPM to be revised by adopting new paradigms of big graphs processing, e.g., Think-Like-A-Vertex and its derivatives. This article discusses and proposes a classification of distributed GPM approaches with a narrow focus on the relaxed models.
- ItemDistributed graph pattern matching via bounded dual simulation(Elsevier, 2022-09) Bouhenni, Sarra; Yahiaoui, Saïd; Nouali-Taboudjemat, Nadia; Kheddouci, HamamacheGraph Pattern Matching (GPM) finds subgraphs of a large data graph that are similar to an input query graph. It has many applications, such as pattern recognition, detecting plagiarism, and finding communities in social networks. Current real-world applications generate massive amounts of data that cannot be stored on the memory of a single machine, which raises the need for distributed storage and processing. Recent relaxed GPM models, although of polynomial time complexity, are nevertheless not distributed by nature. Moreover, the existing relaxed GPM algorithms are limited in terms of scalability. In this paper, we propose Bounded Dual Simulation (BDSim) as a new relaxed model for a scalable evaluation of GPM in massive graphs. BDSim captures more semantic similarities compared to graph simulation, dual simulation, and even strong simulation. It preserves the vertices’ proximity by eliminating cycles of unbounded length from the resulting match graph. Furthermore, we propose distributed vertex-centric algorithms to evaluate BDSim. We prove their effectiveness and efficiency through detailed theoretical validation and extensive experiments conducted on real-world and synthetic datasets. To the best of our knowledge, BDSim is the first relaxed GPM model that captures the cyclic structure of the query graph while being feasible in cubic time.
- ItemEfficient parallel edge-centric approach for relaxed graph pattern matching(Springer, 2022-02) Bouhenni, Sarra; Yahiaoui, Saïd; Nouali-Taboudjemat, NadiaPrior algorithms on graph simulation for distributed graphs are not scalable enough as they exhibit heavy message passing. Moreover, they are dependent on the graph partitioning quality that can be a bottleneck due to the natural skew present in real-world data. As a result, their degree of parallelism becomes limited. In this paper, we propose an efficient parallel edge-centric approach for distributed graph pattern matching. We design a novel distributed data structure called ST that allows a fine-grain parallelism, and hence guarantees linear scalability. Based on ST, we develop a parallel graph simulation algorithm called PGSim. Furthermore, we propose PDSim, an edge-centric algorithm that efficiently evaluates dual simulation in parallel. PDSim combines ST and PGSim in a Split-and-Combine approach to accelerate the computation stages. We prove the effectiveness and efficiency of these propositions through theoretical guarantees and extensive experiments on massive graphs. The achieved results confirm that our approach outperforms existing algorithms by more than an order of magnitude.