International Journal Papers

Permanent URI for this collectionhttp://dl.cerist.dz/handle/CERIST/17

Browse

Search Results

Now showing 1 - 10 of 10
  • Thumbnail Image
    Item
    Reducing thread divergence in GPU-based bees swarm optimization applied to association rule mining
    (John Wiley & Sons, Ltd., 2016) Bendjoudi, Ahcène; Djenouri, Youcef; Habbas, Zineb; Mehdi, Malika; Djenouri, Djamel
    The association rules mining (ARM) problem is one of the most important problems in the area of data mining. It aims at finding all relevant association rules from transactional databases. It is CPU time intensive and requires a huge computing power when dealing with large transactional databases. To deal with this issue, Graphics Processing Units (GPUs) are a powerful tool to speed up the search process. However, their performance is closely subject to thread/branch divergence resulting from the single instruction multiple data parallel model of GPUs. In this paper, we propose three approaches based on database reorganization, aiming to reduce thread divergence in GPU-based bees swarm optimization metaheuristic for ARM, respectively, named block-based reordering, transactions-based reordering, and transactions-based reordering with median value. Theoretical and experimental studies have been carried out using well-known large ARM instances. The experiments have been performed on an Intel Xeon 64 bit quad-core processor E5520 coupled to Nvidia Tesla C2075 448 cores. The results show that the proposed approaches minimize considerably the number of thread divergence and improve the overall execution time. Indeed, the number of thread divergence occurrences has been reduced by up to eight times making the execution much faster.
  • Thumbnail Image
    Item
    GPU-based Bees Swarm Optimization for Association Rules Mining
    (Springer, 2014) Djenouri, Youcef; Bendjoudi, Ahcène; Mehdi, Malika; Nouali-Taboudjemat, Nadia; Habbas, Zineb
    Association Rules Mining (ARM) is a well-known combinatorial optimization problem aiming at extracting relevant rules from given large scale data sets. According to the state of the art, the bio-inspired methods proved their efficiency by generating acceptable solutions in a reasonable time when dealing with small and medium size instances. Unfortunately, to cope with large instances such as the webdocs benchmark, these methods require more and more powerful processors and are time expensive. Nowadays, computing power is no longer a real issue. It can be provided by the power of emerging technologies such as GPUs that are massively multi-threaded processors. In this paper, we investigate the use of GPUs to speed up the computation. We propose two GPU-based bees swarm algorithms for association rules mining (SE-GPU and ME-GPU). SE-GPU aims at evaluating one rule at a time where each thread is associated with one transaction, whereas ME-GPU evaluates multiple rules in parallel on GPU where each thread is associated with several transactions. To validate our approaches, the two algorithms have been executed to solve well-known large ARM instances. Real experiments have been carried out on an Intel Xeon 64 bit quad-core processor E5520 coupled to an Nvidia Tesla C2075 GPU device. The results show that our approaches improve the execution time up to x100 over the sequential mono-core BSO-ARM algorithm. Moreover, the proposed approaches have been compared with CPU multi-core ones (1 to 8 cores). The results show that they are faster than the multi-core versions what ever the number of used cores.
  • Thumbnail Image
    Item
    Pruning Irrelevant Association Rules Using Knowledge Mining
    (2014) Djenouri, Youcef; Derias, Habiba; Bendjoudi, Ahcène
    The efficiency of existing association rules mining algorithms afford large number of delivered rules that the user can not exploit them easily. Consequently, thinking about another mining of these generated rules becomes essential task. For this, the present paper explores metarules extraction in order to prune the irrelevant rules. It first focuses on clustering association rules for large datasets. This allows the user better organising and interpreting the rules. To more go down in our mining, different dependencies between rules of the same cluster are extracted using meta-rules algorithm. Then, pruning algorithm uses these dependencies to delete the deductive rules and keep just the representative rules for each cluster. The proposed approach is tested on different experiments including clustering, meta-rules and pruning steps. The result is very promising in terms of the number of returned rules and their quality.
  • Thumbnail Image
    Item
    GPU-accelerated Bounding for Branch-and-Bound applied to a Permutation Problem using Data Access Optimization
    (John Wiley & Sons, 2013-11) Melab, Nouredine; Chakroun, Imen; Bendjoudi, Ahcène
    Branch-and-Bound (B\&B) algorithms are attractive methods for solving to optimality combinatorial optimization problems using an implicit enumeration of a dynamically built tree-based search space. Nevertheless, they are time-consuming when dealing with large problem instances. Therefore, pruning tree nodes (subproblems) is traditionally used as a powerful mechanism to reduce the size of the explored search space. Pruning requires to perform the bounding operation which consists of applying a lower bound function to the subproblems generated during the exploration process. Preliminary experiments performed on the Flow-Shop scheduling problem (FSP) have shown that the bounding operation consumes over $98\%$ of the execution time of the B\&B algorithm. In this paper, we investigate the use of GPU computing as a major complementary way to speed up the search. We revisit the design and implementation of the parallel bounding model on GPU accelerators. The proposed approach enables data access optimization. Extensive experiments have been carried out on well-known FSP benchmarks using an Nvidia Tesla C2050 GPU card. Compared to a CPU-based single core execution using an Intel Core i7-970 processor without GPU, speedups higher than $100$ times faster are achieved for large problem instances. At an equivalent peak performance, GPU-accelerated B\&B is twice faster than its multi-core counterpart.
  • Thumbnail Image
    Item
    P2P design and implementation of a parallel branch and bound algorithm for grids
    (Inderscience, 2009) Bendjoudi, Ahcène; Melab, Nouredine; Talbi, El-Ghazali
    Solving optimally large instances of combinatorial optimisation problems using Branch and Bound (B&B) algorithms is CPU-time intensive and requires a large number of computational resources. To harness such huge amount of resources Peer-to-Peer (P2P) communications must be allowed between resources, and adaptive load balancing and fault-tolerance have to be dealt with when designing and implementing a B&B algorithm. In this paper, we propose a P2P design and implementation of a parallel B&B algorithm on top of the ProActive grid middleware. Load distribution and fault-tolerance strategies are proposed to deal with the dynamic and heterogeneous characteristics of the computational grid. The approach has been promisingly applied to the Flow-Shop scheduling problem and experimented on a computational pool of 1500 CPUs from the GRID’5000 Nation-wide experimental Grid.
  • Thumbnail Image
    Item
    Study of Maintenance Contribution to Joint Production and Preventive maintenance Scheduling in the Robustness Framework
    (Inderscience, 2010) Benbouzid-Sitayeb, Fatima; Bendjoudi, Ahcène; Benkhallat, Samir; Varnier, Cristophe; Zerhouni, Nouredine
    In this paper, we deal with a joint production and Preventive Maintenance (PM) scheduling problem in the robustness framework. The contributions of this paper are twofold. First, we will establish that the insertion of maintenance activities during production scheduling can hedge against some changes in the shop environment. Furthermore, we will check if respecting the optimal intervals of maintenance activities guarantees a minimal robustness threshold. Then, we will try to identify from the used optimisation criteria those that allow making predictive schedules more robust. The computational experiments in a flowshop show that joint production and PM schedules are more robust than production schedules and maintenance provides an acceptable tradeoff between equipment reliability and performance loss under disruption.
  • Thumbnail Image
    Item
    An adaptive hierarchical master-worker (AHMW) framework for grids - Application to B&B algorithms
    (Elsevier, 2012) Bendjoudi, Ahcène; Melab, Nouredine; Talbi, El-Ghazali
    Well-suited to embarrassingly parallel applications, the master–worker (MW) paradigm has largely and successfully used in parallel distributed computing. Nevertheless, such a paradigm is very limited in scalability in large computational grids. A natural way to improve the scalability is to add a layer of masters between the master and the workers making a hierarchical MW (HMW). In most existing HMW frameworks and algorithms, only a single layer of masters is used, the hierarchy is statically built and the granularity of tasks is fixed. Such frameworks and algorithms are not adapted to grids which are volatile, heterogeneous and large scale environments. In this paper, we revisit the HMW paradigm to match such characteristics of grids. We propose a new dynamic adaptive multi-layer hierarchical MW (AHMW) dealing with the scalability, volatility and heterogeneity issues. The construction and deployment of the hierarchy and the task management (deployment, decomposition of work, distribution of tasks, . . .) are performed in a dynamic collaborative distributed way. The framework has been applied to the parallel Branch and Bound algorithm and experimented on the Flow-Shop scheduling problem. The implementation has been performed using the ProActive grid middleware and the large experiments have been conducted using about 2000 processors from the Grid’5000 French nation-wide grid infrastructure. The results demonstrate the high scalability of the proposed approach and its efficiency in terms of deployment cost, decomposition and distribution of work and exploration time. The results show that AHMW outperforms HMW and MW in scalability and efficiency in terms of deployment and exploration time.
  • Thumbnail Image
    Item
    Hierarchical Branch and Bound Algorithm for Computational Grids
    (Elsevier, 2012) Bendjoudi, Ahcène; Melab, Nouredine; Talbi, El-Ghazali
    Branch and Bound (B&B) algorithms are efficiently used for exact resolution of combinatorial optimization problems (COPs). They are easy to parallelize using the Master/Worker paradigm (MW) but limited in scalability when solving large instances of COPs on large scale environments such as Grids. Indeed, the master process rapidly becomes a bottleneck. In this paper, we propose a new approach H-B&B for parallel B&B based on a hierarchical MW paradigm in order to deal with the scalability issue of the traditional MW-based B&B. The hierarchy is built dynamically and evolves over time according to the dynamic acquisition of computing nodes. The inner nodes of the hierarchy (masters) perform branching operations to generate sub-trees and the leaves (workers) perform a complete exploration of these sub-trees. Therefore, in addition to the parallel exploration of sub-trees, a parallel branching is adopted. H-B&B is applied to the Flow-Shop scheduling problem.Unlike most existing grid-based B&B algorithms, H-B&B has been experimented on a real computational grid i.e. Grid’5000. The results demonstrate the scalability and efficiency of H-B&B.
  • Thumbnail Image
    Item
    Reducing thread divergence in a GPU-accelerated branch-and-bound algorithm
    (2013) Chakroun, Imen; Mezmaz, Mohand; Melab, Nouredine; Bendjoudi, Ahcène
    In this paper, we address the design and implementation of GPU-accelerated Branch-and-Bound algorithms (B&B) for solving Flow-shop scheduling optimization problems (FSP). Such applications are CPU-time consuming and highly irregular. On the other hand, GPUs are massively multi-threaded accelerators using the SIMD model at execution. A major issue which raises when executing on GPU a B&B applied to FSP is thread or branch divergence. Such divergence is caused by the lower bound function of FSP which contains many irregular loops and conditional instructions. Our challenge is therefore to revisit the design and implementation of B&B applied to FSP dealing with thread divergence. Extensive experiments of the proposed approach have been carried out on well-known FSP benchmarks using an Nvidia Tesla C2050 GPU card. Compared to a CPU-based execution, accelerations up to ×77.46 are achieved for large problem instances.
  • Thumbnail Image
    Item
    FTH-B&B: a Fault-Tolerant Hierarchical Branch and Bound for Large Scale Unreliable Environment
    (2013) Bendjoudi, Ahcène; Melab, Nouredine; Talbi, El-Ghazali
    Solving to optimality large instances of combinatorial optimization problems using Brand and Bound (B&B) algorithms requires a huge amount of computing resources. In this paper, we investigate the design and implementation of such algorithms on computational grids. Most of existing grid-based B&B algorithms are based on the Master-Worker paradigm, their scalability is therefore limited. Moreover, even if the volatility of resources is a major issue in grids fault tolerance is rarely addressed. We propose FTH-B&B, a fault tolerant hierarchical B&B. FTH-B&B is based on different new mechanisms enabling to efficiently build and maintain balanced the hierarchy, and to store and recover work units (sub-problems). FTH-B&B has been implemented on top of ProActive and applied to the Flow-Shop scheduling problem. Very often, the validation of existing grid-based B&B works is performed either through simulation or a very small real grid. In this paper, we experimented FTH-B&B on the Grid'5000 real French nation-wide computational grid using up to 1900 processor cores distributed over 6 sites. The reported results show that the overhead induced by the approach is very low and an efficiency close to 100% can be achieved on some Taillards benchmarks of the Flow-Shop problem. In addition, the results demonstrate the robustness of the approach even in extreme failure situations.