Browsing by Author "Melab, Nouredine"

Now showing 1 - 16 of 16

A Parallel P2P Branch-and-Bound Algorithm for Computational Grids
(2007-05-14) Bendjoudi, Ahcène; Melab, Nouredine; Talbi, El-Ghazali
Solving exactly Combinatorial Optimization Problems (COPs) using a Branch-and-Bound algorithm requires a huge amount of computational resources. The efficiency of such algorithm can be improved by distributing at large scale the computation required by the exploration of the search tree. In this paper, we propose ParallelBB, which is a P2P-based parallelization of the Branch-and-Bound algorithm for the computational Grid. The algorithm has been implemented using the ProActive distributed object Grid middleware. The algorithm has been applied to a mono- criterion permutation flow-shop problem and promisingly experimented on the Grid5000 computational Grid.
An adaptive hierarchical master-worker (AHMW) framework for grids - Application to B&B algorithms
(Elsevier, 2012) Bendjoudi, Ahcène; Melab, Nouredine; Talbi, El-Ghazali
Well-suited to embarrassingly parallel applications, the master–worker (MW) paradigm has largely and successfully used in parallel distributed computing. Nevertheless, such a paradigm is very limited in scalability in large computational grids. A natural way to improve the scalability is to add a layer of masters between the master and the workers making a hierarchical MW (HMW). In most existing HMW frameworks and algorithms, only a single layer of masters is used, the hierarchy is statically built and the granularity of tasks is fixed. Such frameworks and algorithms are not adapted to grids which are volatile, heterogeneous and large scale environments. In this paper, we revisit the HMW paradigm to match such characteristics of grids. We propose a new dynamic adaptive multi-layer hierarchical MW (AHMW) dealing with the scalability, volatility and heterogeneity issues. The construction and deployment of the hierarchy and the task management (deployment, decomposition of work, distribution of tasks, . . .) are performed in a dynamic collaborative distributed way. The framework has been applied to the parallel Branch and Bound algorithm and experimented on the Flow-Shop scheduling problem. The implementation has been performed using the ProActive grid middleware and the large experiments have been conducted using about 2000 processors from the Grid’5000 French nation-wide grid infrastructure. The results demonstrate the high scalability of the proposed approach and its efficiency in terms of deployment cost, decomposition and distribution of work and exploration time. The results show that AHMW outperforms HMW and MW in scalability and efficiency in terms of deployment and exploration time.
Fault-Tolerant Mechanism for Hierarchical Branch and Bound Algorithm
(IEEE, 2011-05-16) Bendjoudi, Ahcène; Melab, Nouredine; Talbi, El-Ghazali
Solving exactly large instances of Combinatorial Optimization Problems (COPs) using Branch and Bound (B&B) algorithms requires a huge amount of computing resources. These resources can be offered by computational grids and the scalability can be achieved using Hierarchical Master/Worker-based B&B pushing the limits of the traditional Master/Worker paradigm. However, the resources offered by grids are most of the time unreliable, volatile, and heterogeneous. Therefore, they must take into account fault tolerance. In this paper, we present FTH-B&B, a fault tolerant hierarchical B&B, in order to deal with the fault tolerance issue. It is composed of several fault tolerant Master/Worker-based sub-B&Bs organized hierarchically into groups and perform independently fault tolerant mechanism. Beside, a fault recovery mechanism is introduced to recover and avoid redundant exploration of sub-problems in case of failures. In addition, we propose a mechanism to maintain the hierarchy safe and balanced during the lifetime of the algorithm. Our algorithm is applied to the Flow-Shop scheduling problem (FSP) and implemented on top of the ProActive grid middleware. It has been promisingly experimented on the Grid’5000 French nation-wide grid and shows its ability to remain efficient even in presence of failures.
FTH-B&B: a Fault-Tolerant Hierarchical Branch and Bound for Large Scale Unreliable Environment
(2013) Bendjoudi, Ahcène; Melab, Nouredine; Talbi, El-Ghazali
Solving to optimality large instances of combinatorial optimization problems using Brand and Bound (B&B) algorithms requires a huge amount of computing resources. In this paper, we investigate the design and implementation of such algorithms on computational grids. Most of existing grid-based B&B algorithms are based on the Master-Worker paradigm, their scalability is therefore limited. Moreover, even if the volatility of resources is a major issue in grids fault tolerance is rarely addressed. We propose FTH-B&B, a fault tolerant hierarchical B&B. FTH-B&B is based on different new mechanisms enabling to efficiently build and maintain balanced the hierarchy, and to store and recover work units (sub-problems). FTH-B&B has been implemented on top of ProActive and applied to the Flow-Shop scheduling problem. Very often, the validation of existing grid-based B&B works is performed either through simulation or a very small real grid. In this paper, we experimented FTH-B&B on the Grid'5000 real French nation-wide computational grid using up to 1900 processor cores distributed over 6 sites. The reported results show that the overhead induced by the approach is very low and an efficiency close to 100% can be achieved on some Taillards benchmarks of the Flow-Shop problem. In addition, the results demonstrate the robustness of the approach even in extreme failure situations.
GPU-accelerated Bounding for Branch-and-Bound applied to a Permutation Problem using Data Access Optimization
(John Wiley & Sons, 2013-11) Melab, Nouredine; Chakroun, Imen; Bendjoudi, Ahcène
Branch-and-Bound (B\&B) algorithms are attractive methods for solving to optimality combinatorial optimization problems using an implicit enumeration of a dynamically built tree-based search space. Nevertheless, they are time-consuming when dealing with large problem instances. Therefore, pruning tree nodes (subproblems) is traditionally used as a powerful mechanism to reduce the size of the explored search space. Pruning requires to perform the bounding operation which consists of applying a lower bound function to the subproblems generated during the exploration process. Preliminary experiments performed on the Flow-Shop scheduling problem (FSP) have shown that the bounding operation consumes over $98\%$ of the execution time of the B\&B algorithm. In this paper, we investigate the use of GPU computing as a major complementary way to speed up the search. We revisit the design and implementation of the parallel bounding model on GPU accelerators. The proposed approach enables data access optimization. Extensive experiments have been carried out on well-known FSP benchmarks using an Nvidia Tesla C2050 GPU card. Compared to a CPU-based single core execution using an Intel Core i7-970 processor without GPU, speedups higher than $100$ times faster are achieved for large problem instances. At an equivalent peak performance, GPU-accelerated B\&B is twice faster than its multi-core counterpart.
H-B&B: A Hierarchical B&B for large scale environments
(2011-05-25) Bendjoudi, Ahcène; Melab, Nouredine; Talbi, El-Ghazali
In this paper, we propose a parallel B&B (H-B&B) based on the Hierarchical Master/Worker paradigm. It aims at improving the scalability of the traditional M/W-based B&Bs (M/W-B&B) eliminating the bottlenecks created on the central master process. Unlike the state-of-the-art approaches, H-B&B is fully dynamic as it is composed of several levels of masters, and evolves over time according to the dynamic acquisition of new computing nodes and their disconnections. To evaluate our approach we propose three execution scenarios. First, we evaluate the ability of H-B&B to scale-up and deploy a huge number of nodes. Second, we evaluate its efficiency and its ability to avoid bottlenecks. Finally, we evaluate its ability to scale-down and to manage the release of a part or the votality of the nodes.
Hierarchical Branch and Bound Algorithm for Computational Grids
(Elsevier, 2012) Bendjoudi, Ahcène; Melab, Nouredine; Talbi, El-Ghazali
Branch and Bound (B&B) algorithms are efficiently used for exact resolution of combinatorial optimization problems (COPs). They are easy to parallelize using the Master/Worker paradigm (MW) but limited in scalability when solving large instances of COPs on large scale environments such as Grids. Indeed, the master process rapidly becomes a bottleneck. In this paper, we propose a new approach H-B&B for parallel B&B based on a hierarchical MW paradigm in order to deal with the scalability issue of the traditional MW-based B&B. The hierarchy is built dynamically and evolves over time according to the dynamic acquisition of computing nodes. The inner nodes of the hierarchy (masters) perform branching operations to generate sub-trees and the leaves (workers) perform a complete exploration of these sub-trees. Therefore, in addition to the parallel exploration of sub-trees, a parallel branching is adopted. H-B&B is applied to the Flow-Shop scheduling problem.Unlike most existing grid-based B&B algorithms, H-B&B has been experimented on a real computational grid i.e. Grid’5000. The results demonstrate the scalability and efficiency of H-B&B.
Overlay-Centric Load Balancing: Applications to UTS and B&B
(2012-09-24) Trong-Tuan, Vu; Derbel, Bilel; Asim, Ali; Bendjoudi, Ahcène; Melab, Nouredine
To deal with dynamic load balancing in large scale distributed systems, we propose to organize computing resources following a logical peer-to-peer overlay and to distribute the load according to the so-defined overlay. We use a tree as a logical structure connecting distributed nodes and we balance the load according to the size of induced subtrees. We conduct extensive experiments involving up to 1000 computing cores and provide a throughout analysis of different properties of our generic approach for two different applications, namely, the standard Unbalanced Tree Search and the more challenging parallel Branch-and-Bound algorithm. Substantial improvements are reported in comparison with the classical random work stealing and two finely tuned application specific strategies taken from the literature.
P2P B&B and GA for the Flow-Shop Scheduling Problem
(Springer-Verlag, 2008-09) Bendjoudi, Ahcène; Guerdah, Samir; Mansoura, Madjid; Melab, Nouredine; Talbi, El-Ghazali
Solving exactly Combinatorial Optimization Problems (COPs) using a Branch-and-Bound algorithm (B&B) requires a huge amount of computational resources. The efficiency of such algorithm can be improved by its hybridization with meta-heuristics such as Genetic Algorithms (GA) which proved their effectiveness, since they generate acceptable solutions in a reasonable time. Moreover, distributing at large scale the computation, using for instance Peer-to-Peer (P2P) Computing, provides an efficient way to reach high computing performance. In this chapter, we propose ParallelBB and ParallelGA, which are P2P-based parallelization of the B&B and GA algorithms for the computational Grid. The two algorithms have been implemented using the ProActive distributed object Grid middleware. The algorithms have been applied to a mono-criterion permutation flow-shop scheduling problem and promisingly experimented on the Grid5000 computational Grid.
P2P design and implementation of a parallel branch and bound algorithm for grids
(Inderscience, 2009) Bendjoudi, Ahcène; Melab, Nouredine; Talbi, El-Ghazali
Solving optimally large instances of combinatorial optimisation problems using Branch and Bound (B&B) algorithms is CPU-time intensive and requires a large number of computational resources. To harness such huge amount of resources Peer-to-Peer (P2P) communications must be allowed between resources, and adaptive load balancing and fault-tolerance have to be dealt with when designing and implementing a B&B algorithm. In this paper, we propose a P2P design and implementation of a parallel B&B algorithm on top of the ProActive grid middleware. Load distribution and fault-tolerance strategies are proposed to deal with the dynamic and heterogeneous characteristics of the computational grid. The approach has been promisingly applied to the Flow-Shop scheduling problem and experimented on a computational pool of 1500 CPUs from the GRID’5000 Nation-wide experimental Grid.
Parallel B&B Algorithm on Hybrid Multicore/GPU Architecture
(IEEE, 2013-11-15) Bendjoudi, Ahcène; Chekini, Mehdi; Gharbi, Makhlouf; Mehdi, Malika; Benatchba, Karima; Sitayeb-Benbouzid, Fatima; Melab, Nouredine
B&B algorithms are well known techniques for exact solving of combinatorial optimization problems (COP). They perform an implicit enumeration of the search space instead of exhaustive one. Based on a pruning technique, they reduce considerably the computation time required to explore the whole search space. Nevertheless, these algorithms remain inefficient when dealing with large combinatorial optimization instances. They are time-intensive and they require a huge computing power to be solved optimally. Nowadays, multi-core-based processors and GPU accelerators are often coupled together to achieve impressive performances. However, classical B&B algorithms must be rethought to deal with their two divergent architectures. In this paper, we propose a new B&B approach exploiting both the multi-core aspect of actual processors and GPU accelerators. The proposed approaches have been executed to solve FSP instances that are well-known combinatorial optimization benchmarks. Real experiments have been carried out on an Intel Xeon 64-bit quad-core processor E5520 coupled to an Nvidia Tesla C2075 GPU device. The results show that our hybrid B&B approach speeds up the execution time up to x123 over the sequential mono-core B&B algorithm.
Parallel Branch and Bound on P2P Systems
(IEEE, 2007) Bendjoudi, Ahcène; Melab, Nouredine; Talbi, El-Ghazali
Real or academic combinatorial optimization problems are in the majority NP-hard. For large dimensions, an exact resolution is often impractical due to a limited amount of resources. The use of large scale deployment on distributed systems such as peer-to-peer (P2P) systems, based on exploiting free CPU cycles, provides an efficient way to reach high computing performance by distributing the computation to solve these problems. In this paper, we are interested in solving exactly optimization problems using parallel branch-and-bound algorithm on large scale distributed systems. We propose ParallelBB, which is a parallelization of the branch-and-bound algorithm and apply it to a mono-criterion permutation flow-shop problem. Furthermore, we develop P2PBB, which is the peer-to-peer implementation of our algorithm using ProActive
Parallel GPU-Accelerated Metaheuristics
(CRC Press, Taylor & Francis Group, 2013) Loukil, Lakhdar; Silhadi-Mehdi, Malika; Bendjoudi, Ahcène; Melab, Nouredine
Reducing thread divergence in a GPU-accelerated branch-and-bound algorithm
(2013) Chakroun, Imen; Mezmaz, Mohand; Melab, Nouredine; Bendjoudi, Ahcène
In this paper, we address the design and implementation of GPU-accelerated Branch-and-Bound algorithms (B&B) for solving Flow-shop scheduling optimization problems (FSP). Such applications are CPU-time consuming and highly irregular. On the other hand, GPUs are massively multi-threaded accelerators using the SIMD model at execution. A major issue which raises when executing on GPU a B&B applied to FSP is thread or branch divergence. Such divergence is caused by the lower bound function of FSP which contains many irregular loops and conditional instructions. Our challenge is therefore to revisit the design and implementation of B&B applied to FSP dealing with thread divergence. Extensive experiments of the proposed approach have been carried out on well-known FSP benchmarks using an Nvidia Tesla C2050 GPU card. Compared to a CPU-based execution, accelerations up to ×77.46 are achieved for large problem instances.
Reducing Thread Divergence in GPU-based B&B Applied to the Flow-shop problem
(2011-09-11) Chakroun, Imen; Bendjoudi, Ahcène; Melab, Nouredine
In this paper,we propose a pioneering work on designing and programming B&B algorithms on GPU. To the best of our knowledge, no contribution has been proposed to raise such challenge. We focus on the parallel evaluation of the bounds for the Flow-shop scheduling problem. To deal with thread divergence caused by the bounding operation, we investigate two software based approaches called thread data reordering and branch refactoring. Experiments reported that parallel evaluation of bounds speeds up execution up to 54.5 times compared to a CPU version.
Reducing Thread Divergence in GPU-Based B&B Applied to the Flow-Shop Problem
(Springer Berlin Heidelberg, 2012) Chakroun, Imen; Bendjoudi, Ahcène; Melab, Nouredine
In this paper,we propose a pioneering work on designing and programming B&B algorithms on GPU. To the best of our knowledge, no contribution has been proposed to raise such challenge. We focus on the parallel evaluation of the bounds for the Flow-shop scheduling problem. To deal with thread divergence caused by the bounding operation, we investigate two software based approaches called thread data reordering and branch refactoring. Experiments reported that parallel evaluation of bounds speeds up execution up to 54.5 times compared to a CPU version.