International Journal Papers
Permanent URI for this collectionhttp://dl.cerist.dz/handle/CERIST/17
Browse
6 results
Search Results
Item GPU-accelerated Bounding for Branch-and-Bound applied to a Permutation Problem using Data Access Optimization(John Wiley & Sons, 2013-11) Melab, Nouredine; Chakroun, Imen; Bendjoudi, AhcèneBranch-and-Bound (B\&B) algorithms are attractive methods for solving to optimality combinatorial optimization problems using an implicit enumeration of a dynamically built tree-based search space. Nevertheless, they are time-consuming when dealing with large problem instances. Therefore, pruning tree nodes (subproblems) is traditionally used as a powerful mechanism to reduce the size of the explored search space. Pruning requires to perform the bounding operation which consists of applying a lower bound function to the subproblems generated during the exploration process. Preliminary experiments performed on the Flow-Shop scheduling problem (FSP) have shown that the bounding operation consumes over $98\%$ of the execution time of the B\&B algorithm. In this paper, we investigate the use of GPU computing as a major complementary way to speed up the search. We revisit the design and implementation of the parallel bounding model on GPU accelerators. The proposed approach enables data access optimization. Extensive experiments have been carried out on well-known FSP benchmarks using an Nvidia Tesla C2050 GPU card. Compared to a CPU-based single core execution using an Intel Core i7-970 processor without GPU, speedups higher than $100$ times faster are achieved for large problem instances. At an equivalent peak performance, GPU-accelerated B\&B is twice faster than its multi-core counterpart.Item P2P design and implementation of a parallel branch and bound algorithm for grids(Inderscience, 2009) Bendjoudi, Ahcène; Melab, Nouredine; Talbi, El-GhazaliSolving optimally large instances of combinatorial optimisation problems using Branch and Bound (B&B) algorithms is CPU-time intensive and requires a large number of computational resources. To harness such huge amount of resources Peer-to-Peer (P2P) communications must be allowed between resources, and adaptive load balancing and fault-tolerance have to be dealt with when designing and implementing a B&B algorithm. In this paper, we propose a P2P design and implementation of a parallel B&B algorithm on top of the ProActive grid middleware. Load distribution and fault-tolerance strategies are proposed to deal with the dynamic and heterogeneous characteristics of the computational grid. The approach has been promisingly applied to the Flow-Shop scheduling problem and experimented on a computational pool of 1500 CPUs from the GRID’5000 Nation-wide experimental Grid.Item An adaptive hierarchical master-worker (AHMW) framework for grids - Application to B&B algorithms(Elsevier, 2012) Bendjoudi, Ahcène; Melab, Nouredine; Talbi, El-GhazaliWell-suited to embarrassingly parallel applications, the master–worker (MW) paradigm has largely and successfully used in parallel distributed computing. Nevertheless, such a paradigm is very limited in scalability in large computational grids. A natural way to improve the scalability is to add a layer of masters between the master and the workers making a hierarchical MW (HMW). In most existing HMW frameworks and algorithms, only a single layer of masters is used, the hierarchy is statically built and the granularity of tasks is fixed. Such frameworks and algorithms are not adapted to grids which are volatile, heterogeneous and large scale environments. In this paper, we revisit the HMW paradigm to match such characteristics of grids. We propose a new dynamic adaptive multi-layer hierarchical MW (AHMW) dealing with the scalability, volatility and heterogeneity issues. The construction and deployment of the hierarchy and the task management (deployment, decomposition of work, distribution of tasks, . . .) are performed in a dynamic collaborative distributed way. The framework has been applied to the parallel Branch and Bound algorithm and experimented on the Flow-Shop scheduling problem. The implementation has been performed using the ProActive grid middleware and the large experiments have been conducted using about 2000 processors from the Grid’5000 French nation-wide grid infrastructure. The results demonstrate the high scalability of the proposed approach and its efficiency in terms of deployment cost, decomposition and distribution of work and exploration time. The results show that AHMW outperforms HMW and MW in scalability and efficiency in terms of deployment and exploration time.Item Hierarchical Branch and Bound Algorithm for Computational Grids(Elsevier, 2012) Bendjoudi, Ahcène; Melab, Nouredine; Talbi, El-GhazaliBranch and Bound (B&B) algorithms are efficiently used for exact resolution of combinatorial optimization problems (COPs). They are easy to parallelize using the Master/Worker paradigm (MW) but limited in scalability when solving large instances of COPs on large scale environments such as Grids. Indeed, the master process rapidly becomes a bottleneck. In this paper, we propose a new approach H-B&B for parallel B&B based on a hierarchical MW paradigm in order to deal with the scalability issue of the traditional MW-based B&B. The hierarchy is built dynamically and evolves over time according to the dynamic acquisition of computing nodes. The inner nodes of the hierarchy (masters) perform branching operations to generate sub-trees and the leaves (workers) perform a complete exploration of these sub-trees. Therefore, in addition to the parallel exploration of sub-trees, a parallel branching is adopted. H-B&B is applied to the Flow-Shop scheduling problem.Unlike most existing grid-based B&B algorithms, H-B&B has been experimented on a real computational grid i.e. Grid’5000. The results demonstrate the scalability and efficiency of H-B&B.Item Reducing thread divergence in a GPU-accelerated branch-and-bound algorithm(2013) Chakroun, Imen; Mezmaz, Mohand; Melab, Nouredine; Bendjoudi, AhcèneIn this paper, we address the design and implementation of GPU-accelerated Branch-and-Bound algorithms (B&B) for solving Flow-shop scheduling optimization problems (FSP). Such applications are CPU-time consuming and highly irregular. On the other hand, GPUs are massively multi-threaded accelerators using the SIMD model at execution. A major issue which raises when executing on GPU a B&B applied to FSP is thread or branch divergence. Such divergence is caused by the lower bound function of FSP which contains many irregular loops and conditional instructions. Our challenge is therefore to revisit the design and implementation of B&B applied to FSP dealing with thread divergence. Extensive experiments of the proposed approach have been carried out on well-known FSP benchmarks using an Nvidia Tesla C2050 GPU card. Compared to a CPU-based execution, accelerations up to ×77.46 are achieved for large problem instances.Item FTH-B&B: a Fault-Tolerant Hierarchical Branch and Bound for Large Scale Unreliable Environment(2013) Bendjoudi, Ahcène; Melab, Nouredine; Talbi, El-GhazaliSolving to optimality large instances of combinatorial optimization problems using Brand and Bound (B&B) algorithms requires a huge amount of computing resources. In this paper, we investigate the design and implementation of such algorithms on computational grids. Most of existing grid-based B&B algorithms are based on the Master-Worker paradigm, their scalability is therefore limited. Moreover, even if the volatility of resources is a major issue in grids fault tolerance is rarely addressed. We propose FTH-B&B, a fault tolerant hierarchical B&B. FTH-B&B is based on different new mechanisms enabling to efficiently build and maintain balanced the hierarchy, and to store and recover work units (sub-problems). FTH-B&B has been implemented on top of ProActive and applied to the Flow-Shop scheduling problem. Very often, the validation of existing grid-based B&B works is performed either through simulation or a very small real grid. In this paper, we experimented FTH-B&B on the Grid'5000 real French nation-wide computational grid using up to 1900 processor cores distributed over 6 sites. The reported results show that the overhead induced by the approach is very low and an efficiency close to 100% can be achieved on some Taillards benchmarks of the Flow-Shop problem. In addition, the results demonstrate the robustness of the approach even in extreme failure situations.