Browsing by Author "Lebib, Fatma Zohra"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- ItemKnowledge Discovery from Log Data Analysis in a Multi-source Search System based on Deep Cleaning(CERIST, 2019-07) Lebib, Fatma Zohra; Mellah, Hakima; Meziane, AbdelkrimIn a multi-source search system, understanding users’ interests and behaviour is essential to improve the search and adapt the results according to each user profile. The interesting information characterizing the users can be hidden in large log files, whereas it must be discovered, extracted and analyzed to build an accurate user profile. This paper presents an approach which analyzes the log data of a multi-source search system using the web usage mining techniques. The aim is to capture, model and analyze the behavioural patterns and profiles of users interacting with this system. The proposed approach consists of two major steps, the first step “pre-processing” eliminates the unwanted data from log files based on predefined cleaning rules, and the second step “processing” extracts useful data on user’s previous queries. In addition to the conventional cleaning process that removes irrelevant data from the log file, such as access of multimedia files, error codes and accesses of Web robots, deep cleaning is proposed, which analyzes the queries structure of different sources to further eliminate unwanted data. This allows to accelerate the processing phase. The generated data can be used for personalizing user-system interaction, information filtering and recommending appropriate sources for the needs of each user.
- ItemSelection of Information Sources using a Genetic Algorithm(CERIST, 2017-01-02) Lebib, Fatma Zohra; Drias, Habiba; Mellah, HakimaWe address the problem of information sources selection in a context of a large number of distributed sources. We formulate the sources selection problem as a combinatorial optimization problem in order to yield the best set of relevant information sources for a given query. We define a solution as a combination of sources among a huge predefined set of sources. We propose a genetic algorithm to tackle the issue by maximizing the similarity between a selection and the query. Extensive experiments were performed on databases of scientific research documents covering different domains such as computer science and medicine. The results based on the precision measure are very encouraging.