Log pre-processing and grammatical inference for Web usage mining
In this paper, we propose a WEB USAGE MINING pre-processing method to retrieve missing data from the server log files. Moreover, we propose two levels of evaluation: directly on reconstructed data, but also after a machine learning step by evaluating inferred grammatical models. We conducted some experiments and we showed that our algorithm improves the quality of user data. Keywords: log pre-processing, web usage mining, grammatical inference, evaluation