Automated Feature Engineering Applied to Causality
This cause-effect pairs challenge was motivated by the contrast between the costs to per- forming controlled experiments in order to determine causality and the abundance of observational data. Our goal was to provide a value representing our confidence of causality determined by the observation data which would help identify the most promising variables for experimental verification of their causal relationship. A novel approach was created focusing on feature engineering that requires minimal human intervention. By applying standard machine learning algorithms to the pairs of points, almost 9000 features were created by computing the goodness of fit of these algo- rithms in various ways. This approach was successful enough to attain the highest score in the competition’s private leaderboard. Additionally, alternatives and their explanations of why they weren’t used as well as possible improvements which could greatly improve accuracy were outlined.