Identifying relevant features to explain a response variable has always been an important problem in many areas of science. As data sets become more complex, the number of candidate features is quickly growing and very often even exceeds the number of observations we can afford to collect. This brings huge challenges for statisticians and scientists, as traditional variable selection methods fail in these cases. This talk reviews these challenges and existing statistical methods to address them. We will discuss the advantages and disadvantages of those methods, ultimately motivating the novel approach presented in the main talk.