The repertoire of drugs for patients with cancer is rapidly expanding, however cancers that appear pathologically similar often respond differently to the same drug regimens. Methods to better match patients to specific drugs are in high demand. For example, patients over 65 with acute myeloid leukemia (AML), an aggressive blood cancer, have no better prognosis today than they did in 1980. For a growing number of diseases, there is a fair amount of data on molecular profiles from patients. The most important step necessary to realize the ultimate goal is to identify molecular markers in these data that predict treatment outcomes, such as response to each chemotherapy drug. However, due to the high-dimensionality (i.e., the number of variables is much greater than the number of samples) along with potential biological or experimental confounders, it is an open challenge to identify robust biomarkers that are replicated across different studies. In this talk, I will present two novel machine learning algorithms to resolve these challenges. These methods learn the low-dimensional features that are likely to represent important molecular events in the disease process in an unsupervisedfashion, based on molecular profiles from multiple populations of cancer patients. These algorithms led to the identification of novel molecular markers in AML and ovarian cancer.