Although it is a time-intensive process, data scientists must pay attention to various considerations when preparing data for machine learning. Following are six key steps that are part of the process. 1. Problem formulation. Data preparation for building machine learning models is a lot more than just cleaning and structuring data.