We’re very happy to keep engaging with professional communities on topics we’re passionate about. This time, our Data Science and Machine Learning consulting expert Denis Dus spoke at PyCon BY’17 – an annual international conference that connects Python community. At the event, Denis covered the topic of Reproducibility and Automation of Machine Learning Process.
In case you are interested in ML and you want to master your skills, check out this Machine Learning Online Course.
In his speech, he explained basic design concepts for automation of iterative processes in machine learning and shared his experience of building data pipelines within one of his projects.

Photo Credit: PyCon BY.
Automation of machine learning process does not eliminate the data science expert, it helps to focus efforts on understanding the business problem, improving the model, and explaining results, the true value drivers for business.
Normally data scientists have to spend up to 80% of their time on data engineering tasks like data extraction, data cleaning, data transformation, data normalization, feature extraction and only 20% of the time is spent on modeling. Denis recommends considering automation if you repeatedly need to extract, clean and transform data, if you want to update models on regular basis or if you want to simplify reproducibility of data science experiments.
For more machine learning posts and other technology-related materials, please take a look at our blog.