Currently, we are looking for a Data Scientist / Machine Learning Engineer who will be a part of the general-purpose data science team and work with tasks covering a wide variety of business needs without focus on computer vision.
On this position, you will work with multiple data sources (usually numerical, textual and time-related data, less frequently visual data), huge and small datasets to develop, validate and deploy machine learning models, tune their performance & integrate them into data processing pipelines.
- Deal with both structured and unstructured data, collaborate with data engineers on defining data storage formats, state data collection requirements;
- Not only solve technical tasks but understand business needs and offer appropriate solutions, data collection and labelling requirements and recommendations, while describing a chosen approach to non-technical people;
- Set up reproducible experiments: selection, training, validation and optimization of machine learning models, evaluation of their quality in business-related terms;
- Integrate data preprocessing and model inference into general data processing pipelines;
- Research new tools, papers, etc. in machine learning area.
- Strong knowledge and deep understanding of:
— Сlassical machine learning (linear models, decision trees, ensembles for classification and regression tasks, clustering and dimensionality reduction);
— Main concepts and stages of modelling process (validation scheme, regularization, overfitting and generalization, data leaks, feature selection, etc.)
- Hands-on experience with Python scientific and ML-related libraries;
- Hands-on experience with gradient boosting libraries (xgboost, lightgbm or catboost);
- Experience with relational databases and SQL;
- Ability to implement space and time-efficient algorithms and understand which one is preferable and when;
- Good Python programming skills;
- Data visualization and presentation skills;
- Good spoken and written English (at least B1);
- Ability and desire to convert raw business requests into strictly formulated machine learning tasks;
- Ability to formulate data gathering (or data labelling) requirements;
- Minimum 1-year experience in machine learning;
Would Be a Plus:
- Hands-on experience with developing parallel code in Python;
- Familiarity with non-relational databases (Cassandra, Elasticsearch, MongoDB, etc);
- Experience in software engineering, deployment and integration with data delivery systems and other components, building microservices, providing APIs for models access;
- Experience in developing recommender systems, time series analysis;
- Experience in Natural Language Processing;
- Experience in Deep Learning with applications to any data domain;
- Experience in data labelling process setup using third-party or self-made labelling tools;
- Participation in ML competitions (Kaggle, etc);
- Masters, PhD, or equivalent experience in Mathematics or Computer Science;
You will work with smart people who love to solve hard problems, and who not only expect but also foster high performance. Email us at email@example.com.