Currently, we are looking for a Data Scientist / Machine Learning Engineer who will be a part of the general-purpose data science team and work with tasks covering a wide variety of business needs without focus on computer vision.
On this position, you will work with multiple data sources (usually numerical, textual and time-related data, less frequently visual data), huge and small datasets to develop, validate and deploy machine learning models, tune their performance & integrate them into data processing pipelines.
- Deal with both structured and unstructured data, collaborate with data engineers on defining data storage formats, state data collection requirements;
- Not only solve technical tasks but understand business needs and offer appropriate solutions, data collection and labelling requirements and recommendations, while describing a chosen approach to non-technical people;
- Set up reproducible experiments: selection, training, validation and optimization of machine learning models, evaluation of their quality in business-related terms;
- Integrate data preprocessing and model inference into general data processing pipelines;
- Research new tools, papers, etc. in machine learning area.
- Strong knowledge and deep understanding of:
— Сlassical machine learning (linear models, decision trees, ensembles for classification and regression tasks, clustering and dimensionality reduction);
— Main concepts and stages of modelling process (validation scheme, regularization, overfitting and generalization, data leaks, feature selection, etc.)
- Hands-on experience with Python scientific and ML-related libraries;
- Hands-on experience with gradient boosting libraries (xgboost, lightgbm or catboost);
- Experience with relational databases and SQL;
- Ability to implement space and time-efficient algorithms and understand which one is preferable and when;
- Good Python programming skills;
- Data visualization and presentation skills;
- Good spoken and written English (at least B1);
- Ability and desire to convert raw business requests into strictly formulated machine learning tasks;
- Ability to formulate data gathering (or data labelling) requirements;
- Minimum 1-year experience in machine learning;
Would be a plus:
- Hands-on experience with developing parallel code in Python;
- Familiarity with non-relational databases (Cassandra, Elasticsearch, MongoDB, etc);
- Experience in software engineering, deployment and integration with data delivery systems and other components, building microservices, providing APIs for models access;
- Experience in developing recommender systems, time series analysis;
- Experience in Natural Language Processing;
- Experience in Deep Learning with applications to any data domain;
- Experience in data labelling process setup using third-party or self-made labelling tools;
- Participation in ML competitions (Kaggle, etc);
- Masters, PhD, or equivalent experience in Mathematics or Computer Science;
You will work with smart people who love to solve hard problems, and who not only expect but also foster high performance.
You will work with smart people who love to solve hard problems, and who not only expect but also foster high performance. Email us at firstname.lastname@example.org.