Data Scientist

Currently, we are looking for a Data Scientist / Machine Learning Engineer who will be a part of the general-purpose data science team and work with tasks covering a wide variety of business needs without focus on computer vision.

On this position, you will work with multiple data sources (usually numerical, textual and time-related data, less frequently visual data), huge and small datasets to develop, validate and deploy machine learning models, tune their performance & integrate them into data processing pipelines.

Responsibilities:

  • Deal with both structured and unstructured data, collaborate with data engineers on defining data storage formats, state data collection requirements;
  • Not only solve technical tasks but understand business needs and offer appropriate solutions, data collection and labelling requirements and recommendations, while describing a chosen approach to non-technical people;
  • Set up reproducible experiments: selection, training, validation and optimization of machine learning models, evaluation of their quality in business-related terms;
  • Integrate data preprocessing and model inference into general data processing pipelines;
  • Research new tools, papers, etc. in machine learning area.

Requirements:

  • Strong knowledge and deep understanding of:
    — Сlassical machine learning (linear models, decision trees, ensembles for classification and regression tasks, clustering and dimensionality reduction);
    — Main concepts and stages of modelling process (validation scheme, regularization, overfitting and generalization, data leaks, feature selection, etc.)
  • Hands-on experience with Python scientific and ML-related libraries;
  • Hands-on experience with gradient boosting libraries (xgboost, lightgbm or catboost);
  • Experience with relational databases and SQL;
  • Ability to implement space and time-efficient algorithms and understand which one is preferable and when;
  • Good Python programming skills;
  • Data visualization and presentation skills;
  • Good spoken and written English (at least B1);
  • Ability and desire to convert raw business requests into strictly formulated machine learning tasks;
  • Ability to formulate data gathering (or data labelling) requirements;
  • Minimum 1-year experience in machine learning;

Would Be a Plus:

  • Hands-on experience with developing parallel code in Python;
  • Familiarity with non-relational databases (Cassandra, Elasticsearch, MongoDB, etc);
  • Experience in software engineering, deployment and integration with data delivery systems and other components, building microservices, providing APIs for models access;
  • Experience in developing recommender systems, time series analysis;
  • Experience in Natural Language Processing;
  • Experience in Deep Learning with applications to any data domain;
  • Experience in data labelling process setup using third-party or self-made labelling tools;
  • Participation in ML competitions (Kaggle, etc);
  • Masters, PhD, or equivalent experience in Mathematics or Computer Science;

You will work with smart people who love to solve hard problems, and who not only expect but also foster high performance. Email us at hrm@indatalabs.com.