Responsibilities and Duties
- Build and train ML models on large-scale datasets to solve various business use cases.
- Use data processing frameworks for feature engineering and be proficient across various data both structured and un-structured.
- Use Deep Learning models like Regression, classification, clustering, CNN, RNN and NLP (BERT) for solving various business use cases like name entity resolution, forecasting and anomaly detection.
- Collaborate to develop large-scale data modelling experiments, evaluating against strong baselines, and extracting key statistical insights and/or cause and effect relations.
- Experience across broad range of modern data science and analytics tools (e.g., R, SQL, STATA, NoSQL, Hive, Hadoop, Spark, Python).
- Proficiency in visualization tools including Tableau, Cognos, Power BI, and similar tools required.
- Ability to work in large and medium sized project teams, as self-directed contributor with a proven track record of being detail orientated, innovative, creative, and strategic.
- Ability to convey complex information in an understandable, compelling, and persuasive manner at all levels.
- Execute sound data curation, wrangling, and associated correlation processes.
- Synthesize analytical findings for consumption by the teams and senior executives.
- Articulate analytical findings in clear and concise deliverables, including presentations, discussions, and visualizations.
Qualifications and Skills
- Advanced Degree in field of Computer Science, Data Science or equivalent discipline.
- Minimum 5+ years of working experience as a data scientist.
- Expertise with Python, PySpark, DL frameworks like MLOps.
- Experience in designing and building highly scalable distributed ML models in production (Scala, applied machine learning, proficient in statistical methods, algorithms).
- Experience with analytics (ex: Tableau, SQL, Alteryx Presto, Spark, Python).
- Experience with machine learning techniques and advanced analytics (e.g. regression, classification, clustering, time series, econometrics, causal inference, mathematical optimization.