The Demand Forecasting STO predicts the future sales for all of Wayfair’s catalogs across US, CA, DE, UK and for 18 months into the future. These forecasting products fuel the supply chain and warehouse management systems that aim to increase product availability and fast-delivery options to Wayfair’s customers by effectively deploying capital from their 5000+ suppliers.
PPD Algorithms and Data Science is a team that partners with Data Science and business teams to help build systems that leverage the latest-and-greatest in machine-learning, while also fulfilling business-specific needs for agility in decision-making. They facilitate and advise on Data Science workflows, while working to make sure business-user needs are met.
By joining Demand Forecasting Engineering, a team started in January 2019, you will have the opportunity to significantly impact a space that is growing extremely fast. They are building a team to partner with business and Data Science teams to adapt the current forecasting system to be faster at scale, more reliable, and ultimately to produce more accurate forecasts!
- Creating a fully-automated feature-extraction pipeline that pulls features from inputs across pricing, promotions, out-of-stock, sort rank etc. going back to 2013 to fuel their forecasting model
- The automated part of this pipeline currently consists of ~30 tasks in an Airflow DAG
- Tasks range across Hive, SQL Server, Vertica and Python
- Enabling their end-users to rapidly make business adjustments to 2 million+ item-level forecasts for Wayfair US and CA catalogs where their machine-learning outputs do not meet business needs
- Productionizing further features (we have 200 in total) to make them more scalable and automated, integrating them into our existing pipeline
- Parallelizing existing processes, exploring options of K8s clusters, Airflow and Spark
- Rewriting SQL logic into Python, improving speed and testability
- Facilitating changes to the existing pipeline, with new features and/or changes to the current machine-learning model
- Currently they utilize 16 XGBoost models run in parallel. One future avenue is to use a meta-modelling approach, training a neural net to choose between dozens of models. Other options include utilizing GPUs, or running LightGBM on Spark
- New features include getting more future-facing features into the model, right now they have one of these: planned future promotions
What You’ll Need
- SQL development skills
- 3+ years professional software experience, Python skills a plus
- Ability to lead small/medium sized technical projects with mentorship
- Strong communication skills to navigate a fast-changing environment with multiple stakeholders
- Interest in Data Science and in solving Big Data problems, experience a plus
- Salary Offer 0 ~ $3000
- Experience Level Junior
- Total Years Experience 0-5
- Dropdown field Option 1