Data Science Engineer, Machine Learning Platform
PlayStation HQ in San Mateo, CA
They will consider mid to senior level professional experience. Additionally, they have a more senior Team Lead position available as well.
They seek a Platform Engineer to support their transitioning to the cloud. In this role, you will deliver tools for their new machine learning, analytics and big data cloud-based platform hosted on AWS and based on EMR Permanent and Transitory Clusters, S3 storage and in house software and open source tools. This will empower their global teams to quickly use advanced Machine Learning for a variety of problems. They value positive personalities that inspire to make change. If this is you, please apply!
- Collaborate globally with data and cloud engineers to build a Machine Learning AWS-based platform.
- Engage with data scientists to improve the platform, assure they conform to standard methodologies that meet their requirements.
- Perform creative and complex application programming activities, coding, testing, implementation and documentation of solution.
- Evaluate and debug services in all stages of the development cycles, from development to production.
- Document new and existing projects to improve community understanding and contribution.
- Strong experience in crafting, deploying and operating highly available, scalable and fault tolerant systems using Amazon Web Services (EMR Clusters, S3, ELBs, EC2, EBS).
- Strong working knowledge of deploying and configuring Apache Spark clusters, ideally on EMR clusters.
- Strong proficiency in Python.
- Detailed knowledge/understanding of more than one version control system, including git.
- Knowledge of large open source projects and how they operate preferably Airflow.
- Adept working within unix-like environments; shell scripting and system level knowledge.
- Practical exposure to Continuous Integration/Continuous Delivery tools like Jenkins to merge development with testing through pipelines.
- Big-Data Cloud Scalability.
- Hive metastore and Hadoop.
- JDBC/ODBC, SQL query processing, and distributed query engines.
- Configuration Management tools like Ansible and Terraform.
- Docker container infrastructure.
- Monitoring and logging tools like Splunk.
- Jupyterhub deployment and Apache Livy integration.
- Visualization tools such as Tableau.
- Salary Offer 0 ~ $3000
- Experience Level Junior
- Total Years Experience 0-5
- Dropdown field Option 1