Machine Learning Data Engineer – Phoenix, AZ

Machine Learning Data Engineer

This role requires experience with building data processing pipelines in AWS cloud, creative problem solving, strong coding skills, and strong communication skills to work with different teams and stakeholders. This person should be highly organized, with the ability to manage requests efficiently and accurately.


  • Engineer efficient, adaptable and scalable data pipelines to process structured and unstructured data, for cloud based machine learning applications that enhance Zoom meeting experience.
  • Building robust, reliable scalable tools and services that can power various machine learning solutions.
  • Search for open sourced dataset, and mine data using software tools.
  • Integrate machine learning software with Zoom’s core product.
  • Work closely with the product team to understand current challenges when it comes to use, understand and present the results of text analytics.
  • Work closely with offshore team, providing requirements, technical leadership, code review and result evaluation


  • M.S., prefer Ph.D in Computer Science, Electrical Engineering, Data Engineering, Cloud Computing, or a related technical field.
  • Understand the Data Lifecycle and concepts such as lineage, governance, privacy, retention, anonymity, etc
  • Expertise in engineering data pipelines using big data technologies (Hive, Presto, Spark, Flink etc…) on large scale data sets
  • Extensive knowledge in Spark, Hadoop, Kafka, and data processing pipeline. Experience in databases including MySQL, Redis, MongoDB, Hbase
  • Hands on experience in AWS cloud environment. Familiar with AWS cloud resources (S3, EC2, RDS, EMR, EKS etc)
  • Knowledge of ETL, deep knowledge of data features engineering, data mining, machine learning, or information retrieval
  • Experience processing large amounts of structured and unstructured data, including integrating data from multiple sources. Expertise in distributed data processing patterns
  • Strong coding experience using PySpark, Scalar, or Python, comfortable with complex SQL
  • Familiar with cloud based machine learning pipeline components and processes.
  • Familiar with text based NLP applications in information retrieval, classification, recommender system, Question and Answering Systems etc.
  • Experience in software development. Strong analytical skills and attention to detail. Strong mastery of python and general software development skills (source code management, debugging, testing, deployment, etc.)

Discover More AI Jobs:


More Information

Apply for this job Apply via Facebook
Email Me Jobs Like These
Showing 1–0 of 0 jobs
Share this job

We are one of the largest AI Communities online. Our publications have over 8.5 Million Views Annually and we have over 120K subscribers.