Company: Motion Recruitment
Job description: Role
The Data Engineering team has a need for a Big Data engineering professional who can assist with important test automation and development tasks associated with mission critical projects.
- Expected start is as soon as feasible.
- Expected duration of this role is 12 months, with the possibility of extension for up to an additional 12 months. Initial ramp up time is expected to be about two months (conservative estimate based on current 100% remote work situation).
- This engineer will work closely with the team’s primary test engineer and the broader team to align cohesively with ongoing projects.
Python Developer will perform the following tasks:
- Primary focus will be on QA test automation.
- Will define a test automation strategy for new data pipelines or modify existing strategy to expand QA coverage of an existing pipeline.
- Create data pipeline, specific input datasets, and expected datasets to implement QA automation.
- Modify existing input datasets and expected datasets when business requirements for existing data pipelines change.
- Create or modify Python scripts for triggering data pipeline specific QA tests and validating against the expected outputs.
- Integrate with Jenkins CI/CD automation to run nightly QA tests automatically.
- Perform manual testing where automation is not feasible, or QA tests need to be run on ad-hoc basis.
The role requires an engineer who is data savvy and has an overall system quality mindset.
- Advanced Python scripting skills are must. For example, ability to work with ease using various data formats such as CSV, JSON, XML in Python and create modular/re-usable code. Should be able to write OO code and understand list comprehensions in Python.
- Comfortable working in Linux environment with bash, grep, awk, ssh, xargs etc.
- Critically think about corner cases in data pipelines and create test cases to simulate those conditions. Should understand the significance of test coverage. Should understand fault injection.
- Understands the problems associated with processing large datasets (10’s of TB) and is conceptually familiar with technologies available to solve those problems.
- Not expected to know Hadoop or Spark but would-be a plus.
Location: Los Angeles, CA
Job date: Sat, 18 Sep 2021 00:38:49 GMT
Apply for the job now!