PySpark / Java Developer (Data Engineer)- 100% Remote- Only W2

Remote, USA Full-time Posted 2026-05-31
Apply Now
    Key Responsibilities
  • Design, develop, and maintain scalable ETL pipelines and data processing applications
  • Build and optimize data workflows using PySpark, Java, and Hadoop ecosystem tools
  • Analyze business and technical requirements to produce detailed implementation designs
  • Perform unit testing, integration testing, and debugging of applications
  • Troubleshoot and resolve performance issues related to high-volume data processing
  • Develop and maintain SQL queries, stored procedures, and database objects
  • Work with structured and unstructured datasets for healthcare analytics
  • Generate statistical reports and support data validation processes
  • Collaborate with cross-functional teams to ensure end-to-end data pipeline efficiency
  • Follow software engineering best practices and maintain code quality standards
  • Required Skills & Experience
  • Strong experience in ETL development, data processing, and database technologies
  • 5+ years of experience with Microsoft SQL Server and relational databases
  • Expertise in SQL performance tuning, indexing strategies, and query optimization
  • 2+ years of experience with Hadoop ecosystem tools (HDFS, Hive, Impala, Spark, Kafka, Oozie, Yarn, Sqoop, Hue)
  • Hands-on experience with PySpark, Python, and/or Java
  • Experience working with large-scale data processing frameworks
  • Strong understanding of data transformation and data movement technologies
  • Ability to handle high-volume structured and unstructured datasets
  • Good understanding of end-to-end application/data pipeline lifecycle

Apply tot his job

Apply To this Job

Similar Jobs