Data Scientist at Trusta Labs

Full Time1 month ago
Employment Information
Job Description

We are seeking a skilled Big Data Engineer to design, develop, and optimize ETL processes while ensuring data accuracy, completeness, and timeliness. The role involves collaborating with cross-functional teams to implement efficient data solutions and support business needs.

Key Responsibilities
  • Design, develop, and optimize big data ETL processes to meet business requirements
  • Participate in data warehouse architecture design and develop appropriate ETL solutions
  • Develop Spark applications for large-scale data processing, including data cleaning, transformation, and loading
  • Optimize Spark job performance to improve efficiency and reduce resource consumption
  • Write Python scripts for data collection, preprocessing, and monitoring tasks
  • Integrate Python code with Spark applications for complex data workflows
  • Develop in PySpark environment to leverage combined advantages of Python and Spark
  • Troubleshoot PySpark technical issues including data type conversion and performance optimization
  • Implement data quality monitoring strategies and conduct ETL quality checks
  • Establish data quality reporting mechanisms and provide decision-making support
  • Collaborate with data analysts, scientists, and warehouse engineers on projects
  • Participate in technical knowledge sharing to improve team capabilities
Job Requirements
  • Strong experience in big data ETL process design and optimization
  • Proficiency in Spark application development and performance tuning
  • Expertise in Python programming for data processing tasks
  • Hands-on experience with PySpark integration and development
  • Knowledge of data quality assurance methodologies and tools
  • Understanding of data warehouse architecture principles
  • Ability to troubleshoot complex data processing issues
  • Excellent collaboration and communication skills
  • Experience working in cross-functional data teams
  • Continuous learning mindset and knowledge sharing attitude
Preferred Qualifications
  • Experience with additional big data technologies (Hadoop, Hive, etc.)
  • Knowledge of cloud-based data platforms (AWS, Azure, GCP)
  • Familiarity with data visualization and reporting tools
  • Understanding of machine learning concepts and applications
  • Previous experience in implementing data governance frameworks
MyJob.one - Remote work. Real impact

New Things Will Always
Update Regularly

MyJob.one - Remote work. Real impact