What are the responsibilities and job description for the Data Engineer position at CGT Staffing?
Job Details
Job Summary:
We are seeking a highly skilled Databricks Data Engineer to join our team. The primary responsibilities include designing, developing, and optimizing data pipelines and workflows on the Databricks platform, leveraging Apache Spark for large-scale data processing, and working within a collaborative environment to support data-driven decision-making. This role is critical in ensuring the efficiency and scalability of our data infrastructure and enhancing the organization's ability to derive insights from its data.
Key Responsibilities:
- Design, develop, and optimize end-to-end data pipelines and ETL/ELT workflows on the Databricks platform.
- Perform large-scale data processing using Apache Spark with a focus on performance tuning and optimization.
- Develop, test, and maintain SQL and Python code to process, cleanse, and integrate data from multiple sources.
- Ensure data quality, accuracy, and consistency across systems.
- Collaborate with stakeholders to gather business requirements and translate them into technical solutions.
- Implement data models using Kimball development methodology to support data warehousing.
- Manage and orchestrate data pipelines on the Azure cloud platform.
- Troubleshoot and resolve data-related issues, ensuring timely delivery of data products.
- Work within an agile environment, continuously improving data engineering processes and best practices.
Minimum Education & Experience Requirements:
- Bachelor s degree in Computer Science, Information Systems, or a related field (required).
- Minimum of 3-5 years of hands-on experience with Databricks and Apache Spark in a data engineering capacity.
- Proficiency in SQL, Python, and experience with ETL/ELT processes.
- Familiarity with the Azure cloud platform.
- Strong understanding of Kimball data warehousing methodology.
Special Requirements:
- Databricks or Azure certifications (preferred but not required).
- Knowledge of CI/CD pipelines and DevOps practices for data engineering (preferred).
Knowledge, Skills, and Abilities:
- Advanced proficiency in Databricks, Apache Spark, SQL, and Python.
- Strong problem-solving skills with the ability to work effectively in a collaborative environment.
- Excellent communication skills for working with both technical and non-technical stakeholders.
- Knowledge of data modeling techniques, specifically Kimball methodology.
- Ability to manage data pipelines and workflows in Azure.
- Attention to detail and a focus on quality assurance in data handling.