What are the responsibilities and job description for the Senior Data Engineer position at Reorg?
We are seeking a highly skilled and experienced Senior Data Engineer with a strong background in building and managing data pipelines, data warehouses, and data lakes. As a Senior Data Engineer, you will play a pivotal role in our organization's data infrastructure, enabling efficient and reliable data processing, storage, and analysis.
Responsibilities:
- Design and develop robust, scalable, and efficient data pipelines to support the extraction,
transformation, and loading (ETL) processes from various data sources into data warehouses
and data lakes. - Collaborate closely with cross-functional teams, including data scientists, analysts, and software
engineers, to understand data requirements and design optimal solutions. - Build and manage data warehouses and data lakes to store and organize large volumes of
structured and unstructured data efficiently. - Implement data governance processes and best practices to ensure data quality, integrity, and
security throughout the data lifecycle. - Identify and address performance bottlenecks, data inconsistencies, and data quality issues in
data pipelines, warehouses, and lakes. - Develop and maintain monitoring and alerting systems to proactively identify and resolve data-related issues.
- Continuously evaluate and explore emerging technologies and tools in the data engineering space to improve data processing efficiency and scalability.
- Mentor and guide junior data engineers, providing technical leadership and fostering a collaborative and innovative environment.
Requirements:
- Bachelor's degree in Computer Science or a related field.
- Proven experience (minimum 5 years) in building and managing data pipelines, data warehouses, and data lakes in a production environment.
- Proficiency in programming languages such as Python, SQL and experience with data processing frameworks like Apache Spark or Apache Beam.
- Experience ETL/ELT frameworks and tools like AWS Glue, dbt, Airflow, Airbyte, etc.
- In-depth knowledge of relational databases (e.g., MySQL, PostgreSQL) and experience with columnar storage technologies (e.g., Redshift, Snowflake).
- Strong understanding of distributed systems, data modeling, and database design principles.
- Familiarity with cloud platforms (e.g., AWS, Azure, GCP) and experience in deploying data infrastructure on the cloud.