Bachelor’s degree (or equivalent) in statistics, computer science, or a related field.
Proficiency in Python and basic knowledge of SQL.
Good communication skills, both written and verbal.
Familiarity with collaboration tools (e.g., Atlassian suite) and version control systems (e.g., Git).
Some experience or coursework related to cloud services such as AWS or Google Cloud.
Basic knowledge of Colab/Jupyter Notebooks.
Ability to work well in a collaborative, team-oriented environment.
Attention to detail and strong time-management skills.
Familiarity with Linux server and system administration is a plus.
Key Responsibilities:
Collaborate with the team to design, code, and test data pipelines using Python.
Assist in developing and maintaining data warehouses to store large, complex datasets.
Ingest and assemble data from various sources using tools like SQL and Python.
Work with internal and external stakeholders to understand project needs and goals.
Participate in migrating development code into production, optimizing processes under guidance.
Help troubleshoot and improve existing infrastructure and codebases.